Wednesday, August 22, 2012

Rudimentary benchmarks for the Parallax Propeller.

I'm in the early stages of a big new, ambitious project, that I've wanted to do for a long time. I've got some Propeller chips from Parallax that I've had sitting around for quite a while, so I thought I'd try to incorporate the 'Prop' as one of the many microcontrollers this project will have.

The Propeller is quite a unique processor, I'd even so far as to say it's design is exotic. What it lacks in specialized hardware found in most other MCU's it makes up for with having 8, that's eight, cores. You want I2C, you just dedicate a core to doing it in software. One of the few exceptions is something not found on many other MCU, it's has dedicated video generation circuitry.

A custom language, "Spin" has historically been the primary high level language used on the chip. Parallax has designed into the ROM a byte-code interpreter. You can also use assembly, especially for performance-critical code. C/C++ had for years been an experimental language on the chip, but that has changed in the past year or so. There is a full fledged GCC port for it. I decided I'd give it a try.

I after being lured to try the chip based on recent advances in GCC, I decided that I'd first do some experimentation to test the performance and build sizes.

There are 3 C/C++ "memory models" used on propeller chips: COG, LMM, and XMM. I'm not going to go in depth on the explanation of those, you can find a great detail here. The TL;DR version is COG: internal memory private to each core(very fast), LMM(slower): internal shared memory that must be read word by word into COG memory, XMM(slowest) memory external to the chip. Each one has pros and cons. COG memory is very fast, but very limited, 2kB per core. LMM gives you access to 32kB shared memory, and XMM is virtually unlimited.

I decided to create a simple program in C, SPIN, and assembly("PASM"). This program simply toggled a pin off and on. I then measured the frequency  on the pin to gauge the speed of the code.

Here are the results:
Language Optimization Memory model Code size(Bytes) Blink speed (kHz)
Spin - - 28 60
PASM - - 36 6667
C size COG 180 2854
C speed COG 180 2854
C mixed COG 192 2854
C none COG 196 2221
C size LMM 2448 2856
C speed LMM 2448 2856
C mixed LMM 2456 635
C none LMM 2456 455

Here is the Spin code I used:


  _clkmode = xtal1 + pll16x
  _xinfreq = 5_000_000

PUB LedOnOff

    dira[15] := 1
        outa[15] := 1
        outa[15] := 0

And the PASM:


  _clkmode = xtal1 + pll16x
  _xinfreq = 5_000_000
PUB go

  cognew(@asm_entry, 0)   'launch assembly program into a COG

' Assembly program
asm_entry     org
              or        dira, DrivePinMask               
              or        outa, DrivePinMask               
              andn      outa, DrivePinMask
              jmp       #:loop

DrivePinMask  long      $00008000

and finally the C code:

#include <propeller.h>

#define pin 15
#define mask (1 << pin)

int main(void)
    DIRA |= mask;

        OUTA |= mask;
        OUTA &= ~mask;
    return 0;


I think I might give go the route of a mixed Spin/PASM approach after all. I can get the extreme size saving in Spin where I need it, and get the extreme speed in PASM where I need it. The purpose of the Propeller chip in this project will be to monitor and control four motors with encoders, and provide a serial interface for external high level control and feedback. I don't think I'll need to resort to and XMM model. Also the test code was small enough that my metrics should be taken with a grain of salt, it's the embedded equivalent of "hello world", not really a great real world representation.

One thing I've got bookmarked, but have not checked out yet, are the community developed Propeller dev tools, there's a free PASM debugger, a PC side propeller emulator, and a few others that might help debugging. I'm a big fan of debugging code; that's one reason I love AVR's with my AVR Dragon, debugging is painless and a breeze. The Prop has no special debugging hardware, the debugger tools typically run a special piece of code on one cog that helps in debugging. That's clever, but it does put additional constraints on your applications free resources.


1 comment:

  1. PS. The green numbers in my spreadsheet are essentially what convinced me of the merits of going the traditional Propeller route. My planned functionality is small enough that writing in a proprietary, non portable language is not as big of a deal, I wont be able to used shared headers for my communication protocol though.


I welcome you're thoughts. Keep it classy, think of the children.