Arduino FreeRTOS

Posted on November 24, 2015 by feilipu

For a long time I have been using the AVR port of FreeRTOS as the platform for my Arduino hardware habit. I’ve written (acquired, stolen, and corrupted) a plethora of different drivers and solutions for the various projects I’ve built over the last years. But, sometimes it would be nice to just try out a new piece of hardware in a solid multi-tasking environment without having to dive into the datasheets and write code. Also, when time is of the essence rewriting someone’s existing driver is just asking for stress and failure.

So recently, with an important hack-a-thon coming up, I thought it would be nice to build a robust FreeRTOS implementation that can just shim into the Arduino IDE and allow me to use the best parts of both environments, seamlessly.

Arduino IDE Core is just AVR

One of the good things about the Arduino core environment is that it is just the normal AVR environment with a simple Java IDE added. That means that all of the AVR command line tools used to build Arduino sketches will also just work my AVR port of FreeRTOS.

Some key aspects of the AVR FreeRTOS port have been adjusted to create the seamless integration with the Arduino IDE. These optimizations are not necessarily the best use of FreeRTOS, but they make the integration much easier.

FreeRTOS needs to have an interrupt timer to trigger the scheduler to check which task should be using the CPU, and to fairly distribute processing time among equivalent priority tasks. In the case of the Arduino environment all of the normal timers are configured in advance, and therefore are not available for use as the system_tick timer. However, all AVR ATmega devices have a watchdog timer which is driven by an independent 128kHz internal oscillator. Arduino doesn’t configure the watchdog timer, and conveniently the watchdog configuration is identical across the entire ATmega range. That means that the entire range of classic AVR based Arduino boards can be supported within FreeRTOS with one system_tick configuration.

The Arduino environment has only two entry point functions available for the user, setup() and loop(). These functions are written into an .ino file and are linked together with and into a main() function present in the Arduino libraries. The presence of a fixed main() function within the Arduino libraries makes it really easy to shim FreeRTOS into the environment.

The main() function in the main.c file contains a initVariant() weak attribute stub function prior to the internal Arduino initialisation setup() function. By implementing an initVariant() function execution can be diverted into the FreeRTOS environment, after calling the normal setup() initialisation, by simply continuing to start the FreeRTOS scheduler.

int main(void) // Normal Arduino main.cpp. Normal execution order.
{
init();
initVariant(); // Our initVariant() diverts execution from here.
setup(); // The Arduino setup() function.

for (;;)
{
loop(); // The Arduino loop() function.
if (serialEventRun) serialEventRun();
}
return 0;
}

Firstly, this initVariant() function is located in the variantHooks.cpp file in the FreeRTOS library. It replaces the weak attribute function definition in the Arduino core.

void initVariant(void)
{
setup(); // The Arduino setup() function.
vTaskStartScheduler(); // Initialise and run the FreeRTOS scheduler. Execution should never return to here.
}

Secondly, the FreeRTOS idle task is used to run the loop() function whenever there is no unblocked FreeRTOS task available to run. In the trivial case, where there are no configured FreeRTOS tasks, the loop() function will be run exactly as normal, with the exception that a short scheduler interrupt will occur every 15 milli-seconds (configurable). This function is located in the variantHooks.cpp file in the library.

void vApplicationIdleHook( void )
{
loop(); // The Arduino loop() function.
if (serialEventRun) serialEventRun();
}

Putting these small changes into the Arduino IDE, together with a single directory containing the necessary FreeRTOS v10.x.x files configured for AVR, is all that needs to be done to slide the FreeRTOS shim under the Arduino environment.

I have published the relevant files on Github where the commits can be browsed and the repository downloaded. The simpler solution is to install FreeRTOS using the Arduino Library Manager, or download the ZIP files from Github and install manually as a library in your Arduino IDE.

Getting Started with FreeRTOS

Ok, with these simple additions to the Arduino IDE via a normal Arduino library, we can get started.

Firstly in the Arduino IDE Library manager, from Version 1.6.8, look for the FreeRTOS library under the Type: “Contributed” and the Topic: “Timing”.

Ensure that the most recent FreeRTOS library is installed. As of writing that is v10.3.0-4.

Then under the Sketch->Include Library menu, ensure that the FreeRTOS library is included in your sketch. A new empty sketch will look like this.

Compile and upload this empty sketch. This will show you how much of your flash is consumed by the FreeRTOS scheduler. As a guide the following information was compiled using Arduino v1.8.5 on Windows 10.

// Device: loop() -> FreeRTOS | Additional Program Storage
// Uno: 444 -> 6592 | 20%
// Goldilocks: 502 -> 6684 | 5%
// Leonardo: 3462 -> 9586 | 23%
// Yun: 3456 -> 9580 | 23%
// Mega: 662 -> 6898 | 2%

Now test and upload the Blink sketch, with an underlying Real-Time Operating System. That’s all there is to having FreeRTOS running in your sketches. So simple.

Next Steps

Blink_AnalogRead.ino is a good way to take the next step as it combines two basic Arduino examples, Blink and AnalogRead into one sketch with in two separate tasks. Both tasks perform their duties, managed by the FreeRTOS scheduler.

#include <Arduino_FreeRTOS.h>

// define two tasks for Blink and AnalogRead
void TaskBlink( void *pvParameters );
void TaskAnalogRead( void *pvParameters );

// the setup function runs once when you press reset or power the board
void setup()
{
// Now set up two tasks to run independently.
xTaskCreate(
TaskBlink
, "Blink" // A name just for humans
, 128 // This stack size can be checked and adjusted by reading the Stack Highwater
, NULL
, 2 // Priority, with 3 (configMAX_PRIORITIES - 1) being the highest, and 0 being the lowest.
, NULL );

xTaskCreate(
TaskAnalogRead
, "AnalogRead"
, 128 // Stack size
, NULL
, 1 // Priority, with 3 (configMAX_PRIORITIES - 1) being the highest, and 0 being the lowest.
, NULL );

// Now the task scheduler, which takes over control of scheduling individual tasks, is automatically started.
}

void loop()
{
// Empty. Things are done in Tasks. Never Block or delay.
}

/*--------------------------------------------------*/
/*---------------------- Tasks ---------------------*/
/*--------------------------------------------------*/

void TaskBlink(void *pvParameters) // This is a task.
{
(void) pvParameters;

// initialize digital pin 13 as an output.
pinMode(13, OUTPUT);

for (;;) // A Task shall never return or exit.
{
digitalWrite(13, HIGH); // turn the LED on (HIGH is the voltage level)
vTaskDelay( 1000 / portTICK_PERIOD_MS ); // wait for one second
digitalWrite(13, LOW); // turn the LED off by making the voltage LOW
vTaskDelay( 1000 / portTICK_PERIOD_MS ); // wait for one second
}
}

void TaskAnalogRead(void *pvParameters) // This is a task.
{
(void) pvParameters;

// initialize serial communication at 9600 bits per second:
Serial.begin(9600);

for (;;)
{
// read the input on analog pin 0:
int sensorValue = analogRead(A0);
// print out the value you read:
Serial.println(sensorValue);
vTaskDelay(1); // one tick delay (15ms) in between reads for stability
}
}

Next there are a number of examples in the FreeRTOS Quick Start Guide.

One last important thing you can do is to reduce device power consumption by not using the default loop() function for anything more than putting the MCU to sleep. This code below can be used for simply putting the MCU into a sleep mode of your choice, while no tasks are unblocked. Remember that the loop() function shouldn’t ever disable interrupts and block processing.

#include <avr/sleep.h> // include the Arduino (AVR) sleep functions.

loop() // Remember that loop() is simply the FreeRTOS idle task. Something to do, when there's nothing else to do.
{
// Digital Input Disable on Analogue Pins
// When this bit is written logic one, the digital input buffer on the corresponding ADC pin is disabled.
// The corresponding PIN Register bit will always read as zero when this bit is set. When an
// analogue signal is applied to the ADC7..0 pin and the digital input from this pin is not needed, this
// bit should be written logic one to reduce power consumption in the digital input buffer.

#if defined(__AVR_ATmega640__) || defined(__AVR_ATmega1280__) || defined(__AVR_ATmega1281__) || defined(__AVR_ATmega2560__) || defined(__AVR_ATmega2561__) // Mega with 2560
DIDR0 = 0xFF;
DIDR2 = 0xFF;
#elif defined(__AVR_ATmega644P__) || defined(__AVR_ATmega644PA__) || defined(__AVR_ATmega1284P__) || defined(__AVR_ATmega1284PA__) // Goldilocks with 1284p
DIDR0 = 0xFF;

#elif defined(__AVR_ATmega328P__) || defined(__AVR_ATmega168__) || defined(__AVR_ATmega8__) // assume we're using an Arduino with 328p
DIDR0 = 0x3F;

#elif defined(__AVR_ATmega32U4__) || defined(__AVR_ATmega16U4__) // assume we're using an Arduino Leonardo with 32u4
DIDR0 = 0xF3;
DIDR2 = 0x3F;
#endif

// Analogue Comparator Disable
// When the ACD bit is written logic one, the power to the Analogue Comparator is switched off.
// This bit can be set at any time to turn off the Analogue Comparator.
// This will reduce power consumption in Active and Idle mode.
// When changing the ACD bit, the Analogue Comparator Interrupt must be disabled by clearing the ACIE bit in ACSR.
// Otherwise an interrupt can occur when the ACD bit is changed.
ACSR &= ~_BV(ACIE);
ACSR |= _BV(ACD);

// There are several macros provided in the header file to actually put
// the device into sleep mode.
// SLEEP_MODE_IDLE (0)
// SLEEP_MODE_ADC (_BV(SM0))
// SLEEP_MODE_PWR_DOWN (_BV(SM1))
// SLEEP_MODE_PWR_SAVE (_BV(SM0) | _BV(SM1))
// SLEEP_MODE_STANDBY (_BV(SM1) | _BV(SM2))
// SLEEP_MODE_EXT_STANDBY (_BV(SM0) | _BV(SM1) | _BV(SM2))

set_sleep_mode( SLEEP_MODE_IDLE );

portENTER_CRITICAL();
sleep_enable();

// Only if there is support to disable the brown-out detection.
#if defined(BODS) && defined(BODSE)
sleep_bod_disable();
#endif

portEXIT_CRITICAL();
sleep_cpu(); // good night.

// Ugh. I've been woken up. Better disable sleep mode.
sleep_reset(); // sleep_reset is faster than sleep_disable() because it clears all sleep_mode() bits.
}

o that’s all there is to it. There’s nothing more to do except to read the FreeRTOS Quick Start Guide.
Further reading with manicbug, and by searching on this site too.

General Usage

FreeRTOS has a multitude of configuration options, which can be specified from within the FreeRTOSConfig.h file. To keep commonality with all of the Arduino hardware options, some sensible defaults have been selected.

The AVR Watchdog Timer is used with to generate 15ms time slices, but Tasks that finish before their allocated time will hand execution back to the Scheduler. This does not affect the use of any of the normal Timer functions in Arduino.

Time slices can be selected from 15ms up to 500ms. Slower time slicing can allow the Arduino MCU to sleep for longer, without the complexity of a Tickless idle.

Watchdog period options:

WDTO_15MS
WDTO_30MS
WDTO_60MS
WDTO_120MS
WDTO_250MS
WDTO_500MS

Note that Timer resolution is affected by integer math division and the time slice selected. Trying to accurately measure 100ms, using a 60ms time slice for example, won’t work.

Stack for the loop() function has been set at 192 bytes. This can be configured by adjusting the configIDLE_STACK_SIZE parameter. It should not be less than the configMINIMAL_STACK_SIZE. If you have stack overflow issues, just increase it. Users should prefer to allocate larger structures, arrays, or buffers using pvPortMalloc(), rather than defining them locally on the stack. Or, just declare them as global variables.

Memory for the heap is allocated by the normal malloc() function, wrapped by pvPortMalloc(). This option has been selected because it is automatically adjusted to use the capabilities of each device. Other heap allocation schemes are supported by FreeRTOS, and they can used with additional configuration.

Never use the Arduino delay() function in your Tasks, as it burns CPU cycles and doesn’t place the Task into Blocked State, allowing the Scheduler to place other lower priority tasks into Running State. Use vTaskDelay() or vTaskDelayUntil() instead. Also never delay or Block the Arduino loop() as this is being run by the FreeRTOS Idle Task, which must never be blocked.

Errors

Stack Overflow: If any stack (for the loop() or) for any Task overflows, there will be a slow LED blink, with 4 second cycle.
Heap Overflow: If any Task tries to allocate memory and that allocation fails, there will be a fast LED blink, with 100 millisecond cycle.

Compatibility

ATmega328 @ 16MHz : Arduino UNO, Arduino Duemilanove, Arduino Diecimila, etc.
ATmega328 @ 16MHz : Adafruit Pro Trinket 5V, Adafruit Metro 328, Adafruit Metro Mini
ATmega328 @ 16MHz : Seeed Studio Stalker
ATmega328 @ 16MHz : Freetronics Eleven, Freetronics 2010
ATmega328 @ 12MHz : Adafruit Pro Trinket 3V
ATmega32u4 @ 16MHz : Arduino Leonardo, Arduino Micro, Arduino Yun, Teensy 2.0
ATmega32u4 @ 8MHz : Adafruit Flora, Bluefruit Micro
ATmega1284p @ 20MHz : Freetronics Goldilocks V1
ATmega1284p @ 24.576MHz : Seeed Studio Goldilocks V2, Seeed Studio Goldilocks Analogue
ATmega2560 @ 16MHz : Arduino Mega, Arduino ADK
ATmega2560 @ 16MHz : Freetronics EtherMega
ATmega2560 @ 16MHz : Seeed Studio ADK
ATmegaXXXX @ XXMHz : Anything with an ATmega MCU, really.

Files and Configuration

Arduino_FreeRTOS.h : Must always be #include first. It references other configuration files, and sets defaults where necessary.
FreeRTOSConfig.h : Contains a multitude of API and environment configurations.
FreeRTOSVariant.h : Contains the AVR specific configurations for this port of FreeRTOS.
heap_3.c : Contains the heap allocation scheme based on malloc(). Other schemes are available and can be substituted (heap_1.c, heap_2.c, heap_4.c, and heap_5.c) to get a smaller binary file, but they depend on user configuration for specific MCU choice.

ArduSat SD Card Prototyping

Posted on April 27, 2013 by feilipu

Since my last post on the ArduSat and the idea I had to use the Supervisor node, an ATmega2561, as the core of a centralised eXtended RAM system for the Client nodes, ATmega328p “Arduino” devices, I’ve been thinking and working on a solution for building a centralised non-volatile SD Card based storage solution.

With design, sometimes it is necessary to let an idea stew for a while before the right answer just sort of distils out of the soup. For the solution for this problem, this was the case. There was some thinking space required…

The Question

There are 16 Client nodes in the ArduSat platform. Each and any of them may wish to use the central SD Card to store information at the same, or at different times. How would it be possible to allow more than 16 files to be open on the one SD Card (connected to the Supervisor node) whilst maintaining consistency in the file system? How would access to the file system be scheduled?

The Tools

I have been using the ChaN FatFs file system libraries now for some time. They are fully featured and have a very clean design, fully separating the file system layer from the underlying physical media access layer (the drivers). This means that the file system tools can be implemented on many different architectures, with only changes to the driver layer (DiskIO) needed for each platform.

The Thought Process

My initial thought was that the Supervisor node should maintain the file system, and that I should write packaging for the FatFs file system commands to allow them to be remotely implemented across the SPI bus, in a similar manner as described in the XRAMFS post.

The idea of writing these “remote controls” for the file system commands was scary, as I recognised that there are 33 commands in the interface, and each of them has their own characteristics. Also, maintaining these interfaces would likely be problematic, as I would have to test each command extensively to ensure that there were no “thick thumb” errors introduced into the stable and proven FatFs library.

Some weeks passed…

Then at about 3am, I realised that the right answer was to write a “shim” between the standard FatF file system commands and the standard physical media drivers, and to have this shim operate across the SPI bus in exactly the same manner as the XRAMFS solution.

So, I wrote it.

The Solution

The solution separates the ChaN libraries into two parts. The file system part is resident on the Client node. Each Client node maintains its own view of the file system on the Supervisor SD Card. As the ChaN FatFs library is written for low memory devices, the file directory tree is refreshed each time a change in the working file is done. The Supervisor node only does the DiskIO under the command of each of the Clients.

There are only 5 relevant driver layer DiskIO commands. These commands are used in the Supervisor node to execute requests sent over the SPI bus from the individual Clients. Since there are only a small number of commands, and they are static and dependent on the architecture of the machine they’re running on, their functionality is quite constant. The Supervisor has no knowledge of the file system at all. It simply implements DiskIO commands on sectors of the SD Card as requested, one a time, as requested by Clients.

The Supervisor implementation simply expands on the existing Task loop established for the XRAMFS system, by adding in the 5 additional DiskIO commands. The added complexity, that the SD Card is accessed over the SAME SPI bus as the communications between Client and Supervisor, means that I had to introduce an interim “Pending” state for commands to allow the Client to wait for confirmation that a task has been completed or, in the case of disk_read or disk_ioctl, to recover the waiting data from the Supervisor.

The Client implementation inserts different shim DiskIO commands for the FatF system to call. These commands use the SPI bus to call the Supervisor, and enter a request. Some commands return immediately, allowing the Supervisor to continue with the command, once the command and any required data has been transferred. Other commands wait until they can retrieve information from the Supervisor, before returning to the FatF file system layer of the library.

In this solution, the XRAMFS was instrumental in simplifying the transfer of information. The exclusive availability of 16kB of RAM for each Client meant that disk_write or disk_read commands could cache their data in XRAMFS whilst it was actually written to or read from the SDCard. Because the RAM is available exclusively, there is no consideration that another Client may overwrite the results of a command, or that memory exhaustion may corrupt data.

The code is available at Sourceforge in the usual location.

How does it work?

When a Client program calls one of the FatFs library commands, it in turn calls one of the special ArduSat SPI DiskIO shim routines. These routines signal the Supervisor in the normal manner, and transfer any data associated with the command into the Page of XRAMFS assigned to the Client.

The Supervisor will then undertake the standard DiskIO command, retaining the result of the command and any data resulting from the command in XRAMFS.

Both Client DiskIO routines, and the Task running in the Supervisor are aware of the “Pending” state, which is where a DiskIO command has been completed on the Supervisor and there is data waiting in the XRAMFS for the Client to recover.

Once the Client DiskIO command completes, it returns the normal interface information to the calling FatFs command.

Here a monitor program on a Client is initialising the SD Card. If the Supervisor notices that the SD Card is not initialised, it will return Error, and then undertake to initialise the card. The second call for initialisation will then be successful. This decoupling method ensures that Clients cannot reinitialise the card, whilst other Clients may be using the Card.

The file system (on the Client) is then initialised Then, the SD Card status is read. Finally, the current working directory is read and printed.

In this screenshot, a file is opened for reading, and the file pointer set to the start of the file. A dump of the first 64 Bytes of the file is read and printed. Then the file is closed.

Here, the same file as above is opened for writing, and 45 bytes of 0x10 (16) are being written. The result is checked by opening the file for reading, and dumping the relevant bytes to the screen. Success!

Issues

The Client (Arduino) ATmega328p has so little Flash and RAM that implementing the FatFs consumes a significant proportion of the available resources. From the ChaN FatFs web site, at least 13 kByte of Flash (of 32 kByte on the Arduino), and 600 Bytes of RAM (of 2048 Bytes on the Arduino) are consumed by the library alone. This is excluding the working buffers necessary to prepare or process data for storage.

I was unable to fully test the FatFs solution, because of RAM and Flash limitations. I simply couldn’t turn on all the features. However, I have some confidence that the solution fully works, because the actual FatFs library is unchanged from the working solution that I’ve tested on the Arduino Mega platform. It is only the DiskIO routines that have been tampered with, and since they produce reliable results for some of the FatFs functions, there is every reason to believe they would work for all of the functions.

Thank you

Jon for providing a new Freetronics EtherMega, so that I could complete the prototyping work.

ArduSat and NanoSatisfi for running a great project, which inspired this thought process. Possibly, this work might be useful for one of the launches over the coming years.

ArduSat XRAMFS Prototyping

Posted on February 17, 2013 by feilipu

It is not every day that I get to tell the family I’m doing “rocket science”, but I hope over the past few days, it can be an exception. Space, the final frontier. In this case, it was a lack of space and the frontier it creates, that got me thinking.

At the recent Linux Conf AU Jon Oxer spoke about Freetronics’ efforts in designing the payload for the upcoming NanoSatisfi ArduSat1 launch (pictured below). Jon mentioned in the presentation that the AVR freeRTOS code compilation that I’ve been supporting is being used in the Supervisor node of that platform.

I immediately thought that it would be great to build a distributed cache RAM system to support each of the ATmega328p “Arduino” Client nodes, using the XRAM capabilities of the ATmega2561 Supervisor node. So, I did.

Using this prototype system, each Arduino Client node now has sole access to 32kByte of XRAMFS in addition to their 2kByte of internal RAM.

Initial performance measured is 422kByte/s throughput for the swap function. In other words, half of the entire Arduino RAM can be swapped with the contents of XRAMFS in just 4.74ms.

I’ve also the code for supporting a centralised SD Card on this platform to Sourceforge AVRfreeRTOS, and written about it at ArduSat SD Card Prototyping.

Background

In working with the Arduino hardware I’ve found that the severe limitation in RAM space causes constraints on what can be done. For example, Ethernet, USB and other modern applications need kBytes of buffer to be used effectively, and the ATmega328p used as the Arduino Uno platform supports a total of only 2kB RAM.

Using the Arduino Mega (or Android ADK hardware) has been the saviour of that situation for me, offering an identical environment, but 8kByte of RAM as a playground. And, most importantly, the ability to directly connect 0 wait-state external SRAM.

This XRAM capability of the ATmega2560 and ATmega2561 has been exploited by Rugged Circuits in their QuadRam module, which offers 512kByte of SRAM in one small package.

Therefore, using common off the shelf technology, I had the materials available to test the theory that building a XRAMFS system, to support the ArduSat platform, would work.

This allows each ArduSat Client to store 16 TIMES more data than it can currently access, and have access to that data at a relatively high speed from a medium not subject to wear (such as for example an SD card).

Ingredients & Build

This section looks at the ingredients and how to construct the prototype.

Supervisor Node – Arduino Mega / Freetronics EtherMega / Android ADK

The ArduSat Supervisor node is based on the ATmega2561 MCU, because it is significantly smaller than the ATmega2560 MCU used in the Arduino Mega platform. The only difference between the two chips is that the ATmega2561 doesn’t provide as many Ports, and has only 64 Pins versus 100 Pins on the ATmega2560.

For this prototyping, the ATmega2560 is necessary, because I elected to use pin change interrupts as part of the bus protocol. Also, the Arduino Mega platform is readily available. I don’t even know where I’d go to get a ATmega2561 board…

The use of rainbow hook-up wire was essential for the success of the prototype.

Client Node – Arduino Uno / Freetronics Eleven

The ArduSat Client node is designed to be identical to the Arduino Uno platform, to ensure that it is absolutely easy for people to test code they intend to run in space. Therefore a variety of Arduino Uno devices are being used (basically, whatever I had around).

XRAM Module – Rugged Circuits QuadRAM

I’ve implemented using the Rugged Circuits QuadRAM and the MegaRAM previously. These modules slip over the end of the Arduino Mega platform, instantly enabling either 512kByte or 128kByte of zero wait state SRAM, mapped to the system address space. They also conveniently bring out the SPI interface onto through-hole for pins.

Something about the ability to create 16x 32kByte XRAM pages, linked with 16x Client nodes, seemed like synchronicity.

Layout

The prototype platform is designed to be the classic multi-slave SPI bus layout. This design is demonstrated in the AVR151 document and, in excerpt, is produced below.

Because of my decision to use the Pin Change Interrupts as part of the bus protocol, The Supervisor node (SPI Master) would use the Port K and Port J pins to fill the role of individual Slave Select (SS) pins. The Client nodes would each use their normal SS pin (PB2) to connect to the Supervisor.

In designing for 16x Client nodes, there is a limitation on Port J in that the good folks at Arduino determined not to break out all of the pins which, together with sharing PCINT8 with the Rx0 pin, significantly limits the number of Clients feasible on the prototype platform.

In practice, 8 Client nodes attached to all the pins on Pork K is the simple alternative. As luck (or good planning) would have it, those pins are all brought out onto one connector on the Arduino Mega platform, as evidenced by these pictures.

Amongst friends, a direct connection of the SPI SCK, MISO, and MOSI lines to all Clients is optimal. But in a shared environment, it would make sense to use FET bus isolation to keep Clients from physically attaching to the SPI bus until their SS line is held low by the Supervisor. A gram of hardware prevention can cure a tonne of software ill, as a “rogue” Client could otherwise potentially lock up the SPI bus for all, and the guys in the ISS won’t be happy if asked to hit the reset button.

Bus Protocol

Hey! – Yeah What? – This! – OK

That’s the protocol. Works in the home. Works in the office. Works the world over.

Information to this Saleae Logic chart below in Client Implementation section.

Hey!

The Supervisor node holds all the PCINT pins high. If a Client wants to initiate a Read/Write/Swap transaction, it will pull its SS line low for 30µs. This needs to be long enough for the Supervisor to register an interrupt and process it. If multiple Clients call out simultaneously, no problem, the Supervisor will grab all of the requests and push them onto a queue of requests to serve.

Yeah What?

At the next opportunity, the Supervisor serving task will pop a request off the queue, and identify which Client made the request. It will also check if there were other simultaneous requests, and push them back to the front of the queue.

The Supervisor then pulls the relevant Client SS line low. The Client has been listening for this, and at this point it enables its Slave interface to the SPI bus, and the two swap acknowledgements. When the Supervisor receives the ACK code, it knows the Client is ready, so it requests a command.

This!

When the Client (SPI Slave) has received the Supervisor ACK code, it prepares a command, and is prepared to either read, write or swap XRAMFS data under the command of the Supervisor (SPI Master).

The command set implemented by this protocol can be easily extended to include accessing other shared resources connected to the Supervisor node. This could include analogue sensors, SDCARD mass storage (though using the SPI bus would offer a degree of complexity), or serial interfaced devices.

At the end of one command, with the data transaction complete, a final byte is exchanged to ensure that the Client has remained in sync with the Supervisor, and the SPI bus is released by the Client. It is important the Client stays off the SPI bus. The Supervisor then processes the next Yeah What? request.

Supervisor Implementation – freeRTOS

The Supervisor is implemented as a freeRTOS task, using standard SPI bus libraries contained in my code base. These libraries (now that this project has worked them over) are about as optimised as is possible to write in C, and achieve a good throughput over the SPI bus.

There are two (or one) PCINT based Interrupt that reads the PCINT pins and pushes the raw pin state onto a queue. This process traps multiple simultaneous requests, overcoming any interrupt masking or race conditions. Currently 30µs are allowed for the interrupts to execute. 10µs has been tested, but depending on how long the Supervisor stays in “Critical” state (interrupts off) processing other (non XRAMFS) tasks this time can be adjusted.

From idle, the Supervisor takes only 90µs to 0.1ms to pop a request from the queue and action it. Under load, it could take as long as 64ms to action a request. As soon as the pin state is collected it is processed to identify which SS line triggered the call, and therefore which bank of XRAM should be enabled. Also, at this time I check that no additional requests are pending from the same pin state. If so, the remaining pin state is pushed back on the queue to get next time round.

The exchange of acknowledgements ensures that both sides are speaking SPI, and are set to proceed.

The command contains the action (read / write / swap / test), the address of the XRAMFS block, the size of the XRAMFS block, and a CRC byte.

The bus transaction speed is dependent on the SPI Master SCK clock divisor. Optimally, a SPI Slave can receive data at 1/4th of its system clock. Currently, it is set to one 1/8th, therefore theoretical performance is double that of the logic capture above.

Initially, I determined to calculate a CRC byte to store along with the data, but the calculation time is large compared to the transaction time, and therefore too costly to implement at the protocol level. The application should utilise the CRC when it recovers data to confirm that the data is intact, and not irradiated.

Also, error checking following the transfer could be implemented. But at this stage I think it is better to have the Client do all sanity and error checking of its own data.

Client Implementation – freeRTOS or Arduino IDE

The Client is implemented in freeRTOS as a simple library function, that is passed a command structure, and a pointer to local RAM to be Read/Write/Swap. Some details below.

typedef enum { Huh        = 0, // Client didn't issue us a command, so just break.
               Read       = 1, // read from XRAMFS
               Write      = 2, // write to XRAMFS
               Swap       = 3, // read from both XRAMFS &amp; local RAM, and swap
               Test       = 4  // do something else, to be determined
} RAMFSCommand; // from point of view of the client (Arduino 328p)

typedef struct        /* structure to hold the RAMFS info */
{ RAMFSCommand       ram_cmd;        // Read / Write / Swap / Test
  size_t             ram_addr;       // Address of first byte of RAM in a RAMFS (greater than RAM_START_ADDR)
  uint16_t           ram_size;       // Size of RAM block in RAMFS (less than RAM_COUNT or 32kByte)
  uint8_t            ram_crc8;       // Calculated CRC of stored data
} xRAMFSarray, * pRAMFSarray;

uint8_t ramfs_transfer_block( pRAMFSarray pRAMFS_block, uint8_t *data );

I used C and the freeRTOS platform because it is easiest for my environment, and I know it best. But, I’ll re-write it as a library in the Arduino IDE environment as needed. It won’t be too hard.

The client can use the XRAMFS malloc function to manage RAM allocation. A very simple malloc has been built, which can’t free XRAMFS. But, it can be simply ignored if desired and the command structure can be filled manually.

Initially, I implemented an interrupt driven semaphore system to manage the Yeah What? part of the bus protocol, but typically the Supervisor responds so quickly that the time to do several context swaps generated by the interrupt exceeded the time the Supervisor was prepared to wait. A simple wait loop keeps the Client on ready standby for 90µs so it can complete the transaction in the shortest time.

The Client code has no knowledge of where its XRAM is located on the Supervisor. Therefore the code is orthogonal and constant, irrespective which Client being used. This is a very useful feature where the author may not know in advance which ArduSat Client his code will be running upon.

Client application code should be written to make use of the Swap XRAMFS <-> RAM capability. This makes best use of the SPI bus features to combine Read and Write into one transaction, effectively doubling throughput over the Write plus Read combination.

The user interface (monitor) is just for initial testing. I’ll have to write a load generation rig to find out what this baby can do, but that can wait for the next post. The logic analyser has captured the result of the > r (read) command in the below command line sequence. We can see the 20µs (now 30µs) Hey! on the Slave Select, 90µs pass before the acknowledgement bytes are swapped (only one cycle needed), 6 bytes of command structure are passed (Read command is 0x01), and then the data is read out of XRAMFS to the Client.

Design Notes

The basis of every design: detailed functional specifications, hardware design, and user interface documentation. Oh, and scribbles much.

Updates

I’ve updated the code on 22 February to remove some oversights in the Client main program, and added the OK check byte to the protocol. Code as usual on AVRfreeRTOS on Sourceforge.

Updated on 23 February to include some error checking on Supervisor side (preventing malicious Client requests), and on Client side preventing hang if the Supervisor is AWOL. Also removed the aggressive SPI timing utilising receive double buffering, as it often caused errors, and had no performance effect.

Initial performance measured is about 422kByte/s throughput for the swap function. Specifically 4.73825ms is needed for a complete 2048Byte data payload transaction (including sync, command, & OK timing). This also includes freeRTOS task swapping, as the Supervisor task is run with interrupts enabled in normal mode.

Have fixed some code issues on 4 March, mainly around a few µs delays required to let things run their course.

Now the platform is running stable with 4x Clients. A video is here

And here is a screenshot of the 4x terminals.

April 27th – I’ve uploaded the code for supporting a centralised SD Card on this platform to Sourceforge AVRfreeRTOS, and written about it at ArduSat SD Card Prototyping.

	Beew on Windows 7 Starter with (up to)…
	Hugo K on Goldilocks Analogue –…
	feilipu on Arduino FreeRTOS
	ChronoX on Arduino FreeRTOS
	Stockman Pod Extreme… on Wrangler JL Rubicon – Bu…

Gone Bush

Stuff I need to write down.

Tag Archives: arduino mega