XBee Walkie Talkie

I’m building an advanced Arduino clone based on the AVR ATmega1284p MCU with some special features including a 12 bit DAC MCP4822, headphone amplifier, 2x SPI Memory (SRAM, EEPROM), and a SD Card. There are many real world applications for analogue outputs, but because the Arduino platform doesn’t have integrated DAC capability there are very few published applications for analogue signals. A Walkie Talkie is one example of using digital and analogue together to make a simple but very useful project.

Two Goldilocks Analogue prototypes with XBee radios, and Microphone amplifiers.

Two Goldilocks Analogue prototypes with XBee radios, and Microphone amplifiers.

The actual Walkie Talkie functionality is really only a few lines of code, but it is built on a foundation of analogue input (sampling), analogue output on the SPI bus to the MCP4822 DAC, sample timing routines, and the XBee digital radio platform. Let’s start from the top and then dig down through the layers.

XBee Radio

I am using XBee Pro S2B radios, configured to communicate point to point. For the XBee Pro there needs to be one radio configured as the Coordinator, and the other as a Router. There are configuration guides on the Internet.

I have configured the radios to wait the maximum inter-character time before sending a packet, which implies that the packets will be set only when full (84 bytes). This maximises the radio throughput. Raw throughput is 250 kbit/s, but the actual user data rate is limited to about 32 kbit/s. This has an impact on the sampling rate and therefore quality of speech that can be transmitted.

Using 8 bit samples, I have found that about 3 kHz sampling generates about as much data as can be transmitted without compression. I’m leaving compression for another project.

The XBee radios are configured in AT mode, which acts as a transparent serial pipe between the two endpoints. This is the simplest way to connect two devices via digital radio. And it allowed me to do simple testing, using wire, before worrying about whether the radio platform was working or not.

XBee Packet Reception in Purple

XBee Packet Reception in Purple

Looking at the tracing of a logic analyser, we can see the XBee data packets arriving on the (purple) Rx line of the serial port. The received packet data is stored into a ring buffer, and played out at a constant rate. I have allowed up to 255 bytes in the receive ring buffer, and this will be sufficient because the XBee packet size is 84 bytes.

The samples to be transmitted to the other device are transmitted on the (blue) Tx line, more or less in each sample period even though they are buffered before transmission. The XBee radio buffers these bytes for up to 0xFF inter-symbol periods (configuration), and only transmits a packet to the other endpoint when it has a full packet.

Sampling Rate

Looking at the bit budget for the transmission link, we need to calculate how much data can be transmitted without overloading the XBee radio platform, and causing sample loss. As we are not overtly compressing the voice samples, we have 8 bit samples times 3,000 Hz sampling or 24 kbit/s to transmit. This seems to work pretty well. I have tried 4 kHz sampling, but this is too close to the theoretical maximum, and doesn’t work too effectively.

Sample rate of 3,000 Hz seems to be the optimum.

Sample rate of 3,000 Hz seems to be the optimum.

Looking at the logic analyser, we can see the arrival of a packet of bytes commencing with 0x7E and 0x7C on the Rx line. Both the Microphone amplifier and the DAC output are biased around 0x7F(FF), so we can read that the signal levels captured and transmitted here are very low. The sample rate shown is 3,000 Hz.

Sample Processing

I have put a “ping” on one output to capture when the sampling interrupt is being processed (yellow). We can see that the amount of time spent in the interrupt processing is very small for this application, relative to the total time available. Possibly some kind of data compression could be implemented.

DAC and sample processing

DAC and sample processing

During the sampling interrupt, there are two major activities, generating an audio output, by placing a sample onto the DAC, and then reading the ADC to capture an audio sample and transmit it to the USART buffer.

This is done by the audioCodec_dsp function, which is called from the code in a timer interrupt.

void audioCodec_dsp( uint16_t * ch_A, uint16_t * ch_B)
  {
  /*----- Audio Rx -----*/
  if ( xSerialGetChar( &xSerialPort, &mod7_value.u8[1] ) ) // receive the most significant 8 bits of the sample from the Rx ring buffer.
  {
    mod7_value.u8[0] = 0x00; // and pad out the least significant 8 bits with null.
  }
  else
  {
    mod7_value.u16 = 0x7FFF; // or put nulled signal on the output
  }
  *ch_A = *ch_B = mod7_value.u16; // write the sample out on A & B channel.

  /*----- Audio Tx -----*/
  AudioCodec_ADC( &mod7_value.u16 ); // get 10 bits of sample from the ADC with reference 2.56V Maximum.
  xSerialPutChar( &xSerialPort, mod7_value.u8[1]); // transmit just the most significant 8 bits of the sample to the Tx buffer.
}

I am using the AVR 8 bit Timer0 to generate the regular sample intervals, by triggering an interrupt. By using a MCU FCPU frequency which is a binary multiple of the standard audio frequencies, we can generate accurate reproduction sampling rates by using only the 8 bit timer with a clock prescaler of 64. To generate odd audio frequencies, like 44,100 Hz, the 16 bit Timer1 can be used to get sufficient accuracy without requiring a clock prescaler.

The ATmega1284p ADC is set to free-run mode, and is scaled down to 192 kHz. While this is close to the maximum acquisition speed documented for the ATmega ADC, it is still within the specification for 8 bit samples.

ISR(TIMER0_COMPA_vect) __attribute__ ((hot, flatten));
ISR(TIMER0_COMPA_vect)
{
#if defined(DEBUG_PING)
  // start mark - check for start of interrupt - for debugging only (yellow trace)
  PORTD |= _BV(PORTD7); // Ping IO line.
#endif

  // MCP4822 data transfer routine
  // move data to the MCP4822 - done first for regularity (reduced jitter).
  DAC_out (ch_A_ptr, ch_B_ptr);

  // audio processing routine - do whatever processing on input is required - prepare output for next sample.
  // Fire the global audio handler which is a call-back function, if set.
  if (audioHandler!=NULL)
    audioHandler(ch_A_ptr, ch_B_ptr);

#if defined(DEBUG_PING)
  // end mark - check for end of interrupt - for debugging only (yellow trace)
  PORTD &= ~_BV(PORTD7);
#endif
}

This interrupt takes 14 us to complete, and is very short relative to the 333 us we have for each sample period. This gives us plenty of time to do other processing, such as running a user interface or further audio processing.

SPI Transaction

At the final level of detail, we can see the actual SPI transaction to output the incoming sample to the MCP4822 DAC.

SPI DAC Transaction

SPI MCP4822 DAC Transaction

As I have built this application on the Goldilocks Analogue Prototype 2 which uses the standard SPI bus, the transaction is normal. My later prototypes are using the Master SPI Mode on USART 1 of the ATmega1284p, which slightly accelerates the SPI transaction through double buffering, and frees the normal SPI bus for simultaneous reading or writing to the SD Card or SPI Memory, for audio streaming. In the Walkie Talkie application there is no need to capture the audio, so there’s no down side to using the older prototypes and the normal SPI bus.


void DAC_out(const uint16_t * ch_A, const uint16_t * ch_B)
{
  DAC_command_t write;

  if (ch_A != NULL)
  {
    write.value.u16 = (*ch_A) >> 4;
    write.value.u8[1] |= CH_A_OUT;
  }
  else // ch_A is NULL so we turn off the DAC
  {
    write.value.u8[1] = CH_A_OFF;
  }

  SPI_PORT_SS_DAC &= ~SPI_BIT_SS_DAC; // Pull SS low to select the Goldilocks Analogue DAC.
  SPDR = write.value.u8[1]; // Begin transmission ch_A.
  while ( !(SPSR & _BV(SPIF)) );
  SPDR = write.value.u8[0]; // Continue transmission ch_A.

  if (ch_B != NULL) // start processing ch_B while we're doing the ch_A transmission
  {
    write.value.u16 = (*ch_B) >> 4;
    write.value.u8[1] |= CH_B_OUT;
  }
  else // ch_B is NULL so we turn off the DAC
  {
    write.value.u8[1] = CH_B_OFF;
  }

  while ( !(SPSR & _BV(SPIF)) ); // check we've finished ch_A.
  SPI_PORT_SS_DAC |= SPI_BIT_SS_DAC; // Pull SS high to deselect the Goldilocks Analogue DAC, and latch value into DAC.

  SPI_PORT_SS_DAC &= ~SPI_BIT_SS_DAC; // Pull SS low to select the Goldilocks Analogue DAC.
  SPDR = write.value.u8[1]; // Begin transmission ch_B.
  while ( !(SPSR & _BV(SPIF)) );
  SPDR = write.value.u8[0]; // Continue transmission ch_B.
  while ( !(SPSR & _BV(SPIF)) ); // check we've finished ch_B.
  SPI_PORT_SS_DAC |= SPI_BIT_SS_DAC; // Pull SS high to deselect the Goldilocks Analogue DAC, and latch value into DAC.
}

Wrap Up

Using a few pre-existing tools and a few lines of code, it is possible to quickly build a digitally encrypted walkie talkie, capable of communicating (understandable, but not high quality) voice. And, there ain’t no CB truckers going to be listening in on the family conversations going forward.

This was a test of adding microphone input based on the MAX9814 to the Goldilocks Analogue. I will be revising the Prototype 3 and will add in a microphone amplification circuit to support applications needing audio input, like this walkie talkie example, or voice changers, or vocal control music synthesizers.

I’m also running the ATmega1284p devices at the increased frequency of 24.576 MHz, over the standard rate of 20 MHz. This specific frequency allows very precise reproduction of audio samples from 48 kHz down right down to 4 kHz (or even down to 1,500 Hz). The extra MCU clock cycles per sample period are very welcome when it comes to generating synthesised music.

Code as usual on Sourceforge AVR freeRTOS Also, a call out to Shuyang at SeeedStudio who’s OPL is awesome, and is the source of many components and PCBs.

How To Force A Redirect To The Classic WordPress.com Editor Interface

feilipu:

Piss off Beep Beep Boop.

Originally posted on Diary of Dennis:

classic editor wordpress

The Solution To Use The Classic Editor

If you are blogger at wordpress.com, this post here will help you to solve a big problem. As you have noticed, the decision makers at WordPress want to force you to use the recent new editor interface that is purely designed for mobile devices and for users who only create short-form content. This is of course a pain if you are desktop user and if you like to create long-form content as well. In this post you will learn how to get back to the classic editor permanently.

In the new editor form, we had a link back to the classic editor but that link is now gone too. WordPress does not have the intention to give us the link back as you can read here in the forums. If you go through this huge forum thread, you will find out…

View original 731 more words

Fixed Gear Fixes

A few years ago, I was into fixed gear bicycles. I still am, but the kilometres I travel has dropped to zero. Cycling in Australia is too dangerous; it must be, because I have to wear a helmet even when I’m not racing.

Though my stories have been up for over 10 years at the Fixed Gear Gallery, I’m replicating them here in case of the unforeseen. Wouldn’t be the first time that the Internet has forgotten.

First build 2003(4)

Thanks for the great pictures and site. Inspired by what I’ve seen, I decided to fixx myself one too. My Giant CFR started life in ’95 with a 105 groupset. These aluminium lugged straight gauge carbon tube frames were pretty hot at the time as the Australian Institute of Sport used them for their training bikes (so the story went anyway). I added the Spinergy’s for the PBP in 1999.

stevens1

The next morph was in 2001 when most of the 105 group was swapped for DA/Ultegra, and the standard (heavy aero) carbon fork was swapped for the Profile AC. Easton supplied the post and the bars. The Easton bars are great; very springy and gentle on the hands. What’s trick about the fixxer conversion? Choosing 43×16 means that the chainstays only had to be filed by about 0.5mm. The Surly Fixxer matches nicely with the Spinergy hub, but there is a little box cutter carving of extra metal on both parts necessary.

Getting both brake levers to pull the front calliper meant converting a giro upper cable housing by drilling it out enough to pass both cable inners, and putting a star washer under the clamp stops them slipping.

stevens2

Fixxing Vertical Drop-Outs

Building a fixed gear bike out of my Giant CFR took a lot of thinking. Mostly, thinking about a new custom steel frame with everything perfect, Campy horizontal drop-outs, clearances, multiple braze-ons for long distance riding, etc. After about 6 months of almost constant thought about the perfect frame whilst riding around on my Giant CFR “fixed” in 39×15 (testing my perceived perfect gear of 42×16) I realised that I was potentially already riding my fixed gear bike. All I had to do was to convert the CFR and check whether this fixed gear riding is all that its cracked up to be…

*** Step 1, get rid of the rear brake calliper.

I never used it anyway. After a major high-side rear skid crash in ’01, on the way to hospital (my first helicopter flight) I resolved not to use it ever again. But, I didn’t want to cut up my left lever; I need it as I climb on the drops all the time. Also, I want to indicate when I make a centre of road turn, whilst braking. Solution: make both levers activate one calliper. Read on to see how this is done.

*** Step 2, which wheels?

Well I’ve had these Spinergy wheels for years. They are built with no dish and are about the stiffest and most aerodynamic things around. Build quality… hmmm. Let’s not speak about that. After finding the Surly Fixxer conversion on their web site, I thought use what you have… it’ll be the cheapest. But a UK Surly dealer thought that the Fixxer wouldn’t fit Spinergy, and in principal they are right.

hub

Surly Fixxer and Spinergy both use the Shimano freehub spline system, but they both have a slight twist on this “standard”. Both Surly and Spinergy put much more metal (taller fatter splines) into their sides of the interface than Shimano. The result is that although Fixxer fits Shimano hub body fits Spinergy free hub fits Spinergy hub body fits Shimano free hub, the final Fixxer fixed hub and Spinergy hub body interface doesn’t work.

How to fix this? Using a box cutter, scrape and shape the splines on both Fixxer and Spinergy hob body so they are narrower and not so tall. Took me about an hour, as I was careful and feeling my way. It could be done faster if you know how much metal to remove in advance.

Surly provides right bearing, axle and any number of spacers with the Fixxer. For the Spinergy hub I needed to retain the existing QR axle. The hub left bearing is a cartridge type that matches a step in the axle to generate tension for the standard right cup and cone bearing in the freehub. With the fixxer, just use their spacer nut which provides the bearing surface for their right side cartridge bearing. To space for 130mm dropouts I used one of their #2 spacers. The end result is great. The chainline is 41mm from the inside of the thread. I used a DA cog with the shoulder in, which results in a 47mm chain line. Perfectly lined up with the outside of my double DA chain ring. I wish I’d known this before I set it up. I lost a lot of sleepless nights worrying about chainline… perhaps I worry to much.

*** Step 3. What gear?

That’s a question that can be answered simply. Every Australian cyclist knows that (Sir) Oppy won the ’31 PBP using a 69″ fixed gear. So, if I was going to ride the PBP fixed in 2007 then it had to be 69″ too. So that made it simple… 42×16 it was. (Looking back, I rode several 600km brevets and one 1,000km brevet on my Cannondale with 44×16 in preparation, but in the end didn’t take part in the wet PBP 2007).

*** Step 4. Chain length?

I used both James Quinlan’s ssConvert, and FixMeUp! to work out that if I use 42×16 then COINCIDENTALLY, my chain should be tight. Well it didn’t work out quite right, because of the difficulty of measuring the chainstay length accurately. When I tried the 42 chainring that I ordered, the chain drooped like a laundry line hung with wet washing. A quick test of 42×17 showed me that 43×16 should be my magic number. When the 43 chainring arrived, things were a little tighter I’d hoped, especially when I put on a new 8-speed chain (SRAM PC48). But, with chain tension a very little is often enough. I could see that only fractions of a mm separated me from success. 30min of filing, and testing every 20 or 30 strokes on each dropout, saw it fitting well.

both

I was careful to remove the same amount of metal from both sides, so that the wheel would still fit straight. left

right

In the worst case, it’ll be difficult to take up tension as the chain wears. But if I keep using inexpensive chains then I won’t mind regular replacement of this part. Alternatively, I might find that I can keep the axle in the new “back” of the dropout (only 0.5mm back) with my QR tight, and this will fix the tensioning issue. I’ve no experience. We’ll see as things get on.

The Two Lever / One Brake Calliper Fix

Note YMMV. It works for me, it may not work for you. Particularly note: both Sheldon “nervous” Brown and Andrew “chary” Muzi don’t like the idea of having two inners going to one clamping bolt. If you have a problem with that, forget this fix. The issue is that if one cable slips, then the other will be loosened in the clamp and will probably slip too. You’ve been warned!

It is also possible to use a “London mod” which does a similar thing to my mod. The London mod puts a bolt in the calliper upper arm, and passes two cables AROUND the arm. I don’t like that solution because it creates a triangle of tension forces which strains (twists) the brake calliper unnecessarily, and doesn’t fix the two cable under one bolt issue.

stevens5

First, at your local bmx store, find a SST ORYG Giro Upper Cable as the basis (but it would work with the Odyssey giro upper cable too).

The right piece is a “giro upper cable”. Differentiating it from the lower cable is a barrel adjustment to allow fine tension adjustment on the brake lever on the single cable end. This barrel adjustment bolt is the same diameter and threading as that on the upper arm of the brake calliper.

Cut off all the cables on the giro upper cable. Crack open the hard plastic doubler; its two halves are only pressed together. Remove the blob of cast metal containing the cable remnants.

Drill out the hollow barrel adjustment bolt using successively larger diameter drills until two brake cable inners run freely through it. I think it was 3.5mm from memory. Because there’s already a hole through the centre, its really easy to keep the drill straight. Then using a large drill, 6mm from memory, drill into the head of the bolt to remove the head. All that is left is a hollow screw.

Screw this hollow screw back into the plastic doubler thread till it stops. I put the ugly drilled end in first. Remove the fine adjustment from the brake calliper arm, and screw the end of the doubler into the calliper.

Cut the outer cables from both levers to fit. Feed both inners through the doubler and press the two halves together again. I’ve pulled the two inner cables under the same side of the calliper retaining bolt.

IMPORTANT: Put a star washer (not a flat washer) on first under the cable inners. This will increase the bite on the cables, and prevents them from slipping. Do up the retaining clamp bolt tightly.

VERY IMPORTANT: Test this fix properly. Squeeze one lever harder than you’ll ever need. Then the other. Then both at once. Check that nothing’s slipping. Test this to fix to death. If you don’t test and something is not right, then yours may follow.

Cannondale Track 2005

After nearly two years on my Giant CFR #632 it was finally time to buy a “real” fixed frame. I saw this Cannondale Track bike in a store window in my home town, and just had to get it, even though it’d been waiting for over two years to find an owner. The store manager was pretty happy to get it out of the shop…

PhillipStevens-1

Converting it from standard trim to my comfortable suit of gear (off #632) and dual brake lever system went problem free, but again the Spinergy/SurlyFixxer hub took some work. The Spinergy has a 130mm axle, but the frame required 120mm.

So using a hand drill as a lathe, I carefully cut exactly 5mm from the left axle end, and then removed about 4-5mm from the right threaded end. When re-inserting the left axle stop I found that it needed to be shortened by a few mm (nearly to the rubber O-ring seal) to allow it to seat properly into the shortened axle.

The 5mm axle spacer previously used on #632 is also now unnecessary.

PhillipStevens-2

The hollow axle requires a quick-release but they’re not available in 120mm, nor are they strong enough to hold the wheel back/straight. Both issues are overcome by using a BMX chain tensioner. Co-incidentally, the chain tensioner has a guide which locks against the front of the axle, so no shear force is carried by the quick release skewer… lucky break.

The drive train is 1/8″ and I’m currently using 44-16 on 165mm cranks. I’d found I needed to lengthen the gearing, and go with shorter cranks to keep up with others in the peloton.

Implementing NASA EEFS on AVR ATmega

I am building a variant of the Arduino platform which will have an analogue output capability in the form of a dual channel DAC, called Goldilocks Analogue. The DAC can be used to generate variable DC voltage levels that might be used as part of a PID control system, and it can also generate AC voltages up to about 50kHz if it can be fed with sufficient samples to produce the required signal. To generate a 44.1kHz audio signal the DAC has to receive a stream of data, with a new sample every 22us without fail.

44.1kHz samples using USART MSPI output.

44.1kHz samples using USART MSPI output.

Finding an answer to the question of how to reliably stream data to the DAC is the background to this post.

Looking for a way to structure and assemble a combination of many WAV files on a host PC for storage onto to the AVR ATmega MCU, I needed a system that would support:

  • Editing and assembly of files on a host PC (Linux, Windows, Mac), in to a package.
  • Transferring a package of files to the AVR ATmega (Arduino) device very simply.
  • Can read and write files to the storage medium very quickly, and without jitter.
  • Simple implementation in the avr-libc environment.

Initially I was looking at using the FAT File System on a SD Card to provide the required capability, but I found that SD Cards are quite slow when writing data to their FLASH medium. Often taking 100ms or more to complete a write cycle. A SD Card read cycle also takes quite a long time, when the FAT file system must be inspected prior to reading or writing a specific block of information. The SD Card is great for storing Mega Bytes of information, but is not optimal for jitter free read and write applications.

So I started looking at chip storage based on the SPI bus as a mechanism to store large numbers of samples for playback, or to store large amounts of acquired data samples. There are many alternatives using different technologies for SPI storage devices. These range from EEPROM storage, through to SRAM and also newer FRAM technologies. Storage capabilities with up to 1Mbit seem to be quite good value. For my application 1Mbit of storage would allow about 16 seconds of reasonable quality audio to be retrieved with minimal issues for complexity, jitter, and delay.

So I redesigned the Goldilocks Analogue to incorporate space to have two SPI memory (EEPROM, FRAM, SRAM) devices on the board.

Goldilocks Analogue - 2x SPI Memory Devices

Goldilocks Analogue – 2x SPI Memory Devices

 

Goldilocks Analogue

Goldilocks Analogue

Implementing a method to read and write bytes to these storage devices is very straightforward. There are many libraries available supporting the SPI storage devices of various types. But none of them supported assembling a package of files on a host PC, and then transferring this to the AVR device in a simple manner. So the hunt for a solution to this issue brought me to the NASA EEFS solution.

NASA EEFS

NASA has been releasing their Core Flight System with Open Source licencing over the past few years. The Core Flight System (CFS) is a recognition that many satellite and deep space missions have very common core requirements and that successive missions were simply cloning previous mission software and then owning changes going forward, with learning being improved in a serial manner. The CFS enabled missions that were developing in parallel to push improvements in the platform CFS code back into the general solution for peer and successive missions to benefit from.

The CFS is layered and each layer hides its implementation, enabling the internals of the layer to be changed without affecting other layers’ implementation. Within the CFS Platform Abstraction Layer there is a module designed to support the management of flight software packages on non-volatile storage, called the EEPROM File System (EEFS).

The EEFS is a very small (approximately 2% of the flight software) piece of code that implements the storage and retrieval of all flight system software from flash storage devices. It was designed by NASA GSFC to support similar outcomes as what I needed for my application:

  • Generate a flight software (or general embedded system) executable image on the development workstation. This feature allows the embedded file system to be generated with a known CRC and loaded on to the target processor as a single image. This is a big advantage over formatting a file system on the image, then transferring each file to the file system on the target.
  • Prove that the file system is correct and reliable. Because the EEPROM file system is simple, the code size is small, making it easy to review and find errors.
  • Patch the files in the file system. Due to the simple layout of the EEPROM file system, it is very easy to patch the files in the file system, if the need arises. This can be helpful in deeply embedded systems such as satellite data systems.
  • Dump and understand the file system format. Because the EEPROM file system is simple, it is easy to dump the contents of the EEPROM or PROM memory and determine the contents of each file.

The EEFS is basically a configurable slot-based file system. The file system can be pre-configured with a certain number empty files of known sizes, or known files with specific “spare bytes”, and written with a CRC into an image. The File Allocation Table is a fixed size and contains a fixed number of file slots, together with the location and maximum size for each slot. The File Headers for each slot contain all the information about each File. Changing a file does not impact the FAT, and therefore does not affect other files in the File System.

An EEFS image is created with a tool called geneepromfs, which is a command line tool compiled for the respective host upon which it is used. It reads an input file specifying the files that are to be assembled into the EEFS image, together with the number of empty file slots and their size, and it outputs a complete EEFS image ready to be burnt on the EEPROM, FRAM, or SRAM storage device.

So the EEFS looks like a perfect solution to my requirements. Let’s go to Github and clone the EEFS repository, and get started.

AVR Implementation of EEFS

The EEFS code is supplied for VxWorks or RTEMS platforms, along with a standalone implementation design for bare metal designs. To get the standalone design to work with the AVR ATmega, and my freeRTOS platform of choice, there were two major pieces of work.

Firstly, to develop a generalised SPI interface layer that would allow me to select the actual SPI device installed on the Goldilocks Analogue at compile time. This was necessary because each individual SPI storage device has slightly different command requirements (EEPROM ready check, different address byte numbers), and it made good sense to unify the interface into a single function with compile time options.

Secondly, I needed to revise the pointer calculations inherent in the EEFS code. The NASA GSFC code is based on the availability of 32 bit pointers, and does 32 bit calculations to locate information within the file system. But, on the AVR ATmega platform the inherent pointer size is 16 bits, and many of the advanced pointer arithmetic calculations used in the code would fail.

When I finished the major work, I reduced the return values of most functions to 1 byte error codes, which shaved almost 2,000 bytes of program code off the end result. On the AVR ATmega platform, it is well worth saving 2,000 bytes.

I have built a simple FRAM test program that can write files from a SD Card to the EEFS SPI device, and then edit (read, modify, write) files on the EEFS SPI device for test purposes. This shows how the resulting EEFS library can be best used.

As usual code on Sourceforge AVRfreeRTOS, and also forked on AVR EEFS Github.

ATmega Arduino USART in SPI Master Mode MSPIM

The AVR ATmega MCU used by the Arduino Uno and its clones and peers (Leonardo, Pro, Fio, LilyPad, etc) and the Arduino Mega have the capability to use their USART (Universal Serial Asynchronous Receiver Transmitter), also known as the Serial Port, as an additional SPI bus interface in SPI Master mode. This fact is noted in the datasheets of the ATmega328p, ATmega32u4, and the ATmega2560 devices at the core of the Arduino platforms, but until recently it hasn’t meant much to me.

Over the past 18 months I’ve been working on an advanced derivative of the Arduino platform, using an ATmega1284p MCU at its core. I consider the ATmega1284p device the “Goldilocks” of the ATmega family, and as such the devices I’ve built have carried that name. Recently I have been working on a platform which has some advanced analogue output capabilities incorporating the MCP4822 dual channel DAC, together with a quality headphone amplifier, and linear OpAmp for producing buffered AC and DC analogue signals. This is all great, but when it comes down to outputting continuous analogue samples to produce audio it is imperative that the sample train is not interrupted or the music simply stops!

The issue is that the standard configuration of the Arduino platform (over)loads the SPI interface with all of the SPI duties. In the case of the Goldilocks and other Arduino style devices I have ended up having the MicroSD card, some SPI EEPROM and SRAM, and the MCP4822 DAC all sharing same SPI bus. This means that the input stream of samples from the MicroSD card are interfering and time-sharing with the output sample stream to the DAC. The MicroSD card has a lot of latency, often taking hundreds of milliseconds to respond to a command, whereas the DAC needs a constant stream of samples with no jitter and no more than 22us between each sample. That is a conflict that is difficult to resolve. Even using large buffers is not a solution, as when streaming audio it is easy to consume MBytes of information; which is orders of magnitude more than can be buffered anywhere on the ATmega platform.

Other solutions using a DAC to generate music have used a “soft SPI” and bit-banging techniques to work around the issue. But this creates a performance limitation as the maximum sample output rate is strongly limited by the rate at which the soft SPI port can be bit-banged. There has to be a better way.

USART in SPI mode

The better way to attach SPI Slave devices to the ATmega platform is referenced in this overlooked datasheet heading: “USART in SPI mode”. Using the USART in “Master SPI Mode” (MSPIM) is may be limiting if you need to use the sole serial port to interact with the Arduino (ATmega328p), but once the program is loaded (in the case of using a bootloader) there is often no further need to use the serial port. But for debugging if there is only one USART then obviously it becomes uncomfortable to build a system based on the sole USART in SPI mode.

However in the case of the Goldilocks ATmega1284p MCU with two USARTs, the Arduino Leonardo with both USB serial and USART, and the Arduino Mega ATmega2560 MCU with four USARTs, there should be nothing to stop us converting their use to MSPIM buses according to need.

Excuse me for being effusive about this MSPIM capability in the AVR ATmega. It is not exactly a secret as it is well documented and ages old, but it is a great feature that I’ve simply not previously explored. But now I have explored it, I think it is worthwhile to write about my experience. Also, I think that many others have also overlooked this USART MSPIM capability, because of the dearth of objective review to be found on the ‘net.

Any ATmega datasheet goes into the detailed features and operation of the USART in SPI mode. I’ll go into some of the features in detail and what it means for use in real life.

  • Full Duplex, Three-wire Synchronous Data Transfer – The MSPIM does not rely on having a Slave Select line on a particular pin, and further it doesn’t rely on having both MOSI and MISO lines active at the same time. This means that it is possible to attach a SPI Slave device that doesn’t use the _SS to begin or end transactions with just two pins, being the XCK pin and the Tx pin. If a _SS is required (as in the MCP4822) then only three wires are required. The fact that the MISO (Rx) pin is optional saves precious pins too.
  • Master Operation – The MSPIM only works in SPI Master mode, which means that it is only really useful for connecting accessories. But in the Arduino world, that is what we are doing 99% of the time.
  • Supports all four SPI Modes of Operation (Mode 0, 1, 2, and 3) – yes, it does.
  • LSB First or MSB First Data Transfer (Configurable Data Order) – yes, it does.
  • Queued Operation (Double Buffered) – The MSPIM inherits the USART Tx double buffering capability. This is a function not available on the standard SPI interface and is a great thing. For example, to output a 16bit command two writes to the I/O register can follow each other immediately, and the resulting XCK has no delay between each Byte output. To output a stream of bytes the buffer empty flag can be used as a signal to load the next available byte, ensuring that if the next byte can be loaded with 16 instructions then we can generate a constant stream of bytes. In contrast with the standard SPI interface transmission is not buffered and therefore in Master Mode we’re invariably wait-looping before sending the next byte. This wastes cycles in between each byte in recognising completion, and then loading the next byte for transmission.
  • High Resolution Baud Rate Generator – yes, it is. The MSPIM baud rate can be set to any rate up to half the FCPU clock rate. Whilst there may be little need to run the MSPIM interface at less than the maximum for pure SPI transactions, it is possible to to use this feature, together with double buffered transmission, to generate continuous arbitrary binary bit-streams at almost any rate.
  • High Speed Operation (fXCKmax = FCPU/2) – The MSPIM runs at exactly the same maximum clock speed as the standard SPI interface, but through the double buffering capability mentioned above the actual byte transmission rate can be significantly greater.
  • Flexible Interrupt Generation – The MSPIM has all the same interrupts as the USART from which it inherits its capabilities. In particular the differentiation between buffer space available flag / interrupt and transmission complete flag / interrupt capabilities make it possible to develop useful arbitrary byte streaming solutions.

Implementation Notes

As the USART in normal mode and the USART in MSPIM mode are quite similar in operation there is little that needs to be written. The data sheet has a very simple initialisation code example, which in practice is sufficient for getting communications going. I would note that as there is no automatic Slave Select management, the _SS line needs to be manually configured as an output, and then set (high) appropriately until such time as the attached SPI device is to be addressed. Also note that the XCKn (USART synchronous clock output) needs to be set as an output before configuring the USART for MSPIM. And also to note that the transmission complete flag (TXCn) is not automatically cleared by reading (it is only automatically cleared if an Interrupt is processed), and needs to be manually cleared before commencing a transmission (by writing a 1 to the TXCn bit) it you are planning to use it to signal transaction completion in your code. The Transmit and Receive Data Register (UDRn) is also not automatically cleared, and needs to be flushed before use if a receive transaction is to synchronised to the transmitted bytes.

So the implementation of a simple initialisation code fragment looks like this:

SPI_PORT_DIR_SS |= SPI_BIT_SS;   // Set SS as output pin.
SPI_PORT_SS |= SPI_BIT_SS;       // Pull SS high to deselect the SPI device.
UBRR1 = 0x0000;
DDRD |= _BV(PD4);                // Setting the XCK1 port pin as output, enables USART SPI master mode (this pin for ATmega1284p)
UCSR1C = _BV(UMSEL11) | _BV(UMSEL10) | _BV(UCSZ10) | _BV(UCPOL1);
                                 // Set USART SPI mode of operation and SPI data mode 1,1. UCPHA1 = UCSZ10
UCSR1B = _BV(TXEN1);             // Enable transmitter. Enable the Tx (also disable the Rx, and the rest of the interrupt enable bits set to 0 too).
                                 // Set baud rate. IMPORTANT: The Baud Rate must be set after the Transmitter is enabled.
UBRR1 = 0x0000;                  // Where maximum speed of FCPU/2 = 0x0000

And a fragment of the code to transmit a 16 bit value looks like this. Note with this example there is no need to wait for the UDREn flag to be set between bytes, as we are only writing two bytes into the double transmit buffer. This means that the 16 clocks are generated on XCKn with no gap in delivery.

UCSR1A = _BV(TXC1);              // Clear the Transmit complete flag, all other bits should be written 0.
SPI_PORT_SS &= ~_BV(SPI_BIT_SS); // Pull SS low to select the SPI device.
UDR1 = write.value.u8[1];        // Begin transmission of first byte.
UDR1 = write.value.u8[0];        // Continue transmission with second byte.
while ( !(UCSR1A & _BV(TXC1)) ); // Check we've finished, by waiting for Transmit complete flag.
SPI_PORT_SS |= _BV(SPI_BIT_SS);  // Pull SS high to deselect the SPI device.

Results

Looking at the output generated by the two different SPI interfaces on the AVR ATmega, it is easy to see the features in action. In the first image we can see that the two bytes of the 16 bit information for the DAC are separated, as loading the next byte to be transmitted requires clock cycles AFTER the transmission completed SPIF flag has been raised.

DAC control using SPI bus.

DAC control using SPI bus.

In the case of the MSPIM output, we can’t recognise where the two bytes are separated, and the end of the transaction is triggered by the Transaction complete flag. This example shows that the MSPIM can be actually faster than the standard SPI interface, even though the maximum clock speed in both cases is FCPU/2.

DAC control using USART MSPIM bus.

DAC control using USART MSPIM bus.

The final image shows the Goldilocks DAC generating a 44.1kHz output signal, with dual 12 bit outputs. Whilst this is not fully CD quality, comparisons with other DAC solutions available on the Arduino platform have been favourable.

44.1kHz samples using USART MSPI output.

44.1kHz samples using USART MSPI output.

Conclusion

I am now convinced to use the USART MSPIM capability for the Goldilocks Analogue, and I think that it is time to write some generalised MSPIM interface routines to go into my AVRfreeRTOS Sourceforge repository to make it easy to use this extremely powerful capability.