Goldilocks Analogue – Testing 4

Recap

I’ve been working on the Goldilocks Analogue now for so long that its been the centerpiece of my coding evenings for the past 18 months. This is the first time that I’ve designed a piece of hardware, and I’ve managed to make many mistakes (or learnings) along the way, so I think that every step of the process has been worthwhile.

Goldilocks Analogue Prototype 4 - Rotating

Goldilocks Analogue Prototype 4 – Rotating

In this project, I’ve learned about Digital to Analogue Converters, Operational Amplifiers, Voltage translation, Switching Power Supplies, and most importantly have gained a working knowledge of the Eagle PCB design tools.

Version History

The original Goldilocks Project was specifically about getting the ATmega1284p MCU onto a format equivalent to the Arduino Uno R3. The main goal was to get more SRAM and Flash memory into the same physical footprint used by traditional Arduino (pre-R3) and latest release Uno R3 shields.

The Goldilocks project achieved that goal, but the resulting ATmega platform still lacked one function that I believe is necessary; a high quality analogue capability. The world is analogue, but having an ADC capability, without having a corresponding digital-to-analogue capability, is like having a real world recorder (the ADC capability) with no means to playback and recover these real world recordings.

A major initiative of the Goldilocks is to bring an analogue capability to the Arduino platform via a DAC, so this project was called the Goldilocks Analogue. Following a Kickstarter campaign, the Goldilocks Analogue is now available on Tindie.

I sell on Tindie

Version 1 implemented the MCP4822 DAC buffered with an expensive, very musical, Burr Brown Operational Amplifier. Although the DAC performed exactly as designed, I neglected to provide the Op Amp with a negative supply rail, so it could not approach 0V required to reproduce the full range of output available (0v to 4.096V) from the DAC. That was a mistake.

Version 1 also reverted the USB to Serial interface to use a proper FTDI FT232R device from the ATmega32u2 MCU (the solution preferred now by Arduino). Using the ATmega32u2 (or ATmega16u2) to interface with anything not running at its own 16MHz CPU clock rate is broken and doesn’t work. This is because the ATmega devices don’t produce correct serial when their CPU clock is not a clean multiple of the USART required rate. I learned this the hard way with the Goldilocks Project.

Finally Version 1 implemented a buffered microSD card interface, using devices suitable to convert from 5V to 3.3V (SCK, MOSI, CS), and 3.3V to 5V (MISO) for the SPI interface. This was a correct implementation, but later I removed it because sometimes I’m really too smart for my own good.

Updated - Goldilocks Analogue

Goldilocks Analogue – Version P1

Some significant rework of the analogue section was required for Version 2. Firstly, I decided that audio was a significant use case for the Goldilocks Analogue so it was worth while reducing the cost of the Op Amp, and sharing that expenditure with a special purpose headphone amplifier, working in parallel with the Op Amp.

The TPA6132A2 headphone amplifier was found, and implemented into the prototype. This “DirectPath” device removes the need for large output capacitors in the signal path, as it provides a zero centred output from a single supply voltage.

Version 2 also had a lesser but fully adequate TS922A Op Amp for providing DC to full rate signals, buffering the MCP4822 DAC. I had learned that Op Amps need to be provided with a negative supply rail, if they are to achieve 0V output under load.

To create a negative supply rail for the Version 2 Op Amp, I used a pair of TPS60403 voltage inverters producing -5V from the main power supply, and then fed that signal into a TPS72301 regulator configured to produce -1.186V. This design worked very well, but needed 3 devices to produce the output and also quite a lot of board space.

Finally, for Version 2, I changed the microSD voltage translation to use the TXS0104 (TXB0104) level translation products. These special purpose devices enable the bidirectional transfer of signals over voltage transitions. Normally these two device types (TXB / TXS) work very well, but somehow I couldn’t ever get either version to work properly, which caused the microSD card to never work. This exercise was a failure, and was reverted to normal buffer logic for Version 3.

Goldilocks Analogue Version 2

Goldilocks Analogue – Version P2

Version 3 was supposed to be my final prototype prior to putting this into a crowd funding project. I made some fairly simple changes, and I was pretty happy with the outcome of Version 2 as it was.

Version 3 changed the MCP4822 DAC interface to use the USART1 interface present on the ATmega1284p MCU. Using the USART1 in MSPI Mode allows the DAC to operate independently of the normal SPI bus. This frees up timing constraints on the SPI interface so that slow microSD cards can be read and streamed to the DAC, or SPI graphics interfaces can be used as in the case of the synthesiser demonstration.

As part of the testing, I found that the ATmega1284p would operate successfully at 24.576MHz. This is a magic frequency because it allows an 8 bit timer to reproduce all of the major audio sampling rates related to 48kHz. I’ve tested using this frequency and everything is working fine.

Obviously there was some feature creep over the course of experimentation, so I decided to add two SPI memory device layouts to the PCB. This would allow up to 23LC1024 256kByte of SRAM to be addressed, together with AT25M01 256kByte of EEPROM, for example. Putting these layouts on the board meant that the JTAG interface had to be moved to the back of the PCB. This was actually not a bad thing, as it would have been impossible to use the JTAG interface with a Shield in place anyway.

I was asked to beef up the 3.3V supply capability of the Goldilocks Analogue, so I added a AP1117 device capable of up to 1A supply, upgrading from 150mA.

Goldilocks Analogue Version 3

Goldilocks Analogue – Version P3

Version 4 added the feature of amplified Microphone Input. Using the MAX9814 Mic Amp electret microphones present in the normal headsets used with smart-phones can act as an audio input to the Goldilocks Analogue. The amplified signal is presented on Pin A7, which falls outside of the normal Arduino Uno R3 footprint. For completeness, I also added a level shifted non-amplified signal (for line-in) to Pin A6.

Because of the extra microphone amplifier circuitry, I needed to reduce the footprint of the negative supply rail. So I used a regulated -3V device LTC1983 to produce the negative supply rail required for the Op Amp. There are now 3 regulators on the Goldilocks Analogue, 5V at up to 2.4A, 3.3V at up to 1A, and -3V at 100mA.

Version 4 also saw the last of the through hole components gone. Both the main crystal and the 32kHz clock crystal were reformatted into surface mount technology. This reduces the flexibility of the platform, but it makes manufacturing a bit easier.

Goldilocks Analogue - Version P4

Goldilocks Analogue – Version P4

Version 4 is the end. There will be some minor adjustments to the board for the production version. But, finally, I think this is it.

Unboxing

Prototype 4 back from manufacturing

Prototype 4 back from manufacturing

Errata

There is no power supply jumper provided. Add this to the BOM.

The EEPROM is built with SOIC8 Wide Body, but it should be Narrow Body. Fix the BOM.

Labels for the recommended power supply Voltage are partially obscured by power jack. Move the labels.

The main crystal doesn’t oscillate, because the Eagle library had an incorrect footprint.
Tested to work by rotating the crystal on its existing footprint.

Respecify the main MCU crystal to the Atmel recommended type.

Change C2 and C3 to 15pF in line with Atmel recommendation.

Change C1 and C11 to 12pF in line with Atmel recommendation.

Change R18 to 100kOhm, to bias the Analogue Line in correctly.

Add a simple LC filter to the -3.3V supply, using the known inductor and 10uF capacitor.

Tie Mic Gain to AVcc with a separable link, to produce 40dB default gain. This is the setting required for most normal microphones, so will improve the “out of box” experience for most users.

Testing

Front

Version P4 – Front

Back

Version P4 – Back

Power Supplies

First, looking at power supply noise, we’ve got a slightly better result for noise at the power supply over the Prototype 1. Prototype 1 used the EUP3476 Switched Mode supply device. Problems with obtaining ready supply of this device led to changing it to the AP6503, which is pin compatible but needs slightly different voltage selection resistors. In contrast to the Arduino Uno and other devices, this is very good.

We remember that 1mV represents the Voltage change in the least significant bit of the 12 bit MCP4822. The least significant bit of the ATmega1283p 10 bit ADC is about 4mV. Sampling voltages similar to the noise level inherent on the platform will not generate any further accuracy.

Power Supply GA 4

GA P4: 5V Power Supply Noise

Channel 1 (yellow) is 4.0mV of noise present at the output capacitor for the power supply, and represents the lowest noise inherent in the supply. Channel 2 (blue) is the 5.6mV supply noise present on a test pin closest to the MCU.

The significant improvement in noise level for the GA4 version at the MCU may be due to the currently decreased system clock rate, and therefore is subject to confirmation.

Power Supply GA1

GA P1: 5V Power Supply Noise

Checking the other power supplies on the board, Channel 1 (yellow) is the 3.3V positive supply, provided by a linear regulator. This supply is not used for analogue components, so the 4.3mV noise level is not critical.

Channel 2 (blue) below shows the -3V supply for the Operational Amplifier. This shows a 10mV supply voltage ramp, generated because it is a capacitive charge switching device. This is not particularly good, and it will be worth adding an LC filter to attempt to further smooth this supply, for the production Goldilocks Analogue.

Goldilocks Analogue Prototype 4 - 3.3V & -3V Supply Noise

GA P4: 3.3V & -3V Power Supply Noise

Microphone Input

Microphone amplifier works.

Interestingly, when the amplifier is set to 60dB gain (default setting, Channel 2, blue) it oscillates at 144kHz when not connected to anything, while it doesn’t do this with the normal gain setting of 40dB (Channel 1, yellow) based on the amplification required for typical smartphone headphones. In either case, as soon as the Mic input is grounded, the Mic output reduces to less than 4mVAC on the correct bias of 1.25VDC.

Suggest to make 40dB amplification the default setting for the production Goldilocks Analogue, but allowing the connection to be segmented for increased amplification if required.

GA4 - Mic Oscillation Comparison 60dB and 40dB

GA4 – Mic Oscillation Comparison at 40dB (CH1) and 60dB (CH2) amplification.

Line-In (PA6) seems to be biased at 1.07V rather than 1.25V. Check calculation again… Doh! R18 should be 100kOhm.

Analogue Output

The standard test that I’ve been using throughout the development is to feed in a 43.1Hz Sine wave generated from a 1024 value 16 bit LUT. The sampling rate is 44.1kHz, which is generated by Timer 1 to get the closest match.

The spectra and oscilloscope charts below can be directly compared to the testing done with previous prototype versions of the Goldilocks Analogue.

The below chart shows the sine wave generated at the output of the Op Amp. This is exactly as we would like to see, with no compression of either the 4.096V peak, or the 0V trough.

Goldilocks Analogue – 43Hz Sine Wave – Two Channels – One Channel Inverted

Goldilocks Analogue P4 – 43Hz Sine Wave – Two Channels – One Channel Inverted

Looking at the spectra generated up to 953Hz it is possible to see harmonics from the Sine Wave, and other low frequency noise.

The spectrum produced by the Goldilocks Analogue shows most distortion is below -70dB, and that the noise floor lies below -100dB.

Goldilocks Analogue – 43.1Hz Sine Wave – 953Hz Spectrum

Goldilocks Analogue P4 – 43.1Hz Sine Wave – 953Hz Spectrum

Comparing the same DAC output channel A from Op Amp (Channel 1 – Blue) with the same Headphone Amp Left (Channel 2 – Red), we see slightly more noise carriers.

Goldilocks Analogue P4 – 43.1Hz Sine Wave – 953Hz Spectrum

Goldilocks Analogue P4 – 43.1Hz Sine Wave – 953Hz Spectrum – Headphone in Red

In the spectrum out to 7.6kHz we are looking at the clearly audible range, which is the main use case for the device.

The Goldilocks Analogue has noise carriers out to around 4.5kHz, but they are all below -80dB. After 4.5kHz the only noise remains below -100dB.

Goldilocks Analogue – 43.1Hz Sine Wave – 7.6kHz Spectrum

Goldilocks Analogue P4 – 43.1Hz Sine Wave – 7.6kHz Spectrum

The spectra out to 61kHz should show a noise carrier generated by the reconstruction frequency of 44.1kHz.

The Goldilocks Analogue shows the spectrum maintains is low noise level below -90dB right out to the end of the audible range, and further out to the reconstruction carrier at 44.1kHz.

Goldilocks Analogue – 43.1Hz Sine Wave – 61kHz Spectrum

Goldilocks Analogue P4 – 43.1Hz Sine Wave – 61kHz Spectrum

The final spectrum shows the signal out to 976kHz. We’d normally expect to simply see the noise floor, beyond the 44.1kHz reconstruction carrier noise.

The Goldilocks Analogue has a noise carrier at around 210kHz. Aside from the single carrier mentioned, there is no further noise out to 976kHz.

Goldilocks Analogue – 43.1Hz Sine Wave – 976kHz Spectrum

Goldilocks Analogue P4 – 43.1Hz Sine Wave – 976kHz Spectrum

Analogue output works as specified. It can maintain the 72dB SNR required, of which it should theoretically be capable.

SPI Devices

MicroSD works.

SPI SRAM / EEPROM devices work.

Tests for REWORK

These tests are for the factory rework of the prototype boards.

Check 5V Power Supply

Plug 7-23VDC positive centre into J1.
CHECK: that there is 5VDC available on H15 DC Pin.

GAV4-5VDC

Check 3.3V and -3V Power Supply

Add a jumper to J1 from the centre pin to DC Pin.

CHECK: that 5VDC, 3.3VDC, and -3VDC is available at the test points indicated.

GA4V-Powered

Check for Internal RC Oscillator Function

Connect an AVRISP Mk2 in circuit programmer to the ICSP Socket.

Use the following command to enable the CLKO on PB1, and set other fuses correctly.

avrdude -pm1284p -cavrisp2 -Pusb -u -Ulfuse:w:0x82:m -Uhfuse:w:0xd8:m -Uefuse:w:0xfc:m

CHECK: Using an oscilloscope or counter, check that the CLKO signal on PB1 is between 7MHz and 8.1MHz (nominal value 8MHz).

GAV4-CLKO

Rework 24.576MHz Crystal & Capacitors

Identify Crystal Y2, and capacitors C2 & C3 and C1 & C11.

Screenshot from 2015-10-06 23:21:16

IMPORTANT: Y2 must be ROTATED 90 degrees (either direction) from standard footprint.

Rework Y2 to the 24.576MHz SMD Crystal Digikey 644-1053-1-ND device.

Rework C2 & C3 to 15pF.

Rework C1 & C11 to 12pF.

Check External Crystal Oscillator Function

Connect an AVRISP Mk2 in circuit programmer to the ICSP Socket.

Use the following command to enable the CLKO on PB1, and set other fuses to External Full Swing Oscillator with BOD.

avrdude	-pm1284p -cavrisp2 -Pusb -u -Ulfuse:w:0x97:m -Uhfuse:w:0xd8:m -Uefuse:w:0xfc:m

CHECK: Using an oscilloscope or counter, check that the CLKO signal on PB1 is 24.576MHz.

GAV4-CLKO

Use the following command to disable the CLKO on PB1, and set other fuses to External Full Swing Oscillator with BOD.

avrdude	-pm1284p -cavrisp2 -Pusb -u -Ulfuse:w:0xd7:m -Uhfuse:w:0xd8:m -Uefuse:w:0xfc:m

Rework R18

CHECK: DC voltage on Port A6 (ADC Input 6). Expected value 1.07VDC.

Identify resistor R18. R18 needs to be reworked to 100kOhm.

Screenshot from 2015-10-06 23:21:02

GAV4-Rework

CHECK: DC voltage on Port A6 (ADC Input 6). Confirm the expected value 1.25VDC.

Load Bootloader

Connect an AVRISP Mk2 in circuit programmer to the ICSP Socket.

Use the following command to program a bootloader using the provided HEX file.

avrdude -pm1284p -cavrisp2 -Pusb -Uflash:w:GABootMonitor.hex:a

Connect the Goldilocks Analogue to a USB port and open a serial terminal at 38400 baud to the attached FTDI USB /dev/ttyUSB0 interface.

CHECK: Enter !!! into the serial terminal, immediately following a board RESET (using the RESET button). This will enable the Boot Monitor. These are the boot monitor commands.

Bootloader>
Bootloader>H Help
0=Zero addr (for all commands)
?=CPU stats
@=EEPROM test
B=Blink LED
E=Dump EEPROM
F=Dump FLASH
H=Help
L=List I/O Ports
Q=Quit
R=Dump RAM
V=show interrupt Vectors
Y=Port blink

Bootloader>
Bootloader>? CPU stats
Goldilocks explorer stk500v2
Compiled on = Oct  7 2015
CPU Type    = ATmega1284P
__AVR_ARCH__= 51
GCC Version = 4.8.2
AVR LibC Ver= 1.8.0
CPU ID      = 1E9705
Low fuse    = D7
High fuse   = D8
Ext fuse    = FC
Lock fuse   = FF

Close the serial terminal program (disconnect from the /dev/ttyUSB0 device).

Load Test Program

With the USB interface connected to the Goldilocks Analogue, and the DTR switch in the right most position (closest to DTR text; opposite position to the pictured position).

GAV4-DTR

Use the following command to program a Test Suite using the provided HEX file.

avrdude	-pm1284p -cwiring -P/dev/ttyUSB0 -b38400 -D -v -Uflash:w:GATestSuite.hex:a

Plug a standard smartphone headset (with microphone) with a 3.5mm TRRS connector in LRGM (left, right, ground, mic) configuration into the socket.
CHECK: Sound (echo of input) should be heard from the headphones, when speaking into the microphone.

Connect the Goldilocks Analogue to a USB port and open a serial terminal at 38400 baud to the attached FTDI USB /dev/ttyUSB0 interface.
CHECK: Information on each Task’s stack “Highwater” mark should be seen on the serial terminal.

XBee Walkie Talkie

I’m building an advanced Arduino clone based on the AVR ATmega1284p MCU with some special features including a 12 bit DAC MCP4822, headphone amplifier, 2x SPI Memory (SRAM, EEPROM), and a SD Card. There are many real world applications for analogue outputs, but because the Arduino platform doesn’t have integrated DAC capability there are very few published applications for analogue signals. A Walkie Talkie is one example of using digital and analogue together to make a simple but very useful project.

Two Goldilocks Analogue prototypes with XBee radios, and Microphone amplifiers.

Two Goldilocks Analogue prototypes with XBee radios, and Microphone amplifiers.

The actual Walkie Talkie functionality is really only a few lines of code, but it is built on a foundation of analogue input (sampling), analogue output on the SPI bus to the MCP4822 DAC, sample timing routines, and the XBee digital radio platform. Let’s start from the top and then dig down through the layers.

XBee Radio

I am using XBee Pro S2B radios, configured to communicate point to point. For the XBee Pro there needs to be one radio configured as the Coordinator, and the other as a Router. There are configuration guides on the Internet.

I have configured the radios to wait the maximum inter-character time before sending a packet, which implies that the packets will be set only when full (84 bytes). This maximises the radio throughput. Raw throughput is 250 kbit/s, but the actual user data rate is limited to about 32 kbit/s. This has an impact on the sampling rate and therefore quality of speech that can be transmitted.

Using 10 bit samples companded using A-Law to 8 bit code words, I have found that about 3 kHz sampling generates about as much data as can be transmitted without compression. I’m leaving G.726 compression for another project.

The XBee radios are configured in AT mode, which acts as a transparent serial pipe between the two endpoints. This is the simplest way to connect two devices via digital radio. And it allowed me to do simple testing, using wire, before worrying about whether the radio platform was working or not.

XBee Packet Reception in Purple

XBee Packet Reception in Purple

Looking at the tracing of a logic analyser, we can see the XBee data packets arriving on the (purple) Rx line of the serial port. The received packet data is stored into a ring buffer, and played out at a constant rate. I have allowed up to 255 bytes in the receive ring buffer, and this will be sufficient because the XBee packet size is 84 bytes.

The samples to be transmitted to the other device are transmitted on the (blue) Tx line, more or less in each sample period even though they are buffered before transmission. The XBee radio buffers these bytes for up to 0xFF inter-symbol periods (configuration), and only transmits a packet to the other endpoint when it has a full packet.

Sampling Rate

Looking at the bit budget for the transmission link, we need to calculate how much data can be transmitted without overloading the XBee radio platform, and causing sample loss. As we are not overtly compressing the voice samples, we have 8 bit samples times 3,000 Hz sampling or 24 kbit/s to transmit. This seems to work pretty well. I have tried 4 kHz sampling, but this is too close to the theoretical maximum, and doesn’t work too effectively.

Sample rate of 3,000 Hz seems to be the optimum.

Sample rate of 3,000 Hz seems to be the optimum.

Looking at the logic analyser, we can see the arrival of a packet of bytes commencing with 0x7E and 0x7C on the Rx line. Both the Microphone amplifier and the DAC output are biased around 0x7F(FF), so we can read that the signal levels captured and transmitted here are very low. The sample rate shown is 3,000 Hz.

Sample Processing

I have put a “ping” on one output to capture when the sampling interrupt is being processed (yellow). We can see that the amount of time spent in the interrupt processing is very small for this application, relative to the total time available. Possibly some kind of data compression could be implemented.

DAC and sample processing

DAC and sample processing

During the sampling interrupt, there are two major activities, generating an audio output, by placing a sample onto the DAC, and then reading the ADC to capture an audio sample and transmit it to the USART buffer.

This is done by the audioCodec_dsp function, which is called from the code in a timer interrupt.

void audioCodec_dsp( uint16_t * ch_A,  uint16_t * ch_B)
{
	int16_t xn;
	uint8_t cn;
	/*----- Audio Rx -----*/

	/* Get the next character from the ring buffer. */

	if( ringBuffer_IsEmpty( (ringBuffer_t*) &(xSerialPort.xRxedChars) ) )
	{
		cn = 0x80 ^ 0x55; // put A-Law nulled signal on the output.
	}
	else if (ringBuffer_GetCount( &(xSerialPort.xRxedChars) ) > (portSERIAL_BUFFER_RX>>1) ) // if the buffer is more than half full.
	{
		cn = ringBuffer_Pop( (ringBuffer_t*) &(xSerialPort.xRxedChars) ); // pop two samples to catch up, discard first one.
		cn = ringBuffer_Pop( (ringBuffer_t*) &(xSerialPort.xRxedChars) );
	}
	else
	{
		cn = ringBuffer_Pop( (ringBuffer_t*) &(xSerialPort.xRxedChars) ); // pop a sample
	}

	alaw_expand1(&cn, &xn);	// expand the A-Law compression

	*ch_A = *ch_B = (uint16_t)(xn + 0x7fff); // move the signal to positive values, put signal out on A & B channel.

	/*----- Audio Tx -----*/

	AudioCodec_ADC( &mod7_value.u16 );	// sample is 10bits left justified.

	xn = mod7_value.u16 - 0x7fe0;	// center the sample to 0 by subtracting 1/2 10bit range.

	IIRFilter( &tx_filter, &xn);	// filter transmitted sample train

	alaw_compress1(&xn, &cn);	// compress using A-Law

	xSerialPutChar( &xSerialPort, cn);	// transmit the sample
}

I am using the AVR 8 bit Timer0 to generate the regular sample intervals, by triggering an interrupt. By using a MCU FCPU frequency which is a binary multiple of the standard audio frequencies, we can generate accurate reproduction sampling rates by using only the 8 bit timer with a clock prescaler of 64. To generate odd audio frequencies, like 44,100 Hz, the 16 bit Timer1 can be used to get sufficient accuracy without requiring a clock prescaler.

The ATmega1284p ADC is set to free-run mode, and is scaled down to 192 kHz. While this is close to the maximum acquisition speed documented for the ATmega ADC, it is still within the specification for 8 bit samples.

ISR(TIMER0_COMPA_vect) __attribute__ ((hot, flatten));
ISR(TIMER0_COMPA_vect)
{
#if defined(DEBUG_PING)
  // start mark - check for start of interrupt - for debugging only (yellow trace)
  PORTD |= _BV(PORTD7); // Ping IO line.
#endif

  // MCP4822 data transfer routine
  // move data to the MCP4822 - done first for regularity (reduced jitter).
  DAC_out (ch_A_ptr, ch_B_ptr);

  // audio processing routine - do whatever processing on input is required - prepare output for next sample.
  // Fire the global audio handler which is a call-back function, if set.
  if (audioHandler!=NULL)
    audioHandler(ch_A_ptr, ch_B_ptr);

#if defined(DEBUG_PING)
  // end mark - check for end of interrupt - for debugging only (yellow trace)
  PORTD &= ~_BV(PORTD7);
#endif
}

This interrupt takes 14 us to complete, and is very short relative to the 333 us we have for each sample period. This gives us plenty of time to do other processing, such as running a user interface or further audio processing.

SPI Transaction

At the final level of detail, we can see the actual SPI transaction to output the incoming sample to the MCP4822 DAC.

SPI DAC Transaction

SPI MCP4822 DAC Transaction

As I have built this application on the Goldilocks Analogue Prototype 2 which uses the standard SPI bus, the transaction is normal. My later prototypes are using the Master SPI Mode on USART 1 of the ATmega1284p, which slightly accelerates the SPI transaction through double buffering, and frees the normal SPI bus for simultaneous reading or writing to the SD Card or SPI Memory, for audio streaming. In the Walkie Talkie application there is no need to capture the audio, so there’s no down side to using the older prototypes and the normal SPI bus.


void DAC_out(const uint16_t * ch_A, const uint16_t * ch_B)
{
  DAC_command_t write;

  if (ch_A != NULL)
  {
    write.value.u16 = (*ch_A) >> 4;
    write.value.u8[1] |= CH_A_OUT;
  }
  else // ch_A is NULL so we turn off the DAC
  {
    write.value.u8[1] = CH_A_OFF;
  }

  SPI_PORT_SS_DAC &= ~SPI_BIT_SS_DAC; // Pull SS low to select the Goldilocks Analogue DAC.
  SPDR = write.value.u8[1]; // Begin transmission ch_A.
  while ( !(SPSR & _BV(SPIF)) );
  SPDR = write.value.u8[0]; // Continue transmission ch_A.

  if (ch_B != NULL) // start processing ch_B while we're doing the ch_A transmission
  {
    write.value.u16 = (*ch_B) >> 4;
    write.value.u8[1] |= CH_B_OUT;
  }
  else // ch_B is NULL so we turn off the DAC
  {
    write.value.u8[1] = CH_B_OFF;
  }

  while ( !(SPSR & _BV(SPIF)) ); // check we've finished ch_A.
  SPI_PORT_SS_DAC |= SPI_BIT_SS_DAC; // Pull SS high to deselect the Goldilocks Analogue DAC, and latch value into DAC.

  SPI_PORT_SS_DAC &= ~SPI_BIT_SS_DAC; // Pull SS low to select the Goldilocks Analogue DAC.
  SPDR = write.value.u8[1]; // Begin transmission ch_B.
  while ( !(SPSR & _BV(SPIF)) );
  SPDR = write.value.u8[0]; // Continue transmission ch_B.
  while ( !(SPSR & _BV(SPIF)) ); // check we've finished ch_B.
  SPI_PORT_SS_DAC |= SPI_BIT_SS_DAC; // Pull SS high to deselect the Goldilocks Analogue DAC, and latch value into DAC.
}

Wrap Up

Using a few pre-existing tools and a few lines of code, it is possible to quickly build a digitally encrypted walkie talkie, capable of communicating (understandable, but certainly not high quality) voice. And, there ain’t no CB truckers going to be listening in on the family conversations going forward.

This was a test of adding microphone input based on the MAX9814 to the Goldilocks Analogue. I will be revising the Prototype 3 and will add in a microphone amplification circuit to support applications needing audio input, like this walkie talkie example, or voice changers, or vocal control music synthesizers.

I’m also running the ATmega1284p devices at the increased frequency of 24.576 MHz, over the standard rate of 20 MHz. This specific frequency allows very precise reproduction of audio samples from 48 kHz down right down to 4 kHz (or even down to 1,500 Hz). The extra MCU clock cycles per sample period are very welcome when it comes to generating synthesised music, or encoding audio samples.

Code as usual on Github AVRfreeRTOS repository. Also, a call out to Shuyang at SeeedStudio who’s OPL is awesome, and is the source of many components and PCBs.

Implementing NASA EEFS on AVR ATmega

I am building a variant of the Arduino platform which will have an analogue output capability in the form of a dual channel DAC, called Goldilocks Analogue. The DAC can be used to generate variable DC voltage levels that might be used as part of a PID control system, and it can also generate AC voltages up to about 50kHz if it can be fed with sufficient samples to produce the required signal. To generate a 44.1kHz audio signal the DAC has to receive a stream of data, with a new sample every 22us without fail.

44.1kHz samples using USART MSPI output.

44.1kHz samples using USART MSPI output.

Finding an answer to the question of how to reliably stream data to the DAC is the background to this post.

Looking for a way to structure and assemble a combination of many WAV files on a host PC for storage onto to the AVR ATmega MCU, I needed a system that would support:

  • Editing and assembly of files on a host PC (Linux, Windows, Mac), in to a package.
  • Transferring a package of files to the AVR ATmega (Arduino) device very simply.
  • Can read and write files to the storage medium very quickly, and without jitter.
  • Simple implementation in the avr-libc environment.

Initially I was looking at using the FAT File System on a SD Card to provide the required capability, but I found that SD Cards are quite slow when writing data to their FLASH medium. Often taking 100ms or more to complete a write cycle. A SD Card read cycle also takes quite a long time, when the FAT file system must be inspected prior to reading or writing a specific block of information. The SD Card is great for storing Mega Bytes of information, but is not optimal for jitter free read and write applications.

So I started looking at chip storage based on the SPI bus as a mechanism to store large numbers of samples for playback, or to store large amounts of acquired data samples. There are many alternatives using different technologies for SPI storage devices. These range from EEPROM storage, through to SRAM and also newer FRAM technologies. Storage capabilities with up to 1Mbit seem to be quite good value. For my application 1Mbit of storage would allow about 16 seconds of reasonable quality audio to be retrieved with minimal issues for complexity, jitter, and delay.

So I redesigned the Goldilocks Analogue to incorporate space to have two SPI memory (EEPROM, FRAM, SRAM) devices on the board.

Goldilocks Analogue - 2x SPI Memory Devices

Goldilocks Analogue – 2x SPI Memory Devices

Goldilocks Analogue

Goldilocks Analogue

Implementing a method to read and write bytes to these storage devices is very straightforward. There are many libraries available supporting the SPI storage devices of various types. But none of them supported assembling a package of files on a host PC, and then transferring this to the AVR device in a simple manner. So the hunt for a solution to this issue brought me to the NASA EEFS solution.

NASA EEFS

NASA has been releasing their Core Flight System with Open Source licencing over the past few years. The Core Flight System (CFS) is a recognition that many satellite and deep space missions have very common core requirements and that successive missions were simply cloning previous mission software and then owning changes going forward, with learning being improved in a serial manner. The CFS enabled missions that were developing in parallel to push improvements in the platform CFS code back into the general solution for peer and successive missions to benefit from.

The CFS is layered and each layer hides its implementation, enabling the internals of the layer to be changed without affecting other layers’ implementation. Within the CFS Platform Abstraction Layer there is a module designed to support the management of flight software packages on non-volatile storage, called the EEPROM File System (EEFS).

The EEFS is a very small (approximately 2% of the flight software) piece of code that implements the storage and retrieval of all flight system software from flash storage devices. It was designed by NASA GSFC to support similar outcomes as what I needed for my application:

  • Generate a flight software (or general embedded system) executable image on the development workstation. This feature allows the embedded file system to be generated with a known CRC and loaded on to the target processor as a single image. This is a big advantage over formatting a file system on the image, then transferring each file to the file system on the target.
  • Prove that the file system is correct and reliable. Because the EEPROM file system is simple, the code size is small, making it easy to review and find errors.
  • Patch the files in the file system. Due to the simple layout of the EEPROM file system, it is very easy to patch the files in the file system, if the need arises. This can be helpful in deeply embedded systems such as satellite data systems.
  • Dump and understand the file system format. Because the EEPROM file system is simple, it is easy to dump the contents of the EEPROM or PROM memory and determine the contents of each file.

The EEFS is basically a configurable slot-based file system. The file system can be pre-configured with a certain number empty files of known sizes, or known files with specific “spare bytes”, and written with a CRC into an image. The File Allocation Table is a fixed size and contains a fixed number of file slots, together with the location and maximum size for each slot. The File Headers for each slot contain all the information about each File. Changing a file does not impact the FAT, and therefore does not affect other files in the File System.

An EEFS image is created with a tool called geneepromfs, which is a command line tool compiled for the respective host upon which it is used. It reads an input file specifying the files that are to be assembled into the EEFS image, together with the number of empty file slots and their size, and it outputs a complete EEFS image ready to be burnt on the EEPROM, FRAM, or SRAM storage device.

So the EEFS looks like a perfect solution to my requirements. Let’s go to Github and clone the EEFS repository, and get started.

AVR Implementation of EEFS

The EEFS code is supplied for VxWorks or RTEMS platforms, along with a standalone implementation design for bare metal designs. To get the standalone design to work with the AVR ATmega, and my freeRTOS platform of choice, there were two major pieces of work.

Firstly, to develop a generalised SPI interface layer that would allow me to select the actual SPI device installed on the Goldilocks Analogue at compile time. This was necessary because each individual SPI storage device has slightly different command requirements (EEPROM ready check, different address byte numbers), and it made good sense to unify the interface into a single function with compile time options.

Secondly, I needed to revise the pointer calculations inherent in the EEFS code. The NASA GSFC code is based on the availability of 32 bit pointers, and does 32 bit calculations to locate information within the file system. But, on the AVR ATmega platform the inherent pointer size is 16 bits, and many of the advanced pointer arithmetic calculations used in the code would fail.

When I finished the major work, I reduced the return values of most functions to 1 byte error codes, which shaved almost 2,000 bytes of program code off the end result. On the AVR ATmega platform, it is well worth saving 2,000 bytes.

I have built a simple FRAM test program that can write files from a SD Card to the EEFS SPI device, and then edit (read, modify, write) files on the EEFS SPI device for test purposes. This shows how the resulting EEFS library can be best used.

As usual code on Sourceforge AVRfreeRTOS, and also forked on AVR EEFS Github.

Melding freeRTOS with ChaN’s FatF & HD44780 LCD on Freetronics EtherMega

This post is a bit of a mixed bag, describing some software and hardware integration, together with some raving about a great tool I’ve been using. So, let’s get started.

Platform

Some time ago I got a Freetronics EtherMega, which is essentially an Arduino Mega2560 with an integrated Wiznet W5100 Ethernet interface, and a MicroSD card cage. I’ve introduced this product and the use of freeRTOS here.

EtherMega (Arduino Mega 2560) and freeRTOS

One great thing about the ATmega2560 used in the EtherMega, and the Arduino Mega2560, is the availability of an external memory bus. I’ve been using a Rugged Circuits QuadRAM, and now have ordered three more of their MegaRAM devices, and intend to make the ATmega2560 my standard platform. Why three? Well everyone knows that good things come in threes.

QuadRAM (512kByte) on Freetronics EtherMega (Arduino) ATmega2560 with freeRTOS

I’m actually preferring the Rugged Circuits MegaRAM which has only 128kBytes of RAM, so it won’t be as flexible for bank switching as its big brother. Also its chip select line is reversed (note to self to fix this in the driver). But, simply having 64kBytes of normal extended RAM plus another 56kBytes of special purpose (bank switched) RAM seems like it wil be sufficient for the duration. I’ve bought a couple to go on some Android ADK devices, that I’ll write about soon.

Recently, I also acquired a Freetronics 16×2 LCD-keypad-shield to use as a drop-on display for debugging and status, and anything really. It works really nicely and with its single pin switch analogue interface (which will be useful for navigation). Unfortunately there is a conflict between the SD Card device select pin on the EtherMega (Arduino pin D4) and one of the data pins on the 16×2 LCD.

My rectification of this pin usage conflict can be seen on the pictures below, where the yellow wire joins Pin D4 to Pin D2. What can’t be seen is that the leg of Pin D4 has been cut off, so it doesn’t insert into the EtherMega, so there is no elecrical connection between the D4 pin on the 16×2 LCD, and the D4 Pin used on the EtherMega.

P1010882P1010884P1010883P1010886

Tools

Recently I purchased a Saleae Logic to use in developing. I have got to say that this is probably the best $149 that I have spent on any tool, ever. Having the ability to capture long periods (minutes) of data, with 24MHz resolution, and zoom, shrink, drag, flick around in it, and also compare many windows of alternative samples is just so great. It saves so much time being able to simply “see” what the device is actually doing on the SPI, I2C and serial ports, simultaneously, is well great. But I already said that.

P1010781

Software – FatF File System

As usual, the code is on AVRfreeRTOS on Sourceforge.

My main work was to put the existing ChaN’s FatFs Generic FAT File System Module v0.9 into my existing ATmega2560 freeRTOS system, using my existing libraries, and generally being fully integrated into the system, as a plaform for some further work.

This was fairly time consuming (until I got my Saleae Logic), because the SPI bus transfers required to initiate and drive a SD card are complex, and depend on which version of SD card (MMC, SD, HDSD) is being used.

Now that everything is working, I’ve also done some SPI optimisation, to speed up multi-byte SPI bus transfers used for reading and writing to the SD card.

In testing with a Freetronics EtherMega driving an 4GByte HDSD card the system achieved the following results.

  • Byte transfer cycle time: MOSI 3.750us, MISO 3.6250us
  • Multibyte transfer cycle time: MOSI 1.3333us, MISO 1.3750us
  • Gross Performance increase: MOSI 2.8x, MISO 2.64x

Measured performance for a multi-MegaByte file copy is about 140kBytes/s which includes both read and write operations to the same SD card.

Software – HD44780 LCD

As usual, the code is on AVRfreeRTOS on Sourceforge.

Also, as I had purchased a 16×2 LCD display and I wanted to implement a flexible solution for display, I also ported the Control Module for HD44780 Character LCD into my system.

This was pretty straightforward, once I’d recognised the pin confict issue between the two Freetronics devices, and perhaps the most interesting things to say are:

  • Using a macro to control the pin assignment means that it is very easy to change the pins used for any display type. Simply renumber them in the macro and it is done.
  • Also, using the standard avr-libc stdio utility vsprintf formatting allows me to choose how much library I want to bring in. The standard library doesn’t support float formatting, but with a simple link switch either a simpler (smaller) or more fully featured (larger) standard library can be included. I also use the standard avr-libc tools for the serial port, so there is no additional overhead specifically for the LCD.

Wiznet W5100

Now I’ve finished the W5100 drivers from Wiznet, incorporating their new v1.6 changes (because they screwed up the silicon ARP state machine). And also, a fix for a subtle bug caused
by writing to the W5100 Tx buffer before it was finished with a previous transmission. This was fixed by checking the Tx read pointer and comparing it with the Tx write pointer. When the chip is idle, they are the same. That took me 3 weeks to isolate.

Now the fun part starts, which will be to re-learn the IP protocol suites, through re-implementing some of the standard network tools, like HTTP, FTP, NTP, DHCP, DNS, that we just take for granted. DHCP is done. Ping is done. HTTP is done.

Here is a web server running on this platform.

EtherMega server

QuadRAM (512kByte) on Freetronics EtherMega (Arduino) ATmega2560 with freeRTOS

I’ve been spending some time integrating the Rugged Circuits QuadRAM extension for the Arduino Mega (or Freetronics EtherMega) into the freeRTOS environment. And, it is now my standard environment. Actually, the MegaRAM, a slightly cheaper 128kByte version is my standard, as I’ve not found an application yet that needs more than 64kBytes total RAM. But, that will happen.

QuadRAM is available from Rugged Circuits again, after a long intermission.

Without adding any complexity into the environment, I can address up to 56kBytes of heap variables, effectively leaving the entire 8kbyte of internal RAM for the stack. With no complexity or overhead.

In addition, with some simple commands within a task can implement bank switching to access up to 7 additional banks of RAM, each up to 56kBytes.

A further degree of integration into freeRTOS is to completely automate memory bank switching, to give each of either 7 or 15 tasks a bank of RAM for its exclusive use. But this is a goal for the next few months.

Here are some pictures.

P1010781P1010782P1010783

And here are the links to the products described.

Rugged Circuits QuadRAM

Freetronics EtherMega

I’m very happy with my Saleae Logic too. Must write a review on this, one day. Great tool, more useful every day.

The described code is all available on Sourceforge, if you’re intending to try this at home.

The only finished example using the memory routines is the MegaSD project. Take a look at it on Sourceforge to see how to use the extra RAM.

HOW TO

What do we have to do to get this build working? Well it is pretty simple really, once everything is figured out. It is really only three steps.

  1. Initialise the RAM, and tell the AVR that it should enable its extended RAM bus.
  2. Tell the compiler that you’re moving the heap into the new extended RAM address space.
  3. Tell freeRTOS to move its heap to the new extended RAM address space.

Initialise the RAM

The RAM should be initialised prior to building the standard C environment that comes along for the ride. It can be done in the .init1 (using assembler) or in .init3 in C. I built both methods, but elected to just use the C method, as it is more maintainable (legible).

There are a number of references for this code. Some of the older ones refer incorrectly to a MCUCR register. That is not correct for the ATmega2560.

This example covers what Rugged Circuits suggest for their testing of the QuadRAM, but it doesn’t put the initialisation into .init3, which is needed to make the initialisation before heap is assigned. It makes the initialisation carefree.

// put this C code into .init3 (assembler could go into .init1)
void extRAMinit (void) __attribute__ ((used, naked, section (".init3")));
void extRAMinit (void)
{
// Bits PL7, PL6, PL5 select the bank
// We're assuming with this initialisation we want to have 8 banks of 56kByte (+ 8kByte inbuilt unbanked)
DDRL  |= 0xE0;
PORTL &= ~0xE0;  // Select bank 0
// PD7 is RAM chip enable, active low. Enable it now.
DDRD |= _BV(PD7);
PORTD &= ~_BV(PD7);
// Enable XMEM interface
XMCRA = _BV(SRE); // Set the SRE bit, to enable the interface
XMCRB = 0x00;

To ensure that this .init3 function, that I’ve put into lib_ext_ram, is included in your linked code, we need to call something from the lib_ext_ram library. If you’re planning to use the banks of RAM, then this is easy as you’ll naturally be calling the bank switching functions.

However, if you only want to use the extra 56kByte of RAM for simplicity (it is after all 7x more than you have available with just the internal RAM), then just call this function once from somewhere, possibly main(). I have added it to the freeRTOS stack initialisation function in port.c, so I don’t need to see it ever again.

extRAMcheck();

It returns the XMCRA value, that can be tested if you desire. But there’s no need as things will anyway have gone badly pear shaped if the RAM is not properly enabled. Calling this once is all that is needed to ensure that the .init3 code is properly inserted into the linked code.

Note: that the above code is specific to the QuadRAM device. The MegaRAM device has different IO in use, and the differences are noted in my code on Sourceforge.

Move the heap

The standard C heap has to be moved to the new location above the stack. There are other memory allocation options, but in my opinion this is the most sensible one and the only one I’m planning to implement.

The __heap_start and __heap_end symbols describe the addresses occupied by the extended RAM, and inform malloc() of the location of the heap. This is described in more detail here http://www.nongnu.org/avr-libc/user-manual/malloc.html. This is a great diagram showing the situation.

Malloc-x2
avr-gcc -Wl,-Map,MegaSDTest.map -Wl,--gc-sections -Wl,--section-start=.ext_ram_heap=0x802200  -Wl,--defsym=__heap_start=0x802200,--defsym=__heap_end=0x80ffff -mmcu=atmega2560 -o "MegaSDTest.elf" $(OBJS) $(USER_OBJS) $(LIBS)

Tell freeRTOS

Now freeRTOS has to be made aware of these changes to the heap location. There are three heap management options available for the AVR port. The two most memory economical options use a fixed array of memory defined in the .data section on initialisation. Clearly, this is not going to be useful. For the third option heap_3.c, which uses malloc(), we have nothing more to do.

However, getting heap_1.c and/or heap_2.c to work is not that complicated either. There are three parts to this. Firstly, creating a new section name, and locating it at the start of the desired heap space. We’ve already done that, above, with the –section-start command. The forth option heap_4.c has also been implemented.

Secondly, we have to make a small modification to both heap_1.c heap_2.c and heap_4.c to inform the compiler that the freeRTOS heap will be located at this .ext_ram_heap location. That is done in this manner (heap_2.c shown).

static union xRTOS_HEAP
{
//... edited ...
unsigned char ucHeap[ configTOTAL_HEAP_SIZE ];
//... edited ...
#if defined(portEXT_RAM)
} xHeap  __attribute__((section(".ext_ram_heap"))); // Added this section to get heap to go to the ext memory.
#else
} xHeap;
#endif

And finally, now we’ve (probably, just because we can) allocated a large (up to 32kByte maximum) freeRTOS heap, we need to ensure that the loader omits this section from its preparations for writing the .hex file to flash (in a similar manner to the way the .eeprom section is removed).

avr-objcopy --remove-section=.eeprom --remove-section=.ext_ram_heap -O ihex MegaSDTest.elf  "MegaSDTest.hex"

Something to watch for is that none of your other code is calling malloc(), because if it does its memory allocations will collide with the freeRTOS heap. Either check that malloc() is not being linked in or, for the paranoid, just assign the heap_1.c heap_2.c or heap_4.c heap to a region separate to your new malloc() heap addresses.

And that’s all there is to getting an easy 512kByte of fast no-wait-state RAM on your Freetronics EtherMega or Arduino Mega2560. Enjoy!

EtherMega (Arduino Mega 2560) and freeRTOS

This is an unboxing and porting review of the Freetronics EtherMega.

Ethermega-production-sample-2_large

It is an Arduino Mega 2560 compatible product, with the added goodness of a Wiznet W5100 Ethernet module and a MicroSD card cage on board.

http://www.freetronics.com/products/ethermega-arduino-mega-2560-compatible-with-onboard-ethernet

Obviously, my first job is to port the freeRTOS code to run on the EtherMega ATmega2560 microcontroller. There are some things about the ATmega2560 that are different from other ATmega devices, making it necessary to modify the standard freeRTOS code for it to work correctly. But the main difference is that because of the size of its flash memory, it has a 3 byte program counter (rather than 2 byte in every other ATmega device).

There are only a few references to making the ATmega2560 work. This reference is the most compelling guide.

http://www.avrfreaks.net/index.php?name=PNphpBB2&file=viewtopic&t=70387

Stu_San has proposed a port, and I have used it as an accurate guide to getting things running. There are some changes. freeRTOS has changed significantly with regard to how the port.c file is built. Also, I’ve already implemented options for Timer selection. So the resultant code on the Sourceforge AVRfreeRTOS Project is slightly different to what Stu_San proposed.

https://sourceforge.net/projects/avrfreertos/

The avr-linker script needs to have a small addition to support Stu_San’s suggestions. For the purposes of simplicity (only change a little as possible, to get things working), I’ve only included the .task tag into the avr6.x file, that will replace the normal avr-binutils file in /usr/lib/ldscripts

UPDATE: as of avrlibc 1.8.0 there is a .lowtext section in avr6.x included at the bottom of flash, which is exactly what we need for this outcome. Hence, in the port for freeRTOS800 I’ve converted from .task to use .lowtext, which means there is no need to change or modify the avr6.x file. It will just work automagically.

At this stage the EtherMega is nicely flashing its LED at me, which means that the scheduler and IO addressing using my digitalAnalogue tools are working correctly. But, I’m sure there are many things still to improve over the coming days and weeks.

Update – 25 January 2012

Corrected an error calculating the serial port baud rate. The (all) AVR data sheet has an error in it, that only exhibits at certain CPU frequencies. Unfortunately, I’ve normally only used Arduino and Freetronics kit over-clocked to 22.1184MHz, which provides perfect baud rates with no error. This CPU frequency does not exercise the faulty data sheet calculation, to generate errors.

WormFood’s AVR Baud Rate Calculator

The standard Arduino CPU frequency of 16MHz causes the serial port to run too slow by 3.7% at 115200 baud (well outside the recommended range of 2%), and more importantly the data sheet calculation chooses the wrong set up value for the UBRR value driving the baud clock, causing the serial port to become unworkable. The calculation method has been improved in the AVR <util/setbaud.h> file, and that calculation method is now used in my serial library. New code uploaded.

Update – 8 March 2012

Completed the ChaN FatF library port to freeRTOS for driving the microSD card. Supports both SD and HCSD. Code in usual place.

Update – 5 April 2012

Completed the Ethernet library port to freeRTOS for driving the Wiznet W5100. Supports TCP, UDP and Raw IP modes. Code in usual place.

The standard Arduino Mega stk500v2 bootloader used by default in the Freetronics EtherMega does not work when loading more than 64kB into flash. I’ve found a modified version by msproul that works, or at least does now that I’ve changed it a little more, to be found here.