Forum: Projekte & Code Software MP3 decoder for ATmega/ATxmega


von Horst M. (horst)


Angehängte Dateien:

Lesenswert?

8 bit AVR too slow to decode MP3 in software?
Well, that's not true anymore - at least not as a general statement.

After the first proof of concept (http://embdev.net/topic/370373) I've 
implemented the remaining features to fully *) decode single channel 
MPEG 2.5 Layer 3/LSF. Beside that a few nasty bugs were fixed (but don't 
worry, there are still some left for everybody).
You can feed any single channel MP3 file with sampling frequency 8, 
11.025 or 12 kHz and up to 64 kBit/s into the decoder.
There are various options you can select in order to adapt the resource 
requirements of the decoder.
First of all - the decoder requires 8 kBytes RAM, no chance to get it to 
run on smaller controllers.
If everything's turned on you'll need approx. 45 kBytes Flash. You can 
strip it down to 32 kBytes (or even less), but must accept significant 
limitations.
The source code supports both ATmega and ATxmega, I've tested on 
ATmega1284P and ATxmega128a3u. The code is prepared to run also on 
ATmega 640/1280/1281/2560/2561, but hasn't been tested on real hardware.
On ATmega and ATxmega an 8 bit PWM output with oversampling has been 
implemented.
Alternatively on ATxmega the internal 12 bit DAC can be used, 
significantly improving the audio quality.
While in the latter case the output sampling frequency can be adapted 
dynamically by the decoder, for PWM output you have to set up your 
hardware appropriately by choosing a suitable controller clock to 
support a single fixed sampling frequency.
The decoder is based on 16 bit fixed point arithmetics with respect to 
the limited memory and processing capacity of the platform, hence it 
won't provide the same dynamic range as a 32 bit implementation.
Using higher sampling frequencies with high bitrates might drive the 
decoder quickly into overload (you'll notice stuttering audio), if the 
clock frequency isn't high enough; for instance a 16 MHz clock won't be 
sufficient for 12 kHz sampling frequency with reasonable bitrate.


I've used AVR Studio 4.19 for the project.
The assembly code provides three major options of how the MP3 input and 
the decoded output can be handled.
Before including your own MP3 file into the code it has to be converted 
using some bin2inc utility.

1. You can just use the AVR Studio simulator to play with the code, no 
need of real AVR hardware.

I've implemented and debugged the decoder this way.
For the ATxmega code you have to use simulator V2. Note that simulator 
V2 is much slower (really) than V1. To test any modification of the code 
I recommend to use V1 unless something specific for ATxmega must be 
tested.
The simulator provides a function to write the output of a parallel port 
into a file.
To activate for simulator V1 use menu "Debug"->"AVR Simulator 
Options"->"Stimuli and logging" and enter an output file name for PORTA.
For simulator V2 use menu "Debug"->"AVR Simulator 2 Options", check "Run 
Stimuli File" and select the file "Portalog.stim" included in the 
project folder. This will write output into "output.log" file, you 
should delete this file everytime before you let the code run again in 
simulator.
When running the code, the decoded samples will be written into the 
output file in a special manner.
To make use of it I've provided three small Gawk scripts for conversion.
Using "gawk -f convoutv1.awk simulator_output_file >samples.txt" (for 
simulator V1) or "gawk -f convoutv2.awk output.log >samples.txt" (for 
simulator V2) will convert the special format to a text file containing 
lines with 16 bit hexadecimal values. You may use the script also to 
convert other output data, it was quite helpful for array dumps during 
troubleshooting.
With "gawk -f mkbin.awk -v BINMODE=3 samples.txt >samples.raw" the 
samples will be written as binary PCM data.
I've used Audacity to import the raw data. It should recognize the main 
parameters itself (Signed 16 bit PCM, Big Endian, one channel), you just 
have to enter the sampling frequency.
If you want to go deeper into the code, my urgent recommendation is to 
get a source of libmad/madplay and Helix decoder, get it to run on your 
PC and use it for reference.

2. Play back an MP3 file included directly in the code.

To use this option on ATmega a high-impedance speaker or headphones have 
to be connected to pins OC1A (direct output) and OC1B (inverted output) 
via simple RC low pass filter. Try 47 Ohms and 2.2 or 4.7 uF. For 
suppression of DC connect a 100 uF capacitor in series. Of course, this 
is just the simplest design. Higher order low pass filters will provide 
better quality. The play back will start after reset.
Keep in mind that using an 8 bit PWM for audio play back only provides a 
small range of dynamic, so the use of optimized/normalized full scale 
source material is recommended.
On ATxmega there are two identical outputs OC0A/OC0B of Timer D0 for PWM 
or both DAC outputs. You have to connect a low pass to both of them, the 
output signals are relating to ground. Of course, there's no real need 
to utilize both channels as the source code doesn't support stereo 
files, you can remove the second channel if you want.
Using double output buffers enables the option on ATxmega to use DMA in 
double-buffered mode to feed the DAC and to unload the processor, but I 
didn't implement this.

3. Play back an MP3 file read from serial SPI flash.

To use this option you need the output circuitry described with option 
2.
The PIND.3 (ATmega) or PINE.5 (ATxmega) input should be applied with 
logic H or L for controlling purposes.
Additionally a 64 MBit serial flash chip is required, I've used Winbond 
W25Q64BV (it's available in PDIP package). Connect it to the 
controller's SPI interface (SPIC on ATxmega128a3u), using its /SS output 
(PORTB.4 on ATmega1284P, PORTC.4 on ATxmega128a3u) as /CS on the flash 
chip. Note that the W25Q64BV only supports 3.6 volts max for supply.
If you own a programmer for the serial flash you can use it to store an 
MP3 file, starting with address 0.
Alternatively a quick & dirty mini terminal has been added to the code 
to upload an MP3 file into the serial flash. For this option an RS232 
interface on USART0 (ATmega) or USARTC0 (ATxmega) has to be adapted.
In either case a two-byte value - low byte first - has to be inserted at 
the beginning of the MP3 file, holding the number of MP3 frames (use a 
tool like mp3guessenc to get the number of frames).
If you decide to use the built-in upload terminal, connect your PC to 
the RS232 interface using 115k2 8n2 (speed can be adapted) and no flow 
control (I recommend to use TeraTerm or ZOC). Keep PIND.3 at logic H 
(ATmega) or PINE.5 at logic L (ATxmega) and reset the controller. The 
built-in terminal should respond, erasing any current flash memory 
content. When prompted, start the upload in BINARY mode (simple file 
upload, no transmission protocol like X- or Z-Modem). Once the upload 
has been finished, put a logic L on PIND.3 (ATmega) or logic H on PINE.5 
(ATxmega), reset the controller and the play back should start.

There are several additional flags to control if a particular function 
is optimized for speed or for a lower code flash memory usage. They 
basically switch between the insertion of full expanded macros (most of 
them are multiply operations) or just subroutine calls. Depending on the 
controller's clock frequency you can save approx. half of the code 
memory (size of the tables won't be reduced) and there's still room to 
save even a few bytes more because currently unrolled loops remain 
unrolled, for instance.
To keep an eye on the processor utilization you can enable a check how 
much time per MP3 frame is not consumed by the decoder. The decoder then 
reports the idle time in sample ticks via serial interface.

You can separately disable the support of short blocks as well as the 
bit reservoir.
In this case you'll need special crafted MP3 files. Use Lame encoder 
(older version, e.g. 3.902) with option --noshort (no short blocks) 
or/and --nores (no bit reservoir). Note that in case of disabled short 
block support only 8 or 12 kHz sampling frequency is allowed.
I've included a short audio sample encoded with different settings and 
sampling frequencies so that you can instantly play around with the 
available options.



I definitely won't extend the code to support "regular" mp3 files (MPEG 
1/MPEG 2 Layer 3), providing higher sampling frequencies and bitrates as 
this will require even more RAM and higher controller clock frequency.
But to check what's possible with the code I've implemented a special 
feature.
You can select to use double or even quadruple output speed.
Of course, the controller's clock frequency has to be increased 
appropriately. 32 MHz on ATxmega should be sufficient for double output 
speed, for quadruple speed you'll need 64 MHz.
BTW: I've overlocked different USB capable ATxmegas up to 64 MHz@3.3V, 
they seemed to run stable at room temperature (OK, didn't let it run for 
days).
You can even try with ATmega1284P, but use an external Xtal oscillator, 
simple Xtal might most likely not work. I was able to run 
ATmega644PA/1284P with 30 MHz@5V, also stable at room temp.
But now for the procedure:
Load an audio file - say, a ripped CD track - into Audacity. The 
original sampling frequency will be 44100 Hz.
At first, make a single channel track using "Tracks"->"Stereo Track to 
Mono".
For quadruple output speed, just modify the sampling frequency using the 
track's properties menu "Set Rate" and set it to 11025 Hz. Also set the 
Project Rate to 11025 Hz. Now export as MP3, using "Average" or 
"Constant" with 32 kbit/s.
For double speed first resample to 22050 Hz using "Tracks"->"Resample", 
followed by the actions described above.
For quadruple speed on ATxmega with 12 bit DACs the sound quality is 
fairly good. For me it was quite the same as the original file on the PC 
(individual evaluation may vary :-).





*) With one exception - mixed blocks are not supported.
This kind of blocks seems to be used very rarely -  if at all. I've 
examined almost 50.000 mp3 files from various sources and didn't find a 
single one containing mixed blocks. At least the most common encoders 
apparently don't make use of it.
If of all things your favourite encoder is actually using mixed blocks, 
I call it bad luck. Get another encoder.

Bitte melde dich an um einen Beitrag zu schreiben. Anmeldung ist kostenlos und dauert nur eine Minute.
Bestehender Account
Schon ein Account bei Google/GoogleMail? Keine Anmeldung erforderlich!
Mit Google-Account einloggen
Noch kein Account? Hier anmelden.