Streaming Audio
Primer (Part 4):
Encoding Audio Files into MP3 and VQF Formats
Overview
The next major step is to compress that whopping 250 Mbyte WAV
file into a 6.5 Mbyte file, or maybe even smaller 2 Mbyte file.
How do you such wonderful things? Simple, use either a MP3 or
VQF encoder. You may be asking, "That sounds nice, but which one
should I use - MP3 or VQF, and what's the difference?". That's
a good question. Here's a few thoughts to help you decide.
MP3 versus VQF
First you must decide which audio compression/encoding format
you will use. Both have the same basic idea: analyze frequency
content of audio and compress by encoding only audible content.
Unbelievably this can reduce a file size by 10:1 with virtually
no loss in quality. And, if you are willing to sacrifice quality,
one can achieve compression of almost 100:1 at the extreme. So
what's the difference:
- MP3: The MP3 format, which stands for MPEG 2
Layer 3, has been around for a while. It is a part of the second
generation of the standard MPEG format that is used to compress
both audio and video. Most video capture boards save files in
this format. It is by far the most popular format. Many expect
it to take over the music business and revolutionize how we
listen to music. Already many companies
sell handheld or wristwatch players that play MP3 files from
Flash Roms instead of CD's. In general (except really low sampling
rates), MP3 has slightly better quality than VQF. Also, there
is an abundant source of players, encoders, and other related
software for the MP3 format because of its immense popularity.
Check out the web
site of the guys who invented the MP3 format and are busy
on MP4.
- VQF: The newcomer to the audio compression scene,
VQF has yet to reach the popularity of MP3. However, it has
advantages that may cause it to eventually topple MP3: it has
slightly better compression than MP3 (maybe 5% to 10%). Also,
the free encoder has an 11,000 Hz (11 kHz) / 8,000 bps (8 kbits
per second) setting that produces a phenomenal file that is
only 2 Mbytes. Although the quality is significantly lower than
that of higher rates (such as 11 kHz or 12 kHz at 16 and 20
kbps), it may be an option for those who have limited web space
or may worry about larger file sizes discouraging people from
downloading the file. The MP3 encoders also support this low
sampling rate, but surprisingly have much worse quality at this
particular low rate.
So, which one? That's up to you. I prefer MP3 because of its
widespread acceptance. Also, the popularity of MP3 and the abundance
of related software makes it much more attractive to the majority
of people. However, some prefer the slightly smaller file sizes
that VQF has to offer. The bottom line is that both will work,
and you will win no matter which choice you make.
Encoding the WAV file
First you must download an MP3 or VQF compressor. First check
Fraunhofer
Institute's web-page on its official
MP3 encoders, which are licensed. I use AudioActive's
Production Studio Lite which retails for about $35. They have
been known to offer a free 30-day trial download, but it is currently
discontinued. If you want a free VQF encoder, you can download
a free Linux
Version, or a free 90-day trial version from Yamaha.
Once you have downloaded an encoder program, you can use this
program to encode the audio file into either the MP3/VQF format,
depending on which encoder you selected. Just to get you started,
try setting the compression preferences to the 16,000 Hz sampling
rate, 20 kbps, the Mono setting, and the MP3 output format (using
AudioActive's trial version). Compress away! This will take a
while too. Depending on the quality of your compressor, the quality
and size of your final MP3 file will vary. Using this sampling
rate (20 kbps, 16000 Hz for MP3 encoding) produces a comparably
small file (6.5 Mbytes for 45 minutes), and balances well between
file size and audio quality. This is what I use for all of my
recordings. For the VQF encoder, try the 11kHz/8kbps setting to
produce the small 2 Mbyte files. For recording of speech, stereo
is not required, since most speech are recorded in Mono anyway.
Also, stereo would double the file size too.
If you are going provide streaming audio on your site, then your
primary concern in selecting an encoding rate is the bits-per-second
(BPS or "bit rate") parameter. You must make sure that
the bit rate you select is below the modem connection speed you
want to support. For example, any modem that connects at 28.8
kbps should easily be able to handle a streamed audio file at
the 20 kbps rate. The 24 kbps rate should also be possible, but
any momentary hiccup in the network may cause a slight pause in
the playback, which is not uncommon. Because of this, I prefer
the 20 kbps set of encoding rates. Typically the higher sampling
frequency (22,050 Hz, 16,000 Hz, etc.) within a given bit rate
is better. Also, it helps to plan ahead and choose the original
recording rate (48,000 Hz, 44,100 Hz, etc.) such that the encoding
rate is an integer factor of the original rate (48,000 / 16,000
= 3 - an integer, no decimal). This makes better encoding. Encoding
rates that are not integer factors must be "resampled".
That is why "Resample" is listed by some frequency choices,
but not all, in the "Encoding Properties" window in
AudioActive's Production
Studio. Planning ahead so that resampling is not required produces
superior results.
Once the encoder finishes crunching, try testing it to see if
you are satisfied with the output. If not, try different compression
levels to get the results you like - just don't forget to balance
file size and bit rate against quality. Once you have completed
the encoding step, then you are ready to update your web-site
with your new audio files!
Part 1: Introduction
Part 2: Digital Recording
Part 3: Reducing Noise
Part 4: Encoding
Part 5: Web Pages
Reprinted with permission
from author Trevor Bowen,
whose Web
site contains good information on utilizing the digital medium.
|