Compressing Wav file to MP3
● 1. Introduction First, I don't meant to give you informations about how to understand the mp3 algorithm. My goal is to explain how to use an already existing encoder with BCB.
● 2. Choosing the mp3 encoder There are tons of mp3 encoders. Some of them are free others are not. Some are fast but produce an awful result. Others are slow but with excellent result and give a high audio quality. The ideal would be a free, reasonably fast encoder giving a high audio quality, all at the same time. Enjoy! This pearl exists. But we have to look at it in the GNU world. There is a GNU project, called LAME, for Lame Ain't a Mp3 Encoder, under the GPL license. The official web site of the LAME project is http://www.mp3dev.org/mp3/ Moreover, as it is a GNU project, we have access to the source and there is a version compiled for Win32 in a DLL. Among all the other encoders, I want to quote two of them. The first, FRAUNHOFER, because it is a fast and excellent encoder : http://www.iis.fhg.de/ but it's not free though. The second because it's a very fast encoder but the audio result is awful. So don't use it except if you are looking for a fast encoder. It's the encoder from Xing Tech : http://www.xingtech.com/ Note : The Lame encoder has a limitation. The sample rate must be 32000, 44100 or 48000.?
● 3. Some informations about the WAV format A wav file is just a collection of chunks. There is a format chunk wich contains all the informations about the samples. For instance, the bitrate, the number of channels, if it's stereo or mono... There is also a chunk containing the data. In other words, this chunk contains all the samples. In front of the file, there are 12 characters indicating that the file is a wav file. The two chunks given above must be present in the file. There could be other chunk but we just ignore them. They are not needed for our purpose. If you want to know more about wav file, take a look at http://www.wotsit.org/ for a complete description. The format chunk :
struct FormatChunk { char chunkID[4]; long chunkSize; short wFormatTag; unsigned short wChannels; unsigned long dwSamplesPerSec; unsigned long dwAvgBytesPerSec; unsigned short wBlockAlign; unsigned short wBitsPerSample; // Note: there may be additional fields here, depending upon wFormatTag. };
Above, you can see the struct representing the format chunk. The chunkID is always "fmt " with an ending space (4 characters). It's the identification of the chunk. All other chunk have such an ID. The chunkSize parameter contains the number of bytes of the chunk, the ID and chunkSize excluded. The format chunk must be the first chunk in the file.
The data chunk : struct Chunk { char chunkID[4]; long chunkSize; }; In the case of the data chunk, the chunkID contains "data". The chunkSize parameters contains the size of the raw data (samples). The data begins just after chunkSize.
In the case of the data chunk, the chunkID contains "data". The chunkSize parameters contains the size of the raw data (samples). The data begins just after chunkSize. Dans le cas du bloc de données, chunkID contient "data". Le paramètre chunkSize contient la taille du bloc de données proprement dites. Celles-ci commencent juste après chunkSize. So, when we read a wav file, all we have to do is : - read the first 12 characters to check if it's a real wav file. - read the format chunk in a struct similar to the formatChunk struct. - skip the extra parameters in the format chunk, if any. - find the data chunk, read the raw data and carry out with the encoding. -skip all other chunks. Donc, ce que nous devons faire est : - lire les 12 premiers caractères pour déterminer si on est bien en présence d'un fichier wav. - lire le bloc de format dans une structure similaire à la structure formatChunk. - ignorer les caractères supplémentaires dans le bloc de format, s'il y en a. - ignorer tous les blocs qui ne sont pas le bloc de données. - trouver le bloc de données, lire ces données et lancer l'encodage.
● 4. Importing the DLL The DLL used for the encoding is called lame_enc.dll. Unfortunately, this DLL was build with VC 6 from Microsoft. If we just create a lib file from the DLL and try to import the library in BCB, we'll get an 'Unresolved external error' at link time for each function we'll try to use. Due to the declaration type, BCB is expecting a function name with a leading underscore and the function names doesn't have such a leading underscore. To resolve this issue, we must, first, create a def file from our DLL. Open a console windows and type :
IMPDEF lame_enc.def lame_enc.dll
Open the lame_enc.def file with an editor (Notepad for instance) and modify it like this. This will create aliases for the functions : LIBRARY LAME_ENC.DLL EXPORTS _beCloseStream = beCloseStream _beDeinitStream = beDeinitStream _beEncodeChunk = beEncodeChunk _beInitStream = beInitStream _beVersion = beVersion _beWriteVBRHeader = beWriteVBRHeader beCloseStream @4 beDeinitStream @3 beEncodeChunk @2 beInitStream @1 beVersion @5 beWriteVBRHeader @6
Now, we can create the lib file from our def file. We'll import that lib file in our project. To create the lib file, type : implib lame_enc.lib lame_enc.def
● 5. The code First, you have to import the libary in your project. Next, include the header file of the DLL into your unit. In the DLL header file, you have to add extern "C" in front of all exported function. Here is the header with the moifications (lame_enc.h) : /* bladedll.h +++++++++++++++++++++++++++ + Blade's Encoder DLL + +++++++++++++++++++++++++++
------------------------------------------------------ - Version 1.00 (7 November 1998) - Jukka Poikolainen - ------------------------------------------------------ */ #ifndef ___BLADEDLL_H_INCLUDED___ #define ___BLADEDLL_H_INCLUDED___
#pragma pack(push) #pragma pack(1)
/* encoding formats */
#define BE_CONFIG_MP3 0 #define BE_CONFIG_LAME 256
/* type definitions */
typedef unsigned long HBE_STREAM; typedef HBE_STREAM *PHBE_STREAM; typedef unsigned long BE_ERR;
/* error codes */
#define BE_ERR_SUCCESSFUL 0x00000000 #define BE_ERR_INVALID_FORMAT 0x00000001 #define BE_ERR_INVALID_FORMAT_PARAMETERS 0x00000002 #define BE_ERR_NO_MORE_HANDLES 0x00000003 #define BE_ERR_INVALID_HANDLE 0x00000004 #define BE_ERR_BUFFER_TOO_SMALL 0x00000005
/* other constants */
#define BE_MAX_HOMEPAGE 256
/* format specific variables */
#define BE_MP3_MODE_STEREO 0 #define BE_MP3_MODE_JSTEREO 1 #define BE_MP3_MODE_DUALCHANNEL 2 #define BE_MP3_MODE_MONO 3
#define MPEG1 1 #define MPEG2 0
#ifdef _BLADEDLL #undef FLOAT #include <Windows.h> #endif
enum MPEG_QUALITY { NORMAL_QUALITY = 0, LOW_QUALITY, HIGH_QUALITY, VOICE_QUALITY };
typedef struct { DWORD dwConfig; // BE_CONFIG_XXXXX // Currently only BE_CONFIG_MP3 is supported union { struct { DWORD dwSampleRate; // 48000, 44100 and 32000 allowed BYTE byMode; // BE_MP3_MODE_STEREO, BE_MP3_MODE_DUALCHANNEL // BE_MP3_MODE_MONO WORD wBitrate; // 32, 40, 48, 56, 64, 80, 96, 112, 128, // 160, 192, 224, 256 and 320 allowed BOOL bPrivate; BOOL bCRC; BOOL bCopyright; BOOL bOriginal; }mp3; // BE_CONFIG_MP3 struct { // STRUCTURE INFORMATION DWORD dwStructVersion; DWORD dwStructSize; // BASIC ENCODER SETTINGS DWORD dwSampleRate; // ALLOWED SAMPLERATE VALUES DEPENDS // ON dwMPEGVersion DWORD dwReSampleRate; // DOWNSAMPLERATE, 0=ENCODER DECIDES INT nMode; // BE_MP3_MODE_STEREO, BE_MP3_MODE_DUALCHANNEL // BE_MP3_MODE_MONO DWORD dwBitrate; // CBR bitrate, VBR min bitrate DWORD dwMaxBitrate; // CBR ignored, VBR Max bitrate MPEG_QUALITY nQuality; // Quality setting (NORMAL,HIGH,LOW,VOICE) DWORD dwMpegVersion; // MPEG-1 OR MPEG-2 DWORD dwPsyModel; // FUTURE USE, SET TO 0 DWORD dwEmphasis; // FUTURE USE, SET TO 0
// BIT STREAM SETTINGS BOOL bPrivate; // Set Private Bit (TRUE/FALSE) BOOL bCRC; // Insert CRC (TRUE/FALSE) BOOL bCopyright; // Set Copyright Bit (TRUE/FALSE) BOOL bOriginal; // Set Original Bit (TRUE/FALSE) // VBR STUFF BOOL bWriteVBRHeader; // WRITE XING VBR HEADER (TRUE/FALSE) BOOL bEnableVBR; // USE VBR ENCODING (TRUE/FALSE) INT nVBRQuality; // VBR QUALITY 0..9 BYTE btReserved[255]; // FUTURE USE, SET TO 0 }LHV1; // LAME header version 1
struct { DWORD dwSampleRate; BYTE byMode; WORD wBitrate; BYTE byEncodingMethod; }aac; }format; }BE_CONFIG;
struct BE_VERSION { // BladeEnc DLL Version number BYTE byDLLMajorVersion; BYTE byDLLMinorVersion; // BladeEnc Engine Version Number BYTE byMajorVersion; BYTE byMinorVersion; // DLL Release date BYTE byDay; BYTE byMonth; WORD wYear; // BladeEnc Homepage URL CHAR zHomepage[BE_MAX_HOMEPAGE + 1]; };
#ifndef _BLADEDLL
typedef BE_ERR (*BEINITSTREAM) (BE_CONFIG*, PDWORD, PDWORD, PHBE_STREAM); typedef BE_ERR (*BEENCODECHUNK) (HBE_STREAM, DWORD, PSHORT, PBYTE, PDWORD); typedef BE_ERR (*BEDEINITSTREAM) (HBE_STREAM, PBYTE, PDWORD); typedef BE_ERR (*BECLOSESTREAM) (HBE_STREAM); typedef VOID (*BEVERSION) (BE_VERSION*);
#define TEXT_BEINITSTREAM "beInitStream" #define TEXT_BEENCODECHUNK "beEncodeChunk" #define TEXT_BEDEINITSTREAM "beDeinitStream" #define TEXT_BECLOSESTREAM "beCloseStream" #define TEXT_BEVERSION "beVersion"
/* BE_ERR beInitStream(BE_CONFIG *beConfig, PDWORD dwSamples, PDWORD dwBufferSize, PHBE_STREAM phbeStream); BE_ERR beEncodeChunk(HBE_STREAM hbeStream, DWORD nSamples, PSHORT pSamples, PBYTE pOutput, PDWORD pdwOutput); BE_ERR beDeinitStream(HBE_STREAM hbeStream, PBYTE pOutput, PDWORD pdwOutput); BE_ERR beCloseStream(HBE_STREAM hbeStream); VOID beVersion(BE_VERSION *beVersion); */
#else
extern "C" __declspec(dllexport) BE_ERR beInitStream(BE_CONFIG *beConfig, PDWORD dwSamples, PDWORD dwBufferSize, PHBE_STREAM phbeStream); extern "C" __declspec(dllexport) BE_ERR beEncodeChunk(HBE_STREAM hbeStream, DWORD nSamples, PSHORT pSamples, PBYTE pOutput, PDWORD pdwOutput); extern "C" __declspec(dllexport) BE_ERR beDeinitStream(HBE_STREAM hbeStream, PBYTE pOutput, PDWORD pdwOutput); extern "C" __declspec(dllexport) BE_ERR beCloseStream(HBE_STREAM hbeStream); extern "C" __declspec(dllexport) VOID beVersion(BE_VERSION *beVersion);
#endif #pragma pack(pop) #endif
As you can see in the header above, you have to add #define _BLADEDLL into your .cpp file before including the header.
Below, you'll find the code of a little application which takes a wav file in input and encode the file to mp3. I don't give more explanations because the code is very straightforward and commented. It is not very elegant but it's just to show how to use the DLL.
File Format.h //--------------------------------------------------------------------------- #ifndef Format_H #define Format_H //--------------------------------------------------------------------------- struct FormatChunk { char chunkID[4]; long chunkSize; short wFormatTag; unsigned short wChannels; unsigned long dwSamplesPerSec; unsigned long dwAvgBytesPerSec; unsigned short wBlockAlign; unsigned short wBitsPerSample; // Note: there may be additional fields here, depending upon wFormatTag. };
// This is the start ID of a Wave file // must contains 'RIFF' and 'WAVE' char startID[12];
// contains the chunk id ('data', 'cue ' ...) and the chunk size struct Chunk { char chunkID[4]; long chunkSize; };
// a pointer to the samples in the data chunk unsigned char *WaveformData;
//--------------------------------------------------------------------------- #endif
File Unit1_H.h //--------------------------------------------------------------------------- #ifndef Unit1H #define Unit1H //--------------------------------------------------------------------------- #include <Classes.hpp> #include <Controls.hpp> #include <StdCtrls.hpp> #include <Forms.hpp> #include <Dialogs.hpp> //--------------------------------------------------------------------------- class TForm1 : public TForm { __published: // IDE-managed Components TOpenDialog *OpenDialog1; TEdit *FileEdit; TLabel *Label1; TButton *Browse; TButton *Encode; void __fastcall BrowseClick(TObject *Sender); void __fastcall EncodeClick(TObject *Sender); private: // User declarations AnsiString OutputFileName; public: // User declarations __fastcall TForm1(TComponent* Owner); }; //--------------------------------------------------------------------------- extern PACKAGE TForm1 *Form1; //--------------------------------------------------------------------------- #endif
Fiel Unit1.cpp //--------------------------------------------------------------------------- #include <vcl.h> #pragma hdrstop
#include <fstream> #include <iostream> #include "Unit1.h"
#define _BLADEDLL // Don't forget it #include "lame_enc.h" #include "format.h" //--------------------------------------------------------------------------- #pragma package(smart_init) #pragma resource "*.dfm" TForm1 *Form1; //--------------------------------------------------------------------------- __fastcall TForm1::TForm1(TComponent* Owner) : TForm(Owner) { } //--------------------------------------------------------------------------- void __fastcall TForm1::BrowseClick(TObject *Sender) { OpenDialog1->InitialDir = ExtractFileDir(Application->ExeName); if(OpenDialog1->Execute()) { FileEdit->Text = OpenDialog1->FileName; OutputFileName = ChangeFileExt(OpenDialog1->FileName, ".mp3"); } } //--------------------------------------------------------------------------- void __fastcall TForm1::EncodeClick(TObject *Sender) { if(FileEdit->Text == "") return; std::ifstream fin(FileEdit->Text.c_str(), std::ios::binary); if(!fin) return; // read the 12 character in front of the file fin.read((char*)&startID, sizeof(startID));
// get the format chunk FormatChunk fc; fin.read((char*)&fc, sizeof(FormatChunk)); // the first chunk MUST be the format chunk if(strncmp(fc.chunkID, "fmt ", 4) != 0) { Application->MessageBox("This is not a valid Wave file", "Wav2Mp3 ERROR", MB_OK); return; } if(fc.wFormatTag!=1) { Application->MessageBox("Cannot handle compressed Wave file", "Wav2Mp3 ERROR", MB_OK); return; } // initialization of Mp3 encoder BE_CONFIG bc; bc.dwConfig = BE_CONFIG_MP3; // 32000, 44100 and 48000 are the only sample rate authorized // due to encoding limitations if(fc.dwSamplesPerSec == 32000 || fc.dwSamplesPerSec == 44100 || fc.dwSamplesPerSec == 48000) bc.format.mp3.dwSampleRate = fc.dwSamplesPerSec; else { Application->MessageBox("Unsuported sample rate", "Wav2Mp3 ERROR", MB_OK); return; } if(fc.wChannels == 1) bc.format.mp3.byMode = BE_MP3_MODE_MONO; else bc.format.mp3.byMode = BE_MP3_MODE_STEREO; // the resulting file length depends on this parameter // higher the bitrate, better the result bc.format.mp3.wBitrate = 192; bc.format.mp3.bCopyright = false; bc.format.mp3.bCRC = false; bc.format.mp3.bOriginal = false; bc.format.mp3.bPrivate = false; // skip extra formatchunk parameter, if any if(sizeof(FormatChunk) < int(8 + fc.chunkSize)) { char c; for(int i=0; i< int(8 + fc.chunkSize - sizeof(FormatChunk)); i++) fin.get(c); } // get next chunk Chunk chunk; fin.read((char*)&chunk, sizeof(Chunk)); // check if it's the data chunk while(strncmp(chunk.chunkID, "data", 4) != 0) { char c; for(int i=0; i<chunk.chunkSize; i++) fin.get(c); fin.read((char*)&chunk,sizeof(Chunk)); } // process with the encoding DWORD dwNumberOfSamples; DWORD dwOutputBufferLength; HBE_STREAM hStream; if(beInitStream(&bc, &dwNumberOfSamples, &dwOutputBufferLength, &hStream) != BE_ERR_SUCCESSFUL) { Application->MessageBox("Cannot perform compression", "Wav2Mp3 ERROR", MB_OK); return; } std::ofstream fout(OutputFileName.c_str(), std::ios::binary); char *Mp3Buffer = new char[dwOutputBufferLength]; SHORT *InputBuffer = new SHORT[dwNumberOfSamples]; // SHORT = short = 16 bits
int nSamplesPerformed=0; DWORD dwNumberOfSamplesEncoded; while(nSamplesPerformed < chunk.chunkSize) { fin.read((char*)InputBuffer, dwNumberOfSamples * 2); nSamplesPerformed += dwNumberOfSamples * 2; if(beEncodeChunk(hStream, dwNumberOfSamples, InputBuffer, (BYTE*)Mp3Buffer, &dwNumberOfSamplesEncoded) != BE_ERR_SUCCESSFUL) { Application->MessageBox("Cannot perform compression", "Wav2Mp3 ERROR", MB_OK); return; } fout.write(Mp3Buffer, dwNumberOfSamplesEncoded); } beDeinitStream(hStream, (BYTE*)Mp3Buffer, &dwNumberOfSamplesEncoded); beCloseStream(hStream);
delete Mp3Buffer; delete InputBuffer; return; }  
|