Вы находитесь на странице: 1из 8


Kuncoro Triandono Mukti1 and Anggunmeka Luhur Prasasti2

Computer Engineering,
Telkom University
Bandung, Indonesia
kuncoroteem@gmail.com, 2 anggunmeka@gmail.com

Abstract - Audio is one of the fastest audio data compression, both lossy and lossless,
growing multimedia data, especially in the makes use of information redundancy with
advanced music industry its make a lot of large encoding, pattern recognition, and linear predictions
audio formats, such as data and video data, audio like video compression. In lossless compression, the
data is also required for storage issues and real- compression result can be restored like the original
time access needs through computer networks. data without any change, then the compression ratio
The smaller audio size will reduce the delay time cannot be too large to ensure all data can be restored
so that data transmission will be faster. to its original form. Lossy compression is a
Traditional data compression is applied to compression scheme that eliminates some of the
computer machines, this is done because every information contained in the original data so that
symbol that appears on the computer has when done decompression process then the output
different bits. Data compression is used to data will not exactly the same as the original data.
subtract the number of bits generated from each Both utilize the limitations of the human senses that
symbol that appears. This compression is can only capture (perceive) environmental
expected to reduce (reduce the size of the data) in conditions in a certain range, based on the frequency
the storage space. Similar to image compression, of his voice is divided into 4 groups of which the
there are two kinds of data compression ears of human hearing is between 20 Hz to 20000
techniques on audio, namely lossy and lossless. Hz.
For daily consumption, lossy compression is
more widely used because the resulting Two of the most popular audio formats is
compression ratio is large or the resulting audio FLAC and MP3. In terms of quality, FLAC is
size is very small. This paper discusses the basic definitely better than MP3. However, in fact,
principles in the compression of audio, especially in Indonesia itself, more people choose to
Camparison Algortihm, and Audio file format. listen to MP3 format (MPEG-1 Audio Layer 3) due
to its small size. If we compare with the MP3 format,
Keywords - Audio Compression, Lossless, this FLAC format does require a large enough space.
Lossy. If an audio CD-quality audio data use 44.1 kHz
sampling rate, 16 bits per sample, 2 channels
I. INTRODUCTION (stereo), then the total audio data storage per second
is approximately 176,400 Bytes so for a duration of
60 seconds (1 minute) it takes 10,584 MB. If the
In general, data compression is a change of a
average duration in a song is about 4 minutes, then
symbol into a code. Compression when there is a
it takes about 42.336 MB of space to store 1 song
very small code with the size of the original symbol
where 1 CD can hold only 16 songs [2, 4].
code. From a code or the basic symbols of a model
will be in a special code. Simply model a data set
and rules to set a symbol to determine a codecode as Currently, there are many compression
the output [3]. algorithms, including Dynamic Markov
Compression (DMC), Run Length Encoding (RLE),
Lempel Ziv Welch (LZW), Arithmetic coding,
Audio is one of the fastest growing
Huffman Code, Rice Code, Golomb Code, BW
multimedia data, especially in the advanced music
Transform, and others. Based on some previous
industry its make a lot of large audio formats, such
research the Huffman algorithm is faster at
as data and video data, audio data is also required for
storage issues and real-time access needs through compression, and better in audio compression,
computer networks. The smaller audio size will according to [11] Huffman algorithm is better,
faster, and produces high PSNR than Arithmetic
reduce the delay time so that data transmission will
coding in compression. And according to [12],
be faster [1]. Like most compression techniques,
Huffman compression results better than LZW and
DMC in the case of binary files, multimedia files, sequences are huffman coding and arithmethic
image files and compressed files. coding [6].

In this paper first we will explain the basic

theory of lossless and lossy compression, then the
algorithm that can be used to compress, and then
compare the results of previous research.


Data compression is the process of encoding

information using bits or other information-bearing
figure 2. Lossless compression
units that are lower than the data representation that
is not encoded with a particular encoding system,
there are two compression techniques Lossless and A. Usability of Lossless Compression
Lossy. Here is an explanation of Lossless, Lossy and
audio data compression algorithms. For specific Lossless compression is primarily used
classification shown in figure 1. for archiving, and editing. For the purposes of
archiving, of course, the desired quality is the
best quality. So also with editing. Editing lossy
compressed data causes a decrease in the sound
quality of each storage. Then lossless
compression is always used in sound
engineering. In addition to both uses, lossless
compression is also commonly used by
audiophile, the music fans who enjoy listening
to music with high quality with high-quality
hardware as well. Lossless compressed audio
data is also used to generate lossy audio data for
distribution. Nowadays, with the increasingly
low cost of digital data storage media and
bandwidth, lossless compression is becoming
increasingly popular among consumers.

B. Basic Principle of Lossless Compression

There are two main stages in lossless

compression for audio data, first prediction, and
figure 1. Classification data compression coding. Prediction uses the previous samples to
predict the next sample. Then the difference
2.1. LOSSLESS between the sample predicted the result and the
actual sample is coded. For each format,
Lossless compression in audio data means that usually, the difference is only found in
the compressed results of the data can be prediction and/or coding techniques.
decompressed to produce exactly the same data as
the original data, without any loss of quality at all. Some audio formats that support lossless
Lossless compression for audio data is somewhat compression include Shorten, while those
similar to the generic lossless compression commonly used today are Free Lossless Audio
algorithm, with a compression ratio of about 50% to Codec (FLAC), Apple Lossless, MPEG-4 ALS,
60%, although it can achieve 35% in orchestral Monkey's Audio, WavPack, and True Audio.
music data or less noise chorus.
Each format has a different step or stage on its
usually a lossless compression program uses compression as an example in this paper will
two different types of algorithms. First, which explain the compression process in FLAC
produces statistical models for data input, and format.
second, maping the input data to the bit sequence
using this model in the way that "probable" data will
produce shorter outputs of "improbable" data. The
main encoding algorithms used to generate bit
Example Audio Compression on FLAC results enough to include the prediction
parameters. There are four methods used
The FLAC format is issued by the by FLAC for prediction:
Xiph.Org Foundation by utilizing the high o Verbatim, The prediction signal is
correlation between samples in the audio data. zero, so the residue is the same as the
FLAC uses linear predictions to convert actual signal (no compression).
samples into rows of numbers called residues, o Constant, This method is used if in the
which are then stored with Golomb-Rice certain channel in a block there is
coding. The resulting compression ratio is 40% digital silence or constant value. The
to 50%. encoding used is run-length.
o Fixed linear prediction
The compression process is done by o FIR linear prediction
several stages:  Residual coding using Golomb-Rice
Coding, when predictors cannot describe
 Blocking, Block in FLAC refers to a row of signals exactly, it is necessary to keep the
samples spanning multiple channels. Block difference between the original signal and
size may vary depending on several factors the predicted signal. This difference is
including sample rate. This block size called a residue. The effectiveness of the
affects the compression ratio directly. If the prediction can be seen from the size of the
block size is too small, it takes a lot of required residue. This residue is stored with
frames so many bits will be wasted to store one of two ways of rice coding:
the frame header. If it is too large, the o Using one parameter for the entire
characteristic of the audio signal will be too residue. This parameter is based on
varied making it difficult to find an optimal the residual value variance.
predictor. FLAC limits block size between o The residue is divided into several
16 to 65535 samples per block. parts of the same length, with each
 Interchannel Decorrelation, For stereo data, section having its own parameters
there are often many correlations between determined from the average value of
the left and right channels. Thus there are the residue.
several channel storage methods into
blocks as follows: C. Algorithm for Lossless Compression
o Independent, Independent, both
channels are encoded separately.  Huffman Code
o Mid-side, storing the average signal  Golomb Code
of both channels as mid channel and  Rice Code
the difference between the left  Tunstall Code
channel and the right as the side  Arithmetics Code
channel.  Dictionary Code
 Run-Length Code
o Left-side, save left channel and side
channel. 2.2. LOSSY
o Right-side, save right channel and
side channel. The compression technique where the
decompression data is not the same as the data
In certain cases, Left and Right is the most before compression but is "enough" to be used.
efficient method [15]. Examples: Mp3, streaming media, JPEG, MPEG,
and WMA. The advantage of this method is its
smaller size compared to Lossless. In the lossy
 Prediction, the encoder looks for an
compression for audio data, there will be a decrease
approximation of the mathematical in quality if the compression results are attempted to
description of the signal on each block. The be decompressed. This quality degradation is called
size of this description is generally much compression “artifacts”.
smaller than the size of the signal itself.
This prediction method is known by
encoder or decoder so that the compression
Usually this technique removes pieces of data model is also stored in a compressed
that are actually not so useful, not so perceived, not form as well.
so visible, so people still think that the data can still  ADPCM, Simply the first sample is
be used even if it is compressed. for example on kept intact, while for the next samples,
MP3. examples of these methods are Transform the saved is the difference with the
Coding, Wavelet, and others. lossy compression is previous sample, which is generally
also called irreversible compression because the not very large.
original data is impossible to restore. the advantages  MPEG, This technique uses
of this technique is a high compression ratio psychoacoustic theory. If the voice
compared to lossless methods [6]. cannot be heard by the human ear, then
the sound part does not need to be
encoded. In addition, which is still
associated with psychoacoustic is
noise shaping. High-frequency signals
can only be heard by humans if they
have large volumes, therefore noise is
'hidden' in these high-frequency areas
with small volumes.

Some audio formats that support lossy

figure 3. Lossy Compression compression include a very popular format
is MP3 which is part of MPEG that handles
the audio layer (MPEG layer III), AAC
A. Usability of Lossy Compression which is further developed, and OGG. For
speech data, there are several formats such
Lossy compression in audio data is very as A-law / μ-law used on the phone, AMR
widely used, either directly (eg on mp3 players) on GSM, AMR WB for CDMA, and so on.
or indirectly (contained in DVD video, digital
television, video streaming, etc.). This C. Algorithm for Lossy Compression
compression is used by the music lovers
because the result of a very high compression
 Scalar Quantization
ratio between 0% to 100% with sound quality is
 Vector Quantization
still "good enough".
B. Basic Principle of Lossy Compression
Audio file format grouped in 3 classification
Primarily, lossy compression of audio data
which is Free and Open file format, Open File
utilizes psychoacoustic, which is related to the
Format and Proprietary Format. More detailed
ability of the human ear to sound. The human
shown on table 1.
ear can only capture sounds with frequencies
between 20Hz and 20000Hz. But there are still
Table 1. Audio File Format [16]
some other basic techniques in lossy
compression for audio data, namely:
Free and Open File format
 Voc File Compression. This technique
is very simple, ie removing samples of Format Descripton
silent samples (no sound) such as
pauses between paragraphs in a speech standard audio file
or a moment's silence on some parts of container format used
a song. mainly in Windows PCs.
 Linear Predictive Coding (LPC), Code Commonly used for storing
Excited Linear Predictor (CELP). uncompressed (PCM), CD-
 CELP is a further development, with a quality sound files, which
more complex analytical model to means that they can be large
produce greater compression ratios in size — around 10 MB per
and better sound quality. Slightly minute. Wave files can also
similar to lossless compression, at contain data encoded with a
CELP the difference between the variety of codecs to reduce
original sound and the analytical the file size (for example the
GSM or mp3 codecs). Wav an industry-standard
files use a RIFF structure. protocol that enables
a free, open source electronic musical
container format supporting instruments, computers, and
a variety of codecs, the most other equipment to
*.ogg popular of which is the communicate, control, and
audio codec Vorbis. Vorbis synchronize with each
offers compression similar other.
to MP3 but is less popular.
Open File Format
Musepack or MPC
(formerly known as Format Description
MPEGplus, MPEG+ or
MP+) is an open source designed for telephony use
lossy audio codec, in Europe, gsm is a very
specifically optimized for practical format for
transparent compression of telephone quality voice. It
stereo audio at bitrates of *.gsm makes a good compromise
160–180 kbit/s. Musepack between file size and
*.mpc and Ogg Vorbis are rated as quality. Note that wav files
the two best available can also be encoded with
codecs for high-quality the gsm codec.
lossy audio compression in A variable codec format
many doubleblind listening designed for dictation. It has
tests. Nevertheless, dictation header
Musepack is even less *.dct information and can be
popular than Ogg Vorbis encrypted (often required
and nowadays is used by medical confidentiality
mainly by the audiophiles. laws).
a lossless compression the vox format most
codec. This format is a commonly uses the
lossless compression as like Dialogic ADPCM
zip but for audio. If you (Adaptive Differential
compress a PCM file to flac Pulse Code Modulation)
and then restore it again it codec. Similar to other
will be a perfect copy of the ADPCM formats, it
original. (All the other compresses to 4-bits. Vox
codecs discussed here are format files are similar to
*.flac wave files except that the
lossy which means a small
part of the quality is lost). vox files contain no
The cost of this losslessness information about the file
is that the compression ratio itself so the codec sample
is not good. Flac is rate and number of channels
recommended for archiving must first be specified in
PCM files where quality is order to play a vox file.
important (e.g. broadcast or the Advanced Audio
music use). Coding format is based on
the standard audio file the MPEG2 and MPEG4
*.aiff format used by Apple. It is standards. aac files are
like a wav file for the Mac. usually ADTS or ADIF
a raw file can contain audio containers.
in any codec but is usually MPEG-4 audio most often
*. raw used with PCM audio data. *.mp4/m4a AAC but sometimes
It is rarely used except for MP2/MP3.
technical tests. a Samsung audio format
the standard audio file *.mmf that play a music of
format used by Sun, Unix ringtone.
and Java. The audio in au Proprietary Formats
files can be PCM or
compressed with the μ-law, Format Description
a μlaw or G729 codecs.
the MPEG Layer-3 format A proprietary version with
is the most popular format Digital Rights Management
for downloading and storing developed by 3D Solar UK
music. By eliminating Ltd for use in music
portions of the audio file downloaded from their
that are essentially Tronme Music Store and
inaudible, mp3 files are interactive music and video
compressed to roughly player.
onetenth the size of an A proprietary version of
equivalent PCM file while AAC in MP4 with Digital
maintaining good audio Rights Management
quality. developed by Apple for use
the popular Windows in music downloaded from
Media Audio format owned their iTunes Music Store.
by Microsoft. Designed An iKlax Media proprietary
with Digital Rights format, the iKlax format is a
Management (DRM) multi-track digital audio
abilities for copy protection *.iklax format allowing various
the older style Sony actions on musical data, for
ATRAC format. It always instance on mixing and
*.wav has a .wav file extension. To volumes arrangements.
open these files simply a Musinaut proprietary
install the ATRAC3 drivers. format allowing play of
a Real Audio format different versions (or skins)
designed for streaming of the same song.
audio over the Internet. The
.ra format allows files to be For example compression with different format
*.ra stored in a self-contained shown in figure 4.
fashion on a computer, with
all of the audio data
contained inside the file
a text file that contains a
link to the Internet address
where the Real Audio file is
stored. The .ram file
contains no audio data
Digital Speech Standard
files are an Olympus
proprietary format. It is a
fairly old and poor codec.
Prefer gsm or mp3 where
the recorder allows. It
allows additional data to be
held in the file header.
a Sony proprietary format
*.msv for Memory Stick
compressed voice files.
a Sony proprietary format
for compressed voice files;
commonly used by Sony
dictation recorders. figure 4. Compression with different file format

Table 2. Comparison of Algorithm Lossy and Lossless

No. Format Compression Results

Algorithm Type
Referensi (%)
Source Results
*.aac *.aac 3%
*.flac *.flac 1%
[1] Huffman Lossless *.midi *.midi 1% - 35%
*.mp3 *.mp3 1% - 3%
*.wav *.wav 10% - 27%
[2] Lossy *.flac *.mp3 90% - 95%
Shift Coding
*.mp3 *.mp3 1%
[3] Huffman Lossless *.wav *.wav 17%
*.wma *.wma 2%

[4] Huffman Lossless *.wav *.wav 20% - 40%

*.mp3 *.mp3 2.4%
[5] Arithmethic Lossy
*.wav *.wav 11%
*.aif *.aic 18.443%
*.au *.aic 27.589%
[6] Arithmethic Lossy
*.mid *.aic 31.819%
*.wav *.aic 36.741%
Huffman - - 40.56%
[7] Lempel-Ziv Lossless
- - 51.47%
Welch (LZW)
Run Length *.mp3 *.mp3 0.46 %
[8] Lossless
Encoding (RLE) *.wav *.wav 13.83 %
[9] Lossless *.mp3 *.wav 3.73% - 29,56%
Welch (LZW)
Huffman Shift
[10] Lossless *.wav *.wav 14,87%

In the study [1], the compression and V. CONCLUSION

decompression testing on 30 sample data to see the
results of the ratio and time, after the test obtained From previous research, can be concluded if
the conclusion if the highest to lowest compression currently the popular algorithm that is Huffman
ratio in the test data format can be sorted as follows Coding for Lossless compression, with the
*.midi, *.wav , *.mp3, *.aac, *.flac. In contrast to advancement of data storage technology today a
the research [1], [3], [4], [7] that using only the variety of lossless compression began to be
Huffman algorithm, the research [2] used the developed due to the storage price which can be
Huffman Shift Coding algorithm, with the aim of spelled has started relatively cheap and to get
performing a lossy compression obtained with a comfortable while listening to music.
very satisfactory compression ratio reaching 90%.
Then in the study [5], [6] using Arithmetic Coding From the various formats used as an object,
to perform compression Lossy only get the highest testing got the conclusion if the format * .wav is the
compression ratio of 31.819% in *.wav format. format that has the best compression results
while in research [7], [9] the algorithm used is compared with other formats.
Lempel-Ziv Welch (LZW) with lossless
compression. The highest compression ratio was If grouped by type and algorithm, Lossless
51.47%. and of all the proposed algorithms compression is best suited to use Lempel-Ziv Welch
compared to Run Length Encoding (RLE) in the (LZW) which has been proven by research [7], and
study [8] only got a 0.46% compression ratio alone.
for Lossy is Huffman Shift Coding algorithm that
achieves a compression ratio of over 90% [7]. [7] Rhen Anjerome Bedruz, Ana Riza F.
Quiros, Comparison of Huffman
At least, that all based on the previous research Algorithm and Lempel-Ziv Algorithm for
for now all compression application or application Audio, Image and Text Compression. 8th
that using audio compression such as Factory IEEE International Conference
Format, Xilisoft, Instagram, Facebook, Line, Humanoid, Nanotechnology, Information
Whatsapp they use their own compression that Technology Communication and Control,
produces different results even though the size of the Environment and Management
data is not too much different. therefore the (HNICEM). 2015.
compression percentage is not very important
because the percentage of compression can be [8] Aditya Rahandi, Dian rachmawati, Sajadin
adjusted but what is important now is a good Sembiring, Analisis dan Implementasi
compression result. for example by using MSE Kompresi File Audio Dengan
(mean square error) we can know the difference of Menggunakan Algoritma Run Length
the original file with the compression result, the Encoding (RLE). Jurnal Online Program
smaller value is good compression results. If the Studi S1Ilmu Komputer, Vol 1, No 1
compression reaches 100% but the value of MSE is (2012).
very large the compression results would not be
good. [9] Erwin Dwika Putra, Dedy Abdullah,
Analisis Perbandingan Kompresi Gambar
VI. REFERENCE (*.bmp) dan Audio (*.wav) Menggunakan
Algoritma Lempel Ziv Welch (LZW),
Bengkulu. Amplifier Vol. 6 No. 2, Mei
[1] Rendra Warsita, Rahmat Agus Setiawan, 2016.
Yoannita, Rancang Bangun Aplikasi
Kompresi Audio Berbasis Android [10] Prasetyo, Galang Bagus, Kompresi File
Menggunakan Algoritma Huffman. 2015 Audio Wave Menggunakan Algoritma
Huffman Shift Coding. Sarjana thesis,
[2] Luthfi Firmansyah. Data Audio Universitas Brawijaya. 2013.
Compression Lossless FLAC Format to
Lossy Audio MP3 format with Huffman [11] Venkatasekhar D., Aruna P, A Fast
Shift Coding Algorithm. Fourth Fractal Image Compression Using
International Conference on Information Huffman Coding. Dept of Computer
and Communication Technologies Science & Engg., Annamalai University,
(ICoICT). 2016. Annamalai Nagar, India. 2012.
[3] Ari Wibowo, Kompresi Data [12] Sutoyo T, Teori Pengolahan Citra Digital.
menggunakan Algoritma Huffman. Batam. Andi, Yogyakarta. 2009.
[13] Daryanto, T. Sistem Multimedia dan
[4] Hari Purwanto, Penerapan Algoritma Aplikasinya. Yogyakarta: Graha Ilmu.
Huffman pada Kompresi file WAV. 2005.
JURNAL VOL.2, No.2-40-59 Universitas
Suryadarma. 2015. [14] Benjamin, A. Music Compression
Algorithms and Why You Should Care.
[5] Uswatun Hasanah, IMPLEMENTASI Alexander Benjamin. 2010.
PADA KOMPRESI FILE AUDIO VIA [15] Satrio Adi Rukmono, Kompresi Data
FTP (FILE TRANSFER PROTOCOL). Audio. Bandung. 2009.
Skripsi 2017.
[16] Audio File Format, diakes 2008.
[6] Yahya Fathoni Amri, Analisis Kinerja
Kompresi File audio menggunakan
Algoritma Arithmethic Coding dengan
bilangan integer. 2012.