Академический Документы
Профессиональный Документы
Культура Документы
for
Music Genre Recognition Using Neural Network
Government Engineering College, Wayanad
Athul K S
Jasmin Joseph
Roshna Raj V
Vishnu P K
December 3, 2018
Contents
1 Introduction 2
1.1 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Document Conventions and Acronyms . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Overall Description 3
2.1 Product Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.1 User Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.2 Software Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Product Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.1 Initialize Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.2 Stop Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.3 Reading a music file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.4 preprocessing the file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.5 Classification of genre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.6 playlist generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.7 Login . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.8 Logout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.9 Give Ratings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Specific Requirements 5
3.1 Functional Requirement Specification . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.1 Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.2 Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2 Non-functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2.1 Performance requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2.2 Design constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1
1 Introduction
1.1 Purpose
The purpose of this document is to classify genres of a song in each song. The aim is to apply
machine learning to the task of music genre using a neural network. This document will explain an
efficient method to structure and organize a large number of music files available on the internet.
It will explain the purpose and features of the system, the interfaces of the system, what the
system will do, the constraints under which it must operate. The corresponding environments
that this project will be integrated to are supposed to and probably will have a large number and
variety of users in which many of them need such system to find whatever it is they need. The
users may or may not be aware of their need for a recognition feature on the software or website
they are using, but such features can increase efficiency and save time to users, while they are
looking in a place where there are a large number and variety of content which causes them to
waste a lot of time to find what they need. From another point of view, there might be users that
dont exactly know what they are looking for and such situations can also make a useful solution
out of this project.
1.2 Scope
The goal is to design a genre recognition system on music domain.
The product increases the efficiency and save time to users,while they are looking in a place
where there is a large number and variety of content which causes them to waste a lot of time to
find what they need.
1.4 References
1. http://www.tensorflow.org/api_docs/python
2. http://www.labrosa.ee.columbia.edu/millionsong
3. http://www.github.com/librosa/librosa
2
1.5 Overview
Section 2 of this document gives an overall description of the product. It describes the general
functionalities, expected user groups and constraints of the product. Section 3 gives more specific
information about the functionalities specified in section 2. Section 2 is a general view of the
system and should be used as a guide to section 3. Section 3 is intended for developers and testers
and may be skipped by end users.
2 Overall Description
2.1 Product Perspective
We will use Tensorflow framework for training the neural network that uses music songs data
set. Data set contains features from symbolic songs (.au, in this case) and uses them to classify
the recordings by genre. Each example is classified as classic, rock, jazz, blues, country, metal,
pop, disco, reggae or hiphop. Further, the audios are read as melspectrograms, splittung them
into 3s window with 50% overlapping result in a dataset with the size 19000x129x1281 (samples-
timefrequencychannel). Also to created a model from different data sets depending of features
which will be taken. The attributes are duration of tempo, root mean square (RMS) amplitude,
sampling frequency, sampling rate, dynamic range, tonality and number of digital errors. Main
goal of this experiment is to train neural network to classify this 10 type of genre and to discover
which observed features has impact on classification. Data set contains 1000 instances (100 of
each genre), 8 numeric attributes and genre name. Each instance has one of 10 possible classes:
classic, rock, jazz, blues, country, metal, pop, disco, reggae or hiphop.
1. ML Kit
ML KIT beta brings Googles machine learning expertise to mobile developers in a powerful
and easy-to-use package.
https://developers.google.com/ml-kit/
3
2. Firebase
firebase is a mobile and web application development platform developed by Firebase source:https:
//firebase.google.com/
1. sampling:
The loaded audio file is splitted into 3s windows with 50% overlapping, inorder to reduce
the array size we divide the 30sec loaded audio file into 3s window.
2.2.7 Login
When the user logs in to the website; Music Recognition System will be informed and a recom-
mendation session for the user will start and generate recommendations.
2.2.8 Logout
When the user logs out from the website; Music Recognition System will be informed and the
associated recommendation session for the user will be closed.
4
2.2.9 Give Ratings
The user will give ratings after the recognition on the GUI of our system. These ratings will be
stored to provide better music recognition in the future.
2.3 Constraints
• Since we need user profile data while developing the product, to find real time and sufficient
data can be a problem for developer because of regulatory policies.
• Millions of data will be needed to test the software. At this stage developers will need huge
amount of disk space and clusters.
3 Specific Requirements
This section will describe the software requirements in detail as subsections which are interface
requirements, functional and non-functional requirements.
3.1.1 Client
The client-side of the system will be an application with a user interface that is integrated into
a music listening application. This application gathers the information from users, investigates
some actions of the users, and provides the connection with the server. This application is the
client-side interface of the Music Recognition, it include the functionalities of the host music
environment such as playing music.
3.1.2 Server
The server-side system will hold the entire preprocessed data,this trained model is again retrained
inorder to produce a more precise, accurate and better model.
• Speed The system should generate and provide personalized recommendations to the users
in a reasonable time.
5
3.2.2 Design constraints
• Hardware Constraints The system will be integrated with a Android application. To use
recognition system, user should enter from a personal mobile device with internet connection,
tablet.