Software Requirements Specification: Version 1.0 Approved

Software Requirements
Specification
for
Automatic Subtitle Generator
Version 1.0 approved
Prepared by:
Vaishnavi Kalantri 111603027

Kryselle Martis 111603031
Shivani Patil 111603046
COEP TY Computer Engineering
1st November 2018
Copyright © 1999 by Karl E. Wiegers. Permission is granted to use, modify, and distribute this document.
Software Requirements Specification for Automatic Subtitle Generator
Page ii
Table of Contents
Table of Contents .......................................................................................................................... ii
Revision History ............................................................................................................................ ii
1. Introduction ..............................................................................................................................1
1.1 Purpose ........................................................................................................................................ 1
1.2 Document Conventions ............................................................................................................... 1
1.3 Intended Audience and Reading Suggestions.............................................................................. 1
1.4 Product Scope .............................................................................................................................. 1
1.5 References ................................................................................................................................... 1
2. Overall Description ..................................................................................................................2
2.1 Product Perspective ..................................................................................................................... 2
2.2 Product Functions ........................................................................................................................ 2
2.3 User Classes and Characteristics ................................................................................................. 2
2.4 Operating Environment ............................................................................................................... 2
2.5 Design and Implementation Constraints...................................................................................... 2
2.6 User Documentation .................................................................................................................... 3
2.7 Assumptions and Dependencies .................................................................................................. 3
3. External Interface Requirements ...........................................................................................3
3.1 User Interfaces ............................................................................................................................. 3
3.2 Hardware Interfaces..................................................................................................................... 4
3.3 Software Interfaces ...................................................................................................................... 4
3.4 Communication Interfaces ........................................................................................................... 4
4. System Features .......................................................................................................................5
4.1 Select File .................................................................................................................................... 5
4.2 Upload File .................................................................................................................................. 5
4.3 Customization .............................................................................................................................. 6
4.4 User Verification ......................................................................................................................... 6
4.5 Real-time Subtitle Generation ..................................................................................................... 7
4.6 Direct Speech Subtitle Generation .............................................................................................. 7
4.7 Download .srt file ........................................................................................................................ 8
5. Other Nonfunctional Requirements .......................................................................................8
5.1 Performance Requirements.......................................................................................................... 8
5.2 Security Requirements................................................................................................................. 8
5.3 Scalability .................................................................................................................................... 8
5.4 Maintainability ............................................................................................................................ 8
Appendix A: Glossary....................................................................................................................9
Appendix B: Analysis Models .......................................................................................................9
Appendix C: To Be Determined List ............................................................................................9
Revision History
Name Date Reason For Changes Version
Vaishnavi, Kryselle, 1st November Original SRS 1.0
Shivani
Page 1
1. Introduction
1.1 Purpose
The main objective of developing this system is to present an automated way to generate the
subtitles for audio and video. The system will save time, reduce the amount of work the
administration has to do and will generate the subtitles automatically with electronic apparatus.
This system will first extract the audio, then recognize the extracted audio with the available
speech recognition API. Then, the recognized audio is converted to text and saved in a text file
with the extension “.srt”. This “.srt” file can be opened in a media player to view the subtitles along
with the video.
1.2 Document Conventions

Headings are identified with larger font size and use of bold characters.
Hyperlinks are identified with blue font colors and underlines.
1.3 Intended Audience and Reading Suggestions

This is intended for people who have difficulty understanding videos because there is no text
description available. The target audience could be people who are deaf, or those trying to learn new
languages, or simply get more familiar with the languages they already know. This document will
help end users of this software read to know about what this project can do.
1.4 Product Scope

The software will be very user friendly, and it will be help people ultimately have their videos modified
with subtitles below, matching the audio. It will have many useful entities that would cater to the
different requirements given by the user, with one of three options.
1. The user can choose to download and save the .srt subtitle file to be played as and when
required
2. The user can upload the video file and download the edited video file with the subtitle track
inserted
3. The user can stream the video file and have the subtitles being produced below
simultaneously in real time.
4. The user can convert various video file formats into .mp4 format.
1.5 References
• IEEE SRS template - https://web.cs.dal.ca/~hawkey/3130/srs_template-ieee.doc
• Abhinav Mathur, Tanya Saxena,enerating Subtitles Automatically using Audio Extraction
and Speech Recognition, 7th International Conference on Contemporary Computing (IC3),
2015
• Sadaoki Furui, Li Deng, Mark Gales,Hermann Ney, and Keiichi Tokuda,, Fundamental
Technologies in Modern Speech Recognition, Signal Processing, IEEE Signal Processing
Society, November 2012.
Page 2
2. Overall Description
2.1 Product Perspective

Automatic Subtitle Generator by the name indicates that the software smartly generates subtitles for
the audio or video file automatically without the need to manually select a subtitle .srt file.
The purpose of the system is to efficiently help people who have difficulties in understanding a new
language or who have some disabilities(deaf), so that they can understand the video or audio with
ease.
The model is a real time application, which instantaneously generates subtitles as the audio or video
plays. The application supports conversion of various format video files into .mp4 files.
2.2 Product Functions

The software will take input as location, a set of questionnaires, and preferences. With the help of
artificial intelligence, it will be able to perform the following tasks
• Subtitle generation for an audio (.mp3 file)

• Subtitle generation for video (.mp4 file)
• Subtitle generation for direct speech
• Translation of the subtitles generated
• Formatting of the subtitles generated
• Conversion of .rmvb, .avi, .mov, .mkv, etc video files into .mp4 file
2.3 User Classes and Characteristics

The application targets people with who have difficulty understanding videos because there is no
text description available. The target audience could be people who are deaf, or those trying to learn
new languages. The application will be very user-friendly and will help people watch and understand
videos with ease.
2.4 Operating Environment

The software will be implemented on a web server. It is expected to work smoothly on any web
browser when accessed by the user.
2.5 Design and Implementation Constraints

• The web application is constrained by the availability of Internet
• The application supports only English subtitles.
• Subtitles for only .mp3 or .mp4 file. Other formats need to be converted to these.
Page 3
2.6 User Documentation

N/A
2.7 Assumptions and Dependencies

• One assumption about the product is that the user has strong internet connectivity.
• The user uploads file of the appropriate format.
• The file uploaded should be of good and clear quality.
• The file uploaded must not have any pre-existing subtitles.
• Dependency on the text generated by the Google API Cloud Speech API.
3. External Interface Requirements
3.1 User Interfaces
3.1.1 Homepage
• Top:
Buttons to select an option out of the following three:
1. Convert: To convert any video file format to .mp4 file
2. Generate subtitles: To add subtitle to specified audio/video file
• Top Right:
1. Download: To download a subtitle file
3.1.2 Convert Window
• Buttons for selecting the video file format to be converted:

1. .avi
2. .rmvb
3. .mkv
4. .mov
Once clicked:
• Text input: To specify the path of the file to be converted
• Download button: To download the converted file.
• Back button: To go back to the homepage and continue with subtitle generation
Page 4
3.1.3 Generate Subtitle Window
• 3-option tick box: One to specify audio file uploading, video file uploading and to specify direct
speech input
• Upload and Browse button: To select audio/video file to be uploaded
• Language drop-down menu: To select one of 120 available languages.
• Colour palette: To select text color of subtitles, not a necessary field
• Font Size: To select size of font, not a necessary field
• Google reCAPTCHA: To authenticate user
• Go button: To begin generation of subtitles

Once clicked:
• Play button: To play the video/audio generated with subtitles in real time
• Download button: To download the .srt file generated
• Back button: To go back to the homepage
3.2 Hardware Interfaces

For generating subtitles for direct speech, a microphone for providing the speech input is required,
where the device must be compatible for accepting the speech input.
Other than this, the project being a web application, it does not have any direct hardware interfaces
requirement. The application can run on any device like mobile, laptops, personal computers, etc.
which supports internet facilities and has microphone permissions.
3.3 Software Interfaces

• Google Cloud Speech API: To generate real time subtitles
• PyAudio: To generate a .wav file that contains the extracted audio
• FFMPEG: To attach subtitle file to video/audio, conversion of audio/video files, audio
compression, etc
• FFmpy: To wrap FFMPEG in a python environment
• Flask: Server side language for building application framework
• HTML: To application GUI
• Virtualenv: To create the virtual environment in which the app will operate
• Google App Engine Standard Environment: To deploy the application to the cloud and make
it accessible to anyone with the url
3.4 Communication Interfaces

As the data associated with the application is stored in Google Servers, for the proper functioning of
the application, appropriate data will have to be downloaded into a local cache via HTTP and FTP
protocols.
Page 5
4. System Features
4.1 Select File
4.1.1 Description and Priority

The user can select among video or audio files of any format for the generation of subtitles.
This is the mandatory step and is of highest priority as it provides the type of data on which
the program will run.
4.1.2 Stimulus/Response Sequences

There will be three tick boxes made available right at the beginning, whether the input file will
be audio, video or direct speech will be provided.
4.1.3 Functional Requirements

REQ-1-1: The three tick boxes “Audio file”, “Video file”, “Direct speech input”
4.2 Upload File

The user can select the appropriate file to be uploaded if the audio/video choice is made. For
direct speech input, the record button has to be clicked to enable recording.

If the option of audio/video file is selected, there will be a “Browse” button to select the file to
be uploaded. For direct speech input a “Record” button will be provided.
If audio/video file is uploaded, the application will throw an error to the user if the file is detected
to be corrupt or infected with a virus. If speech input is selected and the user hardware
microphone permissions are not in order, an error is thrown.
REQ-2-1: The “Browse” button to select audio/video file

REQ-2-2: User must have file present on device, or for direct speech, microphone
permissions should be enabled
Page 6
4.3 Customization

The user will define the language of subtitle generated and formatting options of the text.
These are additional functionalities, and are not prioritized. If no language is selected, default
is English, text colour is white and size is 15.

There will be a drop-down menu with all available languages and the required one can be
selected. Similarly, there will be drop-down menus for formatting size and text colour of
subtitle.

REQ-3-1: The “Select language” drop-down menu
REQ-3-2: The “Select text size” drop-down menu
REQ-3-3: The “Select text colour” drop-down menu
4.4 User Verification

The user will have to accurately answer the reCAPTCHA provided by the app, this is of highest
priority as the uploaded/recorded file will not be processed unless this step is accurately
completed.

The reCAPTCHA is non-avoidable security option and once it has been successfully
completed, the “Go” button will be enabled.

REQ-4-1: The browser will be reCAPTCHA enabled
REQ-4-2: The system should be resistant to bot attacks
Page 7
4.5 Real-time Subtitle Generation

This will start the backend functionalities to generate the final playable file. Once the video
with subtitles has been generated, the video file will be displayed on the screen with an option
to play. It is a mandatory step is the audio/video file option is selected.

Once the “Go” button has been clicked, the back-end processes will start where the audio is
extracted with the help of PyAudio in .wav format. Now the Google Cloud Speech API converts
the audio(speech) into text and returns a text file with the subtitles. The .srt file is then
integrated with the video/audio using FFMPEG and respective file along with the integrated
subtitles will be played on clicking the play button.

REQ-5-1: The “Go” button to start background subtitle generation
REQ-5-2: The PyAudio utility will extract the audio from the uploaded file.
REQ-5-3: The Google Cloud Speech API will convert this audio file into text and return
a .srt file
REQ-5-4: The FFMPEG utility will integrate this .srt file with the uploaded audio/video.
REQ-5-5: The “Play” button to prompt playing of the file integrated with the .srt file
4.6 Direct Speech Subtitle Generation

This will start the recording of the speech input. Once the black-screen video with subtitles
has been generated, the video file will be displayed on the screen with an option to play. It is
a mandatory step if the direct speech input option is selected.

Once the “Record” button has been clicked, the Google Cloud Speech API will generate the
subtitle on the screen.

REQ-6-1: The “Record” button to start background subtitle generation
REQ-6-2: The Google Cloud Speech API will generate the subtitle foe the speech.
Page 8
4.7 Download .srt file

This will start the download of the generated .srt file. It is optional.

Once the “Download” button has been clicked, the download immediately starts.
REQ-7-1: The “Download” button to prompt download

REQ-7-2: User disk space availability
5. Other Nonfunctional Requirements
5.1 Performance Requirements

There are two main indicators of performance – accuracy and speed. The software will be quick to
generate the subtitle file (.srt format) and this will be measured on the basis of seconds per word –
estimated speed maintained on average will be 0.4 seconds per word. The real time feature will
additionally reduce time. The software will be accurate and ensure no lag between subtitles printed
below and speech.
5.2 Security Requirements

The web application will provide a reCAPTCHA feature to prevent bot infiltration or unwanted
excessive traffic generated by viruses. The web application will also ensure files being converted are
of appropriate video/audio format and do a check for attempts at uploading executable file formats.
5.3 Scalability
The domain appspot.com (provided by Google App Engine Support Environment) is a multi-server
platform with a large capacity, hence number of users able to visit the site is not limited.
5.4 Maintainability
The software is easy to maintain, as and when new updates are made available of the various
components, their features can be integrated to improve performance. The web application once
Page 9
updated will be easily available to the user on refreshing of the page, no additional installations will
be required.
Appendix A: Glossary
• Everywhere in the SRS, app means application.
• The application will run as a web application.
Appendix B: Analysis Models

N/A
Appendix C: To Be Determined List
• The detailing of the UI like fonts, colors, positioning of elements of the UI.

Software Requirements Specification: Version 1.0 Approved

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Software Requirements Specification: Version 1.0 Approved

Загружено:

Авторское право:

Доступные форматы

Software Requirements

Automatic Subtitle Generator

Version 1.0 approved

Vaishnavi Kalantri 111603027

COEP TY Computer Engineering

1st November 2018

1.2 Document Conventions

1.3 Intended Audience and Reading Suggestions

1.4 Product Scope

2.1 Product Perspective

2.2 Product Functions

• Subtitle generation for an audio (.mp3 file)

2.3 User Classes and Characteristics

2.4 Operating Environment

2.5 Design and Implementation Constraints

2.6 User Documentation

2.7 Assumptions and Dependencies

3. External Interface Requirements

3.1 User Interfaces

3.1.2 Convert Window

• Buttons for selecting the video file format to be converted:

3.1.3 Generate Subtitle Window

• Go button: To begin generation of subtitles

• Back button: To go back to the homepage

3.2 Hardware Interfaces

3.3 Software Interfaces

3.4 Communication Interfaces

4.1 Select File

4.1.1 Description and Priority

4.1.2 Stimulus/Response Sequences

4.1.3 Functional Requirements

4.2 Upload File

4.2.1 Description and Priority

4.2.2 Stimulus/Response Sequences

4.2.3 Functional Requirements

REQ-2-1: The “Browse” button to select audio/video file

4.3.1 Description and Priority

4.3.2 Stimulus/Response Sequences

4.3.3 Functional Requirements

4.4 User Verification

4.4.1 Description and Priority

4.4.2 Stimulus/Response Sequences

4.4.3 Functional Requirements

4.5 Real-time Subtitle Generation

4.5.1 Description and Priority

4.5.2 Stimulus/Response Sequences

4.5.3 Functional Requirements

4.6 Direct Speech Subtitle Generation

4.6.1 Description and Priority

4.6.2 Stimulus/Response Sequences

4.6.3 Functional Requirements

4.7 Download .srt file

4.7.1 Description and Priority

4.7.2 Stimulus/Response Sequences

4.7.3 Functional Requirements

REQ-7-1: The “Download” button to prompt download

5. Other Nonfunctional Requirements

5.1 Performance Requirements

5.2 Security Requirements

Appendix B: Analysis Models

Appendix C: To Be Determined List

Вам также может понравиться