Академический Документы
Профессиональный Документы
Культура Документы
APP(Machine Learning)
SUBMITTED BY- RAHUL BANSAL AND TUSHAR BAHETI
OBJECTIVE OF PROJECT:
To build a web application which generates a video out of the news article
content.
Our app provides a pictorial representation of article which is more
informative if we compare with bunch of text content.
Which also helps the user to understand in much appropriate manner through
videos.
DESCRIPTION :
Tools/Platform
1. Python, NLP
2. Jupyter Notebook
Anaconda package.
3. Libraries - BeautifulSoup, Pillow, OpenCv, gTTl
Frontend
1. HTML, CSS, Javascript
Backend
1. Flask
Software Requirements:
The app starts with the user providing the news article URL.
Currently we are allowing news article URL of Hindustan Times.
With Beautiful Soup we are scraping the news article content from the given
URL.
Text summarization is performed using Natural Language Processing.
Then we are downloading the images from Google images based on the
summary text.
After that we are adding subtitles to the images downloaded.
Audio is generated from the summarized text.
Combining images and audio we are generating the video, which is our final
product.
NewsPaper Headline:
https://www.hindustantimes.com/india-news/pm-modi-saudi-prince-
mohammed-bin-salman-hold-bilateral-talks-statement-shortly/story-
xK1ug5C0xlwJF4v3JrQPhP.html
Current Progress:
The First Module of our Project is Based on scraping the news article content
from the given URL of News article.
For this we are using Beautifulsoup to scrap the text from the News article.
What is BeautifulSoup And how to use it:
Beautiful Soup is a Python package for parsing HTML and XML documents
(including having malformed markup, i.e. non-closed tags, so named after tag
soup). It creates a parse tree for parsed pages that can be used to extract
data from HTML, which is useful for web scraping.
It is available for Python 2.7 and Python 3.
To Use BeautifulSoup First we have to download the package Beautifulsoup
using Pip or conda.
Second Module
In the second module we are using text summarization algo to text summarize
it.
And we are using google_image Library to search image on google.
From this images we are making video to show output to user.
From this summarized text we make audio using gttl library.
PILLOW
There are several APIs available to convert text to speech in python. One of
such APIs is the Google Text to Speech API commonly known as the gTTS API.
gTTS is a very easy to use tool which converts the text entered, into audio
which can be saved as a mp3 file.
The gTTS API supports several languages including English, Hindi, Tamil,
French, German and many more. The speech can be delivered in any one of
the two available audio speeds, fast or slow. However, as of the latest
update, it is not possible to change the voice of the generated audio.