Вы находитесь на странице: 1из 5

Symbiosis Institute of Technology

Information Technology Department

April Internship Report

Submitted to: Prof. Rahul Joshi


Report by: Akash Dholaria
PRN: 16070122002
Batch: 2016-2020
Date: 1st April 2019
Company: Medialytics
Company Mentor: Mr. Anup B
Week 1
The main task for this week was to come up with an algorithm that would analyse the trends in
the database for a given time period and then come up with the top 5 dominating trends(if the
time period is already over and trends that are likely to trend the most in the future if the input
time period has not ended yet.
All the previous trends were processed to find their start time and end time using unique ids for
their entries in our database and a Python module was created for the same by me.
After processing, the trends were analysed using some basic statistical methods, the results of
which I later documented and presented to the entire team and the algorithm that I came up
with. It was implemented as a Python Module again.
The details of the algorithm can be found here:
https://docs.google.com/document/d/19K7SOo8qD6L5pKlN2lTI7aZVf4AYdyspsRweeFxHsik/edi
t
The algorithm was designed such that it observed the rank of that trend for a particular time
interval, if the rank of the trend decreased, it was not considered for our Top 3 trends however
in the rank increased over a period of time, it was given higher priority over the other trends as
it has a higher chance of trending in the upcoming hours.
These trends along with the trends from Google and YouTube were to be put in a table which
would be generated daily for the artists’ help to produce media artifacts based on these topics
only.
To avoid meaningless and weekly repeating trends without real world context, eg.
(#ThursdayThoughts) a list of stopwords was generated and used to prevent such entries into the
table.

2
Week 2
This week started with the refinement of the code that we have created previously which had
the database and API credentials hardcoded, the code was updated by dividing it into proper
packages and then the credentials were externalized using config files in Python.
This week’s main target was to analyze the dumps we have been accumulating for the twitter
data. We worked as a team to complete this project. Since we are getting the daily twitter trends
and the tweets and their polarities for their respective topics in that trend, we decided to first
compute a weighted average polarity of a topic. For this we gave equal weight to user_followers
(no. of followers that the user (the person who tweeted) has), favourite_count and
retweet_count for each tweet, then we calculated their average and assigned it to the respective
polarity of that tweet.
As an example, If Priyanka Chopra tweeted and her tweet had a very high retweet count, favorite
count and no. of followers, a lot more importance would be given to the polarity of her tweet in
that topic. We then averaged the polarity of each topic to know how positive or negative each
topic was in that trend cycle.

Week 3
Having reached a good number of posts on Instagram, we got the task to get and analyse the
data from our instagram channel. The data was collected through the Instagram Insights API and
the json responses were parsed and dumped into the database whose schema was also designed
by us. This data was then to be used for analysis of the audience based on various categories.
This was done using exploratory data analysis. I wrote a tutorial for everyone in the office to use
Seaborn as the library for exploratory data analysis and these were the results from their code:

3
Week 4

4
I did a small course on Social Media Intelligence on Coursera which mentioned different ways of
relating the different parameters of the YouTube and Twitter trends to gain insights on other
channels.

Here is the doc that contains a summary to the course, compiled by me:
https://docs.google.com/document/d/1Xibj6lVTuTBUSaK10sZMjSEB0cdcrqsQ204LxnSnjaU/edit
I used this course to get correlation between the views and likes of YouTube videos and
impressions and likes of Instagram posts.

Summary:
The month was focused more on processing the data and data science to build simple yet
effective models to gain insights on our social media trends so that we can provide better insights
to the digital marketing team and optimize the use of our digital marketing budget.

Вам также может понравиться