Добро пожаловать в Scribd!

Big Data Analytics Assignment Solved in PySpark

Загружено:

0% нашли этот документ полезным (0 голосов)

19 просмотров4 страницы

This document contains a Big Data Analytics assignment submitted by Vaibhav Singh. It includes 7 questions to be solved using PySpark commands. The questions involve common list operations like incrementing elements, multiplying elements, finding most frequent words, and filtering even numbers. It also includes questions on joins between two files and the difference between map and flatMap transformations.

Исходное описание:

lab assignment of big data analytics

Оригинальное название

Bda Lab Assignment

Авторское право

Доступные форматы

PDF, TXT или читайте онлайн в Scribd

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Пожаловаться на этот документ

Авторское право:

Доступные форматы

Скачайте в формате PDF, TXT или читайте онлайн в Scribd

Отметить как неприемлемый контент

0% нашли этот документ полезным (0 голосов)

19 просмотров4 страницы

Big Data Analytics Assignment Solved in PySpark

Загружено:

Vaibhav Singh

Авторское право:

Доступные форматы

Скачайте в формате PDF, TXT или читайте онлайн в Scribd

Отметить как неприемлемый контент

Перейти к странице

Вы находитесь на странице: 1из 4

Поиск в документе

BigDataAnalytics

Assignment

SubmittedBy:-VaibhavSingh
14B00033
CSC

Writeaprograminpysparkthefollowingquestions:-
1. Toincrementeachnumberinalistbyone.

l1=sc.parallalize([1,2,3,4,5])
l1.collect()
l1=rdd1.map(lambda x:x+1)
l1.collect()

Output = [2,3,4,5,6]

2. Tomultiplyeachnumberinalistby10

l1=sc.parallalize([1,2,3,4,5])
l1.collect()
l1=rdd1.map(lambda x:x*10)
l1.collect()

Output = [10,20,30,40,50]

3. To find most commonly occurring words with their associated
frequencies.

from operator import add

s=["a","b","a","c","a"]

s1=sc.parallelize(s)

s2=s1.map(lambda x:((x,1).reduce By key (add).collect())

print s2.collect()

Ouput = [("a",3),("b",1),("c",1)]

4. Findfrequencyofeachstate:-
State=["delhi","HP","HR","HR","UP"]

from operator import add

s=["delhi","HP","HR","HR","UP"]

s1=sc.parallelize(s)

s2=s1.map(lambda x:(x,1)).reduceByKey(add).collect())

print s2.collect()

Output = [("delhi",1),("HP",1),("HR",2),("UP",1)]

5. Toprintevennumbersoutofalistofnumbers.

l1=sc.parallalize([1,2,3,4,5,6])
l1.collect()
l2=l1.filter(lambda x:x%2==0)
print l2.collect()

Output = [2,4,6]

6. Write the spark commands to perform join operations between

twofiles.
Each file contains a persons name, DOB, and age. Group the
personbyage.

l1=sc.textFile(/home/1.txt)
l2=sc.textFile(/home/2.txtt)
l3=l1.map(lambdax:tuple(x.split()))
l4=l3.map(lambda(x,y,z):(x,y))
l5=l2.map(lambdax:tuple(x.split()))
l6=l5.map(lambda(x,y,z):(x,y)))
l7=l6.join(l4)
Printl7.collect()

Output=[(a,(23,25)),(s,(20,24)),(m,(21,20))]

7. Differentiatebetweenmapandflatmap.
Here is an example of the difference:
val textFile = sc.textFile("README.md") // create an RDD of lines of text

// MAP:

textFile.map(_.length) // map over the lines:

res2: Array[Int] = Array(14, 0, 71, 0, 0, ...)

// -> one length per line

// FLATMAP:

textFile.flatMap(_.split(" ")) // split each line into words:

res3: Array[String] = Array(#, Apache, Spark, ...)

// -> multiple words per line, and multiple lines

// - but we end up with a single output array of words

map transforms an RDD of length N into another RDD of length N.

For example, it maps from N lines into N line-lengths.

flatMap (loosely speaking) transforms an RDD of length N into a collection of N collections,

then flattens these into a single RDD of results.
For example, flatMapping from a collection of lines to a collection of words.

["aa bb cc", "", "dd"] => [["aa","bb","cc"],[],["dd"]] =>

["aa","bb","cc","dd"]

The input and output RDDs will therefore typically be of different sizes.

(You may need to call collect() on the RDDs generated in the examples above - I have
omitted this for clarity)

Вам также может понравиться

Distributed System Assignment 3
Документ2 страницы
Distributed System Assignment 3
Vaibhav Singh
Оценок пока нет
Mis Pepsico
Документ34 страницы
Mis Pepsico
Vaibhav Singh
Оценок пока нет
Introduction To Neural Networks: John Paxton Montana State University Summer 2003
Документ24 страницы
Introduction To Neural Networks: John Paxton Montana State University Summer 2003
Vaibhav Singh
Оценок пока нет
Linear Regression with One Variable Explained
Документ49 страниц
Linear Regression with One Variable Explained
Ramon Lins
Оценок пока нет
Semester Core Project Final Report
Документ26 страниц
Semester Core Project Final Report
Vaibhav Singh
Оценок пока нет
Software Engineering - Design Requirement
Документ9 страниц
Software Engineering - Design Requirement
Vaibhav Singh
Оценок пока нет
SEO Analysis Report
Документ9 страниц
SEO Analysis Report
Vaibhav Singh
Оценок пока нет
Machine Learning Assignment
Документ2 страницы
Machine Learning Assignment
Vaibhav Singh
Оценок пока нет
Readme
Документ1 страница
Readme
Vaibhav Singh
Оценок пока нет
Conclusion
Документ4 страницы
Conclusion
Vaibhav Singh
Оценок пока нет
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
От Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Рейтинг: 4 из 5 звезд
4/5 (5794)
The Little Book of Hygge: Danish Secrets to Happy Living
От Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Рейтинг: 3.5 из 5 звезд
3.5/5 (399)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
От Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Рейтинг: 3.5 из 5 звезд
3.5/5 (231)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
От Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Рейтинг: 4 из 5 звезд
4/5 (894)
The Yellow House: A Memoir (2019 National Book Award Winner)
От Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Рейтинг: 4 из 5 звезд
4/5 (98)
Shoe Dog: A Memoir by the Creator of Nike
От Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Рейтинг: 4.5 из 5 звезд
4.5/5 (537)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
От Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Рейтинг: 4.5 из 5 звезд
4.5/5 (474)
Never Split the Difference: Negotiating As If Your Life Depended On It
От Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Рейтинг: 4.5 из 5 звезд
4.5/5 (838)
Grit: The Power of Passion and Perseverance
От Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Рейтинг: 4 из 5 звезд
4/5 (587)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
От Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Рейтинг: 4.5 из 5 звезд
4.5/5 (265)
Yes Please
От Everand
Yes Please
Amy Poehler
Рейтинг: 4 из 5 звезд
4/5 (1891)
Angela's Ashes: A Memoir
От Everand
Angela's Ashes: A Memoir
Frank McCourt
Рейтинг: 4.5 из 5 звезд
4.5/5 (440)
The Emperor of All Maladies: A Biography of Cancer
От Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Рейтинг: 4.5 из 5 звезд
4.5/5 (271)
On Fire: The (Burning) Case for a Green New Deal
От Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Рейтинг: 4 из 5 звезд
4/5 (73)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
От Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Рейтинг: 4.5 из 5 звезд
4.5/5 (344)
Team of Rivals: The Political Genius of Abraham Lincoln
От Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Рейтинг: 4.5 из 5 звезд
4.5/5 (234)
Fear: Trump in the White House
От Everand
Fear: Trump in the White House
Bob Woodward
Рейтинг: 3.5 из 5 звезд
3.5/5 (738)
The Glass Castle: A Memoir
От Everand
The Glass Castle: A Memoir
Jeannette Walls
Рейтинг: 4.5 из 5 звезд
4.5/5 (1712)
Rise of ISIS: A Threat We Can't Ignore
От Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Рейтинг: 3.5 из 5 звезд
3.5/5 (137)
Principles: Life and Work
От Everand
Principles: Life and Work
Ray Dalio
Рейтинг: 4 из 5 звезд
4/5 (599)
The Unwinding: An Inner History of the New America
От Everand
The Unwinding: An Inner History of the New America
George Packer
Рейтинг: 4 из 5 звезд
4/5 (45)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
От Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Рейтинг: 3.5 из 5 звезд
3.5/5 (2219)
Steve Jobs
От Everand
Steve Jobs
Walter Isaacson
Рейтинг: 4.5 из 5 звезд
4.5/5 (806)
John Adams
От Everand
John Adams
David McCullough
Рейтинг: 4.5 из 5 звезд
4.5/5 (2409)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
От Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Рейтинг: 4 из 5 звезд
4/5 (1090)
Bad Feminist: Essays
От Everand
Bad Feminist: Essays
Roxane Gay
Рейтинг: 4 из 5 звезд
4/5 (1015)
The Outsider: A Novel
От Everand
The Outsider: A Novel
Stephen King
Рейтинг: 4 из 5 звезд
4/5 (1839)
Brooklyn: A Novel
От Everand
Brooklyn: A Novel
Colm Toibin
Рейтинг: 3.5 из 5 звезд
3.5/5 (1937)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
От Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Рейтинг: 4.5 из 5 звезд
4.5/5 (119)
A Man Called Ove: A Novel
От Everand
A Man Called Ove: A Novel
Fredrik Backman
Рейтинг: 4.5 из 5 звезд
4.5/5 (4609)
The Light Between Oceans: A Novel
От Everand
The Light Between Oceans: A Novel
M.L. Stedman
Рейтинг: 4.5 из 5 звезд
4.5/5 (789)
The Woman in Cabin 10
От Everand
The Woman in Cabin 10
Ruth Ware
Рейтинг: 3.5 из 5 звезд
3.5/5 (2322)
Manhattan Beach: A Novel
От Everand
Manhattan Beach: A Novel
Jennifer Egan
Рейтинг: 3.5 из 5 звезд
3.5/5 (792)
The Perks of Being a Wallflower
От Everand
The Perks of Being a Wallflower
Stephen Chbosky
Рейтинг: 4.5 из 5 звезд
4.5/5 (2099)
Wolf Hall: A Novel
От Everand
Wolf Hall: A Novel
Hilary Mantel
Рейтинг: 4 из 5 звезд
4/5 (3811)
Little Women
От Everand
Little Women
Louisa May Alcott
Рейтинг: 4 из 5 звезд
4/5 (104)
The Art of Racing in the Rain: A Novel
От Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Рейтинг: 4 из 5 звезд
4/5 (4200)
Sing, Unburied, Sing: A Novel
От Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Рейтинг: 4 из 5 звезд
4/5 (1103)
A Tree Grows in Brooklyn
От Everand
A Tree Grows in Brooklyn
Betty Smith
Рейтинг: 4.5 из 5 звезд
4.5/5 (1929)
The Constant Gardener: A Novel
От Everand
The Constant Gardener: A Novel
John le Carre
Рейтинг: 3.5 из 5 звезд
3.5/5 (104)
Her Body and Other Parties: Stories
От Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Рейтинг: 4 из 5 звезд
4/5 (821)
A1750496823 - 28897 - 26 - 2023 - Zero Lecture - CSE111 (Updated) PPT
Документ22 страницы
A1750496823 - 28897 - 26 - 2023 - Zero Lecture - CSE111 (Updated) PPT
Fareez Raza (Captain)
Оценок пока нет
Ders Kodu Ders Adı Kredi Ders Uyg Lab Akts Türü Z/S YY Dil
Документ27 страниц
Ders Kodu Ders Adı Kredi Ders Uyg Lab Akts Türü Z/S YY Dil
Erdem Uslu
Оценок пока нет
Manual Latest of Ftdi Relay
Документ15 страниц
Manual Latest of Ftdi Relay
nipun
Оценок пока нет
T-MXONE-7.0-C1-IM v1.3
Документ403 страницы
T-MXONE-7.0-C1-IM v1.3
assav
0% (1)
Infoblox Note Infoblox Ipam Integration With Microsoft Ad Sites
Документ2 страницы
Infoblox Note Infoblox Ipam Integration With Microsoft Ad Sites
das zob
Оценок пока нет
Yonatan Andreas Parsaoran Lumban Tobing - Resume
Документ2 страницы
Yonatan Andreas Parsaoran Lumban Tobing - Resume
Chrisma Lumban Tobing
Оценок пока нет
UM2411 User Manual: Discovery Kit With STM32H747XI MCU
Документ61 страница
UM2411 User Manual: Discovery Kit With STM32H747XI MCU
Farouk El
Оценок пока нет
Section 3 Electronics: To Cover
Документ15 страниц
Section 3 Electronics: To Cover
Alvaro Restrepo Garcia
Оценок пока нет
ADE Lab EXPERIMENTS - Merged
Документ73 страницы
ADE Lab EXPERIMENTS - Merged
jainhassan4848
Оценок пока нет
Density Based Traffic Control Using Arduino
Документ16 страниц
Density Based Traffic Control Using Arduino
faisul fary
Оценок пока нет
LaCie 2big Quadra DataSheet
Документ2 страницы
LaCie 2big Quadra DataSheet
WebAntics.com Online Shopping Store
Оценок пока нет
ICONIC Clipsal
Документ24 страницы
ICONIC Clipsal
s_omeone4us
Оценок пока нет
Loytec: Standard IEC 62386 Edition 1.0 and of One of The Following Device Types or A Combination
Документ4 страницы
Loytec: Standard IEC 62386 Edition 1.0 and of One of The Following Device Types or A Combination
Guillermo Andres García Mora
Оценок пока нет
Online UPS 6-10K Tower User Manual
Документ46 страниц
Online UPS 6-10K Tower User Manual
jorgealbr
100% (2)
Componet of A Computer System
Документ13 страниц
Componet of A Computer System
✬ SHANZA MALIK ✬
Оценок пока нет
System Analysis and Design Chapter 1
Документ45 страниц
System Analysis and Design Chapter 1
Randy Lamarca Alabab
100% (1)
Ada Textbook
Документ409 страниц
Ada Textbook
a
0% (1)
Ec 6
Документ77 страниц
Ec 6
Maruthi -civilTech
Оценок пока нет
PC-Based Object Simulator for Testing PLC Software Development
Документ6 страниц
PC-Based Object Simulator for Testing PLC Software Development
Greg Mavhunga
Оценок пока нет
Coa 1
Документ133 страницы
Coa 1
Vis Kos
Оценок пока нет
Integration of SAP TM With SAP EM - Details On The Direct Output Agents
Документ9 страниц
Integration of SAP TM With SAP EM - Details On The Direct Output Agents
Jorge Funez
Оценок пока нет
NCS Unit 1 - Transmission Methods
Документ16 страниц
NCS Unit 1 - Transmission Methods
nanobala15
Оценок пока нет
PSET 3 MOS Logic Gates - Q
Документ4 страницы
PSET 3 MOS Logic Gates - Q
ramchanduri
Оценок пока нет
Espejo, John D.
Документ35 страниц
Espejo, John D.
Rowel Sumang Facunla
Оценок пока нет
Cisco MDS Zoning Steps
Документ4 страницы
Cisco MDS Zoning Steps
Rashid Nihal
100% (1)
Programming Quiz Functions
Документ7 страниц
Programming Quiz Functions
arih
Оценок пока нет
# Lecture III - Advanced Computer Networking - Sci-Tech With Estif
Документ25 страниц
# Lecture III - Advanced Computer Networking - Sci-Tech With Estif
ESTIFANOS
Оценок пока нет
Institute of Technology: Ec8711-Embedded Laboratory Manual
Документ87 страниц
Institute of Technology: Ec8711-Embedded Laboratory Manual
Kamal Cruz Le
Оценок пока нет
TB 05000001 e
Документ47 страниц
TB 05000001 e
Ricardo Lopez
Оценок пока нет
CONTROL-M Job Parameter and Variable Reference Guide 20091008
Документ542 страницы
CONTROL-M Job Parameter and Variable Reference Guide 20091008
Rajni Chaudhary
Оценок пока нет