Добро пожаловать в Scribd!

Пропустить карусель

Consumer Project

Загружено:

Partha Sarathy

0% нашли этот документ полезным (0 голосов)

63 просмотров2 страницы

Project Hadoop

Авторское право

Доступные форматы

DOCX, PDF, TXT или читайте онлайн в Scribd

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Пожаловаться на этот документ

Project Hadoop

Авторское право:

Доступные форматы

Скачайте в формате DOCX, PDF, TXT или читайте онлайн в Scribd

Отметить как неприемлемый контент

0% нашли этот документ полезным (0 голосов)

63 просмотров2 страницы

Consumer Project

Загружено:

Partha Sarathy

Project Hadoop

Авторское право:

Доступные форматы

Скачайте в формате DOCX, PDF, TXT или читайте онлайн в Scribd

Отметить как неприемлемый контент

Перейти к странице

Вы находитесь на странице: 1из 2

Поиск в документе

Analyzing Consumer Complaints.

Hadoop

can be applied with many projects of various domains:

Data consolidation
Machine learning
Specialized analysis (domain specific)

This project comes under Data consolidation

Data consolidation
This is like "enterprise data hub" or "data lake." The idea is you
have disparate data sources, and you want to perform analysis across
them. This type of project consists of getting feeds from all the
sources (either real time or as a batch) and transferring them into
Hadoop. Sometimes this is the first step for becoming a data-driven
company. Data lakes usually materialize as files on HDFS and tables
in pig/Hive or Impala and then we can create simply pretty reports on
the data contained in hdfs.

About data:
This is publically available data set and it is about the consumer
complaints. In the process consumer files the complaints to consumer
forum and then consumer forum forwards the complaint to respective
company.
Below is the description of the data set
Column heading

inde
x
0

Description

Product
Sub-product
Issue
Sub-issue
Consumer complaint
narrative
Company public response

1
2
3
4
5

Type of the product

Sub product type
Issue faced by the consumer
Any sub issues if exists
Detailed description of complaint

Company
State

7
8

ZIP code
Submitted via

9
10

Companys public response to the

complaint
Name of the company
State from which consumer filed the
complaint
Zip code
Channel from which complaint was
submitted

Date received

date on which consumer filed the

complaint

Date sent to company

Company response to
consumer
Timely response?
Consumer disputed?
Complaint ID

Date on which consumer forum

forwarded the complaint to company
Companys
response to the consumer

13
14
15

Unique complaint id

This data is comma delimited.

In some rows there are few columns which are enclosed in double quotes
and have many commas and due to this the same column gets spitted in to
many columns for ex:

Sample record:
10/16/2015,Debt collection,"Other (phone, health club, etc.)",Cont'd
attempts collect debt not owed,Debt was discharged in
bankruptcy,,,"Convergent Resources, Inc.",OH,438XX,Web,10/16/2015,Closed
with explanation,Yes,,1612132
This entire column "Other (phone, health club, etc.)"
Should be product but, if we split this file based on comma then this
column will be splitted into 3 columns which will result in wrong
outputs.
In order to tackle this we should remove the commas present only inside
double quotes.
Since Hadoop is used to handle big data its recommended to use java map
reduce to remove unnecessary commas.

Note: work on this problem statements after doing the data cleaning as
mentioned above.
Problem statement.
1. Write a pig script to find no of complaints which got timely
response
2. Write a pig script to find no of complaints where consumer forum
forwarded the complaint same day they received to respective
company
3. Write a pig script to find list of companies toping in complaint
chart (companies with maximum number of complaints)
4. Write a pig script to find no of complaints filed
with product
type has "Debt collection" for the year 2015

Technologies used:
Flume
Map reduce
Pig

Вам также может понравиться

Web Dev Syllabus PDF
Документ5 страниц
Web Dev Syllabus PDF
Kapten Belalang
Оценок пока нет
Single Node Hadoop 2.x Cluster Setup On CentOS V0.3
Документ23 страницы
Single Node Hadoop 2.x Cluster Setup On CentOS V0.3
Partha Sarathy
Оценок пока нет
Tamil Bible தமிழ் வேதாகமம்
Документ1 467 страниц
Tamil Bible தமிழ் வேதாகமம்
arputhaa2835
100% (1)
Goverment Schools
Документ2 страницы
Goverment Schools
Partha Sarathy
Оценок пока нет
English Bhajan Ayyappan
Документ69 страниц
English Bhajan Ayyappan
banalanaveen
94% (17)
ACS Unit 1
Документ9 страниц
ACS Unit 1
Partha Sarathy
100% (1)
Micro Controller
Документ6 страниц
Micro Controller
Partha Sarathy
Оценок пока нет
List of Matric HSS
Документ18 страниц
List of Matric HSS
glen4u
Оценок пока нет
Life and Employability Skill Practical
Документ5 страниц
Life and Employability Skill Practical
Partha Sarathy
50% (2)
Formula Sheet
Документ1 страница
Formula Sheet
Saleem Sheikh
Оценок пока нет
Question Paper Code:: Reg. No.
Документ6 страниц
Question Paper Code:: Reg. No.
SindhujaSindhu
Оценок пока нет
PI Controller
Документ1 страница
PI Controller
Partha Sarathy
Оценок пока нет
Introduction To Simulink
Документ51 страница
Introduction To Simulink
Sanveer Krishna
Оценок пока нет
AC Circuits: Inductive Reactance, Capacitive Reactance, Impedance and Admittance
Документ17 страниц
AC Circuits: Inductive Reactance, Capacitive Reactance, Impedance and Admittance
Partha Sarathy
Оценок пока нет
Wind Energy Systems (Johnson)
Документ449 страниц
Wind Energy Systems (Johnson)
Mario Shawn Hayden Jr
100% (1)
DDUGKY Guidelines 9dec2014
Документ68 страниц
DDUGKY Guidelines 9dec2014
Partha Sarathy
Оценок пока нет
Journal
Документ6 страниц
Journal
Partha Sarathy
Оценок пока нет
PIC16F84A DataSheet
Документ88 страниц
PIC16F84A DataSheet
Emanuele
Оценок пока нет
Gunsmithing and Tool Making Bible by Harold Hoffman
Документ294 страницы
Gunsmithing and Tool Making Bible by Harold Hoffman
Erik Green
100% (7)
Simulation
Документ15 страниц
Simulation
Partha Sarathy
Оценок пока нет
25 - Vvimp Interview Questions
Документ33 страницы
25 - Vvimp Interview Questions
Japneet Singh
Оценок пока нет
RDBMS Vs ODBMS
Документ3 страницы
RDBMS Vs ODBMS
Nilofer Shaikh
100% (1)
First Phase Final
Документ21 страница
First Phase Final
Partha Sarathy
Оценок пока нет
SQL 1 - Volume III
Документ40 страниц
SQL 1 - Volume III
vishav_20003214
Оценок пока нет
A State-Space Modeling of Non - Ideal
Документ8 страниц
A State-Space Modeling of Non - Ideal
Partha Sarathy
Оценок пока нет
SQL 1 - Volume III
Документ40 страниц
SQL 1 - Volume III
vishav_20003214
Оценок пока нет
PMP An Overview
Документ28 страниц
PMP An Overview
Partha Sarathy
Оценок пока нет
Alternator
Документ9 страниц
Alternator
Partha Sarathy
Оценок пока нет
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
От Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Рейтинг: 4 из 5 звезд
4/5 (5783)
The Yellow House: A Memoir (2019 National Book Award Winner)
От Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Рейтинг: 4 из 5 звезд
4/5 (98)
Never Split the Difference: Negotiating As If Your Life Depended On It
От Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Рейтинг: 4.5 из 5 звезд
4.5/5 (838)
Shoe Dog: A Memoir by the Creator of Nike
От Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Рейтинг: 4.5 из 5 звезд
4.5/5 (537)
The Emperor of All Maladies: A Biography of Cancer
От Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Рейтинг: 4.5 из 5 звезд
4.5/5 (271)
Fear: Trump in the White House
От Everand
Fear: Trump in the White House
Bob Woodward
Рейтинг: 3.5 из 5 звезд
3.5/5 (738)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
От Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Рейтинг: 4 из 5 звезд
4/5 (890)
The Little Book of Hygge: Danish Secrets to Happy Living
От Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Рейтинг: 3.5 из 5 звезд
3.5/5 (399)
Team of Rivals: The Political Genius of Abraham Lincoln
От Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Рейтинг: 4.5 из 5 звезд
4.5/5 (234)
Yes Please
От Everand
Yes Please
Amy Poehler
Рейтинг: 4 из 5 звезд
4/5 (1888)
Grit: The Power of Passion and Perseverance
От Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Рейтинг: 4 из 5 звезд
4/5 (587)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
От Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Рейтинг: 4.5 из 5 звезд
4.5/5 (265)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
От Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Рейтинг: 3.5 из 5 звезд
3.5/5 (231)
On Fire: The (Burning) Case for a Green New Deal
От Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Рейтинг: 4 из 5 звезд
4/5 (72)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
От Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Рейтинг: 4.5 из 5 звезд
4.5/5 (474)
Principles: Life and Work
От Everand
Principles: Life and Work
Ray Dalio
Рейтинг: 4 из 5 звезд
4/5 (599)
Rise of ISIS: A Threat We Can't Ignore
От Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Рейтинг: 3.5 из 5 звезд
3.5/5 (137)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
От Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Рейтинг: 4.5 из 5 звезд
4.5/5 (344)
The Unwinding: An Inner History of the New America
От Everand
The Unwinding: An Inner History of the New America
George Packer
Рейтинг: 4 из 5 звезд
4/5 (45)
Steve Jobs
От Everand
Steve Jobs
Walter Isaacson
Рейтинг: 4.5 из 5 звезд
4.5/5 (806)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
От Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Рейтинг: 3.5 из 5 звезд
3.5/5 (2219)
Angela's Ashes: A Memoir
От Everand
Angela's Ashes: A Memoir
Frank McCourt
Рейтинг: 4.5 из 5 звезд
4.5/5 (440)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
От Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Рейтинг: 4 из 5 звезд
4/5 (1090)
John Adams
От Everand
John Adams
David McCullough
Рейтинг: 4.5 из 5 звезд
4.5/5 (2409)
Bad Feminist: Essays
От Everand
Bad Feminist: Essays
Roxane Gay
Рейтинг: 4 из 5 звезд
4/5 (1015)
The Glass Castle: A Memoir
От Everand
The Glass Castle: A Memoir
Jeannette Walls
Рейтинг: 4.5 из 5 звезд
4.5/5 (1711)
The Outsider: A Novel
От Everand
The Outsider: A Novel
Stephen King
Рейтинг: 4 из 5 звезд
4/5 (1800)
The Woman in Cabin 10
От Everand
The Woman in Cabin 10
Ruth Ware
Рейтинг: 3.5 из 5 звезд
3.5/5 (2322)
A Man Called Ove: A Novel
От Everand
A Man Called Ove: A Novel
Fredrik Backman
Рейтинг: 4.5 из 5 звезд
4.5/5 (4609)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
От Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Рейтинг: 4.5 из 5 звезд
4.5/5 (119)
The Light Between Oceans: A Novel
От Everand
The Light Between Oceans: A Novel
M.L. Stedman
Рейтинг: 4.5 из 5 звезд
4.5/5 (789)
Brooklyn: A Novel
От Everand
Brooklyn: A Novel
Colm Tóibín
Рейтинг: 3.5 из 5 звезд
3.5/5 (1937)
Wolf Hall: A Novel
От Everand
Wolf Hall: A Novel
Hilary Mantel
Рейтинг: 4 из 5 звезд
4/5 (3811)
Manhattan Beach: A Novel
От Everand
Manhattan Beach: A Novel
Jennifer Egan
Рейтинг: 3.5 из 5 звезд
3.5/5 (791)
Little Women
От Everand
Little Women
Louisa May Alcott
Рейтинг: 4 из 5 звезд
4/5 (104)
The Perks of Being a Wallflower
От Everand
The Perks of Being a Wallflower
Stephen Chbosky
Рейтинг: 4.5 из 5 звезд
4.5/5 (2099)
The Art of Racing in the Rain: A Novel
От Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Рейтинг: 4 из 5 звезд
4/5 (4193)
A Tree Grows in Brooklyn
От Everand
A Tree Grows in Brooklyn
Betty Smith
Рейтинг: 4.5 из 5 звезд
4.5/5 (1929)
Her Body and Other Parties: Stories
От Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Рейтинг: 4 из 5 звезд
4/5 (821)
Sing, Unburied, Sing: A Novel
От Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Рейтинг: 4 из 5 звезд
4/5 (1103)
The Constant Gardener: A Novel
От Everand
The Constant Gardener: A Novel
John le Carré
Рейтинг: 3.5 из 5 звезд
3.5/5 (104)
IPC in Linux
Документ4 страницы
IPC in Linux
Barry AnEngineer
Оценок пока нет
Review: Security, Audit and Control Features: Oracle Database
Документ2 страницы
Review: Security, Audit and Control Features: Oracle Database
Emmanuel Enrique
Оценок пока нет
Ipp - Bca - 3rd Sem - 2019
Документ2 страницы
Ipp - Bca - 3rd Sem - 2019
derit83353
Оценок пока нет
SQL Server System Databases
Документ2 страницы
SQL Server System Databases
Elias Alam
Оценок пока нет
CICS condition codes not handled
Документ3 страницы
CICS condition codes not handled
Balakrishna Sambari
Оценок пока нет
07.10.2019 2019 - 20 NX Project ID Selection When Chassis Is Changed at The Field
Документ6 страниц
07.10.2019 2019 - 20 NX Project ID Selection When Chassis Is Changed at The Field
zkova
Оценок пока нет
Cloud Based Face Recognition and Speech Recognition For Access Control Application
Документ5 страниц
Cloud Based Face Recognition and Speech Recognition For Access Control Application
International Journal of Innovative Science and Research Technology
Оценок пока нет
More Details On Data Models
Документ23 страницы
More Details On Data Models
chitraalavani
Оценок пока нет
Management Information Systems For The Information Age 9th Edition Haag Solutions Manual 1
Документ36 страниц
Management Information Systems For The Information Age 9th Edition Haag Solutions Manual 1
sharon
100% (44)
EAI FOR ENTERPRISE APPLICATION INTEGRATION
Документ12 страниц
EAI FOR ENTERPRISE APPLICATION INTEGRATION
Rishin Shah
Оценок пока нет
P3 4311901040
Документ7 страниц
P3 4311901040
Maz Eno
Оценок пока нет
Django Made Easy Build and Deploy Reliable Django Applications
Документ250 страниц
Django Made Easy Build and Deploy Reliable Django Applications
Mahmood Ghaleb Al-Bashayreh
Оценок пока нет
1.fundamentals of Agile
Документ42 страницы
1.fundamentals of Agile
alagar samy
Оценок пока нет
Archer VR600 (B) (EU) (AU) - V2 - UG
Документ122 страницы
Archer VR600 (B) (EU) (AU) - V2 - UG
Moose
Оценок пока нет
Current Log
Документ3 страницы
Current Log
khoirun nissa
Оценок пока нет
Essbase Clustering Part 1
Документ10 страниц
Essbase Clustering Part 1
nas4ru
Оценок пока нет
Social Media Concerns
Документ2 страницы
Social Media Concerns
usgsa
Оценок пока нет
8.4 File Handling
Документ3 страницы
8.4 File Handling
wildcashnumber01
Оценок пока нет
Untitled
Документ31 страница
Untitled
Sleepy Ash
Оценок пока нет
Mis
Документ11 страниц
Mis
Tigran
0% (1)
Object Oriented Design
Документ37 страниц
Object Oriented Design
Ganesh Thapa
Оценок пока нет
5-Effects of Using ICT: Engineer Amina Dessouky Information & Communication Technology 1
Документ18 страниц
5-Effects of Using ICT: Engineer Amina Dessouky Information & Communication Technology 1
Ghadeer Alshoum
Оценок пока нет
Arcgis For Mobile PDF
Документ8 страниц
Arcgis For Mobile PDF
Anbu Mujahidin
Оценок пока нет
The Global Internet Phenomena Report October 2018
Документ23 страницы
The Global Internet Phenomena Report October 2018
Iacob Adi
Оценок пока нет
Assignment Cloud Computing
Документ7 страниц
Assignment Cloud Computing
WasimaNahiyan
100% (1)
Centura Conexion
Документ230 страниц
Centura Conexion
arturodvs
Оценок пока нет
InformaticsPractices PRE BOARD23 24
Документ8 страниц
InformaticsPractices PRE BOARD23 24
ashu890135
Оценок пока нет
Cybersecurity Challenges and Emerging Trends in Latest Technologies
Документ7 страниц
Cybersecurity Challenges and Emerging Trends in Latest Technologies
PoulomiDas
Оценок пока нет
Siebel Loyalty Modules
Документ2 страницы
Siebel Loyalty Modules
Mohamed Abrar
Оценок пока нет
Use The Internet As A Tool For Credible Research and Information Gathering To Best Achieve Specific Class Objectives or Address Situational
Документ45 страниц
Use The Internet As A Tool For Credible Research and Information Gathering To Best Achieve Specific Class Objectives or Address Situational
Sheraine Trienta
Оценок пока нет