Вы находитесь на странице: 1из 22

Leaping into The Career of Data Science

4/27/20 Machine Learning By Sathish Yellanki Slide No : 1


Data Engineer Versus Data Scientist

4/27/20 Machine Learning By Sathish Yellanki Slide No : 2


Let Us Understand Who is Data Engineer?

4/27/20 Machine Learning By Sathish Yellanki Slide No : 3


• Data Engineers Build And Optimize The Systems Allowing Data Scientists
And Analysts To Perform Their Work.
• Data Engineer Should Ensure That Any Data is Properly
• Received
• Transformed
• Stored
• Made Accessible
Data Engineer Responsibilities
• Establish The Foundation Architecture For Data Analysts and Data Scientists
• Take Responsibility To Construct The Data Pipelines, To Handle Huge Data
• Should Understand The Entire Software Development Life Cycle
• Should Keep Focus on Leveraging
• Data Tools
• Maintain Databases
• Create and Manage Data Pipelines
• Should Develop a Mind Set on Building and Optimizing Applications
What are The Tasks of Data Engineer?
• Building API’s For Data Consumption.
• Integrating External OR New Datasets into Existing Data Pipelines.
• Apply Feature Transformations For Machine Learning Models on New Data.
• Continuous Monitoring & Testing, System To Ensure Optimized Performance.
4/27/20 Machine Learning By Sathish Yellanki Slide No : 4
Finally What is Data Engineering?

Software Business BigData


Engineering Intelligence Abilities

Services Provided BY Data Engineer


Data Ingestion
• “Scraping” Databases, Loading Logs, Fetch Data From External Stores OR API’s.
Metric Computation
• Frameworks To Compute &Summarize Engagement, Growth OR Segmentation Related Metrics.
Anomaly Detection
• Automating Data Consumption to Alert People on Anomalous Events OR Changing Trends.
Metadata Management
• Allow Generation &Consumption of Metadata, Make it Easy to Find Information in DWH.
Experimentation
• A/B Testing And Experimentation Frameworks For Company’s Analytics With A Significant Data
Engineering Component integrated to it.
Instrumentation
• Log Events And Attributes Related To Every Event, Make Sure That High-Quality Data is Captured
Upstream
Dependencies
• Establish Pipelines That Are Specialized in Understand Series of Actions in Time, Allowing
Analysts To Understand User Behaviors
4/27/20 Machine Learning By Sathish Yellanki Slide No : 5
Learning To Be a Data Engineer
• Data Engineers Must Focus More on Learning
• Data Modeling Techniques
• Relational And Non-Relational Database Theory And Practice
• Database Clustering Tools And Techniques
• ETL Design
• Architectural Projections
Salary Projections

4/27/20 Machine Learning By Sathish Yellanki Slide No : 6


Let Us Understand Who is Data Analyst?

4/27/20 Machine Learning By Sathish Yellanki Slide No : 7


• Big Data Analyst Reviews, Analyzes And Reports on Big Data Stored And
Maintained by an Organization.
• Big Data Analysts Use
• Manual Techniques
• Automated Big Data Analysis/Analytics Software
• Big Data Analysts Analyze
• Large Amounts of Raw & Unstructured Data
• Big Data Analysts Main Intent is to Find
• Business Insight
• Intelligence
• Useful Information
Big Data Analyst Responsibilities
• Should be Well Versed in Big Data Concepts
• Possesses Knowledge & Skills in Using
• Database Querying Languages
• Big Data Analytics Software
• Should Have Good Understanding of
• Data Mining
• Data Extraction Technique
• Should Usually Work in Coordination With
• Data Scientists
• Database Developers/Administrators
• Management Team Machine Learning By Sathish Yellanki
4/27/20 Slide No : 8
Big Data Analyst Skills
• A High Level of Mathematical Ability.
• Programming Languages, Such As
• Oracle SQL Or Any SQL Flavor
• Python
• R Language
• Java OR Scala
• Good Ability To
• Analyze The Data and Business
• Model The Data For Business
• Interpret The Data in The Business
• Problem-Solving Skills With Design of Algorithms
• A Methodical And Logical Approach
• Should Have Good Ability To
• Plan The Work
• Meet Deadlines
• Develop Good Accuracy and Attention To Detail
• Accuracy and Attention
• Detail Interpersonal Skills
• Team Working skills
• Written & Verbal Communication Skills
4/27/20 Machine Learning By Sathish Yellanki Slide No : 9
Let Us Understand Who is Data Scientist?

4/27/20 Machine Learning By Sathish Yellanki Slide No : 10


• Data Science is a Study Which Involves Extracting Knowledge From Data
• A Data Scientist Should Have the Skill to Turn Raw Data into Valuable
Insights That An Organization Needs.
• A Data Scientist Should Find the Valuable Insight, Which Can Make the
Business Owner to Grow And Compete in His Business.
• Data Scientist Should Have the Skill to Interpret And Analyze the Data From
Multiple Sources To Come Up With Imaginative Solutions To Problems.
• Data Scientist Should Use Their Strong Business Sense Along With An Ability
To Communicate Findings To Both Business And IT.
• Should Have the Leadership That Can Influence “How An Organization
Approaches A Business Challenge”.
• Data Scientists May Have Different Functions Depending on Which
Industry/Sector They Are Involved.
• Should Have the Ability To Combine Practical Skills Such as Coding And
Mathematics With The Ability To Analyze Statistics.
• Should Have the Ability to Model the Data in the Interest of the Business
Growth and Targets.
• Data Scientist Should Eliminate the Noise and Identify the Canonical
Representative Data Points..
• Data Scientist “Generalizes the Data Model to be Able to Make Useful
Statistical Predictions.
4/27/20 Machine Learning By Sathish Yellanki Slide No : 11
Data ScientistResponsibilities
• Should Use Strong Business Acumen
• For Useful Insights, He Should Have Great Ability To
• Communicate Findings
• Mine Vast Amounts of Data
• Use Insights To Influence How An Organization Approaches Business
Challenges
• To Solve Problems Use A Combined Knowledge of
• Computer Science And Applications
• Modeling
• Statistics
• Analytics
• Mathematics
• Extract Data From Multiple Sources, Which Can be
• Un-Structured
• Semi-Structured
• Structured
• Fine Sift And Analyze Data From Multiple Angles, Looking For Trends That
Highlight Problems OR Opportunities
• Communicate Important Information &Insights To Business And IT Leaders
• Make Recommendations To Adapt Existing Business Strategies
4/27/20 Machine Learning By Sathish Yellanki Slide No : 12
Key Skills For Data Scientists (Non-Technical)
• Problem-Solving Skills
• Communication Skills
• Teamwork Skills
• Investigative Skills
• Interest in Statistics
• Interest in Predicting Trends and Identifying Patterns
• Innovative Thinking
• Observation Skills
• Critical Thinking
Key Skills For Data Scientists (Technical)
• Java OR Scala Coding
• Python Coding
• R Programming
• Understand Hadoop Platform
• SQL Database/Coding
• Apache Spark
• Machine Learning and AI
• Data Visualization With Reporting Tools
• Design of Algorithms
• Advanced Statistics
4/27/20 Machine Learning By Sathish Yellanki Slide No : 13
Let Us Get More Insights

4/27/20 Machine Learning By Sathish Yellanki Slide No : 14


4/27/20 Machine Learning By Sathish Yellanki Slide No : 15
4/27/20 Machine Learning By Sathish Yellanki Slide No : 16
4/27/20 Machine Learning By Sathish Yellanki Slide No : 17
4/27/20 Machine Learning By Sathish Yellanki Slide No : 18
4/27/20 Machine Learning By Sathish Yellanki Slide No : 19
4/27/20 Machine Learning By Sathish Yellanki Slide No : 20
4/27/20 Machine Learning By Sathish Yellanki Slide No : 21
Thank You Very Much

4/27/20 Machine Learning By Sathish Yellanki Slide No : 22

Вам также может понравиться