Вы находитесь на странице: 1из 59

E6893 Big Data Analytics: Lecture 13

Cognitive Mobile Analysis

Ching-Yung Lin, Ph.D.


IBM Chief Scientist, Graph Computing
Adjunct Professor, Dept. of Electrical Engineering and Computer Science

December 8th, 2016


E6893 Big Data Analytics Lecture 13: Cognitive Mobile Analysis

2016 CY Lin, Columbia University

Advanced Big Data Analytics Projects


If you are interested in doing advanced research in Spring 2017 and beyond in the following
areas, please let me know (or contact the researchers in the next few pages) and probably
take the Advanced Big Data Analytics class as your course project:
Large-Scale Graph Analysis
Machine Reasoning, including Bayesian Networks and Game Theory
Mobile Vision
Robots Vision and Cognitive Interaction
Large-Scale Visualization
Deep Learning on Mobile Devices
etc.. (As in the next few slides)..

System G Team

2016 IBM Corporation

Project : Deep Learning on Graphs


Contact: Toyo Suzumura, tsuzumrua@us.ibm.com
Project Description
This project focuses on how deep learning could be applied to the graph
classification method or finding anomalous nodes in graphs especially a real-world
massive graph with properties and time-evolving characteristics.
Required Skills : Experience on any deep learning framework such as TensorFlow
etc.

Feature Extraction

Machine Learning
Classifiers

Learning Models

Cognitive
Reasoning
Risk Prediction

Deep Learning/Neural Network

Big Graph
(e.g. Transaction
Graph in the
Financial Domain)
3

E6893 Big Data Analytics Lecture 13: Mobile Data Analysis

2016 CY Lin, Columbia University

Potential Project Directions

Contact: Dr. Guangnan Ye, gye@us.ibm.com


Projects:
Project 1: Robot Intelligence
Description: Improving computer vision and speech recognition capability on
robots Nao, Pepper, etc.
Prior experience required: Knowleadge on computer vision and machine learning.
Project 2: Content analysis and verification from web resources
Description: Given the entity of a product\person\company, determine the
reasonable content based on information sources available from the Internet.
Prior experience required: Knowleadge on web cralwer, NLP, and machine learning
Project 3: Recognition in computer vision
Description: OCR, Face Analysis, Object Recognition, Event Recognition, etc.
Prior experience required: Knowleadge on computer vision and machine learning.

E6893 Big Data Analytics Lecture 13: Mobile Data Analysis

2016 CY Lin, Columbia University

Potential Project Directions

Contact: Dr. Conglei Shi, shiconglei@us.ibm.com


Projects:
Interactive Machine Learning
Description: In this project, we want to build a visual analytic system to better
understand the machine learning model, explore the training and testing process,
and improve the performance.
Prior experience required: Experience on machine learning and familiar with at
least one machine learning toolkit

E6893 Big Data Analytics Lecture 13: Mobile Data Analysis

2016 CY Lin, Columbia University

Potential Project Directions

Contact: Jason Crawford, ccjason@us.ibm.com


Projects:
Graph Query Language for SQL users
Description: In this project, we want to build a novel graph query language that is
applicable for all kinds of applications as well as friendly with traditional SQL users
Prior experience required: Relational Database, Graph Database,

E6893 Big Data Analytics Lecture 13: Mobile Data Analysis

2016 CY Lin, Columbia University

Potential Project Directions

Contact: Danny Yeh, dlyeh@us.ibm.com


Projects:
Robo-Advisor for Wealth Management
Description: In this project, we want to test various strategies that are related to
building a Robo-Advisor which involves 4 steps of wealth management: financial
data gathering and processing, person / user understanding, personal portfolio
management, and recommendation for changes
Prior experience required: Big Data Analytics

E6893 Big Data Analytics Lecture 13: Mobile Data Analysis

2016 CY Lin, Columbia University

Judgement
Perception
Reasoning
Strategy
Observation
Memory

8
4

Network / Graph is the way we remember,


we associate, and we reason.
System G Team

2016 IBM Corporation

System G Graph Computing for Machine Intelligence


IBM System G: a brand name approved by HQ April 2014.
Judgement
Perception
Reasoning &
Strategy

Observation
Memory

Based on 30+ graph-related projects; 150+ papers; ~40 patents; ~10


best paper awards; ~$25M Research funding
Accomplishments in 2013 ($100M+ contribution - SmallBlue) and 2014
(Scientific contribution Social & Cognitive Network Science)

Graph
Database

Graph
Analytics

Graphical
Models

Connecting the Dots and


Reasoning Big Data

http://systemg.research.ibm.com

Memory
System G Team

Relationship,
Perception &
Contextual Analysis

Machine Reasoning &


Deep Learning
2016 IBM Corporation

System G Tools provide Building Blocks for Cognitive Solutions

Machine Learning:
Machine Reasoning:
Graph Middleware:
Deep Learning Tools
Bayesian Networks
Parallel Prog. Lib.
Visual and Text Sentiment Tools
Game Theory Tools
Power Optimization
Anomaly Detection Tools
Multimodal Analysis Platform
GPU Optimization
Mobile Cognition:
Graph Analytics:
iOS Cognition Tools
Topological Analysis
Robot Cognition Tools 3
Matching and Search
Path and Flow
Machine Learning
Technologies
Spatiotemporal Analytics:
Spatiotemporal Mining
Spatiotemporal Indexing
2
Network Analytics
Technologies
Graph Database:
Native Store
GBase
Graph Visualization:
Multivariate Graph
13
Dynamic Graph
Big Graph

10

Sensing &
Observation

4
Machine Reasoning
Technologies
Judgment
Perception &
Representation

Reasoning &
Strategy

Memory

1
Graph Database
Technologies

IBM System G Team

2016 IBM Corporation

IBM System G Application Use Cases


1. System G for Expertise Location
2. System G for Recommendation
3. System G for Commerce
4. System G for Financial Analysis
5. System G for Social Media Monitoring
6. System G for Telco Customer Analysis
7. System G for Watson
8. System G for Data Exploration and Visualization
9. System G for Personalized Search
10. System G for Anomaly Detection (Espionage, Sabotage, etc.)
11. System G for Fraud Detection
12. System G for Cybersecurity
13. System G for Sensor Monitoring (Smarter another Planet)
14. System G for Cellular Network Monitoring
15. System G for Cloud Monitoring
16. System G for Code Life Cycle Management
17. System G for Traffic Navigation
18. System G for Image and Video Semantic Understanding
19. System G for Genomic Medicine
20. System G for Brain Network Analysis
21. System G for Data Curation
22. System G for Near Earth Object Analysis

11

System G Team

2016 IBM Corporation

Deep Learning on Mobile Object


Detection & Recognition

Challenges

13

How to do Machine Reasoning?


Five Layers of Understanding.
Deterministic Classification + Learning Inference
Cognition
Layer
Semantics
Layer
Concept
Layer
Feature
Layer
Sensor
Layer
HR records, Travel records,
Badge/Location records,
Phone records, Mobile records

14

: observations

: hidden states

Transmitted images,
speech content, video
content
2013 IBM Corporation

Multi-Scale Deep Convolutional Neural


Network for Fast Object Detection

Where are Pedestrians?

16

Deep Learning + Graph Contextual Analysis

Demo: Detecting Cars and Pedestrians in


complex scenario

17

Comparison to the State-of-the-Art on


KITTI benchmark test set

Single CPU core (2.40GHz) of an Intel Xeon E5-2630 server with 64GB of RAM. An NVIDIA Titan GPU was
used for CNN computations.

18

Demo: Multi-Scale Deep Convolutional


Neural Network for Fast Object Detection

19

System G Mobile Cognition Enabling AI right on the Edge


Created novel graph computing and deep learning framework on iOS devices and NAOqi
robots including:
generic object recognition, event recognition, face recognition, visual sentiment
recognition, and document recognition
graph database

Novel Deep Learning works that Speed Up image computation


utilizing the GPUs on iOS devices: 195x or 1657x faster

iPad Pro

iPhone 6s

Classification rate ~13 frames/sec


(on ~1000 classes)
20

System G Team

~7 frames/sec

2016 IBM Corporation

Example - Generic Object Recognition running


natively on mobile device (iPhone)

21

Demo - Document Detection (Warning!! Potential Sensitive Info Leakage!)

22

IBM System G EventNet


Generic Event Recognition on Videos

Demo System G Video event recognition


and search example

24

Event Detection Baseline

Training Videos

Feature Extractions

Low-level feature
SIFT (Visual)
Attempting board trick

STIP (Motion)

Feeding an animal

25

Deep
Learning

Decision
Tree

Fusion
Late
Fusion

Early
Fusion

MFCC (Audio)

SVM

Mid-level Concept
Landing a fish

Classifiers

Output

Mid-level Feature Representation


Decompose an event into concepts
speech
sound

running
jumping

person
board

park
street

Events Classification Framework


Event Classifier
Pair-Activity Event Classifier
Embrace Classifier

Feature
Extracting

PeopleSplitUp
Classifier

Key frames

PeopleMeet Classifier

PersonRuns Classifier
Detected
Embrace

Event
Merging

Detected
PeopleSplitUp

Postprocessing

Detected PersonRuns

Preliminary Events

27

Detected PeopleMeet

Event Identifying
Backwards Search
Forwards Search

Automatic Video Event Tagging


4,490 trained event models
and a library of 95K videos
Reasoned Event

Reasoned
Event
Concepts

28

Demo: IBM System G EventNet

29

Examples of our Previous work on


Abnormal Video Event Analysis
Event: Abnormal Behavior
(Surveillance Video)
TRECVID Surveillance Event Detection
(SED) Evaluation 2008-2016

Event: Making a bomb


(Consumer Video)
TRECVID Multimedia Event Detection
(MED) Evaluation 2010-2016

30

Detection and Tracking of Head, Shoulder, and Body


Fusion of Head-shoulder and Body detection
Adjust the detector searching scales

31

Detection Results

32

Complex Scenario how to predict people action?

33

2015 IBM Corporation

System G Anomaly Detection and Machine Reasoning Platform

2010

2009
2014

34

Since 2009, U.S. Justice Department


lawyers have pursued at least
19 cases of corporate espionage,
including GM, Ford, Motorola, DuPont, .
Impacted economic and jobs
WSJ Feb 21, 2013

2013
System G Team

2015 IBM Corporation

System G Reasoning and Predictive Pipeline is composed of

hundreds of cognitive analyzers

35

2013 IBM Corporation

Game Theory may help decision making in complex scenario

-- Forecasting what
will happen based on
our and others
potential actions.

36

UNCLASSIFIED

IBM System G Team

2016 IBM Corporation

Acceleration
of Neural
network on
Mobile
Devices

Project Name

Richard Chen, Ruichi Yu*, Larry Lai


IBM T. J. Watson Research Center
*Ruichi Yu is IBM summer intern this year.

Summary of Acceleration of Neural network on Mobile


Devices

Porting Deep Convolution Neural Network on iOS Device with


near real-time computation
Reduce algorithmic complexity with ignorable performance degradation.
Utilize computational hardware to achieve better performance.

IBM :: 2015 IBM Corporation

38

Outline

Background
Full Network Acceleration and Compression
Kernel Importance Measurement
Algorithmic Performance Evaluation
Computation Complexity Assessment

Acceleration on iOS Mobile Devices


Metal API for GPU Programming
Computation Speed Evaluation

IBM :: 2015 IBM Corporation

39

Methods for Running CNNs on Mobile Devices


Sending CNN
jobs to cloud

Acceleration CNN
on Local Device

Apple A9X SoC,


12-core GPU

How to trade off between algorithmic


complexity and performance?
How to utilize hardware effectively
IBM :: 2015 IBM Corporation

40

Problems for Running CNNs on Mobile Devices


Storage

Memory

Speed

Model Size

Weights

Mult.s

AlexNet

243MB

61M

725M

VGG-S

393MB

103M

2640M

VGG-16

552MB

138M

15484M

GoogLeNet

51MB

6.9M

1566M

Statistics of some popular CNNS


Reference:
Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications

41

Computational Resource on iPhone and iPad

iPhone 6S
(Plus)

iPad Air 2

iPad Pro (12.9/9.7)

iPhone 7 (Plus)

SoC

A9

A8X

A9X

A10 Fusion

CPU

2x Twister @
1.85 GHz

3x Typhone @
1.5 GHz

2x Twister @ 2.26
GHz

4-core

GPU

PVR GT7600
(6 cluster)

PVR GXA6850
(8 cluster)

PVR 12 Cluster Series 6 cluster GPU?


7

RAM (shared
memory)

2GB LDDR4

2GB LDDR3

4GB LDDR4

3GB on Plus?

Memory bus width

64-bit

128-bit

128-bit

Max # of threads
per group

512

512

512

IBM :: IBM Confidential :: 2015 IBM Corporation

42

Methods for Running CNNs on Mobile Devices


Sending CNN
jobs to cloud

Compression
(pruning) of CNN

Speeding up CNN

IBM :: 2015 IBM Corporation

43

Think Different

All existing methods can be viewed as approximations of an overly-

redundant CNN, but do we really need such a CNN as the starting point?

!
g
in

Slim

N
N
C

44

Good to be Slim

Slim CNN leads to:

less storage space


less memory usage
less computation
less power consumption

Compare with others:

Full-Network Acceleration and Compression


We think one step preceedingly

45

Be Slim is Hard

Train a small model


from scratch
Randomly Pruning

46

Feature Selection on CNN

CNNs can be viewed as a set of "overly-redundant" feature extractors

features

47

A method for Pruning Redundant Neurons and Kernels of


Deep Convolutional Neural Networks

A pre-trained
CNN

Extract CNN
Responses

Measure the
Importance of
Feature
Extractors

Prune Model

Fine-tuning

48

A method for Pruning Redundant Neurons and Kernels of


Deep Convolutional Neural Networks

Intractable

tractable

Inconsistent

consistent

A pre-trained
CNN

Extract
Responses of
a High-level
Layer

Measure the
Importance of
Feature
Extractors

Backpropagate the
Importance &
Prune Model

Fine-tuning

Forward Propagation

Input
layers

Response

Response

Response

Important Score Back Propagation and Pruning

FC layers
49

Fine-tuning the Pruned Model


The pruned model consists of important feature extractors, but will suffer loss
of accuracy due to loss of redundant features

Good starting point on the learning curve due to feature selection


Fine-tuning the pruned model with a lower learning rate to recover the
performance

Evaluation

Metal
iOS GPU Language

Project Name

Programming Model of Apple GPU

Programming by Metal language


Most of syntaxes are compatible with C++14

A unified programming language interface for graphics and


data-parallel computation workloads.
Single Instruction Multiple Threads (SIMT) programming fashion
Every thread performs simple and identical instruction.

IBM :: IBM Confidential :: 2015 IBM Corporation

53

Execution Model of Apple GPU

It integrates the support for both


graphics and compute
operations.
Three command encoder:
Render Command Encoder :
Graphics Rendering
Compute Command Encoder:
Data-Parallel Compute
Processing
Blitting Command Encoder:
Transfer Data between Resource

Multi-threading in encoding
command is supported
*Put identical type of commands together as possible.

IBM :: IBM Confidential :: 2015 IBM Corporation

54

Diagram of iOS GPU Execution

IBM :: IBM Confidential :: 2015 IBM Corporation

55

Metal Programming, Kernal Function


Compute command:
Two parameters, threadsPerGroup and numThreadgroups,
determines number of threads. (They are all 3-D variable.)
The total of all threadgroup memory allocations must not exceed 16 KB
(shared memory within one threadgroup.)
Shared memory used to sync data across different threads.

Four attribute qualifiers are used to access data in kernel


function:
thread_position_in_grid, threadgroup_position_in_grid
thread_position_in_threadgroup, thread_index_in_threadgroup,
Max number of threads in a group is 512 (device-dependent).

For a 48x48 image


threadGroupCount = MTLSizeMake(16, 16, 1)
threadGroups = MTLSizeMake(3, 3, 1)
IBM :: IBM Confidential :: 2015 IBM Corporation

56

Sigmoid Function Speedup

Syntax might be changed in swift3.

Perform sigmoid function on 224 data points on iPad Pro:

IBM :: IBM Confidential :: 2015 IBM Corporation

57

Image Recognition Convolution Layer

Computation-intensive layer.
Explore a way to reuse data in
cache to improve the speed.
Lots of implementation trick to
reduce overhead in GPU
computations.
E.g., perform compilation
optimization in your codes.

IBM :: IBM Confidential :: 2015 IBM Corporation

58

Evaluation on iOS Mobile Devices

p-prefix denotes pruned/compressed CNN.

IBM :: IBM Confidential :: 2015 IBM Corporation

2.0x

2.4x

1.6x

1.6x

59

Вам также может понравиться