Вы находитесь на странице: 1из 17

How to Get Started with Artificial

Intelligence: A Guide for


Enterprises
Machine learning services on public clouds

Publication Date: 16 Nov 2016 | Product code: IT0014-003161

Michael Azoff
How to Get Started with Artificial Intelligence: A Guide for Enterprises

Summary
Catalyst
Artificial intelligence (AI) may well become the largest IT wave of the computer era in the decade
ahead as the latest advances push the field to new levels of accuracy and innovation. The potential
for AI to infuse applications, services, and products is huge. To take just one example, the automobile
industry will be unrecognizably transformed by autonomous vehicles, and AI is playing a significant
role in this. Machine learning (ML) and deep learning (DL) are key AI technologies at the root of these
innovations. This report provides a starting point for enterprises building an internal capability to
exploit AI for their products and services by examining the ML services available on the major public
clouds that can support such an effort. The innovation taking place in hardware acceleration is also
examined.

Ovum view
AI is a large field, but the subfield of ML and the DL sub-branch are making the largest advances in
recent years. To support a new generation of users interested in exploiting this technology, the major
public clouds provided by Amazon, IBM, Microsoft, and Google offer ML services. The spread of DL
into enterprise and industrial applications is also rapidly taking place. As a result of successful activity
by early adopters of AI/ML, Ovum expects interest to widen and touch most verticals. The digital
transformation that is affecting all enterprises, turning traditional businesses into software-centric
businesses, will accelerate the adoption of AI.

The ML technology available on public clouds has been established for some years, and with the
latest advances in DL, it will now be added to ML services. The services vary in how they target their
users, who tend to fall into three distinct categories: neural network experts (for example, ML
graduates from universities), data scientists with deep business domain knowledge but not
necessarily with ML expertise, and software developers who wish to build applications making use of
AI and ML and typically without ML expertise. There will also be user cases that overlap the three
categories.

Key messages
 This is a good time for enterprises new to AI to initiate proof-of-concept trials because AI will
grow in importance and become possibly the largest technology wave yet.
 Enterprises are already infusing ML and DL into their services and products.
 The major cloud providers vary in the respective ML user categories they target.
 The services from the public cloud providers can help businesses new to AI kick-start their
entry into this space.
 The verticals currently exploiting AI include automotive with autonomous driving systems,
aerospace with drone technology, healthcare with data and information analysis, and social
networking with language processing.
 AI algorithms require hardware acceleration, and Nvidia GPUs are the market leaders in
accelerating DL neural networks. Ovum expects this market to become more competitive in

© Ovum. All rights reserved. Unauthorized reproduction prohibited. Page 2


How to Get Started with Artificial Intelligence: A Guide for Enterprises

2017 with new players emerging to chase what will be a multi-billion dollar accelerator
industry.

Recommendations
Recommendations for all enterprises
AI and in particular ML and DL will transform the world, and every business needs to evaluate this
technology to see how it can improve their offerings. Enterprises must also look to see how their
competitors are likely to react to this technology, and should also look at adjoining domains that are
further ahead in their adoption of AI.

The AI applications we currently see being actively pursued include autonomous driving; human-
computer communication (including in robots) such as natural language processing; healthcare
applications such as drug discovery, robotic surgery, doctor-on-demand smartphone services, and
drone assisted remote region aid delivery; the processing of big data with AI-based analytics where
there is an urgent need for AI to help automate big data processing; and cyber-security.

AI will impact every vertical including IT itself. AI is already being used to analyze performance-
monitoring streaming metrics to optimize the running of a data center. Google DeepMind turned its DL
technology to optimize the cooling system in the company’s data centers. By taking in sensor data
such as temperature and compute traffic flow, and by controlling windows, cooling water flows, and
fans, Google was able to reduce cooling energy demand by 40%, and to reduce the overall energy
cost of the data center by 15%. There is even research to train DL systems to program computers,
impacting how software will be created in future.

As first step, enterprises should initiate reports on how AI can bring benefits, as well as the challenges
AI poses to their industry, and what measures they should take to ensure they are abreast of the
technology. A next step will be to initiate proof-of-concept projects to trial AI technology.

Recommendations for AI technology vendors


The adoption of AI technology will grow rapidly with the fast pace of technology transfer from research
into products and services. Key to the growth of AI will be the availability of dedicated
microprocessors that will further accelerate the training of DL and ML, as well as low-power features
for embedding trained solutions working in inference mode inside products. The pace of evolution in
this field is expected to be rapid, with the new hardware allowing more ambitious AI/ML architectures
and algorithms to be developed.

There will be a demand for AI services to assist businesses that lack expertise. While early adopters
will make use of ML cloud services and the wealth of material available in the open AI communities,
there will be a business opportunity to bring AI to the substantial size of laggards that have yet to
waken to the possibilities of AI.

© Ovum. All rights reserved. Unauthorized reproduction prohibited. Page 3


How to Get Started with Artificial Intelligence: A Guide for Enterprises

Public cloud machine learning services


Amazon Machine Learning
Amazon has been using advanced ML algorithms for many years and its ML services are based on
the ones it has perfected for internal use, such as those providing customers with product
recommendations in the Amazon retail business. Amazon defines ML as a part of artificial intelligence
that concerns systems that learn from observation. This is distinct from cognitive science, which it
views as being concerned with the study of human intelligence, such as, for example, building
intelligent systems that can emulate human intelligence. ML offers algorithms that can be applied to
data-intensive (big data, for example) problems where there is a need to perform classification,
perform regression across variables, or make predictions. While cognitive science aims to mimic
human intelligence, a ML approach is akin to how a plane flies, not by flapping its wings like a bird but
through novel man-made invention, where ultimately, a plane and a bird achieve the same function,
but how they achieve it is vastly different. ML algorithms achieve intelligence but not necessarily in the
same way as a human brain. However, neural networks do draw their inspiration from a model of the
brain, and while they share similarities the training of and inferencing with neural networks is a purely
man-made innovation. The reason of course is that despite recent successes in neuroscience we still
don’t know a lot about how intelligence in the brain arises.

The target audience for Amazon ML are users who are expert in their application domain but are new
to ML and AI. To use the service requires no prior knowledge of ML algorithms. Furthermore, this
information is kept hidden and Amazon publishes no details of which algorithms are used “under the
hood”, but says state-of-the-art optimization techniques are used, such as elastic net regularization, to
train the ML algorithms.

There is an expert mode available that allows a data scientist familiar with data processing concepts
to perform some fine-tuning, which overrides some of the automated optimization. The expert mode
exposes a recipe language where features are generated and experts can perform some algorithm
customization. For example, it is possible to specify the amount of regularization on a three-point
scale, so although these parameters are optimized, having domain knowledge allows an expert to set
the best values. There is, however, no selection of training algorithm possible, such as choosing a
different class of predictor. The default ML service is designed to perform optimization of model
parameters in order to remove this burden from the user.

Amazon ML democratizes the use of ML, and startups with domain expertise but no background in ML
can achieve the same results as a large enterprise with a team of data scientists. When using
Amazon ML, the user needs to have an idea of the model they want to create and understand their
data, such as, for example, which columns are the most predictive variables. The service can provide
results in real time through a REST-based API, or can perform batch processing of data.

For ML experts who wish to customize the inner workings of ML algorithms, there is the option to use
the facilities of AWS to create a workspace and run one of the many ML and deep learning libraries
available. AWS supports GPUs for accelerating processing, and also supports libraries including
Spark and Spark ML. In addition, Amazon has open-sourced a deep learning library it has developed:
Deep Scalable Sparse Tensor Network Engine (DSSTNE).

One of the challenges in applying deep learning on large data sets is that the data may not fit on one
GPU. GPU RAM at the higher end can reach 384GB, but this doesn’t compare with CPU RAM that

© Ovum. All rights reserved. Unauthorized reproduction prohibited. Page 4


How to Get Started with Artificial Intelligence: A Guide for Enterprises

can reach 3TB. The advantage of GPU over CPU in deep learning is a factor of 1,000 in speed, but
this advantage is reduced every time data needs to be swapped between CPU and GPU, even using
the latest processor links that Nvidia, for example, has created. Cluster management software that
can manage and control an application distributed across multiple servers is necessary, and Apache
Spark and Spark ML can perform this. However, for deep learning it is even better to be able to
distribute workloads across many CPUs and GPUs. Amazon offers these features automatically in
Amazon ML and also in DSSTNE.

The goals that Amazon sets itself in pushing ML technology forward relate to solving e-commerce
problems, such as, for example, how to economically translate online retail sites for visitors who
speak a different language. There are advantages in using Amazon ML, because it connects to the
wider AWS ecosystem, such as AWS Lambda, the event-processing environment. For example, ML
forecasts can be triggered by the completion of data aggregation tasks, and the forecasts can then be
used to automatically order fresh supplies.
Figure 1: Benefits of Amazon ML

Source: Amazon

IBM Watson Cognitive Services and Power-based Computing


Watson-driven Cognitive Services
The terminology IBM prefers in relation to AI is cognitive X: cognitive business solutions, cognitive era,
cognitive computing, and so on. Within IBM, to be recognized as an IBM Cognitive product or service
the solution must satisfy four criteria: understand, reason, learn, and interact with humans in a natural

© Ovum. All rights reserved. Unauthorized reproduction prohibited. Page 5


How to Get Started with Artificial Intelligence: A Guide for Enterprises

way. The solution may have further advanced elements but these four are essential. The cognitive
solution must deliver insights to the user organization and deliver information faster than other
methods. Machine learning and deep learning are recognized as subsets within the cognitive
approach, and cognitive computing sits within the broader field of AI. IBM also prefers the term
augmented intelligence to artificial intelligence, and one can see the advantages from a marketing
perspective because it emphasizes technology to assist humans rather than replace them. IBM also
emphasizes acceleration rather than automation in how this technology is deployed, again helping
humans work faster, rather than necessarily replacing them.

The Watson field scope includes the combination of AI and robotics, where the emphasis with robotics
is also on helping humans. A good example is the robot concierge assistant at Hilton Hotels. For IBM,
it is a case of seeing where the market is heading in the use of AI and robotics. Robotics is not,
however, a core activity at IBM.

The Watson services are targeted at users who wish to make use of cognitive technology as
painlessly and swiftly as possible without requiring expertise in AI. The target is a higher-level user,
with the aim to enable the building of full-scale and deployable cognitive solutions. This is clearly a
much broader market than just those with the expertise to build their own AI solutions from the bottom
up. Nevertheless, experts in AI do find using Watson a time-saver.

The Watson API uses ML and DL underneath, but these details are not exposed to the user. IBM is
continually evolving and improving the solutions that underlie Watson. For example, the original
Watson that won Jeopardy! used a non-AI solution called DeepQA. If Watson were to play Jeopardy!
Today, the technology behind it would be based on DL, which is also the dominant technology in
image and speech analysis.

One of the notable improvements with Watson is that Q&A with it can be performed over a protracted
conversation, so that the machine can understand a thread of conversation. A challenge that IBM is
involved in is the creation of a Watson debater that will perform in a machine-versus-human debate.

Watson has evolved into a cloud offering that accumulates information over domains of knowledge,
such as law, medicine, the various branches of science and engineering, and so on. This knowledge
is expanded with new academic research in the various domains as it is published. The user therefore
benefits from Watson’s knowledge being continually updated and evolved.

IBM is also involved in producing next-generation microchips for supporting AI in projects such as
TrueNorth/SyNAPSE and Celia (Cognitive Environment Laboratory for Intelligent Agent). Celia is
being used to provide recommendations in mergers and acquisitions by applying Watson to big data
on financial information.

Most of the major application areas for Watson involve big data and the need to automate analysis of
information within it. Creating new industry solutions with Watson involves amassing a “corpus” of
data relevant for the task at hand, with human intervention to cull irrelevant material, manually
“curating the content”. Watson then pre-processes the corpus to create indices and other metadata to
be used in later analysis, a process called ingestion that also produces knowledge graphs. A human
expert then trains Watson to interpret the ingested information using ML and DL. Training may involve
matching question-and-answer pairs, building knowledge within Watson. The Watson solution is then
able to make recommendations and suggestions backed by its corpus of evidence, and it continually
updates itself through new user interactions.

© Ovum. All rights reserved. Unauthorized reproduction prohibited. Page 6


How to Get Started with Artificial Intelligence: A Guide for Enterprises

For application developers on BlueMix, Watson connects seamlessly with the other services on the
cloud.
IBM Power-based Computing
Note: the comparative performance benefits quoted below are based on statements by IBM.

IBM produces a microprocessor chip, Power, that competes with Intel, the dominant market player for
PCs and supercomputers. Power uses a RISC instruction set and is being re-architected for a new
generation of applications. The current version is Power8, withPower9 due in 2017, and continues
with the shift of focus to support big data, ML, and cognitive workloads. There is support for the open
source clustering Apache Spark, which has an advanced “directed acyclic graph” (DAG) execution
engine that supports cyclic data flow and in-memory computing, as well as distributed computing
features. Spark also has a ML library, MLlib, which is able to exploit the scalability of Spark. IBM has
created a Spark technology center, recognizing the disruptive nature of this technology. IBM itself is a
heavy user of Spark, and ML is an important use case for Spark.

In addition, Power8 supports Nvidia’s NVlink advanced communications pipeline technology between
CPU and GPU that has higher data throughput than the Intel x86 PCI standard, with 16GB per sec
per direction, a factor of five improvement over PCI. Nvidia GPUs are attached with the PCI bus to
Nvidia K40 and K80 GPUs.

Power9 will support the next-generation NVlink 2.0, which will offer a 33% increase in speed
communications between CPU and GPU and improvements in performance. Also on the roadmap are
enhancements to IBM’s Coherent Accelerator Processor Interface (CAPI) which will extend to NVlink
2.0, such as full bandwidth advantages and full coherence between CPU and accelerator, instead of
requiring explicit directives by the programmer.

Power9 CPU and Nvidia Pascal GPU make a powerful combination for DL and big data applications
where large amounts of data need to be moved. The Power9 roadmap is for a scaled-out version in
H2 2017, and for a scaled-up version in late 2018.

IBM is open to new technology that will enhance the Power platform and will partner with AMD and
Intel if they release new-generation chips. IBM established the OpenPower Foundation to support
collaborative development on the platform, and Nvidia is a founding member. Innovative partners
working on Power include Kinetica, an in-memory distributed database that exploits clusters of GPUs.
Using Kinetica significantly reduces the database footprint.

The Power platform’s CAPI features allow the connecting of field-programmable gate array (FPGA)
accelerator chips in a variety of application areas including financial services, healthcare, retail,
computer vision, ML, and others.

Power has 120GB on chip cache built in for in-memory big data applications. It also has on-chip
encryption for security and IP protection. The next-generation Power10 is already being designed for
launch in 2020.

The use of Power and its large in-memory data store and direct access to CAPI Flash allows
organizations to reduce their need for x86 clusters. For example, a customer was able to reduce a
cluster of 25 x86 CPU nodes to a single Power8 chip with CAPI Flash.

Using Power on a Spark 1TB logistic regression ML example, it was found that memory bandwidth
increased four times compared with x86 CPUs. The cache size (from L1 to L4) is four times greater
than on Intel CPUs, making Power a good choice for big data applications, and also ML, with the

© Ovum. All rights reserved. Unauthorized reproduction prohibited. Page 7


How to Get Started with Artificial Intelligence: A Guide for Enterprises

advantage that data can be held in memory during the training iterations. The higher supported thread
density also allows more data to be pushed over the network.

Watson services are available on Power servers on BlueMix and all the major DL libraries are
supported on the platform, including Torch, Theano, DL4J, Caffe, Computational Network Toolkit
(CNTK), and TensorFlow. With one click these frameworks are installed for an easy start on Power.
Figure 2: IBM Watson services

Source: IBM

A good example of the advantages in using GPUs with Power is demonstrated in a Spark model for
predicting adverse drug reactions. Figure 3 shows how the Learn Building Time (training the model)
was reduced from 701.79 secs to 28.12, a x25 improvement in speed. The data points in the figure
are in seconds.
Figure 3: Power: without GPUs (top row) and with GPUs (bottom row)

Source: IBM

© Ovum. All rights reserved. Unauthorized reproduction prohibited. Page 8


How to Get Started with Artificial Intelligence: A Guide for Enterprises

Google’s AI and ML activities


Google Cloud Prediction API and Google Cloud ML
Google’s founders have a life-long interest in AI, and learning how to make better search returns is
clearly a perfect task for a machine. Initially the various product streams within Google developed
their own AI technology, but around 2012 Andrew Ng (now head of AI at Baidu) reorganized AI activity
into a horizontal team that the various product teams could tap for AI expertise. Currently there are
more than 100 Google products using AI. The goal of this AI capability, which is located at Google HQ
in Silicon Valley, is around machine learning, training models to make predictions on data sets.
Google acquisition DeepMind, which is based in London, has a different agenda. Its goal is more
research-oriented, creating cutting-edge AI technology. For example, it investigates work-
management techniques to increase the rate of cognition per unit of data. Its technology is passed on
to the Mountain View-based ML team when it’s ready for delivering business value in real-world
applications.

Google’s AI technology is being pushed in a number of ways and the ML services on Google Cloud
are being heavily invested in, with the services currently available to the public in beta. Google Cloud
Prediction API is a ML platform that offers APIs in a number of areas including fully trained models
that used Google data and were built by Google scientists. This allows developers who need a vision
feature, for example, to use the models out of the box without needing to be expert in ML. The
application categories include vision, natural language processing, and speech translation. These
cognition APIs improve the human-machine interaction, making communicating with machines as
natural as possible, and they are designed to enable the machine to understand the intent of the
human.

At the other end of the spectrum, for full flexibility and custom design there is the Google Cloud ML,
which offers TensorFlow, a leading deep learning library. TensorFlow has its origins in the Google
Brain project, which continues as a research-oriented AI project. The target audience for Google
Cloud ML comprises data scientists who have their own data and wish to build their own ML models.
These twin services are used both internally within Google and also with easy-to-use interfaces for
external users.

The target audiences for Google’s ML services are currently at these two ends of the spectrum. If the
API ML services have labeled the data of interest to the user, these are excellent starting points, but
for other applications you would need to be an expert in data science and neural networks to build
fresh models with Google cloud ML and TensorFlow. Over time Google plans to fill out the middle
ground with tools that provide users with a range of ML skills. Google expects commercial tooling for
ML in the market to settle in the next year or two as experience is gained with what works best for
users.

The ML use cases Google envisages include retail applications, such as in improving customer sales
on websites by optimizing UI layout, and in monitoring call center telephone conversations for quality-
control purposes, such as, for example, running sentiment analysis on the caller: if they ended their
conversation sounding happy or not. Natural language translation in real time is another strong ML
use case, as well as which references are made to products. These types of unstructured and analog
forms of data are ideal for analysis by ML.

When Google Cloud ML goes fully live it will obfuscate the hardware it runs on, so all hardware
concerns will be managed automatically, and this includes balancing workloads across GPUs

© Ovum. All rights reserved. Unauthorized reproduction prohibited. Page 9


How to Get Started with Artificial Intelligence: A Guide for Enterprises

(currently, Nvidia high-end GPUs are essential accelerators for training deep neural networks), and
Google’s own Google Tensor Processing Units (TPUs), which are hardware inferencing accelerators.
TPUs were used by AlphGo, the DeepMind Go playing system that beat world champion Lee Sedol.
Google is currently working on a next-generation TPU that will support its ML services. It sees further
possible improvements in accelerating TensorFlow with dedicated hardware for training and
inferencing deep learning systems. Linking racks of AI-dedicated hardware into an AI supercomputer
will take AI to the next level in its evolution.
Figure 4: Experiment with Google TensorFlow at playground.tensorflow.org

Source: playground.tensorflow.org

Microsoft Azure Machine Learning, Cognitive Services and


Deep Learning Library Computational Network Toolkit(CNTK)
Microsoft has been researching advanced ML for over 20 years for its internal needs. It has invested
$15bn in its Azure cloud infrastructure, and in Microsoft’s recent transformation through CEO Satya
Nadella it is embracing open source software (OSS). In particular for ML, relevant OSSes are
Hadoop, Spark, and the statistics language R. In 2015 Microsoft acquired Revolution Analytics, a key
provider of services for the OSS R project, and R support now comes out of the box in Sql Server
2016. Today Microsoft says 80% of Fortune 500 companies are on the Microsoft Cloud, and its
objective is to achieve a $20bn annualized revenue stream on commercial cloud by 2018.

AI and ML specifically are a major competency target for Microsoft, which is spending $12bn of its
R&D budget on this. Its aim is to provide analytics wherever the data is situated, on the cloud or on-
premise. An outcome of this investment is the Cortana intelligence suite for big data, offering
advanced analytics tools for ingesting data and orchestrating other tools for data analytics, ML, and
predictive analytics, enabling engineers to build their own models. The Azure ML target audience
resides in enterprises and is split between data scientists, data engineers, and developers.

One of the key application areas for Microsoft is cognitive services, including vision, speech,
language, knowledge, and search, all exploiting ML. This work was previously conducted under the
Project Oxford name. The most typical envisaged users are application developers who do not need
to be experts in ML, and do not possess deep knowledge of neural networks.

The idea behind Azure ML is to democratize ML by making it easy to use, where basic statistics
knowledge is enough to build models with the use of simple drag-and-drop tools.

© Ovum. All rights reserved. Unauthorized reproduction prohibited. Page 10


How to Get Started with Artificial Intelligence: A Guide for Enterprises

Microsoft also has ready-to-use pre-trained ML APIs, such as speech, knowledge, vision, language,
and search, all using simple REST calls. These APIs are available for developers to build applications
infused with ML, such as, for example, intelligent agents capable of complex tasks, using the Cortana
Intelligence Suite, Bot Framework, or Cortana, Microsoft’s personal assistant technology. Developers
and analysts can also consume ML models directly within Microsoft Excel.

The ML library exposed on Azure through the service tools benefit from extensive research conducted
within Microsoft and these services are being continually enhanced. With Microsoft’s support for R,
experts as well as aspiring data scientists will find their needs supported. Microsoft made
improvements on top of open source R, including:
 Removed certain limitations, such as memory limit.
 Added distributed computing with R, allowing large amounts of data to be processed in
parallel with R.
 Deployed it on multiple systems including Spark, Linux, and Windows without re-coding.

The R community has currently 3.5 million active users, and therefore brings a potentially large
community to Azure with support for R on its cloud and tools. Azure also provides space for data
scientists to share R code as part of an experiment in a gallery, Cortana Intelligence Gallery,
gallery.cortanaintelligence.com.
Deep learning with Computational Network Toolkit (CNTK)
DL is not explicitly available on Azure ML because it only allows for simple neural networks. For users
who want a more advanced hands-on approach, and especially for DL, Microsoft has open-sourced a
DL library developed internally called Computational Network Toolkit (CNTK). CNTK offers a high level
support for building models with advanced DL architectures such as Convolutional Neural Networks
(CNN) and Long Short-Term Memory (LSTM) neural networks. Users can start with pre-built CNTK
binaries, or build the library from source code. Importantly, CNTK supports GPU accelerators at
multiple levels, including single-GPU, multi-GPU, and multi-machine-multi-GPU with built-in support
for massive data sets. CNTK is also extensible, allowing custom computation nodes to be plugged in,
making it both a production-grade and a research-grade DL library. Many cognitive services available
on Azure have used CNTK. CNTK can be used on Azure Series N GPU VMs as IaaS.
Figure 5: Microsoft and partner-led solutions using Cortana Intelligence

Source: Microsoft

© Ovum. All rights reserved. Unauthorized reproduction prohibited. Page 11


How to Get Started with Artificial Intelligence: A Guide for Enterprises

Hardware advances to accelerate AI applications


Vendors with microprocessors designed for AI
Introduction
Running machine learning algorithms on traditional CPU-based computers is too slow to provide
responses in real time or near time during production use (inferencing), and far too slow for training
neural networks. The market for accelerating deep learning systems was defined by Nvidia with its
GPUs, and the company is the dominant player. There is another accelerator available on the market
from KnuEdge, which has only recently gone public. Two additional players, Intel and Graphcore, will
enter the market in 2017 when competition in the accelerator market will therefore heat up, but clearly
Nvidia is not standing still and continues to innovate.

The competition in the accelerator market, for what will turn into a billion-dollar industry in the next few
years as AI infused products become commonplace, is healthy and will take AI/ML to the next level.
The interesting trend is that it is no longer just about hardware. For example, Nvidia offers a deep
learning platform that includes sophisticated vision and high-throughput processing software, and the
use of Field-Programmable Gate Array (FPGA) processors by vendors including Intel and Microsoft
provides another option between programmability and hard-wiring.

As accelerators improve in computational capability, the possibilities of innovation on the theoretical


and algorithmic side opens out, so the balance between hardware and software play together to yield
better-performing AI systems. Ovum expects to see the hardware acceleration market take software
AI innovation to new heights, with DL expanding to embrace more ambitious architectures.

On the hardware side, the variables in contention are data throughput, processing speed (in training
and inferencing), power consumption, and cost. The capability of the AI hardware units to form
distributed systems across memory and across servers is also an important consideration.
Intel and Nervana
Nervana was acquired by Intel in August 2016 and Ovum will be updating this entry on Nervana once
Intel goes public with its plans for the acquisition. What follows is accurate to prior the acquisition.

Nervana Cloud runs on a deep learning framework called Neon. It is open source and downloadable
from Nervana’s website or GitHub. Nervana has spent a lot of time on the low-level primitives and
optimizing them for speed on existing hardware platforms such as GPUs. There are also some
algorithmic advances that offer better ways to do the fundamental operations including convolution
and matrix multiply. These are the operations that need to be accelerated to make deep learning train
quickly.

Neon is an open source, Python-based language set of libraries for developing deep learning models.
Nervana says that Neon is more than twice as fast as other deep learning frameworks such as Caffe
and Theano. Neon’s performance advantage is the result of assembler-level optimization, multi-GPU
support, optimized data-loading, and the use of the Winograd algorithm for computing convolutions.

Once a model is written in Neon, it can be run on low-powered hardware such as a laptop or a CPU,
and it will work, but slowly. It can also be run on in-house GPU hardware, but most customers at the
enterprise level will prefer to use Nervana’s Cloud Platform. The platform uses multiple optimally

© Ovum. All rights reserved. Unauthorized reproduction prohibited. Page 12


How to Get Started with Artificial Intelligence: A Guide for Enterprises

connected GPUs within a system. The user interacts with it through Neon but they can submit a job to
the cloud and run it against data that they have sitting on AWS.

The cloud platform uses Nervana’s own physical machines, but these are connected with a very fast
connection into AWS. To the user it looks as if it's an API call within AWS that runs deep learning
much faster than it would on AWS. It looks like just an API call to them with Nervana hiding all the
complexity that makes things fast, distributing these loads across multiple GPUs within a server and
multiple nodes within the cloud.
Figure 6: Nervana Deep Learning Platform

Source: Nervana

Nervana Cloud has an easy-to-use interface that allows users to kick off training with a click of a
button and track progress on the dashboard. All the capabilities of Nervana Cloud are also accessible
via a command line interface called ncloud.

The Nervana Engine (due 2017) is an ASIC that is custom-designed and optimized for deep learning.
It will include everything required for deep learning with optimizations resulting in a 10-fold increase in
training speed.
Graphcore
Graphcore, a Bristol UK based startup in the AI accelerator space, went live on October 31, 2016,
announcing a successful and oversubscribed series A venture funding of more than $30m, a record in
the UK. The names backing the venture have stellar backgrounds. Key funders include those linked to
high-technology firms, such as Bosch and Samsung, and VC firms with a history in successful
ventures, including Apple and Google. The result is that Graphcore is able to call on first-class
expertise and a deep network in the high-technology industry to help steer the company as it readies
the launch of its flagship product in 2017, an AI accelerator microprocessor, Intelligent Processing Unit
(IPU). The company’s location in Bristol is not by chance. CEO Nigel Toon and CTO Simon Knowles
have a background in custom processors and many of their team go back to the Inmos Transputer
that was based in Bristol. The AI accelerator market, in particular for training and inference on deep
neural networks, is forecast to grow from its current million-dollar bracket to a billion dollar-sized
industry in about three to five years. The success of general-purpose processing GPUs for deep
learning neural networks has made Nvidia the dominant player in AI acceleration. It is the turn of new
hardware innovations dedicated to AI to up the game, and Graphcore promises to deliver this. The
competition in the accelerator market bodes well for AI systems designers, and will lead to the next
round of innovation.

© Ovum. All rights reserved. Unauthorized reproduction prohibited. Page 13


How to Get Started with Artificial Intelligence: A Guide for Enterprises

KnuEdge
KnuEdge was formed in 2005 by Dan Goldin, former NASA Administrator (the highest role) 1992-
2001. KnuEdge’s vision was to use neurological principles to solve new challenges, and two areas
have been its focus: voice recognition for human-machine interfaces, and new hardware architecture
for neural computing. The company is private investor-funded and has been in stealth mode for a
decade as it develops its technology, but has already earned revenue in excess of $20m, and is now
public about its business. It has two business units: KnuVerse for voice biometrics, and KnuPath for
neural computing.

KnuPath has created a technology called LambdaFabric, and the first-generation of this is the
KnuPath Hermosa processor. LambdaFabric includes a low-latency, high-throughput network router
for connecting Hermosa processors, with plans to connect GPUs, FPGAs, and/or CPUs. The
Hermosa chip comprises a hierarchy of clusters, including eight tiny digital signal processors (tDSPs)
in a cluster with associated memory and network communications, eight clusters that form a
Supercluster, and four Superclusters that form a KnuPath Hermosa processor. In total there are 256
tDSP cores, with each one handling a processing thread. Hermosa chips run in parallel so that, for
example, 1,000 cores can be run simultaneously on an application. A significant feature of the chip is
its low power consumption. At around 34W it is lower than rival GPUs. LambdaFabric is also
differentiated by its extremely low latency (<450ns even rack-to-rack), and the Multi-Program Multi-
Data (MPMD) capability of the Hermosa processor. KnuPath is currently shipping a two-processor
Reference Design Kit (RDK) through www.KnuPath.com.
Figure 7: LambdaFabric Neural Computing

Source: KnuEdge

Nvidia
The market leader in DL acceleration is Nvidia with its latest generation Pascal GPU and platforms
built on this technology. General-purpose programming on GPU (GPGPU) was conceived to boost
computer performance that became limited by CPU technology hitting a performance ceiling (build
CPUs any smaller and faster and they literally melt). Nvidia first introduced its CUDA parallel-
computing interface for its high-end GPUs for this role in wide-ranging applications. Jurgen

© Ovum. All rights reserved. Unauthorized reproduction prohibited. Page 14


How to Get Started with Artificial Intelligence: A Guide for Enterprises

Schmidhuber’s team at the Swiss AI lab (IDSIA) and Geoffrey Hinton’s team at the University of
Toronto ported their DL systems to CUDA processors and saw a massive boost in performance that
has subsequently allowed the AI field to evolve rapidly. In 2012 Hinton team’s Alexnet won the
ImageNet Visual Recognition challenge, a breakthrough for DL. They used the Nvidia Kepler GPU
architecture, which gave x40 acceleration in training. Since then Nvidia has embarked on a company
mission transformation toward supporting AI, and deep learning systems in particular. The current top
GPU specification is the Nvidia Pascal GP100, which offers x65 performance improvement over
Kepler.

Nvidia announced recently its most advanced product yet, targeted at AI. Xavier is a system-on-chip
(SOC) AI supercomputer, based on the next-generation GPU architecture (after Pascal comes Volta)
to be released in 2017. Xavier is designed to accelerate the adoption and embedding of deep learning
AI systems in advanced systems such as autonomous driving for cars, and will replace its current
Deep Learning Drive PX2, offering greater computational capability, lower power consumption, and a
smaller physical footprint.

To support the transfer of ML applications from the lab into hardware products, Nvidia has launched a
series of embedded system development kits under the brand Jetson. The latest, Jetson TX1, is a
supercomputer the size of a credit card, combining a 64-bit ARM CPU and 256-core Maxwell GPU
capable of 1 Tera FLOPS. The TX1 is designed to run DL systems in inference mode, hence the
ability to shrink its size. The DL training kit for the TX1 is naturally of much larger size. Many use
cases for the TX1 exist including Internet of Things-connected intelligent machines.

Nvidia is supporting the learning of AI and DL through its Deep Learning Institute. It is partnering with
online learning providers Coursera and Udacity, as well as with Microsoft to deliver self-paced
courses.
Field-Programmable Gate Array (FPGA) for AI systems
The FPGA accelerator technology is not new but there has been a resurgence of interest in its use for
deep learning as a flexible alternative to running AI software systems on CPU plus fixed hardware
accelerators. AI practitioners agree that the next few years will see the pace of evolution accelerate,
which makes building specialized AI hardware a risky undertaking because it is likely to move so fast
that the hardware could become rapidly outdated. On the other hand, the current best practice of
using a combination of CPU and GPU may not be powerful enough for the most advanced AI
applications. FPGAs balance the efficiency of hardware with the advantage of being able to be
reprogrammed when improved AI system designs and algorithms appear. The reprogramming is a
non-trivial exercise but is not as painful as creating a new piece of hardware. FPGAs are already in
use on the major clouds, with Amazon, Baidu, Google, and Microsoft using them for network traffic
processing. The next major application area will see FPGAs used for AI and DL systems as AI
becomes an essential ingredient in cloud services. Microsoft has created an internal FPGA capability
through its Project Catapult, and Intel’s acquisition of Altera, a major FPGA player, for $16.7bn in
December 2015, has this market in its sights.

© Ovum. All rights reserved. Unauthorized reproduction prohibited. Page 15


How to Get Started with Artificial Intelligence: A Guide for Enterprises

Appendix
Further reading
Making artificial intelligence applications safe for humans, IT0022-000801, October 2016.

Nvidia announces the most advanced AI computer in a SOC: Xavier, IT0022-000800, October 2016.

The next chip arms race will be to power machine learning, IT0022-000725, June 2016.

Nvidia bets on deep learning, T0022-000675, April 2016.

DeepMind AlphaGo and general artificial intelligence: are we there yet? IT0022-000653, March 2016.

Google DeepMind achieves artificial intelligence (AI) milestone, IT0022-000639, March 2016.

Digital Economy 2025: Technology Outlook, TE0009-001466, October 2015.

Machine learning in business use cases: Artificial intelligence solutions that can be applied today,

IT0022-000335, April 2015.

Author
Michael Azoff, Principal Analyst, Software Infrastructure Group

michael.azoff@ovum.com

Ovum Consulting
We hope that this analysis will help you make informed and imaginative business decisions. If you
have further requirements, Ovum’s consulting team may be able to help you. For more information
about Ovum’s consulting capabilities, please contact us directly at consulting@ovum.com.

Copyright notice and disclaimer


The contents of this product are protected by international copyright laws, database rights and other
intellectual property rights. The owner of these rights is Informa Telecoms and Media Limited, our
affiliates or other third party licensors. All product and company names and logos contained within or
appearing on this product are the trademarks, service marks or trading names of their respective
owners, including Informa Telecoms and Media Limited. This product may not be copied, reproduced,
distributed or transmitted in any form or by any means without the prior permission of Informa
Telecoms and Media Limited.

Whilst reasonable efforts have been made to ensure that the information and content of this product
was correct as at the date of first publication, neither Informa Telecoms and Media Limited nor any
person engaged or employed by Informa Telecoms and Media Limited accepts any liability for any
errors, omissions or other inaccuracies. Readers should independently verify any facts and figures as
no liability can be accepted in this regard – readers assume full responsibility and risk accordingly for
their use of such information and content.

Any views and/or opinions expressed in this product by individual authors or contributors are their
personal views and/or opinions and do not necessarily reflect the views and/or opinions of Informa
Telecoms and Media Limited.

© Ovum. All rights reserved. Unauthorized reproduction prohibited. Page 16


CONTACT US
www.ovum.com
analystsupport@ovum.com

INTERNATIONAL OFFICES
Beijing
Dubai
Hong Kong
Hyderabad
Johannesburg
London
Melbourne
New York
San Francisco
Sao Paulo
Tokyo