The Scientist Exploring Life 2019

MAY 2019 | WWW.THE-SCIENTIST.
COM
AI TACKLES
BIOLOGY
HOW MACHINE LEARNING WILL
REVOLUTIONIZE SCIENCE AND MEDICINE
BUILDING BRAINS
WITH SILICON
ALGORITHMS
FOR CANCER
DATA DIVING
IN BIOLOGICAL
IMAGES
PLUS
ARTIFICIALLY INTELLIGENT
DRUG DEVELOPMENT
HUNT FOR 10-PLEX
BIOMARKERS
WITH UNMATCHED
SENSITIVITY
With Quanterix SP-X™, unparalleled Simoa™ 10-plex
detection of circulating protein biomarkers is now possible
at the earliest stages of disease progression – even at
healthy baseline levels
Introducing the Quanterix SP-X Imaging and Analysis System,
detection at both acute and baseline levels. With the new SP-X,
oncology and immuno-oncology researchers and others who
rely on robust multiplexing capabilities now have access to
next generation Simoa planar technology in an easy-to-use
research, and ultimately accelerate drug approvals.
Visit quanterix.com/SP-X
for more information.
Quanterix.com | © 2018 Quanterix, Inc. SP-X™ is a registered trademark of Quanterix, Inc.

For research use only. Not for diagnostic procedures.
Vi-CELL BLU
Cell Viability/Counting Analyzer
does more work
in less time, using
fewer resources
50% faster
sample
processing
speeds in FAST
uses sample mode
volumes as
low as 170 μL
in FAST mode
improved
greater sample
instrument-
capacity-choose
to-instrument
between a
comparability
24-position
carousel and
a 96-well
plate loader
21.2"
16.5"
Different by (your) design

The new Vi-CELL BLU Cell Viability Analyzer from Beckman Coulter Life Sciences.
Before we upgraded our Vi-CELL XR, we asked labs around the world what improvements they wanted
to see. As a result, our new Vi-CELL BLU is a next-generation cell viability analyzer with faster processing
speeds, expanded sample capacity (choose between a 24-position carousel and a 96-well plate), minimized
sample test volumes (170 μL in FAST mode), improved instrument-to-instrument comparability, and a
smaller footprint with on-board computer and monitor (just 16.5” x 21”, 42 cm x 54 cm)
Learn more about Vi-CELL BLU now, at beckman.com/blu
© 2019 Beckman Coulter, Inc. All rights reserved. Beckman Coulter and the stylized logo are trademarks or
registered trademarks of Beckman Coulter, Inc. in the United States and other countries.
PART-3390ADV02.18
be INSPIRED
drive DISCOVERY
stay GENUINE
You’ll be
thrilled to pieces.
NEBNext Ultra II FS DNA Library Prep Kit
® ™
Novel Enzymatic Fragmentation System

Do you need a faster, more reliable solution for NEBNext Ultra II FS DNA produces the highest yields,
DNA fragmentation and library construction? from a range of input amounts
The NEBNext Ultra II FS DNA Library Prep
140
Kit meets the dual challenge of generating NEBNext Ultra™ II FS

®
Kapa™ HyperPlus
120
Covaris
®
high quality next gen sequencing libraries from 100

Illumina Nextera
® ®
Library Yield (nM)
ever-decreasing input amounts AND simple 80
scalability. Enzymatic shearing increases library 60
yields by reducing DNA damage and sample loss.

40
20
Our novel fragmentation reagent is combined 0
with end repair and dA-tailing reagents, and a PCR cycles

DNA input
13
100 pg
10
500 pg
9
1 ng
5
50 ng
4
100 ng
3
500 ng
single protocol is used for a wide range of input

Libraries were prepared from Human NA19240 genomic DNA using the input amounts and
amounts (100 pg – 500 ng) and sample types. numbers of PCR cycles shown. For NEBNext Ultra II FS, a 20-minute fragmentation time
was used. For Kapa HyperPlus libraries, input DNA was cleaned up with 3X beads prior to
library construction, as recommended, and a 20-minute fragmentation time was used. Illumina
You’ll be thrilled to pieces with the result – a reliable, recommends 50 ng input for Nextera, and not an input range; therefore, only 50 ng was used
in this experiment. “Covaris” libraries were prepared by shearing each input amount in 1X TE
flexible, high-quality library prep that is fast Buffer to an insert size of ~200 bp using a Covaris instrument, followed by library construction
and scalable. using the NEBNext Ultra II DNA Library Prep Kit (NEB #E7645). Error bars indicate
standard deviation for an average of 3–6 replicates performed by 2 independent users.
Visit NEBNextUltraII.com One or more of these products are covered by patents, trademarks and/or copyrights owned or controlled by New England Biolabs, Inc.
For more information, please email us at gbd@neb.com.
to request your sample today. The use of these products may require you to obtain additional third party intellectual property rights for certain applications.
ILLUMINA® and NEXTERA® are registered trademarks of Illumina, Inc.
KAPA™ is a trademark of Kapa Biosystems. COVARIS® is a registered trademark of Covaris, Inc.
© Copyright 2017, New England Biolabs, Inc.; all rights reserved.
MAY 2019
Contents
THE SCIENTIST THE-SCIENTIST.COM VOLUME 33 NUMBER 05
© MOLLY MENDOZA; © HEIDELBERG UNIVERSITY; MODIFIED FROM © ISTOCK.COM
Features ON THE COVER: © ANDRIY ONUFRIYENKO, GETTY IMAGES
22
Run Program: Cancer
28
A Silicon Brain
36
A Deeper Look
Using input from images to -omes, Biology-inspired computer chips With the help of computer programs
artificial intelligence can find patterns in may help scientists better simulate that learn from experience, researchers
tumors and generate prognoses where neurological function. look for meaning in vast volumes of
human clinicians see only a jumble of data. BY SANDEEP RAVINDRAN image data.
BY AMBER DANCE BY JEF AKST AND CAROLYN WILKE
05. 201 9 | T H E S C IE N T IST 3

MAY 2019
Department Contents
10 FROM THE EDITOR 52 LAB TOOLS
Cerebral Inception Automating the Fight Against
Our brains have evolved to the point Superbugs
where we can build artificial brains Antibiotic resistance is on the rise.
that might help us understand our Can AI help?
brains. Let that sink in. BY AMBER DANCE
BY BOB GRANT
56 BIO BUSINESS
12 THE BASICS Computing a Cure
Artificial Intelligence: An The pharmaceutical industry is look-
Introduction ing to machine learning to overcome
A brief history of AI, machine complex challenges in drug
learning, artificial neural networks, development.
and deep learning BY BIANCA NOGRADY
13 BY JEF AKST
60 READING FRAMES
13 Robots and Eureka Moments
46
NOTEBOOK
Going Through the Motions; Listening Artificial intelligence can be trained
MURTHY LAB AND SHAEVITZ LAB, PRINCETON UNIVERSITY; © ISTOCK.COM, JXFZSY; © JANE SHAUCK PHOTOGRAPHY
for Your Health; A Better Beehive; to recognize patterns and predict

Into the Cell results. But will machine learning
ever be able to make novel scientific
20 MODUS OPERANDI discoveries?
Microbiology Meets Machine BY KARTIK HOSANAGAR
Learning
Artificially intelligent software 64 FOUNDATIONS
augments the study of host-pathogen Learning Machine, 1951
interactions. BY JEF AKST
BY RUTH WILLIAMS
IN EVERY ISSUE
21 CRITIC AT LARGE 9 CONTRIBUTORS
AI Versus Animal Testing 11 SPEAKING OF SCIENCE
Machine learning could be the key 62 THE GUIDE
to reducing the use of animals in 63 RECRUITMENT
51 experiments. ANSWER
BY THOMAS HARTUNG
PUZZLE ON PAGE 11
46 THE LITERATURE S T I RRU P P A P E R

AI for filtering out sequence noise; A R O L A O A
algorithms predict CRISPR repairs L L AMA A NN E L I D
M Q R C S E I
OX I D I Z E Y UCC A
48 PROFILE N N B A L
Bob Murphy: Autopilot Advocate A NGKO RWA T
BY SHAWNA WILLIAMS J N E A A
ORD E R F OR T R A N
51 SCIENTIST TO WATCH V R O F T O G
Nick Turk-Browne: Pattern Seeker I SO T OP E H EGE L
BY CATHERINE OFFORD
A I T C O U E
NOD E S T I GR E S S
05. 201 9 | T H E S C IE N T IST 5

MAY 2019
Online Contents
THIS MONTH AT THE-SCIENTIST.COM:
VIDEO VIDEO SLIDESHOW

Murphy on AI for Biology A Window on Memory AI Photo Album
See Robert Murphy of Carnegie Mellon Watch Princeton University’s Nicholas Peruse some images that artificial
University discuss the revolutionary Turk-Browne describe his research on intelligence is sifting through for clues
application of machine learning to how the human brain makes, stores, and about the biology they depict.
biomedical research. adjusts memories.
AS ALWAYS, FIND BREAKING NEWS EVERY DAY ON OUR WEBSITE.
Coming in June
SNAPSHOT SERENGETI; © ISTOCK.COM, WACOMKA

HERE’S WHAT YOU’LL FIND IN NEXT MONTH’S ISSUE
• The new frontier of quantum biology
• How the body tolerates disease instead of fighting it
• Can microbiome-produced antigens lead to autoimmunity?
• Inclusivity in academia
AND MUCH MORE
6 T H E SC I EN TIST | the-scientist.com
WE REMOVED
THE COMPRESSORS.
Y O U ’ R E W E L C O M E.
For decades, you’ve protected your life’s work with technology destined to fail. Like you,
we don’t accept the status quo. Which is why we invented the first and only compressor-free,
ultra-low temperature freezer. With the uncomplicated, free-piston Stirling engine, there’s virtually
nothing to fail. It’s an engine that has created breakthroughs in every aspect of ULT technology:
in performance, energy savings, sustainability, total cost of ownership and sample safety.
Learn more about breakthroughs in ULT technology at NoCompressors.com.

415 Madison Avenue,
Suite 1508,
New York, NY
10017
E-mail: info@the-scientist.com
EDITORIAL DESIGN ADVERTISING, MARKETING, EDITORIAL ADVISORY BOARD

EDITOR-IN-CHIEF AND PRODUCTION ADMINISTRATION Roger Beachy
Bob Grant PRODUCTION MANAGER ASSOCIATE SALES DIRECTOR Donald Danforth Plant Science
rgrant@the-scientist.com Greg Brewer Key Accounts Center
gregb@the-scientist.com Ashley Haire Steven A. Bent
MANAGING EDITOR ashleyh@the-scientist.com Foley and Lardner LLP
Jef Akst ART DIRECTOR
SENIOR ACCOUNT Deborah Blum
jakst@the-scientist.com Erin Lemieux
EXECUTIVES University of Wisconsin
elemieux@the-scientist.com
SENIOR EDITOR Northeast US, Eastern
Annette Doherty
Kerry Grens INTERIM ART DIRECTOR Canada, Europe, ROW, Pfizer Global Research
kgrens@the-scientist.com Luiza Augusto Careers/Recruitment and Development
laugusto@labx.com Melanie Dunlop
ASSOCIATE EDITORS Kevin Horgan
melanied@the-scientist.com GE Healthcare
Catherine Offord VIDEO PRODUCTION
cofford@the-scientist.com COORDINATOR Western US and Steve Jackson
Ryan Kyle Western Canada University of Cambridge
Shawna Williams rkyle@labx.com Karen Evans
swilliams@the-scientist.com Simon Levin
kevans@the-scientist.com
Princeton University Center
Ashley Yeager CREATIVE SERVICES ACCOUNT EXECUTIVE for BioComplexity
ayeager@the-scientist.com Midwest and Southeast US
DIRECTOR Edison Liu
CONTRIBUTING EDITOR Elizabeth Young Anita Bell Genome Institute of Singapore
Alla Katsnelson eyoung@the-scientist.com abell@the-scientist.com
Peter Raven
SENIOR SCIENTIFIC DIRECTOR OF MARKETING Missouri Botanical Garden
COPY EDITOR
TECHNICAL EDITOR Alex Maranduik Joseph Schlessinger
Annie Gottlieb
Nathan Ni amaranduik@labx.com Yale University School
CORRESPONDENTS nni@the-scientist.com of Medicine
AUDIENCE DEVELOPMENT
Anna Azvolinsky SCIENTIFIC TECHNICAL EDITORS SPECIALIST J. Craig Venter
Abby Olena Kathryn Loydall Matthew Gale J. Craig Venter Institute
Ruth Williams kloydall@the-scientist.com mgale@labx.com Marc Vidal
Dana Farber Cancer Institute
INTERNS SOCIAL MEDIA EDITOR EVENTS MANAGER
Harvard University
Chia-Yi Hou Lisa Winter Cayley Thomas
Carolyn Wilke lwinter@the-scientist.com cayleyt@labx.com H. Steven Wiley
Biomolecular Systems Pacific
WEBINARS & SOCIAL MEDIA SALES AND MARKETING Northwest National Laboratory
MANAGEMENT
COORDINATOR COORDINATOR
AND BUSINESS Alastair J.J. Wood
Meaghan Brownley Katie Prud’homme Symphony Capital
PRESIDENT mbrownley@labx.com katiep@the-scientist.com
Bob Kafato CUSTOMER SERVICE
SOCIAL MEDIA SUBSCRIPTION RATES & SERVICES
bobk@labx.com info@the-scientist.com
COORDINATOR In the United States & Canada individual subscriptions:
$39.95. Rest of the world: air cargo add $25.
GENERAL MANAGER Catherine Rocheleau
Ken Piech crocheleau@labx.com For assistance with a new or existing subscription
please contact us at:
kenp@labx.com
Phone: 847.513.6029
Fax: 847.763.9674
MANAGING PARTNER E-mail: thescientist@halldata.com
Mario Di Ubaldi Mail: The Scientist, PO Box 2015, Skokie, Illinois 60076
mariod@the-scientist.com For institutional subscription rates and services visit
www.the-scientist.com/info/subscribe or
VICE PRESIDENT e-mail institutions@the-scientist.com
GROUP PUBLISHING DIRECTOR
LIST RENTALS
Robert S. D’Angelo Contact Statlistics, Jennifer Felling at
rdangelo@the-scientist.com 203-778-8700 or j.felling@statlistics.com
REPRINTS
Contact Katie Prud’homme at katiep@the-scientist.com
POSTMASTER: Send address changes to The Scientist, PO Box 2015, Skokie, Illinois 60076. Canada Publications Agreement #40641071 The Scientist is indexed in
Current Contents, Science Citation Index, BasicBIOS IS, and other databases. Articles published in The Scientist reflect the views of their authors and are not the official views PERMISSIONS
of the publication, its editorial staff, or its ownership. The Scientist is a registered trademark of LabX Media Group Inc. The Scientist® (ISSN 0890-3670) is published monthly.
For photocopy and reprint permissions, contact
Advertising Office: The Scientist, 415 Madison Avenue, Suite 1508, New York, NY 10017. Periodical Postage Paid at New York, NY, and at additional mailing offices. Copyright Clearance Center at www.copyright.com
8 T H E SCI ENT I ST | the-scientist.com

MAY 2019
Contributors
As a kid, Molly Mendoza loved to watch cartoons such as Sailor Moon and Speed Racer and to read Japanese
comics. “From a really young age I wanted to tell stories with pictures,” she recalls. She didn’t really consider
making a career out of it, though, until her high school art teachers and college professors told her she had tal-
ent. With their encouragement, she decided to pursue a bachelor’s of fine arts degree at the Pacific Northwest
College of Art in Portland, Oregon. After graduating in the winter of 2014, Mendoza went straight into free-
lancing, creating banner images for different tutorials within the creative suite of Adobe, as well as illustrating
for publications such as Nautilus and The Scientist. In May 2015, her work graced the cover of The Scientist’s
special issue on HIV. And on page 22 of this issue, she illustrates how artificial intelligence (AI) technologies
are transforming masses of data on cancer into meaningful information for patients.
In addition to freelancing, Mendoza works part-time at Land Gallery, an online and brick-and-mortar
retail space for independent artists in and around Portland, where she has settled. And this July, she will pub-
lish her first graphic novel, Skip, about two people hopping between dimensions as they try to find their way
home. “It’s an adventure story,” she says, but it’s also more than that. “Along the way they realize each other’s
flaws and fears but also what about them makes them special and what they should be proud of.”
Kartik Hosanagar, professor of technology and digital business at the Wharton School of the University
of Pennsylvania, has had a front row seat for the artificial-intelligence revolution. In addition to studying
how analytics and algorithms affect the behavior of consumers and society, he’s launched his own compa-
nies, built around automating our lives. For example, in 2005 he cofounded Yodle, a marketing platform
for small businesses that was designed to take the place of traditional phonebook searches performed
by people looking for particular products or services. Powered by Hosanagar’s custom-built algorithms,
Yodle was acquired by web.com in 2016.
Although he doesn’t specifically research the interface of AI and biology, Hosanagar says that he has
an ongoing study that looks at how healthcare professionals incorporate AI into their work. He says his
research asks questions such as, “What does transparency or model interpretations do to an expert’s deci-
sion to adopt [AI technology]?”
Hosanagar took two years to write his first book, A Human’s Guide to Machine Intelligence. He says that his
impetus for penning the work was to demystify AI for a lay audience. “I felt like there’s a lot of excitement about
AI, but a lot of the discussion is either incomplete or often imprecise.” On page 60 he writes about the potential
for AI to make scientific discoveries, something he says could happen in the near future. “It’s a matter of maybe
two to three years before we start seeing the first good examples coming out.”
CELESTE NOCHE; THE WHARTON SCHOOL; SANDEEP RAVINDRAN
Though he had spent much of his early life with a foot in two worlds—science and writing— Sandeep
Ravindran thought he would have to eventually choose between them. Upon completing a bachelor’s degree in
biological sciences at Cornell University, Ravindran simultaneously considered PhD programs and master of fine
arts programs in creative writing. He ultimately decided to pursue his PhD in microbiology and immunology at
Stanford University. But a talk by a science writer late in his training made Ravindran realize he could combine
the best of both worlds. “I decided right then that [science writing was] kind of perfect for me,” he says.
After finishing his doctorate, Ravindran completed internships at Popular Science and Science Illustrated,
before pursuing additional training through the UC Santa Cruz science communication program. After com-
pleting internships at a host of outlets including the San Jose Mercury News and Science News, he worked for
two years as a science writer at PNAS. Now Ravindran works as a freelance science journalist based out of
Washington, DC, delving into any topics that capture his interest, which often lie in the realm of life sciences.
“I’m still fascinated by biology in general, both in terms of how the natural world finds ways, through evolu-
tion, to solve problems, and how humans emulate those ways using technology,” he says. Over the past 3 years,
Ravindran has written several features and Lab Tools pieces for The Scientist. Find his latest feature about
brain-inspired computing on page 28 of this issue.
05 . 2019 | T H E S C IE N T IST 9
FROM THE EDITOR
Cerebral Inception
Our brains have evolved to the point where we can build artificial
brains that might help us understand our brains. Let that sink in.
BY BOB GRANT
Y
ou’d think that overseeing an entire issue of The Sci- to the point of assem-
entist focused on artificial intelligence would cause bling models of brains,
my mind to wander far into the future—robotic built with inanimate
researchers formulating digital hypotheses, whizzing about components, that
in sleek, metallic labs. But immersing myself in stories simulate the func-
about the novel insights and deep analyses enabled by smart tioning of their bio-
instruments and machine learning did not transport me into logical counterparts.
a vision of science in the 23rd century. These brain mimics
Instead, I found myself thinking of the distant past, of a can perform impres-
time when the first micro-vibrations of life were roiling the sive feats of pattern
raw muck of early Earth. Rather than the grand sweep of recognition, calcu-
what artificial intelligence may bring about—faster and more lation, and decision
economical data processing, new insights, novel discoveries, making. Some even
and revolutionized workflows and transportation systems— border on human-like
I thought of the original form of intelligence on our planet. intelligence, learning and adapting their behavior based
Intelligence on the molecular scale. on inputs they receive through sophisticated sensors. As
The Miller-Urey experiment of the 1950s began to researchers and engineers develop AI machines, from
shed mechanistic light on the dark mysteries of how Earth self-driving cars to self-driving microscopes (see “Autopi-
changed from an inanimate sphere to a planet bursting with lot Advocate,” page 48), at a breakneck pace, it’s how this
life. But those famed researchers could zap into existence technology informs our understanding of human biology
only amino acids, the building blocks of proteins. Later that interests me most.
experiments pushed the chemical evolution toward life fur- The intellectual feedback loop involved is what really
ther by generating nitrogenous bases, the building blocks gets me. Biology has evolved in fits and starts—species com-
of RNA and DNA. But researchers have not yet succeeded ing and going, adaptations developed and obviated by natu-
in demonstrating a route from raw chemical materials to ral selection. Eventually, curious apes appeared on the scene,
those crucial macromolecules. And it was through RNA, and, by dint of luck and environmental happenstance, their
DNA, or both that life really burst from the starting blocks, lineage branched and changed until our species, Homo sapi-
those plucky nucleic acids acquiring a sort of self-motivation ens, arrived, persisted, and spread. In the space of 1,000 or
to replicate. That, I believe, was the dawn of intelligence. so centuries, we became a singular force in the biosphere.
RNA and DNA, past and present, harbor a chemical drive Although not the most numerous organism on Earth, our spe-
to persist and reproduce that has woven a thread of life on cies is certainly the one that has made the most forceful and
Earth through the eons, unbroken. lasting changes to our planet and its inhabitants. And now
Fast-forward the evolutionary clock by 4 billion or so we’ve arrived at a point where a human brain, made from that
years, and you arrive at today. In this issue, you’ll read of same muck that gave rise to the first life forms, can conceive of,
fascinating insights into basic biology, land management, build, and manipulate models of itself that serve to teach that
and clinical practice facilitated by AI algorithms. Such mod- same lump of biological matter about its own inner workings.
els are on the precipice of helping physicians diagnose and The master is becoming the student. g
treat cancer (see “Run Program: Cancer,” page 22), pre-
dicting wildfire spread (see “A Deeper Look,” page 36), and
improving the performance of CRISPR-Cas9 genome edit-
ing (see “CRISPR Predictions,” page 46).
ANDRZEJ KRAUZE
But it is the construction of advanced neural net-

works that attempt to mirror the architecture and func-
tion of the human brain (see “A Silicon Brain,” page 28) Editor-in-Chief
that truly blows my meaty mind. Scientists have progressed eic@the-scientist.com
QUOTES
Speaking of Science
1 2 3 4 5 6 7 This is a technology with the
potential to change history for
all of us. The question is, ‘Can we
8 9 have the good without the bad?’
—Former Google AI chief Fei-Fei Li speaking in mid-March
at the opening of Stanford University’s new Institute for
Human-Centered Artificial Intelligence (March 19)
10 11
Will it be possible, in the

Note: The answer grid will include every letter of the alphabet.
foreseeable future, to build a

12 13
machine that can discover physics
14 15 or mathematics that the brightest
humans alive are not able to do
16 17 18 19
on their own, using biological
hardware? Will the future of
science eventually necessarily be
20 21
driven by machines that operate
on a level that we can never reach?
22 23
I don’t know. It’s a good question.
—Kevin Schawinski, an astrophysicist and
CEO/founder of artificial-intelligence company
BY EMILY COX AND HENRY RATHVON Modulos, talking to Quanta Magazine about how AI is
poised to change science (March 11)
ACROSS DOWN
1. Innermost ossicle of the middle ear 1. Swimmer using a ladder, maybe
5. Nest material for some wasps 2. Modern-day Mesopotamian
8. Camel’s South American cousin 3. Vocalizing like a 23-Across
9. Segmented worm 4. Relevant phenomenon in drug
10. Form rust, say testing (2 wds.)
11. Shrub pollinated by a moth 5. Flower in the violet family
12. Temple pictured on the 6. Fetid ferret relative
Cambodian flag (2 wds.) 7. Jellyfish’s kind of symmetry
16. Carnivora, for dogs, e.g. 12. Robot in human form
18. Programming language of the 1950s 13. Wild sub-Saharan pig
20. Variant of a chemical element 14. Like the Galilean moons
21. German thinker who wrote Science 15. Quadrilateral quartet
of Logic 17. Underground network
22. Tree-graft sites 19. Elephant loner
JONNY HAWKINS
23. Mother of striped cubs

Answer key on page 5
05.2019 | T H E S C IE N T IST 1 1
THE BASICS
Artificial Intelligence: An Introduction

A brief history of AI, machine learning, artificial neural networks, and deep learning
BY JEF AKST
T
he term “artificial intelligence” dates back to the mid- problem solving. Over the second half of the 20th century, machine
1950s, when mathematician John McCarthy, widely rec- learning emerged as a powerful AI approach that allows computers
ognized as the father of AI, used it to describe machines to, as the name implies, learn from input data without having to be
that do things people might call intelligent. He and Marvin Min- explicitly programmed. One technique used in machine learning
sky, whose work was just as influential in the AI field, organized is a neural network, which draws inspiration from the biology of
the Dartmouth Summer Research Project on Artificial Intel- the brain, relaying information between layers of so-called artifi-
ligence in 1956. A few years later, with McCarthy on the fac- cial neurons. The very first artificial neural network was created by
ulty, MIT founded its Artificial Intelligence Project, later the AI Minsky as a graduate student in 1951 (see “Learning Machine, 1951”
Lab. It merged with the Laboratory for Computer Science (LCS) on page 64), but the approach was limited at first, and even Min-
in 2003 and was renamed the Computer Science and Artificial sky himself soon turned his focus to other approaches for creating
Intelligence Laboratory, or CSAIL. intelligent machines. In recent years, neural networks have made a
Now a ubiquitous part of modern society, AI refers to any comeback, particularly for a form of machine learning called deep
machine that is able to replicate human cognitive skills, such as learning, which can use very large, complex neural networks.
ARTIFICIAL INTELLIGENCE
An attribute of machines that embody a
form of intelligence, rather than simply
carrying out computations that are input
by human users.
Early applications of AI included machines
that could play games such as checkers and
chess and programs that could analyze and
reproduce language.
MACHINE LEARNING
An approach to AI in which an algorithm
learns to make predictions from data that
is fed into the system.
From personalized news feeds to traffic pre-
diction maps, most people in developed coun-
tries use machine learning–based technolo-
gies every day.
NEURAL NETWORKS
A machine learning approach in which
algorithms process signals via intercon-
nected nodes called artificial neurons.
Because they mimic the architecture of bio-
logical nervous systems, artificial neural net-
works are the obvious method of choice for
modeling the brain.
DEEP LEARNING
A form of machine learning that often
uses a network with many layers of
THE SCIENTIST STAFF
computation—a deep neural network—

enabling an algorithm to powerfully
analyze the input data.
Deep neural networks are responsible
for self-driving vehicles, which learn to
recognize traffic signs, as well as for
voice-controlled virtual assistants.
NEWS AND ANALYSIS
Notebook MAY 2019
Going Through the notes of his ballad. Then the partners

dance, running and circling each other.
A FLY’S TRACKS: An algorithm that auto-
matically tracks an animal’s limb movements
the Motions
could help scientists study the link between
Finally, the male attempts to copulate
MURTHY LAB AND SHAEVITZ LAB, PRINCETON UNIVERSITY
brain and behavior.

and the female accepts or rejects.
I
t takes an average of 17 minutes for a Pereira is studying how the court-
fruit fly couple to move from meeting ship song and dance are represented in undergraduate studies at Princeton.
to mating, says Talmo Pereira, a PhD the brains of the flies. Along the way, “We developed all this crazy artificial
student studying neuroscience in Joshua he and colleagues developed a power- intelligence just to try to understand
Shaevitz’s lab at Princeton University. ful method to track animal behavior. fly sex,” jokes Pereira. “Or not even sex
The encounter is marked by “lots of com- Their tool, LEAP Estimates Animal really, just what leads up to it.”
plex stages, arguably more complex than Pose (LEAP), harnesses a type of arti- Traditionally, researchers have col-
human courtship,” he says. A male and ficial intelligence called a deep neural lected data on animal movements by
a female Drosophila melanogaster first network, essentially a “fancy machine clicking through video footage, frame
size each other up through an exchange that can learn to do any . . . arbitrary by frame, and labeling the body parts of
of pheromones. If they’re compatible, operation that you can train it to do,” interest. It’s a laborious process that can
the male chases the female down and says Diego Aldarondo, currently a PhD take graduate students or volunteers
woos her by “singing” with a wing that student at Harvard University, who hours upon hours, says Ben de Bivort,
he vibrates in particular patterns to form built the tool with Pereira during his a behavioral neuroscientist at Harvard
05 . 2019 | T H E S C IE N T IST 1 3
NOTEBOOK
who was not part of the study. And such the tool that needs far fewer frames— help study disorders such as autism
herculean efforts produce small data- around 100—to achieve up to 95 per- that are associated with stereotyped
sets compared to other methods that cent accuracy in tracking 32 points on a movements, says de Bivort. They’ll also
researchers employ in their studies of fly’s body. In their report, the researchers help neuroscientists probe the connec-
animal biology, such as genomics or used LEAP to track all six of a fly’s legs, tions between the brain and behavior,
high-resolution neural recording, he plus its wings, body, and head. They also he adds. “Maybe the biggest question
says. “So measurement of the behavior applied their tool to capture the limb in neuroscience is: How does the brain
was always kind of a bottleneck.” movements of a mouse (Nat Methods produce behavior? Because that’s what
Another option is to glue markers 16:117–25, 2019). the brain is for,” de Bivort says. “It’s not
onto an animal’s limbs and then use LEAP’s success comes from a com- exaggerating to say that these tools are
computer software to track them from bination of human and artificial input. a big deal for our field now.”
video footage. “Imagine like the markers After receiving a set of labeled video —Carolyn Wilke
you would put on Andy Serkis to make frames, it uses them to learn how points
Gollum in Lord of the Rings,” says Gor- are placed according to each image’s fea-
don Berman, a theoretical biophysicist
at Emory University who did a postdoc
tures, and then spits out labels for the
next set of frames, which a researcher Listening for
with Shaevitz but was not involved in
the LEAP project. Unfortunately, ani-
then reviews. The tool’s guesses may not
be great the first time, but correcting Your Health
mals are pretty good at grooming them the program helps it get smarter. After Speech is a window into our brains—and
off, and “putting these markers on a fly a few rounds of back and forth between not just when we’re healthy. When neuro-
. . . is rather difficult,” he says. LEAP and a human, the program has logical issues arise, they often manifest in
learned enough to correctly identify the the things we say and how we say them.
parts—even over the course of less than IBM computer scientist Guillermo
We developed all this crazy one day. De Bivort describes the process Cecchi came to appreciate just how
artificial intelligence just to as “using the algorithm to produce the important language is in medicine as
try to understand fly sex. data to make a better algorithm.” a psychiatry postdoc at Weill Cornell
—Talmo Pereira, Princeton University “It’s surprisingly easy. It’s obviated Medicine in the early 2000s. Despite
a lot of my hard-won image processing advances in brain imaging, “it’s still
skills over the years,” says Berman, who [through] behavior, and fundamentally
Pereira says that watching actors in uses the tool in his own research on flies through language, that we assess men-
motion capture suits was in fact what and prairie voles. “What used to take tal health,” he says. “And we deal with it
got him thinking about how to track the months and months of work now takes through therapy. . . . Language is essen-
flies. But as he dug into the literature, a couple of weeks, if that.” tial for that.”
he realized that scientists were already Another artificially intelligent In the digital age, hardware and
starting to capture animal motion with- method for motion capture, DeepLab- software are available for “natural lan-
out using markers. Aldarondo, mean- Cut, developed by a separate group of guage processing”—a type of artificial
while, was working on motion capture Harvard researchers appeared around intelligence (AI) pioneered by IBM’s
algorithms in a computer science class, the same time as LEAP, also for track- Watson that extracts useful content
and, after chatting about it with Pereira, ing mice and fruit flies. Each tool has from written and spoken language. But
he decided to apply neural networks to its advantages: LEAP requires less time while companies such as Google and
his lab’s footage of individual fruit flies to train, but DeepLabCut uses a bigger Facebook use language processing to
as a course project. neural network and performs better evaluate our social media interactions,
In the first attempt, Aldarondo and than LEAP on cluttered or lower-qual- emails, and browsing histories in order
another student in the course labeled ity images, says Berman. But both have to personalize the ads we see in our
thousands of frames of video with points the strength of being applicable beyond news feeds, the tools have yet to be har-
denoting fly body part locations and the species they were first developed nessed for medical applications. In the
then used those frames to train the net- for. So far, multiple research groups clinic, “all that technology is completely
work to recognize the body parts auto- have used LEAP to track the motion of ignored,” says Cecchi. “We still judge the
matically. After the course ended, he mice, rats, grasshoppers, spiders, ants, language production of the patient on a
and Pereira continued working on the fish, and more, says Pereira. subjective basis.”
project, tweaking the algorithm to auto- Both tools could have applications With recent advances in AI, that’s
mate more of the process. At the end in everything from behavioral ecology starting to change, and Cecchi’s team at
of last year, they published a version of to medical research, where they could IBM is one of several groups now devel-
While the linguistic content of speech
can reveal a lot about how a person’s brain
is functioning, other aspects of spoken
language, such as voice, tone, and intona-
tion, could provide additional clues about
a person’s physical and mental health. “If
you have a cold, the sound of your voice
changes,” notes MIT computer scientist
James Glass, who has investigated AI
analyses of speech for detection of cogni-
tive impairment and depression.
Monitoring people’s health by listen-
ing to the sound of their voice is the focus
of researchers at Sonde Health, a startup
based in Boston, Massachusetts, that aims
to integrate voice-analysis technology into
consumer devices such as Google Home
and Amazon Echo. Company cofounder Jim
Harper says the team has already developed
machine learning models to predict more
than 15 conditions, including neurological,
respiratory, and muscular or cardiac dis-
orders, based on the acoustic properties of
short fragments of speech. The early mod-
els work “almost as well as existing measure-
oping machine learning algorithms to 85–100 percent accuracy, if psychosis ments,” Harper says, noting that the com-
analyze patient language. “I would say in onset was imminent (in the next two pany is already in talks with the US Food
the last five years there’s been an explo- years) in young, high-risk patients (npj and Drug Administration about its model
sion of interest,” he says. The approaches Schizophr, 1:15030, 2015; World Psy- for detecting depression and hopes to begin
are all in the earliest stages of develop- chiatry, 17:67–75, 2018). a clinical study within the year.
ment, with most models suffering from The model is a long way from clinical
small training and testing datasets. But use, cautions Cecchi, noting that the stud-
several studies have yielded promising ies included data from just a few dozen
While several groups are
results across a range of psychiatric and subjects. “We need to reach sample sizes investigating either linguis-
neurological conditions. [of ] several thousand to say we are abso- tic or acoustic elements of
One area that Cecchi has explored lutely sure this is working.” But he sus- spoken language, a com-
is the prediction of schizophrenia and pects that more work will support the use bination of the two often
psychosis onset. Because in these con- of such AI-based approaches, not only for
yields the best results.
ditions “it’s your thought process that helping psychiatrists diagnose psychosis,
is disordered,” as Cecchi explains, their but also for monitoring patients who suf-
connection to language is intimate. fer from psychotic disorders. Examining qualities of speech, such
A few years ago, Cecchi and his col- And it’s not just psychosis, he empha- as the tone and expressiveness of a per-
leagues developed a machine learning sizes. “The major disorders affecting son’s voice, can be particularly revealing
algorithm to analyze two features of our society—depression, PTSD, addic- for identifying movement disorders such
speech known to be affected in psycho- tion, and then neurological disorders, as Parkinson’s disease, which can disrupt
sis: syntactic complexity and semantic Alzheimer’s disease, Parkinson’s disease the functioning of muscles involved in
coherence, a technical term for the flow —all of them leave a mark in language.” speech. After being diagnosed, some Par-
of meaning. A few years ago, for example, his group kinson’s patients will recognize that one
developed machine learning models that of the first symptoms was a flat speaking
ANDRZEJ KRAUZE
In two small-scale validation stud-

ies, Cecchi’s team trained the algo- predict Parkinson’s diagnoses and sever- tone, for example.
rithm using transcripts of interviews ity with about 75 percent accuracy based Laureano Moro-Velázquez, a tele-
with patients, and showed that the on transcripts of patients describing their communications engineer at Johns
resulting model could predict, with typical day (Brain Lang, 162:19–28, 2016). Hopkins University, and his colleagues
05 . 2019 | T H E S C IE N T IST 1 5
NOTEBOOK
are using machine learning to analyze

phonemes—the discrete sounds that
extract clues about whether a person is
depressed from text and audio record- A Better
compose speech—as a means of diagno-
sis. This February, the team published a
ings of interviews. The model learned—
both from the words used to respond to Beehive
model, trained with recordings of sen- questions and from other features of When high school student Jade Greenberg
tences recited by about 100 people with speech, such as its speed—to predict, heard about what was happening to Amer-
Parkinson’s and 100 controls, that could with 77 percent accuracy, depression in ica’s bee populations, she decided to take
determine, with more than 80 percent 142 patients whose data were held in a action. Last year, Greenberg, then a junior
accuracy, whether or not someone had public repository, according to results at Pascack Hills High School in New Jer-
the disease (Biomed Signal Process Con- Alhanai presented at the Interspeech sey, learned from a local beekeeper about
trol, 48:205–20, 2019). conference in September 2018. a tiny reddish-brown mite that is posing
While several groups are investi- For now, the use of speech analysis a serious threat to the honey bees (Apis
gating either linguistic or acoustic ele- is still in the proof-of-concept stage, mellifera) used across the US for polli-
ments of spoken language, a combi- whatever aspects are analyzed. “I think nating various crops. But what began as
nation of the two often yields the best everything suffers from small data- a school project to build a hive that might
results. Cecchi’s group, for example, bases,” says Glass. “Do it on something help boost bee health soon turned into a
recently used an analysis of recordings ten or a hundred times bigger, and I’ll collaboration with two tech companies to
of Parkinson’s patients’ speech—consid- pay more attention.” But if validation of use artificial intelligence (AI) and tracking
ering both acoustic and linguistic fea- the early-stage work proves as success- systems to tackle the problem.
tures—to successfully identify who was ful as many in the field anticipate and The varroa mite (Varroa destructor)
taking the drug levodopa (bioRxiv, doi: hope, he says, “I think it just opens up was brought over to North America from
10.1101/420422, 2018). new opportunities to complement exist- Southeast Asia decades ago and has been
And Tuka Alhanai, a graduate stu- ing techniques and maybe provide more decimating not only colonies tended by
dent in Glass’s lab at MIT, has devel- comprehensive [care].” beekeepers, but also feral colonies—those
oped a machine learning model to —Jef Akst that were started in the wild by bees orig-
YOU ARE NOT A ROBOT...

SO DON‘T ACT LIKE ONE
FREE YOURSELF FROM
ROUTINE PIPETTING
ASSIST PLUS Automating Multichannel Pipettes

Making hands-free serial dilutions, reagent additions and sample reformatting very
affordable for every lab. Compatible with all Integra‘s electronic pipettes from 4 to 16
channels for consistent results and unbeatable ergonomics.
VIAFLO - Electronic Pipettes VOYAGER - Adjustable Tip Spacing Pipettes www.integra-biosciences.com

inally from hives kept by humans. The ceptibility to mite infestations. “There engineer at artificial intelligence company
pest reproduces within honey bee colo- is research on environmental factors, Kinetica, made a suggestion.
nies, latching onto the insects and feeding such as where the bees forage and how Kinetica uses AI and other technol-
off their fat body, a tissue similar in func- they are affected by pesticides,” she ogies to analyze extremely large data
tion to the mammalian liver. Over time, says. “But there is very little informa- sets on everything from financial mar-
the parasite weakens bees’ immune sys- tion on whether or not the shape of the kets to the movements of fleets of trucks
tems, making them more susceptible to commercial hive has an effect on [mite as they drive around the US. Jonathan
viruses and pesticides. infestation rates].” Greenberg had told colleagues about his
“Even if we solve the other problems Taking advantage of a school sci- daughter’s project and, to his surprise,
contributing to honey bee loss, like pesti- ence fair as an occasion for the proj- found they were keen to get involved.
cides and poor nutrition, colonies will still ect, Greenberg set about designing her One of those colleagues was Jacci Cenci,
be lost if varroa is not under control,” says own beehive, based on the so-called Sun a solutions architect at NVIDIA, a tech
honey bee researcher Gloria DeGrandi- Hive. Invented in the 1980s by Günther company focused on AI and graphics.
Hoffman of the US Department of Agri- Mancke, a German sculptor who studied Cenci offered to mentor Greenberg on
culture’s Agricultural Research Service natural beehives, Sun Hives “are hand- machine learning technologies and pro-
(USDA-ARS) Carl Hayden Bee Research made hives and not commercially via- vide her with real-world data to analyze
Center in Tucson, Arizona. ble,” Greenberg says. “I decided to riff off bee health.
of Mancke’s design and make something Working with a beekeeper in Califor-
that could be both commercially viable nia, the team placed sensors on a single
What began as a school and also healthier for bees.” Langstroth hive containing a healthy bee
project soon turned into a Greenberg incorporated sensors to colony to monitor its temperature, humid-
collaboration with two tech monitor hive weight, temperature, and ity, and weight, and installed a camera to
companies to use artificial humidity, along with video monitoring capture images. The researchers then cre-
technology, while still retaining the nec- ated an analytical tool using Kinetica tech-
intelligence to tackle the
essary features for commercial beekeep- nology to detect bees in the images and
problem. ing, such as removable frames. After a few
months of honing the design on the com-
BEES, DEBUGGED: Researchers trained
There are hints that the practice of puter, Greenberg was considering building machine learning algorithms to distinguish
beekeeping itself may be contributing to a prototype of the 42-liter structure using between a healthy honey bee and one covered in
the problem. DeGrandi-Hoffman and her
colleagues, for example, recently found ...TOBOR A TON ERA UOY
a 3-D printer when her father, a solutions parasitic mites.
that the current commercial beekeeping

practice of preventing swarming—when
ENO EKIL TCA T‘NOD OS
a queen bee and a group of worker bees
leave their original colony to form a new
one—may have exerted selection pressure
MORF FLESRUOY EERF
on the mites to find new ways to disperse
among bee colonies (Environ Entomol,
GNITTEPIP ENITUOR
46:737–46, 2017).
Greenberg was interested in the role of
another aspect of beekeeping: hive design.
Feral bees typically nest in tree cavities or
other structures, forming layers of wax
comb to fill the available area. By contrast,
the most widely used manmade beehive settepiP lennahcitluM gnitamotuA SULP TSISSA
in North America, the Langstroth hive, yrev gnittamrofer elpmas dna snoitidda tnegaer ,snoitulid laires eerf-sdnah gnikaM
is essentially a stacked, wooden file cabi- 61 ot 4 morf settepip cinortcele s‘argetnI lla htiw elbitapmoC .bal yreve rof elbadroffa
JONATHAN GREENBERG
net with removable frames on which bees .scimonogre elbataebnu dna stluser tnetsisnoc rof slennahc
build their combs, making it easy for bee-
keepers to collect honey and monitor the
colony’s health.
Greenberg wondered if the hive
design could be affecting the bees’ sus-
05 . 2019 | T H E S C IE N T IST 1 7
moc.secneicsoib-argetni.www settepiP gnicapS piT elbatsujdA - REGAYOV settepiP cinortcelE - OLFAIV
NOTEBOOK
to analyze environmental factors such as erties such as ventilation and heat dis- of cells as they move from one state to
temperature, humidity, and weather con- tribution in her prototype hive, in the another—for example, from a pluripotent
ditions that might be contributing to var- Langstroth hive, and in wild beehives to stem cell to a differentiated heart cell.
roa mites’ ability to infect the colony. The understand how hive design might help “Because of technological limita-
effort was an initial proof of concept, says or hinder mite infestations. tions, we can only see a few things in
Jonathan Greenberg. “This was about fig- “I’m always excited when younger stu- the cells at once,” Johnson says. “So we
uring out how the technology can work dents bring passion and a youthful, new wanted to figure out ways that we could,
to more effectively monitor parasites for way to look into established systems,” says at the very least, predict the organization
Jade’s hive experiments in the field.” Ramsey. “That is certainly a way to make of many more structures from the data
The group also came up with a better sure there is progress.” DeGrandi-Hoffman that we already have.”
way to monitor mite infection in honey agrees. “It’s great to see young students with Specifically, they wanted to develop a
bee colonies. Usually, “the way beekeep- enthusiasm and confidence to tackle com- method to identify a living cell’s compo-
ers assess whether there is a mite infes- plicated problems with creative solutions.” nents in images taken using brightfield
tation is invasive,” says Samuel Ramsey, For Greenberg, the biggest surprise microscopy. This technique is simpler and
a research entomologist at the USDA was how few researchers are studying cheaper than fluorescent microscopy, but
Bee Research Laboratory in Beltsville, biological problems such as honey bee has a major disadvantage—it produces
Maryland, who was not involved in the health from an engineering perspective. images that appear only in shades of gray,
project. “We take 300 bees, put them in Her solution? To major in bioengineering making a cell’s internal structures diffi-
a jar, sprinkle powdered sugar on them, when she starts university this fall. cult to decipher. So the scientists decided
shake the jar, and count how many mites —Anna Azvolinsky to create a computer algorithm that could
fall off of the bees.” combine the benefits of both methods by
Instead, Greenberg’s team captured learning how to detect and tag cellular
images in the California hive every
10 seconds and then used NVIDIA’s Into the Cell structures the way fluorescent labels can,
but in brightfield images instead.
machine learning to detect the presence For cell biologists, fluorescence micro- To do this, the team turned to deep
of varroa mites on bees. To do this, the scopy is an invaluable tool. Fusing dyes learning, an artificial intelligence (AI)
team used individual frames from vid- to antibodies or inserting genes coding approach where algorithms learn to
eos of the healthy beehive, and from a for fluorescent proteins into the DNA of identify patterns in datasets. They trained
different beehive where the varroa mite living cells can help scientists pick out the convolutional neural networks—a deep
had taken up residence, to train a com- location of organelles, cytoskeletal ele- learning approach typically used to analyze
puter algorithm. ments, and other subcellular structures and classify images—to identify similari-
The data collection and analysis from otherwise impenetrable microscopy ties between brightfield and fluorescence
tools that Cenci, Greenberg, and their images. But this technique has its draw- microscopy images of several cellular com-
colleagues built are all still in the proto- backs. There are limits to the number of ponents, including the nuclear envelope,
type stage. But Ramsey says that it has fluorescent tags that can be introduced cell membrane, and mitochondria.
been a step toward figuring out whether into a cell, and side effects such as photo-
Langstroth hives contribute to the mite toxicity—damage caused by repeated
Seeing this work in a movie,
infestation problem—and if Green- exposure to light—can hinder research-
berg’s hive design could aid the bees in ers’ ability to conduct live cell imaging.
in a live cell, in 3D, was
staving off the mites with their natu- These issues were on biomedical engi- really jaw dropping.
ral defenses. “Designing a better honey neer Greg Johnson’s mind when he joined —Rick Horwitz, Allen Institute for Cell Science
bee hive would be wonderful because the Allen Institute for Cell Science in
we are putting these bees in an unnat- Seattle in 2016. Johnson, whose doctoral After comparing many pairs of
ural situation, the Langstroth hive,” he work at Carnegie Mellon University had images, the algorithm was able to pre-
says. “It’s an old technology that wasn’t focused on creating computational tools dict the location of structures that flu-
designed to address current issues like to model cellular structures (see “Autopi- orescent labels would have tagged, but
mite infestation.” lot Advocate” on page 48), was hired as in 3-D brightfield images of live cells
The project won the engineering cat- part of a group of researchers working (Nat Methods, 15:917–20, 2018). The
egory at the Nokia Bell Labs North Jer- to build a 3-D model of a cell. Accord- researchers found that the tool was very
sey Regional Science Fair, and Greenberg ing to Johnson, one of the key aims of accurate: its predicted labels were highly
was a finalist at the Intel International the project, dubbed the “Allen Integrated correlated with the actual fluorescent
Science and Engineering Fair. She says Cell,” was to develop a tool to help visu- labels for many cellular components,
she now hopes to measure physical prop- alize changes in the spatial organization including nucleoli, nuclear envelopes,
and microtubules. By applying the tech- he adds. The researchers were also able Another group, which included Steve
nique to a series of brightfield images to use their deep learning algorithm to Finkbeiner, a neuroscientist at the Glad-
and merging the outputs, “we [were identify the location of proteins that stone Institutes and the University of Cal-
able to get] this beautiful time-lapse of make up myelin—the protective sheath ifornia, San Francisco, and his colleagues
all these cell parts moving around and around neurons—in 2-D electron micro- at Google, developed a similar AI-based
interacting with each other,” Johnson scope images. cell-labeling technique last year (Cell,
tells The Scientist. Still, the method has some limita- 173:P792–803.e19). “I think the cool
tions. According to Johnson, one key thing about this technology is that it can
The cool thing about this issue is that the technique does not be applied so broadly, and I don’t think
work on all cellular structures, because we have a great feel yet for what the lim-
technology is that it can be
some simply do not appear in images its are,” Finkbeiner says.
applied so broadly. taken with certain forms of micros- Boucheron notes that she is currently
—Steve Finkbeiner, Gladstone Institutes copy. In their recent study, for example, investigating applications of these AI-
the algorithm had difficulty identifying based approaches to image analysis in
“Seeing this work in a movie, in a a few structures in brightfield images, both biology and astronomy—another
live cell, in 3D, was really jaw dropping,” including Golgi apparatuses and desmo- field where researchers rely on a vari-
says Rick Horwitz, executive director of somes, junctions that hold cells together. ety of instruments to capture and ana-
the Allen Institute for Cell Science, who Another limitation is that, while the tool lyze natural phenomena. “I work with
wasn’t directly involved in the project. “It requires a relatively small training set, astronomy data quite a bit, and partic-
was really a bit like magic.” a model trained on images from one ularly with solar images,” she explains.
Laura Boucheron, an electrical engi- microscope might not work on images “I’ve been looking for several years for
neer at New Mexico State University gathered from another. ways to kind of translate between some
who was not involved in the work but The team is now investigating some of the different [instruments] that are
coauthored an accompanying perspec- potential applications of the technique. used to image the Sun.”
tive article piece in the same issue of Horwitz suggests that, in addition to Techniques that apply deep learning
Nature Methods, tells The Scientist that being able to make imaging studies faster to image analysis could be useful wher-
the results were “shockingly impressive.” and cheaper, the tool could eventually be ever a microscope or telescope is used,
She adds that the images generated by applied in pathology to help identify sick Horwitz says. This latest study is “just the
the algorithm are “remarkably similar” cells or to rapidly identify how cellular tip of the iceberg.”
to those produced using fluorescence structures change in diseased states. —Diana Kwon
microscopy. “The brightfield images are,
to a human, visually not particularly
interesting. They are not as clear—in
terms of the structures present—as flu-
orescent images,” Boucheron says. “But
based on the results, clearly there is infor-
mation [in the brightfield images] that
the network is learning to interpret.”
Johnson notes that a big upside to
his team’s method is that, contrary to the
common belief that deep learning algo-
rithms require thousands of images to
learn, this tool could be trained with just
dozens. “This is something that a gradu-
COURTESY OF THE ALLEN INSTITUTE
ate student can gather in an afternoon,”
NEW PERSPECTIVE: Researchers used an algo-

rithm to predict the location of structures in a
human cell from images taken with brightfield
microscopy. Cellular components include the
nuclear envelope and DNA (gray), mitochondria
(purple), actin (yellow), endoplasmic reticulum
(blue), and microtubules (tan).
05 . 2019 | T H E S C IE N T IST 1 9
The IsoLight:
Illuminating
The Small Things
That Make a Big
Difference
IsoPlexis, with industry leaders, has uniquely revealed differences
between Non-Responders and Responders in a variety of
immunotherapy contexts. Follow our data to see how the award winning
IsoLight is making a difference throughout immuno-oncology at
www.IsoPlexis.com/CellT-P
Off-target Effects in CRISPR-Cas9 Genome Editing:
COMINGSOON Securing Specificity
A cause for concern regarding the popular CRISPR-Cas9 genome editing technology is the potential occurrence of off-target effects. RNA-guided
Cas9 may cleave DNA sequences that are not exact complements of the guide strand when either strand harbors bulges due to insertions or
deletions. Undoubtedly, a better understanding of the specificity of CRISPR-Cas9 editing can help develop strategies to minimize off-target cleavage.
To take a closer look at the specificity and possible causes of off-target effects in the CRISPR-Cas9 system, The Scientist is bringing together
researchers who will summarize their work on off-target effects limiting the applications of Cas9-mediated genome modification and discuss
strategies for predicting and preventing such off-target effects. Attendees will have the unique opportunity to ask experts about their experience with
off-target effects of the CRISPR-Cas9 system.
LAURYL NUTTER, PhD TUESDAY, MAY 7

Associate Director, Model Production and 2:30 – 4:00PM EASTERN TIME
Cryopreservation & Recovery
The Centre for Pharmacogenomics REGISTER NOW!
The Hospital for Sick Children www.the-scientist.com/crisprcas9offtarget
The webinar video will also be available at this link.
DAVID J. SEGAL, PhD TOPICS TO BE COVERED:

Professor, Dept. of Biochemistry • Whole genome sequencing to assess Cas9 off-target
and Molecular Medicine effects in genetically engineered mice
UC Davis School of Medicine • Will elimination of off-target effects be good enough?
Professor, Dept. of Pharmacology
UC Davis MIND Institute
WEBINAR SPONSORED BY:
Systems Immunology:
COMINGSOON Understanding Responses to Vaccination and Infection
Systems immunology is a new and rapidly developing field of research providing quantitative molecular profiles of patients’ immune responses,
made possible by recent advances in multiple technological platforms. The human-centered approach has invigorated vaccine research and
shown promise to enhance our knowledge of the systemic immune response to vaccination and infection that was not possible with classical,
animal-model based approaches. Techniques that make use of peripheral blood to survey systemic immune responses include next-generation
sequencing coupled to bioinformatic analysis, gene and protein microarrays, metabolomics, multiparameter flow cytometry, mass cytometry,
and multiplex cytokine and chemokine assays. The Scientist is bringing together a panel of experts to present their research in systems
immunology and provide insight into this exciting new field.
KATHRYN MILLER-JENSEN, PhD TUESDAY, MAY 21

Associate Professor, Departments of 2:30 – 4:00PM EASTERN TIME
Biomedical Engineering and Molecular,
Cellular, and Developmental Biology REGISTER NOW!
Yale University www.the-scientist.com/systemsimmunology
The webinar video will also be available at this link.
TOPICS TO BE COVERED:
EDANA CASSOL, PhD • New insights into immune responses to infection
Assistant Professor through data-driven analysis of multiplexed
Department of Health Sciences single-cell data
Carleton University
• Using systems immunology to understand innate
immune dysfunction in HIV infection
WEBINAR SPONSORED BY:

MODUS OPERANDI
Microbiology Meets Machine Learning

Artificially intelligent software augments the study
of host-pathogen interactions.
BY RUTH WILLIAMS
C
Host defense
Toxoplasma
omputational systems called neural proteins
gondii
networks—based on the learning
processes of biological brains—
enable a form of machine learning that has
the potential to help researchers interpret
biological and medical images. Scientists who
study how pathogens interact with host cells Pathogen-derived
proteins
are now beginning to harness such technology.
“Most people in the [pathogen-host
interactions] field were just manually
counting—literally sitting there and assessing Vacuoles
how many [parasites] per cell, how many

in one of these vacuoles,” and so on, says
parasitologist Eva Frickel of the Francis Crick
Institute in London. “My students were losing
DETECTING INVADERS: A neural network–based machine learning program called HRMAn spots several features of
hours and hours, days and weeks counting
cells to determine whether they are infected with a pathogen (in this case, T. gondii), such as the number of vacuoles,
these events.” which changes with infection, and the presence of host defense proteins and of the pathogen and its proteins.
Neural networks are used for all manner
of image-processing tasks, such as face The system required training with thousands Indeed, the team demonstrated that
recognition, diagnostics, and self-driving of example images, and once it was up and HRMAn could simultaneously recognize
cars, so Frickel thought such a system might running, the team gave the system a name: pathogen killing, replication, and a variety
offer a solution to her team’s problem. She HRMAn, pronounced Herman, for Host of cellular defense processes—just as a
teamed up with computational biologist Artur Response to Microbe Analysis. trained scientist might, but the computer
Yakimovich of the Medical Research Council’s They’ve used the software to analyze had far higher throughput and greater
Laboratory for Molecular Cell Biology to make Toxoplasma gondii and Salmonella enterica statistical strength, and did not require
a human-like host-pathogen analyzer. infections in a variety of human cell lines. Other tea breaks and sleep.
Yakimovich, Frickel, and colleagues high-throughput image-analysis software may “It’s a tool that goes beyond our
started with an existing open-source neural be capable of identifying which cells contain ability as humans to process and to
network–based analytics platform called pathogens, but HRMAn comes into its own, interpret image data,” says parasitologist
KNIME (for Konstanz Information Miner) says Frickel, in its ability to identify multiple Adrian Hehl of the University of Zurich
and tweaked its algorithms to process visual characteristics of pathogens and host who was not involved in the study (eLife,
images of host cells and their pathogens. cells at once and detect patterns in the images. 8:e40560, 2019). g
AT A GLANCE
IMAGE ANALYSIS TYPES OF IMAGES PROCESSED USES AI COST

SOFTWARE AND HARDWARE REQUIRED COMPONENT?
Harmony from Perkin Images captured on an Operetta CLS or Analyzing a large variety of cell No Comes free with the Operetta CLS and Opera Phenix
Elmer Opera Phenix high-content screening phenotypes, including host-pathogen high-content screening systems, which themselves
© GEORGE RETSECK
system. interactions cost several hundred thousand dollars each.
HRMAn Images collected on any fluorescent Currently trained to detect Toxoplasma Yes Free, open source
microscope or high-content screening gondii and Salmonella enterica interacting
system. with human cells, but could be adapted
for a variety of host-pathogen
interactions
CRITIC AT LARGE
AI Versus Animal Testing

Machine learning could be the key to reducing the use of animals in experiments.
BY THOMAS HARTUNG
T
here are more than 100,000 chemi- In collaboration with Underwriters
cals in consumer products. For the vast Laboratories (UL), a global safety consult-
majority, there is very little information ing and certification company, the data-
about their toxicity. Traditionally, researchers base has expanded to more than 10 million
will test chemicals of interest in animals. As chemical structures, more than 300,000
an extreme example, a pesticide undergoes of which are annotated with biological and
about 30 animal tests, costing about $20 mil- chemicophysical data and some 50,000
lion and consuming more than 10,000 mice, of those also include animal data. Using
rats, rabbits, and dogs over five years. About an Amazon cloud server, it took two days
20 kilograms of the chemical are needed for to analyze the similarities and differences
this testing; obtaining such a volume can be among the 10 million chemicals to place
quite a challenge for a substance not yet on them on a map. Applying this to 190,000
the market. Other substances receive less classified chemicals based on animal tests,
scrutiny, but even products with lower regu- the computer correctly predicted the out-
latory standards, such as industrial chemicals, come of toxicity studies 87 percent of the
can require $5 million worth of animal test- time—exceeding the 70 percent probabil-
ing before entering the marketplace. ity of animal tests to find a toxic substance
Our group, the Center for Alternatives again in a repeat animal test. Traditional
to Animal Testing (CAAT) at Johns Hopkins testing for the nine different hazard clas-
University, sought a better way. As so many sifications currently analyzed by the model
biologists are doing these days, we turned to consumes 57 percent of all animals in safety tion will only work for toxic properties that
intelligent computer programs for help. We testing in Europe, or about 600,000 ani- are directly dependent on chemical structure.
showed that artificial intelligence (AI) could mals per year. These include assays such as Where the underlying biology is complex—
mine existing data on chemical toxicity and skin and eye irritation/corrosion, skin sen- such as links between a substance and can-
generate new information. In 2016, we com- sitization, acute toxicity, and mutagenicity. cer, or insults to a developing embryo—more
piled a database of 800,000 toxicological Some of the chemicals in the database information is required. Here, large stan-
studies on more than 10,000 chemicals reg- have already been tested excessively, caus- dardized databases are rare, with the notable
istered under the European REACH legisla- ing unnecessary loss of animal life. For exception of some US agencies’ robot-testing
tion for industrial chemicals, and used it to instance, two chemicals were tested more programs. These programs have subjected
feed an advanced predictive algorithm that than 90 times in rabbit eyes; 69 substances about 2,000 chemicals to more than 700
enabled us to predict the toxicity of any chem- were tested more than 45 times in the same cellular assays, and nearly 10,000 chemicals
ical without setting foot in the animal lab. eye test. Our AI approach could drastically to about 60 assays. With computing power
The software takes advantage of the power reduce the need for these common animal now more accessible and far more affordable,
of big data and transfer learning, a machine toxicity tests, and save quite a bit of money AI is the solution to many real-world prob-
learning method that applies information in the process. Running the supercomputer lems, such as limiting the need for animals in
from one task or set of items to another. Simi- to create the map cost about $5,000, and safety testing and biological research. g
MODIFIED FROM ©ISTOCK.COM, JXFZSY
lar chemicals have similar properties. Based on computer costs per prediction are now Thomas Hartung is chair for Evidence-
that principle, the software builds a map of the negligible. Most importantly, beyond the based Toxicology at the Johns Hopkins
chemical universe. Similar chemicals are put regulation of chemicals being brought to Bloomberg School of Public Health in Bal-
close to each other, dissimilar ones more dis- market, chemists can identify hazards before timore, Maryland, where he is a professor
tant. Then, the model can place new chemicals the compounds are ever synthesized. And of molecular microbiology and immunol-
on the map, assess what is known about their when a toxic chemical needs to be replaced, ogy. Hartung is also a professor of phar-
neighbors, and from that information surmise the AI approach can help ensure that less- macology and toxicology at the University
their potentially harmful health and environ- toxic substances are chosen. of Konstanz in Germany. He directs the
mental effects. The more data are fed into the The power of this approach depends on Centers for Alternatives to Animal Testing
model, the more powerful it becomes. the availability of data, and its implementa- (CAAT) of both universities.
05 . 201 9 | T H E S C IE N T IST 23
Run Program:
Cancer
Using input from images to -omes, artificial intelligence can find patterns in tumors
and generate prognoses where human clinicians see only a jumble of data.
BY AMBER DANCE
I
t’s the question on every cancer patient’s mind: How long and Personalized Medicine at Stanford University. In 2013, he
have I got? Genomicist Michael Snyder wishes he had and then–graduate student Kun-Hsing Yu wondered if artificial
answers. intelligence could provide more-accurate predictions.
For now, all physicians can do is lump patients with simi- Yu fed histology images into a machine learning algo-
lar cancers into large groups and guess that they’ ll have the rithm, along with pathologist-determined diagnoses, train-
same drug responses or prognoses as others in the group. ing it to distinguish lung cancer from normal tissue, and two
But their methods of assigning people to these groups are different types of lung cancer from each other. Then he fed
© MOLLY MENDOZA
coarse and imperfect, and often based on data collected by in survival data for those slides, letting the system learn how
human eyeballs. that information correlated with the images. Finally, he added
“When pathologists read images, only sixty percent of the time in new slides that the model hadn’t seen before, and asked the
do they agree,” says Snyder, director of the Center for Genomics all-important longevity question.
The computer could predict who would live for shorter or the top four millimeters of the cervix’s transformation zone, a ring
longer than average survival times for those particular cancers— of tissue surrounding the cervix where cancer most often arises.
something pathologists struggle to do.1 “It worked surprisingly Once the cancer metastasizes, however, survival rates drop to 56
well,” says Yu, now an instructor at Harvard Medical School. percent or lower over five years.
But Snyder and Yu thought they could do more. Snyder’s lab
works on -omics, too, so they decided to offer the computer not
just the slides, but also tumor transcriptomes. With these data When pathologists read images,
combined, the model predicted patient survival even better than
images or transcriptomes alone, with more than 80 percent accu-
only sixty percent of the time do
racy.2 Today, pathologists normally make survival predictions based they agree.
on visual evaluations of tissue micrographs, from which they assess —Michael Snyder, Stanford University
a tumor’s stage—its size and extent—and grade, the likelihood that
it will grow and spread further. But pathologists don’t always agree,
and tumor grade doesn’t always predict survival accurately. Early treatment is commonplace in developed nations, where
Snyder and Yu aren’t the only researchers who are recognizing women get regular Pap smears to check for abnormal cervical cells
the power of AI to analyze cancer-related datasets of images, of and tests for the human papillomavirus that causes the cancer. But
-omes, and most recently, of both combined. Although these tools in the developing world, such screenings are rare. There is a cheaper
have a long way to go before they reach the clinic, AI approaches test—health care workers coat a woman’s cervix in acetic acid, look-
stand to yield a precise diagnosis quickly, predict which treatments ing for telltale white areas that could indicate cancer—but “this tech-
will work best for which patients, and even forecast survival. nique is so inaccurate,” says medical epidemiologist Mark Schiffman
For now, some of those applications are still “science fiction,” says of the National Cancer Institute. As a result, some healthy women
Andrea Sottoriva, a computational biologist at London’s Institute of undergo treatment while others might have their precancerous cells
Cancer Research who’s working on AI to predict cancer evolution and missed, leading to cancer that requires more-radical treatments,
choose the right drugs to treat a given tumor. “We aim to change that.” such as chemotherapy, radiation, or hysterectomy.
Schiffman and other groups have been trying to find a way to
INPUT: Images, OUTPUT: Diagnosis make acetic acid screening more accurate—for example, by imag-
Finding and treating cancer before it progresses too far can be key ing with spectra other than white light. Schiffman’s team had accu-
to increased survival. When it comes to cervical cancer, for exam- mulated thousands of cervix pictures from diverse sources in the
ple, early detection leads to five-year survival rates of more than US and Costa Rica, including photos taken by health-care profes-
90 percent. Doctors can fry, freeze, or excise precancerous cells in sionals with a magnifying camera called a colposcope or with a cell-
phone. But he was about to give up. “We couldn’t make it be really
as sensitive or as accurate or as reproducible as the other [tests].”
Then, near the end of 2017, a nonprofit associated with the
Bill & Melinda Gates Foundation called Global Good reached out.
The organization wanted to try machine learning on Schiffman’s
image collection, to see if a computer could provide diagnoses
when physicians could not.
So Schiffman teamed up with Global Good and other collabo-
rators to use a particular kind of machine learning, called a convo-
lutional neural network, to analyze the cervix images. The goal of
the algorithm was to identify features in the images—for example,
how similar or dissimilar side-by-side pixels tend to be—that help
it get the right diagnosis. At the start, its accuracy was no better
than chance. As it analyzed more and more images, it weighed
those features to help it find the answer. “It’s a process of getting
hotter, hotter, colder, colder, oh yeah, hotter, hotter . . . until it gets
as close as it possibly can,” explains Schiffman.
The team started with cervix images collected over seven
© MOLLY MENDOZA
years in Costa Rica from more than 9,000 women. Schiffman had
also amassed data from more-accurate screening tests in these
women, along with 18 years’ worth of follow up information on
precancer or cancer diagnoses. The researchers used 70 percent
of the complete dataset to train the model, then tested its perfor-
HOW AI TAKES ON CANCER
Scientists have been using two main forms of clinical data to predict cancer outcomes: images (either photographs, as in
the case of skin cancer, or pathology slides) and -omes of various sorts. Applying ever-more sophisticated machine learning
approaches to these datasets can yield accurate diagnoses and prognoses, and even infer how tumors evolve (yellow arrows).
Now, scientists are finding that images can predict -omics (blue arrows). Combining the two data sources gives researchers
even better predictions of how long a cancer patient will live (thick purple arrows). The ultimate goal of these algorithms, cur-
rently under development in basic biology labs, is to help doctors select treatments and forecast survival.
DIAGNOSIS/TUMOR GRADE
DIRECT IMAGING
HISTOPATHOLOGY
SLIDES
PROGNOSIS/SURVIVAL
MACHINE
LEARNING
THEI SCIENTIST STAFF; © ISTOCK.COM, DZIKA_MROWKA, FROM2015
TUMOR EVOLUTION
GENOMES
TRANSCRIPTOMES
PROTEOMES BEST DRUG TREATMENTS
mance on the images only from the remaining 30 percent. Schiff- The researchers used a computer algorithm to extract from those
man couldn’t believe the results: machine learning distinguished images nearly 10,000 features, such as cell shape or size, which
between healthy tissue, precancer, and cancer with 91 percent on they used to train several machine learning algorithms.
a standard measure of the accuracy of machine learning predic- One approach that worked well is called Random Forest. It
tions. A human visual inspection, in contrast, only scored 69 per- generates hundreds of possible decision trees; then those “trees”
cent.3 “I’ve never seen anything this accurate,” says Schiffman. He vote on the answer, and the majority rules. This algorithm was
was sure there was some mistake. more than 75 percent accurate in distinguishing between healthy
The group checked its work and asked collaborators at the tissue and the two cancer types, and it could predict who fell into
National Library of Medicine to independently verify the tech- the high- or low-survival group with greater accuracy than mod-
nique. There was no error: the machine really was that good els based solely on the cancer’s stage.1 “This is something that goes
at identifying precancer and cancer. Armed with this new tool, beyond the current pathological diagnosis,” says Yu.
Schiffman hopes to develop a low-cost screening test for cervical In their follow-up study, the researchers ran their trained image
cancer coupling a cell phone–type camera with machine-based analysis algorithm on histopathology slides from 538 people with
image analysis. First, he wants to train his algorithm on tens of lung cancer, then added transcriptomes and proteomes from those
thousands of cell-phone cervix images from all over the world. same patients, and asked the “random forest” to vote on the grade
He’s not the only one eyeing smartphones for cancer diagno- of their cancers. The expression levels of 15 genes predicted can-
sis. Skin lesions—which might be cancerous or benign—are right cer grade with 80 percent accuracy. These genes turned out to be
on the surface, and anybody can snap a shot. Researchers at Stan- involved in processes such as DNA replication, cell cycle regula-
ford University built a database of almost 130,000 photographs tion, and p53 signaling—all known to play roles in cancer biology.
of skin lesions and used it to train a convolutional neural network The team also identified 15 proteins—not the ones encoded by the
to distinguish between benign bumps and three different kinds 15 genes—involved in cell development and cancer signaling that
of malignant lesions, with at least 91 percent accuracy. The algo- predicted grade with 81 percent accuracy. While the researchers
rithm outperformed the majority of 21 dermatologists asked to didn’t compare this to human performance, one study of patholo-
assess the same pictures.4 gists found 79 percent agreement on lung adenocarcinoma grad-
ing5—suggesting the machine and humans were equally accurate.
But the machine was going further, apparently homing in on the
I’ve never seen anything this specific gene-expression factors driving a cancer’s progression.
accurate. Finally, the researchers asked the computer to predict survival

based on gene expression, cancer grade, and patient age. With all
—Mark Schiffman, National Cancer Institute those data, the model achieved greater than 80 percent accuracy,
correctly sorting cases into long-term and short-term survivors
A major challenge to creating predictive models of cancer is better than human pathologists, transcriptomes, or images alone.2
acquiring enough high-quality data. When the Stanford team com- Inspired by Snyder and Yu’s work, Aristotelis Tsirigos and col-
piled images of skin cancer from Stanford Medical School and from leagues at New York University School of Medicine also sought
the internet, the angles, zooms, and lighting all varied. The research- to link images to genetics in lung cancer, using 1,634 slides of
ers had to translate labels from a variety of languages, then work with healthy or cancerous lung tissue. Based on images alone, their
dermatologists to correctly classify the lesions into more than 2,000 convolutional neural network was able to distinguish adenocarci-
diseases categories. noma from squamous cell carcinoma with about 97 percent accu-
And, of course, most cancers require more than a smart- racy. Then, the team fed the algorithm data on the 10 most com-
phone camera to see what’s going on. Observing individual cells monly mutated genes in lung adenocarcinoma, and the computer
in tumors requires microscopy. Scientists would also like to incor- learned to predict the presence of six of those mutations from the
porate as much information as possible about a person’s clinical pathology slides with accuracy ranging from 73 to 86 percent.6
treatments and responses, plus molecular data such as genomes, “It works quite well,” commented Sottoriva, who was not
but that too can be hard to come by, says Yu. “Rarely will we find involved in the work. “As a start, it’s quite exciting.”
a patient with all the data we want.” Of course, doctors and scientists don’t need to identify muta-
tions via imaging; other tests are more straightforward and more
INPUT: Images + -Omes, OUTPUT: Survival accurate, with genetic sequencing providing a nearly perfect readout
As Snyder and Yu have found, -omics data, when available, can of the cancer’s genome. This study, explains Tsirigos, serves to dem-
provide information about the molecular pathways involved in onstrate that genetics and image features are related in predictable
a given cancer that may help identify cancer type, survival, or ways. Now, he’s working to combine histopathology and molecular
likely response to treatment. In their initial, image-based stud- information to predict patient outcomes, as Yu and Snyder’s group
ies, the researchers had 2,186 lung tissue slides, disease classi- did. These kinds of methods should work for any cancer type, says
fications from human pathologists, and patient survival times. Tsirigos, as long as researchers have the right data to input.
INPUT: -Omes, OUTPUT: Tumor Evolution
-Omics data are also useful on their own, even without images.
For example, Sottoriva and colleagues are using genomics to
understand tumor evolution. One tumor is typically made up
of multiple cell lineages all derived from the same original can-
cer cell. To effectively treat cancer, it’s important to understand
this heterogeneity and the manner in which a tumor evolved. If
a treatment works on only a portion of a tumor, the cancer will
come back. “It’s a real matter of life and death,” says Guido San-
guinetti, a computer scientist at the University of Edinburgh and
a collaborator on the tumor evolution studies.
By sampling multiple parts of an individual tumor, research-
ers can infer what evolutionary paths the cancer took; it’s akin
to sampling modern human genomes to trace various popula-
tions back to ancestral groups. (See “Tracking the Evolutionary
History of a Tumor,” The Scientist, April 2017). Tumors from dif-
ferent patients, even with the same kind of cancer, tend to have
wildly different evolutionary trees. Sanguinetti, Sottoriva, and
colleagues think that if they can find common pathways that can-
cer tends to follow, oncologists could use that information to cat-
egorize people who are likely to have similar disease progression,
or to respond similarly to drugs. that are clinically relevant. Moreover, by selectively eliminat-
To find those common evolutionary trees, the researchers ing certain pieces of data from the model’s input and seeing if
used a form of machine learning called transfer learning. The its accuracy drops, bioinformaticians are starting to figure out
algorithm looks at all the trees from patients’ genomes simul- what features the computers are using to distinguish those pat-
taneously, sharing information between them to find a solu- terns, says Tsirigos.
tion compatible with the whole group, explains Sanguinetti. Current AI applications for cancer research are just the begin-
They called their tool REVOLVER, for Repeated Evolution in ning. Future algorithms may incorporate not only -omes and
cancer. As a first test, they invented fictional tumor evolution- images, but other data about treatment outcomes, progression,
ary trees. When they fed REVOLVER genomics data based on and anything else scientists can get their hands on.
those made-up trees, it did spit out a phylogeny that matched “At the end of the day,” says Snyder, “when dealing with com-
the invented trees. plicated diseases like cancer, you want every bit of information.”
To validate the tool in a well-known form of cancer evolution,
the researchers turned to the transition to malignancy in colorec- Amber Dance is a freelance science journalist living in the Los
tal cancer. This happens as a benign adenoma accumulates muta- Angeles area.
tions in known driver genes: for example, in APC, then KRAS,
then PIK3CA. The researchers fed REVOLVER a set of genomes
from nine real benign adenomas and 10 malignant carcinomas.
Sure enough, the model drew phylogenetic trees that matched the References
1. K.H. Yu et al., “Predicting non-small cell lung cancer prognosis by fully
adenoma-to-carcinoma transition.
automated microscopic pathology image features,” Nat Commun, 7:12474,
The group then analyzed tumor samples for which the evolu- 2016.
tion was less well understood. In genomes from 99 people with 2. K.H. Yu et al., “Association of omics features with histopathology patterns in
non-small-cell lung cancer, REVOLVER identified 10 potential lung adenocarcinoma,” Cell Syst, 5:620–27.e3, 2017.
clusters of patients, based on the sequence of mutations tumors 3. L. Hu et al., “An observational study of deep learning and automated
evaluation of cervical images for cancer screening,” J Natl Cancer Inst,
accumulated. People in some of those clusters lived for less than
doi:10.1093/jnci/djy225, 2019.
150 days, while those placed in other clusters survived much 4. A. Esteva et al., “Dermatologist-level classification of skin cancer with deep
longer, suggesting the categories have prognostic value. Simi- neural networks,” Nature, 542:115–18, 2017.
larly, REVOLVER found six clusters among 50 breast cancer 5. Y. Nakazato et al., “Nuclear grading of primary pulmonary adenocarcinomas,”
© MOLLY MENDOZA
tumors with varying levels of survival between clusters.7 “We Cancer, 116:2011–19, 2010.
6. N. Coudray et al., “Classification and mutation prediction from non-small cell
didn’t expect to find groups, really,” says Sottoriva. “These results
lung cancer histopathology images using deep learning,” Nat Med, 24:1559–67,
tell us that evolution in cancer can be quite predictable.” 2018.
Medicine runs on those kinds of predictable patterns, says 7. G. Caravagna et al., “Detecting repeated cancer evolution from multi-region
Sottoriva. And AI is a powerful tool to help identify patterns tumor sequencing data,” Nat Methods, 15:707–14, 2018.
05 . 2019 | T H E S C IE N T IST 2 9
A Silicon
Brain
Biology-inspired computer chips may help scientists
UNIVERSITY
better simulate neurological function.

HEIDELBERG
LINE
BY SANDEEP RAVINDRAN
CREDIT
©
SMART CHIP: A neuromorphic chip designed by the
Heidelberg group of physicist Karlheinz Meier. The chip
features 384 artificial neurons connected by 100,000
synapses, and operates approximately 100,000 times
CREDIT LINE
faster than the speed at which the brain computes.

I It’s one of the most ambitious technological
n 2012, computer scientist Dharmendra
Modha used a powerful supercomputer to
simulate the activity of more than 500 bil- puzzles that we can take on—reverse engi-
lion neurons—more, even, than the 85 bil-
lion or so neurons in the human brain. It was neering the brain.
—Mike Davies, Intel
the culmination of almost a decade of work,
as Modha progressed from simulating the
brains of rodents and cats to something on gence, let alone eclipse it, they may have to Many of today’s neuromorphic systems,
the scale of humans. start with better building blocks—computer from chips developed by IBM and Intel to
The simulation consumed enormous chips inspired by our brains. two chips created as part of the European
computational resources—1.5 million proces- So-called neuromorphic chips replicate Union’s Human Brain Project, are also
sors and 1.5 petabytes (1.5 million gigabytes) the architecture of the brain—that is, they available to researchers, who can remotely
of memory—and was still agonizingly slow, talk to each other using “neuronal spikes” access them to run their simulations.
1,500 times slower than the brain computes. akin to a neuron’s action potential. This Researchers are using these chips to create
Modha estimates that to run it in biological spiking behavior allows the chips to con- detailed models of individual neurons and
real time would have required 12 gigawatts sume very little power and remain power- synapses, and to decipher how units come
of energy, about six times the maximum out- efficient even when tiled together into very together to create larger brain subsystems.
put capacity of the Hoover Dam. “And yet, it large-scale systems. The chips allow neuroscientists to test the-
was just a cartoon of what the brain does,” “The biggest advantage in my mind is ories of how vision, hearing, and olfaction
says Modha, chief scientist for brain-inspired scalability,” says Chris Eliasmith, a theoreti- work on actual hardware, rather than just
computing at IBM Almaden Research Center cal neuroscientist at the University of Water- in software. The latest neuromorphic sys-
in northern California. The simulation came loo in Ontario. In his book How to Build a tems are also enabling researchers to begin
nowhere close to replicating the functional- Brain, Eliasmith describes a large-scale the far more challenging task of replicating
ity of the human brain, which uses about the model of the functioning brain that he cre- how humans think and learn.
same amount of power as a 20-watt lightbulb. ated and named Spaun.1 When Eliasmith It’s still early days, and truly unlocking the
Since the early 2000s, improved hard- ran Spaun’s initial version, with 2.5 million potential of neuromorphic chips will take the
ware and advances in experimental and theo- “neurons,” it ran 20 times slower than bio- combined efforts of theoretical, experimen-
retical neuroscience have enabled researchers logical neurons even when the model was tal, and computational neuroscientists, as
to create ever larger and more-detailed mod- run on the best conventional chips. “Every well as computer scientists and engineers.
els of the brain. But the more complex these time we add a couple of million neurons, it But the end goal is a grand one—nothing less
simulations get, the more they run into the gets that much slower,” he says. When Elia- than figuring out how the components of the
limitations of conventional computer hard- smith ran some of his simulations on digital brain work together to create thoughts, feel-
ware, as illustrated by Modha’s power-hun- neuromorphic hardware, he found that they ings, and even consciousness.
gry model. were not only much faster but also about 50 “It’s one of the most ambitious techno-
Modha’s human brain simulations were times more power-efficient. Even better, the logical puzzles that we can take on—reverse
run on Lawrence Livermore National Lab’s neuromorphic platform became more effi- engineering the brain,” says computer engi-
Blue Gene/Q Sequoia supercomputer, an cient as Eliasmith simulated more neurons. neer Mike Davies, director of Intel’s neuro-
immensely powerful conglomeration of tra- That’s one of the ways in which neuromor- morphic computing lab.
ditional computer hardware: it’s powered by phic chips aim to replicate nature, where
a whole lot of conventional computer chips, brains seem to increase in power and effi- It’s all about architecture
fingernail-size wafers of silicon containing ciency as they scale up from, say, 300 neu- Caltech scientist Carver Mead coined the
millions of transistors. The rules that gov- rons in a worm brain to the 85 billion or so term “neuromorphic” in the 1980s, after
ern the structure and function of traditional of the human brain. noticing that, unlike the digital transistors
computer chips are quite different from that Neuromorphic chips’ ability to perform that are the building blocks of modern com-
of our brains. complex computational tasks while consum- puter chips, analog transistors more closely
But the fact that computers “think” very ing very little power has caught the atten- mirror the biophysics of neurons. Specifi-
differently than our brains do actually gives tion of the tech industry. The potential com- cally, very tiny currents in an analog circuit—
them an advantage when it comes to tasks mercial applications of neuromorphic chips so tiny that the circuit was effectively “off”—
like number crunching, while making them include power-efficient supercomputers, exhibited dynamics similar to the flow of ions
decidedly primitive in other areas, such as low-power sensors, and self-learning robots. through channels in biological neurons when
understanding human speech or learning But biologists have a different application in that flow doesn’t lead to an action potential.
from experience. If scientists want to simu- mind: building a fully functioning replica of Intrigued by the work of Mead and his
late a brain that can match human intelli- the human brain. colleagues, Giacomo Indiveri decided to do
POWER GRID
Neuromorphic hardware takes a page from the architecture of animal nervous systems, relaying signals via spiking that
is akin to the action potentials of biological neurons. This feature allows the hardware to consume far less power and run
brain simulations orders of magnitude faster than conventional chips.
Conventional
hardware HOOVER
Neuromorphic DAM
THE SCIENTIST STAFF; INTEL CORPORATION; © ISTOCK.COM, JOLYGON; © ISTOCK.COM, VADIM ZHAKUPOV; PHOTO COURTESY UNISEM EUROPE LTD; © HEIDELBERG
hardware
Produces up to 2
SUPERCOMPUTER CHIP
gigawatts
Not analogous to neural function
Runs slower than biological neural
networks
Megawatts
WIND
TURBINE
Produces ~2-3
megawatts
BRAIN
~85 billion neurons DESKTOP
1 quadrillion synapses COMPUTER CHIP
Not analogous to neural function
Runs vastly slower than biologi-
20 watts cal neural networks
50–100 watts
LIGHT
BULB
SPINNAKER
Variable number of artificial neurons,
typically ~1,000 Uses ~20 watts
1 million synapses per 1,000 neurons
Runs at speed of biological neural networks
BRAINSCALES
1 watt 512 artificial neurons
128,000 synapses
10,000x faster than biological
neural networks
1 watt
LASER IN
CD/DVD
UNIVERSITY; INTEL CORPORATION; IBM RESEARCH
PLAYER LOIHI
~130,000 artificial neurons
Uses ~5-10 milliwatts 130 million synapses
Runs at speed of biological
neural networks
IBM’S TRUE NORTH Tens of milliwatts

~1 million artificial neurons
ER
256 million synapses

Runs at speed of biological
W
neural networks
PO
Tens of milliwatts
HEARING
AID
Uses less than 1 milliwatt

his postdoctoral research at Caltech in the on digital neuromorphic chips, whose silicon Neuromorphic chips such as True-
mid-1990s. Now a neuromorphic engineer neurons replicate the way information flows North have a very high degree of connec-
at the University of Zurich in Switzerland, in biological neurons but with different phys- tivity between their artificial neurons, simi-
Indiveri runs one of the few research groups ics, for the same reason that conventional dig- lar to what is seen in the mammalian brain.
continuing Mead’s approach of using low- ital chips have taken over the vast majority of The massively parallel human brain’s 85 bil-
current analog circuits. Indiveri and his team our computers and electronics—their greater lion neurons are highly interconnected via
design the chips’ layout by hand, a process reliability and ease of manufacture. approximately 1 quadrillion synapses.
that can take several months. “It’s pencil and But these digital chips maintain their TrueNorth is quite a bit simpler—con-
paper work, as we try to come up with elegant neuromorphic status in the way they capture taining 256 million “synapses” connecting
solutions to implementing neural dynamics,” the brain’s architecture. In these digital neuro- its 1 million neurons—but by tiling together
he says. “If you’re doing analog, it’s still very morphic chips, the spikes come in the form multiple TrueNorth chips, Modha has cre-
much an art.” of packets of information rather than actual ated two larger systems: one that simulates
Once they finalize the layout, they email pulses of voltage changes. “That’s just dramat- 16 million neurons and 4 billion synapses,
the design to a foundry—one of the same pre- ically different from anything we convention- and another with 64 million neurons and 16
cision metal-casting factories that manufac- ally design in computers,” says Intel’s Davies. billion synapses. More than 200 researchers
ture chips used in smartphones and com- Whatever form the spikes take, the sys- at various institutions currently have access
puters. The end result looks roughly like a tem only relays the message when inputs to TrueNorth for free.
smartphone chip, but it functions like a net- reach a certain threshold, allowing neuro- In addition to the highly interconnected
work of “neurons” that propagate spikes of morphic chips to sip rather than guzzle power. and spiking qualities of neuromorphic chips,
electricity through several nodes. In these This is similar to the way the brain’s neurons they copy another feature of biological ner-
analog neuromorphic chips, signals are communicate whenever they’re ready rather vous systems: unlike traditional computer
relayed via actual voltage spikes that can vary than at the behest of a timekeeper. Conven- chips, which keep their processors and mem-
in their intensity. As in the brain, information tional chips, on the other hand, are mostly ory in separate locations, neuromorphic chips
is conveyed through the timing of the spikes linear, shuttling data between memory hard- tend to have lots of tiny processors, each with
from different neurons. ware, where data are stored, and a processor, small amounts of local memory. This config-
“If you show the output of one of these where data are computed, under the control uration is similar to the organization of the
neurons to a neurophysiologist, he will not of a strict internal clock. human brain, where neurons simultaneously
be able to tell you whether it’s coming from When Modha was designing IBM’s handle both data storage and processing.
a silicon neuron or from a biological neu- neuromorphic chip, called TrueNorth, he Researchers think that this element of neu-
ron,” says Indiveri. first analyzed long-distance wiring diagrams romorphic architecture could enable mod-
els built with these chips to better replicate
If you’re doing analog, it’s still very much human learning and memory.
Learning ability was a focus of Intel’s
an art. Loihi chip, first announced in September
—Giacomo Indiveri, University of Zurich 2017 and shared with researchers in Janu-
ary of last year.4 Designed to simulate about
These silicon neurons represent an of the brain, which map how different brain 130,000 neurons and 130 million synapses,
imperfect attempt to replicate the nervous regions connect to each other, in macaques Loihi incorporates models of spike-timing-
system’s organic wetware. Biological neurons and humans.2 “It really began to tell us about dependent plasticity (STDP), a mechanism
are mixed analog-digital systems; their action the long-distance connectivity, the short- by which synaptic strength is mediated in
potentials mimic the discrete pulses of digital distance connectivity, and about the neuron the brain by the relative timing of pre- and
hardware, but they are also analog in that the and synapse dynamics,” he says. By 2011, post-synaptic spikes. One neuron strengthens
voltage levels in a neuron influence the infor- Modha created a chip that contained 256 its connection to a second neuron if it fires
mation being transmitted. silicon neurons, the same scale as the brain just before the second neuron, whereas the
Analog neuromorphic chips feature sili- of the worm C. elegans. Using the latest chip connection strength is weakened if the firing
con neurons that closely resemble the physi- fabrication techniques, Modha packed in the order is reversed. These changes in synaptic
cal behavior of biological neurons, but their neurons more tightly to shrink the chip, and strength are thought to play an important role
analog nature also makes the signals they tiling 4,096 of these chips together resulted in learning and memory in human brains.
transmit less precise. While our brains have in the 2014 release of TrueNorth, which Davies, who led the development of Loihi,
evolved to compensate for their imprecise contains 1 million synthetic neurons—about says the intention is to capture the rapid, life-
components, researchers have instead taken the scale of a honeybee brain—and con- long learning that our brains are so good at
the basic concept into the digital realm. Com- sumes a few hundred times less power than and current artificial intelligence models are
panies such as IBM and Intel have focused conventional chips.3 not. Like TrueNorth, Loihi is being distrib-
uted to various researchers. As more groups
use these chips to model the brain, Davies
says, “hopefully some of the broader prin-
ciples of what’s going on that explain some
of the amazing capability we see in the brain
might become clear.”
Neuromorphics for neuroscience

For all their potential scientific applications,
TrueNorth and Loihi aren’t built exclu-
sively for neuroscientists. They’re primarily BUILDING BLOCKS: Each SpiNNaker chip is pack-
aged together with memory (above), then tiled
research chips aimed at testing and optimiz-
together into larger devices, such as the 48-node
ing neuromorphic architecture to increase board on the right. Multiple boards can be connected
its capabilities and ease of use, as well as together to form larger SpiNNaker systems (below).
exploring various potential commercial
applications. Those range from voice and
PHOTOS COURTESY UNISEM EUROPE LTD; © UNIVERSITY OF MANCHESTER, BY STEVE FURBER AND COLLEAGUES; UNIVERSITY OF MANCHESTER
gesture recognition to power-efficient robot-

ics and on-device machine learning models
that could power smarter smartphones and
self-driving cars. The EU’s Human Brain
Project, on the other hand, has developed
two neuromorphic hardware systems with
the explicit goal of understanding the brain.
BrainScaleS, launched in 2016,5 com-
bines many chips on large wafers of sili-
con—more like ultra-thin frisbees than fin-
gernails. Each wafer contains 384 analog
chips, which operate rather like souped-up
versions of Indiveri’s analog chips, optimized
for speed rather than low power consump-
tion. Together they simulate about 200,000
neurons and 49 million synapses per wafer. late 2018.6 Furber expects that SpiNNaker vidual brain regions,” inching closer to a full-
BrainScaleS, along with the EU’s other should be able to model the 100 million scale model of the 85 billion–neuron organ
neuromorphic system, called SpiNNaker, ben- neurons in a mouse brain in real time— that powers human intelligence.
efits from being part of the Human Brain Proj- something conventional supercomputers
ect’s large community of theoretical, experi- would do about a thousand times slower. Mimicking the brain
mental, and computational neuroscientists. Access to the EU Human Brain Project Modeling the brain using neuromorphic
Interactions with this community guide the systems is currently free to academic labs. hardware could reveal the basic princi-
addition of new features that might help sci- Neuroscientists are starting to run their ples of neuronal computation, says Rich-
entists, and allow new discoveries from both own programs on the SpiNNaker hardware ard Granger, a computational neuroscien-
systems to quickly ripple back into the field. to simulate high-level processing in specific tist at Dartmouth College. Neuroscientists
Steve Furber, a computer engineer at subsystems of the brain, such as the cere- can measure the biophysical and chemical
the University of Manchester in the UK, bellum, the cortex, or the basal ganglia. For properties of neurons in great detail, but it’s
conceived of SpiNNaker 20 years ago, example, researchers are trying to simulate hard to know which of these properties are
and he’s been designing it for more than a a small repeating structural unit—the corti- actually important for the brain’s computa-
decade. After toiling away at the small dig- cal microcolumn—found in the outer layer tional abilities. Although the materials used
ital chips underlying SpiNNaker for about of the brain responsible for most higher- in neuromorphic chips are nothing like the
six years, Furber says, he and his colleagues level functions. “The microcolumn is small, cellular matter of human brains, models
achieved full functionality in 2011. Ever but it still has 80,000 neurons and a quar- using this new hardware could reveal com-
since, the research team has been assem- ter of a billion synapses, so it’s not a trivial putational principles for how the brain shut-
bling the chips into machines of ever- thing to model,” says Furber. tles and evaluates information.
increasing size, culminating in the million- Next, he adds, “we’re beginning to sort Replicating simple neural circuits in sil-
processor machine that was switched on in of think system-level as opposed to just indi- icon has helped Indiveri discover hidden
05 . 2019 | T H E S C IE N T IST 3 5
BUILDING A FUNCTIONAL MODEL
OF THE BRAIN
Neuromorphic technology is powering ever bigger and more-complex brain models, which had begun to reach their limits with modern super-
computing. Spaun is one example. The 2.5 million–neuron model recapitulates the structure and functions of several features of the human brain to
perform a variety of cognitive tasks. Much like humans, it can more easily remember a short sequence of numbers than a long sequence, and is bet-
ter at remembering the first few and last few numbers than the middle numbers. While researchers have run parts of the current Spaun model on
conventional hardware, neuromorphic chips will be crucial for efficiently executing larger, more-complicated versions now in development.
Motor
6 output Motor
5 processing
Motor
Parietal area
cortex
Information
4 decoding
Prefrontal
cortex
Visual
cortex Temporal cortex
1 Visual Input
Working
3 memory
Information
2 encoding
When shown a series of numbers  1 , Spaun’s visual system compresses When asked to recall the sequence sometime later, the model’s information
and encodes the image of each number as a pattern of signals akin to decoding area decompresses each number, in turn, from the stored list  4,
the firing pattern of biological neurons. Information about the content and the motor processing system maps the resulting concept to a motor com-
© MARINA MUUN
of each image—the concept of the number—is then encoded as another 5 . Finally, the motor system translates the motor com-
mand firing pattern 
spiking pattern 2 , before it is compressed and stored in working memory mand for a computer simulation of a physical arm to draw each number  6.
along with information about the number’s position in the sequence  3. This is akin to human recall, where the parietal cortex draws memories
This process mimics the analysis and encoding of visual input in the visual from storage areas of the brain such as the prefrontal cortex and translates
and temporal cortices and storage in the parietal and prefrontal cortices. them into behavior in the motor area.
benefits of the brain’s design. He once gave a correctly detect a wider variety of inputs. The says Johannes Schemmel, a biophysicist at
PhD student a neuromorphic chip that had results led Cleland to update his model of Heidelberg University who helped develop
the ability to model spike frequency adap- olfaction, and researchers can now look at BrainScaleS. (See “The Intelligence Puzzle,”
tation, a mechanism that allows humans to biological systems to see if they use this previ- The Scientist, November 2018.)
habituate to a constant stimulus. Crunched ously unknown technique to recognize com- Current artificial intelligence systems still
for space on the chip, the student decided plex or noisy odors. trail the brain when it comes to flexibility and
not to implement this feature. However, as Cleland is hoping to scale up his model, learning ability. “Google’s networks became
he worked to lower the chip’s bandwidth which runs in biological real time, to analyze very good at recognizing images of cats once
and power requirements, he ended up with odor data from hundreds or even thousands they were shown 10 million images of cats,
something that looked identical to the spike of sensors, something that could take days but if you show my two-year-old grandson
frequency adaptation that he had removed. to run on non-neuromorphic hardware. “As one cat he will recognize cats for the rest of
Indiveri and his colleagues have also discov- long as we can put the algorithms onto the his life,” says Furber.
ered that the best way to send analog signals neuromorphic chip, then it scales beautifully,” With advances to Loihi set to roll out
over long distances is to represent them not he says. “The most exciting thing for me is later this year, Eliasmith hopes to be able
as a continuously variable stream, for exam- being able to run these 16,000-sensor data- to add more high-level cognitive and learn-
ple, but as a sequence, or train, of spikes, sets to see how good the algorithm is going to ing behaviors to his Spaun model. He says
just like neurons do. “What’s used by neu- get when we do scale up.” he’s particularly excited to try to accurately
rons turns out to be the optimal technique SpiNNaker, TrueNorth, and Loihi can all model how humans can quickly and easily
to transmit signals if you want to minimize run simulations of neurons and the brain at learn a cognitive task, such as a new board
power and bandwidth,” Indiveri says. the same speed at which they occur in bio- game. Famed AI gamers such as AlphaGo
Neuromorphic hardware can also allow logy, meaning researchers can use these must model millions of games to learn how
researchers to test their theories about brain chips to recognize stimuli—such as images, to play well.
functioning. Cornell University computa- gestures, or voices—as they occur, and pro- It’s still unclear whether replicating
tional neuroscientist Thomas Cleland builds cess and respond to them immediately. Apart human intelligence is just a matter of build-
models of the olfactory bulb to elucidate the from allowing Cleland’s artificial nose to pro- ing larger and more detailed models of the
principles underpinning our sense of smell. cess odors, these capabilities could enable brain. “We just don’t know if the way that
Using the Loihi chip enabled him to build robots to sense and react to their environ- we’re thinking about the brain is somehow
hardware models that were fast enough to ment in real time while consuming very little fundamentally flawed,” says Eliasmith. “We
mimic biology. When given data from che- power. That’s a huge step up from most con- won’t know how far we can get until we have
mosensors—serving as artificial versions ventional computers. better hardware that can run these things in
of our scent receptors—the system learned For some applications, such as model- real time with hundreds of millions of neu-
to recognize odors after being exposed to ing learning processes that can take weeks, rons,” he says. “That’s what I think neuro-
just one sample, outperforming traditional months, or even years, it helps to be a bit faster. morphics will help us achieve.” g
machine learning approaches and getting That’s where BrainScaleS comes in, with its
closer to humans’ superior sniffer. ability to operate at about 1,000–10,000 Sandeep Ravindran is a freelance science
“By successfully mapping something times faster than biological brains. And the journalist living in New York City.
like that and actually showing it working in system is only getting more advanced. It’s
a neuromorphic chip, it’s a great confirma- in the process of being upgraded to Brain-
tion that you really do understand the sys- ScaleS2, with new processors developed in
References
tem,” says Davies. close collaboration with neuroscientists. 1. C. Eliasmith et al., “A large-scale model of the
Cleland’s olfaction models didn’t always The new system will be able to better functioning brain,” Science, 338:1202–05, 2012.
work out as expected, but those “failed” emulate learning and model chemical pro- 2. D.S. Modha, R. Singh, “Network architecture of the
experiments were just as revealing. The cesses, such as the effects of dopamine on long-distance pathways in the macaque brain,” PNAS,
107:13485–90, 2010.
odor inputs sometimes looked different to learning, that are not replicated in other
3. P.A. Merolla et al. “A million spiking-neuron
the sensors than the model predicted, possi- neuromorphic systems. The researchers say integrated circuit with a scalable communication
bly because the odors were more complex or it will also be able to model various kinds network and interface,” Science, 345:668–73, 2014.
noisier than expected, or because tempera- of neurons, dendrites, and ion channels, 4. M. Davies et al. “Loihi: A neuromorphic manycore
ture or humidity interfered with the sensors. and features of structural plasticity such as processor with on-chip learning,” IEEE Micro,
38:82–99, 2018.
“The input gets kind of wonky, and we know the loss and growth of synapses. Maybe one
5. J. Schemmel et al., “A wafer-scale neuromorphic
that that doesn’t fool our nose,” he says. The day the system will even be able to approx- hardware system for large-scale neural modeling,”
researchers discovered that by paying atten- imate human learning and intelligence. “To Proc 2010 IEEE Int Symp Circ Sys, 2010.
tion to previously overlooked “noise” in the understand biological intelligence is, I think, 6. S.B. Furber et al., “The SpiNNaker Project,” Proc
odor inputs, the olfactory system model could by far the biggest problem of the century,” IEEE, 102:652–65, 2014.
05 . 2019 | T H E S C IE N T IST 37
A Deeper
Look
With the help of computer programs that learn
from experience, researchers look for meaning
in vast volumes of image data.
BY JEF AKST
S
ix years ago, Steve Finkbeiner of the Gladstone Insti-
tutes and the University of California, San Francisco,
got a call from Google. He and his colleagues had
invented a robotic microscopy system to track single cells over
time, amassing more data than they knew what to do with. It
was exactly the type of dataset that Google was looking for to
apply its deep learning approach, a state-of-the-art form of
artificial intelligence (AI).
FROM ©ISTOCK.COM
CREDIT LINE
MODIFIED
“We generated enough data to be interesting, is basically cal Research Centre of the Hungarian Academy of Sciences.
what they said,” Finkbeiner recalls of the phone conversation. “It’s really changing the field of image analysis.”
“They were interested in blue-sky ideas—problems that either
humans didn’t think would even be possible or things that a Identity crisis
computer could do ten times better or faster.” There are many ways scientists design intelligent computer sys-
One application that came to Finkbeiner’s mind was to have a tems to identify cells of interest. Last year, Aydogan Ozcan of the
neural network—an algorithm that commonly underlies deep learn- University of California, Los Angeles, and colleagues built sen-
ing approaches (see “Primer” on page 12)—examine microscopy sors that focused a deep learning network on microscopy images
images and draw from them information that researchers had been generated by a mobile phone microscope to screen the air for
unable to visually identify. For example, images of unlabeled cells pollen and fungal spores. The model accurately identified more
taken with basic light microscopes do not reveal many details beyond than 94 percent of samples of the five bioaerosols it was trained
the cell’s overall size and shape. If a researcher is interested in a par- on.3 “This is where AI is getting really powerful,” says Ozcan, who
ticular biological process, she will typically use stains or fluorescent notes that traditional screening methods require sending samples
tags to look for finer-scale cellular or molecular structures. Maybe, out for analysis, while this method can be done on the spot. In
Finkbeiner thought, a computer model could learn to see these fine 2017, a group of Polish researchers used deep learning algorithms
details that scientists couldn’t when looking at untreated samples.
In April of last year, he and his colleagues published their
results: after some training, during which the deep learning PAINT BY NUMBERS: Using unlabeled images, a neural network identi-
fies neurons derived from induced pluripotent stem cells (green) in a cul-
networks were shown pairs of labeled and unlabeled images,
ture containing diverse cell types (nuclei in blue). The error map highlights
the models were able to differentiate cell types and identify pixels where the model’s prediction was too bright (magenta) or too dim
subcellular structures such as nuclei and neuronal dendrites.1 (teal). Scale bars = 40 mm.
“The upshot of that article: the answer was, resoundingly, yes.
You could take images of unlabeled cells and . . . predict images BRIGHTFIELD TRUE NEURON LABEL
of labeled and unlabeled structures,” says Finkbeiner. “It’s a
way to, almost for free, get a lot more information [on] the
cells you’re studying.” The very next month, a second team of
researchers similarly showed that a deep learning model could
identify stem cell–derived endothelial cells without staining.2
(See “Into the Cell” on page 18.)
Deep learning is really dominant at the

moment. It’s really changing the field of
image analysis.
—Peter Horvath, Hungarian Academy of Sciences
Such is the power of AI, and deep learning in particular.

The approach draws inspiration from the neuron-based archi- PREDICTED NEURON LABEL ERROR MAP
tecture of the animal nervous system, with many layers of inter-
connected nodes that communicate information and learn from
experience. From basic research needs, such as cell type identi-
fication, to public health and biomedical applications, sophis-
ticated machine learning models are changing how researchers
interact with visual data.
Because of their simplicity and power, these approaches
CELL, 173:P792–803.E19, 2018
are quickly becoming popular for biological image analyses.

“The use of deep learning in microscopy—there was really
almost nothing in 2017,” says Samuel Yang, a research scien-
tist at Google who collaborated on the cell microscopy proj-
ect with Finkbeiner. “In 2018, it [took] off.”
“Deep learning is really dominant at the moment,” agrees
Peter Horvath, a computational cell biologist at the Biologi-
05 . 2019 | T H E S C IE N T IST 3 9
to classify bacterial genera and species in microscopy images, an The key to these artificially intelligent models is the com-
important part of basic and applied research in fields such as agri- puter’s agnostic approach, says Carlos Cordon-Cardo, a physi-
culture, food safety, and medicine.4 cian scientist at the Icahn School of Medicine at Mount Sinai
In the biomedical sciences, AI approaches that distinguish who last year coauthored a study on a machine learning algo-
among human cell types could improve diagnosis. Finkbeiner rithm that graded prostate cancers with a high level of accu-
has partnered with the Michael J. Fox Foundation for Parkinson’s racy. “Rather than force the markers, [it’s] more of an open
Research to create a neural network that can differentiate between box; the algorithm selects what may be the best players.”
stem cell–derived neurons from patients with Parkinson’s disease Which elements of the cellular environment are impor-
and those from healthy controls. Traditionally, researchers have tant for a deep learning model to make accurate evaluations
identified a variable that differs between diseased and healthy cells,
but the AI model can learn to use any number of qualities in the It’s still pretty early days in the field. But
image to make its assessment, Finkbeiner says. This multivariate
approach should reveal a more complete view of the cell’s health,
already, we’ve been pretty blown away by
he says, and help researchers screen for treatments that move a cell how powerful the approach is.
toward a healthy state using this more holistic perspective. “With —Steve Finkbeiner, Gladstone Institutes and
University of California, San Francisco
deep learning, you can really transform the way screening is done.”
The use of AI for analysis of microscopy images is nothing
new to the cancer field, which has been applying machine learn- is often unclear. “It learns by itself. . . . I cannot say what the
ing approaches to analyze micrographs of biopsy samples for more neural network does,” says Christophe Zimmer, a computa-
than a decade. In the last couple of years, deep learning has become a tional biophysicist at the Pasteur Institute in Paris. “And the
popular tool to try to improve diagnoses and treatment regimes. (See neural network cannot tell me, unfortunately.”
“Run Program: Cancer” on page 22.) After training on images that Nonetheless, that information is valuable, as it could provide
have been graded by a pathologist, these models can learn to classify insights into disease processes that might in turn lead to better
various tumors, including lung and ovarian, and to predict cancer diagnostics or treatments. So researchers will “try to break the
progression, often more accurately than clinicians can. model,” Finkbeiner explains. By taking away or blurring parts
of the image, scientists can see what reduces the model’s perfor-
mance. This, then, points to those elements as being key for the
PREDICTING PROSTATE CANCER PROGRESSION: The normal prostate
model’s assessments, yielding clues about what the models “see”
contains microscopic glands that appear as ring-like or circle structures
(grey in leftmost image). Artificial intelligence (AI) based methods, includ-
in the images that might have some biological underpinning.
ing deep learning approaches, may identify abnormalities in these and “It’s one thing to produce data but another to produce
other structures to predict clinical outcomes from histological images of knowledge,” says Cordon-Cardo.
prostate cancer. In the far right image, red is used to denote complete ring
structures while green denotes abnormal ring fragments indicative of can- AI improves resolution
cer. This information can be used to predict the cancer’s grade, a measure
of disease severity. AI algorithms can also evaluate levels of the androgen Using deep learning to analyze microscopy images is just one
receptor (blue in center image), which, when present and activated, drives example of how AI can influence visual data, says Ozcan. “This
tumor growth and cell proliferation (bright yellow). is the first wave—let there be an image captured as before . . .
COURTESY OF CARLOS CORDON-CARDO
MODEL INPUT MODEL OUTPUT GROUND TRUTH
UPSAMPLING: Deep learning approaches can transform images

taken with a smartphone microscope into an image akin to a
micrograph taken with a benchtop instrument (left), or turn such
standard micrographs into super-resolution captures (top).
and we’ll act on [it] better than before. The second wave is problem and data. He says he was surprised the first time a col-
even more exciting to me. That is how AI is helping us with league showed him that a deep neural network model trained on
image generation.” images of dogs, cats, and cars from the internet could be adapted
Finkbeiner’s work with Google is one example. The deep to cluster microscopy images of breast cancer cells “almost per-
neural network can, using brightfield images, generate images fectly” based on what type of drug the cells had been treated with.
that look like fluorescence micrographs. And Ozcan and his For cellular applications, most researchers will recommend their
colleagues have used deep learning approaches to improve the models be trained on relevant sets of images to ensure accuracy,
quality of smartphone microscopes5 and create holographic but experts say the approach itself is essentially transferrable.
reconstructions to visualize objects in three dimensions based “If there is a state-of-the-art paper in computer vision,
on a single snapshot.6 Most recently, his group enhanced basic it doesn’t really take any modification for it to be applicable
images taken with benchtop microscopes into images of a qual- to biology,” says Allen Goodman, a computer scientist at the
ity similar to or better than those taken by Nobel Prize–win- Broad Institute of MIT and Harvard. “It’s Google, Apple, etc.—
ning super-resolution technologies.7 (See “AI Networks Gen- we can benefit from these methods almost immediately.” g
erate Super-Resolution from Basic Microscopy,” The Scientist,
December 17, 2018.) These improved images can then be fed into
deep learning models that analyze their content, Ozcan says.
References
“It’s still pretty early days in the field,” says Finkbeiner. 1. E.M. Christiansen et al., “In silico labeling: Predicting fluorescent labels in
But already, “we’ve been pretty blown away by how powerful unlabeled images,” Cell, 173:P792–803.E19, 2018.
the approach is.” His team is now adapting its robotic system 2. D. Kusumoto et al., “Automated deep learning–based system to identify
to integrate AI directly into microscopes, teaching them to endothelial cells derived from induced pluripotent stem cells,” Stem Cell Rep,
10:P1687–95, 2018.
recognize individual cells not by their physical location but
3. Y. Wu et al., “Label-free bioaerosol sensing using mobile microscopy and deep
by “facial” recognition. One day, he says, researchers may be learning,” ACS Photon, 5:4617–27, 2018.
able to program the system to conduct experiments autono- 4. B. Zieliński et al., “Deep learning approach to bacterial colony classification,”
mously—delivering drugs to certain cells based on what it PLOS ONE, 12:e0184554, 2017.
OZCAN LAB AT UCLA
observes, for example—which “could accelerate discovery and 5. Y. Rivenson et al., “Deep learning enhanced mobile-phone microscopy,” ACS
Photon, 5:2354–64, 2018.
lead to less bias,” Finkbeiner says. (See “Profile” on page 54.)
6. Y. Rivenson et al., “Phase recovery and holographic image reconstruction using
In addition to being powerful, AI paired with microscopy is deep learning in neural networks,” Light-Sci Appl, 7:17141, 2018.
user-friendly. “Deep learning is just so powerful and so easy to 7. H. Wang et al., “Deep learning enables cross-modality super-resolution in
use just kind of out of the box,” says Yang, if you have the right fluorescence microscopy,” Nat Methods, 16:103–10, 2019.
05 . 2019 | T H E S C IE N T IST 41
A Wider
View
AI makes inroads in analysis of images of the
world on a macroscopic scale.
BY CAROLYN WILKE
A
rtificial intelligence is changing how researchers
examine the microscopic biological world. At the
same time, machine learning approaches are being
applied to images on greater scales. From snapshots of the
brain and other organs to satellite images of Earth’s surface,
intelligent computer programs can spot trends or features
of complex systems that escape visual detection by experts.
FROM ©ISTOCK.COM
CREDIT LINE
MODIFIED
Plant Spies
Instead of watering or applying pesticides to an entire field, meeting in April 2018, the algorithm could discriminate
farmers may be able to be more selective, thanks to AI’s between images of well-hydrated versus water-stressed
ability to spot plants in need. Scientists and engineers from the hydrangea and butterfly bush (doi:10.1117/12.2304739,
Spanish National Research Council (CSIC) in Córdoba, Spain, 2018). These are “differences that are not obvious in the field,”
deploy drones equipped with cameras to map areas of land and says Jose Peña, one of the CSIC researchers.
snap pictures that can be mined for information using machine The team is also turning to AI to locate tiny weeds lurk-
learning and object-based image analysis, a method for group- ing in the field, allowing farmers to target small areas with
ing pixels into objects. herbicides. For this application, the researchers employed
The team trained the model by providing information a machine learning approach called Random Forest that
about how much water plants had received along with a learns from photos of the field. To differentiate between
set of pictures of the plants. In preliminary tests presented crops and weeds, the model uses context clues such as size
at the SPIE Commercial + Scientific Sensing and Imaging or the placement of the plant in or outside of a row, achiev-
ing nearly 90 percent accuracy in one patch of sunflow-
The researchers are hoping to create ers (Remote Sens, 10:285, 2018). The researchers are work-
ing to optimize the algorithms, hoping to create AI-enabled
AI-enabled “smart sprayers” that are “smart sprayers” that are more efficient at detecting and
more efficient than ones currently on spraying weeds as they image rows of crops than ones cur-
the market. rently on the market.
JOSE M. PEÑA & JOE M. MAJA
BIRD’S EYE VIEW: A drone flies over an experimental field of ornamen-

tal plants while snapping photos with a camera attached to its underside
(above). The images are fed into an object-based image analysis algorithm
that predicts the water stress condition of each plant in several processing
stages (left). The original image (top left) is segmented by the algorithm
(top right), grouped into plants (bottom left), and edges are removed (bot-
tom right), before extracting spectral data that are translated to a water
stress rating.
05 . 201 9 | T H E S C IE N T IST 4 3
Camera Traps
Aided by cameras with shutters triggered by motion, animal
researchers can keep an eye on their field sites even from far
away. But such camera traps snap away at anything that passes
by, and it still takes a lot of human effort to slog through photos
to identify the animals and make note of what they’re doing. (See
“Wild Ones,” The Scientist, April 2017.)
A tool developed last year by researchers at the University
of Wyoming showed that AI—along with tens of thousands
of volunteers—can help with this task. In a project called
Snapshot Serengeti, citizen scientists labeled 3.2 million
pictures—tagging them with information such as the spe-
cies present, numbers of individuals captured in the shots,
and the animals’ behavior. The researchers then developed a
convolutional neural network, a deep learning program that
mimics the way the brain makes connections, and showed
it the images that had been annotated by online volunteers.
After training, the model correctly identified roughly 94 per-
cent of the images (PNAS, 115:E5716–25, 2018).
The model can process the majority of the images, matching
the accuracy of human assessments, and hand the tough ones off to
SNAPSHOT SERENGETI PROJECT

experts, says Mohammad (Arash) Norouzzadeh, the first author of
the group’s work and a PhD student at the University of Wyoming.
Of course, some images are tricky even for people, he adds. For
instance, when an animal is moving, is too close to or too far from
the camera, or is partly out of frame, the picture may not receive
enough labels to be used for training. The next step is to develop
more-advanced algorithms that can extract information from cam-
era trap projects with less training, Norouzzadeh says. NOW YOU SEE ME: The neural network labels this image “zebra, moving.”
Forecasting Wildfires
USGS/NASA LANDSAT PROGRAM; SRIRAM GANAPATHI SUBRAMANIAN
AND MARK CROWLEY, TAKEN FROM FRONT ICT, 5:6, 2018
Satellite cameras capture wildfires blazing across the land. With physics-based models. Tackling wildfire spread by apply-
these pictures in hand, researchers at the University of Waterloo ing AI to real world datasets presents a challenge because
in Ontario, Canada, are using an AI strategy called reinforcement the data are noisy. “The real world has a lot of complications
learning to create models that predict how the fires spread. In an that we don’t anticipate,” says study coauthor Mark Crowley,
iterative process using images from previous fires, the model receives a computer scientist at Waterloo. “It will push us to find bet-
images showing a fire’s location every 16 days, the length of time ter algorithms or software.”
between satellite pictures. The model then predicts the next 16 days’ Other problems that reinforcement learning could study in
spread and receives feedback about the accuracy of its prediction, the environmental realm involve the spread of infectious dis-
improving the model’s understanding of how fires move. As it pro- ease and climate, says Crowley. Researchers have already applied
gresses, the model learns “rules” that wildfires follow—for example, other machine learning techniques to predict flooding (Neural
that fire stops when it meets a lake (Front ICT, 5:6, 2018). Comput Appl, 27:1129–41, 2016) and drought (Geomat Nat Haz
The researchers found that their model holds up well Risk, 8:1080–102, 2017), and computer scientists are continu-
against others developed by machine learning and against ally working to make their tools more powerful.
Brain Scans
Scientists use three-dimensional MRI scans to track how a shed light on which regions of the brain drive the disease.
brain showing signs of Alzheimer’s disease changes over time. Alzheimer’s shrinks the brain, causing changes to several
With the goal of improving Alzheimer’s disease diagnoses, parts of the organ, so the scientists were surprised that their
researchers from Columbia University and Carnegie Mellon model identified one region—the hippocampal formation,
University trained a convolutional neural network to mine which includes the hippocampus and nearby entorhinal cor-
those scans. After being shown brain scans from patients with tex—as the major contributor to its prediction of Alzheimer’s.
the disease and from healthy controls, the model could dif- Researchers are using similar AI approaches in other
ferentiate between diseased and healthy brains in scans it health-care contexts, including assessing MRI images of
hadn’t previously seen with 93 percent accuracy (bioRxiv, infants’ brains to assess the risk of autism (Nature, 542:348–
doi:10.1101/456277, 2018). Another model that analyzed 2-D 51, 2017), analyzing tumors in the liver (Radiology: Artificial
slices extracted from the 3-D scans also had good success, Intelligence, 1:e180019, 2019), and scanning the eye to probe
indicating that patients could someday receive a diagnosis retinal disease (Brit J Ophthalmol, 103:167–75, 2019).
from a shorter scan that images less of the brain, according
to the researchers.
“A lot of neurologic and psychiatric con-
ditions are . . . localized to certain areas
of the brain,” says Frank Provenzano,
one of the study’s authors. In addition
XINYANG FENG/COLUMBIA UNIVERSITY
to improving Alzheimer’s diagnoses,

deep learning approaches could
AI SCANS: A 3-D rendering of the human brain

shows regions of the brain in different colors.
The most important areas to a deep learning
method’s classification of Alzheimer’s disease
are the hippocampus (yellow) and entorhinal
cortex (blue), shown in the inset.
FAST-MOVING FIRES:
Satellite imagery show-
ing the Fort McMurray
Fire in Alberta, Canada,
in 2016 (left) is ana-
lyzed by a reinforcement
model that predicts the
fire’s movement (right).
Correctly predicted fire
pixels are red, blue pix-
els were predicted incor-
rectly, and black pixels
were correctly labeled
not ablaze.
05 . 201 9 | T H E S C IE N T IST 4 5
EDITOR’S CHOICE PAPERS
The Literature
GENETICS
CRISPR Predictions
THE PAPERS
M.W. Shen et al., “Predictable and precise
template-free CRISPR editing of pathogenic
variants,” Nature, 563:646–51, 2018.
CRISPR-Cas9
F. Allen et al., “Predicting the mutations gener- double strand break
ated by repair of Cas9-induced double-strand Types of repairs
breaks,” Nat Biotechnol, 37:64–72, 2019.
Insertion of
single base pair
During gene editing with CRISPR technol-
ogy, the Cas9 scissors that cut DNA home in
Small deletion
on the right spot to snip with the help of guide
RNA. The way the genetic material is stitched Microhomology
back together afterward isn’t terribly precise, deletions
though; in fact, scientists have long thought
that without a template, the process is ran- CRYSTAL BALL: CRISPR guide RNAs target specific spots in the genome for the Cas9 enzyme to cut, forming a
dom. However, “there’s been anecdotal evi- double-strand break. A machine learning algorithm predicts which types of repairs will be made at a site targeted
a. , a small deletion 
by a specific guide RNA. Possibilities include an insertion of a single base pair  b. , or a larger
dence that cells don’t repair DNA randomly,” change known as a microhomology deletion  c. .
geneticist Richard Sherwood of Brigham and
Women’s Hospital tells The Scientist. A 2016 than 1 billion repairs in various cell types—the suggesting that the algorithms’ accuracy
paper also suggested patterns in the repairs. model showed that the majority of repairs are needs improvement.
Sherwood wondered if artificial intelligence either single base insertions, small deletions, Accurate predictions of sequence repair
could predict these outcomes. or longer deletions called microhomology- could allow researchers to computationally
In a study published last year in Nature, mediated deletions, and are based on specific predict the precise guide RNAs that will repro-
Sherwood and colleagues describe how they sequences that exist at the Cas9-cut site. The duce exact human patient mutations, leading
trained a machine learning algorithm called algorithm was then able to use the sequences to the development of better research models
inDelphi to predict repairs made to DNA that determine each repair to predict Cas9 to study genetic disease. Sherwood and his col-
snipped with Cas9, using experimental data editing outcomes, the researchers reported in leagues also showed that their algorithm could
from 1,872 target sequences cut and then Nature Biotechnology. The predicted repairs predict which guide RNAs would be needed
restitched in mouse and human cell lines. The are similar to Sherwood’s, but based on much to—without a repair template—correct dis-
algorithm showed that 5–11 percent of the guide more data, Allen and Parts say. ease-causing mutations found in human
RNAs used induced a single, predictable repair “It’s the right place and the right patients, a clinical application of CRISPR that
genotype in the human genome in more than 50 time for these predictions to occur,” says is still years, if not decades, from becoming a
percent of editing products. In other words, the Rich Stoner, the chief science officer at reality. The predicted repairs worked on cell
edits aren’t random, the team reports. Synthego, a genome engineering company lines from patients with a rare genetic disease
Separately, Felicity Allen and Leopold interested in developing repair-prediction that causes a blood clotting deficiency and albi-
Parts of the Wellcome Sanger Institute in programs, similar to inDelphi, FORECasT, nism, and another that includes growth failure
the UK and colleagues created an algorithm and a third one called SPROUT, for com- and nervous system deterioration.
called FORECasT (favored outcomes of repair mercialization. However, Stoner notes, Next, Sherwood says, “we would want
© KELLY FINAN
events at Cas9 targets) to do the same thing. a still-unpublished analysis of the three to test whether we can fix disease-causing
Based on a library of 41,630 guide RNAs algorithms’ results reveals that at times mutations in animal models, with an even-
and the sequences of the targeted loci before they all made vastly different predictions tual goal of doing so for human patients.”
and after repair—a dataset that totaled more for the same cuts in the same types of cells, —Ashley Yeager
CLEANING UP RNA SEQ DATA: Computational methods help researchers reduce noise in the data generated during single-cell RNA sequencing.
TECHNIQUES
Intelligent Data Miners

THE PAPERS cells into subgroups based on their mRNA (Nat Methods, 15:1053–
G. Eraslan et al., “Single-cell RNA-seq denoising using a deep count 58, 2018). Compared with more-traditional statistical approaches,
autoencoder,” Nat Commun, 10:390, 2019. the autoencoder techniques are “a more universal approach, quite
elegant, where you . . . can let the machine learning take care of fitting
M. Büttner et al., “A test metric for assessing single-cell RNA-seq batch all the parameters,” says Peter Kharchenko, a computational biologist
correction,” Nat Methods, 16:43–49, 2019. at Harvard University who was not involved in this research. Plus, he
adds, they’re very flexible and scalable. “The huge advantage of these
In the not-so-distant past, researchers had to pool thousands of cells [models] is that it’s easier to build on these tools.”
together for bulk RNA sequencing, yielding an averaged snapshot of In addition to extracting more information from individual data sets,
gene expression. But advances in technology and significant reductions researchers can combine data from different days and augment their
in cost now enable scientists to sequence RNA from single cells, pool of sequences with ones from other labs. Considering multiple data
unleashing a flood of transcription data. sets together allows researchers to view a more complete landscape
“It used to be that you had to wait for the biologists [to generate of cellular biology, says Zhichao Miao, a computational postdoc at
data for analysis], but now we are the slow guys,” says Fabian Theis, a the European Molecular Biology Laboratory–European Bioinformatics
computational biologist at Helmholtz Zentrum München in Germany. Institute. However, sequence reads in different datasets are influenced
“There’s just so much data to analyze that we can’t keep up.” not only by biological variation in the cells being sequenced—differences
One difficulty with single-cell RNA sequencing data is separating over time or between treatment groups—but by unintended variation in
meaningful variation from noise. For instance, a gene may appear experimental conditions and in the methodology, such as the sequencing
“silent” because it is not expressed or because its expression was protocol used.
missed for technical reasons. Theis and colleagues try to cut that Scientists use statistical approaches to remove such technical
noise with an artificial intelligence algorithm called a deep count noise—so-called batch effects—while trying to capture and dissect
autoencoder (DCA), an artificial neural network that can compress biological variation. But there wasn’t a quantitative metric for measuring
gene expression data into fewer dimensions, distilling the information how “batchy” the data remain, says Miao. So he, Theis, and their
down to its most important relationships. collaborators developed a method called k-nearest-neighbor batch-
To see how well these parameters capture the full picture, the effect test, or kBET, that determines variance in the datasets and scores
algorithm recreates a full-size dataset and compares it to the original, the different approaches on how well they eliminate batch effects to
noting the differences. It repeats this process, stopping once the leave data that are “well mixed.”
program doesn’t achieve an improvement after 25 cycles. In two trials The kBET work, published online in Nature Methods last December,
with simulated data, the scientists removed some sequences from is a “good step forward,” says Kharchenko, but new approaches may be
the database to introduce noise and found that the autoencoder could required to evaluate batchiness in highly variable datasets, such as RNA
recover cell-type information that had become masked. In their Nature sequences from cancer patients and healthy people. “If you consider
© ISTOCK.COM, JXFZSY
Communications paper published early this year, they also applied the the difference between samples to be technical variability, then it’s clear
algorithm to several examples of real transcriptome data, using it in one what you want to do with it. You want to remove it,” says Kharchenko,
case to identify the cell types in a blood sample—a task required for whose lab is developing tools for analyzing datasets with more-
various medical and research applications. systematic differences. “If we move to that more challenging setting [of
Another group published an AI-based tool similar to DCA a few comparing very different groups], then the question itself becomes a
months earlier, which could also recover data hidden by noise and cluster little bit less obvious.” —Carolyn Wilke
05 . 2019 | T H E S C IE N T IST 47
PROFILE
Autopilot Advocate
Carnegie Mellon computational biologist Bob Murphy is betting that high-throughput experiments
orchestrated by machine learning algorithms will help crack biology’s mysteries.
BY SHAWNA WILLIAMS
I
t was the code that hooked Bob Murphy on biology. At age ecules that have been gobbled from the cell’s surface via endocy-
13, he visited New York City’s American Museum of Natu- tosis undergo a rapid drop in pH. The researchers also detailed
ral History with his parents and picked up a copy of Isaac the kinetics of what happens to those materials once a compart-
Asimov’s book The Genetic Code in the gift shop. After reading ment enters the cell.
it, “I came downstairs, and I told my parents, ‘I know what I When it came time to apply for faculty jobs, Murphy saw an
want to do with my life,’” he recalls. “I was just fascinated by ad that seemed too perfect to pass up: Carnegie Mellon Univer-
the idea that you could decode biological information, that sity (CMU) in Pittsburgh was starting a new center for applying
biological systems were built on this DNA material that could fluorescence to biology experiments and was looking for profes-
be converted into RNAs and proteins.” sors. Murphy joined CMU in 1983 and has never left.
That was in the mid-1960s. Murphy, true to his word, went on Murphy’s early work at CMU continued in a similar vein to his
to study biochemistry at Columbia University. Then in 1974, when research in Cantor’s lab, using fluorescence spectrometry to trace
he was in graduate school at Caltech, he encountered another type the kinetics of what happens to materials after they are endocy-
of code that would shape his career. One day, after he’d extracted tosed into cells. For example, his team found strong evidence that
proteins from chromatin and run polyacrylamide gels of the sam- the transmembrane protein Na+,K+–ATPase regulates acidifica-
ples, he and others in his lab were “having discussions about how tion that he and Cantor had identified in endosomes.
to analyze [the gels], and somebody said, ‘You know, that’s the
kind of thing that you could use a computer for,’” Murphy recalls. SEEING IS BELIEVING
“And I literally said, ‘What do you mean, use a computer?’” In the course of his fluorescence spectrometry research, Murphy
Murphy was introduced to the computing lab at Caltech, and learned that if one of his studies found, for example, that a partic-
was soon learning to code—first in BASIC, and later in Fortran ular cargo would end up inside a specific type of cellular compart-
and MACRO-11. His advisor, James Bonner, acquired the lab’s ment, reviewers would ask what the compartment looked like.
first minicomputer during Murphy’s time there, and Murphy “My first reaction was, well . . . the point here is to study the kinet-
remembers connecting the lab’s spectrophotometers and other ics and show the biochemistry of this,” he says. But eventually he
instruments to the computer and developing software to allow relented. “We would then just go and do a very simple microscopy
automated data collection. With that computational assistance, experiment, and that would make the reviewers happy. It actu-
he studied chromatin structure, determining that the distances ally hadn’t changed the story, but there was a picture to go along
between DNA-histone complexes called nucleosomes change dur- with the story.”
ing DNA replication. It was the start of a career devising com- Similarly, at scientific talks he went to in the early 1990s, Mur-
puter programs to answer biological questions. phy would see microscopy images given center stage—and, in his
view, used to support models of what happened inside cells via
HARNESSING BIG DATA tenuous reasoning. “I kept saying to myself, somebody has to try
Murphy earned his PhD in 1980 and returned to Columbia for a post- to find a way to make these images into something closer to data,
doc with Charles Cantor. In Cantor’s lab, Murphy started working on something that you actually can operate on. . . . At a certain point
identifying interactions between histones, which led him to the I decided that we would try to do it.”
question of how cells take up certain chromatin components in the Murphy “is one of these people who’s brilliant, and he’s almost
© CARNEGIE MELLON UNIVERSITY
first place. He tackled the problem using a fluorescence cytom- always right,” says Mario Roederer, who was Murphy’s first gradu-
eter that Cantor had just acquired for the lab. “You could put a ate student in the mid-1980s and is now an immunologist at the
suspension of cells in the instrument, and it would flow them National Institute of Allergy and Infectious Diseases. “His work
through a laser and then measure the fluorescence given off by ethic was amazing.”
each cell,” Murphy explains. “This was a very early high-through- That work ethic served Murphy well as he and then MD/PhD
put, data-generating instrument, and it was perfect for the kinds student Michael Boland went about trying to computationally
of things that I was interested in, especially because it produced a analyze images of cells. They would label proteins that home to
lot of data quickly.” Using fluorescent probes, he and Cantor were specific organelles, then feed images of the labeled cells into a
among the first to show that the compartments containing mol- computer to train it to use pattern recognition to group together
cells that contained the same labeled protein. It was a machine
learning approach to visually identifying organelles within cells that
removed the need for a human to first make the determination of
which organelles were shown in training images. As a result, it could
potentially achieve greater accuracy than people could.
“When we started, I would go to cell biology meetings and talk
about the idea of recognizing what organelle a protein is in by this
automated approach, and the reaction most people had was . . . ‘No,
you have to go to grad school in cell biology to be able to tell the dif-
ference between [organelles].’ It took a lot of subsequent work to
convince people that this was a viable approach,” Murphy recalls.
His group’s first paper on the image recognition program came
out in 1997, and eventually his team was able to achieve “basically
perfect” accuracy with it, he says. Yet all was not smooth sailing.
He had originally hoped the program would learn to recognize pro-
teins with similar distribution patterns as members of a common
class—for example, to group different lysosome-localizing proteins
in the same bucket. Instead, the program picked up on subtle dif-
ROBERT MURPHY ferences in the distribution patterns, such that “almost every new
Ray and Stephanie Lane Professor of Computational Biology, protein we looked at wasn’t readily recognized as being one of those
Carnegie Mellon University classes,” he explains. Rather than continue trying to develop a clas-
Computational biology department head, Carnegie Mellon University sification model, the group switched gears, embracing the messy
Senior Fellow, Allen Institute for Cell Science complexity of protein dynamics. They trained a model to break
Fellow, American Institute of Medical and Biological Engineering, 2007 down each protein’s distribution into a mixture of fundamental
Senior Member, Institute of Electrical and Electronics Engineers, 2007 patterns, so that it could determine which organelles a protein was
Honorary Professor of Biology, Albert Ludwig University of Freiburg, 2011 likely to be found in.
Senior Member, International Society for Computational Biology, 2018
A NEW PARADIGM
Greatest Hits Murphy’s fascination with biology has remained, even as the idea
• Found that ligands are acidified rapidly after being endocytosed that first attracted him to the field—that life is a code that can
in mouse cells. be cracked, with each gene corresponding to a single function—
• Identified the transmembrane enzyme Na+,K+–ATPase as a key has failed the test of time. “The way we used to think about these
regulator of endosome acidification. kinds of problems was to try to simplify things as much as pos-
• Trained an artificial neural network to recognize from microscopy sible—reductionism—but it’s been at least 20 years since we’ve
images where a protein had localized in HeLa cells, and later built a realized that that’s not going to work, that biological systems are
model to predict a given protein’s distribution in cells. complex systems,” he told The Scientist late last year, in an inter-
• Showed that a machine learning algorithm with control of a view in his corner office in one of Carnegie Mellon’s newer build-
robotic liquid handler and microscope can be used to predict ings, overlooking the Steel City. Murphy talked about slides pro-
the effects of a set of chemical compounds on the subcellular jected on a large wall-mounted flat-screen TV with the polish of
localization of a set of proteins while directly testing just less than someone practiced at conveying the reasons for his excitement
a third of the possible combinations. to nonspecialists. Given biology’s complexity, he says, it’s just not
• Founded CMU’s computational biology department in 2009, as possible to do all of the experiments needed to figure out all the
well as the world’s first master’s program in automated science, interactions that govern each of a cell’s functions in a traditional,
which will begin this fall. hypothesis-driven way. “We need to have a way in which we can
05 . 201 9 | T H E S C IE N T IST 49
PROFILE
only do only the experiments that we need to do and not do all Cell Science in Seattle, say Murphy is a tough and rigorous mentor.
possible experiments.” In one-on-one meetings, Johnson recalls Murphy repeatedly say-
There are no hard and fast rules in biology, as there are in phys- ing, “What is the question that you’re asking, and why is it impor-
ics, Murphy elaborated in a later phone interview, because there are tant?”—and how does that jibe with the computational methods
always exceptions. Computational models provide a way forward used to address it. “[He] definitely was the first person who I met
for biologists. For example, let’s say a researcher had 96 drug can- who had a clear articulation of the tight coupling between compu-
didates and wanted to know how they’d act on 96 different proteins tational modeling and biological research,” Johnson says.
within a cell line. Doing 9,216 experiments is out of the question,
so instead, the researcher aims to do some fraction of those experi-
ments, and use the results to train a machine learning program to
model what the outcome of the others would have been.
The machine doesn’t need us. It’s the same
That training process will be most effective if it’s active rather way that a self-driving car works, in that
than passive, Murphy says. That means that, rather than feed- you get in the self-driving car, and you
ing a computer program a large data set for training, he wants tell it, “I want to go to Cleveland,” and it
researchers to hand the reins over to the program from the get- figures it out. You don’t tell it how to get to
go, enabling it to determine which experiments’ results would be
Cleveland, you just tell it what your goal is.
most useful for improving its model—and then, to go get those
- Bob Murphy, CMU
results. By hooking up computers that run machine learning pro-
grams to instruments such as robotic liquid handlers and micro-
scopes, and putting needed starting materials, such as drug can-
didates and cell lines, within robot arm’s reach, his group has Murphy expects that the utility of such modeling, coupled with
created automated setups that can construct and continuously self-driving instruments, will extend to experiments on many differ-
refine experiments in response to research questions. In the drug ent species. He and colleague Joshua Kangas have already collabo-
candidate and cell-line example, the program starts out running rated with plant biologists to automate an experiment on how vari-
experiments with nearly random combinations of the two. As it ous chemicals affect the growth of Arabidopsis protoplast cultures,
runs microarray analyses to “see” the results, it builds a model and while that particular study didn’t work out as hoped, Murphy
predicting what the results would be of all possible combina- and Kangas think machines could one day steer such plant experi-
tions. The program then interrogates that model to see which ments, and perhaps similar experiments in animals. Alternatively,
of its predictions are the most uncertain, and runs those experi- a machine learning program could produce a readout of instruc-
ments, using the results to further refine the model. As the cycle tions for, say, mouse experiments it needs to refine its model, and
repeats, the accuracy of the model increases. a researcher could perform the experiments and feed the data back
“It doesn’t need us. It’s the same way that a self-driving car works, into the program. “The methods that we’re working on are general-
in that you get in the self-driving car, and you tell it, ‘I want to go to izable, and we try to find collaborations that will test the parts of the
Cleveland,’ and it figures it out. You don’t tell it how to get to Cleveland, methods. . . that we think might need the most work,” Murphy says.
you just tell it what your goal is,” Murphy says. Accordingly, he’s dubbed Using images along with machine learning, Murphy thinks
these active machine learning setups “self-driving instruments.” biology will advance toward a much more detailed structural
understanding of spatial relationships within cells—where a given
DRIVING INTO THE FUTURE protein will be found in relationship to the cell membrane, for
Not surprisingly, realizing this ambitious vision of experiments example, or to microtubules. Having that foundation will make
with minimal human oversight requires plenty of people-powered it feasible to investigate how perturbations such as mutations
planning, setup, and tinkering. Even as his own lab works out the or drug candidates change those orientations. Building com-
kinks, Murphy has been working on launching a new CMU mas- prehensive computational models of cells’ spatial relationships
ter’s program in automated science to teach trainees how to set up will require the work of many labs, Murphy says, and he thinks
and maintain self-driving instruments. The first master’s program the role of his lab is to “develop tools that will enable that, and
students will arrive this fall to an array of new equipment to prac- describe a way in which we think that this task could be done.”
tice on, including robotic liquid handling, microscopy, and nucleic In one recent step in that direction, he and a graduate student
acid extraction instruments. Murphy expects the program’s grad- worked with Seema Lakdawala of the University of Pittsburgh
uates to fan out to industry employers and national labs, or to go School of Medicine to use images to train a model of the spatial
on to earn PhDs and eventually set up their own research labs that relationships of different segments of influenza RNA to predict
harness artificial intelligence and automation. how they likely come together within an infected cell to produce
The students in the new program should be prepared for a chal- new infectious particles.
lenge. Both Roederer and Greg Johnson, who completed his PhD in Ultimately, Murphy says, “We really believe that these self-driving
Murphy’s lab in 2016 and is now a scientist at the Allen Institute for instruments are going to change the way science is done.” g
5 0 T H E SC I EN TIST | the-scientist.com
SCIENTIST TO WATCH
Nick Turk-Browne: Pattern Seeker

Professor, Department of Psychology, Yale University, Age: 37
IBY CATHERINE OFFORD
W
hen Nick Turk-Browne was unconscious process of statistical learning REFERENCES
a teenager, he read V.S. was very surprising.” 1. N.B. Turk-Browne et al., “Neural evidence
Ramachandran’s Phantoms In 2009, straight out of grad school, of statistical learning: Efficient detection of
in the Brain, a gift from his father, and was Turk-Browne started his own lab at Princeton visual regularities without awareness,” J Cogn
fascinated by the book’s message that “our University. “We’d never seen someone Neurosci, 21:1934–45, 2009. (Cited 315 times)
experience of the world is constructed by who had been that productive, not just in 2. A.C. Schapiro et al., “Shaping of object
our brain,” he says. “It blew my mind. It got terms of number of papers but in terms of representations in the human medial
me very much interested in this question of the substance, as a graduate student,” says temporal lobe based on temporal
how do we experience the world, and how Princeton psychologist Ken Norman. “It was regularities,” Curr Biol, 22:P1622–27, 2012.
does our brain construct that experience?” clear that he was ready to leap right into a (Cited 220 times)
This fascination has been a guiding faculty position.” 3. A.C. Schapiro et al., “Complementary learning
force for Turk-Browne, now a professor of A few years in, Turk-Browne and systems within the hippocampus: A neural
cognitive neuroscience at Yale University. grad student Anna Schapiro found network modelling approach to reconciling
After graduating from the University that, during statistical learning, the episodic memory with statistical learning,”
of Toronto in 2004, he moved to Yale’s hippocampus and nearby brain regions Philos Trans R Soc B, 372:20160049, 2017.
psychology department for a PhD, and “was represent objects based on how they’re (Cited 62 times)
a superstar from the beginning,” recalls temporally associated with one another—
psychologist Marvin Chun, one of Turk- whether they’re seen together, for
Browne’s advisors. “He absolutely is one example—rather than on function
of the most productive and creative PhD or appearance— say, associating
students I’ve mentored in my entire career.” books with other books.2 “We think
In grad school, Turk-Browne focused that this is a basic mechanism for
on how the brain orchestrates statistical how the hippocampus is able to
learning. Unlike episodic memory, which store these sorts of regularities,”
allows conscious recall of snippets of Turk-Browne says.
information such as phone numbers or Using neural network models
birthdays, statistical learning involves the of the hippocampus, Turk-Browne,
more subconscious association of events or Schapiro, and Norman uncovered
objects, such as remembering a book with evidence for distinct anatomical
the table it’s placed on, or a person with pathways for episodic and statistical
the building they’re usually seen in. “It’s learning.3 Turk-Browne has also
about extracting patterns, or regularities, worked on novel tools such as
across experiences,” explains Turk-Browne. BrainIAK, machine learning
This type of learning is critical “to generate software developed with Intel
expectations about the world, to interact, that runs advanced analyses
to behave in an efficient manner when of fMRI data. In 2017, he
confronted with new situations.” returned to Yale as a professor
Using behavioral experiments and and, as expected, he’s an
© JANE SHAUCK PHOTOGRAPHY
functional MRI (fMRI) brain scans, Turk- “outstanding teacher and

Browne found that the hippocampus, an departmental citizen,” says
area usually linked to episodic memory, Chun, who adds that he
unexpectedly also showed activity frequently finds himself
associated with statistical learning, even learning from his former
when people weren’t paying attention student. “It’s a real
to learning tasks.1 “The fact that the thrill to have him
hippocampus was engaged in this relatively back.” g
05 . 2019 | T H E S C IE N T IST 51
LAB TOOLS
Automating the Fight Against Superbugs

Antibiotic resistance is on the rise. Can AI help?
BY AMBER DANCE
S
hould you be so unlucky as to wind most relevant sequences. They’re mak- metagenomes, aiming to understand the
up in the hospital with a drug- ing progress, thanks to databases stuffed resistance profile of environments such
resistant bacterial infection, doc- with thousands of genomes from differ- as wastewater effluent.
tors will need to figure out which antimi- ent strains of pathogenic bacteria, along Challenges remain before AI is able
crobial drug has the best chance of killing with corresponding data on whether to prescribe your antibiotics, though, says
your particular pathogen. With antibiotic those strains were susceptible or resis- James Davis, a computational biologist at
resistance on the rise—and predicted to tant to dozens of antibiotics. Argonne National Laboratory. For one, fast,
kill 10 million people per year by 2050— Some researchers are training point-of-care sequencing remains expen-
© ISTOCK.COM, LUISMMOLINA
it’s not always an easy choice. machine learning algorithms to iden- sive—and less accurate than slower con-
It would help clinicians to be able to tify known drug resistance genes in ventional methods. For another, databases
mine your superbug’s genome for DNA new strains of a pathogen. Others are are often skewed toward resistant strains,
sequences that indicate susceptibility or using AI to hunt for entirely new resis- because hospitals sequence the most dif-
resistance to antibiotics. As a step toward tance genes, seeking a better under- ficult cases, but including genomes from
that goal, bioinformaticians are tap- standing of how bacteria fight off drug antibiotic-susceptible strains would help
ping artificial intelligence to identify the treatments. And some are moving into the algorithms perform better, he says.
Here, The Scientist profiles three INPUT MACHINE LEARNING
recent studies applying machine learning The researchers used 5,278 Salmonella The team applied a machine learn-
to the antibiotic resistance problem. genomes from the US Food and Drug Admin- ing algorithm called extreme gradient
istration’s National Antimicrobial Resis- boosting (XGBoost). Using those 10-mer
DRUGS & DOSES tance Monitoring System, along with so- counts, the computer designs decision
called minimum inhibitory concentrations, trees to predict the right MICs. Each
What drugs have the best shot at cur- or MICs, for 15 antibiotics—that is, the lowest decision point uses one of the 10-mers to
ing an infection? Some scientists have amount needed to block growth of each strain help it classify a given genome as resis-
relied on known antimicrobial resis- in the lab. All the bacteria had been isolated tant or susceptible to various drugs. The
tance genes and proteins to match bac- from raw meat and poultry for sale or from algorithm then assigns different lev-
terial strains with drugs that are most livestock being slaughtered for food. els of importance to each 10-mer, and
likely to kill them. Davis says AI can do The researchers used a program designs trees repeatedly, in rounds called
better, by analyzing entire genomes for called the K-mer Counter (KMC) to “boosts,” until it gets the lowest error it
both known and potentially unknown divide each of those genomes into over- can for its MIC predictions compared to
genes related to drug resistance or sus- lapping 10-mers. For example, if a the true MICs. The researchers ran the
ceptibility. He and his team developed hypothetical sequence started with algorithm 10 times, each time leaving out
a machine learning approach to identify AAAAAGGGGGTTTTTCCCCC, the a different tenth of their dataset. They’d
key differences between resistant and first 10-mers would be AAAAAGGGGG, train the computer on the other 90 per-
susceptible strains, and thus predict the AAAAGGGGGT, AAAGGGGGT T, cent of the data, then use the remaining
drug-response profile of novel strains. and so on, starting one base farther ten percent to test its accuracy.
The algorithm may also help scientists along each time. Then the computer
identify novel resistance genes. counted how many times a given 10-mer OUTPUT
The researchers recently tested their appeared in each genome: the number When given an entirely new genome, the
approach on Salmonella, a top cause of AAAAAGGGGGs, AAAAGGGGGTs, program predicts which drugs the strain
of food poisoning (J Clin Microbiol, AAAGGGGGTTs, and so forth. These will be resistant or susceptible to, along
57:e01260–18, 2019). Though the infec- were the features fed into the machine with the relevant dose. In the team’s test,
tion usually isn’t severe, strains resistant learning algorithm, along with MIC data, with strains in the reserved 10 percent,
to antibiotics can make people sicker. to train it to predict MICs on its own. the algorithm was 95 percent accurate.
By rerunning their tests with 15-mers
for each pathogen genome, and consid-
Bacteria from a patient Ideal prescription ering each antibiotic individually, the
researchers identified DNA snippets asso-
ciated with resistance or susceptibility to
each drug. Comparing those 15-mers to
the Salmonella sequence, the research-
ers started to figure out which genes
were most important in making these
DNA AI to identify predictions. In fact, many of the genes the
sequences resistance genes algorithm chose corresponded to known
drug resistance genes, indicating that the
algorithm was on the right track. But not
all pointed to well-understood resistance
genes, suggesting the AI might be pick-
ing up on genetic features as yet unknown
to scientists that are also associated with
resistance. “There’s biology there that’s
worth studying,” says Davis.
THE SCIENTIST STAFF
Wastewater or other
environmental sample
INTELLIGENT DESIGN: Researchers are
using artificial intelligence to explore the DNA
Sequences of novel sequences of bacteria that infect people and
resistance genes contaminate the environment in order to iden-
tify known and new drug resistance genes.
05 . 2019 | T H E S C IE N T IST 53
LAB TOOLS
PRO products that directly interact with the First, the researchers determined
• The machine learning algorithm is not drug in question. But other kinds of the pangenome—the full list of every
biased by a list of known resistance genes—for example, genes that affect possible protein-coding gene—from all
genes, or even protein-coding genes, the permeability of the bacterial cell the M. tuberculosis strains in their data-
allowing it to identify new genetic fac- wall, or how the cell pumps out toxins set. Based on this list, they identified
tors potentially involved in resistance and waste—might also influence sus- all possible alleles that could potentially
across the entire genome. ceptibility to antimicrobials. be present in a given TB genome. Then,
Erol Kavvas, a bioengineering grad- they noted whether the genome of each
CON uate student at the University of Cal- individual strain possessed each allele
• The machine identifies 10- and 15-mers ifornia, San Diego, hunted for novel or not. Together with the resistance
associated with drug responses, but resistance genes in the genome of Myco- data, these yes-or-no allele lists created
it’s not immediately clear which genes bacterium tuberculosis (Nat Commun, a multidimensional matrix.
are relevant, or whether an individual 9:4306, 2018). This bacterium infects
sequence promotes resistance or sus- some 10 million people worldwide each MACHINE LEARNING
ceptibility. Davis adds that it is usually year, and more than 500,000 of these Kavvas applied an approach called sup-
possible to infer this information once infections are resistant to commonly- port vector machine, or SVM. The algo-
he compares the 10- or 15-mers to the prescribed antibiotics. “There’s a lot of rithm is designed to group similar data
bacterial sequence. complexity to drug resistance in TB,” and draw boundaries between the groups.
Kavvas says. For example, for a simple, two-dimen-
TRY IT sional input matrix with just two kinds
Download the code at https://github.com/ INPUT of variables, it might draw lines between
PATRIC3/mic_prediction. The researchers used 1,595 M. tuberculosis groups. For the multidimensional matrix
genomes from the Pathosystems Resource Kavvas created, it draws a multidimen-
GENE PROSPECTING Integration Center (PATRIC) database, sional divider, called a hyperplane,
plus whether each genome was from a between resistant and susceptible strains.
Researchers studying microbial resis- strain resistant or susceptible to 13 differ- To identify the most important genes
tance have generally focused on gene ent antibiotics. for resistance, Kavvas also applied a tech-
nique called the L1-norm. Simply put, he
told the computer to use a small number
of genes to draw the boundary.
OUTPUT
The algorithm provides a list of genetic
mutations involved in resistance to each
drug, ranked by order of importance.
Overall, Kavvas identified 33 known drug
resistance genes; this information could
help doctors choose the right drug for a
patient with TB.
He also found 24 novel resistance
genes, many of which are involved in
metabolism and cell wall processes.
He hopes experimental biologists will
pick up on his results and work out how
those genes help neutralize antibiotics.
© ISTOCK.COM, GAETAN STOFFEL
PRO
• Many models are biased by using a
standard reference genome, which
may or may not represent the most
common strains in circulation. By
using the pangenome instead, the
team avoided this bias.
vidual antibiotic resistance genes that are the best predictions of antibiotic resis-
homologous to sequences in their sample. tance category for each UNIPROT gene.
But making those comparisons requires The researchers built two differ-
defining a threshold of similarity—say, ent models for different kinds of DNA
50-90 percent—that counts as close sequences. DeepARG-SS works for short
enough to call a bit of DNA a resistance reads, of 100 base pairs or so, like the
gene. Researchers often set high, stringent reads one typically gets from metage-
thresholds, resulting in a high rate of false nomic sequences. DeepARG-LS works
negatives, says Liqing Zhang, a bioinfor- with longer, gene-length reads.
matician at Virginia Tech. That is, many
true resistance genes are overlooked. OUTPUT
Zhang and colleagues developed a new When tested with the remaining 30 per-
tool to assess the resistance genes in envi- cent of UNIPROT sequences it hasn’t
ronmental samples. Called DeepARG (for been trained on, Zhang’s algorithm gen-
antibiotic resistance genes), it compares the erates the probability that each sequence
environmental DNA to all known resistance reflects a gene for resistance to each of
genes, one at a time, instead of to a single, the 30 categories of antibiotics. It was
most-homologous gene (Microbiome, 6:23, able to identify antibiotic resistance
2018). That helps because it focuses the genes with low rates of both false nega-
comparison on broad categories of resis- tives and false positives.
tance genes and what they have in common, DeepARG’s predictions match well
so the algorithm can identify novel genes with other reports. The researchers
that share those common features. compared their results to a recently
published list of 76 new antibiotic resis-
INPUT tance genes (Microbiome, 5:134, 2017).
First, the researchers built a database “Sure enough, yes, we predicted 65 of
of known resistance genes and which of them,” says Zhang.
30 different drugs they affect, collected Her collaborators can now apply Deep-
CON from three sources: the Comprehensive ARG to assess wastewater and other envi-
• So far, Kavvas has only included vari- Antibiotic Resistance Database (CARD), ronmental samples. For example, they can
ants in protein-coding genes, so he the Antibiotic Resistance Genes Data- test how wastewater treatment alters the
could miss relevant non-protein-coding base (ARDB), and the Universal Protein profile of resistance genes.
elements, such as genes for regulatory Resource (UNIPROT). They call the data-
RNAs, in other parts of the genome. base DeepARG-DB. PRO
They then used 70 percent of the • DeepARG does not require strict cut-
TRY IT 10,602 genes from UNIPROT to train the offs to identify genes as related to drug
Download the code at https://github.com/ machine learning algorithm. To develop resistance, so it provides fewer false neg-
erolkavvas/microbial_AMR_ML/. the input data, they had the computer atives than standard comparisons.
compare the sequence of each gene from
GET META, GO DEEP UNIPROT, individually, to the known CON
resistance genes from the other two data- • The database only considers entire
Microbes pick up new drug resistance bases. The result was a list of thousands of genes as resistance-related or not; it
genes from other bacteria, swapping similarity scores for each UNIPROT gene. lacks the resolution to identify single
them like trading cards. The swap meet nucleotide polymorphisms associated
goes down in places where microbes mix, MACHINE LEARNING with resistance, or mutations that indi-
© SHUTTERSTOCK.COM, URFIN
such as wastewater from hospitals or Zhang’s group used a deep learning rectly influence resistance pathways.
farms where antibiotic use is high. Even model. These types of algorithms are
after water treatment, traces of resistance- inspired by how the human brain is TRY IT
related DNA remain. thought to work, and they assign differ- Download DeepARG-DB, or compare
To assess the risk in water samples, ent weights to inputs to come up with your sequences to it, at https://bench.
researchers often compile metagenomes— the most accurate output. During the cs.vt.edu/deeparg; or download the code
that is, all the DNA within a microbial training, the computer figured out how at https://bitbucket.org/gusphdproj/
community—then look for known, indi- to weight those similarity scores to make deeparg-ss.
05 . 2019 | T H E S C IE N T IST 5 5
BIO BUSINESS
Computing a Cure
The pharmaceutical industry is looking to machine learning to
overcome complex challenges in drug development.
BY BIANCA NOGRADY
A
ustralia’s government drug
safety watchdog sounded the
alarm about the oral antifungal
agent terbinafine in 1996. The drug,
sold under the brand name Lamisil by
pharma giant Novartis, had come onto
the market in 1993 for the treatment of
fungal skin infections and thrush. But
three years later, the agency had received
a number of reports of suspected adverse
reactions, including 11 reports of liver
toxicity. By 2008, three deaths from liver
failure and 70 liver reactions had been
pinned on oral terbinafine.
Researchers in Canada identified
the biochemical culprit behind terbin-
afine’s liver toxicity—a compound called
TBF-A that appeared to be a metabolite
of terbinafine—in 2001. Clinicians
quickly learned to monitor and manage
this potential toxicity during treatment,
but no one could work out how the
compound actually formed in the liver,
or could experimentally reproduce its
synthesis from terbinafine in the lab.
Then, in 2018, graduate student Na
Le Dang at Washington University in
St. Louis hit upon a way to use artificial
intelligence (AI)—specifically, a machine
learning algorithm—to work out the pos-
MODIFIED FROM ©ISTOCK.COM, THE SCIENTIST STAFF

sible biochemical pathways terbinafine
takes when it is metabolized by the liver. flagged until after the product was on faster than any human. “A lot of the
Trained on large numbers of known the market, says S. Joshua Swamidass, questions that are really facing drug
metabolic pathways, the algorithm had a physician scientist and computational development teams are no longer the
learned the most likely outcomes when biologist at Washington University and sorts of questions that people think
different types of molecules were broken Dang’s supervisor. The discovery not that they can handle from just sorting
down in the organ. With that information, only shed light on a long-standing bio- through data in their heads,” Swamidass
it was able to identify what no human chemical mystery, but showed how the says. “There’s got to be some sort of sys-
could: that the metabolism of terbinafine use of AI could more broadly aid drug tematic way of looking at large amounts
to TBF-A was a two-step process. discovery and development. of data . . . to answer questions and to
Two-step metabolites are much more Given enough data, machine learn- get insight into how to do things.”
difficult than direct metabolites to detect ing algorithms can identify patterns, AI is well positioned to handle
experimentally, which is likely why and then use those patterns to make the complexity of the rules that must
this potentially lethal outcome wasn’t predictions or classify new data much be applied to understand these data,
too, says Regina Barzilay, a computer This is exactly the sort of problem and enter your bloodstream fairly
scientist at MIT and a scientific advisor that machine learning can help solve— quickly.” After the algorithm learned the
for drugmaker Janssen, a subsidiary of and the data are already available to associations between certain molecular
Johnson & Johnson. “When we study help that process. For example, the US structures and solubility from those
chemistry, we definitely study a lot of federal government’s Tox21 program, a 89 molecules, it was able to accurately
rules and we understand the mechanism, collaboration among the Environmen- predict the key properties of a similar
but sometimes they’re really, really com- tal Protection Agency, the National molecule, the team reported at the end
plex,” she says. “If [a] machine is pro- Institutes of Health, and the Food and of last year (Nat Commun, 9:5096).
vided with a lot of data, and the problem
is formulated correctly, it has a chance to
capture patterns which humans may not It’s really our ambition to automate as much as possible, so
be able to capture.” the chemists are just focusing on the much higher level, the
Machine learning couldn’t have difficult problems, the strategy.
matured at a better time for the pharma- —Adrian Schreyer, Exscientia
ceutical industry, says Shahar Keinan,
cofounder and chief scientific officer
at Cloud Pharmaceuticals, a company Drug Administration, maintains a large Although screening for potential
focusing on AI–based drug discovery. data set of molecules and their toxicity toxicities and for biochemical properties
She argues that the steady decline in against key human proteins—perfect are essential sub-tasks in the process of
the number of new drug targets, new fodder for AI to digest in search of pat- drug development, a more tantalizing
mechanisms, and novel first-in-class terns of association between structure, question is whether AI could suggest
drugs coming to market each year is an properties, function, and possible toxic the structure of a new therapeutic
indication that the current system of effects, Keinan says. molecule from scratch.
drug discovery is insufficient to meet Cloud Pharmaceuticals is one At Maryland-based biotech Insilico
modern challenges. “Right now, you company that makes use of these data Medicine, CEO Alexander Zhavoronkov
need to do more work to get to those as part of their workflow. “Now you and colleagues are using a type of algo-
kind of first-in-class [drugs],” she says. can train a machine learning algorithm rithm called a generative adversarial
“The way to overcome that . . . is to find on this data set, then a new molecule network to help develop entirely novel
something new in what [data] we have, comes along, and you can just use your small-molecule organic compounds
and this is where artificial intelligence prediction to say, ‘Is this molecule toxic aimed at treating everything from
will come in.” or not?’” says Keinan. cancer to metabolic disease to neuro-
The pharmaceutical industry and its As well as identifying potential degenerative disorders. This algorithm
investors seem to agree. Last year Cloud toxicities, machine learning algorithms consists of two deep neural networks
Pharmaceuticals entered into a drug could predict how a candidate molecule pitted against one another, Zhavor-
discovery collaboration with pharma will respond to different physical and onkov explains.
giant GlaxoSmithKline. And UK-based chemical environments, and so help The first deep neural network has the
BenevolentAI, which describes its mis- drug developers understand how that challenge of coming up with outputs—
sion as “accelerating the journey from molecule might behave in various tis- molecular structures—in response to
data to medicine” in areas ranging from sues in the human body. a series of inputs, namely the desired
drug discovery to clinical development, A group led by physical chemist Scott functional and biochemical charac-
recently earned an eyebrow-raising Hopkins of the University of Waterloo, teristics of those structures, such as
multibillion-dollar valuation. in collaboration with researchers at solubility, targets, or bioavailability. The
Pfizer, has been training an algorithm other deep neural network has the job of
Designing better drugs to do just that, using data on 89 small- critiquing those outputs. “They engage
More than 450 medicines worldwide molecule drug candidates obtained by in this adversarial game,” Zhavoronkov
have been withdrawn from the market performing a type of spectrometry that says. “They basically kind of compete
post-approval in the last half century as measures how quickly molecules absorb with each other . . . and in this process,
a result of adverse reactions, with liver or lose water. after many, many iterations, they learn
toxicity the most common side effect. “If our drug molecule absorbs a lot to generate something new.”
But the metabolism of compounds by of water very quickly and doesn’t give it Insilico Medicine recently synthesized
organs such as the liver is extremely up, it tells us that that drug is going to be its first molecule from this process in part-
complex and, as in the case of terbin- very soluble in water,” Hopkins says. “It’s nership with China-based pharmaceutical
afine, difficult to anticipate. going to dissolve easily in your stomach company Wuxi AppTec. The two compa-
05 . 2019 | T H E S C IE N T IST 57
BIO BUSINESS
“It’s really our ambition to automate as investing up to $67 million in a drug-

much as possible, [so] the chemists are discovery collaboration with the com-
just focusing on the much higher level, pany. Another recent financing round
the difficult problems, the strategy,” says has earned the company $26 million in
Adrian Schreyer, chief technology officer investment from companies including
at the company. This means the biologi- Celgene Corporation and GT Health-
cal targets and the molecules designed to care Capital Partners.
bind them will all be influenced by the
outputs of their AI platform. The machine’s limit
But there is still a human hand on the Despite progress in using AI for drug
tiller. The algorithm “is fairly autonomous development, it’s not time for humans to
in terms of generating compounds, [but] step out of the process just yet, says Bar-
the final selection is done by chemists,” zilay. For now, she explains, AI in pharma
Schreyer says. This is partly because the is analogous to a smart kitchen: “You
patent system demands a human name have a microwave, and you have a coffee
to an invention, but also because the machine, and you have this and that, but
nies have now entered into a collaboration selections that the chemists make from none of it actually cooks you the dinner. .
agreement to test other candidate mol- the AI-generated options can be fed back . . You need to put it all together to make
ecules designed by the adversarial network into the algorithm to fine-tune its deci- a dinner, even though all these things can
approach for multiple orphan drug targets. sion-making protocol for the next time. help you to do it faster and better.”
Other companies, such as Exscientia, “If you see twenty to thirty compounds While Barzilay and colleagues are
a UK-based, AI-driven drug discovery and [humans] pick ten . . . by virtue of part of a pharma-funded research col-
company, are bringing machine learning the decision making you can see what’s laboration with MIT to bring AI into drug
MODIFIED FROM ©ISTOCK.COM, THE SCIENTIST STAFF

discovery, she points out that machine
learning has not yet brought a molecule
If a machine is provided with a lot of data, and the problem is to market. “It just assists humans to do a
formulated correctly, it has a chance to capture patterns which faster and better job in various sub-tasks,
humans may not be able to capture. but we still don’t have AI coming up with
—Regina Barzilay, MIT new drugs at this point,” she says.
Artificial intelligence is also limited
by the quality of the inputs—in par-
to bear on the entire process of drug dis- preferable, you can use that process itself ticular, the quality of the data that it
covery. Exscientia is building a platform to build a machine learning model.” learns from, says Schreyer. “If the data
that aims to mimic the decision making Exscientia’s approach has gener- is flawed, then the results are likely to be
of a medicinal chemist while also learning ated considerable interest from the flawed as well, which can be a problem
from human medicinal chemists’ input. pharmaceutical industry, with Roche if you get data from third parties,” he
HUMAN VERSUS MACHINE

Machine learning has the capacity to analyze vast quantities of biochemical data, and do so much faster and more accurately than a
human brain. But just how much better than a person can an algorithm really perform?
MIT computer scientist Regina Barzilay and her colleagues have attempted to answer this question by pitting a machine learning
algorithm against ten human chemists challenged with predicting the outcomes of organic reactions.
Participants and the algorithm were given 80 sample organic reactions and asked what the most likely product of each of those
reactions would be. The algorithm had been trained on a database of organic reactions from the US Patent and Trademark Office, while
eight of the ten humans were graduate, postdoctoral, and professor-level chemists with considerable experience in organic chemistry,
and the other two were graduate students in chemical engineering (ACS Cent Sci, 3:434–43, 2017).
The algorithm was, on average, more accurate than the chemists: the scientists’ average accuracy was 48.2 percent, while the algo-
rithm scored 69.1 percent. In fact, there was just one person who prevented artificial intelligence from claiming a total victory, Barzilay
says. “The only human who actually outperformed the machine, and not by a big margin, was the head of our chemistry department.”
says. Experimental data are not perfect Another issue for some researchers with different solvents and different
measurements of the real world, and is the so-called “black box” nature of temperatures and other things, then
there are always assumptions and fitting the algorithms themselves: data goes in, it becomes really difficult even for the
involved. If the algorithm and its users answers come out, but the internal work- experienced human to predict what is
the right outcome, so what does it mean
[to] give an explanation?”
One tantalizing question is whether AI could suggest the There is unquestionably a lot of hype
structure of a new therapeutic molecule from scratch. around the potential of AI in drug dis-
covery—Swamidass likens this period
to the internet boom of the late 1990s.
do not sufficiently take those biases ings remain a mystery. Swamidass says But many, such as Schreyer, are excited
and weightings into account, then the that this presents a fundamental conflict about the possibilities that this new
outputs will also be biased. for scientists. “What does it mean to be technology offers, particularly when it
Sometimes data can be lacking applying black-box methods in fields comes to finding novel therapeutics for
altogether, particularly for recently that are all about understanding what’s difficult-to-treat diseases.
discovered molecules. At Cloud Phar- inside of a black box?” he says. “We don’t “If you can improve . . . how efficiently
maceuticals, researchers are getting want to just have something that works you do drug discovery fivefold, tenfold,
around this problem by using compu- in science, we want to understand the or even more,” he says, “then from an
tational chemistry to design a range underlying processes there.” economic perspective . . . you’re able to
of molecules that could fit a particular But Barzilay says that the relation- take on more risky projects because the
biological target, then using that range ships between inputs and outputs cost of failure is much lower.” g
as the data set that the machine learn- might often go be yond human
ing algorithm can learn from to help comprehension. “When we’re talk- Bianca Nogrady is a freelance science
design new candidate molecules. ing about complex organic reactions writer based in Sydney, Australia.
Advancing Diagnostics Together 800+

Diagnostics
11th Annual Professionals
AUGUST 20–22, 2019
WASHINGTON, DC
Grand Hyatt Washington
58%
of Attendees from
IVD & Pharma
60+
Industry Leading
Sponsors &
Exhibitors
42%
C-Suite &
Directors
Register by April 26, 2019 and SAVE UP TO $400! NextGenerationDx.com

READING FRAMES
Robots and Eureka Moments

Artificial intelligence can be trained to recognize patterns and predict results.
But will machine learning ever be able to make novel scientific discoveries?
BY KARTIK HOSANAGAR
M
ost of the practical AI success sto- a research paper proposing the hypothesis.
ries in recent years have involved In 1989, a clinical study conducted in the
what computer scientists call super- rheumatology clinic at Albany Medical Col-
vised machine learning: the use of labeled lege confirmed Swanson’s hypothesis.
datasets to train algorithms to automate Swanson’s main insight was that new
what had been a human activity. For exam- knowledge could be uncovered by con-
ple, take a dataset of symptoms and test necting disparate fields of knowledge: If A
results of thousands of patients, along with (fish oil) was related to B (blood flow) and
their eventual diagnosis by doctors, and train B was related to C (Raynaud’s symptoms),
an algorithm to learn the patterns in the then there might be a potential relation-
dataset—that is, which symptoms and clini- ship between A and C.
cal markers predict which diseases. Similarly, Swanson and University of Illinois at
take a dataset of labeled images and train an Chicago psychiatry professor Neil Smal-
algorithm to recognize people’s faces. These heiser developed a computer program Viking, March 2019
successes show that machine learning can, called Arrowsmith that plucked out such
with the right training data, approximate hypotheses from medical research data-
tacit human knowledge. But is it possible for bases, with a focus on theories generated learned from data generated by an algo-
AI to extract knowledge unknown even to out of links between disparate specialties. rithm itself via exploration, an approach
experts? Can we automate something like Swanson later hypothesized a relation- known as reinforcement learning. Such
scientific discovery? ship between magnesium deficiency and algorithms explore different actions and
One potential approach, which I discuss migraine headaches that was also supported learn which actions lead to a better per-
in my recently published book A Human’s by subsequent clinical research. formance. Instead of being restricted to
Guide to Machine Intelligence, came from Over the years, Arrowsmith has had lim- analyzing the data already obtained, the
the late Don Swanson, an information sci- ited effect, but Swanson’s early foray sug- approach can explore the space of poten-
entist at the University of Chicago. Swanson gests that finding relationships in data from tial actions and prioritize what to test
was reading about the Inuit diet when one disparate fields can help tap into undiscov- next. This ability to take stock of multiple
detail of it—high fish consumption—caught ered knowledge hidden in data. Although hypotheses and explore them (that is, con-
his attention. The research suggested that Swanson’s efforts were fully manual, such duct experiments and acquire data to vali-
elevated fish oil intake increases blood flow, a process can indeed be automated to help date hypotheses), all while recognizing the
reduces blood vessels’ reactions to cold, and uncover knowledge that scientists might not cost of exploration, can be a big boost to
dampens platelet-triggered clotting. The have discovered yet. scientific discovery. For example, drug dis-
opposite of these changes in the blood sys- An alternative approach is illustrated covery relies on coming up with millions of
tem, Swanson happened to know, were all by Google Deep Mind’s Go-playing soft- candidate molecules and running a series
associated with Raynaud’s disease, a syn- ware AlphaGo Zero. While the original of experiments to identify if some molecule
drome that causes blood vessels to constrict version of the software was trained heavily seems to work.
in response to low temperatures or stress. on past games played by human Go play- While AI is automating routine tasks in
Fish oil, Swanson hypothesized, might help ers, AlphaGo Zero didn’t bother studying a variety of industries, there is great promise
treat Raynaud’s disease. human moves; instead, its entire train- for its application in science as well. g
Swanson found many studies that con- ing dataset was self-generated. The soft-
firmed the observations that 1) fish oil ware, armed with basic rules of Go, played Kartik Hosanagar is the John C. Hower
improved blood circulation and 2) Raynaud’s millions of games against itself. Next, it Professor at the Wharton School of the
disease was associated with poor blood circu- analyzed those games to figure out which University of Pennsylvania, where he
lation. But none of the existing research sug- moves helped and which ones hurt. studies technology and the digital economy.
gested that fish oil could be an effective treat- While supervised learning relies on Read an excerpt of A Human’s Guide to
ment for Raynaud’s. In 1986, Swanson wrote cleanly labeled training data, AlphaGo Zero Machine Intelligence at the-scientist.com.
FOLLOW US ON INSTAGRAM
the_scientist_magazine
The Guide
Continuous Measurements Technical. Guide. Free.
of Cell Monolayer Barrier Update.
Function (TEER) in Worthington educational materials have
Multiple Wells. guided generations of life science research
for decades.
The TEER 24 System electrically monitors And thanks to you, we are cited in respected
the barrier function of cells grown in culture scientific journals more than any other
upon permeable membrane substrates. The new TEER 24 primary enzyme producer across the globe.
system accommodates standard 24 well membrane inserts in
a disposable 24 well microplate. The instrument is placed in a Order your NEW Tissue Dissociation
standard CO2, high humidity incubator and connected to a PC. Guide today. Call: 1.800.445.9603, Fax:
Dedicated software presents real-time continuous measurement 1.800.368.3108,
of TEER in ohm-cm2. or go to: worthington-biochem.com
Expertise for a new generation of researchers.
For more information visit WORTHINGTON BIOCHEMICAL CORPORATION

APPLIED BIOPHYSICS 1.800.445.9603
www.biophysics.com/products-ecisTeer24.php worthington-biochem.com
Insect Medium Hepatocytes

IS Sf DonorPlexTM
• Serum-free, animal component-free, and low • High-quality cryopreserved pooled
hydrolysate donor suspension hepatocytes are
• Formulated for scalability for the consistent now available
growth and yields of proteins, viral vectors, and • Designed to offer a reliable, robust,
viral-like particles using baculovirus expression and customizable tool for studying the
systems (BEVS), in Sf9 and Sf21 cells hepatic metabolism of drugs
• Supports high, viable cell density growth and protein yields in • Proprietary algorithm that predicts yield, viability, and functional
baculovirus expression systems at small- and large-scale phenotype prior to manufacture ensures individual customer
specifications are met
FUJIFILM Irvine Scientific Lonza

www.irvinesci.com www.lonza.com
Microbiological Incuba- Flow Cytometer

tors BIT/BIF Series ZE5
• Added to the company’s ambr® 15 • Offers a unique option for histology
fermentation and ambr® 250 high laboratories
throughput systems for microbial • Can be placed on any benchtop as a plug-
applications and-play product
• Addresses the bottleneck associated • Combines a quiet fan and a Peltier design
with offline sampling optimized for robust performance
• Features state-of-the-art non-invasive reflectance measurement, saving
time, effort, and sample volume
Sartorius Stedim Biotech TECA

www.sartorius-stedim.com www.thermoelectric.com
BioProcess
International
September 9-12, 2019
Boston Convention & Exhibition Center
Boston MA
Part of
THE LARGEST BIOPROCESSING EVENT BRINGING

YOU THE SCIENCE, TECHNOLOGIES AND PARTNERS
NEEDED TO ACCELERATE PROMISING BIOLOGICS
TOWARDS COMMERCIAL SUCCESS
View the Full Speaker List Today at

www.BPIevent.com
FOUNDATIONS
Learning Machine, 1951

BY JEF AKST
A
s an undergraduate at Harvard
in the late 1940s and in his first
year of grad school at Princeton
in 1950, Marvin Minsky pondered how
to build a machine that could learn.
At both universities, Minsky studied
mathematics, but he was curious about
the human mind—what he saw as the
most profound mystery in science. He
wanted to better understand intelligence
by recreating it.
In the summer of 1951, he got his
chance. George Miller, an up-and-coming
psychologist at Harvard, secured funding
for Minsky to return to Boston for the
summer and build his device. Minsky
enlisted the help of fellow Princeton
graduate student Dean Edmonds, and the
duo crafted what would become known as ARTIFICIAL INTELLIGENCE TAKES BABY STEPS: The SNARC machine included 40 artificial neurons (one
pictured), which were interconnected via a plugboard and held in racks in a contraption about the size of a grand
the first artificial neural network. Called
piano. At one end of the neuron was a potentiometer (bar on far right), a sort of volume knob that could adjust
SNARC, for stochastic neural-analog the probability that an incoming signal would result in an outgoing signal. If the neuron did fire, a capacitor
reinforcement calculator, the network (red) on the other end of the neuron retained a memory of the firing for a few seconds. If the system was
included 40 interconnected artificial “rewarded”—either by the researchers pushing a button or an electrical signal from an outside circuit—a chain
neurons, each of which had short-term connected to the volume knobs for all 40 neurons would crank. This would cause the volume knob to increase
the future probability of the neuron firing, but only if a magnetic clutch had been engaged by a recent firing.
and long-term memory of sorts. The
short-term memory came in the form of a
capacitor, a piece of hardware that stores led toward the finish line, the system 28.) But nearly 70 years ago, Minsky felt
electrical energy, that could remember for adjusted to increase the likelihood of neural networks were too limited because
a few seconds if the neuron had recently that firing pattern happening again. Sure they didn’t have a sense of expectation,
relayed a signal. Long-term memory was enough, the rats began making fewer which is critical for filling in gaps in human
handled by a potentiometer, or volume wrong turns. Multiple rats could run at perception. He shifted his focus to symbolic
knob, that would increase a neuron’s once to increase the speed at which the artificial intelligence, which, instead of
probability of relaying a signal if it had system learned. trying to mimic the brain’s approach to
just fired when the system was “rewarded,” “We sort of quit science for a while to information processing, draws inspiration
either manually or through an automated watch the machine,” Minksy told the New from the nature of human thought,
electrical signal. Yorker in a 1981 profile. “We were amazed doing computations based on high-level
Minsky and Edmonds tested the that it could have several activities going concepts that are interpretable by people.
model’s ability to learn a maze. The on at once in its little nervous system.” Minsky, who died in 2016, did
details of how the young researchers Artificial neural networks continued think that neural networks would be
COURTESY OF MARGARET MINSKY
tracked the output are unclear, but one to advance over the next seven decades, one component of a truly intelligent
theory is that they observed, through but it’s only been since the turn of the 21st machine, notes his daughter, interactive
an arrangement of lights, how a signal century that the approach has taken center computing expert Margaret Minsky, a
moved through the network from a stage in the field of artificial intelligence. visiting professor at New York University
random starting place in the neural Deep neural networks, which involve Shanghai. She wonders what he would
network to a predetermined finish line. many layers of computation, are now think about that today. “Given the things
The duo referred to the signal as “rats” powering myriad applications, including [artificial neural networks] do and don’t
running through a maze of tunnels. ever-more-realistic models of the human do, and how they fit into systems now, I
When the rats followed a path that brain. (See “The Silicon Brain” on page really want to know what he would say.” J
6 4 T H E SC I EN T I ST | the-scientist.com
DeLIVERing
to Your Needs!
Quality-Assured and Fully-Characterized
Primary Human Hepatocytes (PHH)
Our PHH batches are characterized by:

• CYP Induction
• Plateability & Morphology
• Phase I and II Enzyme Activities
• HLA characterization
1. Select Access current inventory
2. Test Evaluate batch performance
3. Reserve Batch sizes from 50 to 500 vials
For more information, contact

admetox@sial.com
© 2019 Merck KGaA, Darmstadt, Germany and/or its

affiliates. All Rights Reserved. MilliporeSigma and the
vibrant M are trademarks of Merck KGaA, Darmstadt,
Germany or its affiliates. All other trademarks are the
property of their respective owners. Detailed information
on trademarks is available via publicly accessible
resources.
The life science

business of Merck
KGaA, Darmstadt,
Germany operates as
MilliporeSigma in the
U.S. and Canada.
10:37 AM OCT 11, 2021
THE MOMENT A DIFFICULT

BIOLOGICAL QUESTION HAS
AN ANSWER_
WITH AN APPROACHABLE, AFFORDABLE AUTOMATED CELL SORTING SOLUTION

IN YOUR LAB. Cell sorting may be complex, but it doesn’t need to feel complicated or out
of reach. With intuitive software that requires minimal training, the BD FACSMelody™ cell
sorter enables deep scientific insights with reproducible results, cost savings and workflow
efficiencies. Discover how better instrumentation can free up your time, so you can focus
your expertise where it matters most. Discover the new BD.
Advance your research at bd.com/ResearchSolutions

Class I Laser Products. For Research Use Only. Not for use in diagnostic or therapeutic procedures.
BD, San Jose, CA, 95131, U.S.
BD, the BD Logo and FACSMelody are trademarks of Becton, Dickinson and Company or its affiliates.
© 2019 BD. All rights reserved. 0219-1699

The Scientist Exploring Life 2019

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

The Scientist Exploring Life 2019

Загружено:

Авторское право:

Доступные форматы

MAY 2019 | WWW.THE-SCIENTIST.

Introducing the Quanterix SP-X Imaging and Analysis System,

research, and ultimately accelerate drug approvals.

Quanterix.com | © 2018 Quanterix, Inc. SP-X™ is a registered trademark of Quanterix, Inc.

Different by (your) design

Novel Enzymatic Fragmentation System

Kit meets the dual challenge of generating NEBNext Ultra™ II FS

high quality next gen sequencing libraries from 100

ever-decreasing input amounts AND simple 80

scalability. Enzymatic shearing increases library 60

yields by reducing DNA damage and sample loss.

Our novel fragmentation reagent is combined 0

with end repair and dA-tailing reagents, and a PCR cycles

single protocol is used for a wide range of input

Features ON THE COVER: © ANDRIY ONUFRIYENKO, GETTY IMAGES

05. 201 9 | T H E S C IE N T IST 3

for Your Health; A Better Beehive; to recognize patterns and predict

46 THE LITERATURE S T I RRU P P A P E R

05. 201 9 | T H E S C IE N T IST 5

THIS MONTH AT THE-SCIENTIST.COM:

VIDEO VIDEO SLIDESHOW

AS ALWAYS, FIND BREAKING NEWS EVERY DAY ON OUR WEBSITE.

SNAPSHOT SERENGETI; © ISTOCK.COM, WACOMKA

• The new frontier of quantum biology

• How the body tolerates disease instead of fighting it

• Can microbiome-produced antigens lead to autoimmunity?

AND MUCH MORE

Learn more about breakthroughs in ULT technology at NoCompressors.com.

EDITORIAL DESIGN ADVERTISING, MARKETING, EDITORIAL ADVISORY BOARD

8 T H E SCI ENT I ST | the-scientist.com

But it is the construction of advanced neural net-

Will it be possible, in the

foreseeable future, to build a

23. Mother of striped cubs

Artificial Intelligence: An Introduction

computation—a deep neural network—

Notebook MAY 2019

Going Through the notes of his ballad. Then the partners

brain and behavior.

In two small-scale validation stud-

are using machine learning to analyze

YOU ARE NOT A ROBOT...

ASSIST PLUS Automating Multichannel Pipettes

VIAFLO - Electronic Pipettes VOYAGER - Adjustable Tip Spacing Pipettes www.integra-biosciences.com

that the current commercial beekeeping

ate student can gather in an afternoon,”

NEW PERSPECTIVE: Researchers used an algo-

LAURYL NUTTER, PhD TUESDAY, MAY 7

DAVID J. SEGAL, PhD TOPICS TO BE COVERED:

WEBINAR SPONSORED BY:

KATHRYN MILLER-JENSEN, PhD TUESDAY, MAY 21

WEBINAR SPONSORED BY:

Microbiology Meets Machine Learning

how many [parasites] per cell, how many

IMAGE ANALYSIS TYPES OF IMAGES PROCESSED USES AI COST

system. interactions cost several hundred thousand dollars each.

AI Versus Animal Testing

accurate. Finally, the researchers asked the computer to predict survival

better simulate neurological function.

faster than the speed at which the brain computes.

IBM’S TRUE NORTH Tens of milliwatts

256 million synapses

Uses less than 1 milliwatt

Neuromorphics for neuroscience