Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Storing Digital Binary Data in Cellular DNA: The New Paradigm
Storing Digital Binary Data in Cellular DNA: The New Paradigm
Storing Digital Binary Data in Cellular DNA: The New Paradigm
Ebook1,042 pages19 hours

Storing Digital Binary Data in Cellular DNA: The New Paradigm

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Storing Digital Binary Data into Cellular DNA demonstrates how current digital information storage systems have short longevity and limited capacity, also pointing out that their production and consumption of data exceeds supply. Author Rocky Termanini explains the DNA system and how it encodes vast amounts of data, then presents information on the emergence of DNA as a storage technology for the ever-growing stream of data being produced and consumed. The book will be of interest to a range of readers looking to understand this game-changing technology, including researchers in computer science, biomedical engineers, geneticists, physicians, clinicians, law enforcement and cybersecurity experts.
  • Presents a comprehensive reference on the fascinating and emerging technology of DNA storage
  • Helps readers understand key concepts on how DNA works as an information storage system
  • Provides readers with key information on the technologies used to work with DNA data encoding, such as CRISPR
  • Covers emerging areas of application and ethical concern, such as Smart Cities, cybercrime and cyberwarfare
  • Includes coverage of synthesizing DNA-encoded data, sequencing DNA-encoded data, and fusing DNA with Digital Immunity Ecosystems (DIE)
LanguageEnglish
Release dateAug 18, 2020
ISBN9780128234587
Storing Digital Binary Data in Cellular DNA: The New Paradigm
Author

Rocky Termanini

Dr. Rocky Termanini, CEO of MERIT CyberSecurity Group, is a subject matter expert in IT security and brings 46 years of cross-industry experience at national and international levels. He received his Ph.D. in Computer Science, Artificial Intelligence, from Yale University. He is the designer of the "Cognitive Early-Warning Predictive System" and "The Smart Vaccine™" which replicates the human immune system to protect the critical infrastructures against future cyber wars. Dr. Termanini spent five years in the Middle East working as a security consultant in Saudi Arabia, Bahrain, and the UAE. Professor Termanini’s teaching experience spans over 30 years. He taught Information Systems courses at Connecticut State University, Quinnipiac University, University of Bahrain, University College of Bahrain, Abu Dhabi University, and lectured at Zayed University in Dubai. Dr. Rocky Termanini is a senior advisor to the Department of Homeland Security and other Federal Law Enforcement agencies, as well as an advisor to the FBI on Cyber-terrorism and global malware. Dr. Termanini was the security manager of the Saudi e-Government project for the Saudi Ministry of Interior. Presently, Dr. Termanini helps companies set up cyber-security plans to protect their information assets and to monitor employee loyalty. He is a visiting professor to several universities in the Persian Gulf region, giving short courses in Digital Forensics and ethical hacking. Dr. Termanini has experience in preparing DARPA solicitations for cyber security grants and is the author of two books on Cybersecurity from CRC Press. He is the author of Storing Digital Binary Data into Cellular DNA: The New Paradigm from Elsevier Academic Press.

Related to Storing Digital Binary Data in Cellular DNA

Related ebooks

Science & Mathematics For You

View More

Related articles

Related categories

Reviews for Storing Digital Binary Data in Cellular DNA

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Storing Digital Binary Data in Cellular DNA - Rocky Termanini

    Storing Digital Binary Data in Cellular DNA

    The New Paradigm

    Dr. Rocky Termanini

    Table of Contents

    Cover image

    Title page

    Copyright

    Dedication

    About the author

    Acknowledgments

    Prologue

    Chapter 1. Discovery of the book of life—DNA

    Initial thoughts

    DNA—the code of life

    DNA, the Columbus discovery

    The DNA pioneers

    DNA as an organic data castle

    Music is also one of the DNA's talents

    Appendix

    How was DNA first discovered and who discovered it? Read on to find out …

    Glossary (courtesy of MERIT CyberSecurity—archive)

    Chapter 2. The amazing human DNA system explained

    Magical DNA found in King Tut study

    Fighting chaos

    Anatomy of DNA

    The central dogma of genetics

    The genetic code

    Properties of the genetic code

    Another blessing of DNA, biological profiling

    The holy grail of DNA: gene editing

    The yogurt story

    How gene editing works

    Anatomy of CRISPR

    Social and ethical issues in DNA fingerprinting

    DNA's other dark side, the hacking nightmare

    DNA is a warm place for music

    DNA can hack computers

    Appendices

    Chapter 3. The miraculous anatomy of the digital immunity ecosystem

    Introduction

    What is the Smart Vaccine?

    Smart cities are like the human body

    What is a smart city?

    CEWPS is the intelligent smart shield of the smart city

    The 3D nanoattack scenario

    Anatomy of CEWPS and its intelligent components

    Anatomical composition of CEWPS (the digital immunity ecosystem)

    The critical infrastructures in smart cities

    What is a critical infrastructure?

    The smart grid model

    Appendices

    Chapter 4. Hacking DNA genes— the real nightmare

    A glimpse of the bright future

    Stuxnet is the devil's key to hell

    The DNA stuxnet (DNAXNET)

    Criminals could alter their DNA to evade justice with new genetic editing tools

    MyHeritage website leakage

    Appendices

    Two attack strategies

    The crack in the door

    Damages

    Historical background

    Chapter 5. The digital universe with DNA—the magic of CRISPR

    For your own information

    What is our digital universe?

    How big is our digital universe?

    Why is the digital universe growing so fast?

    Data storage capacity is becoming asymptotic

    Dubai is the magical smart city with its Achilles' heel

    Dubai embracing DNA storage

    The hyper data center of the world

    Five types of data centers

    The top 10 hyperscale data centers

    How did DNA digital storage start?

    From DNA genetic code to DNA binary code

    Why ASCII was so important to DNA coding—A glimpse of history

    The Huffman compression rule

    Example: the gopher message

    Summary of the data storage in DNA

    The CRISPR magic

    How to hack DNA

    Anatomy of CRISPR—the smart cleaver

    Artificial intelligence–centric text editing

    Ethical concerns

    CRISPR is the Holy Grail of data deluge

    Appendices

    Chapter 6. Getting DNA storage on board: starting with data encoding

    Some mathematical ideas that we all need to know

    And DNA does that!

    Data nomenclature

    The random access method

    Other existing encoding methods

    DNA storage with random access

    The simulation method

    Comparison of reliability versus density

    DNA the Rosetta stone

    Silicon is getting scarce

    Artificial gene synthesis

    Amazon's flying warehouses

    Church's DNA storage

    DNA computing—the tables turned

    DNA is the new supercomputer

    Appendix 6.A Glossary of DNA encoding (from merit CyberSecurity library)

    Chapter 7. Synthesizing DNA-encoded data

    One trillionth of a gram!

    The DNA writer

    DNA Fountain software strategy

    Fountain software architecture

    A reliable and efficient DNA storage architecture

    DNA computing is around the corner

    The adleman discovery

    Dr. Jian-Jun Shu discovery

    The BLAST algorithm software

    BLAST software architecture

    One word on FASTA software system

    The story of binary code

    Nondeterministic universal turing machine

    Video and audio media features

    Digital video

    Binary to DNA code, revisited

    Mechanics of the buffer overflow

    The creative mind of the hacker

    The next generation of DNA hacking

    CRISPR, the gene editor

    Who is the DNA cyber hacker?

    Type 1: artificial intelligence–powered malware

    Type 2: nanopowered malware

    DNA hacking with nanorobots

    How could DNA attack a computer?

    Appendices

    Chapter 8. Sequencing DNA-encoded data

    The grandiose design of our digital universe in the 21st century

    Smart City ontology

    The bright sun of DNA is coming up

    How to retrieve your Illumina Solexa sequencing data

    What is the Illumina method of DNA sequencing?

    Illumina DNA sequencing operations?

    The disruptive industry of digital DNA sequencing

    From cell atlas project to DNA storage libraries

    Blockchained cannabis DNA

    The hidden second code in our DNA

    Get on the A-train for blockchain

    The sunrise

    Unseen sinkholes

    Blockchain's competitionethereum

    Disadvantages of blockchain

    Malware is hovering over DNA code

    Case 1: blockchained malware inside DNA

    Case 2: biohacking—malware hidden in DNA

    Case 3: DNA malware trafficking

    Appendices

    Chapter 9. Decoding back to binary

    Introduction

    Dynamic equilibrium

    Structure DNA

    The central dogma of genetics

    Key players in DNA synthesis/sequencing and storage

    Academic research

    Research consortium

    Industry

    Start-ups

    Iridia

    US Government

    Intelligence Advanced Research Projects Activity

    National Institutes of Health

    Foreign research

    Decoding DNA sequence back into binary

    Copying DNA sequences with polymerase chain reaction

    BioEdit software, sequence editing

    The next-generation sequencing technology

    Malware Technology

    DNA malware

    Blockchain malware

    Appendices

    Chapter 10. Fusing DNA with digital immunity ecosystem

    How did human immunity come about?

    Plague at the Siege of Caffa, 1346

    The plague of Athens, 430 BC

    DNA digital storage meets the Digital Immunity Ecosystem

    Anatomy of Digital Immunity Ecosystem and its intelligent components

    Anatomical composition of Digital Immunity Ecosystem

    The smart grid model

    Cryptology evolution of time

    DNA cryptology

    The encryption algorithm

    Malware going after DNA storage

    DNA computing applications

    DNA computer

    Case study: demographic and data storage growth of Dubai

    Dubai digital data forecast

    Appendices

    Chapter 11. DNA storage heading for smart city

    Introduction

    Why do we need molecular information storage?

    Smart city needs smart data

    The smart city will switch from hardware to bioware

    DNA potentialities

    Concluding revelation

    Appendices

    Chapter 12. DNA Data and Social Crime

    Sources of social crime

    Poverty as a pervasive social crime

    Key information stored in DNA data storage

    DNA can interpret the behavior of a mass killer

    Smart cities and eradication of cybercrime

    Final thought

    Some interesting numbers about DNA data storage

    Some interesting numbers about datacenter power consumption

    Justification for using DNA storage

    Appendices

    Chapter 13. DNA data and cybercrime and cyberterrorism

    Opening thoughts

    DNA is our binary Holy Grail of data storage

    Behavior of the cybercriminal

    Adding insult to injury

    Anatomy of cyberterrorism

    Cybercrime runs on steroids; antivirus technology runs on diesel

    Cybercrime data repositories

    The storage supply is killing the storage demand

    DNA is the holy grail of digital storage

    Back to smart city

    What is cyberterrorism?

    DNA is the holy grail of smart city

    Appendices

    Biochemistry-based information technology

    Chapter 14. DNA is a time storage machine for 10,000 years

    A special genre of time machine

    DNA time clock can predict when we will die

    The link between biological clock and mortality

    The telomere story

    Time travel is within reach

    The amazing storage phenomenon

    Storage evolution over time

    DNA storage random access retrieval

    Appendices

    Chapter 15. DNA and religion

    DNA and religion galaxies are intersecting

    We all ride our personal boat

    The blessing of bacteria

    Peace between science and religion—injecting scriptural DNA into the body

    The magic of CRISPR

    DNA and bacteria are allies

    The magic of CRISPR, the futuristic phenomenon

    Stepping into God's domain

    What is more important—DNA or religion?

    Personal and medical data used as DNA fingerprint

    DNA is human future diary, cannot be fooled

    Finally, CODIS was born

    Now religion speaks about DNA

    The story of evolution

    Concluding thoughts

    The last words of Einstein

    Appendices

    Advantages of knowing DNA

    Index

    Copyright

    Academic Press is an imprint of Elsevier

    125 London Wall, London EC2Y 5AS, United Kingdom

    525 B Street, Suite 1650, San Diego, CA 92101, United States

    50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States

    The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom

    Copyright © 2020 Elsevier Inc. All rights reserved.

    No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.

    This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

    Notices

    Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.

    Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.

    To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.

    Library of Congress Cataloging-in-Publication Data

    A catalog record for this book is available from the Library of Congress

    British Library Cataloguing-in-Publication Data

    A catalogue record for this book is available from the British Library

    ISBN: 978-0-323-85222-7

    For information on all Academic Press publications visit our website at https://www.elsevier.com/books-and-journals

    Publisher: Mara Conner

    Acquisitions Editor: Chris Katsaropoulos

    Editorial Project Manager: Ana Claudia A. Garcia

    Production Project Manager: Sruthi Satheesh

    Cover Designer: Victoria Pearson

    Typeset by TNQ Technologies

    Dedication

    I dedicate this book to the young generation of the Edisons, Turins, Wieners, Hawkinses, Venters, Watsons, Cricks, Churches, Franklins, Doudnas, Charpentiers, Zhangs, Ehrlichs, Zielinskis, and the rest of the genetics pioneers such as Illumina, TwistBioscience, Sandia National Lab, University of Washington with Microsoft Research (Molecular Information System Lab) who are still in the lab working to enhance our living.

    About the author

    Dr. Rocky Termanini, CEO of MERIT CyberSecurity Group, is a subject matter expert in IT security, artificial intelligence, nanotechnology, machine and deep learning, and DNA digital storage. He is a member of the San Francisco Electronic Cyber Crime Task Force. He brings 46 years of cross-industry experience at national and international levels. He is the designer of the Cognitive Early-Warning Predictive System and The Smart Vaccine™, which replicates the human immune system to protect the critical infrastructures against future cyberwars. Dr. Termanini spent 5   years in the Middle East working as a security consultant in Saudi Arabia, Bahrain, and the UAE. Professor Termanini’ s teaching experience spans over 30 years. He has taught Information Systems courses at Connecticut State University, Quinnipiac University, University of Bahrain, University College of Bahrain, and Abu Dhabi University and has lectured at Zayed University in Dubai.

    Acknowledgments

    DNA is a magic word and galvanized with so many mysteries. DNA data storage is a field in its infancy and is rapidly growing to be the new paradigm that will shift magnetic storage into molecular storage. I have talked and listened to many people about this subject matter and absorbed a lot of new ideas and learned about the dark corners of this technology, which is just as exciting as the new space discoveries of the Black Hole. I salute these people, and I acknowledge their contributions to the world of DNA.

    Ms. Lina Termanini

    Senior Manager, Global Methods, Ernst Young, San Francisco, CA

    Dr. Zafer Termanini

    Orthopedic Surgeon, Saint Lucie, Florida

    Dr. Sami Termanini

    General Medicine, Dublin, Ireland

    Ms. Mia Termanini Williams

    Business Consultant, Saint Lucie, Florida

    Radwan Termanini

    Senior Economist, Doha, Qatar

    Samir Termanini

    Attorney at Law, Computer Science Expert, Newark, New Jersey

    Dr. Eesa Bastaki

    President of University of Dubai, Dubai, UAE

    Dr. Bushra Al Blooshi

    Director of Research and Innovation, Dubai Electronic Security Center, Dubai, UAE

    Dr. Charles J. Bentz

    Fanno Creek Clinic, Portland OR

    Dr. George M. Church

    Professor of Genetics, DNA storage inventor, Harvard Medical School, Boston, MA

    Dr. Feng Zhang

    Biochemist CRISPR inventor, MIT, Boston, MA

    Dr. Abdul Rahim Sabouni

    President & CEO of the Emirates College of Technology (ECT), Abu Dhbai UAE

    Dr. Hussain Al-Ahmad

    Dean of the College of Engineering at University of Dubai

    Dr. Sameera Almulla

    Board Member of the Emirates Science Club, Dubai, UAE

    Dr. Howard Zeiger

    Internal Medicine, John Muir Medical Group, Alamo, CA

    Dr. Matthew DeVane

    Physician, Cardiovascular Consultants Medical Group, San Ramon, CAAlison Ryan, PA-C Diablo

    Dr. Robert Robles

    Valley Oncology and Hematology, California Cancer and Research Institute, Pleasant Hill, CA

    Dr. Siripong Malasri

    Dean of Engineering, Christian Brothers University, Memphis, TENN

    Dr. Judson Brandeis

    Brandeis MD Clinique, San Ramon, CA

    Colonel Khaled Nasser Alrazooki

    Ministry of Interior, Dubai, UAE

    Ms. Akila Kesavan

    Executive Director, Ernest Young, San Francisco, CA

    Dr. Yigal Arens

    Director USC/Information Science Institute, Los Angeles, CA

    Ms. Kristina Nyzell

    CEO, Disruptive play Consulting, Malmo, Sweden

    Meshal Bin Hussain

    CIO, Ministry of Finance, Abu Dhabi, UAE

    John & Danielle Cosgrove

    Cosgrove Computer Systems, El Segundo, CA

    Ms. Amna Almadhoob

    Senior Security Researcher, AMX Middle East, Bahrain

    I also would like to thank the rest of my visionary and creative friends for their gracious assistance. Finally, I am indebted and graciously thank my family and all the people who own part of this book. My special thanks and gratitude go to Academic Press staff (Ana Claudia Abad Garcia, Sruthi Satheesh, Chris Katsaropoulos, and Swapna Praveen) who gave all of us the chance to enjoy reading the book.

    Dr. Rocky Termanini, CA 94595

    CRISPR pioneers (from left to right): George Church, Jennifer Doudna, Feng Zhang, and Emmanuelle Charpentier.

    My highest esteem to these four DNA God's ambassadors for their contribution to humanity and giving hope to doomed patients for a healthy life. They advanced Biomedicine 100 years in the future.

    Prologue

    The most remarkable property of the universe is that it has spawned creatures able to ask questions.

    —Stephen Hawking, Illustrated Theory of Everything:The Origin and Fate of the Universe

    The first ultra-intelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control.

    —Irving J. Good, 1963

    Microsoft CEO Satya Nadella says underwater data centers will play a major role in expanding the firm's global cloud computing platform.

    DNA is much more than the sum of two words. Deoxyribonucleic acid is the biological schematic that dictates the shape of our cheekbones, and whether we sneeze and wheeze every spring when pollen pops. It is the miracle that reminds all of us when Jesus spoke to the man who could not walk. I tell you, He said, get up. Take your mat and go home. The man got up and took his mat. Then he walked away while everyone watched. All the people were amazed. They praised God and said, We have never seen anything like this! —Mark 2:6–12

    It all started when I read Dr. George Church's book Regenesis, How Synthetic Biology Will Reinvent Nature and Ourselves, which sparked a real obsession in me to learn more about how DNA works and how artificial intelligence (AI) and nanotechnology have pushed the envelope. The book is truly the most compelling bit of prophecy since the Old Testament first came out in hardback. I then read about the magic of CRISPR (clustered regularly interspaced short palindromic repeats) and how it ignited a revolution. Dr. Feng Zhang from MIT is the master of CRISPR, the formidable gene editing tool.

    In 2012, Dr. Jennifer Doudna and Dr. Emmanuelle Charpentier were the first to propose that CRISPR/Cas9 (enzymes from bacteria that control microbial immunity) could be used for programmable editing of genomes, which is now considered one of the most significant discoveries in the history of biology. It is mind-blowing to realize that in just 5 years, CRISPR could be used to eliminate mutated genes in a patient with muscular dystrophy (MS) and insert healthy genes into his DNA. Pretty soon, his muscles will be rejuvenated, and he will be able to walk! We can softly say that CRISPR is the hand of Jesus, resurrected. As I get deeper in my DNA research, I have come up with an amazing mind-stretching revelation: DNA nucleotides (the building blocks, ACTG) can be used not only to make body protein but also to encode binary data from our business world. This is going to push aside the world of magnetic and silicon storage. The origin of this idea goes back to 1964 when Mikhail Neiman, a Soviet physicist, published his works in the journal Radiotechnika. Neiman extended his innovation into the possibility of recording, storage, and retrieval of information on DNA molecules. The physicist explained that he had the idea from an interview with Norbert Wiener, an American cybernetic, mathematician, and philosopher, published in 1964. Innovation is like a wave, you cannot stop it, but you can ride it. Here is a story about innovation: The Polynesians were master navigators who traveled with neither compass nor sextant. They learned to read the pattern formed by waves. They observed that when waves hit an island, some are reflected back while others are deflected but continue on in a modified form. Each navigator used the motion of the canoe to feel the wave across the ocean.

    My research led me to the journal Science, which published a research paper by Dr. George Church and colleagues at Harvard University. In the paper, the author described an experiment in which DNA was encoded with digital information that included an HTML draft of a 53,400-word document of Dr. Church's book Regenesis: How Synthetic Biology Will Reinvent Nature and Ourselves, into a four-letter code based on the four DNA nucleotides (A, T, C, and G). It took 5.5 petabits (10¹⁵) to store the book in 1 mm³ of DNA! Dr. Church used a simple translation code where binary bits were mapped one-to-one with DNA base pairs. The research results showed that in addition to its other functions, DNA can also be another type of storage medium, such as hard drives and magnetic tapes.

    The March 2018 issue of MIT Technology Review magazine published a detailed article discussing the work of Dr. Andrew Phillips, head of the Biological Computation Group at Microsoft who is currently developing methods and software for understanding and programming information processing in biological systems. The article described a new disrupting technology where DNA could be the new habitat of binary data. My focus shifted into how we can fuse the Digital Immunity Ecosystem (DIE)—which is a replication of the human immune system—with DNA binary data storage. I wrote Dr. Bastaki, president of the University of Dubai, and mentioned the exciting news about DNA as the new paradigm of data storage and suggested we write a paper on how we can use DNA binary storage as the future storage facility. He replied: Truly, this is an excellent work. As a matter of fact, DDS will be a critical component of the DIE, the futuristic molecular security system to protect the critical systems in the city. He encouraged me to write a paper highlighting the benefits of DNA storage technology, which will replace magnetic and silicone storage options. DNA will have all the advantages that come with storage capacity, integrity, and authentication. This is a marvelous project of innovation and creativity. Innovation rocks—it is a process and not an accident.

    So, I decided to write this book and highlight the storage issues of our present digital universe, which are manifested by an exponential out-of-control growth. Organizations worldwide, large and small, whose IT infrastructures transport, store, secure, and replicate these bits, have little choice but to employ ever more sophisticated techniques for information management, security, search, and storage. The excursion of data storage initiated from bones, rocks, and paper. Then this journey deviated to punched cards, magnetic tapes, gramophone records, floppies, and so forth. Afterward, with the development of the technology, optical discs including CDs, DVDs, Blu-ray discs, and flash drives came into operation. All of these are subjected to decay. Being nonbiodegradable materials, they pollute the environment and release high amounts of heat energy while using energy for operation. The world keeps crunching data—and dancing to its music—while our digital universe is slipping into the sunset. The storage technology vendors are looking vertically, with myopic vision into diversifying and changing labels of their products. In the past year, signs of real change have become more visible, and recent news have brought to light alarming issues for every silicone user. The shortage of silicone supply is driving higher prices, and global demand is expected to continue, and many big elite storage companies (Pure Storage, Veritas, Western Digital, Seagate, NetApp. Hewlett Packard Enterprise, Hitachi Data Systems) and major suppliers remain concerned about their supply chain in the short term and the bottom-line impact in the future.

    Today, data have become the new oil, meaning that we no longer regard the information we store as merely a cost of doing business, but a valuable asset and a potential source of competitive advantage. It has become the fuel that powers advanced technologies such as machine learning (ML). To meet the demand of big data storage, companies such as Microsoft, IBM, Facebook, and Apple are looking beyond silicon for solutions. The next-generation data storage market is expected to be valued at $144.76 billion by 2024.

    The big escape to the cloud

    We are now in the middle of a storage war, creating confusion and havoc among leaders of the business world. The big computing companies are luring their strategic customers to migrate to their cloud. Leaders are actively refactoring legacy applications and developing cloud-native applications from the ground up. The sale cliché is: Start with a cloud-native approach and build, modernize and migrate without being locked-in… Unlock the value of your data in new ways and accelerate your journey to AI…

    Basically, cloud computing is a model or infrastructure that provisions resources dynamically and makes them available as services over the Internet. A cloud service includes infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS). There are three major types of cloud computing models: private clouds, public clouds, and hybrid clouds. The key criteria for evaluation include the underlying infrastructure, redundancies, connectivity, uptime, service-level agreement (SLA), and turnaround services. These six critical performance areas must be thoroughly examined by organizations before they fully jump into the cloud. The fundamental benefits of the as a service model are well known and include a shift from capital to operational expenditure (capex to opex), often leading to lower total cost of ownership (TCO); access for businesses of all sizes to up-to-date technology, maintenance by service providers that can leverage economies of scale; scalability according to business requirements; fast implementation times for new applications and business processes; and the freeing up of staff and resources for other projects and priorities.

    In 1963, Sam Wyly founded University Computing Company (UCC) as a data processing service bureau on the campus of Southern Methodist University in Dallas, Texas. It was the first pseudo cloud—prior to Internet—that offered two software as service options: tape management system and job scheduler. The cloud gained popularity as companies gained a better understanding of its services and usefulness. In 2006, Amazon launched Amazon Web Services, which offers online services to other websites, or clients. The same year, Google joined the cloud with a spreadsheet and word processor. Then IBM, Microsoft, and Oracle jumped on the cloud bandwagon.

    Cloud hardware

    Beginning in the 1960s through the 1990s and up to the commercialization of the Internet, service bureaus relied on water-cooled time-sharing legacy systems. Telecommunications companies primarily offered only dedicated, point-to-point data circuits to their users. Beginning in the 1990s, however, they began expanding their offerings to include virtual private network (VPN) services.

    Then the greed in embracing the dot.com market spread, climaxing in June 2000 with the burst of the bubble—half of the dot.com companies vanished. Layoffs of programmers resulted in a glut in the job market. Office equipment liquidation turned sour. University enrollment for computer-related degrees dropped noticeably. Anecdotes of unemployed programmers going back to school to become accountants or lawyers were common. New companies, such as Amazon, Google, IBM, and Microsoft, started to deploy the new cloud model, Everything as a Service (XaaS), supported by massive hyperscale data centers.

    Since the invention of the integrated circuit approximately 60 years ago, computer chip manufacturers have been able to pack more transistors onto a single piece of silicon every year. Moor's law worked fine for 40 years while the number of transistors was doubling every 24 months. The first chip had 10 transistors (10 microns); today the most complex silicon chips can now hold a billion times more transistors (12 nanometers), and the cost of a chip went from millions to billions. Due to the severe limitations in packing, chip design, cost, power density, and clock speed all suffered. Tech companies threw in the towel on further shrinking line widths due to the diminishing returns. This also means that the future direction of innovation on silicon is no longer predictable. It is worth remembering that human brains have had 100 billion neurons for at least the last 35,000 years. Yet we have learned to do a lot more with the same computer power. The same will hold true with semiconductors—we are going to figure out radically new ways to use those 10 billion transistors.

    Why is silicone important?

    Being aware of these painful constraints, the cloud data center changed direction and decided to move into the all-flash domain. An all-flash array (AFA) is a storage infrastructure that contains only flash memory drives instead of spinning-disk drives. All-flash storage is also referred to as a solid-state array (SSA). AFAs and SSAs offer speed, high performance, and agility for your business. Although modern data centers are looking to AFAs as a solution to performance and capacity demands, not every AFA is created equal. It is important to understand the difference between purpose-built arrays and retrofit arrays. Retrofits attempt to combine all-flash with 20-year-old disk-based architectures, preventing customers from getting the best return on investment and exposing shortcomings in performance, reliability, and simplicity.

    The silicon addiction

    We love our electronics, and most of us buy a new phone, computer, or laptop every year or two. With this, we expect to buy a faster, more intelligent device. The microchips inside our electronics are currently made up of silicon, an abundant material found in sand. The same silicon chip is the prime component in all our computing devices, from our laptop all the way to AFA or SSA, hard drive controllers, and memory and processing units that make up our digital universe. Hyperscale data centers, both conventional and cloud-centric, will realize sometime soon that silicon-based chips will no longer be able to provide devices with the extra speed and functionality that buyers demand. It will be a frightening—a fourth of July—shock to the business world to learn that DNA, the genetic material inside every human cell, is a leading contender to fill silicon's shoes. All the public, private, hyper, Kubernetes, and containers cloud providers will have a Himalayan storage problem that looms on the horizon of 2022.

    One of the remarkable ironies of digital technology is that every step forward creates new challenges for storing and managing data. In the analog world, a piece of paper or a photograph never becomes obsolete, but it deteriorates and eventually disintegrates. In the digital world, bits and bytes theoretically last forever, but the underlying media—floppy disks, tapes, and drive formats, as well as the codecs used to play audio and video files—become obsolete, usually within a few decades. Once the machine or medium is outdated, it is difficult, if not impossible, to access, retrieve, or view the file. Yaniv Erlich, assistant professor of computer science at Columbia University and a core member of the New York Genome Center, observes Digital obsolescence is a very real problem. There is a constant need to migrate to new technologies that do not always support the old technologies.

    DNA—the holy grail

    We can all respectfully consider the wheel, the most important invention by humankind; while most other inventions have been derived from nature itself, the wheel is 100% a product of human imagination. Now here is another natural invention that will change the computing world. It is hard to imagine that 10 trillion DNA molecules could fit into the size of a marble. The other bombshell is that all these molecules can process data simultaneously. In other words, we can calculate 10 trillion times simultaneously in a small space at one time! Just imagine, in one 0.04 ounce (1 g) of artificial DNA, we can store the data of some 3,000,000 CDs. In an article in the August 2016 issue of Science Magazine, Katherine S. Pollard said, A mere milligram of the molecule could encode the complete text of every book in the Library of Congress and have plenty of room to spare. In 2014, researchers published a study in the journal Supercomputing Frontiers and Innovations estimating the storage capacity of the Internet at 10²⁴ bytes, which can fit into a glass cookie jar. If the sky is the limit, DNA comes close to it! https://www.the-scientist.com/magazine/issue/human-evolution-30-8.

    DNA data storage and computing are true disruptive technologies that stand ahead of all technical inventions combined. This book is moderately technical, and I followed the Ray Kurzweil style in writing his great book, Singularity Is Near. This book is easy to read, but it covers this new paradigm, which will give informatics a new dimension. I imagine in a couple of decades; the business world will have a universe of Internets, and computing will go down to the molecular level driven by AI smart nanobots that will boost our intelligence and conquer many chronic diseases. Hopefully, man and machine will coexist in harmony.

    During the writing of this book, I modestly admit that this work is the collective effort of many meetings with physicians, geneticists, computer engineers, academicians, biologists, law enforcement agents, and devout religious and agnostic partisans. The algebraic sum of all this was enigmatic yet enlightening.

    I started out by talking briefly about the human immune system, which is an incredible autonomic ensemble of smart components and arteries that interoperate with perfect orchestration. We will use the human immune system as the reference architecture model for our digital immunity. The Digital Immunity Ecosystem (DIE) is actually a replication of the human immune system. It is built with futuristic and disruptive architecture that represents the next generation of cyberdefense—not just security—for smart cities. This book is the intersection of AI, nanotechnology, and ML. I describe the expansion of our digital universe with its gluttony of data storage. I introduce DNA synthesis and sequencing systems to capture and store binary data. I present the model of fusing digi-informatics with DNA informatics to create a holistic smart ecosystem. I cover in detail DNA hacking and cryptography. With every chapter, I include testimonies and references to support my arguments. The combined systems into one homogenous platform rely heavily on DNA computing algorithms, molecular AI, and nanobots energized with ML for predictive analytics. In the end, I cover—with interesting scientific discussions—three important topics that are pathologically tied to DNA: crime, time, and religion.

    I am confident that this book will be more interesting to computing people than to medical specialists. A disruptive technology such as DNA data storage (DDS) is one that displaces an established technology and shakes up the industry, similar to a groundbreaking product that creates a completely new industry. DDS will start with baby steps and will eventually make sense to the resident of the digital universe. I give credit to and salute all the pioneers who developed this and connected the dots of this marvelous technology. I hope the leaders of the computing major leagues start investing more time, money, and effort in DDS instead of building more hyperscale data centers.

    Dr. Rocky Termanini, Walnut Creek, CA 94595

    Chapter 1: Discovery of the book of life—DNA

    Abstract

    In the beginning, there was water, and then, DNA was born. The discovery of fire was necessary to our survival; the Internet has accelerated our information ubiquity and spread knowledge at a global level. DNA was a big league by itself. DNA gave us a knife to alter our genes and heal us from bedeviled diseases. Now, DNA is offering us a new direction in storing real data for thousands of years. The book gives us a detailed portraiture on how we can preserve our civilization for 10,000 years. It is the tip of the history iceberg.

    Keywords

    Binary code; Digital universe; DNA code; Genetics; Helix; Nucleotide

    When looking for a needle in a haystack, the optimist wears gloves.

    The Little Book of Things to Keep in Mind.

    DNA is like a computer program but far, far more advanced than any software ever created.

    Bill Gates, The Road Ahead.

    If at first an idea does not sound absurd, then there is no hope for it.

    Albert Einstein.

    DNA is not predestined. Genes are reprogrammable.

    Science Magazine (March 21, 2019).

    Today we are in science and technology where we humans can reduplicate and then improve what nature has already accomplished. We too can turn the inorganic into the organic. We too can read and interpret genomes—as well as modify them. And we too can create genetic diversity, adding to the considerable sum of it that nature has already produced …

    Dr. George Church from Genesis.

    The nitrogen in our DNA, the calcium in our teeth, the iron in our blood, the carbon in our apple pies were made in the interiors of collapsing stars. We are made of starstuff.

    Carl Sagan.

    Hmmm … If the DNA of one human cell is stretched out, it would be almost 6 feet long and contain over three billion base pairs. How does all this fit into the nucleus of one cell?

    Rosalind Franklin: DNA from the Beginning.

    Initial thoughts

    Imagine a world where you could go outside and take a leaf from a tree and put it through your personal DNA sequencer and get data such as music, videos, or computer programs from it. Well, this is all possible now. It was not done on a large scale because it is quite expensive to create DNA strands, but it is possible.

    DNA—the code of life

    No car can go through the African Sahara without spare tires and ample fuel. We have the same analogy with DNA data storage. No organization, small or large, would successfully compete without deep data lakes and scalable data warehouses to accommodate machine learning and predictive analytics. The idea of storing digital data in a DNA living cell is truly revolutionary and mind blowing. It is the most optimum option since conventional storage is asymptotically going to hit the wall soon. We can encode images and movies into synthetic DNA, which acts as a molecular recorder and collects data over time.

    Imagine riding on a time machine for 400   years into the future and looking back at movies and images from 2018. It would be refreshingly electrifying. If DNA storage had existed some 3000   years ago, we would have been able to witness how Moses crossed the Red Sea and the horrific crucifixion of Jesus. Both events would create an unforgettable posttraumatic mental disorder. Converting binary information to DNA code is a great historical Rosetta Stone that will carry our digital universe into the future. This is an amazing mind-stretching dream.

    Even DNA has a special day, called DNA day! So, what is so hypnotic about DNA? It is not an exaggeration to say that DNA control has discrete information about our birth and death. Scientists are working every single day on knowing more about mutations and mechanisms that could possibly confirm or refute DNA mysteries and provide access to important DNA data.

    Modern science and medicine have taught us that genetics dictates every portion of our lives in the sense of health and what we can expect in wellness or disease. It seems reasonable to say that if medicine cannot identify the cause and cure of an illness, then genetics will have an answer for it. Genetics' original theory was reasonable because people age on a daily basis, and no method for stopping the aging clock had been found. It must be that each cell was permanently stamped by time and could never be reverted or made younger. Aging, and every other factor related to the body, was a predestined program unfolding into its unknown final stop as time passed. Dr. Ian Wilmut was the geneticist who changed that line of thinking. Dr. Wilmut's research finally proved that the human cell, the basis of all life, was not predestined to live or die by the secret code of the genes. Cells were not permanently stamped in time. In fact, this new research proved that all the genetic information encoded as your inheritance is reprogrammable and enhanced to extend longevity and eliminate the human misery from disease. Fig. 1.1 illustrates the magic of our genetic code.

    Deoxyribonucleic acid , known as DNA, is a bewildering engineering marvel. It is a mechanism that has been at work on the earth for billions of years: storing information in the form of DNA strands. This fundamental process is what has allowed all living species, plants and animals alike, to live on from generation to generation. We would not be here on the earth without this magic organism.

    The information in your genes is presently programmable; this is no longer a fear. Research has now demonstrated that your destiny is not predefined: It is set by the choices you make every day.

    Even more real than the fact that your genes are programmable in the present is the fact that the way the information in your genes is interpreted is changeable on a daily basis.

    Discovery is an exquisite drive to explore and learn about the unknown. It is true that Archimedes' Eureka was also a very emotional shout. Scientific discoveries may seem like sudden breakthroughs, but new findings do not come out of nowhere. Each breakthrough is made possible by the work that came before it. Some scientific discoveries are a bit like putting together the pieces of a puzzle. Many different researchers discover important bits of evidence—pieces of the puzzle—and the sudden breakthrough arises when one person sees how the puzzle pieces logically fit together. In the case of DNA, new findings and technological advances have made so many new puzzle pieces available that the odds of someone putting them together seem quite high. Making this final leap often involves a brilliant insight—but it is important to recognize all the clues, which made that insight possible.

    Figure 1.1 The normal text code is translated into ASCII binary code, and then each combination of 01, 00, 10, 11 is converted into DNA code.

    If we compare the human body to a building, the body's complete plan and project down to its minute technical detail is present in DNA, which is in the nucleus of each cell. All the developmental phases of a human being in the mother's womb and after birth take place within the outlines of a predetermined program.

    We' are made up of many, many cells that we cannot see, and each cell has a job. Some clusters of cells make up our muscles, some make up our bones—and all together, they make our bodies! But how does each cell know what to do? That is where DNA comes in. It tells the cells what to do.

    We are all made of trillions of cells. There are around 2.5 billion cells in one of your hands, but they are tiny—so tiny that we cannot see them. If every cell in your hand was the size of a grain of sand, your hand would be the size of a school bus!

    Each cell has its own job, just like humans do. Some cells help us detect light and see, other cells help us touch, some help us hear, others carry oxygen around, and others help us digest food by secreting enzymes. There are over 200 cell types in the body—that is 200 different jobs!

    DNA, the Columbus discovery

    The discovery of DNA took a lot of patience and sweat. It happened in increments stretching from the 19th century through the 1940s and 1950s, until today. We cannot give credit to one scientist, but the work continued from different angles and stages. The voyage to unpack DNA's fundamental secrets began in 1865 in a monastery where a humble monk named Gregor Mendel had a holy moment when he witnessed pea plants inherited parents' traits. Mendel did not know that what was being transferred was deoxyribonucleic acid—DNA.

    In 1998, after a series of experiments, scientists revealed that gene expression is controlled by a phenomenon called RNA interference. This process defends against viruses that try to insert themselves into DNA and control gene expression. This discovery was awarded a Nobel Prize in 2006 and has directly led to research on silence genes that cause problems for the body, such as the gene causing high blood pressure.

    Before cells divide, they must double their DNA so that each cell gets identical copies of the DNA strands. DNA replication helps assure that the bases are copied correctly. Enzymes carry out the process. So DNA is actually a biological USB drive, a carrier of genetic instructions for the development and life of an organism to another organism.

    So one exciting discovery leads to another astonishing revelation until the human genome was decoded. The discovery of the human genome is the entirety of our DNA as described by the former president Bill Clinton without a doubt … the most important, most wonderous map ever produced by humankind.

    Scientists dutifully continue to follow the map and step into uncharted territory. Scientists are working devotedly to connect the dots. Not only new connections between genes and diseases—yesterday Alzheimer's, today cancer, tomorrow perhaps depression—are brought to light. And the search for knowledge keeps marching on. It is no theoretical endeavor.

    The DNA pioneers

    Let us, for a moment, appreciate the excruciating sweat work of these dedicated DNA explorers crumbling the mystery of DNA, the blueprint of life (see Fig. 1.2).

    1865—the Czechoslovakian monk Gregor Mendel, called the father of genetics, discovered how pea plants inherited parents' traits. The law of inheritance was named after him.

    1869—Friedrich Miescher, the daddy of DNA, discovered that every cell carries the same DNA goo, as hereditary information.

    1910—Thomas Hunt Morgan, with his fruit fly experiment, proved that chromosomes are sex specific (XX for females, XY for males).

    1928—Frederick Griffith discovered genetic transformation, which underpins much of the genetic engineering being done today.

    And then the big revelation came up when the digital universe merged with the biomedicine universe: DNA emerged as the prospective medium for data storage with its striking features.

    1931—Barbara McClintock, Nobel Prize Laureate of 1983, known as the foremost cytogeneticist in the world, is best known for her research and resulting theories on jumping genes.

    1944—Oswald Avery discovered that DNA carries a cell's genetic material and can be altered through transformation, which consists of chemical transformation, electroporation, or particle bombardment.

    1953—Rosalind Franklin was a British chemist best known for her role in the discovery of the structure of DNA.

    1953—James Watson and Francis Crick contributed to the discovery of DNA structure and the spiral ladder, also known as the double helix.

    Figure 1.2 The chronology of the luminaries and their great vision to help humanity. The adventure is just beginning!.

    1966—Marshall Nirenberg, a Physiology Nobel Laureate, was able to break the genetic code and describe how it operates in protein synthesis.

    1973—Stanley Cohen pioneered in cut-and-paste DNA, the first step in reengineering an organism's DNA.

    1975—Experiments have shown that every cell contains an organism's entire genetic manual, its genome.

    1977—Frederick Sanger helped geneticists read the DNA manual. DNA's four nucleotides repeat millions, even billions, of times within a genome.

    1987—Francis Collins, director of the National Institute of Health, is a physician-geneticist who discovered the genes associated with a number of diseases and led the Human Genome Project.

    1988—DNA was introduced into crime forensics. All states in the United States have DNA banks of criminals.

    1996—DNA cloning of mammals opened up new intriguing possibilities.

    2003—J. Craig Venter and the Institute of Health are competing for the Human Genome Project.

    2010—A new science emerged with the name epigenetics, which is the study of how genes are influenced by outside forces, like the environment of lifestyle.

    DNA as an organic data castle

    While geneticists are still on a voyage to unpack the fundamental secrets of DNA, new opportunities have emerged and ushered in a bright new future in data informatics. A systematic process is being formulated to store digital data into DNA. This is a great discovery of biblical proportion. We are fast approaching a new era of the Data Age. We can assign the term data universe to the total data created, captured, processed, stored, transferred, and replicated on our planet for the past 60   years.

    The science behind storing data in DNA has been proven. Researchers have demonstrated that DNA is a scalable, random access, and error-free data storage system. DNA also remains stable for thousands of years and offers utility in long-term data storage. Advancements in next-generation sequencing have enabled rapid and error-free readout of data stored in DNA. As the data storage crisis worsens in the coming years, as shown in Fig. 1.3, DNA will be used to store vast amounts of data in a highly dense medium.

    Data created by the Internet alone amount to 90% of the total sum. One of the primary reasons is that we have a gargantuan affinity for knowledge gene rated from data. We are going to have a serious problem of austere—I call it data stampede—where good data will devour good data! IDC forecasts that by 2025 the data universe will have its big bang when data grow to 163 zettabytes (ZB, that is, 163   ×   10²¹ [bytes]). That is 10 times the 16.1   ZB of data generated in 2016! Magnetic medium will reach its asymptotic scalability soon; that is the time we praise our Holy Grail, DNA!

    One of the reasons why DNA is considered a better storage system is that 215 petabytes (215 million gigabytes) can be stored in just 1   g of DNA. DNA is apocalypse-proof because, even after global disasters, one thing that we can always preserve and store is DNA. In 1965, Mikhail Samoilovich Neiman, a Russian physicist, was the first pioneer who proposed the idea of the possibility of storing and retrieving information from DNA molecules. This technology was known as MNeimON (Mikhail Neiman Oligonucleotides). Fig. 1.4 is an example how a flash USB stick can be used to store 44   ×   10²¹ bytes of data in 1   g of DNA.

    Figure 1.3 A picture is worth 1000 words. The solid black line shows what we have today as our daily consumption of data from workplace and home. The thinner black lines represent the information that we store, and we can see that we cannot store everything we want, because we are going to run out of magnetic storage. The dashed line represents the DNA-coded storage, which shows that we have plenty of room to store all our information for the next 1000   years, without running out of room.

    Figure 1.4 The magic DNA flash USB, which is capable of storing 44   ×   10²¹ bytes of data in 1   g of DNA. It would take 2.6   ×   10⁹ hard drive devices, 227   ×   10⁶ magnetic tapes, or 3   ×   10⁶ Blu-ray CDs to store the same amount of data.

    Music is also one of the DNA's talents

    Here is a scenario that will surprise you: Assume that you could store the whole library of the famous Arabic singer Feirouz, or Kathem Al-Saher, in 30   g of live DNA. Then, 500 years from now (even 3000   years from now), Arabic music fans would be overwhelmed with the beauty of these old songs. One of the most intriguing methods of storing music in DNA is called music of the spheres. The method uses bioinformatics technology, developed by Dr. Nick Goldman and Charlotte Jarvis. They took music from the Kreutzer Quartet, the recording of which has been encoded into DNA, and stored it as digital information in synthetic DNA molecules. Goldman and Jarvis suspended the DNA in a soap solution. The soap bubbles would fill the air, pop on visitors' skin, and bathe the room in music. It might sound far-fetched, but the technology to encode music in synthetic DNA was developed by Goldman and his team a few years ago, imitating the binary method computers used to store information digitally and swapping the 0s and 1s for the base chemicals that form DNA sequences. Music of the spheres follows on from a similar project undertaken by Jarvis a few years ago when she encoded simple sentences in the DNA of bacteria.

    Appendix

    Dr. Rosalind Franklin (history was unfair for her).

    Rosalind Franklin was a British chemist best known for her role in the discovery of the structure of DNA. This amazing woman also pioneered the use of X-ray diffraction. She overcame personal and societal strife to make one of the greatest discoveries in science. Today, July 25th would have marked her 97th birthday. And so, it seems only fitting to honor her life and contribution to science.

    Rosalind Franklin made a crucial contribution to the discovery of the double helix structure of DNA, but some would say she got a raw deal. Biographer Brenda Maddox called her the Dark Lady of DNA, based on a once-disparaging reference to Franklin by one of her coworkers. Unfortunately, this negative appellation undermined the positive impact of her discovery. Indeed, Franklin is in the shadows of science history, for while her work on DNA was crucial to the discovery of its structure, her contribution to that landmark discovery is little known.

    Franklin was born on July 25, 1920, in London, to a wealthy Jewish family who valued education and public service. At age 18   years, she enrolled in Newnham Women's College at Cambridge University, where she studied physics and chemistry. After Cambridge, she went to work for the British Coal Utilization Research Association, where her work on the porosity of coal became her Ph.D. thesis, and later it would allow her to travel the world as a guest speaker.

    In 1946, Franklin moved to Paris, where she perfected her skills in X-ray crystallography, which would become her life's work. Although she loved the freedom and lifestyle of Paris, she returned after 4   years to London to accept a job at King's College.

    Franklin worked hard and played hard. She was an intrepid traveler and avid hiker with a great love of the outdoors. She enjoyed spirited discussions of science and politics. Friends and close colleagues considered Franklin a brilliant scientist and a kindhearted woman. However, she could also be short-tempered and stubborn, and some fellow scientists found working with her to be a challenge. Among them was Maurice Wilkins, the man she was to work alongside at King's College.

    A misunderstanding resulted in immediate friction between Wilkins and Franklin, and their clashing personalities served to deepen the divide. The two were to work together on finding the structure of DNA, but their conflicts led to them working in relative isolation. While this suited Franklin, Wilkins went looking for company at the Cavendish laboratory in Cambridge, where his friend Francis Crick was working with James Watson on building a model of the DNA molecule. Unknown to Franklin, Watson and Crick saw some of her unpublished data, including the beautiful photo 51, shown to Watson by Wilkins. This X-ray diffraction picture of a DNA molecule was Watson's inspiration (the pattern was clearly a helix). Using Franklin's photograph and their own data, Watson and Crick created their famous DNA model. Franklin's contribution was not acknowledged, but after her death, Crick said that her contribution had been critical.

    Franklin moved to Birkbeck College where, ironically, she began working on the structure of the tobacco mosaic virus, building on research that Watson had done before his work on DNA. During the next few years, she did some of the best and most important work of her life, and she traveled the world talking about coal and virus structure. However, just as her career was peaking, it was cut tragically short when she died of ovarian cancer at age 37   years.

    How was DNA first discovered and who discovered it? Read on to find out …

    The story of the discovery of DNA begins in the 1800s. The molecule now known as DNA was first identified in the 1860s by a Swiss chemist named Johann Friedrich Miescher. Miescher set out to research the key components of white blood cells, part of our body's immune system. The main source of these cells was pus-coated bandages collected from a nearby medical clinic.

    Miescher carried out experiments using salt solutions to understand more about what makes up white blood cells. He noticed that, when he added acid to a solution of the cells, a substance separated from the solution. This substance then dissolved again when an alkali was added. When investigating this substance, he realized that it had unexpected properties different to those of the other proteins he was familiar with. Miescher called this mysterious substance nuclei, because he believed it had come from the cell nucleus. Unbeknownst to him, Miescher had discovered the molecular basis of all life—DNA. He then set about finding ways to extract it in its pure form.

    Miescher was convinced of the importance of nuclei and came very close to uncovering its elusive role, despite the simple tools and methods available to him. However, he lacked the skills to communicate and promote what he had found to the wider scientific community. Ever the perfectionist, he hesitated for long periods of time between experiments before he published his results in 1874. Before then, he primarily discussed his findings in private letters to friends. As a result, it was many decades before Miescher's discovery was fully appreciated by the scientific community.

    Glossary (courtesy of MERIT CyberSecurity—archive)

    Allele: One of several possible versions of a gene. Each one contains a distinct variation in its DNA sequence. For example, a deleterious allele is a form of a gene that leads to disease.

    Amino acid: The chemical building block of proteins. During translation, different amino acids are strung together to form a chain that folds into a protein.

    Archaea: Microbes that look similar to bacteria but are actually more closely related to eukaryotes, such as humans. Archaea are single-celled organisms that do not have a nucleus and can only be seen with a microscope. They are found in many different habitats, and many of the first known examples were found in extreme environments.

    Bacteria: An abundant type of microbe. These single-celled organisms are invisible to the naked eye, do not have a nucleus, and can have many shapes. They' are found in all types of environments, from Arctic soil to inside the human body. Most bacteria are not harmful to human health, but certain pathogenic bacteria can cause illness.

    Base: The four letters of the genetic code (A, C, T, and G) are chemical groups called bases or nucleobases. A   =   adenine, C   =   cytosine, T   =   thymine, and G   =   guanine. Instead of thymine, RNA contains a base called uracil (U).

    Base pair: Different chemicals known as bases or nucleobases are found on each strand of DNA. Each base has a chemical attraction for a particular partner base, known as its complement. C matches up with G, whereas A pairs with T or U. These bonded genetic letters are called base pairs. Two strands of DNA can zip together to form a double helix shape when complementary bases match up to form base pairs.

    Cancer: A type of disease caused by uncontrolled growth of cells. Cancerous cells may form clumps or masses known as tumors and can spread to other parts of the body through a process known as metastasis.

    Cas: Abbreviation of CRISPR-associated; may refer to genes (Cas) or proteins (Cas) that protect bacteria and archaea from viral infection.

    Cas9: A protein derived from the CRISPR-Cas bacterial immune system that has been coopted for genome engineering. Uses an RNA molecule as a guide to find a complementary DNA sequence. Once the target DNA is identified, Cas9 cuts both strands. It has been compared with molecular scissors or

    Enjoying the preview?
    Page 1 of 1