Академический Документы
Профессиональный Документы
Культура Документы
of Nucleic Acids
Revised edition
C.F.A. Bryce* and D. Pacini
5
4
3 2
5
4
3 2
5
4
3 2
5
4
3 2
From thi s fi gure i t can be seen that the phosphodi ester bond i s
formed between the 5carbon of one sugar molecule and the 3 carbon of
the next. The doubl e-stranded nature of DNA i s shown schemati cal l y
above.
Si nce i t woul d be very ti me consumi ng to have to wri te out such a
structure every ti me i t was needed, abbrevi ated forms have been
introduced. These are essentially of three types:
I f the P i s on the l eft of a base then i t i s attached to the 5-carbon
(five); if it is on the right then it is attached to the 3-carbon (three).
Nucleic acids:structure and function 11
Sugar
Phosphate
group
Phosphodiester
bond
A T
C G
G C
T A
ApGpCpGpUp
5
AGCGU
3
DNA isolation,base composition and structure
Isolation
I n order to study the chemistry of DNA it must be isolated and purified
from the nucleus of a cell. I n the isolation of DNA from cell samples, the
major contami nant i s histone proteinwhi ch i s cl osel y associ ated wi th
DNA to form visible chromatinthreads in the nucleus. To remove this
contaminant the cell is first deproteinized by treatment with phenol and
the DNA is precipitated using ice-cold ethanol. A more detailed account
of the i sol ati on of DNA i s gi ven i n one of the practi cal schedul es l ater
(see Chapter 8) and in BASC Booklet 7, Recombinant DNA Technology.
Once isolated, DNA can be broken down or degraded by either: (i)
treatment wi th perchloric acidat 100C, whi ch breaks down the DNA
to i ts consti tuent bases; or (i i ) treatment wi th enzymes(DNase, phos-
phodi esterase, etc.) whi ch break down the DN A to i ts consti tuent
nucleotides.
The mixture of either of these products can be separated by chroma-
tography or el ectrophoresi s and the amount of each of the consti tuent
bases can be measured.
Composition
Usi ng the techni ques outl i ned above i t was noted by Chargaffand co-
workers in the late 1940s that:
(i ) the base composi ti on of an organi sms DNA i s characteri sti c of that
organism;
(i i ) di fferent cel l s or ti ssues of the same organi sm have i denti cal base
composition;
(i i i ) cl osel y rel ated organi sms exhi bi t si mi l ar base composi ti ons (thi s
property is used in biology as a basis for chemical taxonomy)
(iv) the majority of DNA molecules exhibit certain chemical regularities
that are defi ned as Chargaffs base pairingrul es (base equivalence),
namel y the amount of A equal s the amount of T and the amount of G
equal s the amount of C, as can be seen (wi thi n the bounds of experi -
mental error) from the data shown in Table 2.
I n consi deri ng the data shown i n Tabl e 2, i n terms of base equi v-
al ence i t i s necessary to combi ne the val ues for cytosi ne and 5-methyl
cytosine where these apply. The only exception to base equivalence is the
12 Isolation and structure of nucleic acids
DNA from the virus X174. Can you think of an explanation? (See the
Structure section in this chapter for the solution.)
Structure
I n 1947 Astbury studied DNA using the technique of X-ray diffraction,
shown schematically below.
When the beam of X-rays collides with an electron-dense region in
the DN A, the beam i s di ffracted. From the pattern of the di ffracted
beam i t i s possi bl e to determi ne the arrangement of atoms i n the DNA.
I n this way, Astbury was able to deduce that there was a regular feature
occurring within the DNA molecule every 3.34 . Such a regular feature
is referred to as a crystallographic repeator identity period.
Nucleic acids:structure and function 13
X-ray
source
Diffracted
beams
Photographic
film
DNA
crystal
Table 2.The percentage composition of bases in the DNA from a
variety of organisms
Source of DNA Adenine Guanine Cytosine Thymine 5-Methyl cytosine
Cattle sperm 28.7 22.2 20.7 27.2 1.3
Rat bone marrow 28.6 21.4 20.4 28.4 1.1
Sea urchin 32.8 17.7 17.3 32.1 1.1
Wheat germ 27.3 22.7 16.8 27.1 6.0
Yeast 31.3 18.7 18.1 31.9
Human sperm 30.9 19.9 19.8 29.4
Escherichia coli 26.0 24.9 23.9 25.2
X174 Phage virus 24.3 24.5 18.2 32.3
Later work by Frankl i n, Gosl i n, Wi l ki ns and co-workers (1953)
demonstrated that there were two forms of DN A, a crystal l i ne form
known as the A formand a paracrystalline form known as the B form.
Based on the X-ray di ffracti on work and Chargaffs base equi v-
alence rules, Watson and Crick in 1953 deduced the structure of DNA as
a double helix. Thi s compri sed two strands of sugarphosphate
backbone with the bases positioned in between, with adenine (A) on one
strand bei ng hydrogen (or H-) bondedto a thymi ne (T) on the other,
and a guani ne (G) on one strand bei ng H-bonded to a cytosi ne (C) on
the other. The H -bondi ng arrangement of these base pai rs i s shown
below.
A si ngl e-ri nged pyri mi di ne base opposi te a doubl e-ri nged puri ne
base woul d ensure equal di stance between the two pol y nucl eoti de
strands but why AT and GC speci fi cal l y ? The answer l i es i n the
number of H-bonds that can be formed between the bases as shown (see
also BASC Booklet 7, Recombinant DNA Technology).
14 Isolation and structure of nucleic acids
N
O
H
H
H
N
N
N N
CH
3
N N
O
N
N
H
H
H
H
N
N N
N
H
N
O
N
O
A T
G C
Sugar
Sugar
Sugar
Sugar
H-bond
Molecular impostor that cannot form hydrogen bonds
In a recent study a synthetic chemical which resembled thymine but which had a
benzene ring instead of the pyrimidine ring and fluorine atoms in place of the
oxygen and nitrogen atoms was used in an assay with DNA polymerase.Much to
the researchers surprise,a DNA helix was produced thus raising doubts on the
role of hydrogen bonds in the DNA structure. A further study showed that mol-
ecular mimics which have the wrong shape distort the helix and that this may rep-
resent the key feature of the structural stability.
New Scientist (22 February 1997)
Because the two pol ynucl eoti de strands have di recti on, two con-
fi gurati ons are possi bl e, one i n whi ch both strands run i n the same
di recti on, whi ch i s known as a parallel arrangement, and one i n whi ch
the strands run i n the opposi te di recti on, whi ch i s known as an
antiparallel arrangement. I t has been shown that the DN A dupl exes
found in nature have their chains in an antiparallel arrangement. This has
a profound si gni fi cance when we come to consi der the bi ol ogi cal
function of the molecules (see DNA replication section).
I n coi l i ng the backbone i nto a hel i cal confi gurati on, two arrange-
ments i n space are possi bl e, one a right-handed helix, the other a l eft-
handed hel i x. The di fference between these two arrangements can be
easily demonstrated using two pipe cleaners. I t has been shown that the
doubl e hel i x of DN A found i n nature i s i n the ri ght-handed confi gu-
ration.
Nucleic acids:structure and function 15
Left-handed Right-handed
Subsequent studi es usi ng space-fi l l i ng model s showed that there
were two groovesi n the DN A mol ecul e, one shal l ow (approx. 12 )
and one deep (approx. 22 ). This feature was found to be important in
rel ati on to the bi ndi ng of certai n anti bi oti cs, pol ymerase enzymes, etc.
The pri nci pal features of the structure of DNA are summari zed i n the
schematic drawing shown below.
Earl i er i n thi s chapter i t was observed that the DNA i sol ated from
the vi rus X174 di d not exhi bi t base equi val ence. The reason for thi s i s
that the DNA of this virus is single-strandedand not the normal duplex
form. This has a significance for its mode of replication.
Triple helices are also possible
Triple helical forms of DNA are now known to exist in vivoand these would
appear to have a key function in certain biological processes,in controlling gene
expression and as tools in genome mapping strategies.Such molecules are now
being actively investigated.
Triple-helical nucleic acids(V.N.Soyfer,1996,ISBN 0-387-94495-8)
DNA as the logic for new breed of computers
The notion of organic computers has been with us for some time now,but
recently the use of short DNA fragments to simulate the logic AND and OR
gates of a computer has been investigated.The claim is that this may now help to
produce a new form of computer that would significantly out-perform those cur-
rently based on silicon.
New Scientist (8 February 1997)
16 Isolation and structure of nucleic acids
Sugarphosphate
backbone
Hydrogen
bonds
Base
3
3
5
5
Shallow groove Deep groove
1 helical turn 3.4 nm
Breast tissue traps carcinogens
From recent research work it would appear that the fatty tissue that constitutes
80%of womens breasts is capable of absorbing fat-soluble organic molecules,
some of which are carcinogenic,causing DNA damage and mutation.It would
appear that certain foodstuffs may be the source of these carcinogens and as such
it might be impossible or difficult to avoid ingesting these.
New Scientist (7 December 1996)
RNA isolation,base composition and structure
Isolation
The i sol ati on procedure i s very si mi l ar to that for DN A i sol ati on
al though i t i s essenti al to i ncl ude an RNase i nhi bi tor to prevent degra-
dati on of the RNA. Unl i ke DNA, whi ch exi sts as a more or l ess homo-
geneous molecule for any one cell, RNA exists as a family of molecules,
each member of which has a distinctive structure and function (see later
sections in this chapter). Because of this wide variety of RNA species it
is necessary to separate further the initial RNA pool into its constituent
molecules. The methods employed include:
density-gradient centrifugation;
ion-exchange chromatography;
Nucleic acids:structure and function 17
Table 3.DNA structure and function
Structural feature Function
Strong,covalently bonded Preserves the base sequence of genetic
sugarphosphate backbone information during the lifetime of the cell
and allows shortening of the DNA during
chromosome formation in cell division
Complementary purinepyrimidine Keeps an equal distance between the two
base pairing polynucleotide strands,increasing the stability
of the double helix
Numerous but weak H-bonds Enhance stability of the helix but are easily
between complementary bases broken by enzymes to allow DNA replication
and transcription
Large,insoluble molecule Restricts DNA to the nucleus,protecting it
(molecular mass approx.100000 units) from biochemical damage,thus preserving
the genetic code
gel filtration;
electrophoresis.
Base composition
Purified RNA molecules can be degraded by chemical means (alkali) or
by enzymes (RNase) and the base composition determined in a manner
similar to that described for DNA (see section on DNA isolation in this
chapter). One major difference between DNA and RNA is that in RNA
there are a fai rl y l arge number of mi nor bases. Al so, i n the majori ty of
cases there i s no base equi val ence, si gni fyi ng that RNA mol ecul es are
usually single-stranded.
Structure
There are three major forms of RN A: ribosomal RNA (rRN A),
transfer RNA(tRNA) and messenger RNA(mRNA) and these exist in
al l the major l i fe forms, together wi th other di sti nct RN A mol ecul es
whi ch are not uni versal . rRN A accounts for about 80% of the total
cellular RNA and is associated with protein to form the cytoplasmic par-
ti cl es known as ribosomes. As shown bel ow the ri bosome i tsel f can be
consi dered as two subuni ts, a l arge and a smal l , both of whi ch are
RN Aprotei n associ ati ons wi th about 65% RN A: 35% protei n. The
RNA has high molecular mass and is metabolically stable.
Transfer RNA i s the next most abundant speci es and accounts for
about 15% of the total RNA. These mol ecul es are of much l ower mol -
ecul ar mass than the rRNA and are al so referred to as 4S RNA. As we
shal l see l ater, these mol ecul es functi on as adaptorsfor ami no aci ds i n
18 Isolation and structure of nucleic acids
Front view Side view
30 S subunit
50 S subunit
the course of protei n synthesi s and many di fferent tRNAs exi st, each
being specificfor one amino acid. The tRNA molecule is single-stranded
but the chai n fol ds back on i tsel f i n a very di sti ncti ve way (see di agram
below) such that about 5060% of the structure is base-paired.
Most of the remai nder of the cel l RNA (l ess than 5%) i s accounted
for by mRN A whi ch, i n eukary otes, ori gi nates i n the nucl eus and
mi grates to the cytopl asm duri ng protei n synthesi s (see Chapters 5 and
6). I t is of high molecular mass and is metabolically very labile, i.e. easily
broken down. As we shal l see i n a l ater secti on (see Chapters 5 and 6),
mRNA is centrally involved in the transfer and expression of the genetic
information and is responsible for the sequence of amino acids in each of
the di fferent protei ns i n the cel l . Because the si ze of the messenger i s
variable, no S-value is associated with it.
Note: the S-values refer to how fast a particular molecule sediments in an
ultracentrifuge and the S refers to Svedberg units, the unit of mea-
surement in such studies. Clearly this relates both to the massand shape
of a molecule in that an object of greater mass will move faster than one
with lower mass but the same shape, and one with a small volume will
move faster than one with a similar mass but larger volume (i.e. less
dense).
Nucleic acids:structure and function 19
Loop 3
Loop 1
Loop 2
Anticodon
3
5
RNAs which act as enzymes
In the late 1980s it was shown that certain RNA molecules could act as enzymes,
capable of splicing out specific sequences of RNA either on itself or other RNA
strands.The name given to such molecules was ribozymeand this work earned
a Nobel Prize for one of the researchers (Tom Cech of the University of
Colorado).Two of these ribozymes are now being clinically tested as potential
treatment against HIV (see BASC Booklet 3,Enzymes and their Role in
Biotechnology,for further details).
New Scientist (7 December 1996)
A role for RNA in the treatment of debilitating disease
Specific RNA molecules are being selected from pools of randomly generated and
synthesized RNA as a means of treating the debilitating disease myasthenia gravis.
Sufferers of this disease produce auto-antibodies which block the normal signals
from nerves to muscles by blocking the acetylcholine receptor.It has been shown
that specific RNA molecules are able to bind to the antibodies,so interfering with
their binding to the receptors and hence restoring the normal nerve-to-muscle
signals (see BASC Booklets 5,Immunology,and 9,The Biochemical Basis of Disease
for further information).
New Scientist (18 January 1997)
20 Isolation and structure of nucleic acids
Table 4.Differences between DNA and RNA structure and properties
DNA RNA
Deoxyribose sugar Ribose sugar
Thymine base Uracil base
Double-stranded helix Single-stranded molecule
Very large molecular mass Much smaller molecular mass
Insoluble Soluble
3
DNA replication
Learning objectives
Each student should, without reference to his or her notes, be able to:
explain the biological significance of DNA replication;
explain the suitability of DNA structure for replication;
outline the main features of semi-conservative replication of DNA;
outline the Meselson and Stahl experiment; and
name the major enzymes involved in replication.
Introduction
I n the previ ous chapter i t was shown how we knew that the geneti c
information of a cell is contained in the DNA of that cell. For the cell to
di vi de and produce daughter cel l s i n mi tosi s and mei osi s i t i s essenti al
that the DNA is copied (replicated) and an identical copy is passed to the
daughter cel l . DNA i s repl i cated duri ng i nterphase of both mi tosi s and
meiosis.
The basi c resul ts on whi ch our present day assumpti ons of the
mechanism of DNA replication are built are:
(i ) the evi dence that the geneti c i nformati on was contai ned wi thi n the
nucleic acids;
(i i ) the fact that most DNA consi sts of two strands of pol ynucl eoti de
chains, each of which consists of deoxyribonucleotide residues joined by
35 phosphodiester bonds; and
(i i i ) the fact that H-bonds occur between the bases i n the two strands,
adeni ne i n one strand i s al way s bonded to thy mi ne i n the other and
likewise cytosine with guanine.
21
Each of these we have al ready consi dered, and thus, i f we are gi ven
the sequence of bases i n one strand, we coul d i mmedi atel y wri te down
the sequence of bases in the complementary strand.
Basically, there are five theoretically possible modes of replication.
Conservative
By thi s mechani sm one daughter cel l recei ves the ori gi nal DN A
molecule whilst the other receives a completely new copy.
Semi-conservative
H ere the ori gi nal DN A mol ecul e i s spl i t i nto two strands and each
strand acts as a template on which a new strand is synthesized.
Non-conservative
H ere the ori gi nal DN A i s destroyed compl etel y duri ng the course of
synthesis of two new identical DNA molecules.
Dispersive
H ere the ori gi nal DN A mol ecul e i s di spersed or di stri buted i nto al l
nascent chains.
End-to-end
Here the ori gi nal DNA mol ecul e i s present as hal f of one chai n and the
other half of the complementary chain for both nascent molecules.
Note: A-level students are only expected to consider evidence for or
against the first two possible modes of replication outlined above.
The fi ve possi bl e modes of repl i cati on are shown schemati cal l y i n the
figure which follows.
I n fact it hasbeen proved that DNA replicates semi-conservatively,
fi rst i n E. coli, and subsequentl y i n al l hi gher organi sms as wel l : i t has
been shown to be the case for eukaryotic cells, including human cells, in
22 DNA replication
ti ssue cul ture. The ori gi nal experi ment whi ch proved thi s mechani sm
was carried out by Meselson and Stahl in 1958.
The Meselson and Stahl experiment
(a) Meselson and Stahl grew E. coli in a medium containing
15
NH
4
Cl (i.e.
15
N is the heavy isotope of nitrogen). The result of this was that
15
N was
i ncorporated i nto newl y synthesi zed DNA (as wel l as other ni trogen-
containing compounds).
(b) They al l owed the cel l s to grow for many generati ons i n the
15
N H
4
Cl so that al l the DN A was heavy that i s, had a greater
density than normal. This heavy DNA could be distinguished from the
normal DNA by centri fugati on i n a caesium chloride (CsCl) density
gradient. Note that
15
N is not a radioactive isotope.
(c) They then transferred the cel l s i nto a medi um wi th normal
14
NH
4
Cl and took samples at various definite time intervals as the cells
mul ti pl i ed, and extracted the DNA that remai ned as doubl e-stranded
helices.
Nucleic acids:structure and function 23
Parent DNA
New DNA
Parent DNA
Conservative
Semi-conservative
Non-conservative
End-to-end
Dispersive
The various samples were then run independently on CsCl gradients
to measure the densities of DNA present. The results of this experiment
were as follows:
Thus the DNA that was extracted from the cul ture one generati on
after the transfer from
15
N to
14
N medi um (i .e. after 20 mi nutes) had a
hybrid or intermediate density, exactly half-way between that of DNA
contai ni ng onl y
15
N and that contai ni ng onl y
14
N; each mol ecul e must
therefore have contained one old and one new strand.
DNA extracted from the cul ture after another generati on (i .e. after
40 minutes) was composed of equal amounts of this hybrid DNA and of
l i ght DN A. As ti me went on, more and more l i ght DN A accu-
mulated.
The results of the Meselson and Stahl experiment are not consistent
with non-conservative (only
14
N-DNA bands would be expected from
20 minutes onwards) or conservative (one band of
14
N-DNA and one of
15
N-DNA at 20 minutes would be expected) modes of replication.
Thus far, however, the data are consi stent wi th the three remai ni ng
modes of replication. To distinguish between these, and hence determine
the mode of repl i cati on, Mesel son and Stahl carri ed out an addi ti onal
24 DNA replication
Light
Intermediate
Heavy
0 Generation 1 2 3 4 etc
0
Time interval
following transfer (minutes)
20 40 60 80
D
e
n
s
i
t
y
o
f
m
o
l
e
c
u
l
e
s
a
f
t
e
r
c
e
n
t
r
i
f
u
g
a
t
i
o
n
experi ment i n whi ch they took the DN A i sol ated at 20 mi nutes after
transfer to
14
N -medi um, strand separated thi s and ran the resul tant
sampl e on a CsCl gradi ent. For di spersi ve or end-to-end model s we
would predict a single hybrid band, whereas for semi-conservative repli-
cati on we woul d expect two bands, one l i ght, one heavy. The l atter was
found to be the case and hence DNA replication is a semi-conservative
process.
Mechanism of replication
Watson and Cri ck, i n thei r 1953 paper i n whi ch they proposed the
doubl e-hel i cal structure for DN A, concl ude as fol l ows: I t has not
escaped our noti ce that the speci fi c pai ri ng we have postul ated i mme-
diately suggests a possible copying mechanism for the genetic material .
The i mpl i cati on was that, owi ng to the stri ct compl ementary nature of
the strands, the base sequence of one strand woul d determi ne the base
sequence of the other. I t coul d be i magi ned therefore that repl i cati on
could follow a scheme as shown below.
Nucleic acids:structure and function 25
5
3
3
3
3
3
3
3
3
5
5
5
5
5
5
5
The problem, however, with this scheme is that the energy required
for the first step, namely strand separation, would be far too high to be
plausible. This led to the suggestion that unfolding took place a little at a
time, so forming a replication forkas shown.
Here, as el sewhere i n bi ochemi cal reacti ons, an enzyme i s used as a
catalyst. Enzymes lower the energy needed to activate a reaction, making
i t more l i kel y to occur. For E. coli thi s i s an enzyme cal l ed DNA poly-
merase III. The simplest scheme would be for polymerase molecules to
attach themselves to the ends of the single-stranded DNA template and
to synthesize a complimentary strand of each.
The first problem encountered with this proposal arose because the
polymerase enzyme was found to operate in one direction only, that is it
synthesized the nascent DNA strand in the direction 5!3. This meant
that one of the parent strands (called the leading strand) could be copied
i n a continuousfashi on, whereas the other (cal l ed the lagging strand)
could only be copied in a discontinuousway as a series of short lengths
(approx. 1000 bases l ong) of DNA, each of whi ch i s cal l ed an Okazaki
fragment (see Fi gure bel ow). I t was al so di scovered earl y on that the
DN A pol y merase enzy me coul d not actual l y sy nthesi ze a DN A
molecule from individual DNA nucleotides but instead required a short
stretch of existing DNA helix duplex on which to build. I n this sense it is
truly a DNA-dependentDNA polymerase I I I .
26 DNA replication
3
5
3
3
5
5
RNA
primer
Helix-destabilizing protein
DNA helicase
DNA gyrase
DNA
polymerase III
3
5
The full details of the process of replication are beyond the scope of
the A-l evel syl l abus but i t may be of i nterest to know the rol es of some
of the major enzy mes/protei ns i nvol ved i n the process. The detai l s
required for A-level are given in bold in Table 5.
Exposure to i oni zi ng radi ati on and carci nogeni c chemi cal s, e.g.
phenol s i n ci garette tar, probabl y overwhel m DNA-repai r enzymes
resulting in gene mutation and cancerous cells.
Nucleic acids:structure and function 27
Table 5. Enzymes involved in DNA replication
Enzyme Function
DNA polymerase III The replicase enzyme
DNA polymerase I A repair enzyme involved in removing errors and filling
in gaps in the sequence
DNA ligase Seals nicks in the sugar-phosphate backbone
RNA polymerase Make RNA primers to get the DNA polymerase III started
or RNA primase
RNase H Removes the RNA primers once they have completed
their function
DNA helicase and Unwind the DNA duplex
DNA gyrase
Helix-destabilizing protein Prevents the unwound DNA from immediately rewinding
4
The genetic code
Learning objectives
Each student should, without reference to his or her notes, be able to:
state that the unit of information is the codon a sequence of three
bases (on DNA and mRNA) representing one amino acid;
state that si nce there are four bases there are 64 possi bl e tri pl et
codons;
state that all codons have been assigned to amino acids or start/stop
signals;
give details of the characteristics of the genetic code; and
predi ct the ami no aci d sequence of a protei n from an mRN A or
DNA base sequence (with the aid of a copy of the genetic code).
Introduction
The mi ddl e to l ate 1950s was a parti cul arl y acti ve and producti ve phase
wi th respect to the research to ai d our understandi ng of the process of
gene expression. At about this time Crick formulated the central dogma
which, in the earliest forms, was expressed as:
Thus far, we have seen that the genetic information which dictates a
cells phenotype is stored in the DNA and this can be passed to progeny
29
Replication DNA RNA Protein
Translation Transcription
cells via the process of replication. We will now consider how this stored
information is expressed to yield the specific phenotype.
A key feature in this respect is the concept of a genewhich has been
defi ned operati onal l y as a regi on of the genomethat segregates as a
si ngl e uni t duri ng mei osi s and gi ves ri se to a defi nabl e phenotypi c trai t
such as a red or whi te eye i n Drosophilaor a round or wri nkl ed seed i n
peas .
The genome is the sum of all DNA nucleotide/base sequences in an
individual organism or species.
Later work demonstrated that a gene or gene regioncorresponds to
a l ength of DNA on the genome that contai ns the i nformati on for the
synthesis of a single protein. This led to the useful concept that one gene
corresponds to one protein.
H owever, as a great many bi ol ogi cal l y acti ve protei ns compri se
more than one pol ypepti de chai n, i t i s more correct to consi der thi s as
one gene one polypeptide chain.
Although serving as a useful memory aid, even this statement is not
enti rel y correct si nce some genes code for si ngl e structural RNA mol -
ecul es, such as mRNA and tRNA, rather than pol ypepti de chai ns and,
also, more recent studies have revealed additional complexities in terms
of the structural organi zati on of the geneti c i nformati on. For the
purposes of the present section, however, it is acceptable to treat the gene
in terms of the one gene one polypeptide chain hypothesis.
Building a human chromosome
The first synthetic chromosome,that of yeast,was built about a decade ago.To
extend this to the synthesis of a human chromosome required detailed consid-
eration of further structural complexities.Thus,in addition to the structural genes
there is a need for telomeres,long repeating DNA sequences which protect the
ends of the chromosome and prevent them binding together,and centromeres,
which provide the scaffolding that enables duplicate chromosomes to split during
cell division.Using this approach,totally artificial chromosomes have been
produced and it is hoped that these may help in gene therapy in the future.
New Scientist (5 April 1997)
30 The genetic code
To transfer the geneti c i nformati on from DN A to a sequence of
ami no aci ds that consti tutes a pol ypepti de chai n i nvol ves two di screte
but consecutive processes:
(i ) transcription, i n whi ch an mRN A strand of nucl eoti des i s sy n-
thesized from one of the strands of the DNA in the gene region; and
(ii) translation, in which the genetic information coded as a sequence of
bases in mRNA is translated or converted into a sequence of amino acids
to form a discrete polypeptide chain.
The gene regions specific to protein synthesis collectively only rep-
resent a small part of the genome for many cells. The remaining regions
are accounted for as non-coding/nonsense sequences, multiple repeat
sequences, control regionsand regions specifying the synthesis of func-
ti onal RNA mol ecul es such as tRNA and rRNA. A summary of these
features is shown below.
How many bases code for an amino acid?
I n the next section we shall consider details of the mechanism by which
the i nformati on coded i n the mRNA i s used to synthesi ze protei ns of a
speci fi c sequence. An i mportant concept underpi nni ng thi s topi c i s the
nature of the code used to translate the sequence of bases in the mRNA
mol ecul e i nto a sequence of ami no aci ds i n a pol ypepti de chai n. Thi s
code is referred to as the genetic codeand in the present section we shall
consider the general properties of the code.
For mRNA to control the synthesis of specific proteins we must ask
how a mol ecul e made up from onl y four di fferent ki nds of base can
determine the order of about 20 different amino acids. Obviously there
cannot be a 1:1 correspondence between the RNA bases and the ami no
Nucleic acids:structure and function 31
Transcription
Control region for gene A Multiple copies of gene C Control region for genes B1 and B2
Translation
gene A
Protein A Protein B (dimer)
Non-coding
sequences
mRNA mRNA mRNA tRNA
gene B1 gene B2
aci ds. Si mi l arl y there cannot be a 2:1 correspondence, si nce thi s onl y
allows for 16 different arrangements (44). I f, however, there are three
bases codi ng for each ami no aci d, the number of possi bl e tri pl ets i s 64
(4 4 4), whi ch woul d be enough combi nati ons to code for each
ami no aci d wi th pl enty to spare. Cl ear-cut evi dence has been produced
that the triplet theoryi s correct and we wi l l now l ook at the evi dence
for such a view. The name given to the sequence of three nucleotides is a
codon.
Evidence for a triplet code
The evidence for a triplet code comes from genetic studies by Crick and
co-workers usi ng mutati ons caused by acri di ne orange or profl avi ne i n
frui t fl i es. The mode of acti on of these compounds i s not ful l y
understood but the net effect i s to cause the i nserti on of an addi ti onal
base i nto a DNA strand or to del ete one of the bases i n a DNA strand,
possibly during DNA replication.
I t i s easi er to expl ai n the resul ts of these studi es by consi deri ng a
simple model and studying its behaviour. I f we assume a DNA sequence
to be a series of triplet codes of bases, or codons, for five amino acids:
CAT CAT CAT CAT CAT
The insertion of a base, T:
CAT TCA TCA TCA TCA
Or deletion of a base, C, from the original sequence:
CAT ATC ATC ATC ATC
The resul t of a si ngl e mutati on (ei ther i nserti on or del eti on) i s to
cause the genetic message/amino acid sequence everywhere to the right
of the mutation to be misread.
The result would almost certainly be a defective protein or enzyme,
especi al l y i f the ami no aci d sequence i s i n or near the acti ve si te of an
enzyme.
Al so i n these studi es i t was shown that i f we i nsert an extra base,
then delete another base a little further on in the sequence, the net result
is an almost identical product (demonstrate thi s for y oursel f). The
important point here is that the insertion and deletion are relatively close
32 The genetic code
together (why ?). There are al l sorts of other way s of combi ni ng
mutation, but the net results provide very strong evidence in favour of a
tri pl et code. These resul ts al so provi de addi ti onal data concerni ng the
mode of translating the code.
Deciphering the code
The probl em of deci pheri ng the code, that i s assi gni ng ami no aci ds to
each of the codons, has been approached usi ng a vari ety of techni ques,
the most productive being the ribosome-binding techniquedevised by
Leder and Ni renberg. Thi s method, l i ke many of the others, makes use
of synthetic mRNA, whi ch i n thi s case i s onl y three bases l ong, i .e. a
si ngl e codon. Cl earl y the advantage of usi ng an mRN A of known
sequence is that it makes data interpretation a relatively straightforward
process.
The ribosome-binding technique involved mixing a sample of pure
syntheti c codon wi th ri bosomes and then, i n turn, each of the di fferent
amino acids (which had been radioactively labelled), each linked to their
correspondi ng tRN A. Each mi xture was fi l tered through a Mi l l i pore
fi l ter whi ch caused the ri bosome and any bound components to be
retained on the filter. The filter could then be tested for radioactivity. I f
we consi der as an exampl e the tri pl et UUG, then i t was found that the
sample which contained radioactive serine did not bind to the filter, and
likewise for the amino acids tryptophan, glycine, methionine, tyrosine,
etc. I n fact, the onl y sampl e whi ch retai ned radi oacti vi ty on the fi l ter
occurred when leucine was used (see diagram below) and therefore it can
be concluded that the codon UUG codes for leucine (Leu).
Nucleic acids:structure and function 33
UUG
UUG
AAC
AAC
Leu* Leu*
Ribosome
Synthetic
mRNA codon
Anti-codon
Amino acid tRNA molecule