Академический Документы
Профессиональный Документы
Культура Документы
SECTION
17
Experimental
Biochemistry
Learning Objectives
✓ How are proteins purified?
✓ How is the primary structure of a protein determined?
✓ What immunological techniques are used in biochemistry laboratories?
✓ How can a protein-encoding gene be cloned?
✓ How can a DNA molecule be sequenced and amplified?
Y ou have learned much about how cells, tissues, and organisms function in
your study of biochemistry. What you have learned has been presented as
facts in a textbook. But everything that is now believed to be true was at some
point simply an experimental observation and often a controversial one at that.
For instance, recall the rejection letter that Hans Krebs received when he first sub-
mitted a paper describing the citric acid cycle (p. 290). This section describes some
of the techniques used by researchers to tease information out of the cell.
Although scientists are most interested in how biochemistry take place in an
organism, this goal is technically very difficult to achieve. How can we learn about
a particular biomolecule—a protein, for instance—when it is surrounded by thou-
sands of other molecules and interacting with them? To circumvent this difficulty,
Tymo_c40_620-637hr 2-12-2008 10:12 Page 621
Ser Tyr Gln Val lle Cys Arg Asp Glu Lys Thr Gln Met lle Tyr Gln Gln His Gln Ser
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Trp Leu Arg Pro Val Leu Arg Ser Asn Arg Val Glu Tyr Cys Trp Cys Asn Ser Gly Arg
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
Ala Gln Cys His Ser Val Pro Val Lys Ser Cys Ser Glu Pro Arg Cys Phe Asn Gly Gly
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Thr Cys Gln Gln Ala Leu Tyr Phe Ser Asp Phe Val Cys Gln Cys Pro Glu Gly Phe Ala
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
Gly Lys Cys Cys Glu lle Asp Thr Arg Ala Thr Cys Tyr Glu Asp Gln Gly lle Ser Tyr
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
Arg Gly Asp Trp Ser Thr Ala Glu Ser Gly Ala Glu Cys Thr Asp Trp Gln Ser Ser Ala
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120
Leu Ala Gln Lys Pro Tyr Ser Gly Arg Arg Pro Asp Ala lle Arg Leu Gly Leu Gly Asn
121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140
His Asn Tyr Cys Arg Asn Pro Asp Arg Asp Ser Lys Pro Trp Cys Tyr Val Phe Lys Ala
141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160
Gly Lys Tyr Ser Ser Glu Phe Cys Ser Thr Pro Ala Cys Ser Glu Gly Asn Ser Asp Cys
161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180
at least initially, the first approach to understanding how any biomolecule works Tyr
181
Phe
182
Gly
183
Asn
184
Gly
185
Ser
186
Ala
187
Tyr
188
Arg
189
Gly
190
Thr
191
His
192
Ser
193
Leu
194
Thr
195
Glu
196
Ser
197
Gly
198
Ala
199
Ser
200
Cys Leu Pro Trp Asn Ser Met lle Leu lle Gly Lys Val Tyr Thr Ala Gln Asn Pro Ser
201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220
is to isolate and purify the biomolecule and examine its biochemically properties Cys
221
Gln
222
Ala
223
Leu
224
Gly
225
Leu
226
Gly
227
Lys
228
His
229
Asn
230
Tyr
231
Cys
232
Arg
233
Asn
234
Pro
235
Asp
236
Gly
237
Asp
238
Ala
239
Lys
240
Pro Trp Cys His Val Leu Lys Asn Arg Arg Leu Thr Trp Glu Tyr Cys Asp Val Pro Ser
in vitro. Toward this end, we will first examine how proteins, the workhorses of bio- 241
Cys
261
242
Ser
262
243
Thr
263
244
Cys
264
245
Gly
265
246
Leu
266
247
Arg
267
248
Gln
268
249
Tyr
269
250
Ser
270
251
Gln
271
252
Pro
272
253
Gln
273
254
Phe
274
255
Arg
275
256
lle
276
257
Lys
277
258
Gly
278
259
Gly
279
260
Leu
280
chemistry, are purified. Protein purification is both a science and an art. Phe
281
Pro
Ala
282
Gly
Asp
283
Glu
lle
284
Arg
Ala
285
Phe
Ser
286
Leu
His
287
Cys
Pro
288
Gly
Trp
289
Gly
Gln
290
lle
Ala
291
Leu
Ala
292
lle
Leu
293
Ser
Phe
294
Ser
Ala
295
Cys
Ala
296
Trp
Ala
297
lle
Ala
298
Leu
Ala
299
Ser
Ser
300
Ala
301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320
Researchers take advantage of often slight differences in the physical character- Ala
321
His
322
Cys
323
Phe
324
Gln
325
Glu
326
Arg
327
Phe
328
Pro
329
Pro
330
His
331
His
332
Leu
333
Thr
334
Val
335
Trp
336
lle
337
Leu
338
Ser
339
Ala
340
Tyr Arg Val Val Pro Gly Glu Glu Glu Gln Lys Phe Glu Val Glu Lys Tyr lle Val Lys
istics of similar proteins to separate them from one another. After a protein has 341
Lys
361
342
Glu
362
343
Phe
363
344
Asp
364
345
Asp
365
346
Asp
366
347
Thr
367
348
Tyr
368
349
Asp
369
350
Asn
370
351
Asp
371
352
lle
372
353
Ala
373
354
Leu
374
355
Leu
375
356
Gln
376
357
Leu
377
358
Lys
378
359
Ser
379
360
Asp
380
Leu Gln Leu Pro Asp Thp Thr Glu Cys Glu Leu Ser Gly Tyr Gly Lys His Glu Ala Leu
401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420
ture. Knowing the primary structure of a protein can be a source of insight into Ser
421
Pro
422
Phe
423
Tyr
424
Ser
425
Glu
426
Arg
427
Leu
428
Lys
429
Glu
430
Ala
431
His
432
Val
433
Arg
434
Leu
435
Tyr
436
Pro
437
Ser
438
Ser
439
Arg
440
Cys Thr Ser Gln His Leu Leu Asn Arg Thr Val Thr Asp Asn Met Leu Cys Ala Gly Asp
441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460
the structure and function of the protein and allows us to compare it with other Thr
461
Arg
462
Ser
463
Gly
464
Gly
465
Pro
466
Gln
467
Ala
468
Asn
469
Leu
470
His
471
Asp
472
Ala
473
Cys
474
Gln
475
Gly
476
Asp
477
Ser
478
Gly
479
Gly
480
Pro Leu Val Cys Leu Asn Asp Gly Arg Met Tyr Leu Val Gly lle lle Ser Trp Gly Leu
similar proteins. Protein purification and primary structure determination are the 481
Gly
482
Cys
483
Gly
484
Gln
485
Lys
486
Asp
487
Val
488
Pro
489
Gly
490
Val
491
Tyr
492
Thr
493
Lys
494
Val
495
Thr
496
Asn
497
Tyr
498
Leu
499
Asp
500
Trp
501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520
In Chapter 41, we will examine additional techniques for the investigation of Chapter 40: Techniques in
biomolecules. We will explore powerful immunological techniques that can be Protein Biochemistry
used to further investigate proteins as well as other biomolecules. These same
techniques are also useful in clinical settings for diagnosis and treatment. Finally,
we will investigate recombinant DNA technology—the tools and techniques that
allow researchers to move and connect genes and large pieces of DNA. We will
learn how genes are cloned and look into the variety of experimental and clinical
opportunities that cloning provides.
CHAPTER
40 Techniques in Protein
Biochemistry
40.1 The Proteome Is the Functional Ser
1
Tyr
2
Gln
3
Val
4
lle
5
Cys
6
Arg
7
Asp
8
Glu
9
Lys
10
Thr
11
Gln
12
Met
13
lle
14
Tyr
15
Gln
16
Gln
17
His
18
Gln
19
Ser
20
Function Gly Lys Cys Cys Glu lle Asp Thr Arg Ala Thr Cys Tyr Glu Asp Gln Gly lle Ser Tyr
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
Facilitates an Understanding of Leu Ala Gln Lys Pro Tyr Ser Gly Arg Arg Pro Asp Ala lle Arg Leu Gly Leu Gly Asn
121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140
Protein Function His Asn Tyr Cys Arg Asn Pro Asp Arg Asp Ser Lys Pro Trp Cys Tyr Val Phe Lys Ala
141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160
Gly Lys Tyr Ser Ser Glu Phe Cys Ser Thr Pro Ala Cys Ser Glu Gly Asn Ser Asp Cys
161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180
Tyr Phe Gly Asn Gly Ser Ala Tyr Arg Gly Thr His Ser Leu Thr Glu Ser Gly Ala Ser
181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200
Cys Leu Pro Trp Asn Ser Met lle Leu lle Gly Lys Val Tyr Thr Ala Gln Asn Pro Ser
201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220
Cys Gln Ala Leu Gly Leu Gly Lys His Asn Tyr Cys Arg Asn Pro Asp Gly Asp Ala Lys
221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240
Pro Trp Cys His Val Leu Lys Asn Arg Arg Leu Thr Trp Glu Tyr Cys Asp Val Pro Ser
241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260
Cys Ser Thr Cys Gly Leu Arg Gln Tyr Ser Gln Pro Gln Phe Arg lle Lys Gly Gly Leu
261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280
Phe Ala Asp lle Ala Ser His Pro Trp Gln Ala Ala Leu Phe Ala Ala Ala Ala Ala Ser
281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300
Pro Gly Glu Arg Phe Leu Cys Gly Gly lle Leu lle Ser Ser Cys Trp lle Leu Ser Ala
301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320
Ala His Cys Phe Gln Glu Arg Phe Pro Pro His His Leu Thr Val Trp lle Leu Ser Ala
321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340
Tyr Arg Val Val Pro Gly Glu Glu Glu Gln Lys Phe Glu Val Glu Lys Tyr lle Val Lys
341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360
Lys Glu Phe Asp Asp Asp Thr Tyr Asp Asn Asp lle Ala Leu Leu Gln Leu Lys Ser Asp
361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380
Ser Ser Arg Cys Ala Gln Glu Ser Ser Val Val Arg Thr Val Cys Leu Pro Pro Ala Asp
381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400
Leu Gln Leu Pro Asp Thp Thr Glu Cys Glu Leu Ser Gly Tyr Gly Lys His Glu Ala Leu
401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420
Ser Pro Phe Tyr Ser Glu Arg Leu Lys Glu Ala His Val Arg Leu Tyr Pro Ser Ser Arg
421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440
Cys Thr Ser Gln His Leu Leu Asn Arg Thr Val Thr Asp Asn Met Leu Cys Ala Gly Asp
441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460
Thr Arg Ser Gly Gly Pro Gln Ala Asn Leu His Asp Ala Cys Gln Gly Asp Ser Gly Gly
461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480
Pro Leu Val Cys Leu Asn Asp Gly Arg Met Tyr Leu Val Gly lle lle Ser Trp Gly Leu
481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500
Gly Cys Gly Gln Lys Asp Val Pro Gly Val Tyr Thr Lys Val Thr Asn Tyr Leu Asp Trp
501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520
M uch of our study of biochemistry has focused on protein structure and func-
tion. We have observed that proteins are indeed the workhorses of the cell.
All of the information that we have learned about proteins raises an interesting
question: How do we know what we know about proteins? The first step toward
learning how proteins work in the cell is to learn how they work outside the cell,
in vitro. To do so, the proteins must be separated from all of the other constituents
of the cell so that their biochemical properties can be identified and characterized.
In other words, the protein must be purified.
In this chapter, we will examine some of the key techniques of protein purifi-
cation. All of these techniques take advantage of biochemical properties unique
to each protein. Then, we will learn how one crucial property of proteins—amino
acid sequence, or primary structure—is elucidated.
622
Tymo_c40_620-637hr 2-12-2008 10:12 Page 623
623
40.1 The Proteome Is the Functional Representation of the 40.2 Protein Purification
Genome
Every year, researchers are increasing their knowledge of the exact DNA base
sequences and volume of information contained in the genomes of many organ-
isms. For example, researchers recently concluded that the roundworm Caenorhab-
ditis elegans has a genome of 97 million bases and about 19,000 protein-encoding
genes, whereas that of the fruit fly Drosophilia melanogaster contains 180 million
bases and about 14,000 genes. The completely sequenced human genome contains
3 billion bases and about 25,000 genes. But this genomic knowledge is analogous
to a list of parts for a car: it does not explain which parts are present in different
components or how the parts work together. A new word, the proteome, has been
coined to signify a more complex level of information content—the level of func-
tional information, which encompasses the types, functions, and interactions of
proteins that yield a functional unit.
The term proteome is derived from proteins expressed by the genome. The
genome provides a list of gene products that could be present, but only a subset of
these gene products will actually be expressed in a given biological context. The
proteome tells us what is functionally present—for example, which proteins inter-
act to form a signal-transduction pathway or an ion channel in a membrane.
Unlike the genome, the proteome is not a fixed characteristic of the cell. Rather,
because it represents the functional expression of information, it varies with cell
type, developmental stage, and environmental conditions, such as the presence of
hormones. Almost all gene products are proteins that can be chemically modified
in a variety of ways. Furthermore, these proteins do not exist in isolation; they often
interact with one another to form complexes with specific functional properties.
An understanding of the proteome is acquired by investigating, characteriz-
ing, and cataloging proteins. In some, but not all, cases, this process begins by sep-
arating a particular protein from all other biomolecules in the cell.
–
O O –
C O O
Lactate
dehydrogenase C
HO C H + NAD+ + NADH + H+
C
CH3 O CH3
Lactate Pyruvate
Tymo_c40_620-637hr 2-12-2008 10:12 Page 624
Centrifuge
at 500 × g
for 10 minutes
Supernatant
Homogenate
forms Pellet: Nuclear
10,000 × g
20 minutes
fraction
Cytosol
(soluble proteins)
Pellet: Microsomal
fraction
Figure 40.1 Differential centrifugation. Cells are disrupted in a homogenizer and the
resulting mixture, called the homogenate, is centrifuged in a step-by-step fashion of
increasing centrifugal force. The denser material will form a pellet at lower centrifugal force
than will the less-dense material. The isolated fractions can be used for further purification.
[Photographs courtesy of Dr. S. Fleischer and Dr. B. Fleischer.]
Tymo_c40_620-637hr 2-12-2008 10:12 Page 625
Protein solubility
Size, Charge, and Binding Affinity Salting out
Proteins are purified on the bases of differences in solubility, size, charge, and spe-
cific binding affinity. Usually, protein mixtures are subjected to a series of separa- Salting in
tions, each based on a different property.
Salting out. Most proteins require some salt to dissolve, a process called salt-
Salt concentration
ing in. However, most proteins precipitate out of solution at high salt concen-
trations, an effect called salting out (Figure 40.2). Salting out is due to Figure 40.2 The dependency of protein
competition between the salt ions and the protein for water to keep the protein solubility on salt concentration. The graph
in solution (water of solvation). The salt concentration at which a protein pre- shows how altering the salt concentration
cipitates differs from one protein to another. Hence, salting out can be used to affects the solubility of a hypothetical
protein. Different proteins will display
fractionate a mixture of proteins. Unfortunately, many proteins lose their activ- different curves.
ity in the presence of such high concentrations of salt. However, the salt can be
removed by the process of dialysis. The protein–salt solution is placed in a small
bag made of a semipermeable membrane, such as a cellulose membrane, with
pores (Figure 40.3). Proteins are too large to fit through the
pores of the membrane, whereas smaller molecules and ions
such as salts can escape through the pores and emerge in the
medium outside the bag (the dialysate).
the basis of their net charge by ion-exchange chromatogra- Figure 40.3 Dialysis. Protein molecules (red) are retained within
phy. If a protein has a net positive charge at pH 7, it will usu- the dialysis bag, whereas small molecules (blue) diffuse into the
ally bind to a column of beads containing negatively charged surrounding medium.
Tymo_c40_620-637hr 2-12-2008 10:12 Page 626
626
40 Techniques in Protein Carbohydrate
Biochemistry polymer bead
Small molecules
enter the
aqueous spaces
within beads
Protein
sample Large
molecules
cannot enter
Molecular beads
exclusion
gel
Flow direction
carboxylate groups, whereas a negatively charged protein will not bind to the
column (Figure 40.5). A positively charged protein bound to such a column
can then be released by increasing the concentration of salt in the buffer
poured over the column. The positively charged ions of the salt compete with
positively charged groups on the protein for binding to the column. Likewise,
a protein with a net negative charge will be bound to ion-exchange beads car-
rying positive charges and can be eluted from the column with the use of a
buffer containing salt.
0.24
Glucose-binding
protein attaches G
G 5
to glucose
residues (G) on 0.20 1
G
beads G
G
Absorbance at 220 nm
G 0.16
GG
0.12
G 23 4
G
0.08
Addition of
glucose (G)
0.04
G
G 0
G 0 5 10
G
GG Time (minutes)
G
G
Figure 40.7 High-pressure liquid
chromatography (HPLC). Gel filtration by
HPLC clearly defines the individual
GG G Figure 40.6 Affinity chromatography. proteins because of its greater resolving
Glucose-binding G
proteins are G Affinity chromatography of power. Proteins are detected by their
released on G absorbance of 220-nm light waves:
concanavalin A (shown in yellow) on a
addition of GG solid support containing covalently (1) thyroglobulin (669 kd), (2) catalase
glucose
attached glucose residues (G). (232 kd), (3) bovine serum albumin
(67 kd), (4) ovalbumin (43 kd), and
(5) ribonuclease (13.4 kd). [After K. J. Wilson
and T. D. Schlabach. In Current Protocols in
Proteins Can Be Separated by Gel Electrophoresis and Displayed Molecular Biology, vol. 2, suppl. 41, F. M.
How can we tell whether a purification scheme is effective? One way is to demon- Ausubel, R. Brent, R. E. Kingston, D. D. Moore,
strate that the specific activity rises with each purification step. Another is to visu- J. G. Seidman, J. A. Smith, and K. Struhl, Eds.
(Wiley, 1998), p. 10.14.1.]
alize the number of proteins present at each step. The technique of gel
electrophoresis makes the latter method possible.
A molecule with a net charge will move in an electric field, a phenomenon
termed electrophoresis. The distance and speed that a protein moves in elec-
trophoresis depends on the electric-field strength, the net charge on the protein,
which is a function of the pH of the electrophoretic solution, and the shape of the
protein. Electrophoretic separations are nearly always carried out in gels, such as
polyacrylamide, because the gel serves as a molecular sieve that enhances separa-
tion. Molecules that are small compared with the pores in the gel readily move
through the gel, whereas molecules much larger than the pores are almost immo-
bile. Intermediate-size molecules move through the gel with various degrees of
ease. The electrophoresis of proteins is performed in a thin, vertical slab of poly-
acrylamide. The direction of flow is from top to bottom (Figure 40.8).
Proteins can be separated largely on the basis of mass by electrophoresis in a
polyacrylamide gel in the presence of the detergent sodium dodecyl sulfate (SDS).
The negatively charged SDS denatures proteins and binds to the denatured protein
at a constant ratio of one SDS molecule for every two amino acids in the protein.
The negative charges on the many SDS molecules bound to the
protein “swamp” the normal charge on the protein and cause all
proteins to have the same charge-to-mass ratio. Thus, proteins will Na+ SO3–
differ only in their mass. Finally, a sulfhydryl agent such as mercap- O
toethanol is added to reduce disulfide bonds and completely lin- Sodium dodecyl sulfate
earize the proteins. The SDS–protein complexes are then subjected (SDS)
627
Tymo_c40_620-637hr 2-12-2008 10:12 Page 628
Electrophoresis
Direction of
+ electrophoresis
Porous gel
to electrophoresis. When the electrophoresis is complete, the proteins in the gel can
be visualized by staining them with silver or a dye such as Coomassie blue, which
reveals a series of bands (Figure 40.9). Small proteins move rapidly through the gel,
whereas large proteins stay at the top, near the point of application of the mixture.
(B)
Low pH High pH
(+) (−)
Low pH
(+)
SDS–polyacrylamide slab
SDS-PAGE
Figure 40.11 Two-dimensional gel electrophoresis. (A) A protein sample is initially
fractionated in one direction by isoelectric focusing as described in Figure 40.10. The
isoelectric focusing gel is then attached to an SDS–polyacrylamide gel, and electrophoresis
is performed in the second direction, perpendicular to the original separation. Proteins with
the same pI value are now separated on the basis of mass. (B) Proteins from E. coli were
separated by two-dimensional gel electrophoresis, resolving more than a thousand different
proteins. The proteins were first separated according to their isoelectric pH in the
horizontal direction and then by their apparent mass in the vertical direction. [(B) Courtesy of
Dr. Patrick H. O’Farrell.]
the vertical direction on the basis of mass. More than a thousand different pro-
teins in the bacterium Escherichia coli can be resolved in a single experiment by
two-dimensional electrophoresis (Figure 40.11).
629
Tymo_c40_620-637hr 2-12-2008 10:12 Page 630
• Total Activity. The enzyme activity for the fraction is obtained by measuring
the enzyme activity in the volume of fraction used in the assay and multiply-
ing by the fraction’s total volume.
• Specific Activity. This parameter, obtained by dividing total activity by total pro-
tein, enables us to measure the degree of purification by comparing specific
activities after each purification step. Recall that the goal of a purification
scheme is to maximize specific activity.
• Yield. This parameter is a measure of the total activity retained after each
purification step as a percentage of the activity in the crude extract. The
amount of activity in the initial extract is taken to be 100%.
• Purification level. This parameter is a measure of the increase in purity and is
obtained by dividing the specific activity, calculated after each purification step,
by the specific activity of the initial extract.
As we see in Table 40.1, several purification steps can lead to several thousand-
fold purification. Inevitably, in each purification step, some of the protein of inter-
est is lost, and so our overall yield is 35%. A good purification scheme takes into
account purification levels as well as yield.
The SDS-PAGE depicted in Figure 40.12 shows that, if we load the same
QUICK QUIZ 2 What physical amount of protein onto each lane after each step, the number of bands decreases
differences among proteins allow for in proportion to the level of purification and the amount of protein of interest
their purification? increases as a proportion of the total protein present.
631
40.3 Determining Primary Structure
R
O N
R NH2
O O
O OH
O
O
OH
Figure 40.13 Fluorescent derivatives of amino acids. Fluorescamine reacts with the ␣-amino
group of an amino acid to form a fluorescent derivative.
proteins with those isolated from patients with pathological conditions allows an
understanding of the molecular basis of diseases.
Let us examine first how we can sequence a simple peptide, such as
Ala-Gly-Asp-Phe-Arg-Gly
The first step is to determine the amino acid composition of the peptide. The pep-
tide is hydrolyzed into its constituent amino acids by heating it in strong acid. The
individual amino acids can then be separated by ion-exchange chromatography
and visualized by treatment with fluorescamine, which reacts with the ␣-amino
group to form a highly fluorescent product (Figure 40.13).
The concentration of an amino acid in solution is proportional to the fluorescence
of the solution. The solution is then run through a column. The amount of buffer
required to remove the amino acid from the column is compared with the elution
pattern of a standard mixture of amino acids, revealing the identity of the amino
acid in the solution (Figure 40.14). The composition of our peptide is
(Ala, Arg, Asp, Gly2, Phe)
The parentheses denote that this is the amino acid composition of the peptide,
not its sequence.
The sequence of a protein can then be determined by a process called the
Edman degradation. The Edman degradation sequentially removes one residue at
Phe
Asp
Leu
Glu
Cys
Pro
Arg
Ser
Thr
Gly
His
Ala
Lys
Val
Tyr
lle
O
H H
EDMAN DEGRADATION H2N Asp Phe Arg Gly
N + N
C H
1 2 3 4 5 H3C H O
S
Phenyl isothiocyanate Ala Gly
Labeling
Labeling
First
1 2 3 4 5 round
O
H H H H
Release N N Asp Phe Arg Gly
N
H
1 2 3 4 5 H3C H O
S
Labeling
Release
2 3 4 5 Second
round
S
Release
H H
NH Asp Phe Arg Gly
2 3 4 5 N + H2N
H
O
O CH3
Figure 40.15 The Edman degradation. The labeled amino-terminal residue (PTH–alanine in
the first round) can be released without hydrolyzing the rest of the peptide. Hence, the
amino-terminal residue of the shortened peptide (Gly-Asp-Phe-Arg-Gly) can be determined in
the second round. Three more rounds of the Edman degradation reveal the complete
sequence of the original peptide.
a time from the amino end of a peptide (Figure 40.15). Phenyl isothiocyanate
reacts with the terminal amino group of the peptide, which then cyclizes and
breaks off the peptide, yielding an intact peptide shortened by one amino acid.
The cyclic compound is a phenylthiohydantoin (PTH)–amino acid, which can be
identified by chromatographical procedures. The Edman procedure can then be
repeated sequentially to yield the amino acid sequence of the peptide.
632
Tymo_c40_620-637hr 2-12-2008 10:12 Page 633
Figure 40.16 Overlap peptides. The peptide obtained by chymotryptic digestion overlaps
two tryptic peptides, establishing their order.
634 proteins apparently have arisen by the duplication of primordial genes. For exam-
40 Techniques in Protein Biochemistry ple, calmodulin, a ubiquitous calcium sensor in eukaryotes, contains four similar
calcium-binding modules that arose by gene duplication (Figure 40.17).
4. Many proteins contain amino acid sequences that serve as signals designating
their destinations or controlling their processing. For example, a protein destined for
export from a cell or for location in a membrane contains a signal sequence, a
stretch of about 20 hydrophobic residues near the amino terminus that directs the
protein to the appropriate membrane. Another protein may contain a stretch of
amino acids that functions as a nuclear localization signal, directing the protein to
the nucleus.
5. Sequence data allow a molecular understanding of diseases. Many diseases
are caused by mutations in DNA that result in alterations in the amino acid
N C sequence of a particular protein. These alterations often compromise the protein’s
function. For instance, sickle-cell anemia is caused by a change in a single amino
Figure 40.17 Repeating motifs in a acid in the primary structure of the  chain of hemoglobin. Approximately 70% of
protein chain. Calmodulin, a calcium the cases of cystic fibrosis are caused by the deletion of one particular amino acid
sensor, contains four similar units (shown
from the 1480-amino-acid-containing protein that controls chloride transport
in red, yellow, blue, and orange) in a single
polypeptide chain. Notice that each unit across cell membranes. Indeed, a major goal of biochemistry is to elucidate the mol-
binds a calcium ion (shown in green). ecular basis of disease with the hope that this understanding will lead to effective
[Drawn from 1CLL.pdb.] treatment.
Clinical Insight
Understanding Disease at the Molecular Level: Sickle-Cell Anemia
Results from a Single Amino Acid Change
Studies with mutations of hemoglobin have provided many examples showing
that alterations in the primary structure of a protein result in pathological condi-
tions. The role of hemoglobin, which is contained in red blood cells, is to bind
oxygen in the lungs and to transport and release oxygen to tissues that require oxy-
gen for combusting fuels. If the ability of hemoglobin to carry and release oxygen
is somehow compromised, anemia results. Anemia is characterized by a host of
symptoms, most commonly fatigue. A well-studied example of anemia is sickle-
cell anemia, which is most commonly found in people from sub-Saharan Africa
or their descendants (p. 113). Recall from Chapter 8 that the defining feature of
sickle-cell anemia is that the red blood cells adopt a sickle shape after the hemo-
globin has released its bound oxygen. These sickled cells clog small capillaries and
impair blood flow. The results may be painful swelling of the extremities and a
higher risk of stroke or bacterial infection (owing to poor circulation). The sick-
led red cells also do not remain in circulation as long as normal cells do, leading
to anemia.
What is the molecular defect associated with sickle-cell anemia? A single
amino acid substitution in the  chain of hemoglobin is responsible—namely,
the substitution of a valine residue for a glutamate residue in position 6. The
mutated form is referred to as hemoglobin S (Hb S). In people with sickle-cell ane-
mia, both copies of the hemoglobin -chain (Hb B) gene are mutated. The Hb S
Figure 40.18 Sickled red blood cells substitution substantially decreases the solubility of deoxyhemoglobin, although
trapped in capillaries. The micrograph it does not markedly alter the properties of oxyhemoglobin. Hence, sickling takes
shows sickled red blood cell trapped in tiny place in the small capillaries after the hemoglobin has released its oxygen to the
blood vessels called capillaries. The
tissues. The deoxyhemoglobin molecules associate with one another and form
trapped cells impede the flow of blood
through the tissues. Consequently, the insoluble aggregates that deform the cell, leading to the characteristic sickle shape
cells are deprived of oxygen and the (Figure 40.18). Thus, sickle-cell anemia, like the spongiform encephalopathies
tissues are damaged. [Courtesy of National discussed earlier (Chapter 4), is a pathological condition resulting from protein
Heart, Lung and Blood Institute.] aggregation. ■
Tymo_c40_620-637hr 2-12-2008 10:12 Page 635
SUMMARY 635
Answers to Quick Quizzes
40.1 The Proteome Is the Functional Representation of the Genome
The rapid progress in gene sequencing has advanced another goal of
biochemistry—the elucidation of the proteome. The proteome is the com-
plete set of proteins expressed and includes information about how they are
modified, how they function, and how they interact with other molecules.
Unlike the genome, the proteome is not static and varies with cell type,
developmental stage, and environmental conditions.
40.2 The Purification of Proteins Is the First Step in Understanding Their
Function
Proteins can be separated from one another and from other molecules on
the basis of such characteristics as solubility, size, charge, and binding affin-
ity. SDS-PAGE separates the polypeptide chains of proteins under denatur-
ing conditions largely according to mass. Proteins can also be separated
electrophoretically on the basis of net charge by isoelectric focusing in a pH
gradient.
40.3 Determining Primary Structure Facilitates an Understanding of Protein
Function
The amino acid composition of a protein can be ascertained by hydrolyz-
ing the protein into its constituent amino acids. The amino acids can be
separated by ion-exchange chromatography and quantitated by their reac-
tion with fluorescamine. Amino acid sequences can be determined by
Edman degradation, which removes one amino acid at a time from the
amino end of a peptide. Phenyl isothiocyanate reacts with the terminal
amino group to form a phenylthiohydantoin–amino acid and a peptide
shortened by one residue. Longer polypeptide chains are broken into
shorter ones for analysis by specifically cleaving them with a reagent that
breaks the peptide at specific sites. Amino acid sequences are rich in infor-
mation concerning the kinship of proteins, their evolutionary relations, and
diseases produced by mutations. Knowledge of a sequence provides valu-
able clues to conformation and function.
Key Terms
proteome (p. 623) ion-exchange chromatography isoelectric focusing (p. 628)
assay (p. 623) (p. 625) two-dimensional electrophoresis
homogenate (p. 625) affinity chromatography (p. 626) (p. 629)
salting in (p. 625) high-pressure liquid chromatography Edman degradation (p. 631)
salting out (p. 625) (HPLC) (p. 626) phenyl isothiocyanate (p. 632)
dialysis (p. 625) gel electrophoresis (p. 627) overlap peptide (p. 633)
gel-filtration chromatography (p. 625) isoelectric point (p. 628)
Problems
1. Salting out. Why do proteins precipitate at high salt (b) Under what conditions might the statement in part a be
concentrations? incorrect?
2. Salting in. Although many proteins precipitate at high salt (c) Some proteins migrate anomalously in SDS-PAGE gels.
concentrations, some proteins require salt in order to dissolve For instance, the molecular weight determined from an
in water. Explain why some proteins require salt to dissolve. SDS-PAGE gel is sometimes very different from the molec-
3. Competition for water. What types of R groups would ular weight determined from the amino acid sequence. Sug-
compete with salt ions for water of solvation? gest an explanation for this discrepancy.
4. Column choice. (a) The octapeptide AVGWRVKS was 10. A question of efficiency. The Edman method of protein
digested with the enzyme trypsin. Would ion-exchange or sequencing can be used to determine the sequence of pro-
gel-filtration chromatography be most appropriate for sep- teins no longer than approximately 50 amino acids. Why is
arating the products? Explain. (b) Suppose that the peptide this length limitation the case?
had, instead, been digested with chymotrypsin. What would
be the optimal separation technique? Explain. Chapter Integration Problem
11. Quaternary structure. A protein was purified to homo-
5. Frequently used in shampoos. The detergent sodium
geneity. Determination of the mass by gel-filtration chro-
dodecyl sulfate (SDS) denatures proteins. Suggest how SDS
matography yields 60 kd. Chromatography in the presence
destroys protein structure.
of urea yields a 30-kd species. When the chromatography is
6. Making more enzyme? In the course of purifying an repeated in the presence of urea and -mercaptoethanol, a
enzyme, a researcher performs a purification step that single molecular species of 15 kd results. Describe the struc-
results in an increase in the total activity to a value greater ture of the molecule.
than that present in the original crude extract. Explain how
the amount of total activity might increase. Data Interpretation Problems
7. Protein purification problem. Complete the following 12. Protein sequencing 1. Determine the sequence of hexa-
table. peptide on the basis of the following data. Note: When the
Specific sequence is not known, a comma separates the amino acids.
Total Total activity Purifi- (See Table 40.2.)
Purification protein activity (units cation Yield Amino acid composition: (2R,A,S,V,Y)
procedure (mg) (units) mg-1) level (%)
N-terminal analysis of the hexapeptide: A
Crude 20,000 4,000,000 1 100 Trypsin digestion: (R,A,V) and (R,S,Y)
extract
Carboxypeptidase digestion: no digestion
(NH4)2SO4 5,000 3,000,000
precipitation
Chymotrypsin digestion: (A,R,V,Y) and (R,S)
DEAE– 1,500 1,000,000 13. Protein sequencing 2. Determine the sequence of a pep-
cellulose tide consisting of 14 amino acids on the basis of the follow-
chromatography ing data.
Gel-filtration 500 750,000 Amino acid composition: (4S,2L,F,G,I,K,M,T,W,Y)
chromatography N-terminal analysis: S
Affinity 45 675,000 Carboxypeptidase digestion: L
chromatography Trypsin digestion: (3S,2L,F,I,M,T,W) (G,K,S,Y)
Chymotrypsin digestion: (F,I,S) (G,K,L) (L,S) (M,T)
8. Dialysis. Suppose that you precipitate a protein with (S,W) (S,Y)
1 M (NH4)2SO4, and you wish to reduce the concentration N-terminal analysis of (F,I,S) peptide: S
of the (NH4)2SO4. You take 1 ml of your sample and dialyze Cyanogen bromide treatment: (2S,F,G,I,K,L,M*,T,Y)
it in 1000 ml of buffer. At the end of dialysis, what is the con- (2S,L,W)
centration of (NH4)2SO4 in your sample? How could you M*, methionine detected as homoserine
further lower the (NH4)2SO4 concentration?
9. Charge to mass. (a) Proteins treated with a sulfhydryl
reagent such as -mercaptoethanol and dissolved in sodium
dodecyl sulfate have the same charge-to-mass ratio. Explain.
Tymo_c40_620-637hr 2-12-2008 10:12 Page 637
Answers to Problems
1. If the salt concentration becomes too high, the salt ions 8. The sample was diluted 1000-fold. The concentration
interact with the water molecules. Eventually, there are not after dialysis is thus 0.001 M or 1 mM. You could reduce the
enough water molecules to interact with the protein, and salt concentration by dialyzing your sample, now 1 mM, in
the protein precipitates. more buffer free of (NH4)2SO4.
2. If there is lack of salt in a protein solution, the proteins 9. (a) Because one SDS molecule binds to a protein for
may interact with one another—the positive charges on every two amino acids in the proteins, in principle, all pro-
one protein with the negative charges on another or several teins will have the same charge-to-mass ratio. For instance,
others. Such an aggregate becomes too large to be solublized a protein consisting of 200 amino acids will bind 100 SDS
by water alone. If salt is added, the salt neutralizes the charges molecules, whereas a protein consisting of 400 amino acids
on the proteins, preventing protein–protein interactions. will bind 200 SDS molecules. The average mass of an amino
3. Charged and polar R groups on the surface of the acid is 110, and there is one negative charge per SDS mole-
enzyme. cule. Thus, the charge-to-mass ratio of both proteins is the
4. (a) Trypsin cleaves after arginine (R) and lysine (K), same—0.0045. (b) The statement might be incorrect if the
generating AVGWR, VK, and S. Because they differ in size, protein contains many charged amino acids. (c) The protein
these products could be separated by molecular exclusion may be modified. For instance, serine, threonine, and tyro-
chromatography. sine may have phosphoryl groups attached.
(b) Chymotrypsin, which cleaves after large aliphatic or 10. Because the cleavage does not occur every time for each
aromatic R groups, generates two peptides of equal size peptide being sequenced. Consequently, after many repeti-
(AVGW) and (RVKS). Separation based on size would not be tions (approximately 50), many different peptides are
effective. The peptide RVKS has two positive charges (R and releasing different amino acids at the same time. To illus-
K), whereas the other peptide is neutral. Therefore, the two trate this point, assume that each sequencing step is 98%
products could be separated by ion-exchange chromatography. efficient. The proportion of correct amino acids released
after 50 rounds is 0.9850, or 0.4—a hopelessly impure mix.
5. The long hydrophobic tail on the SDS molecule (p. 627)
disrupts the hydrophobic interactions in the interior of the 11. Treatment with urea disrupts noncovalent bonds. Thus,
protein. The protein unfolds, with the hydrophobic R groups the original 60-kd protein must be made of two 30-kd sub-
now interacting with the SDS rather than with one another. units. When these subunits are treated with urea and
-mercaptoethanol, a single 15-kd species results, suggest-
6. An inhibitor of the enzyme being purified might have
ing that disulfide bonds link the 30-kd subunits.
been present and subsequently removed by a purification
step. This removal would lead to an apparent increase in the 12. N terminal: A
total amount of enzyme present. Trypsin digestion: Cleaves at R. Only two peptides are pro-
duced. Therefore, one R must be internal and the other
7.
Specific
must be the C-terminal amino acid. Because A is N termi-
Total Total activity Purifi- nal, the sequence of one of the peptides is AVR.
Purification protein activity (units cation Yield Carboxypeptidase digestion: No digestion confirms that R
procedure (mg) (units) mg–1) level (%) is the C-terminal amino acid.
Chymotrypsin digestion: Cleaves only at Y. Combined with
Crude 20,000 4,000,000 200 1 100 the preceding information, chymotrypsin digestion tells us
extract that the sequences of the two peptides are AVRY and SR.
(NH4)2SO4 5,000 3,000,000 600 3 75 Thus the complete peptide is AVRYSR.
precipitation
DEAE– 1,500 1,000,000 667 3.3 25
13. First amino acid: S
cellulose Last amino acid: L
chromatography Cyanogen bromide cleavage: M is 10th position, C-terminal
residues are: (2S,L,W)
Size- 500 750,000 1,500 7.5 19
exclusion
N-terminal residues: (G,K,S,Y), tryptic peptide, ends in K
chromatography N-terminal sequence: SYGK
Chymotryptic peptide order: (S,Y), (G,K,L), (F,I,S), (M,T),
Affinity 45 675,000 15,000 75 17
(S,W), (S,L)
chromatography
Sequence: SYGKLSIFTMSWSL