Вы находитесь на странице: 1из 449

BIOLOGICAL AND BIOMEDICAL INFRARED SPECTROSCOPY

Advances in Biomedical Spectroscopy


Spectroscopic methods play an increasingly important role in studying the molecular details of complex biological systems in health and disease. However, no single spectroscopic method can provide all the desired information on aspects of molecular structure and function in a biological system. Choice of technique will depend on circumstance; some techniques can be carried out both in vivo and in vitro, others not, some have timescales of seconds and others of picoseconds, whilst some require use of a perturbing probe molecule while others do not. Each volume in this series will provide a state of the art account of an individual spectroscopic technique in detail. Theoretical and practical aspects of each technique, as applied to the characterisation of biological and biomedical systems, will be comprehensively covered so as to highlight advantages, disadvantages, practical limitations and future potential. The volumes will be intended for use by research workers in both academic and in applied research, and by graduate students working on biological or biomedical problems. Series Editor: Dr. Parvez I. Haris De Montfort University, Leicester, United Kingdom

Volume 2
Recently published in this series Vol. 1. B.A. Wallace and R.W. Janes (Eds.), Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism Spectroscopy

ISSN 1875-0656

Biological and Biomedical Infrared Spectroscopy

Edited by

Andreas Barth
Stockholm University, Stockholm, Sweden

and

Parvez I. Haris
De Montfort University, Leicester, UK

Amsterdam Berlin Tokyo Washington, DC

2009 The authors and IOS Press. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without prior written permission from the publisher. ISBN 978-1-60750-045-2 Library of Congress Control Number: 2009932637 Publisher IOS Press BV Nieuwe Hemweg 6B 1013 BG Amsterdam Netherlands fax: +31 20 687 0019 e-mail: order@iospress.nl Distributor in the UK and Ireland Gazelle Books Services Ltd. White Cross Mills Hightown Lancaster LA1 4XS United Kingdom fax: +44 1524 63232 e-mail: sales@gazellebooks.co.uk Distributor in the USA and Canada IOS Press, Inc. 4502 Rachael Manor Drive Fairfax, VA 22032 USA fax: +1 703 323 3668 e-mail: iosbooks@iospress.com

LEGAL NOTICE The publisher is not responsible for the use which might be made of the following information. PRINTED IN THE NETHERLANDS

Biological and Biomedical Infrared Spectroscopy A. Barth and P.I. Haris (Eds.) IOS Press, 2009 2009 The authors and IOS Press. All rights reserved.

Series Preface
In the post-genomic era there is a great need to understand the structure and dynamics of macromolecules, not just single molecules but also their multiple interactions as part of a systems biology approach. It is therefore not surprising that in recent years several Nobel Prizes have been awarded to scientists who have developed well established analytical techniques to the study of biological and medical systems, these includes mass spectrometry, NMR spectroscopy, magnetic resonance imaging. There is no doubt that the development of new analytical techniques and the effective utilisation of existing methods is vital for obtaining a better picture of the molecular details of complex biological systems in both health and disease. Such progress is important for disease diagnosis and drug discovery processes. However, the complexity of biological systems is such that no single experimental method can provide information on all aspects of molecular structure and function. There are a large number of spectroscopic methods that can be used in the analysis of biological systems. Some can be used to carry out analysis in both in vivo and in vitro settings whereas others are restricted, at least currently, to one particular environment. The timescales of many of these techniques can be very different. Some require the use of potentially perturbing probe molecules, whereas others do not. Clearly, no single technique is perfect and each has its respective advantages and disadvantages. Consequently, a serious scientist would not be fully satisfied with the analysis of particular system based on results from a single technique. Ideally, one should use a battery of techniques before drawing a final conclusion. Considering the wide array of techniques available for analysis of biological systems, producing a single book on one particular spectroscopic technique would not be sufficient to meet the needs of scientists engaged in understanding biological molecules and their interactions. Therefore, I decided to produce a series of books on emerging and established spectroscopic methods to serve the needs of academics, industrial scientists as well as graduate students who are currently using or seeking to use a particular spectroscopic method in their research work. The books are intended to provide advances in theoretical and practical aspects of each technique, as applied to the characterisation of biological and biomedical systems, highlighting advantages, disadvantages and potential pitfalls. The first volume of the series provides a comprehensive discussion of the state-ofthe-art methods in Circular Dichroism spectroscopic analysis of biological systems. The volume was edited by Bonnie Ann Wallace (Birkbeck College, University of London) and Robert William Janes (Queen Mary & Westfield College, University of London). Bonnie Wallace has been awarded the 2010 AstraZeneca Award by the Biochemical Society, UK and the 2010 Interdisciplinary Award by the Royal Society of Chemistry, UK for her work on the development of Synchrotron Radiation-based Circular Dichroism Spectroscopy for biological studies. The current volume is devoted to the application of infrared spectroscopy in biological and biomedical studies. It is edited by Andreas Barth (Stockholm University) and myself and brings together contributions from leading experts in infrared spectroscopy.

vi

Finally, I would like to thank all the editors for their hard work in bringing together leading experts in their field to make contributions that ultimately result in the production of each volume in the series. Parvez I. Haris Leicester, United Kingdom

vii

Preface
This book aims to provide an insight into some of the key areas where infrared spectroscopy has been successfully applied to understand important biological and biomedical processes. It highlights the latest advances and the directions for the future. The book provides a historical framework for the development of biological infrared spectroscopy. Key methodologies that are in current use and latest advances, in both theoretical and practical aspects, are discussed. Examples of applications, ranging from characterisation of individual macromolecules (DNA, RNA, lipids, proteins) to complex systems such as human tissues, cells and whole organisms are covered. The main focus is in the mid-infrared region as the vast majority of studies are conducted in this region. However, there is increasing use of the near-infrared region for biomedical application and hence a chapter is devoted to this part of the infrared spectrum. Biological spectroscopy is a highly interdisciplinary field of research requiring involvement of life scientists and analytical chemists. Advances in instrumentation technology and methods for analysis and interpretation of the spectroscopic data require input from multiple disciplines including Chemistry, Physics, Mathematics, Computer Science and Engineering. It is this co-operation between scientists from diverse disciplines that ultimately results in the utilisation of a physical technique for understanding the molecular details of biological processes and systems. Such co-operation is vital if spectroscopists are to play a significant role in the analysis of the vast number of genes and proteins that are being identified by the various genome sequencing projects. Currently, it is not impossible for a gene sequencing laboratory to produce as much data in less than a week as was produced by Shakespeare in his entire life-time. However, an understanding of the molecular details of the genes and proteins identified, and their diverse interactions, require application of biophysical techniques such as infrared spectroscopy. Continued technological development in spectroscopic methods is vital to keep pace with the breathtaking advances in the field of molecular biology. Nearly 400 years ago Shakespeare described the seven ages of life in the following manner: All the worlds a stage, And all the men and women merely players: They have their exits and their entrances; And one man in his time plays many parts, His acts being seven ages. Using this as an analogy, Laitinen in 1973 wrote an editorial in Analytical Chemistry describing the seven ages of an analytical method (H.A. Laitinen, Anal. Chem. 45 (1973) 2305). He used infrared spectroscopy as an example to illustrate how it has reached its seventh age. His description of this seventh age is as follows: Seventh, a period of senescence occurs as other methods of greater speed, economy, convenience, sensitivity, selectivity, etc., surpass the method under consideration. It is surprising that Laitinen chose infrared spectroscopy as his example, since at that time the first commercial Fourier transform infrared spectrometers were being delivered to laboratories around the world. As such it was a very exciting time for infra-

viii

red spectroscopy. Indeed, a year earlier, in 1972, Peter Griffiths published a letter in the same journal entitled Trading rules in infrared Fourier transform spectroscopy (P.R. Griffiths, Anal. Chem., 44 (1972), 1909). As an editor of the journal, Laitinen must have been aware of the revolution taking place in infrared spectroscopy. The widespread availability of FT instruments and the use of computers for recording and analysis of infrared spectra, heralded a new era in infrared spectroscopy. Now it was possible to analyse biological molecules, in aqueous media, at fast speeds and at high resolutions that was virtually impossible with dispersive instruments. Far from reaching its seventh age infrared spectroscopy is a vibrant methodology playing a central role in some of the latest discoveries in biology and medicine, including some recent Nobel Prize winning work. For example, Stanley Prusiner was awarded the Nobel Prize for Physiology or Medicine in 1997 and infrared spectroscopy played an important role in his work. In a section of his Nobel lecture (S.B. Prusiner, Proc. Natl. Acad. Sci. USA, 95 (1998), 1336313383) he states the following: For more than 25 years, it had been widely accepted that the amino acid sequence specifies one biologically active conformation of a protein. Yet in scrapie we were faced with the possibility that one primary structure for PrP might adopt at least two different conformations to explain the existence of both PrPC and PrPSc. When the secondary structures of the PrP isoforms were compared by optical spectroscopy, they were found to be markedly different. Fourier-transform infrared (FTIR) and circular dichroism (CD) studies showed that PrPC contains about 40% -helix and little sheet, whereas PrPSc is composed of about 30% -helix and 45% -sheet). Nevertheless, these two proteins have the same amino acid sequence! It is noteworthy that the abnormal form of the prion protein (PrPSc) misfolds and forms aggregates that are virtually impossible for characterisation using X-ray crystallography, NMR and CD spectroscopy. In order to overcome this problem, Prusiner and co-workers used infrared spectroscopy to obtain direct evidence for an increase in betasheet structure in the PrPSc aggregates. In recent years infrared spectroscopy is going through a renaissance catalysed by some exciting developments in technology. This includes the use of the bright synchrotron radiation for recording infrared spectra. Latest breakthroughs also include the development of two-dimensional infrared spectroscopy and the ability to record infrared spectra at ultrafast speeds. There are also some major advances in theoretical analysis that is enabling a better interpretation of the infrared spectra of biological molecules. Considering these advances, we felt it would be timely to produce a book that brings together some of the key developments in the field. The book is intended for both experts and those who are new to the field of biological infrared spectroscopy. It would be particularly beneficial for graduate students and research scientists in both industry and academia. Finally, we would like to thank all the authors who have contributed in this volume. Without their co-operation it would not have been possible to accomplish this task. Andreas Barth (Stockholm University, Stockholm, Sweden) Parvez I. Haris (De Montfort University, Leicester, UK)

ix

List of Contributors
Tsutomu ARAKAWA Andreas BARTH Alliance Protein Laboratory, Inc., 3957 Corte Cancion, Thousand Oaks, CA 91360, USA Department of Biochemistry and Biophysics, The Arrhenius Laboratories for Natural Sciences, Stockholm University, S-10691, Stockholm, Sweden Institute of Organic Chemistry and Biochemistry, Academy of Sciences, Czech Republic SOLEIL Synchrotron, Saint-Aubin BP 48, 91192 Gif-sur-Yvette Cedex, France Department of Chemistry and Center for Multidimensional Spectroscopy, Korea University, Seoul 136-701, Korea Department of Chemistry and Center for Multidimensional Spectroscopy, Korea University, Seoul 136-701, Korea SOLEIL Synchrotron, Saint-Aubin BP 48, 91192 Gif-sur-Yvette Cedex, France Robert Koch-Institute, Nordufer 20, D-13353 Berlin Department of Chemistry, Stanford University, Stanford, CA 94305-5080, USA Department of Chemistry, Stanford University, Stanford, CA 94305-5080, USA Laboratory for the Structure and Function of Biological Membranes, Center for Structural Biology and Bioinformatics, Universit Libre de Bruxelles, CP 206/2, Boulevard du Triomphe, B-1050 Brussels, Belgium Faculty of Health & Life Sciences, De Montfort University, Leicester, UK Department of Computer Science, University of Applied Sciences Ulm, Prittwitzstrae 10, 89075 Ulm, Germany Department of Chemistry, Stanford University, Stanford, CA 94305-5080, USA Department of Chemistry-Biology, University of Qubec at Trois-Rivires, C.P. 500, Trois-Rivires (Qubec) Canada

Petr BOU

Sirinart CHIO-SRICHAN Jun-Ho CHO

Minhaeng CHO

Paul DUMAS Heinz FABIAN Michael D. FAYER Ilya J. FINKELSTEIN Erik GOORMAGHTIGH

Parvez I. HARIS Joachim A. HERING

Haruto ISHIKAWA David JOLY

Biomacromolecular Drug Delivery, Department of Pharmaceutics and Analytical Chemistry, Faculty of Pharmaceutical Sciences, University of Copenhagen, Universitetsparken 2, 2100 Copenhagen, Denmark Timothy A. KEIDERLING Department of Chemistry, University of Illinois at Chicago, USA Seongheun KIM Department of Chemistry, Stanford University, Stanford, CA 94305-5080, USA Jan KUBELKA Department of Chemistry, University of Wyoming, USA Peter LASCH Robert Koch-Institute, Nordufer 20, D-13353 Berlin, Germany Tiansheng LI PacificBio Inc., 1152 Tourmalin Drive, Newbury Park, CA 91320, USA Andrew MACNAB Faculty of Medicine, Departments of Pediatrics and Urologic Sciences, Director Near-Infrared Study Group, University of British Columbia, Bladder Care Centre, Canada Lisa M. MILLER National Synchrotron Light Source, Brookhaven National Laboratory, Upton, NY 11973 USA Dieter NAUMANN Robert Koch-Institute, Nordufer 20, D-13353 Berlin, Germany Christophe N. NSOUKPO-KOSSI Department of Chemistry-Biology, University of Qubec at Trois-Rivires, C.P. 500, Trois-Rivires (Qubec) Canada G9A 5H7 Heidar-Ali TAJMIR-RIAHI Department of Chemistry-Biology, University of Qubec at Trois-Rivires, C.P. 500, Trois-Rivires (Qubec) Canada G9A 5H7 Mark J. TOBIN Australian Synchrotron, 800 Blackburn Road, Clayton, Victoria, 3168, Australia Marco VAN DE WEERT Biomacromolecular Drug Delivery, Department of Pharmaceutics and Analytical Chemistry, Faculty of Pharmaceutical Sciences, University of Copenhagen, Universitetsparken 2, 2100 Copenhagen, Denmark Willem F. WOLKERS Institute of Multiphase Processes, Leibniz Universitt Hannover, Hannover, Germany

Lene JORGENSEN

xi

Contents
Series Preface Parvez I. Haris Preface Andreas Barth and Parvez I. Haris List of Contributors Infrared Spectroscopy Past and Present Andreas Barth and Parvez Haris The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy Andreas Barth Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins Haruto Ishikawa, Seongheun Kim, Ilya J. Finkelstein and Michael D. Fayer FTIR Data Processing and Analysis Tools Erik Goormaghtigh FTIR Spectroscopy for Analysis of Protein Secondary Structure Joachim A. Hering and Parvez I. Haris Infrared Spectroscopy of Protein Pharmaceuticals Marco van de Weert and Lene Jorgensen Quantum Mechanical Calculations of Peptide Vibrational Force Fields and Spectral Intensities Jan Kubelka, Petr Bou and Timothy A. Keiderling Computational Linear and Nonlinear IR Spectroscopy of Amide I Vibrations in Proteins Jun-Ho Choi and Minhaeng Cho Application of Isotope-Edited FTIR Spectroscopy to the Study of Protein-Protein Interactions Tiansheng Li and Tsutomu Arakawa Biomedical FTIR Spectroscopy of Lipids Willem F. Wolkers Structural Analysis of Protein-DNA and Protein-RNA Interactions by FTIR Spectroscopy H.A. Tajmir-Riahi, C.N. Nsoukpo-Kossi and D. Joly FTIR Spectroscopy of Cells, Tissues and Body Fluids Dieter Naumann, Heinz Fabian and Peter Lasch v vii ix 1

53 79 104 129 168

178

224

261 272

288 312

xii

Biomedical Applications of Near Infrared Spectroscopy Andrew Macnab The Use of Synchrotron Radiation for Biomedical Applications of Infrared Microscopy Lisa M. Miller, Mark J. Tobin, Sirinart Chio-Srichan and Paul Dumas Author Index

355

403

429

Biological and Biomedical Infrared Spectroscopy A. Barth and P.I. Haris (Eds.) IOS Press, 2009 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-045-2-1

Infrared Spectroscopy Past and Present


a

Andreas BARTH a,1 and Parvez HARIS b,2 Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden b Faculty of Health and Life Sciences, De Montfort University, Leicester, United Kingdom

Abstract. History of infrared spectroscopy as well as current technology and applications are reviewed. Keywords. FTIR, infrared, spectroscopy, history, Herschel, Melloni, Langley

1. Early Days 1.1. The Discovery It is sometimes of great use in natural philosophy, to doubt of things that are commonly taken for granted; especially as the means of resolving any doubt, when once it is entertained, are often within our reach [1]. This timeless expression of critical thinking by William Herschel (17381822) introduces his four publications in 1800 dealing with the rays that occasion heat [2]. For the second article [3], he is generally accredited with the discovery of infrared radiation, although some ascribe this to Carl Wilhelm Scheele (17421786) somewhat before 1777 or Marc-Auguste Pictet (17521825) in 1790 [4]. By 1800, Herschel was a respected astronomer, famous for his discovery of the planet Uranus in 1781. He was born 1738 in Hanover/Germany [5,6] as one of ten children of a musician in the Hanoverian Guard Band. 1757 he emigrated to England, where he later changed his name from Friedrich Wilhelm to William, which became Sir William in 1816 when he was knighted. Until 1782 he earned his living as a professional musician but became increasingly interested in astronomy, which was initially a hobby that he pursued together with his sister Caroline Lucretia Herschel (17501848). Not satisfied with the available and affordable telescopes [7], he constructed his own and became a professional telescope maker [6,8]. In 1782, the English king George III appointed him to his court astronomer, following Herschels discovery of Uranus and his suggestion to name the newly discovered planet after the king. This position was
Corresponding Author: Andreas Barth, Department of Biochemistry and Biophysics, The Arrhenius Laboratories for Natural Sciences, Stockholm University, S-10691 Stockholm, Sweden; E-mail: barth@dbb.su.se. 2 Corresponding Author: Parvez I. Haris, Faculty of Health and Life Sciences, De Montfort University, Leicester, United Kingdom, E-mail: pharis@dmu.ac.uk.
1

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

associated with an income [5,6], which enabled him to pursue his astronomical studies without the distraction of earning his living with music. William Herschel was a dedicated researcher who is known to have worked once for three days and nights in a row without break [6]. In 1787 also Caroline received a salary from the king as Williams assistant [7], which made her the first woman in England on a paid scientific position [5]. She also pursued independent studies and received several distinctions for her scientific work in the later stage of her long life [5,7]. In Herschels observations of the sun, the heat generated by reflecting telescopes was a problem. In other words, infrared radiation made itself aware as a nuisance. This caused its entry into science, since Herschels aim to reduce the heat [6,8,9] led to the discovery of invisible radiation beyond the red light. In his systematic study of the heat effect, he dispersed the solar spectrum with a prism and measured the heat with thermometers. The first article [1] explored the visible spectral range, as others did before him [10], solved the heat problem by introducing blue and green coloured glasses into the telescope and speculated on the possibility that the maximum of heat radiation is found beyond the red light. The second article, dated March 17, 1800, reported the detection of infrared radiation with the apparatus shown in Fig. 1: the four last experiments prove, that the maximum of the heating power is vested among the invisible rays [3]. After measuring the infrared spectrum at three different wavelengths (later he added a fourth data point), he then probed the ultraviolet region where no radiation had been detected before: so fine a day, with regard to clearness of sky and perfect calmness, was not to be expected often, at this time of the year; I therefore hastened to make a trial of the other extreme of the prismatic spectrum. However, he found no effect. Ultraviolett radiation was eventually discovered one year later by Johann Wilhelm Ritter (17761810) [10]. Herschel continued in his third article to prove that heat radiation obeys the optical laws of reflection and refraction and introduced a candle and a chimney fire as new sources of radiation [2]. In the fourth article [11], he measured spectral distributions of light and heat, observed that near-infrared radiation is less scattered than visible light, designed the first double beam instruments and used them to study the near-infrared absorption of several substances, including water and some alcoholic beverages. The double beam apparatus for the candle experiments is shown in Fig. 2. Nowadays it is well appreciated that light sensation and heating effect are two aspects of the same kind of radiation. Light caused the temperature rise observed by Herschel in the visible spectral range. In the invisible spectral region beyond the red light, heating was due to infrared radiation, which differs from visible light only by its longer wavelength but not by its nature. It is invisible because our eyes are not sensitive to it, but it can be detected by its heating effect. All this was not obvious around 1800. Nevertheless, Herschel considered infrared radiation first as a spectral extension of visible light and used for it the expression invisible light [1] that differs only in momentum [1] from visible light: radiant heat will at least partly, if not chiefly, consist, if I may be permitted the expression, of invisible light; that is to say, of rays coming form the sun, that have such a momentum as to be unfit for vision [1]. The reason for being invisible was correctly attributed to the properties of the eye: it is highly probable, that the organs of sight are only adapted to receive impressions from [light] particles of a certain momentum [1]. This argument is repeated in the second article where the question whether light be essentially different from radiant heat is explicitly posed [3].

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

Figure 1. The experiment that discovered infrared radiation in 1800. A prism dispersed sunlight, the spectrum fell on a table and a moveable stand with mounted thermometers. Thermometers 1 and 2 were exposed to the radiation, whereas thermometer 3 served as a control [3].

However, the experiments described in his fourth article [11] made him change his mind [10,12] and he regarded light and heat radiation as different phenomena thereafter. One argument for this arose from his measurement of the spectral distributions of brightness and heating effect, shown in Fig. 3, which seemed to indicate maxima at different spectral positions. Discussing the spectra from larger to smaller wavelengths he concluded that those who would have the rays of heat also to do the office of light must be obliged to maintain the following arbitrary and revolting positions; namely, that a set of rays conveying heat, should all at once, in a certain part of the spectrum, begin to give a small degree of light; that this newly acquired power of illumination

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

Figure 2. The first double beam instrument. A candle served as light source, two thermometers as detectors. The sample was placed between one of the thermometers and the candle while the second thermometer served as control. The apparatus on the table (labelled Fig. 2) was used to cover the two holes between candle and thermometers simultaneously [11].

Figure 3. The spectra of heat radiation (shaded, labelled with S) and of light (labelled with R) as measured by Herschel [11]. The spectra are dispersed horizontally with wavelengths decreasing from left to right.

should increase, while the power of heating is on the decline; that when the illuminating principle is come to a maximum, it should, in its turn, also decline very rapidly, and vanish at the same time with the power of heating. How can effects that are so opposite be ascribed to the same cause? first of all, heat without light; next to this, decreasing heat, but increasing light; then again, decreasing heat and decreasing light [11]. We know now why the maximum of the heat radiation lay in the infrared spectral region in Herschels experiment. His thermometer bulb sampled a larger spectral range in the infrared than in the visible spectral range because the dispersion of glass prisms decreases with increasing wavelength. If this is corrected for, the maximum of the heat effect lies near 600 nm in the orange region of the spectrum [9,13]. The second argument for seeing light and heat radiation as independent phenomena stemmed from their different absorptions by glass filters and other substances. This

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

is due to the wavelength dependence of absorption, a factor not considered in Herschels long quantitative discussion of light and heat absorption. So he concluded that heat and light seem to be entirely unconnected. However, he was close to doing the decisive experiment, stating that the answer to the question whether heat and light can be occasioned by the same rays or not [11] lies in the visible spectral range: it can only become a subject of inquiry, whether some of these heat-making rays may not have a power of rendering objects visible, superadded to their now already established power of heating bodies. So he asked is the heat which has the refrangibility of the red rays occasioned by the light of these rays? and studied the absorption of the heat of red light by various substances. Surprisingly in light of the preceding meticulous experiments, he did not relate this effect quantitatively to the perceived brightness of the transmitted red light. Instead he started handwaving when discussing a dark-red glass that absorbed close to 70% of the heat generated by red light: I am assured that red glass does not stop red rays. Indeed the appearance of objects seen through such coloured glasses will be a sufficient proof to every one that they transmit red light in abundance [11]. Here he was probably deceived by the complicated non-linear relationship between light intensity and perceived brightness [14], bringing with it that a relatively large loss of light intensity might remain unnoticed. Ignorant of this possible source of error he concluded here we have a direct and simple proof, in the case of the red glass, that the rays of light are transmitted, while those of heat are stopped, and that thus they have nothing in common but a certain equal degree of refrangibility [11]. The separation of radiation into three different kinds visible light, heat radiation and radiation producing chemical effects (UV light) was widely accepted during the first half of the 19th century [12,15] and later abandoned in favour of a unified theory of radiation. Early advocates of the latter were Andr Marie Ampre (17751836) between 1832 and 1835 [10,15,16], Macedonio Melloni (17981854) around 1842 [10,12,15,16] after a U-turn in his interpretation, and Sir John Frederick William Herschel (17921871), the only child of William, between 1835 and 1845 [16]. More detailed accounts of the arguments for and against the unified theory of radiation have been published [10,12,15,17]. When Herschel discovered infrared radiation, most scientists did not accept the wave theory of light and the concept that light of a certain colour corresponds to radiation of a certain wavelength. This did not change for more than a decade, even though the wavelength of visible light was measured by Thomas Young (17731829) shortly afterwards [10,18]. Since light was dispersed by prisms, its wavelength was not directly accessible in most of the infrared experiments until the 1950s. By observing interference fringes or by using gratings, wavelength information could be obtained. In this way, the calibrated spectral range was extended slowly throughout the 19th century to 1.445 m in 1847 by Jean Bernard Leon Foucault (18191868) and Armand Hippolyte Louis Fizeau (18191896) [10], to 1.9 m in 1859 by J. Mller [19], to 7 m in the mid-infrared range (2.5 m to 50 m) by Paul Quentin Desains (18171885) and Pierre Curie (18591906) in 1880 [10,19], and to 150 m in the far-infrared range (50 m to 1000 m) in 1897 by Heinrich Rubens [10,19]. The mid-infrared spectral range attracted most attention thereafter, whereas only a handful of articles were published in the far-infrared range up to 1938 [20] and only 91 publications in the near-infrared range until 1950 [21].

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

1.2. The Long Way to Modern Instruments Sir John Frederick William Herschel (17921871) followed in the footsteps of his father and became himself a renown astronomer. He devised an ingenious way to render infrared radiation visible and reported 1840 probably the first multichannel infrared spectrometer [22]. The publication deals primarily with photographic experiments and foresees already naturally coloured photographic images. Note III of this work is concerned with a process for rendering visible the calorific spectrum by its effect on paper properly prepared. The process is described as follows: It is well known to artists in water colours, that their tints, when freshly laid on and wet, are deeper and darker than they ultimately become on drying, a change which must be allowed for in the colouring, or the effect will be spoiled. If a paper so over-coloured be dried unequally, those parts which are dry first appear lighter than the rest. He used this evaporation effect to afford a visible picture of the thermic spectrum. In his experiment, dispersed sunlight fell on a specially prepared paper: one side of this paper is to be smoked in the flame of oil of turpentine, or over a candle burning with a smoky flame, by drawing it often and quickly through the flame, giving it time to cool between each exposure, till it is coated on the under side with a film of deposited black, as nearly uniform as possible. The paper presents its white side to the incident spectrum. Then a flat brush, equal in breadth to the paper, dipped in good rectified spirit of wine, is to be passed over the white surface till the paper is completely saturated, which will be indicated by its acquiring a uniform blackness in place of the white it at first exhibited. After a few moments exposure, a whitish spot begins to appear considerably below the extreme red end of the luminous spectrum, (supposing the violet end uppermost). As shown in Fig. 4, five white spots were produced on the paper, indicating regions of transmission for solar radiation. Herschel discussed already that the atmospheres of the sun and the earth might cause the observed pattern: The gaseous media through which the rays have reached their point of action, are the atmospheres of the sun and earth. The effect of the former is beyond our control, unless we could carry our experiments to such a point of delicacy as to operate separately on rays emanating from the centre and borders of the suns disc. That of the earths, though it cannot be eliminated any more than in the case of the suns, may yet be varied to a considerable extent by experiments made at great elevations and under a vertical sun, and compared with others where the sun is more oblique, the situation lower, and the atmospheric pressure of a temporarily high amount. Should it be found that this cause is in reality concerned in the production of the spots, we should see reason to believe that a large portion of solar heat never reaches the earths surface, and that what is incident on the summits of lofty mountains differs not only in quantity, but also in quality, from what the plains receive [22]. As this example also demonstrates, exploration of the infrared spectral range was impeded by the lack of sensitive detectors. This situation improved slowly during the 19th century. The first two important innovations were made by Macedonio Melloni (17981854) in 1830 [24,25] and by Samuel Pierpont Langley (18341906) in 1880 [10,19,26]. Melloni developed a thermophile detector for heat radiation which was based on the discovery of the thermoelectric effect by Thomas Johann Seebeck (17701831) in 1821 [10] and which was inspired by a thermophile thermometer developed by his friend Leopoldo Nobili (17841835) by 1830 [24]. Mellonis detector could detect the radiation from a person 610 m away and was 40 times more sensitive than

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

Figure 4. The near infrared spectrum of solar radiation recorded with an early multichannel infrared spectrometer in 1840 by Sir John Herschel [22]. Top (labelled Fig. 2): light intensity made visible by evaporation of alcohol from a soaked paper. The light is dispersed horizontally with longer wavelengths on the left hand side. The y-axis indicates yellow light (~580 nm). Spots to are transmission maxima of solar radiation in the near-infrared spectral region. The dark spaces between them are regions of absorption of atmospheric gases [23] (if not the glass prism contributes), in particular of water vapour and CO2. Bottom (labelled Fig. 3): visible spectrum as seen with the naked eye [22]. A reasonable interpretation of the top spectrum seems to be that the waist between spots and is due to the several atmospheric absorptions around 750 nm, and that the dark regions between spots and , spots and and spots and are due to the water vapour absorptions around 930, 1130 and 1400 nm, respectively. The recorded spectrum therefore seems to extend out to 16001700 nm into the near-infrared spectral region. This interpretation is in line with the fact that smaller wavelengths produce a larger spread of a fixed wavelength interval on the recording paper than larger wavelengths. Between 400 and 580 nm the spread is estimated to be ~2-fold larger than between 580 and 750 nm and ~4-fold larger than between 750 and 930 nm in accordance with known values [9,13].

thermometers [24,25]. In contrast to William Herschels passing interest, Melloni studied heat radiation throughout his scientific career. Amongst his other achievements were the discovery of the transparency of rock salt (NaCl) for infrared radiation in 1833 [2729], which was of eminent importance to expand the accessible spectral range, usage of the first [8] rock salt prism in the first mid-infrared spectrometer in 1833 [2729], detection of infrared radiation from the moon in 1846 [25] and of variations in the earths atmosphere in 1852 [10,25]. The next milestone in the improvement of infrared detectors took place in 1880 [10,19,26] when Samuel Pierpont Langley (18341906), who later became also an aeronautic pioneer, developed his first bolometer. This detector measured radiation intensity via a resistance change of a small metal strip in a Wheatstone bridge, a measuring principle which had been demonstrated before by Adolf Ferdinand Svanberg (18061857) in 1851 [30]. Langleys bolometer could detect temperature differences of 105 C [10,26]. Langley and his assistant Charles Greeley Abbot further improved the sensitivity 400-fold by 1898 and enabled detection of a cow at a distance of 400 m in 1901 or of a temperature difference of 108 C [26,31]. Further technical development of dispersive infrared spectrometers is beyond the scope of this overview. However, a remark on the effort needed to obtain an infrared spectrum might be of interest. Langley measured more than two weeks for some of the data points of his solar spectrum in the early 1880s [10,26]. Coblentz around 1905 needed only 1.5 min per data point or four hours for a spectrum from 5000 to

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

670 cm1 [8,32], a situation which did not improve for nearly half of a century [33,34] until the second generation of commercial instruments became available after the second world war. They reduced the measuring time to 20 min [35]. This is still 1000times slower than modern rapid scanning Fourier transform spectrometers, which can acquire a spectrum in as little as 10 ms or high quality spectra within seconds. The obstacles for the early investigators implied that only a few spectrometers world-wide were operated by specialised physicists for much of the first half of the 20th century. The instruments and their accessories were mostly custom-built. This is very different to the present situation where routine measurements can be run by personal with little training and spectrometers as well as ample accessories are commercially available for the entire field of bioanalytics. The first commercial infrared spectrometer was produced in 1913 [3537] by the English company Adam Hilger Ltd which continued manufacturing infrared spectrometers until 1974 [37]. In 1936 American Cyanamid Co. started a small scale production of infrared spectrometers for industry [8,38]. During World War II, the USA government commissioned infrared spectrometers from National Technical Laboratories (later Beckman Instruments, model IR-1 from 1942 see Fig. 5) and Perkin-Elmer (model 12A from 1944). Figure 6 shows the successor model 12C. These early models were dc instruments that would operate only in humidity- and temperature-controlled atmospheres. Because central air conditioning was still a few years away, the first industrial spectroscopists tended to barricade themselves in their specially controlled rooms, letting no one else in lest the spectrum being recorded be ruined. As a result, spectroscopists earned the reputation of being recluses who spoke only to other spectroscopists [38]. What it was like to operate these early commercial instruments is vividly described by F.A. Miller [35] at the example of the Perkin-Elmer instrument 12B, which was upgraded from model 12A by an automatic spectrum recorder. This simplified the measurements enormously compared to the previous manual recording of every data point. The instrument was a single beam spectrometer, which meant that the spectra with and without sample had to be recorded one after the other. The instrument was not linear in anything useful m, cm1, %T, or absorbance. Extensive replotting of the raw data was therefore necessary to obtain a real spectrum [35]. Marks that were automatically drawn on the abscissa of the chart paper had to be calibrated against gas reference spectra in order to translate them into wavelength or wavenumber. The ordinate recorded the signal of the thermocouple detector, which was determined only partly by sample absorption but also by lamp and spectrometer characteristics, atmospheric absorption and any other heat change close to the spectrometer. If one lit a match and held it near the instrument housing, the pen moved [35] or in a laboratory that was heated by steam radiators: when the steam came on, the baseline climbed upward and went off the paper if zero was not reset [35]. Therefore, the signal without measuring light had to be checked frequently and the measuring light intensity manually evaluated as the difference between the signal with and without measuring light. In consequence one had to work hard to run and plot two spectra a day [35]. An anecdote says that students of MIT, who had to evaluate the original recordings of such a spectrometer, complained to Arthur C. Cope, chairmen of the department, but Cope said that it was good experience for them. Finally one day Cope said that he would replot a spectrum to show the students that he was willing to do what he asked of them. He took the material home that weekend, and the story is that when he came in on

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

Figure 5. Photograph of Beckman IR-1, one of the first commercial infrared spectrometers. Photograph courtesy of Heritage Exhibitions.

Monday he authorized the purchase of a Baird double-beam instrument [35]. This instrument avoided these troubles. The introduction of double-beam spectrometers with chopped infrared radiation shortly after World War II was a vast improvement in comfort and velocity [35]. Detecting the signal with and without measuring light in rapid succession avoided the problem of thermal drift of the older, dc-operated instruments and increased the sensitivity. This and other improvements were greeted in 1948 by V.Z. Williams stating the infrared spectrometer is no longer a capricious instrument which must be housed in the sub-basement for mechanical and thermal stability. It has changed in appearance from the bulky, wax-sealed box and accessory equipment which required a fair sized room to [a] compact unit which [occupies] the space of a standard desk [16]. The double beam spectrometers of the 1950s were expensive instruments. A review in 1963 [39] gives a price of 15 000 $ for a good spectrometer a couple of years earlier, a price that corresponds to 110 000 $ in May 2008 taking into account 640% inflation from January 1960 [40]. An example is the first widely used commercial infrared spectrometer, the PerkinElmer 21 (see Fig. 7), which was officially introduced in 1950. The design and flexibility of this double-beam spectrometer made infrared spectroscopy readily accessible to non-specialists. This led to a rapid growth in the application of infrared spectroscopy for analysis of diverse chemical and biological systems. It would not be incorrect

10

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

Figure 6. Photograph of RDB Fraser next to a infrared spectrometer in 1958 at the CSIRO Division of Protein Chemistry in Melbourne, Australia. This spectrometer is a Perkin-Elmer Model 12c modified for use with a selenium film polarizer, and a 0.8 NA reflecting microscope (the latter shown detached on the top of the pen recorder). Photograph courtesy of RDB Fraser.

Figure 7. Photograph of Juana Bellanato next to a Perkin-Elmer model 21 spectrometer. The photograph was taken in 1956 when she was working in the Physical Chemistry Institute of the University of Freiburg (Germany) as a postdoctoral researcher with Prof. Mecke and Dr. E. Schmid. She was one of the first scientists to apply infrared spectroscopy for characterisation of lipids. The photograph is reproduced with the permission of Juana Bellanato.

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

11

to state that this provided the basis for laying the early foundations of biological infrared spectroscopy. During late 1940s and 1950s, there were a large number of publications that used dispersive infrared instruments to analysis nucleic acids, lipids, peptides, proteins, carbohydrates, bacterium, viruses and human tissues (see later). Indeed, most of the systems that are currently being studied using the sophisticated FT instruments had already been investigated using dispersive instruments in late 1940s and 1950s. 1.3. Discovery of the Usefulness of Infrared Spectroscopy for Chemical Analysis Infrared spectroscopy is one of the classical methods for structure determination of small molecules. This standing is due to its sensitivity to the chemical composition and architecture of molecules. Bond lengths and bond angles can be measured and bond distortions detected with picometer precision. Apart from that, redox state, interactions with the environment like hydrogen bonding and electric fields as well as conformational freedom reflect in the spectra. This usefulness of infrared spectroscopy for chemical analysis was discovered mainly at the end of the 19th and the beginning of the 20th century. The first indication came however already in 1833. Melloni studied the transmission of heat radiation through substances. He used light sources at different temperatures, which provided radiation with different spectral compositions according to Wiens displacement law. He found that each substance had characteristic transmission properties [2729]. With this it may be said that analytical infrared spectroscopy was born [25]. 50 years later, Sir William de Wiveleslie Abney (18431920) also known for his fundamental work in photography pioneered the use of infrared spectroscopy for chemical analysis [8]. In 1881, Abney and Edward Robert Festing published the first [21] near infrared spectra of organic compounds, reporting amongst others the absorption of 48 organic liquids [41] up to about 1.3 m [10]. In this spectral region predominantly the absorption due to overtones of C-H stretching vibrations were observed [8]. Bands specific for aromatic and ethyl groups were found which indicated the analytical potential of the method as pointed out by the authors: We may say, however, it seems highly probable by this delicate mode of analysis that the hypothetical position of any hydrogen which is replaced may be identified, a point which is of prime importance in organic chemistry and It seems to us that the spectra leave as definite characters to read as are to be found in hieroglyphics, and we venture to think that we have given a clue to enable them to be deciphered [41]. The characteristic absorption spectra provided even a tentative interpretation of the solar spectrum: in two instances at least, a study of the absorption spectra of organic bodies has to some extent thrown a glimmering of meaning on some of the absorption lines of the solar spectrum . One of the authors of this chapter (AB) was pleased to find out that Knut Johann ngstrm (18571910) recorded one of the first [8] mid-infrared spectra of organic liquids and gases in 1890 [42] while being employed by the predecessor of ABs university, Stockholms Hgskola. Three years later, also William Henry Julius (1860 1925) published mid-infrared spectra of organic compounds [8,32] and suggested that the absorption is due to internal motions in the molecule and that the internal structure of the molecule determines the spectrum [8]. The concept that many chemical groups absorb in narrow frequency bands was finally established by the work of William Weber Coblentz (18731962). Not satisfied with the available data he wrote in

12

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

1904 [43]: The investigation of absorption spectra far into the infra-red has never been made in a thoroughly systematic manner. This is no doubt due to the enormous difficulties to be encountered and the slowness with which observational data can be obtained, so that usually after investigating half a dozen compounds, the results have been given to the public. As a consequence the agreement in the location of certain absorption bands are not always convincing. Accordingly he measured spectra of more than hundred compounds around 1904 [43] and published a list of group frequencies [44]. Surprisingly, it took about 30 years until the usefulness of infrared spectroscopy began to be realised in industry [8,16,36,45] and in medical research [8] during the 1930s. The same time period saw the onset of biological work as described in the next section. The breakthrough of infrared spectroscopy came in World War II, where it was employed in three programs of the USA and UK governments: for quality control in the production of synthetic rubber [4,8,35], for the analysis of petroleum [4,8,35], for example to trace the origin of gasoline used by the German airforce [4,46], and to resolve the structure of penicillin [8,35,36]. As a consequence, the number of instruments rocketed from around 10 to several hundred in the USA [4,47,48]. After World War II, with commercial availability of infrared spectrometers, there was rapid growth in the use of infrared spectroscopy for analysis of organic molecules. The work from this period has been described in the following manner by Williams in 1951 [49]: The period since World War II has resulted in a steady growth throughout the infrared field. This growth has been mainly one of utilization and extension of wartime developments and techniques rather than one of fundamentally new discoveries. Even so, the process of utilization and extension has been so broad that the field has altered radically in the last five years. An example of this alteration is the acceptance of the infrared spectrum by the organic chemist. [The] issue of the Journal of the American Chemical Society, January, 1950, shows a large number of infrared spectra of organic materials . The interesting point is that, in many cases, no description at all is given of the spectrometer or the sampling conditions used. At some sites, evidently, the infrared spectrometer has been so successful as to reach a stage of oblivion, and now ranks with the distillation column or gravimetric balance as standard laboratory furniture. Actually, this is to be expected since there are probably over 1000 infrared spectrometers in use today. 1.4. From Chemistry to Biology The high information content in an infrared spectrum is of use also for biological systems. This makes infrared spectroscopy a valuable tool for the investigation of structure and function of biomolecules and of cells and tissue. One of the first infrared studies of biological systems was performed by Nobili and Melloni with their new, sensitive thermopile detector for heat radiation. They investigated more than 400 insects, discovered that caterpillars have a higher temperature than the butterflies and chrysalides which proceed from them [50] and related this to their higher metabolic activity: the insect, in the first period of its life, where its nourishment is abundant and its growth rapid, converts into carbonic acid a much greater quantity of oxygen, than at subsequent periods. the heat of the animal will vary, so to speak, proportionally to the quantity of oxygen employed in the act of respiration. Nobili and Melloni pointed out that their method was non-invasive and thus superior to previous experiments which

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

13

did not give the temperature of the animal in the natural state, but in a maimed and suffering condition whereas their thermo-multiplier offers the means of repeating these experiments without incurring any of the inconveniences above alluded to [50]. Still nowadays, the non-invasiveness of infrared spectroscopy is one of its striking advantages, since no artificial spectroscopic probes have to be introduced into the sample that might alter its properties. The most abundant biological molecule water was exposed to infrared radiation already by William Herschel. Melloni studied amongst other substances cowhorn, citric acid, sugar and ice [2729]. Ernest Fox Nichols (18691924) recorded nearinfrared spectra of chlorophyll and hemoglobin in 1892 [51], which were, however, dominated by solvent absorption. Pioneering work was also done by Coblentz, who studied fatty acids in 1904 [43]. Between 1930 and 1950, at the same time when industry became interested in infrared spectroscopy, research in the mid-infrared spectral range started on all kinds of biomolecules and on larger biological systems: tissue in 1933 [52,53]; polysaccharides [54], amino acids (some of them in aqueous solution) [55,56], polypeptides [55], and proteins (some in water) in 1935 [54,57,58]; the simple fatty acid acetic acid in 1936 [59], longer fatty acids in 1940 [60], and steroids in 1946 [8]; vitamin C in aqueous solution in 1937 [61]; and nucleic acids in 1948 [62]. The penicillin program during World War II led to the elucidation of its structure by infrared spectroscopy [8,36]. Otherwise, the war years had a negative impact on biological infrared spectroscopy as some of the scientists had to focus on war related projects and this is evident from a survey of the literature published during that period. For example, in one paper published in 1946 by Furchgott et al. [63], the authors wrote the following as a footnote: Infra-red analysis of the steroids was begun in this laboratory by Carl Herget and Ephraim Shorr, and was the subject of a brief report in 1941. At that time Dr. Herget left this work in order to engage in war research at the Underwater Sound laboratory, Harvard University. Over the last 30 years the application of infrared spectroscopy for biological applications has been accelerated by the following key factors that will be discussed in more detail below. 1. Advances in instrumentation, especially the commercial availability of FTIR spectrometers, has been the single most important factor in the rapid growth in the application of infrared spectroscopy for biological analysis. 2. The use of computers has also been an important development since it made possible the digital spectral subtraction, especially the absorption of water, for obtaining difference spectra. However, it is important to stress that use of computers for analysis of infrared spectra began before the advent of FT instruments. 3. Advances in chemical and molecular biological methods that enabled production of samples with targeted alterations for structure-activity and spectral interpretation studies. Ability to carry out site directed mutagenesis, chemically synthesise peptides and introducing isotopically labelled groups probably had the greatest impact. 1.5. Emergence of FTIR Instruments and Computer Controlled Spectrometers Triggers Renaissance in Biological Infrared Spectroscopy In the 1960s, Jones [64] and Savitzky [65] were pioneers in the development of mathematical and computational approaches for analysis of infrared spectral data. The effort of these scientists, and many others, led to the development of computer aided

14

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

dispersive instruments in the 1970s. This was happening at virtually the same time when Fourier transform infrared (FTIR) spectrometers were also entering the market. It is interesting to note, however, that despite the appearance of FT instruments, some in the scientific community dismissed the idea that dispersive instruments were confined to history. As an example, James Mattson stated the following in one of his articles [66]: Infrared spectrophotometry has long had the reputation of being semiquantitative. For this reason, some of the attention being lavished on Fourier transform infrared instrumentation is derived from misconceptions regarding conventional dispersive instrumentation. He then goes on to predict a bright future for dispersive instruments: The dispersive instrument manufacturers are moving slowly into the world of today, an era of sophisticated, computer-based data acquisition and reduction. As they begin redesigning their 10-year old dispersive instrumentation with low-cost mini- or microcomputers, the field will enjoy a second childhood. However, the predictions of Mattson [66] did not come to fruition and virtually all spectrometer manufactures are currently producing FT-instruments instead of dispersive instruments. Mattson is no longer engaged in scientific research but in a recent communication with him, he stated the following about dispersive instruments: About 10 years ago, I visited the Rosenstiel School of Marine & Atmospheric Science campus and saw my beloved Perkin-Elmer 180 sitting in a hallway because nobody knew what to do with it. That was a great machine. NSF (National Science Foundation) paid a lot of money for it. There was a world of things I could have done and would have done if that machine had been under my control. So many problems; so much to be done. I assume there are many research-quality spectrophotometers sitting around university laboratories doing nothing. Although dispersive instruments are no longer routinely used for recording infrared spectra, they are no means obsolete as they can be valuable for specialised applications such as time-resolved studies. The first commercial FTIR spectrometer (Model FTS-14), totally computer controlled, was introduced by Digilab in 1969. This led to a surge of application similar to what was observed when the first commercial dispersive infrared spectrometers gained widespread use in the late 1940s and through 1950s. However, the difference this time was that the advance in instrumentation caught the attraction of scientists who have been waiting to make use of infrared spectroscopy for analysis of biological systems in aqueous media. As such, it would not be incorrect to state that the advent of FTIR spectrometers had a greater impact for life science research than chemical science. Indeed, the first FTIR paper to appear in the literature database (ISI) was a biological study reported in 1972 [67]. Figure 8 shows a Digilab FTIR spectrometer in James Albens laboratory that has been used for studying protein-ligand interactions by infrared difference spectroscopy. Recent communication with James Alben, provided some insight about his experience of using both dispersive instruments and FT-instruments. Firstly, he described his early studies with dispersive instruments in the following manner: The need to understand the roles of iron, copper and oxygen in respiration led to studies with Winslow Caughey at Johns Hopkins, on the coordination chemistry of transition metal porphyrin complexes, and the effects of ligand field strength. We characterized a range of porphyrin derivatives by infrared spectroscopy by use of a sodium chloride prism instrument (Perkin-Elmer Model 21) which allowed a 5x expansion of %Transmission. Inadequacies of measurement led to use of a Perkin-Elmer 400 grat-

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

15

Figure 8. A photograph of a Digilab FTIR instrument with modified data system displaying CO bound to hemoglobin minus its photoproduct. The photograph, courtesy of James Alben, was taken in Albens laboratory sometime in the early 1980s.

ing instrument, courtesy of Ellis Lippincot at the University of Maryland, where we first observed carbon monoxide coordinated to iron in hemoglobin. An example of a Perkin-Elmer 21 instrument that Alben refers to is shown in Fig. 7 from a photograph taken in 1956. In early 1970s, Alben had access to FTinstrumentation [67]. He states the following about his experience with FT-instruments: A major breakthrough in spectral resolution, signal/noise, and baseline stability, came with incorporation of a Block Engineering interferometer into a bench-top instrument by Tom Dunn Associates (later to become Digilab Division of Block Engineering). This instrument was originally delivered with a Data General Nova minicomputer that contained 16 kilobytes of core memory (3 microsecond cycle time) but no disk drive. Three-fourths of that memory was eventually exchanged for a 128 kilobyte head-per-track disk drive that permitted data collection at 0.5 cm-1 resolution and 32 bit fourier transforms. The signal/noise obtained was ten-fold better than that of the best grating instrument. With the commercial availability of FT-instruments, increasing number of scientists were using the technique for studying different types of biomolecules. The first application of FTIR spectroscopy for analysis of lipids in aqueous media was reported by Jack Koenigs and Henry Mantschs group [68,69]. However, not everyone had the necessary funds to purchase a new FTIR spectrometer, and hence many leading groups continued to use their dispersive instruments well into the mid-1980s. Following is a quote from Michael Byler, who was working with Heino Susi, regarding his desire to purchase a FT-instrument (see photograph of Michael Byler in front of his new FTIR spectrometer, Fig. 9): At this time, commercial mid-IR Fourier-transform spectrometers (based on the principle of the Michelson interferometer) had been on the market for just a few years.

16

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

Figure 9. Photograph of Michael Byler, colleague of Heino Susi, proudly sitting in front of his newly acquired FTIR spectrometer at the U.S. Department of Agriculture, the Eastern Regional Research Center in Wyndmoor, PA in August 1981. Photograph courtesy of Michael Byler.

According to the literature, they offered numerous advantages over even the best dispersive instrument. But they were expensive: more than $80000, probably equivalent to at least $200000 in todays funds. At the time, USDA management had other spending priorities and Susi was rather sanguine about the prospect of them granting us such a sum of money. Even our second choice, a Perkin-Elmer 180, perhaps the best all round dispersive instrument then available, was priced at over $50000. Nonetheless each new budget year we added our request for a new IR to managements instrumentation wish list. Suddenly in 1980, sufficient funds became available to purchase a new Nicolet 7199 FTIR. Ironically, at the time both of us were so deeply involved with other assigned research, that we did not attempt our first protein spectrum until more than two years later. Interestingly, standard FTIR spectrometers are currently sold at the same nominal price as the dispersive spectrometers of the 1960s implying that the better performance of the present instruments can be obtained at ~7-times lower cost. It would be plausible to state that 1980s is the starting point for the new era of biological infrared spectroscopy with highly active research groups emerging in different parts of the world. Rapid data acquisition and greater access to instruments made infrared spectroscopy accessible to a wider community of scientists and not only chemists and physicists. It was now much more common for infrared spectrometers to be found in biochemistry and life science laboratories in universities and research institutes around the world.

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

17

1.6. Use of Computers in Infrared Spectroscopy As already mentioned earlier, the use of computers in biological infrared spectroscopy has been a major contributor in the growth of biological applications. Even before the advent of FT-instruments, computer aided infrared spectroscopy demonstrated that difference spectra can be obtained by digitally subtracting one spectrum from another and that the broad infrared bands can be analysed using mathematical and computational tools to reveal subtle details of molecular structure. The following examples illustrate biomolecular studies with dispersive instruments coupled to microprocessors. The first highlights also the importance of collaboration between academia and industry which played a significant role in advancing biological infrared spectroscopy. For example, Dennis Chapman at London University collaborated with Perkin-Elmer in England in his first studies of proteins and membranes in H2O [70]. Juan Gomez-Fernandez, a co-author of this paper, in recent communication with one of us, made the following remarks about the background to this study whilst working with Dennis Chapman in London (references removed): In 1978, Mantsch and his group began the application of FT-IR spectroscopy to the study of aqueous lipids and soon after they reported the possibility of water subtraction from aqueous lipid samples. When early in the summer of 1979 I returned to spend the summer period working with Dennis, I have seen these papers and I showed them to Dennis. He rapidly reacted realizing the wealth of possibilities that this technical advance could permit. At that moment Dennis did not know of any FT-IR spectrometer available to us, but he has a very good knowledge of Perkin-Elmer innovations, among other reasons because he lived in Beaconsfield at a very short distance of a Perkin-Elmer factory. Dennis knew that a grating infrared spectrometer controlled by a data station has been just introduced. It should be commented, at this point, that computers were in its infancy in 1979, and its use was still very rare. The data station was a very primitive computer, but it was sufficient to permit the digital acquisition of spectra and to work with them, performing, for example, subtractions. Dennis quickly called Mary Barnard, a scientist working for Perkin-Elmer, and he concerted a visit to the factory. Rapidly my colleague Felix Goi and I prepared samples containing just lipids and also protein-lipids samples. In a few days we were taking our first spectrum. The study by Chapman and co-workers described above [70] was conducted using thin pathlength transmission cells. In contrast, Mattson et al. [71] used internal reflection spectrometry to record infrared spectra of proteins in aqueous solution. They also used a minicomputer, interfaced to a dispersive Perkin-Elmer 180 spectrometer, to obtain the protein spectra in H2O and carry out digital subtraction of the overlapping water absorbance. In spite of the first encounters between computers and infrared spectrometers in the 1960s, it was not until the mid-1970s that the marriage between computer and infrared machines was initiated with the advent of Fourier transform instruments. It was now possible to record spectra very easily and what was needed were mathematical tools that could be used to process and analyse the data. One of the fundamental advantages of computerisation has been the ability to digitally subtract the absorbance of H2O from aqueous samples and thereafter analyse the broad infrared spectra of biomolecules with computational and statistical tools. The other major problem in biological infrared spectroscopy is the overlap of peaks arising from different structures and

18

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

groups in complex macromolecules. Here again it is the development of computer programmes for unmasking the complex band envelope that played a pivotal role in proliferating the use of the technique for analysis of biological molecules. In the 1980s the Fourier Self-Deconvolution method for analysis of infrared spectra was introduced by Kauppinen et al. [72]. This is a mathematical procedure for resolving overlapping component bands in a complex spectrum. It is also referred to as resolution enhancement although the experimental spectral resolution remains unchanged. At virtually the same time, the FTIR spectrometer manufacturers were busy developing their own suite of software programmes to aid the analysis of infrared spectral data. Virtually, all the FTIR manufacturers provided programmes for spectral subtraction, smoothing, derivative and deconvolution analysis. The availability of such software played an important role in broadening the application of the technique to the analysis of subtle changes in infrared spectra of biomolecules. Below is a quote from Michael Byler (with references removed) indicating the helpful role played by spectrometer manufacturers in providing software programmes for data analysis: After checking with the manufacturer, I learned that a modified version of the deconvolution algorithm of Kauppinen et al. was available. We ordered the software and I soon learned how to apply it to a variety of spectra. About the same time, I had learned that comparable band narrowing could also be achieved by means of calculating the second derivative of a spectrum. Each method presented its own advantages and disadvantages, but for initial exploration, the theory and application of differentiation proved to be simpler. Unlike deconvolution, analytical calculation of a derivative of a spectrum requires no a priori knowledge of any band parameters. One of the first infrared spectroscopic application of resolution enhancement methods for protein analysis was made by Susi and Byler in 1983 [73]. They obtained second derivative Fourier transform infrared spectra of the native and denatured soluble proteins in deuterium oxide between 1350 to 1800 cm1. They state in their paper In the second derivative spectra, clearly resolved peaks are observed which can be associated with the alpha-helix, beta-strands, and turns. No protein spectra with such resolution have heretofore been reported. The data appear to present the first direct spectroscopic evidence of turns in a native protein. This first paper was important in showing the usefulness of second-derivative analysis for protein analysis using infrared spectroscopy which prompted a plethora of studies to be reported in the literature during the late 1980s. Infrared analysis by Byler and Susi were conducted for proteins dissolved in 2H2O. The resolution enhancement techniques enabled a detailed analysis of bands making it possible to compare spectra of proteins recorded in both H2O and 2H2O. As a consequence, for the first time, the complications associated with the interpretation of spectral data for samples in 2H2O was highlighted by Olinger et al. [74]. The authors state This paper represents the first example of the use of deconvoluted Fourier transform infrared spectra in conjunction with hydrogen-deuterium exchange in order to aid in the assignment of a proteins infrared bands. The paper by Haris et al. [75] states The results show that it is necessary to be cautious in making band assignments based on exchange methods unless the extent of exchange is known. Furthermore, it is seen that the combination of Fourier transform infrared spectroscopy and hydrogen-deuterium exchange is a powerful technique for revealing small differences in protein secondary structure. Before ending this section, it is important to state that long before the use of the so called resolution enhancement procedures, difference spectroscopy (see section on

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

19

difference spectroscopy and Chapter by Barth) was already being used to probe subtle changes in biomolecular structure using infrared spectroscopy. However, these studies were mainly restricted to systems which can be triggered from one state to another without having to re-assemble the infrared cell which may lead to changes in pathlength, resulting in artefacts. Difference spectroscopy became more popular as a result of development of FT instruments since high signal-to-noise ratio spectra could be obtained rapidly.

2. Dealing with H2O in Infrared Analysis of Biological Systems This section summarises the historical context of some of the key issue of dealing with H2O absorbance that has been a major hindrance for biological infrared studies. The major challenge for recording infrared spectra of biological molecules was the strong absorbance of H2O over much of the mid-infrared region. Early infrared measurements in H2O were restricted to analysis in the near-infrared region and to selected window regions in the mid-infrared. In order to illustrate the problems encountered in biological infrared spectroscopy in the 1950s, we have included below quotes from Barer, working at Oxford University, in his bid to use infrared microspectroscopy for biological studies. In a discussion of the Faraday Society, Barer [76], wrote the following with respect to the problem of water absorbance and possible ways to overcome this problem: The next question which is of considerable interest to the biologist is whether it will ever be possible to apply the method to the study of living cells. The difficulties here are formidable. With few exceptions, living cells must be examined in an aqueous medium, and all cells contain water. Water possesses a number of strong absorption bands in the infra-red region, and, indeed, workers in this field know all too well that the effect of even the small amount of water vapour normally present in the atmosphere can be very disturbing. It might conceivably be possible to work with extremely thin films of water and at wavelengths at which the absorption due to such films would not overshadow everything else. Another possibility is to use heavy water, which has a rather different absorption spectrum, provided that it did not affect the structure and viability of the cell. In this way, by the use of two different media it might be possible to derive the absorption spectrum of the object itself. Another factor to be considered is the possible action of the absorbed radiation on the cell. I have carried out preliminary observations on the action of short-wave infra-red radiations on living cells and the results suggest that they do not tolerate this treatment very well. For all these reasons it must be admitted that the prospect of applying the method to living material is extremely remote. Clearly, measurement in aqueous media has been a major hindrance in the application of infrared spectroscopy for biological applications a fact that has been repeatedly highlighted in many publications from 1930s until the mid 1970s. The first reference to recording the infrared spectrum was spectrum of H 2O was reported in 1895 [77]. In 1911, Coblentz reported a study investigating the interaction between gelatine and water and investigated bands associated with both of these molecules [78]. After Coblentz, a number of scientists were very active in the analysis of the interaction between water and biomolecules. They published a series of papers between late 1930s and mid 1940s studying proteins, carbohydrates and amino acids. For example, Buswell and Rodebush analysed biomolecules in water in late 1930s [79]. Ellis &

20

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

Bath [80] obtained near infrared spectra of oven-dried gelatin and gelatin saturated with H2O and 2H2O vapour respectively. These authors continued their infrared studies on proteins and carbohydrates during the 1940s. Towards the late 1940s there was greater confidence in the ability to carry out analysis in H 2O and 2H2O. 2H2O provides further window regions for assessing bands arising from biological molecules which was appreciated in late 1940s [81]. Gore et al. state the following in one of their articles [81]: One of the widespread misconceptions concerning the application of infrared spectrometry in organic analysis is that it is impossible or at least difficult to obtain spectra of aqueous solutions. On the contrary, it is often of extreme value to observe spectra in aqueous solution, at various hydrogen ion concentrations, especially in the case of carboxylic acids and amino acids. One of the key persons to pioneer the application of infrared spectroscopy for analysis of biomolecules in aqueous solution (2H2O) is Henri Lenormant [82] who was one of the first to record spectra of biomolecules in solution by dissolving them in 2 H2O. Subsequently, he collaborated with Blout which led to the publication of number of papers on analysis of biomolecules in aqueous system [83]. Lenormant and Blout were very active in infrared measurements in 2H2O during the 1950s. In late 1950s, Parker and co-workers [84,85] took advantage of advances in instrumentation and availability of barium fluoride windows (for containing aqueous solutions), to obtain spectra of biomolecules in H2O. Parker reported studies on infrared spectra in H2O solutions saturated with particular biomolecules such as amino acids and proteins so that the peaks in the window regions (approx. 1550950 cm 1) where H2O does not absorb strongly can be monitored. In the 1960s, Susi and co-workers [86] were one of the first to attempt analysing the amide I band which occurs at virtually the same frequency as the O-H bending vibration of the H2O molecule. Up to that point, the vast majority of studies of biomolecules in H2O avoided regions containing strong absorption bands arising from the water molecule. The following text from the Susi et al. paper in 1967 [86] highlights the difficulty they had to go through in order to visualise the amide I band in H2O: Measurements of amide I and amide II frequencies in H20 solution are possible only by extremely careful differential procedures; these are not easily adopted for routine investigations. Absorption by the solvent was cancelled by repeatedly adjusting the path length of the reference cell and measuring the differential absorption at various wave length settings until a spectrum was obtained which showed only characteristic polypeptide bands. To prevent aggregation of the solute, it was necessary to use dilute solutions, while the intense H2O absorption precluded the use of cell thicknesses larger than 0.01 mm. An ordinate scale expansion device, which forms an integral part of the employed spectrophotometer, was therefore used. Although data obtained in this manner are not as precise as corresponding data obtained in D2O solution, repeated experiments led to reproducible and internally consistent results. The tedious and timeconsuming experiments were carried out in order to obtain information concerning the effect of dissolution of proteins in water without simultaneously introducing additional variables associated with deuteration. To the best of our knowledge this was the only paper where Susi reported the analysis of the amide I band for proteins dissolved in H2O. This despite the fact that from 1980 (personal communication with his colleague Michael Byler), he had access to a Fourier transform instrument. This is not surprising if one reads the following

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

21

statements taken from a book chapter [87] he wrote with Michael Byler (reference removed): When FTIR first became available, it was thought that the increased sensitivity would render infrared spectroscopy of proteins feasible even in H2O solution. Instead of the earlier differential methods, FTIR permitted one to obtain the solution spectrum and the solvent spectrum separately, and then to subtract the latter from the former, to obtain the spectrum of the pure solute. Unfortunately, neither the subtraction nor differential procedures are straightforward as they first appear. He then goes on to comment on the first attempt at using the FTIR instrument for analysis of the amide I band of proteins in H2O by Koenig and Tabb (1980) [88]. Susi and Byler noted that the amide I frequencies observed by Koenig and Tabb, appear to be uncertain and are actually in error for ribonuclease, evidently because of the difficulties inherent in subtracting the strong H2O band which absorbs close to 1640 cm1. On this basis, they go on to make the following recommendation: We strongly suggest that protein structure studies based on the important amide I band which absorbs at 16201680 cm1 be carried out in D2O solution wherever possible. Certainly, at low protein concentrations water subtraction is difficult and can be rather subjective. However, increasing number of protein spectra were being recorded in H2O and the similarity between these spectra and those recorded in 2H2O proved that H2O subtraction can be done accurately. As an example, Chapman and co-workers digitally subtracted H2O absorbance from spectra of proteins and lipids in 1980 [70]. They were excited about their new found ability to use computer programs for subtracting water absorbance and wrote the following in their paper [70]: A Perkin-Elmer infrared data station associated with a simple IR spectrometer (model 298) is shown to give excellent results with aqueous model and biomembrane systems. FTIR spectrometers were in the market place at the time when Chapman reported the above study, unfortunately he did not a get grant to purchase a FT instrument until 1985. Although he took advantage of the microprocessor for subtraction of the water absorbance using a dispersive instrument, he was looking forward to using the FT instruments as he was not entirely happy with the performance of the dispersive instrument. This is quite evident from the following statement taken from a book Chapter [89] by Chapman and Goni (reference removed): Perkin-Elmer introduced in 1975 the first microprocessor-controlled commercial dispersion infrared spectrometer, and advantage was taken of this facility for lipid studies a few years later. However, the main problems arising from conventional dispersive spectrometry were only overcome by the use of interferometric methods. In contrast to Chapman, Koenigs group had access to FT instruments almost as soon as they became commercially available. They took advantage of this advance in technology and were one of the main pioneers in demonstrating the practicality of digital removal of overlapping water absorption bands from spectra of biopolymers in H 2O solutions. By 1980, Koenig and co-workers published FTIR spectra of several globular proteins in aqueous solution after digital subtraction of water [88]. The FTIR spectra of proteins in solution were reported in a PhD thesis by Tabb [90]. During this period, the number of studies in H2O was still rather limited and vast majority of the FTIR studies were conducted in 2H2O although computer aided digital subtraction of 2H2O from protein solutions were carried out. Koenig wrote the following in one article (references removed) [91]: One of the spectral processing operations most widely used, beyond the simple computation of transmission and absorbance spectra, is the digital subtraction of ab-

22

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

sorbance spectra in order to reveal or emphasize subtle differences between two materials. Numerous applications of this procedure to polymer systems have been presented previously, and it is this digital subtraction capability more than any other single factor that has inspired subsequent investigation of polymeric materials by Fourier transform infrared spectroscopy. In a recent personal communication with Koenig, he said the following regarding his work on water subtraction: Three things came together which allowed spectral subtraction of water from aqueous solutions of proteins. First, FTIR produced an infrared signal that was stronger and digital compared to the dispersion. Secondly, there were (and we wrote some of them) computer programs written which could scale the signal linearly and allowed digital subtraction. Finally, researchers were convinced that the IR spectra of an aqueous solution of a protein was useful in understanding the structure of proteins. In 1986, Byler and Susi published a paper on quantification of protein secondary structure [92]. Few years later quantitative analysis on secondary structure of proteins from infrared spectra recorded in H2O were successfully achieved. As an example, three groups of scientists, working independently, published papers at a virtually identical period demonstrating the potential of quantifying secondary structure of proteins recorded in H2O. These were studies by Dong et al. 1989 [93, received by journal on July 11, 1989), Lee et al. 1990 [94, received by Journal on 5 Oct. 1989] and Dousseau and Pezolet 1990 [95, received by journal on 14 March 1990]. Dong et al. [93] quantified the secondary structure from the intensities of the peaks in the second-derivative spectra of proteins. This is similar to the method first developed by Susi & Byler in 1986 [92 also see section on quantitative analysis] which used the intensities of peaks in deconvolved infrared spectra of proteins in 2H2O using the curve-fitting method. In contrast, others used factor analysis [94] and partial least squares method [95] for their analysis. All these methods demonstrated good agreement between the X-ray data and the infrared data for the proteins recorded in H2O. After these pioneering studies, other methods for improving the quantification of protein secondary structure from infrared spectra of proteins have been reported. These have included methods that take into consideration the overlap of amino acid absorbance in the amide I region. Other developments include use of artificial intelligence techniques such as neural networks and genetic algorithms for quantification of protein secondary structure [96,97].

3. Infrared Sampling Techniques Infrared spectra of biomolecules can be recorded by transmission, reflection, emission and photoacoustic modes. Of these different methods, transmission and reflection modes are most widely used for biological applications. A brief discussion on the historical background to sampling methods is given below. 3.1. Analysis of Biological Samples by Transmission Infrared Spectroscopy Transmission studies have been the most established, and widely used, method for recording spectra of biological samples. Much of the early infrared studies on biomolecules were conducted in the solid state and involved producing thin films on infrared transmitting substrates. Samples were mulled in various mineral oils (for example, Nujol) and then spread on infrared transmitting disks. High vacuum sublimation onto

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

23

infrared transmitting disks and casting of aqueous solutions to give films on silver chloride disks were also used. The problems encountered by scientists wanting to analyse biological molecules in solution is evident from the statements made in various publications between 1940s to early 1970s. Following is a quote from a paper by Sutherland and co-workers [98]: For accurate analytical work using infrared spectra, the sample should be either in the gaseous or liquid state, although solids which can be obtained as thin plates or films of which the thicknesses can be accurately determined are acceptable. Solvents having no intense absorption over the spectral regions investigated are very scarce in infrared work, and none of these was suitable for this problem. Estimations were accordingly made on the solid material suspended as a paste in liquid paraffin (Nujol). The acetamido acids were finely ground in a mortar and well mixed with an approximately equal quantity of Nujol until a smooth homogeneous paste was obtained Another method of recording infrared spectra of molecules in the solid state, that did not involve suspending samples in Nujol or producing a thin film, was introduced in 1952 [99]. Stimson & ODonnell (1952) introduced the concept of recording infrared spectra in the solid state using KBr. Sister Miriam Michael Stimson and Sister Marie Joannes ODonnell were catholic nuns working at the The Research Laboratories of the Institutum Divi Thomae, Cincinnati, Ohio. These two ladies are the first to develop the KBr disk approach for recording infrared spectra in the solid state and reported their findings at a meeting in 1951 [cited in ref 99]. The authors wrote the following in their article: A method has been developed whereby the usual nujol mull employed for the study of solid organic compounds in the infrared region of the spectrum may be obviated. They chose two biologically relevant molecules for their analysis, namely cytosine and isocytosine. The compounds were mixed in KBr and transparent disks were produced after application of high pressure. This method of recording infrared spectra in the solid state gained increasing popularity and continues to be used today. It is interesting that the work of Stimson and ODonnell in analysing the infrared spectra of DNA bases, using KBr disks, played an important role in the structural elucidation of DNA bases and the double helix. It is important to also note that Scheidt from the Max-Planck Institute for Biochemistry in Tbingen Germany also independently contributed in the development of the KBr method for recording infrared spectra [100]. 3.2. Emergence of ATR Method of Recording Infrared Spectra In early 1960s, internal reflection spectroscopy (IRS) also knows as attenuated total reflection (ATR) spectroscopy came to the scene as a serious alternative to transmission measurements. Two people played a major role in the development of this technique. They are Fahrenfort [101] and Harrick [102], both of whom were working independently in the industrial sector during the late 1950s. Fahrenfort was working on organic compounds at the Royal Dutch Shell laboratories in Amsterdam, whereas Harrick was working on semiconductors at the Philips Laboratories in New York. Fahrenfort was first to report an article entitled Attenuated Total ReflectionA New Principle for the Production of Useful Infrared Reflection Spectra of Organic Compounds at a spectroscopy Meeting in Bologna in 1959 [101]. The technique was used for analysis of aqueous solutions in 1963 [103]. One of the first biomedical appli-

24

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

cations using the ATR method was also reported in 1964 when Scheuplein [104] investigated the quantitative determination of skin reflectance using ATR-infrared spectroscopy. Increasing number of scientists were attracted to this new method of recording infrared spectra. Parker and Ans [105] recorded infrared spectra of human and animal tissues and the authors noted the advantages of the technique in the following manner: Little preparation of the samples is necessary. Single-reflection spectra have shown the application of the method for examining tissues, both normal and diseased. The changes in tissue chemistry produced by the diseased state are evident. Multiple reflection has afforded more intense bands and increased the resolution of the spectra. The tissues examined were human fat, spleen, and aorta, and rat heart, chicken heart, and veal heart (endocardium and atrium). Currently, a wide array of biological and biomedical studies are being reported that utilise the ATR method for recording infrared spectra. The infrared spectroscopy community is divided into two main groups those who mainly use the ATR method and those who favour the transmission method. Below is a quote from a paper by Khurana and Fink [106] who justify their reasons for favouring the ATR technique over the transmission method (references removed): Hydrated thin-film ATR-FTIR was chosen over transmission FTIR because of its technical superiority. For example, the ATR mode has a much higher signal-to-noise ratio, data acquisition is much faster and easier, the samples can be in H2O rather than 2 H2O (2H2O may affect protein conformation in some cases), and analysis of the spectra is facilitated by the absence of liquid water. Thin-film ATR-FTIR spectra of proteins are comparable to those obtained by transmission FTIR, and the secondary structure analysis by both methods gives equivalent results. Proteins in the thin films are fully hydrated and hence would be assumed to exist in their native conformation. Others hold different views regarding this and below is a quote from Arbely et al. [107] who consider the transmission method to be easier then the ATR method: In general it is easier to undertake transmission measurements versus ATR studies. This is accompanied by the nontrivial theoretical considerations that one has to undertake when performing ATR measurements. As an example, one can note the debate that exists over the use of thin-film versus thickfilm approximations. The use of the ATR method for protein analysis has been strongly criticised by the group of Henry Mantsch at the National Research Council in Canada who were concerned that protein adsorption on the ATR crystal could lead to serious errors in the interpretation of protein infrared spectra [108]. However, others have suggested that, with adequate care, the ATR method can be highly effective in the analysis of proteins [109]. These authors have been actively engaged using the ATR method for analysis of biomolecules, especially membrane proteins for many years. The ATR method is certainly an attractive method for determining orientation of proteins in membrane although such analysis can be equally done using the transmission method [107,110]. There are few studies in the literature that have carried out a systematic analysis, comparing the transmission method and the ATR method for recording infrared spectra of biomolecules [e.g. 111]. Van Weert et al. [111] recorded FTIR spectra of five wellknown proteins, (bovine serum albumin, lysozyme, ribonuclease A, -lactalbumin, immunoglobulin G) using several different sampling methods. Transmission spectra were recorded for solid proteins in KBr pellets, protein solutions (in both 2H2O and H2O) in a liquid cell. ATR spectra of the proteins solutions (in both 2H2O and H2O) were also recorded. Results obtained showed that that environmental effects and physi-

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

25

cal nature of the sampling technique can significantly influence the infrared spectra of proteins. Hence, it was suggested that comparison of spectral data, obtained using different sampling methods, should be done with extreme care. Ideally, infrared spectra of a biomolecule should only be compared when they have been recorded using the same sampling technique. Differences in the spectrum of a protein recorded in solution and in the solid states may arise from changes in the phase of the sample rather than alteration in molecular structure. This fact has been recognised over sixty years ago when HW Thompson, working in the Physical Chemistry Laboratory at Oxford University, discussed how the infrared spectrum of a sample can alter due to a change in its physical state [112]. Thomson wrote the following in his article (reference removed): As pointed out in an earlier paper, the passage from one physical state to another will involve a change in both the potential energy function of the system and of selection rules, and with a long branched paraffin chain the frequencies and intensities of the bands may be expected to change. The ATR versus transmission debate has been ongoing in the literature for some time. It appears that some consensus has been reached on this issue and that both methods can give consistent results. The ATR method can provide some advantages such as stopped-flow measurements and for protein-ligand interactions as it provides the opportunity for one of the reactants to be immobilised on the surface of the ATR cell. 3.3. Infrared Microspectroscopy Barer et al. [113] were the first to apply infrared microspectroscopy for biological applications in 1949 and used a reflecting microscope. Regarding the use of infrared microspectroscopy for biological application it is appropriate to refer back to Barer [76], who noted that interpretation of spectra of biological tissues is not easy. He expressed this concern by firstly discussing about how difficult it is to interpret the spectrum of a single molecule let alone a complex mixture of macromolecules. Regarding this he states the following: the infra-red absorption spectrum gives what is essentially information concerning the presence or absence of certain specific chemical groups such as OH, CH, NH, CO, etc. In a complex molecule such as a protein, many such groups will be present and it may be wondered whether it is possible to distinguish different proteins by their infra-red spectra. This subject is still in its infancy. Noting the complexity of interpreting protein infrared spectra, Barer [76] goes on to state the following: If this is the case with single proteins, how much more complicated will be the position in the living cell in which we have a mixture of proteins, lipoids, nucleic acids and other substances? Any observations on the infra-red spectroscopy of cells or tissues will almost certainly be largely empirical until the fundamental data on these classes of substances have been obtained and interpreted. He [76] then went on to point out why the situation is even more difficult when carrying out infrared microspectroscopy compared to traditional transmission measurements. He was very much aware of the limited technology accessible to him and noted the following: The difficulties of examining such complex material are made even greater by using a microscope, for the amount of energy available is less and there is a consequent loss of spectral resolving power due to having to increase slit widths, and if the amplification of the small signals available is pushed to its limit the noise level and base-line instability may make quantitative measurements of transmission unreliable. If the rec-

26

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

ognition of different parts of the cell or different types of cells is to depend on small differences of frequency and height of absorption bands, some improvement in the performance of infra-red spectrophotometers, particularly in the detecting devices, is highly desirable, and until such improvements have taken place, the full value of the technique as applied to cytological material will not be realized. However, Barer [76] concludes on a more optimistic note: These conclusions may seem to be pessimistic but nevertheless there are a number of ways in which infra-red microspectrophotometry could be applied to cytological material. In the first place, we can attempt to alter the chemical constitution of the cell by various procedures. One method of approach is to extract materials from the cell and to follow the structural and spectroscopic changes which occur. Preliminary work on the extraction of nerve tissue with lipid solvents has already been reported and comparisons between the spectra from unmyelinated nerves from various species have been made with those from extracted myelinated nerves. Barers article [76] clearly points out the major difficulties that one encounters when using infrared spectroscopy for analysis of complex biological systems. The first commercial infrared microscope was produced in 1953 by Perkin-Elmer. Thanks to continued developments in technology, infrared miscrospectroscopy is a highly successful and productive area of research. Powerful FTIR spectrometers, with highly sensitive detectors, in conjunction with sophisticated computational and mathematical techniques, have made infrared microspectroscopy a formidable technique in the characterisation of complex biological systems. Bio-Rad Digilab Division, in Cambridge, MA was the first to produce FTIR spectrometers coupled to a microscope which entered the market in 1983. In one of the first articles on FTIR microspectroscopy, Krishnan [114] from Bio-Rad, described the accessory in the following manner: The accessory consists of an all-reflecting infrared microscope coupled with a high sensitivity, small area mercury-cadmium-telluride (MCT) detector. Since the mid-1980s nearly all the major infrared manufacturers provide necessary accessories for carrying out microspectroscopic measurements using their FTIR instruments. As a consequence, there has been a surge in the application of FTIR microspectroscopy for biological studies. Some of the latest applications are discussed by Naumann, Fabian and Lasch, in a later Chapter of this book. The need to use large quantities of samples for obtaining good quality infrared spectra has been a major hurdle for some biological applications due to the lack of availability of sufficient quantities of sample. Hence, the devolvement of FTIR microspectroscopy offered the possibility of overcoming this problem. Vogel and coworkers [115] took advantage of this new advance and published a paper entitled Downscaling Fourier Transform Infrared Spectroscopy to the Micrometer and Nanogram Scale: Secondary Structure of Serotonin and Acetylcholine Receptors. In this study they used FTIR microspectroscopy to analyse the receptors using micrometer-sized, fully hydrated protein films. They described the advantage of this method in the following manner: Because this novel procedure requires only nanogram quantities of membrane proteins, which is 45 orders of magnitude less than the amount of protein typically used for conventional FTIR spectroscopy, it opens the possibility to access the structure and dynamics of many important mammalian receptor proteins. We are currently witnessing a renaissance in infrared microspectroscopy with the use of the bright synchrotron radiation for recording spectra (see Chapter by Dumas,

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

27

Miller and co-workers). This is likely to make the technique amenable to the analysis of the most difficult biological samples at high resolution. 3.4. Development of Specialised Accessories for Recording Infrared Spectra of Biomolecules Accessories for recording infrared spectra of biomolecules have been developed by both industrial and academic laboratories. For example, diverse range of ATR accessories has been developed that are commercially available to users. ATR accessories can either be vertical or horizontal. The vertical accessories are restricted to samples in the solid state whereas the horizontal-ATR accessories can be used to measure samplers in both solution and solid states. One of the popular ATR devices known as the cylindrical infrared reflection cell for liquid evaluation has been widely used for many years. Subsequently, other devices have gained popularity including the Golden Gate ATR System. Accessories for carrying out measurements using the transmission mode have also been developed. Commercial cells designed to simplify analysis of biomolecules in H2O have been produced such as Confocheck (Bruker Optics) and BioCell (BioTools). These cells contain infrared transmitting windows with fixed pathlength (67 m) that are suitable for recording infrared spectra of molecules in H2O without having to worry about pathlength changes. Protein samples can be injected into the cells directly instead of having to take the infrared windows apart and then loading the sample and reassembling the cell. They are particularly useful for those who are new to analysis of biomolecules in H2O. Disadvantage of these devices is that they are more difficult to clean and often require a large volume of sample. Different types of cells designed for specialised applications includes the reactioninduced cells that allow certain chemical reaction to occur in situ, within the cell, without having to take the infrared cell windows apart. For example, transmission cells developed for recording spectra of a protein in the reduced and oxidised states to enable in situ monitoring of products and intermediates in biological redox processes. The availability of optically transparent electrodes was the important development that made possible the recording of infrared spectra directly through the electrode [116]. Mark & Pons (1966) [116] were one of the first to record infrared spectra of molecules at the electrode surface during electrolysis. Some two decades later, Mntele and coworkers extended this approach for biological studies [117]. They constructed an infrared spectroelectrochemical cell that enabled the coupling of biological electrochemical reactions with infrared spectroscopy. The cell that is suitable for studies of proteins in aqueous solution [118] and is particularly useful for infrared difference spectroscopic analysis of redox events (See Chapter by Barth). Fourier-transformed infrared photoacoustic spectroscopy (PAS) has been a more recent addition to the sampling techniques available to infrared spectroscopists. The method has been demonstrated to be useful for characterisation of biological samples in the solid state with the advantage that it involves no sample preparation [119]. More recently, surface enhanced infrared spectroscopy [120] has been developed that can be used to obtain spectra from very small amounts of sample. The principle of this technique is analogous to the better known surface enhanced Raman spectroscopy. This new approach can be used for analysis of protein monolayers.

28

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

Attempts to develop stopped-flow technology for investigating kinetics of biochemical reactions, using spectroscopic techniques in general, started in the 1950s [121]. Infrared spectroscopy has lagged behind UV, fluorescence and CD spectroscopy for stopped-flow studies of biological systems. There have been few infrared studies of biological systems which require rapid mixing of aqueous solutions. This is mainly due to the fact high pressure is required to flow a solution through the short path length infrared transmission cells (550 um) and the viscosity of the concentrated solution. Wharton and co-workers [122] were one of the first to attempt to produce a stopped-flow device for rapid mixing of solutions for analysis of enzymatic reactions using FTIR spectroscopy. The authors did their measurements in 2H2O and predicted further application of this approach for monitoring enzymatic reactions. Stopped-flow study using the ATR method is more widely used compared to the transmission mode [e.g. 123].

4. Spectral Acquisition & Processing Methods In order to obtain specific types of information on molecular structure using infrared spectroscopy, different methods have been developed for recording infrared spectra of biomolecules. Here, some examples of commonly used methods will be provided. 4.1. Polarised Infrared Spectroscopy One of the important modes of recording infrared spectra that have been extremely valuable for biological studies is polarised infrared spectroscopy. One of the first reports on use of polarised infrared spectroscopy for determining the orientation of polymers and proteins was reported in 1947 by Mann and Thomson [124] from Oxford University. Elliott and Ambrose [125] were some of the first scientists to use polarised infrared spectroscopy for monitoring protein folding. The work of Bruce Fraser [126] using polarised infrared spectroscopy is also important, especially with respect to polarised infrared spectroscopic analysis of DNA and fibrous proteins. The method was productively used to determine orientation of fibrous proteins [127,128] and lipids [129]. In 1966, Wallach and Zahler [130] attempted to determine the orientation of bacteriorhodospin in purple membranes using polarised infrared spectroscopy. However, the finding of this study was inconclusive and Rothschild and Clark in 1979 [110] carried out a more detailed polarised infrared study to determine the orientation of bacteriorhodoposin in purple membranes. Unlike the previous studies, this was [110] the first study to use a FT instrument. Rothschild and Clark used a dual beam FTIR spectrometer (Digilab FTS-14). In a recent communication with Rothschild, he made the following remarks about his early work (references removed): in collaboration with Wim DeGrip, we demonstrated that opsin, the form of rhodopsin which lacks the retinal chromophore, had very little beta-structure but is predominantly alpha-helical . An important goal was then to determine how these alpha-helices were arranged in the membrane. For this purpose, we worked with Noel Clark, who was on the faculty at Harvard University, to develop a technique for orienting photoreceptor and other membranes using an ultracentrifuge which we termed isopotential spin-drying . This method enabled us to demonstrate using polarized FTIR that both bacteriorhodopsin (BR) in purple membrane, the light-driven proton

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

29

pump, and rhodopsin in the photoreceptor membrane contained a bundle of alphahelices oriented predominantly perpendicular to the membrane plane. These early studies laid the foundation for subsequent application of polarised infrared spectroscopy to determine orientation of biological molecules. The determination of orientation of peptides and proteins in lipid membranes has been the most active and fruitful area of polarised infrared spectroscopy. Few techniques are readily amenable for structural determination of proteins in membranes and a survey of the literature clearly shows that many studies on membrane proteins had relied on infrared spectroscopy to give an idea about the orientation of the protein and its secondary structural elements with respect to the lipid bilayer. The latest advances in polarised infrared spectroscopy of biomolecules have been the effective use of isotopically labelled residues for determining the orientation of specific residues in peptides and proteins and probing if they undergo any conformational changes due to interaction with other molecules or alterations in the functional state of a protein [for a review see 131]. 4.2. Biological Infrared Difference Spectroscopy Infrared difference spectroscopy has been used to analyse biomolecules long before the advent of FT instruments. However, the problem due to absorbance of water is a recurrent theme in biological infrared applications that continued to be reported until the 1970s. Here is a statement from one study [132], published in the late 1960s, which highlights the authors attempt to overcome the problem of water absorbance and their hopes for the future: The application of infrared spectroscopy to the study of enzyme action has been severely limited by the very strong infrared absorption of aqueous protein solutions. However, all aqueous protein solutions have relatively high transmittance in the range, 2000 to 2800 cml. As technological progress continually brings us more intense light sources and better radiation detectors we have a better chance of seeing how enzymes work by looking through this infrared window. These authors measured the difference infrared spectrum of CO2-equilibrated bovine carbonic anhydrase against the ethoxzolamide- or azide-inhibited enzyme, They obtained a band at 2341 cml due to the antisymmetric stretching of the CO2 molecule bound to a hydrophobic surface at the active site of the enzyme. They were able to monitor this band in aqueous solution since H2O does not absorb in this region. Thus during the 1960s and early 1970s, before FT instruments became widely accessible, also biological infrared difference spectroscopy in aqueous media was restricted to the window regions where H2O does not absorb strongly. In early 1960s, infrared difference spectroscopy has been used to monitor changes protein structure [133]. For example, infrared difference spectra of protein samples recorded at low and high pD values were obtained which showed differences in the Amide I and Amide II bands. The spectra were recorded in 2H2O and the changes in both of these bands were attributed to exchange of hydrogens for deuteriums on the peptide amide groups. Proteins that have been most extensively studied by FTIR difference spectroscopy are those that are available in large quantities and even more importantly happen to contain a chromophore that can be switched from one state to the other without having to disturb the sample in the infrared cell. This approach of triggering a protein reaction directly in the infrared cell is called reaction-induced infrared difference spectroscopy.

30

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

The most common approach has been to use light to switch the protein from one functional state to another (see Chapter by Barth). Hence the most well studied proteins using FTIR spectroscopy are light driven proteins such as bacteriorhodopsin, rhodopsin and the photosynthetic reaction centres. Several groups have been working on these proteins for three to four decades and obtained valuable information that have played a pivotal role in our understanding of their mechanism of action. Thus for example, the 3D structures of bacteriorhodopsin and rhodopsin have given the overall molecular structure but information from infrared spectroscopic analysis of these proteins at different stages of its functional cycle have been vital for understanding the conformational changes of individual amino acid residues and the overall protein backbone. One of the infrared spectroscopists who contributed significantly in understanding the structure of bacteriorhodopsin and rhodopsin is Kenneth Rothschild who in a recent communication stated the following regarding his early infrared difference spectroscopic measurements (references removed): Around 1975 I realized that the increased sensitivity of FTIR spectroscopy might enable conformational changes of membrane proteins to be detected even down to the level of individual amino acid side chains. Bacteriorhodopsin and rhodopsin provided an ideal system to test this idea since they could be activated by light. Attempts by us in this direction began in 1976 but it wasnt until we acquired an MX-1 Nicolet FTIR Spectrometer in the summer of 1981 in my lab at Boston University that we were able to successfully detect for the first time small bands which we were able to assign using isotope labeling and resonance Raman spectroscopy to changes in individual molecular groups. Cytochrome c oxidase is another protein that has been extensively studied by FTIR spectroscopy due to the fact it can be obtained in large quantities. Furthermore, it can be converted to different states, bound to different ligands, and the structure of the ligands and the protein analysed simultaneously by infrared spectroscopy. Since 1960s Alben and Caughey [134] have been studying ligand binding to heme containing proteins and one of their earliest studies investigated carbon monoxide binding to human red blood cells as well as to isolated haemoglobin and the heme carbonyl [134]. They have contributed significantly to our understanding of several respiratory proteins and their interaction with different ligands. 4.3. Two-Dimensional Infrared Spectroscopy Unfortunately, FTIR has been lagging behind NMR spectroscopy with respect to development of methodologies for spectral recording that would enable greater simplification of the spectra for ease of band assignment and obtaining additional information. Historically, the development of two-dimensional infrared spectroscopy has been influenced by the much earlier work on two-dimensional NMR spectroscopy. The first report on two-dimensional infrared spectroscopy was published by Noda in 1986 [135] who was working at the Procter and Gamble Company in the USA. However, due to the specialised nature of the technique it did not gain wide-spread application until the 1990s when it became increasingly apparent that two-dimensional infrared spectroscopy can help unmask the overlapping signals that often congest infrared spectra of biological molecules. The efforts of Noda and Ozaki have been pivotal in raising the awareness of the potential offered by 2D-IR spectroscopy [e.g. 136].

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

31

Few years after the work of Noda (1986) [135], another important breakthrough in 2D-IR spectroscopy was reported when Tanimura and Mukamel [137] reported nonlinear optical 2D-IR spectroscopy based on ultrafast laser pulses. This approach is much similar to the concept used for obtaining 2D-NMR spectra which involves application of radiofrequency pulses. Due to the non-trivial nature of conducting 2D-IR spectroscopy using ultrafast pulses, only a few groups of workers are currently using this approach compared to the much widespread use of the generalised 2D correlation spectroscopy developed by Noda (1986) [135]. Nevertheless, several important studies based on pulse photon echo approach have been reported [137]. Applications include studies of structure and dynamics of peptides and fingerprinting of peptides and proteins [e.g. 138, 139 and Chapter by Ishikawa, Kim, Finkelstein and Fayer]. Although the development of 2D infrared spectroscopy is a major development, we are still a very long way before we can determine the complete 3D structure of a protein in a way that can be achieved using NMR spectroscopy or X-ray crystallography. However, the use of powerful computers to calculate vibrational modes of peptides and some small proteins is promising and in conjunction with 2D-IR techniques it may be possible one day to elucidate the complete 3D structure of a protein. 4.4. Time-Resolved Infrared Spectroscopy Before the advent of FT-instruments, time-resolved infrared measurements on biomolecules have been conducted using dispersive spectrometers. In the study of bacteriorhodopsin and rhodopsin this involved the use of flash photolysis in conjunction with infrared measurements at a specific wavelength. This approach provided submillisecond time resolution [140]. The disadvantage of this approach is the need for substantial signal averaging at a single wavelength in order to access a significant infrared spectral region. However, with FT instruments it is possible to simultaneously record a large infrared spectral region. Taking advantage of this, a method based on sweeping the interferometer moving mirror rapidlyenough to obtain a FTIR spectrum from 0 to 2000 cm1 with 8 cm1 resolution, within 5 milliseconds was first reported by Rothschild and co-workers in 1985 [141]. The step scan technique improved the time resolution of FTIR spectrometers to the sub-microsecond range. This technique records time-resolved intensity changes from a kinetic experiment at fixed mirror positions of the FTIR interferometer. The measurement is repeated for all mirror positions needed to construct time-resolved spectra of a desired spectral resolution. A requirement for the step scan technique is that the experiment can be accurately reproduced at least several hundred times, since kinetic traces at typically about 600 mirror positions have to be sampled at 4 cm 1 optical resolution [142]. However, the time resolution of the step scan technique was still not sufficient to access some of the very fast biological processes. Hochstrasser and co-workers have taken advantage of advances in laser technology to carry out biological studies on proteins using infrared spectroscopy at femtosecond resolution [143]. More recently ultrafast infrared spectroscopy is being used to probe protein folding of peptides and proteins. The Chapter by Ishikawa, Kim, Finkelstein and Fayer, later in this book, discusses advances in ultrafast infrared spectroscopy as applied to biological systems.

32

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

5. Application of Infrared Spectroscopy for Characterisation of Biological Systems A historical perspective of the application of infrared spectroscopy of biological molecules, cells, tissues and organisms is provided. Furthermore, obtaining quantitative information from infrared spectra of biological molecules is discussed. 5.1. Protein Studies Stair and Coblentz in 1935 [144] used infrared spectroscopy to characterise plant and animal tissues this included analysis of carbohydrates and proteins. Later, Klotz, Gruen and co-workers recorded infrared spectra of purified crystallised, proteins using a Beckman IR-2 instrument in late 1940s [145]. They were concerned that some of the earlier infrared studies were conducted on samples of questionable homogeneity and was restricted to a narrow region of the infrared spectrum. They produced films of the proteins and peptides on appropriate supporting plate, such as silver chloride, after evaporation in a vacuum desiccator. They analysed both the amide A, amide I, amide II as well as some side absorbance bands such as those arising from tyrosine. Besides soluble proteins, they analysed small peptides such gramicidin. Films of the soluble proteins were made by drying aqueous solution onto silver chloride films. Films of gramicidin were produced by dissolving the peptide in 95% ethanol and then producing a dry film on silver chloride plates. Dieter Gruen [145] is one of the early pioneers of biological infrared spectroscopy whose main area of scientific interest was disrupted by the war. After graduation, Gruen worked on the Manhattan Project at Oak Ridge, Tennessee helping to separate uranium isotopes using calutrons. After the war, he returned to Northwestern to do a Masters thesis with Irving Klotz before continuing with his Ph.D studies at the University of Chicago. In a personal communication, Dieter Gruen (currently Argonne Distinguished Fellow at Argonne National Laboratory) said the following about his early infrared studies of proteins, at Northwestern University, more than sixty years after the work was published: The chemistry department had just acquired a Beckman IR2 spectrometer and I was one of the first to make use of it. Most of the data were obtained by doing manual scans, an arduous procedure considering todays FTIR instrumentation. Those were exciting times since there was virtually no data of this kind available at the time and we were helping to lay the foundations on which future generations of chemists and biochemists were able to build. The enormous advances in instrumentation including IR have helped to make biology into the magnificent edifice it is today and we sometimes forget the role played by the pioneers in a field. In the UK, the pioneering work of Sutherland [e.g. 146], Elliott and Ambrose [e.g. 125,147] and Fraser [148] was important for laying the foundation of biomolecular infrared spectroscopy. The early work of Elliott and Ambrose in 1950s [125,147] was important in demonstrating that the two main types of protein secondary structure can be distinguished by infrared spectroscopy. The work was performed with proteins in the solid state and for the next 15 years there were not many studies that exploited this potential. The major advance came some 15 years later when Heino Susi measured proteins in 2H2O and analysed model polypeptide (poly-L-lysine) in different conformations [86]. Susis group persisted in using infrared spectroscopy of proteins between

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

33

1966 to 1986. They continued to publish papers that motivated others to enter the field although some had been using infrared spectroscopy for analysis of biomolecules long before Susi. As an example, Dennis Chapman, Henry Mantsch and others, who had been using infrared spectroscopy for analysis of lipids for many years [149], also started to work on proteins using infrared spectroscopy. Susis work on proteins in 2 H2O made the technique more attractive to biologists as it represented a significant step forward from looking at biomolecules in the dry solid state. However, measurement in 2H2O had its limitations, as mentioned earlier, since 1H-2H exchange of amide proteins can cause complications in the assignment of peaks [74,75]. A review article written by Hvidt and Nielson [150] discusses in detail the use of infrared spectroscopy for monitoring 1H-2H exchange in proteins, peptides and small molecules. During the 1960s stopped-flow methods have been used to monitor 1H-2H exchange in N-methylacetamide with half-lives of up to few seconds [150]. With modern FT-instruments it is now possible to carry out hydrogen-deuterium exchange at faster speeds and monitor complex macromolecular interactions [e.g. 151]. As one would expect mistakes are clearly evident in some of the early infrared studies of biological molecules. For example, it was not uncommon to see infrared papers on protein spectra that contained peaks from water vapour and some were erroneously assigned to protein structure. In due course, it became increasingly clear that purging the sample compartment with dry air or nitrogen was vital to reduce overlap of peaks from water vapour. Some suggested subtraction of water vapour spectrum from the protein spectrum. Others were hesitant to do this since under- or over-subtraction can result in either appearance or elimination of peaks in the final spectrum. Spectrometer manufacturers also contributed towards resolving the problem of water vapour overlap in protein studies. In the field of modern biological infrared spectroscopy spectra of only a single or few proteins are recorded and published, due to various difficulties. Therefore the field lacks the type of systematic study that Coblentz felt was necessary when he was analysing organic molecules in early 1900s. It is true that attempts are being made to systematically record the infrared spectra of a large number of proteins by various authors. Indeed, a suggestion has been made to produce an infrared protein spectra database similar to the Protein Data Bank that contains the X-ray & NMR structure of proteins. Unfortunately, the progress in this field is still slow due to the much smaller number of scientists engaged in protein infrared spectroscopy compared to NMR or X-ray crystallography. 5.2. Lipid Studies Some of the first studies on lipid like molecules using infrared spectroscopy were reported by Coblentz [43] and Lecomte [152]. Several workers reported studies employing infrared spectroscopy for analysis of fats during the late 1940s [153,154]. However, some were not satisfied by the quality of the early studies and this is evident from an article by OConnor et al. [155] who begin their paper article with the following sentences: Before many successful applications of infrared spectroscopy to fatty acid and vegetable oil chemistry can be made, extensive spectral data on a large number of pure reference compounds will be required. Heretofore the scant spectral data available on fatty acids, esters, and triglycerides have consisted either of measurements made over

34

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

Figure 10. Photograph of Dennis Chapman describing the structure of a phospholipid molecule to HRH the Prince of Wales (photograph taken few years before his death in 1999 [see 157]).

a very limited range of the infrared spectrum or have been obtained with compounds of undescribed purity. However, the authors also noted in the article some comprehensive studies [e.g. 154] which started to appear at the same time as they were working. This is described in the following manner (reference removed): After the work to be described in this communication was well under way, Shreve, Heether, Knight, and Swern presented infrared absorption data on a number of long chain saturated and mono-unsaturated fatty acids, methyl esters, and alcohols. Their data constitute the most complete study of the infrared spectral properties of fatty acids and esters which has yet been described. The number of studies on lipids, fats and oils using infrared spectroscopy were still rather limited. This is evident from a review published by Binkerd and Harwood in 1950 [156], who state the following regarding the lack of infrared studies on fats and oils: The literature contains many references to the use of infrared spectroscopy, but little is to be found in regard to its application to fat and oil chemistry. They concluded their review [156] by adding the following statements: In fattyacid chemistry, as in all organic chemistry, infrared spectroscopy has already become an indispensable tool. It is being applied to both theoretical and practical problems and has only begun to demonstrate its real value. Almost at the same time, as the above articles were published, several detailed infrared studies on lipids and lipoproteins were reported in the literature [e.g. 158,159]. Norman Jones based at the National Research Council in Canada made important contributions in the analysis of lipids using infrared spectroscopy [160]. From late 1950s, Dennis Chapman (see Fig. 10), based at Unilever in the UK, was one of the most active scientists applying infrared spectroscopy for analysis of lipids especially for monitoring lipid polymorphism. In one of his first publications [149], he wrote the following which indicates the limited number of studies on lipids and absence of studies on lipid polymorphism at that time (references removed): Whilst infrared spectra of some monoglycerides have been reported, none of this work was concerned with the various polymorphic forms, and the spectra given there are mainly solution spectra or those of the stable p-form. Up to the present, spectra of 2-monoglycerides have not been reported.

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

35

Chapman successfully demonstrated in this study [149] the usefulness of infrared spectroscopy for monitoring lipid polymorphism and concluded the paper in the following manner: In conclusion, the investigation has shown that infrared spectra can provide a good standard for observing and following polymorphic transitions. This method for studying polymorphic transitions has some advantages over the X-ray method, notably the speed of obtaining the spectra and the information from them about the bonding of the groups. As a supplement to the X-ray technique it is of obviously great potential value. Chapman continued his work on infrared studies of lipids well into 1990s. During late 1970s Henry Mantsch at the National Research Council in Canada published many papers on lipids and lipid-protein interactions mainly using FTIR instruments. He published his first infrared analysis of lipids using a dispersive instrument in 1978 [161]. In the same year he published a paper on the analysis of lipids using a FT instrument [69]. Infrared spectroscopy continues to be a powerful tool for analysis of lipids and interaction of lipids with diverse molecules [for a review see 162 and also see Chapter by Wolkers]. 5.3. Studies of Nucleic Acids and Carbohydrates The vast majority of infrared spectroscopic studies on biomolecules have focused on characterisation of proteins and lipids. Much fewer studies have been reported on nucleic acids and carbohydrates. Nevertheless, infrared spectroscopy offers a number of advantages for studying these latter macromolecules. The first studies on nucleic acids can be traced back to late 1940s and 1950s [62,164]. The spectra of the nucleic acids were recorded in the solid state. Blout and Fields [164] noted the following in their article regarding their inability to measure the nucleic acids in aqueous systems: It should be noted that because of the very slight solubility of the nucleic acids, nucleotides, and nucleosides in any but aqueous solvents and the rather strong absorption of infra-red by water in the region 2 to 15 except in very thin layers it is necessary to measure these materials in the solid state. We have used the following techniques: (a) casting of a concentrated aqueous solution on silver chloride disks, followed by removal of the water, leaving a continuous film; (b) evaporation of the material in high vacuum upon sodium chloride disks; (c) finely divided powders on sodium chloride disks; and (d) powders mulled into mineral oil. Few years later the situation changed and Blout in conjunction with Lenormant published infrared spectra of nucleic acids, proteins and even bacteria in aqueous media (H2O and 2H2O) [165]. Others who contributed in the early infrared studies of nucleic acids includes Fraser who was the first to study oriented films of DNA using infrared spectroscopy [126,166]. Subsequently, Sutherland, and Tsuboi in 1957 also studied DNA using polarised infrared spectroscopy [167]. In the latter study, infrared spectra of oriented films of sodium deoxyribonucleate using polarised radiation and under varying degrees of relative humidity. The authors also recorded spectra of films that have been deuterated by vapour-phase exchange with 2H2O. In the 1960s, Lord, Falk [168] and Thomas [169] made important contributions in the analysis of DNA using infrared spectroscopy. Thomas reported an infrared spectroscopic method [169] that can be applied to determine the fractions of Watson-Crick base pairs at a given temperature in any RNA containing the four common bases in known ratios.

36

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

In the 1970s a number of scientists have been highly active applying infrared spectroscopy to the study of DNA [170]. Tailandier and co-workers were one of the first to apply FTIR instruments for analysis of DNA [171]. The advent of FT-instruments led to a renaissance in infrared spectroscopy of nucleic acids [see for e.g. 172174]. With respect to carbohydrates, one of the first studies using infrared spectroscopy was reported in 1933 [54]. Work on this class of biomolecules continued after World War II [175]. Goulden studied interaction between proteins and carbohydrates in 1956 [176]. FTIR spectroscopy has been applied to the characterisation of diverse carbohydrates [see for e.g. reference 177]. 5.4. Infrared Spectroscopy of Cells, Tissues and Intact Organisms The appearance of commercial infrared instruments in late 1940s attracted many scientists to apply infrared spectroscopy to the analysis of complex systems ranging from the analysis of bacteria to human fingers. Some examples of the pioneering studies are described below. Hardy and Muschenheim in 1934 [178], used infrared spectroscopy to characterise human skin by recording emission, reflection and transmission spectra. For emission and reflectance measurements, spectra were recorded for human finger. The authors placed the subjects finger immediately in front of the spectrometer slit. Using this approach they also compared the reflectance spectra for subjects belong to two different racial groups. From the spectral analysis, the authors [178] concluded the following: It is also significant that the amount of reflection in the infra-red is about the same for negro skin as it is for white skin. For recording transmission spectra of human skin it is obviously not possible to use human fingers, therefore the authors used fragments of human skin from surgical amputations. The authors were careful to produce a very thin sample in order to reduce absorption from the skin and water. Regarding water absorbance, the authors noted the following: The small amount of water on the tissue during the time of measurement could not have represented a film of greater thickness than 0.05 mm. Such a thickness transmits the infra-red readily. The authors [178] were optimistic about the potential of infrared spectroscopy for analysis of human tissue and make the following statement in their conclusions: The infra-red transmission spectrum of skin has a characteristic fine structure which may prove to be of physiological interest. However, despite the optimism the problem of studying biological tissues containing water was evident from their work. Indeed, this was clearly shown by these authors in a later paper [179] where they state: The absorption spectrum of normally wet skin is essentially that of liquid water. Upon drying, other absorption bands not due to water become evident The study by Hardy and Muschenheim (1934) [178] is an example of one of the detailed infrared studies on human tissues reported in the literature although other brief studies have been reported earlier. For example, as far as back as 1927, infrared spectroscopy has been used for biomedical applications [180]. Cartwright in 1930 [181] reported infra-red transmission of the flesh. Pearson and Norris in 1933 [182], used infrared spectroscopy to analyse horny layers of human skin. When commercial instruments became available a large number of scientists were engaged in biomedical studies including characterisation of bacteria, viruses, human

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

37

tissues and studies on metabolic biochemistry. Taking advantage of the Perkin-Elmer 12A instrument Norman Jones [183] and colleagues wrote the following in their paper: It is to be expected that the technique will soon be used extensively in medical and biological research, since adequate instruments are now available commercially. In this paper is described the use of infra-red spectrometry as an analytical tool for the detection and characterization of small quantities of steroid metabolites in urinary fractions. To aid identification of the steroids in complex systems, the same group of authors published a paper in 1949 where they reported the use of deuterium labelled steroids in infrared studies metabolism [184]. This is one of the first examples of isotope-edited biological infrared spectroscopy. These series of studies are very good examples of collaboration between scientists from different disciplines, with academic and industrial backgrounds, coming together to solve a problem. Norman Jones was an infrared spectroscopist from the Chemistry department at the National Research Council in Canada. In contrast, Dobriner and Libermen were from the Sloan-Kettering Institute for Cancer Research in New York and were well known for their work on steroid metabolism in health and disease. Williams and Barnes were based at American Cyanamid Company and engaged in industrial applications of infrared spectroscopy. In the field of infrared spectroscopy the close co-operation between academia and industry played an important role in advancing the application of this technique to a diverse range of systems. No doubt the financial strength of the industrial partners played a useful role in obtaining access to the latest instrument and expensive chemicals which were often out of reach for many in University departments. Infrared spectra of tissues and cells have been recorded by a number of workers [e.g. 185187] and the technique has been applied to the study of viruses [188] and bacteria [e.g. 189] in the 1950s. The advent of FT-instruments and associated technology has led to a renaissance in the application of infrared spectroscopy for analysis of cells and tissues [for reviews see 19091 see also the Chapter by Naumann, Fabian and Lasch later in this book]. 5.5. Quantitative Information from Infrared Spectra of Biological Systems Edsall in 1938 showed the possibility of using Raman spectroscopy for distinguishing between different amino acids by monitoring differences in their vibrational frequencies [192]. This concept was subsequently extended to the analysis of amino acid mixtures by Buswell and Gore (1942) [193] and also by Sutherland and co-workers in 1948 [98]. Buswell and Gore were one of the first to attempt to obtain quantitative information on proteins using infrared spectroscopy [193]. Their aim was to see if it was possible to separate the bands from the different amino acids and determine their extinction coefficients which will enable them to estimate the content of specific amino acids. They carried out analysis on salmine which is a water soluble protein. The authors noted that they were not able to record the spectra of the protein in the dissolved state since the protein was not sufficiently soluble in infrared-transparent solvents. Therefore, they recorded the spectra by producing solid films on thin microscope cover glasses [193]. Some seven years later, Sutherland and co-workers [98] working in Cambridge tried to estimate the isoleucine/leucine ratio for protein hydrolysates as well as for control samples. These pioneering studies were some of the first attempts towards obtaining quantitative information from protein infrared spectra. The first quanti-

38

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

fication of protein secondary structure content from infrared spectra can be attributed to the work of led by Susi and co-workers in the 1980s [92]. Below is a quote (with references omitted) from Michael Byler regarding their early attempts at obtaining quantitative information on protein secondary structure: Because of his experience in working with Fourier deconvolution of infrared spectra of a variety of substances, we sought the collaboration of Professor Peter R. Griffiths, then at the University of California at Riverside. He kindly agreed. Griffiths and his graduate student W.-J. Yang wished to focus on the deconvolution of diffuse reflectance spectra of solid proteins; we continued our work with D2O solutions. Because deconvolution requires a user to make subjective choices regarding the two parameters to be employed to obtain optimum band narrowing (band shape, band width, and the resolution-enhancement factor K) many spectroscopists had serious misgivings about how trustworthy any data obtained by this technique would ultimately prove to be. Our joint investigation indicated that spectra measured and manipulated independently at two different laboratories on samples of different origin [gave] similar results. Nevertheless, care must be taken to ensure that band intensities in the original are not saturated or distorted. In addition, because in principle deconvolution does not alter the area under a peak, we felt that this new method should enable quantitative estimates of the proportion of each conformation in a protein to be calculated. Byler and Susi pursued a more detailed study on secondary structure quantification from protein infrared spectra which was published in 1986 [92]. This can be considered as one of the land mark papers in modern protein FTIR spectroscopy in relation to protein secondary structure analysis. In this study, the authors estimated the secondary structure of 21 proteins [92]. They deconvolved the spectra of the proteins, that were recorded in 2H2O, and used curve-fitting method to estimate the content of helical and beta-structure. The accuracy of the prediction was very good and this encouraged others to use the approach for their studies. It is not surprising that the paper has been cited nearly 1,000 times and is probably the most highly cited biological FTIR spectroscopy paper in the scientific literature. As discussed earlier in this Chapter, after this study by Susi and Byler, a number of authors have shown that quantification can also be done for proteins in H2O [93,94,96,97]. Some quantitative studies on lipids have also been reported in the 1950s. For example, Freeman et al. (1953) [158] reported how the intensity of the lipid ester carbonyl band and certain strong bands of proteins can be used to obtain quantitative information on the lipid-protein ratio in lipoproteins. With respect to carbohydrates, the potential of using infrared spectroscopy for quantitative determination of nitrate groups in nitrocellulose has been reported by Kuhn in 1950 [175]. Infrared spectroscopy based methods have been developed for quantitative analysis of proteins adsorbed on surfaces such as biomaterials. In one early study, ATR spectra of surface-adsorbed proteins were correlated with measurements determined by 125Ilabeled proteins. The authors demonstrated a linear correlations between the intensity of the major infrared bands of proteins and the quantity of proteins [194] Early attempts have also been made at obtaining quantitative information on analysis of bacteria using infrared spectroscopy [e.g. 187]. Currently, quantitative analysis of changes in complex systems such as cells and tissues are also being developed. Sophisticated statistical and computational tools are being developed to distinguish between normal and diseased tissues, identification & classification of bacteria etc. [e.g. 190,191 and Chapter by Naumann, Fabian and Lasch]. Quantitative information

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

39

on the strength of molecular interactions and of covalent bonds can also be obtained and is discussed in the Chapter by Barth. 6. Impact of Advances in Chemical & Molecular Biological Methods for Biological Infrared Spectroscopy Developments in synthetic chemistry have played major roles in extending the potential of infrared spectroscopy. The effect of amino acid sequence on protein structure and stability using infrared spectroscopy has been made amenable due to the possibility to readily synthesise peptides using automated peptide synthesisers. Woodward and Schramm in 1947 were the first to successfully synthesise peptides [195]. It is interesting that the peptides synthesised by the latter workers were subjected to infrared spectroscopic analysis by Sutherland in the same year [146]. From a historical context, this is probably the first study to be reported in the literature that analysed the structure of a synthetic polypeptide and also make comparison with a naturally occurring protein. The potential offered by peptide synthesis for structure characterisation and interpretation of infrared spectral data is nicely summarised by Sutherland and co-workers [196] in the following manner (references removed): The recent synthesis of polypeptides from given amino-acids by Woodward and Schramm, following the neglected Leuchs polymerization, has reopened a promising line of attack on the problem of protein structure. Comparison of the properties of such synthetic polypeptides of known composition with those of proteins built of the same amino-acids should prove of great value, and in particular should help considerably in the interpretation of X-ray and infra-red data, which are still so imperfectly understood owing to the complexity of even the simplest protein. The peptide synthesis work developed by Woodward and Schramm and the subsequent spectroscopic work pioneered by Sutherland in late 1940s has now developed into a highly productive area of research [197]. In recent years synthetic peptides have been extensively investigated using infrared spectroscopy, especially in the study of complex proteins such, amyloids, prions and membrane proteins. Isotopically labelled molecules have been used in biological infrared spectroscopy for many years. Use of 2H2O, instead of H2O, to obtain a window in the amide I region and for monitoring 1H-2H exchange of amide proteins are the most well known application of isotope-edited biological infrared spectroscopy dating back to late 1930s. One of the first studies recording infrared spectra of a mixture of H 2O and 2H2O was reported in 1934 [e.g. 198] In lipid work, deuterated and 13C labelled lipids have been used for probing specific regions of lipid membranes since the 1970s, especially using NMR spectroscopy [199]. However, the use of labelled peptides and proteins have been more recent and became possible after availability of isotopically labelled amino acids for peptide synthesis. It is for this reason that there have been very few studies with isotopically labelled peptides and proteins until the early 1990s other than studies of proteins in 2H2O to monitor 1H-2H exchange. Advances in synthetic peptide chemistry, development of automated peptide synthesisers, recombinant DNA technology and bacterial expression of proteins all have been instrumental in making it possible to obtain isotopically, labelled peptides and proteins. The first FTIR study on 13C labelled peptides was reported back in 1991 [200]. Far fewer studies have been possible with isotopically labelled proteins. The first study investigating protein-protein interac-

40

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

tion, where one of the proteins is fully labelled using 13C and 15N was reported in 1992 [201]. Isotopically labelled proteins provide immense potential for interpretation of infrared spectra and also for obtaining detailed information at the single residue level. Unfortunately, due to the expense associated with producing labelled proteins the number of infrared studies employing labelled proteins continues to be rather limited. Nevertheless, the method is being successfully employed for characterisation of a number of complex systems such as membrane proteins and insoluble aggregates of proteins. In one study polarised infrared spectroscopy was used to determine the local amide orientation in ordered insoluble protein. The work involved labeling of individual amide carbonyl carbons with 13C which enabled the systematic assignment of amide I modes of specific amino acid residues [202]. The other major help in advancing biological applications of infrared spectroscopy has been the ability to carry out site-directed mutagenesis studies on proteins using recombinant DNA technology. This ability has been instrumental in our understanding of structure-function relationship of several proteins. A good example, of highly successful research in this area is the large number of studies on bacteriorhodopsin. Other advances that have been helpful for biological infrared spectroscopy are the synthesis of photolabile caged compounds that releases a specific ion (for example Calcium) or molecule (e.g. ATP) in response to light trigger, enabling the spectra of a protein to be recorded in more than one state so that a difference spectrum can be obtained (see Chapter by Barth).

7. Theoretical Analysis of Protein Infrared Spectra Theoretical analysis of infrared spectra has been vital for understanding the relationship between molecular structure and infrared spectra. This approach generally involved calculating spectral frequencies of a molecule, based on its molecular structure, and correlating this to experimental data. Although this approach has been successful with small molecules, it has been more challenging with complex molecules such as peptides, proteins, nucleic acids etc. Herzberg (see Fig. 11) was one of the first to use theoretical methods for analysis of infrared vibrations [203]. He was awarded a Nobel Prize for Chemistry in 1971 and had a profound effect on a future generation of infrared spectroscopists, especially those at the National Research Council, Ottawa, Canada, including Henry Mantsch (see Fig. 11). In a recent communication with Henry Mantsch, he stated the following regarding the views of Herzberg, compared to others, regarding the application of infrared spectroscopy for analysis of biological systems: Until the end of his long life Herzberg was often wondering how far we had taken his molecular spectroscopy, unlike some of his purist colleagues who accused me of having sacrificed infrared spectroscopy on the altar of biology. Scientists who pioneered the theoretical analysis of infrared spectra of complex polymers, with relevance to biological systems, include Sutherland [204], Miyazawa [205] and Krimm [206]. Miyazawas work on normal vibrations of N-methylacetamide have been important for understanding the infrared spectra of peptides and proteins [207]. In an article published in 1950 [204], Sutherland summarised the key problems encountered in theoretical analysis of large molecules:

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

41

Figure 11. Photograph of Henry Mantsch discussing energy levels with Herzberg on an old fashioned blackboard. Photograph courtesy of Henry Mantsch.

During the past ten years or so infra-red spectroscopy has changed from a branch of molecular physics dealing with the structural features of the smaller, and generally inorganic, molecules to a subject of considerable interest to the organic chemist and the biochemist, and of growing interest to the biologist. Whereas it is possible to give a complete analysis of the spectrum of the ammonia molecule, from which the bond lengths and angles can be deduced with an accuracy of 1 part in 1000 and the height of the potential barrier inhibiting inversion is known to within a few per cent, the most that can be stated at present about a large organic molecule or polymer is that it does or does not contain certain chemical groups, and even this statement has frequently to be hedged about with qualifications. He concluded his article [204] by identifying some key areas where much work needs to be carried out: (i) Intermolecular forces (including hydrogen bonding) encountered in solution, in the liquid state and in solids, both amorphous and crystalline. (ii) Weak intramolecular forces such as occur in internal hydrogen bonding and in restricted rotation about a single bond. (iii) Changes of charge distribution in characteristic groups caused by the presence of strongly electronegative or electropositive groups in certain positions relative to the group under consideration. (iv) Changes in group force constants due to changes in bond hybridization, arising from different environments of a particular group in different molecules. The effects of all these factors on the intensities as well as on the positions of characteristic frequencies must be studied. Finally the need for more theoretical work on the determination of the force constants governing the frequencies of characteristic groups must be stressed. In the last analysis it is the force constants and not the frequencies which are the fundamental physical constants of groups of atoms in large molecules.

42

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

Figure 12. Photograph showing Yuri Chirgadze with visiting guests in a laboratory at the Institute of Protein Research during the IV International Congress of Biophysics, Moscow, 1972. From left to right: post graduate student of Harvard Medical School Ms B. Doyle and Prof. E. Blout (Boston, USA), Dr. Y. Chirgadze, wife of Prof. E. Blout, and Prof. S. Krimm (USA). Photograph courtesy of Yuri Chirgadze.

These recommendations of Sutherland [204] are as important today as they were nearly sixty years ago when he wrote the above statements. Samuel Krimm (see Fig. 12) who continues to be active in the field to this day, worked with Sutherland at Michigan on theoretical analysis of infrared spectra of high polymers since the early 1950s. In a recent communication with him, he summarised some of his key contributions in this field (references removed): My own efforts in biological IR spectroscopy began during my spectroscopic studies on synthetic polymers, initiated during a postdoc with Gordon Sutherland at Michigan starting in 1950. My early efforts to relate spectra to conformation dealt with extensions of the Miyazawa and Blout interaction treatment of amide modes. I also presented spectral evidence for C-HO hydrogen bonding in polyglycine II. But I then began to feel that the only solid way to correlate spectra and structure was through normal mode calculations, and since this required a good force field we began to prepare the way with an analysis of a model system, N-methylacetamide, choosing to develop a valence force field rather than the Urey-Bradley field being used by Shimanouchi and collaborators. From this emerged our first polypeptide calculations, the concept of transition dipole coupling to account for amide I splittings, and the beginning of a series of some 50 detailed papers on peptides and polypeptides, part of which is summarized in the review article in Adv. Protein Chem. that tried to emphasize the power of vibrational spectroscopy in studying protein structure. Despite my recent work on developing spectroscopically accurate molecular mechanics energy functions, I am very pleased that we have come up with a new idea for using spectra to determine conformation: the sensitivity of the C(alpha)-D(alpha) stretch frequency to the phi,psi angles at the C(alpha) atom, which led to confirmation of the presence of CHO(water) hydrogen bonding in aqueous solution.

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

43

Advances in computational approaches for calculating vibrational spectra of biomolecules will not only aid a better understanding of spectra-structure relationship but will also enable the determination of macromolecular 3D structures at high resolutions. Chapters by Kubelka, Bour and Keiderling and by Choi and Cho in this book discuss the latest advances in the theoretical analysis of protein spectra.

8. Growth of Biological Infrared Spectroscopy 8.1. Biological Infrared Spectroscopy from an International Perspective This section provides a brief historical context of key developments in the application and spread of biological infrared spectroscopy at the international level. The main countries that were involved in laying the foundation of biological infrared spectroscopy were from the USA, UK, France and Japan with major contributions occurring between 1940s through to 1960s. Some of the contributions made by scientists from these countries have been highlighted earlier in this Chapter. From 1960s other countries also entered the field including Germany and USSR, and scientists from these countries made significant contributions in the development of biological infrared spectroscopy. One of the key pioneers in protein infrared spectroscopy from the USSR was Yuri Chirgadze who had a significant impact in infrared studies of proteins and published his first paper in this area in 1961 [208]. In a recent communication with him, he stated the following regarding how he first started applying infrared spectroscopy for biological studies (references removed): Last half of 1958 I was a graduate student and started to work in the laboratory of Dr Natalia S. Andreeva, who was a pioneer in solving X-ray crystal structure of small biological molecules (dipeptides), and then she was the first Soviet scientist who began to solve X-ray structure of a globular protein, pepsin. That time she offered me to apply IR spectroscopy to analyze the structure of model peptides related to structural genesis of a collagen molecule, i.e. oligopeptides containing imino acids proline and hydroxiproline. This was a theme of my University Diploma Thesis. As far as I know, it was the first attempt to apply method of IR spectroscopy for studying peptides in the USSR. A result of this work was published in my first scientific paper in 1961 in Russian journals: Physical Chemistry and Biophysics. The instrumental apparatus for this work was a single beam prism IR spectrometer IKS-1 manufactured serially by the optical and mechanical corporation LOMO in Leningrad. Figure 12 shows a photograph of Chirgadze with some visiting scientists from the USA including leading pioneers in infrared spectroscopy of proteins, Krimm and Blout. Chirgadze also commented on how the support of science in former USSR, the ability to purchase foreign instruments, and interactions with scientists from other countries played a valuable role in their research activities: infrared spectroscopy in the Soviet Union was intensively developed both experimentally and theoretically. At that time the biological science was strongly supported by the government. And the striking example of this was the foundation of a number of specific small scientific towns around Moscow, Leningrad (today SaintPetersburg), Novosibirsk and other cities. During all that time, we had an essential financial support for purchasing modern commercial spectrophotometers and other special equipment from the Soviet Union, Germany (DDR), USA, Japan etc. It is

44

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

very important that we could present our results at many conferences and congresses in the Soviet Union and abroad. I was personally introduced to famous scientists in the field Prof L. Pauling (USA), Prof. J. Watson and Prof. F. Crick (USA), Prof. E. Blout (USA), when they visited the Soviet Union and Prof. T. Miyazawa (Japan) when I visited shortly Osaka University in 1970. As one would expect, the use of Fourier transform instruments was largely determined by the accessibility and affordability of these spectrometers. It is not a coincidence that the first use of FT instruments by the academic community involved close interaction with instrument manufacturers. The manufactures were keen to sell their machines by highlighting the power of the new FT instruments and in this they found a natural ally in research scientists who were keen to analyse their difficult samples using these modern machines. As already mentioned, the first commercial FT instruments were produced by Digilab. Hence, most of the first FTIR studies on biological molecules were conducted using Digilab machines. Furthermore, the company being based at Boston (USA), meant that the first infrared studies on biological systems using FT instruments were carried out by scientists based in the USA. For example, Alben, who published the first FTIR paper on a protein molecule, used a Digilab FTS14 instrument (See Fig. 8). This is also the case for Rothschild who did the first FTIR study on a membrane protein used a Digilab instrument. After the scientists from USA, it was the Canadians who were quick to use FT instruments for biological studies. The European scientific community entered the field infrared spectroscopy of biological systems using FT instruments, some 68 years after the North Americans. The first biological application is reported by Siebert, Mntele and Kreutz in 1980 [209] from the University of Freiburg in West Germany . The biological FTIR spectroscopy community based in Freiburg in Germany were the leaders in Europe at that time. In the UK, Chapman and Belton were one of the first to use FT instruments for analysis of biological molecules [75,210,211]. Although, papers dealing with infrared studies on biological systems were being published by scientists in the former USSR, there are no reports on the use of FT instruments for biological studies until well after the end of communism. One of the first applications of FTIR instruments in Japan, for biological studies, was reported in 1978 [212]. The authors of that study used FTIR spectroscopy to analyse plasma proteins adsorbed on polymer surfaces. Japan continues to be the leading country in Asia in the field of biological spectroscopy. Indeed, Japanese scientists have played significant role in advancing the theoretical and practical aspects of biological infrared spectroscopy since late 1940s. Amongst the Asian countries, biological infrared spectroscopy has become more common in China. 8.2. Growth of Conferences Focusing on Biological Infrared Spectroscopy The late 1980s and the 1990s was a particularly exciting time to be involved in infrared spectroscopy of biomolecules. Several international conferences started in the 1980s which became the venue for dialogue, debate and dissemination of the latest advances in the field. One such meeting that attracted many infrared spectroscopists, engaged in biological studies, was the European Conference on the Spectroscopy of Biological Molecules (ECSBM). The first ECSBM was held in Reims in 1985 and the latest one (the XIII in the series) was held in Paris in 2007. Selected contributions from the last conference were published in Spectroscopy Biomedical Applications. Cur-

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

45

rently, there are many other conferences that have proliferated and some focus exclusively on infrared spectroscopy. 8.3. The Infrared Spectroscopy Literature The fact that infrared spectroscopy had already entered the realm of life sciences is clearly evident from the number of publications that were appearing between 1930s to 1950s in journals devoted to advances in subjects such as biochemistry, physiology, microbiology etc. A significant number of early biological infrared papers were published in journals such as Proceedings of the Royal Society of London, Discussions of the Faraday Society, Biochemical Journal, Nature, Science, Applied Spectroscopy, Journal of Biological Chemistry, Bulletin of the Chemical Society of Japan, Journal of the American Oil Chemists Society, Journal of Bacteriology, The Journal of Chemical Physics. At present, many of the biological applications of infrared spectroscopy were published in journals such as Biochemistry, Biochimica et biophysica acta, Applied Spectroscopy, Biophysical Journal, European Journal of Biochemistry, FEBS Letters etc. With further growth of biological infrared spectroscopy, specialised journals devoted to this area started to appear. For example, Vibrational Spectroscopy appeared in 1990, followed by Biospectroscopy in 1995. Existing journals were also changing their focus to meet the growth in biological applications and keep pace with the changing times. Thus for example, one of the authors (PIH) of this article took over the editor-inchief role of Spectroscopy An International Journal and changed its focus towards publication of biological and biomedical applications. Continuing with this development, the name of this journal has been recently modified to Spectroscopy Biomedical Applications. Modern biological infrared spectroscopy is still a relatively young and highly specialised field. Nevertheless, biological infrared spectroscopy publications are some of the highly cited papers in the literature. The most cited infrared spectroscopy paper dealing with analysis of a biological molecule is related to the Nobel Prize winning work by Prusiner on the prion protein [213], although this paper is not exclusively on infrared spectroscopy. The most highly cited paper, dealing exclusively on biological infrared spectroscopy, is the Byler and Susi paper [92] where they estimated the secondary structure of 21 proteins from their deconvolved spectra. There are two other papers in the literature, exclusively on biological infrared spectroscopy, that are highly cited. Whilst being an excellent source of information for the infrared spectroscopist, both of these papers are critical about the misuse of infrared spectroscopy. The most cited (570) is by Surewicz, Mantsch and Chapman [214] and deals with the need for caution in protein secondary structure analysis by FTIR spectroscopy. The second most cited (525), published two years later by Mantsch and Jackson [215], is a more detailed article discussing the use and misuse of infrared spectroscopy in the determination of protein structure. The citation of the papers by Mantschs group is far greater than the citations of their other FTIR papers with the closest one being a review published earlier [216] which focused on the potentials offered by infrared spectroscopy for biological studies (over 470 citations). The critical assessment articles by Mantsch, Chapman and co-workers were particularly important at a time when increasing number of people were entering the field and some of the new and unwary users needed guidance regarding not only potentials, but also about problems and pitfalls that one should be aware of.

46

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

9. Further Reading The occupation with the history of infrared spectroscopy is truly fascinating and it is easy to submerge in past arguments and the ingenious experiments of our scientific ancestors. For those interested in this experience, some of the original articles from the 19th century are available via ABs web site (http://www.dbb.su.se/Faculty/ Andreas_Barth). Besides the cited original literature, this overview relies on previous compilations. We found particularly valuable refs. [4,8,10,19]. The book chapter by R.N. Jones has been republished in a series of three articles [217219]. Accounts of the work by Herschel [6], Melloni [15,25], Langley [26] and Coblentz [32,220] are also available. Jones labels the time period from 1960 to 1985 the age of the acronym. Acronyms are still used extensively and Jones criticism well worth consideration: extreme specialisation has encouraged the use of new jargons which make it increasingly difficult for analytical spectroscopists to communicate effectively across these selferected barriers. The custom has also developed to identify these narrow fields by acronyms or lettered abbreviations, often to the extent of using them exclusively and dropping the descriptive name, even in the titles of publications [8]. He demonstrates his point by listing 64 acronyms, some of which have several meanings (e.g. MIR stands for multiple internal reflection or mid infrared) and others are confusingly similar (e.g. FT-IR for Fourier transform infrared and FTIR for frustrated total internal reflection). Infrared spectroscopic work on biological systems has been reviewed already in 1940 by Loofbourow [221] in an article titled borderland problems in biology and physics. Interestingly, the author decided not to use the title biophysics instead, because this term did not self-evidently include the use of physical methods at that time. This has changed and the expression biophysics nowadays means physical methods and physical principles applied to biology and biochemistry as suggested by Loofbourow [221]. Early work on proteins and their constituents has been reviewed in 1952 by Sutherland [222]. The history of biological applications has been summarised by During and Gerson in 1979 [223], that of lipids by Mantsch in 1998 [46] and that of near-infrared spectroscopy by McClure in 2003 [21].

10. Future Directions H.A. Laitinen, wrote an editorial in 1973 in Analytical Chemistry [224] where he made an analogy between Shakespeares seven ages of man and developments in analytical techniques. Using infrared spectroscopy as an example, he stated how this technique had reached its final (seventh age) stage (see preface). This judgement is far from the reality as is evident from some of the very recent advances in technology that enables the use of infrared spectroscopy to probe complex biological systems at a single residue level that was previously unimaginable. Future advances will not be restricted to new technological improvements in instrumentation and associated accessories but advances in chemistry and molecular biology will be harnessed to enable more challenging problems to be addressed. This will include site-specific isotopic labelling of proteins and peptides for detailed structure elucidation and macromolecular interactions. There are exciting advances in synchrotron radiation infrared spectroscopy which is likely to herald a new renaissance in bio-

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

47

logical infrared spectroscopy. In conjunction with computational and statistical tools, infrared spectroscopy will extend its power in the detection, identification, quantification and classification of molecular changes in complex systems and environments at high resolution. It will enable highly detailed fingerprinting of diverse chemical components in complex systems such as cells, tissues and whole organisms. Infrared spectroscopy is likely to play a central role in the rapidly emerging fields of systems biology, metabolomics, proteomics etc. It may not be too long before we see infrared spectroscopy as a diagnostic tool from clinics to bedside. Finally, we would like to end our article by quoting R.D.B. Fraser (See Fig. 6) who is one of the early pioneers in biological infrared spectroscopy but who also used X-ray diffraction. Fraser made a significant contribution in the elucidation of DNA structure whilst he was based at Kings College, London which is now widely recognised [225]. One of us (PIH) asked him if he considered infrared spectroscopy as a second-fiddle to X-ray diffraction. His response to the question is given below: Infrared spectroscopy has always been a powerful tool in the analysis of materials of unknown structure and was used extensively in the study of synthetic polypeptides and fibrous proteins to asses the proportions of alpha helix, beta structure and coil. Ambrose and Elliott were the pioneers in this application and also in the use of polarized radiation to determine the orientation of the chain axes in the alpha and beta sections. About the same time I found, using a high resolution spectrometer that Bill Price and I had developed, that the NH stretching frequency in collagen was about 30 wave numbers higher than in other fibrous proteins confirming that the conformation of the polypeptide chain was substantially different from that of the alpha or beta forms. At that time the precise conformation of the polypeptide chains in any protein were unknown and infrared studies were a valuable tool in the construction of plausible models. Remember that it is still only possible to obtain electron density maps at atomic resolution from X-ray diffraction patterns for crystalline proteins. With the fibre-type X-ray patterns obtained from fibrous proteins such as muscle, tendon and hair trial and error methods still have to be used. The search for the structure of feather keratin started in 1932 with the observation by Astbury and Street that the X-ray pattern resembled that of the stretched wool (beta) rather than wool (alpha) and evidence from polarized ir studies played a vital part in the development of a model that accounted for the broad features of the X-ray diffraction pattern. Even today our latest model still relies on the IR study that showed that feather keratin is based on the anti parallel chain pleated sheet. So I would say complimentary rather than second-fiddle.

11. Acknowledgements and Apologia Due to lack of time and space limitations, it has not been possible to cover the work of all the scientists who have made important contributions in the field of biological infrared spectroscopy. We consider this article as work in progress and hope to publish more in this area in the future and include key contributions that we may have omitted. In this context, we welcome any advice, comments, information and corrections from our readers. In order to avoid confusion, we have deleted references within quotes from various scientists and sections taken from the published literature. AB would like to thank the staff of the Chemistry library of Stockholm University for their professional help with the endless requests for external loans. PIH would like

48

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

to thank RDB Fraser, Dieter Gruen, Juana Bellanato, Samuel Krimm, James Alben, Jack Koenig, Henry Mantsch, Yu Chrigadze, Michael D. Byler, Kenneth Rothschild, Juan Gomez-Fernandez and James Mattson for taking their valuable time to answer many questions about their early contributions in biological infrared spectroscopy and some cases provide photographs etc. We would also like to thank Pat Ashton, Senior Communications Specialist at the Heritage Exhibit for providing the photograph of the Beckman IR-1 spectrometer. References
[1] [2] [3] [4] [5] [6] [7] [8] W. Herschel, Philos. Trans. Roy. Soc. London 90 (1800), 255-283. W. Herschel, Philos. Trans. Roy. Soc. London 90 (1800), 293-326. W. Herschel, Philos. Trans. Roy. Soc. London 90 (1800), 284-292. Y.M. Rabkin, Isis 78 (1987), 31-54. Hutchinson dictionary of scientific biography. Helicon Publishing, Abingdon, 2005. E.S. Barr, Infrared Phys. Technol. 1 (1961), 1-4. Encyclopaedia Britannica Online Version. R.N. Jones, Analytical applications of vibrational spectroscopy a historical review, in: Chemical, biological and industrial applications of infrared spectroscopy, ed. J.R. During, John Wiley & Sons, Chichester, 1985, 1-50. A.M.C. Davies, Spectrosc. Eur. 12 (2000), 10-16. E.S. Barr, Am. J. Phys. 28 (1960), 42-54. W. Herschel, Philos. Trans. Roy. Soc. London 90 (1800), 437-538. H. Chang and S. Leonelli, Stud. Hist. Phil. Sci. 36 (2005), 477-508. T. Chester, Reconciling the Herschel experiment, http://home.znet.com/schester/calculations/herschel/ index.html. D. Purves, S.M. Williams, S. Nundy, and R.B. Lotto, Physiol. Rev. 111 (2004), 142-158. K. Hentschel, NTM 13 (2005), 216-237. V.Z. Williams, Rev. Sci. Instrum. 19 (1948), 135-178. H. Chang and S. Leonelli, Stud. Hist. Phil. Sci. 36 (2005), 686-705. E.S. Barr, Phys. Teach. 5 (1967), 53-60 (reprint of Appl. Opt. 2 (1963), 639). R.A. Smith, F.E. Jones, and R.P. Chasmar, The detection and measurement of infra-red radiation. Clarendon Press, Oxford, 1968. H.M. Randall, Rev. Mod. Phys. 10 (1938), 72-85. W.F. McClure, J. Near Infrared Spectrosc. 11 (2003), 487-518. J.F.W. Herschel, Philos. Trans. Roy. Soc. London 130 (1840), 1-59. C.A. Gueymard, D. Myers, and K. Emery, Sol. Energy 73 (2002), 443-467. L. Nobili, Ann. Phys. Chem. 20 (Vol. 96 of the whole series) (1830), 245-252 (German translation from Biobliothque universelle 44 (1830), 225). E.S. Barr, Infrared phys. 2 (1962), 67-73. E.S. Barr, Infrared phys. 3 (1963), 195-206. M. Melloni, Ann. Chim. Phys. 55 (1833), 337-397. M. Melloni, Ann. Phys. Chem. 35 (1835), 385-413 and 529-578 (German translation of Ann. Chim. Phys. 55 (1833), 337-397). M. Melloni, Taylors Sci. Mem. 1 (1837), 39-74 (English translation of Ann. Chim. Phys. 55 (1833), 337-397). A.F. Svanberg, Ann. Phys. 160 (1851), 411-418. A. Rogalski, Infrared Phys. Technol. 43 (2002), 187-210. R.N. Jones, Appl. Opt. 2 (1963), 1090-1097. M. Davies, Infra-red spectroscopy and molecular structure. Elsevier, Amsterdam, 1963. H. Gershinowitz and E.B. Wilson Jr., J. Chem. Phys. 6 (1938), 197-200. F.A. Miller, Anal. Chem. 64 (1992), 824A-831A. B. Schrader, Nachr. Chem. Tech. Lab. 47 (1999), 1019-1022. N. Sheppard, Anal. Chem. 64 (1992), 877A-883A. P.A. Wilks Jr., Anal. Chem. 64 (1992), 833A-838A. W.P. Jencks, Methods Enzymol. 6 (1963), 914-928. Inflation calculator, http://inflationdata.com/Inflation/Inflation_Calculators/Inflation_Rate_Calculator. asp.

[9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40]

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

49

[41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93]

W. de W. Abney and E.R. Festing, Philos. Trans. Roy. Soc. London 172 (1881), 887-918. K. ngstrm, fv. Kongl. Vet. Akad. Frh. 47 (1890), 331-352. W.W. Coblentz, Astrophys. J. 20 (1904), 207-223. W.W. Coblentz, Phys. Rev. (Series 1) 20 (1905), 273-291 and 337-363. K.F. Luft, Angew. Chem. 19 (1947), 2-12. H.H. Mantsch, Chem. Phys. Lipids 96 (1998), 3-7. J. Lecomte, Le rayonnement infrarouge. Tome II. La spectromtrie infrarouge et ses applications physico-chimiques. Gauthier-Villars, Paris, 1949. R.B. Barnes, R. Perkin, J.A. Sanderson, and M.E. Warga, Physics Today June (1966), 115-117. V.Z. Williams, Appl. Spectrosc., 6 (1951), 3-29. L. Nobili and M. Melloni, Am. J. Sci. 23 (1833), 185-190 (English translation of Ann. Chim. Phys. 48, 198). E.F. Nichols, Phys. Rev. 1 (1893), 1-18. F. Rcker, Pflgers Arch. Eur. J. Physiol. 231 (1933), 742-749. E.R. Blout and R.C. Mellors, Science 110 (1949), 137-138. R. Stair and W.W. Coblentz, J. Res. Natl. Bur. Stand. 15 (1935), 295-316. E. Heintz, C. R. Acad. Sci. 201 (1935), 1478-1480. N. Wright, J. Biol. Chem. 120 (1937), 641-646. F. Vls and E. Heintz, Comptes Rendus 200 (1935), 1927-1929. S.E. Darmon and G.B.B.M. Sutherland, J. Am. Chem. Soc. 69 (1947), 2074. R.H. Gillette and F. Daniels, J. Am. Chem. Soc. 58 (1936), 1139-1142. R.C. Herman, J. Chem. Phys. 8 (1940), 252-258. D. Williams and L.H. Rogers, J. Am. Chem. Soc. 59 (1937), 1422-1423. E.R. Blout and M. Fields, Science 107 (1948), 252. R.F. Furchgott, H. Rosenkrantz and E. Shorr, J. Biol. Chem., 163 (1946), 375-386 R.N. Jones (Editor). Computer programs in infrared spectrophotometry. NRC Bull. 11, 12 (1968). A. Savitzky and M.J.E. Golay, Anal. Chem., 36 (1964), 1627-1639. J.S. Mattson, Anal. Chem., 49 (1977), 470-478. L.Y. Fager and J. O. Alben, Biochemistry, 11 (1972), 4786-4792. D.G. Cameron and H.H. Mantsch, Biochem. Biophys. Res. Commun. 83 (1978), 886-92. S.M. Greenwald, A.J. Hancock, H.Z. Sable, L. DEsposito and J.L. Koenig, Chem. Phys. Lipids. 18 (1977), 154-69. D. Chapman, J.C.Gomez-Fernandez, F.M. Goni and M.J. Barnard, Biochem. Biophys. Methods, 2 (1980), 315-323. J.S. Mattson, C.A. Smith and K.E. Paulsen, Anal. Chem., 47 (1975), 736738. J.K. Kauppinen, D.J. Moffatt, H. H. Mantsch and D. G. Cameron, Appl. Spectrosc. 35 (1981), 271276. H. Susi and D.M. Byler, Biochem. Biophys. Res. Commun. 115 (1983), 391-397. J.M. Olinger, D.M. Hill, R.J. Jakobsen and R.S. Brody, Biochim. Biophys. Acta. 869 (1986), 89-98. P.I. Haris, D.C. Lee and D. Chapman, Biochim. Biophys. Acta. 874 (1986), 255-265. R. Barer, Discussions of the Faraday Society, (1950), 369-378. Aschkinass, Ann. der Phys. 55 (1895), 401-431. W.W. Coblentz, J. Franklin Inst. 172 (1911), 309 A.M. Buswell, K. Krebs and W.H. Rodebush, J. Am. Chem. Soc., 59 (1937), 2603-2605. J.W. Ellis and J. Bath, J. Chem. Phys. 6 (1938), 723-729. R.C. Gore, R.B. Barnes, and E. Petersen, Anal. Chem. 21 (1949), 382-386. H.J. Lenormant, J. Physiol. (Paris), 42 (1950), 639-640. E.R. Blout and H.J. Lenormant, J. Opt. Soc. Am., 43 (1953), 1093-1095. F.S. Parker, Appl. Spectrosc. 12 (1958), 163-166. F.S. Prker and D.M. Kirschenbaum, Nature 187 (1960), 386-388. H. Susi, S.N. Timasheff and L. Stevens, J. Biol. Chem. 242 (1967), 6460-5466. H. Susi and D.M. Byler, in Methods for protein analysis By John P. Cherry, Robert A. Barford, American Oil Chemists Society, 1988, 235-255. J.L. Koenig and D.L. Tabb, in Analytical Applications of FT-IR to Molecular and Biological Systems (Durig, J.R., Ed.) D. Reidel, Boston. 1980, 241-255. D. Chapman and F.M. Goni in The Lipid Handbook By F. D. Gunstone, John L. Harwood, Fred B. Padley. Published by CRC Press, 1994, 487-504. D.L. Tabb, Ph.D. Thesis, Case Western Reserve University, 1974. J.L. Koenig and M. K. Antoon, Appl. Opt. 17 (1978), 1374-1385. D.M. Byler and H. Susi Biopolymers, 25 (1986), 469-487. A. Dong, P. Huang and W.S. Caughey, Biochemistry 29 (1989), 3303-3308.

50

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

[94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146]

D.C. Lee, P.I. Haris and D. Chapman and R.C. Mitchell, Biochemistry, 29 (1990), 9185-9193. F. Dousseau and M. Pezolet, Biochemistry 29 (1990), 8771-8779. M. Severcan, F. Severcan and P.I. Haris, J. Mol. Struc. 565-566 (2001), 383-387. J.A. Hering, P.R. Innocent and P.I. Haris, Spectroscopy An Int. J., 16 (2002), 53-69. S.E. Darmon, G.B.B.M. Sutherland, and G.R. Tristram, Biochem. J. 42 (1948), 508-516. M.M. Stimson, and M.J. ODonnell J. Am. Chem. Soc., 74 (1952), 1805-1808. U. Scheidt and H.Z. Rheinwein, Naturforsch. 7b (1952), 270. J. Fahrenfort, in Molecular Spectroscopy, Proceedings IV Int. Meeting Bologna, 1959, A. Mangini, Ed. (Pergamon, London), 2 (1962), 437. N.J. Harrick, Ann. N.Y. Acad. Sci. 101 (1963), 928-959. B. Katlafsky and R.E. Keller. Anal. Chem. 35 (1963 ), 1665-1670. R.J. Scheuplein, J. Soc. Cosm. Chem., Vol. 15 (1964), 111-122. F.S. Parker and R. Ans, Ana.l Biochem. 18 (1967), 414-422. R. Khurana and A.L. Fink, Biophys. J. 78 (2000), 994-1000. E. Arbely, I. Kass, and I.T. Arkin, Biophys. J. 85 (2003), 2476-2483. M. Jackson and H.H. Mantsch, Appl. Spectrosc., 46 (1992), 699-701. E. Goormaghtigh, V. Raussens, J.M. Ruysschaert, Biochem. Biophys. Acta., 1422 (1999), 105-185. K.J. Rothschild and N. A. Clark, Biophys. J. 25 (1979), 473-487. M. van de Weert, P.I. Haris, W.E. Hennink, D.J.A. Crommelin, Anal. Biochem. 297 (2001), 160-169. H.W. Thompson, Nature 158 (1946), 234. R. Barer, A.R.H. Cole and H.W. Thompson, Nature, 163 (1949), 198-201. K. Krishnan, American Chemical Society, Polymer Preprints, Division of Polymer Chemistry, 25 (1984), 182-184. P. Rigler, W.P. Ulrich, R. Hovius, E. Ilegems, H. Pick and H. Vogel, Biochemistry, 42 (2003), 1401714022. H.B. Mark, Jr. and B.S. Pons, Anal Chem., 38 (1966), 119-121. W. Mntele, A. Wollenweber, F Rashwan, J. Heinze, E. Nabedryk, G. Berger and J. Breton, Photochem. and Photobiol. 47 (1988), 451-455. D. Moss, E. Nabedryk, J. Breton and W. Mntele. Eur. J. Biochem. 187 (1990), 565-572. M.G. Rockley, D.M. Davis and H.H. Richardson, Science 21 (1980), 918-920. A. Hartstein, J.R. Kirtley and J.C. Tsang, Phys. Rev. Lett. 45 (1980), 201-204. B. Chance, Rev. Sci. Instrum. 22 (1951), 619-627 A.J. White, K. Drabble and C.W. Wharton, Biochem. J. 306 (1995), 843-849. B.C. Dunn, J.R. Marda, and E.M. Eyring, Appl. Spectrosc. 56 (2002), 751-755. J. Mann and H.W. Thompson, Proc. Royal Soc. London. Series A., 192 (1948), 489-497. A. Elliott and E.J. Ambrose, Discuss. Faraday Soc., 9 (1950), 246 251. M.J. Fraser and R.D.B. Fraser, Nature 167 (1951), 761-762. R.D.B. Fraser and T.P. MacRae in Conformation in Fibrous Proteins and Related Synthetic Polypeptides. Academic Press, Inc., New York. 1973. R.D.B. Fraser, J. Chem. Phys. 21 (1953), 1511-1515. H. Akutsu and H., Y. Kyogoku, H. Nakahara and K. Fukuda, Chem. Phys. Lipids. 15 (1975), 222-242. D.F.H. Wallach and P.H. Zahler Proc. Natl. Acad. Sci. 56 (1966), 1552-1559. A. Kukol, Spectroscopy An Int. J. 19 (2005), 1-16. M.E. Riepe and J. H. Wang, J. Biol. Chem. 243 (1968), 2779-2787. W.J. Leonard, K.K. Vijai and J.F. Foster, J. Biol. Chem. 238 (1963), 1984-1988. J.O. Alben and W.S. Caughey, Biochemistry, 7 (1968), 175-183. I. Noda. Bull. Am. Phys. Soc. 31 (1986), 31, 520. I. Noda, Y. Liu and Y. Ozaki, J. Phys. Chem, 100 (1996), 8674-8680. Y. Tanimura, S.J. Mukamel, Chem. Phys. 99 (1993), 9496-9511. F. Fournier, E.M Gardner, R. Guo, P.M. Donaldson, L.M.C. Barter, D.J. Palmer, C.J. Barnett, K.R. Willison, I.R. Gould and D.R. Klug, Anal. Biochem. 374 (2008), 358-365. C. Kolano, J. Helbing, M. Kozinski, W. Sander, P. Hamm, Nature 444 (2006), 469-472. F. Siebert and W. Mntele, Biophys. Struct. Mech. 6 (1980), 147-164. M.S. Braiman, P.L. Ahl and K.J. Rothschild, in Spectroscopy of Biological Molecules, eds. Alix, A.J., Bernard, L. & Manfait, M. (Wiley-Interscience, New York), 1985, 57-59. W. Uhmann, A. Becker, C. Taran, F. Siebert, Appl. Spectrosc. 45 (1991), 390-397. P.A. Anfinrud, C. Han, J.N. Moore, P.A. Hansen, and R.M. Hochstrasser. In Ultrafast Phenomena VI. T. Yajima, K. Yoshihara, C.B. Harris, S. Shionoya, editors. Springer Verlag, Berlin, 1988, 442-446. R. Stair and W.W. Coblentz, J. Research Nat. Bur. Standards, 15 (1935), 295-316. I.M. Klotz, P. Griswold and D.M. Gruen, J. Am. Chem. Soc., 71 (1949), 1615-1620. S.E. Darmon and G.B.B.M. Sutherland, J. Amer. Chem. Soc., 69 (1947), 2074.

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

51

[147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202]

A. Elliott and E.J. Ambrose, Nature, 165 1950), 921-922. R.D.B. Fraser, W.C. Price, Nature, 170 (1952), 490-491. D. Chapman, J. Chem. Soc., (1956), 55-60. A. Hvidt, and S.O. Nielsen, Adv. Protein Chem., 21 (1966), 287-386. S. Meskers, J.M. Ruysschaert, and E. Goormaghtigh, J. Am. Chem. Soc. 121 (1999), 5115-5122. J. Lecomte, Le Spectre Infrarouge Paris, 1928. P.C. Rao, B.F. Daubert, J. Am. Chem. Soc., 70 (1948), 1102-1104; O.D. Shreve, M.R. Heether, H.B. Knight and D. Swern, Anal. Chem., 22 (1950), 1498-1501. R.T. OConnor, E.T. Field, W.S. Singleton, J. Am. Oil Chem. Soc. 28 (1951), 154-160. E.F. Binkerd and H.J. Harwood, J. Am. Oil Chem. Soc. 27 (1950), 60-62. P.I. Haris, Trends Biochem. 25 (2000), 104-105. N.K. Freeman, F.T. Lindgren, Y.C. Ng, A.V. Nichols, J. Biol. Chem., 203 (1953), 293-304. M.J. Barcelo and J. Bellanato, Anal.RS Esp. Fis. Quim. (Madrid). 49 (1953), 557. R.G. Sinclair, A.F. McKay, R.N. Jones, J. Am. Chem. Soc., 74 (1952), 2570-2575. S. Sunder, D. Caemeron, H.H. Mantsch and H.J. Bernstein, Can. J. Chem. 56 (1978), 2121. H.H. Mantsch and R.N. McElhaney, Chem. Phys Lipids., 57 (1991), 213-226. C. Clark, Appl. Spectroscp. 6 (1951), 14-17. E.R. Blout and M. Fields, J. Biol. Chem., 178 (1949), 335-43. E.R. Blout, H. Lenormant, J. Opt. Soc. Am. 43 (1953), 1093-1095. R.D.B. Fraser, Nature 170 (1952), 491. G.B.B.M. Sutherland and M. Tsuboi, Proc. Roy. Soc. (London). Series A, 239 (1957), 446-463. M. Falk, K.A. Hartman and R.C. Lord, J. Am. Chem. Soc., 85 (1963), 387-391. G.J. Thomas Jr., Biopolymers, 7 (1969), 325-334. J. Liquier, M. Pinot-Lafaix, E. Taillandier, J. Brahms, Biochemistry, 14 (1975), 4191-4197. J. Liquier, A. Mchami, E. Taillandier, J. Biomol. Struct. Dyn., 7 (1989), 119-26. E. Taillandier and J. Liquier, Methods Enzymol., 211 (1992), 307-335. D.C. Malins, N.L. Polissar, S.J. Gunselman Proc. Natl. Acad. Sci. U. S. A., 94 (1997), 3611-3615. J.F. Arakawa, J.F. Neault and H.A. Tajmir-Riahi, Biophys. J. 81 (2001), 1580-1587. L.P. Kuhn Anal. Chem., 22 (1950), 276-283. J.D.S. Goulden, Nature 177 (1956), 85-86. M. Kacurkov and R.H. Wilson, Carbohyd. Poly., 44, (2001), 291-303. J.D. Hardy and C.J. Muschenheim, J. Clin. Invest. 5 (1934), 817-831. J.D. Hardy and C.J. Muschenheim, J. Clin Invest. 1 (1936), 1-9. W.E. Pauli, and I. Ivancevic, Strahlentherapie, 25 (1927), 532. C.H. Cartwright, J. Opt. Soc. Am., 20 (1930), 81. A.R. Pearson and R.E. Norris, Brit. J. Radiol., 6 (1933), 480. K. Dobriner, S. Lieberman, C.P. Rhoads, R.N. Jones, V.Z. Williams, RB. J. Biol. Chem. 172 (1948), 297-311. K. Dobriner, T.H. Kritchevsky, D.K. Fukushima, S. Lieberman, T.F. Gallagher, J.D. Hardy, R.N. Jones and G. Cilento, Science, 109 (1949), 260-261. E.R. Blout, R.C. Mellors, Science, 110 (1949), 137-138. H.P. Schwarz, H.E. Riggs, C. Glick, W. Cameron, E Beyer, Jaffe and L. Trombetta,, Proc.Soc. Exp. Biol. Med. 76 (1951), 267-72. H.M. Randall, D.W. Smith, A.C. Colm, W.J. Nungester, Am. Rev. Tuberc. 63 (1951), 372-380. M. Pollard, F.B. Engley Jr., R.F. Redmond, H.I. Chinn and R.B. Mitchell, Proc. Soc. Exptl. Biol. Med., 81 (1952), 10-11. S. Levine, H.J.R. Stevenson, L.A. Chambers, B.A. Kenner, J. Bacteriol. 65 (1953), 10-15. D. Naumann, Appl. Spectroscopy Rev.36 (2001), 239-298. M. Jackson, M.G. Sowa and H.H. Mantsch, Biophys. Chem. 68 (1997), 109-125. J.T. Edsall, Cold Spr. Harb. Symp. quant. Biol. 6 (1938), 40. A.M. Buswell and R.C. Gore J. Phys. Chem., 46 (1942), 575-581. D.J. Fink, T.B. Hutson, K.K. Chittur, R.M. Gendreau, Anal. Biochem. 165 (1987), 147-54. R.B. Woodward, C.H. Schramm J. Am. Chem. Soc., 69 (1947), 1551-1552. W.T. Astbury, C.E. Dalgliesh, S.E. Darmon and G.B.B.M. Sutherland, Nature 164 (1949), 440-441. P.I. Haris and D. Chapman, Biopolymers, 37 (1995), 251-263. J.W. Ellis, and B.W. Sorge, J. Chem. Phys. 2 (1934), 559. E. Oldfield, D. Chapman, and W. Derbyshire, FEBS Lett. 16 (1971), 102-104. L. Tadesse, R. Nazarbaghi and L. Walters, J. Am. Chem. Soc., 113 (1991), 7036-7037. P.I. Haris, G.T. Robillard, A.A. Van Dijk, D. Chapman, Biochemistry, 31 (1992), 6279-6284. T.S. Anderson, J. Hellgeth and P.T. Lansbury Jr., J. Am. Chem. Soc., 118 (1996), 6540-6546.

52

A. Barth and P. Haris / Infrared Spectroscopy Past and Present

[203] G. Herzberg, Molecular Spectra and Molecular Structure: II Infrared and Raman Spectra of Polyatomic Molecules, D. Van Nostrand Co. Inc., New York, 1945. [204] G.B.B.M. Sutherland, Discuss. Faraday Soc., 9 (1950), 274-281. [205] T. Miyazawa, T. Shimanouchi, S. Mizushima, J. Chem. Phys. 24 (1956), 408. [206] S. Krimm, J. Mol. Biol. 4 (1962), 528-40 [207] T. Miyazawa, J. Chem. Phys, 32 (1960), 1647-1652. [208] Y.N. Chirgadze, L.A. Gribov, N.S. Andreeva, N.E. Shutzkever, Zhurnal Fizicheskoy Khemie (Moscow), 35 (1961), 755-760. [209] F. Siebert and W. Mntele, Eur. J. Biochem., 130 (1983), 565-73. [210] D.C Lee and D. Chapman, Biosci. Rep. 6 (1986), 235-256. [211] P.S. Belton, R.H. Wilson and D.H. Chenery, Int. J. Biol. Macro. 8 (1986), 247-251. [212] T. Matsui, S. Tanaka and T. Akaike, J. Bioeng., 2 (1978), 539-541. [213] K.-M., Pan, M. Baldwin, J. Nguyen, M. Gasset, A. Serban, D. Groth, I. Mehlhorn, Z. Huang, R.J. Fletterick, F.E. Cohen and S.B. Prusiner, Proc. the Natl. Acad. Sci. USA, 90 (1993), 10962-10966. [214] W.K. Surewicz, H.H. Mantsch and D. Chapman. Biochemistry 32 (1993), 389-394. [215] M. Jackson and H.H. Mantsch, Crit. Rev. Biochem. Mol. Biol. 30 (1995), 95-120. [216] W.K. Surewicz and H.H. Mantsch, Biochimica et Biophysica Acta, 952 (1988), 115-130. [217] R.N. Jones, Eur. Spec. News 70 (1987), 10-20. [218] R.N. Jones, Eur. Spec. News 72 (1987), 10-20. [219] R.N. Jones, Eur. Spec. News 74 (1987), 20-34. [220] E.K. Plyler, Appl. Spectrosc. 16 (1962), 73-77. [221] J.R. Loofbourow, Rev. Mod. Phys. 12 (1940), 267-358. [222] G.B.B.M. Sutherland, Adv. Prot. Chem. 7 (1952), 291-318. [223] J.R. Durig and D.J. Gerson, Historical survey of the infrared and Raman spectroscopic study of biological molecules, in: Infrared and Raman spectroscopy of biological molecules, ed. T.M. Theophanides, D. Reidel Publishing Company, Dordrecht, 1979, 35-43. [224] H.A. Laitinen, Anal. Chem. 45 (1973), 2305. [225] A.C. Steven and W. Baumeister, J. Struc. Biol. 145 (2004), 181-183.

Biological and Biomedical Infrared Spectroscopy A. Barth and P.I. Haris (Eds.) IOS Press, 2009 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-045-2-53

53

The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy


Andreas BARTH1 Department of Biochemistry and Biophysics, Stockholm University

Abstract. Reaction-induced infrared difference spectroscopy of proteins is reviewed. This technique enables detailed characterization of enzyme function on the level of single bonds of proteins, cofactors or substrates. Discussed are methods to initiate protein reactions in the infrared samples, general aspects of spectra interpretation, measurements of enzyme activity and studies of protein function at the example of the Ca2+ pump. Keywords. FTIR, protein structure, protein function, SERCA, enzyme activity, ligand binding.

1. Difference Spectroscopy - the Technique of Choice to Monitor Single Bonds in Large Proteins Elucidating the molecular mechanism of protein reactions is a major challenge for the life science community. Applying infrared spectroscopy to the study of protein reactions combines several of its advantages: (i) high time resolution (< 1 s), (ii) universal applicability from small soluble proteins to large membrane proteins, (iii) the high molecular information content, and (iv) a sensitivity high enough to detect a change in bond strength of a single bond in a large protein. In favourable cases, a protein reaction can be observed in the infrared absorption spectrum [1-4]. In most cases however, the effects are too small to be obvious. In addition, the information provided by an absorption spectrum is limited because it is composed of many overlapping bands. The key to obtain detailed structural information is to reduce the number of groups that contribute to a spectrum. This can be done by difference techniques which are particularly suited for protein reactions. Infrared difference spectroscopy monitors the changes in infrared absorption associated with a reaction, or in other words, it records the infrared difference spectrum of the reaction. A difference spectrum can be obtained by carefully subtracting the spectrum of a sample where the protein is in a particular state A from a spectrum where it is in a state B, see for example refs. [1, 5-12]. However, the absorbance changes usually observed
1 Corresponding author: Andreas Barth, Department of Biochemistry and Biophysics, The Arrhenius Laboratories for Natural Sciences, Stockholm University, S-10691 Stockholm, Sweden; E-mail: barth@dbb.su.se

54

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

for protein reactions are very small, on the order of 0.1% of the maximum absorbance. This is illustrated in Fig. 1. In consequence, subtracting spectra obtained from different protein samples does not generally allow the sensitive detection of the small absorbance changes between two protein states. Instead, the protein reaction of interest has to be initiated directly in the cuvette. This technique is termed reaction-induced infrared difference spectroscopy. For reviews see refs. [13-24]. In a typical experiment that employs reaction-induced difference spectroscopy, the protein is prepared in the stable state A and the absorbance of this state is measured. Then the reaction is triggered, the protein proceeds to state B and again the absorbance is recorded. Instead of only one particular state B, also a sequence of transient states may be adopted in the course of the reaction. In that case the interconversion between the product states B1, B2, etc. can be followed by time-resolved methods. From the spectrum recorded before the start of the reaction - state A - and the spectra recorded during and after the reaction - state(s) B - difference spectra are calculated. They originate only from those groups that are affected by the reaction. All passive residues are invisible in the difference spectrum which means that the number of observed groups is dramatically reduced compared to the absorption spectrum. Therefore, a difference spectrum exhibits details of the reaction mechanism on the molecular level despite a large background absorption.

Figure 1. The need for reaction-induced difference spectroscopy. Panel (a) compares absorption (full line) and difference spectrum (dashed line) of a membrane protein (Ca2+-ATPase) in 2H2O. The absorption spectrum exhibits prominent bands due to the C=O stretching vibration of lipids (~1730 cm-1) and the amide I vibration of proteins (~1650 cm-1). The difference spectrum appears as flat line when viewed on the same scale as the absorption spectrum, although it has been recorded for a reaction with relatively large absorbance changes. Panel (b) shows the same difference spectrum on a 100 times larger scale. Positive and negative bands in the difference spectrum are due to absorbance changes associated with the protein reaction. There is no obvious noise in the spectrum, demonstrating the sensitivity of infrared difference spectroscopy.

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

55

This chapter discusses first technical aspects of reaction-induced difference spectroscopy, then general aspects of spectra interpretation and finally two exemplary applications: enzyme activity measurements and the study of protein function. 2. Bottleneck 1 of Infrared Difference Spectroscopy: Concentrated Protein Samples Are Needed One drawback of infrared spectroscopy of aqueous solutions is the strong absorption of water in the mid-infrared spectral region (near 1645 cm -1) [25] which overlaps the important amide I band of proteins and some side chain bands (see section 4.3). When these protein bands are of interest, the strong water absorption demands a short path length for aqueous samples in transmission experiments, which is typically around 5 m. This implies a relatively high protein concentration for studies of protein reactions in order to be able to detect individual molecular groups in the spectrum. A desirable protein concentration is 1 mM or ~100 mg/ml which is nearly as high as the protein concentration found in cells [26]. Using 2H2O, the pathlength can be increased to 50 m and the concentration lowered because the water band is downshifted to ~1210 cm-1. Nevertheless, proteins often have to be concentrated for infrared spectroscopy, for example by partially drying the protein sample on an infrared window in a stream of nitrogen or in vacuum, by centrifugation and subsequent transfer of the pellet onto infrared windows, by direct centrifugation onto an infrared window, by incubation in an atmosphere of constant humidity defined by a saturated salt solution, or by micro-concentrators in case of soluble proteins or solubilised membrane proteins. Cooling of the sample might help to maintain protein functionality during concentration. An alternative to transmission experiments is the attenuated total reflectance (ATR) technique [19, 27-29]. Compared to transmission experiments, it avoids the handling problems which are caused by the required short pathlength. In an ATR experiment a sample is placed on a crystal. The total reflectance process at the crystal surface senses the absorption of the sample in a layer that extends about one wavelength away from the crystal surface, which means that the optical thickness of the sample is small enough for measurements of aqueous solutions. The sample is usually a protein film [28, 30-32], often prepared by drying, with a buffer solution above it. The thickness of the buffer layer does not influence the measured spectrum. The advantage of ATR spectroscopy is, that the buffer can be exchanged, which makes sample manipulations relatively simple and the method very flexible. The disadvantage may be that it is difficult to prepare a stable film. An unstable protein layer is easily disturbed upon buffer exchange which makes the recording of small absorbance changes impossible. 3. Bottleneck 2 of Infrared Difference Spectroscopy: Triggering Protein Reactions A crucial problem in reaction-induced difference spectroscopy is how to trigger the protein reaction of interest. The number of methods for this has constantly increased in the last decade and the main approaches are summarised in the following. Light-induced infrared difference spectroscopy is the technique that has first been applied to proteins from the early 1980s [33-35]. Here, continuous illumination or a light flash induces a reaction in photosensitive proteins, like bacteriorhodopsin [14, 15,

56

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

36, 37] or photosynthetic reaction centres [38-41]. From spectra recorded before and during/after illumination, light-induced difference spectra can be calculated. Concentration jump techniques are required to study the effects of ions or molecules on proteins. Three different approaches have been applied to generate a concentration jump in an infrared sample: (i) the ATR technique, (ii) the infrared variants of the stopped-flow and continuous-flow techniques, and (iii) the photolytical release of effector substances from biologically silent precursors (termed caged compounds). ATR has the advantage that manipulation of medium composition is straightforward when a stable protein film can be prepared. For example ligands can be added or the pH can be changed. Examples are studies of ligand binding to the nicotinic acetylcholine receptor [42, 43], to the gastric H+/K+-ATPase [44], to transhydrogenase [45], and of protein-protein interaction between transducin and rhodopsin [46]. A recent extension of the ATR technique separates a sample compartment close to the ATR crystal from a reservoir by a dialysis membrane. In this way the medium in the reservoir can be altered without disturbing the protein sample in the sample compartment [46-49]. A protein film is not required, making the technique also applicable for soluble proteins. Rapid mixing techniques are difficult to apply in infrared spectroscopy because of the viscous consistency of a concentrated protein solution and the small pathlength of less than 10 m for measurements in 1H2O. Nevertheless, successful applications of mixing devices have been reported [50-53], as reviewed recently [54]. A photolytically induced concentration jump can be achieved with photosensitive molecules that release a compound of interest upon illumination in the UV spectral range (300-350 nm). These molecules are termed caged compounds and have been used for 30 years to study biological reactions [55, 56]. In its caged form, the effector compound is modified such that it does not react with the protein of interest (see Fig. 2 for caged ATP). Photolysis of the caged compound leads to a sudden concentration jump of the free effector substance on the s to ms timescale (< 10 ms for caged ATP of Fig. 2) which initiates the protein reaction. A recent extension of the approach uses helper enzymes to induce a series of consecutive reactions [57]. Binding of the effector substance to a protein and subsequent conformational changes alter the infrared spectrum [58]. In addition to protein and effector molecule bands, the photolysis reaction is reflected in the difference spectra. In infrared studies of proteins, caged nucleotides, caged Ca2+ (Nitr-5 or DMNitrophen) and caged electrons (described in the photoreduction section) [59] have been used most often. The studies have dealt with two main aspects, molecule-protein recognition and the molecular basis of enzyme function. Most of these studies have been done on the sarcoplasmic reticulum Ca2+-ATPase (see section 6), which has also been the first enzyme to be studied with this technique [58]. Other proteins studied are alkaline phosphatase [60], annexin VI [61], DNaK [62], glutamate receptor [63], GroEL [64], kinases [65-67], Ras [68-70], and RecA [71]. Temperature- and pressure-jumps can be used to study folding and unfolding of proteins. The temperature jump is generated either by injecting a protein solution into a cuvette that is held at a different temperature (time resolution 100 ms) [72, 73], or by a laser pulse near 2 m that is absorbed by the sample solvent 2H2O [74-77] (temperature maximum reached after 20ns [74]), or by a visible laser pulse that excites a heat transducing dye [78, 79]. Pressure-jump experiments [80, 81] require a cell that allows

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

57

to set the sample under pressures of several kbar and have an experimental dead time of about 20 s.

Figure 2. Infrared difference spectroscopy with caged compounds at the example of caged ATP. (a) Photolysis of caged ATP. Caged ATP is modified at the -phosphate so that it does not react with the ATPase. Upon an UV flash (300-350 nm), the caged molecule photolyses which leads to a sudden concentration jump of free ATP (< 10 ms). (b) Infrared difference spectra upon release of ATP from caged ATP in the presence and absence of protein. From a spectrum recorded before ATP release and spectra recorded after ATP release, difference spectra are calculated (absorbance after ATP release minus absorbance before ATP release). Bands in the difference spectra originate only from those groups that are affected by ATP release with positive bands being characteristic of the state after ATP release and negative bands characteristic of the initial state. All "passive" residues are invisible in the difference spectrum which, therefore, highlights what happens to the "active" residues and exhibits details of the molecular reaction mechanism despite a large background absorption. Negative bands in difference spectra are characteristic of the initial state before release of ATP, while positive bands reflect the state(s) after ATP release. In addition to protein and ATP bands, the photolysis reaction is reflected in the difference spectra. Bold line: spectrum of caged ATP photolysis in the absence of protein. The main bands at 1524 and 1342 cm-1 have been assigned to the antisymmetric and symmetric stretching vibrations of the nitro group of caged ATP, respectively, and below 1270 cm-1 to a diminution of electron density in the phosphate P-O bonds upon photolysis [58]. Further information can be found in refs. [82-87]. Thin line: ATP release in a Ca2+-ATPase sample. Two reactions contribute to the signals: (i) caged ATP photolysis and (ii) the transition of the Ca2+ loaded ATPase Ca2E1 to the E2P phosphoenzyme where Ca2+ has been pumped and released. Tentative assignments of selected infrared difference bands are given. Prot: bands of protonated carboxyl groups due to protonation of acidic Ca2+ ligands upon Ca2+ release. The right and the left bands are characteristic of carbonyl groups with and without hydrogen bonding, respectively. Ca2+ release: two pairs of bands, each consisting of a positive and a negative component, originating from a change in absorption of carboxylate groups due to Ca2+ release; phosphate: band of the E2P phosphate group; conf: bands predominantly due to conformational changes of the protein backbone. Reprinted in modified form from [23]. 2002 American Chemical Society.

58

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

Equilibrium electrochemistry can be used to initiate redox reactions of proteins. Infrared investigations became available with the development of an ultra-thin-layer spectroelectrochemical cell suitable for protein investigations in aqueous solution [88]. The spectroelectrochemical cell permits control of the redox state of proteins in the infrared cuvette by applying a potential to a working electrode which changes the potential of the sample volume that is probed by the infrared beam. By applying a potential step at the working electrode, a redox reaction can be triggered in the electrochemical cell. From the infrared absorption spectrum before and after the potential step, a difference spectrum can be calculated that reflects only the redox reaction of the protein. The electrochemical cell is particularly useful when a protein contains several redox active cofactors with different midpoint potentials as it is often the case for proteins involved in photosynthesis and respiration. Since the method allows the precise control of the sample redox state, it is possible to selectively induce the redox reaction of only one particular cofactor, i.e. to dial-a-cofactor [21]. In a study of photosystem I for example, careful experimentation has avoided the oxidation of the abundant antenna chlorophylls and has allowed the exclusive titration of signals of the primary electron donor P700 (a chlorophyll dimer) [89]. While much of the initial work has been on photosynthetic proteins [21, 89-91], recent work has concentrated on cytochrome c oxidases [92-100]. More information on this method can be found in reviews [23, 24, 40, 100]. Photoreduction is a second way to induce redox reactions. It is based on photoexcitable electron donors like riboflavin [101, 102] or Ru2+ complexes [103] for which the expression caged electron has been coined [59]. Light absorption converts these systems directly or indirectly to a highly reducing state which can transfer one electron to a protein. The principle is similar to that of photosynthetic reaction centres where photoexcitation of the primary donor produces a highly reducing excited state and initiates electron transfer reactions within the protein. Photoreduction has been used in studies of aa3 cytochrome c oxidase of Rhodobacter sphaeroides [59], bo3 [59, 104, 105] and bd [106] ubiquinol oxidases of Escherichia coli and cytochrome P450cam [103]. 4. How to Make Sense of Infrared Difference Spectra 4.1. The Origin of Difference Bands Bands appear in difference spectra of protein reactions for several reasons: the chemical structure might change (protonation/deprotonation, catalytic reactions) or the three-dimensional structure of protein or cofactor. The former gives rise to a different absorption spectrum before and after the reaction. The latter changes vibrational coupling between neighbouring groups or the environment around particular functional groups causing band shifts and changes in the absorption index. In many cases, negative bands in a difference spectrum are characteristic of the state before the reaction and positive bands of the state(s) after the reaction. This interpretation might be misleading in the case of a change in absorption index since only the strength of absorption is affected but not the band position.

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

59

4.2. The Difference Spectrum Seen as a Fingerprint of Conformational Change Infrared difference spectra usually contain many difference bands which indicate the wealth of information that is encoded in the spectrum. However, extracting this information is often difficult. A simple first approach is to regard the spectra as a characteristic fingerprint of the conformational change. The spectral signature of a change in structure can then be used to detect and define transient states of a protein. Similar approaches have a long tradition in fluorescence and absorption spectroscopy. The approach can be used to study reaction intermediates and to classify and quantify conformational changes. It has even provided molecular information in a study that mapped substrate protein interactions [107] (see section 6.3 in this chapter). Intermediates. From the time course it is possible to evaluate the number of intermediates in the reaction. Here, time-resolved vibrational spectroscopy has the advantage that the observation is not restricted to a limited number of chromophores (i.e. Trp residues) or to an extrinsic fluorescence label which will largely reflect local changes in the vicinity of the chromophore(s) and may miss conformational changes occurring in distant regions of the protein. Instead, in vibrational spectroscopy all carbonyl chromophores of the backbone amide groups are monitored, and this will reveal any change in backbone conformation even if very small. Additionally, it is possible to follow the fate of individual catalytically "acting" groups in the same experiment. Thus, infrared spectroscopy simultaneously looks, on the one hand locally at the catalytic site, and on the other hand at the protein as a whole. This approach has been used in studies of the Ca2+-ATPase pump mechanism [108] of the photoactive yellow protein photocycle [109], and of protein folding studies [53, 74, 110, 111]. Similar conformational changes. From the shape of the spectra, conformational changes can be classified according to their similarity. This can be used to compare different preparations of a protein or related partial reactions [42, 89, 95, 112]. The extent of conformational change. From the magnitude of the difference signals, the extent of conformational change in a protein reaction may be estimated [108, 113-115] using the amide I region of the spectrum (1700 to 1610 cm -1) which is due to the absorption of the backbone carbonyl groups. Their absorption maximum depends on secondary structure due to transition dipole coupling [116] and throughbond coupling [117, 118], as well as on the strength of hydrogen bonding [119-121]. On this basis, the amplitude of the infrared difference signals in the amide I region can be used to estimate the change of secondary structure. Proceeding along this line, one has to consider the following: Signals of conformational changes may overlap in a way that they cancel each other leading to an underestimation of the extent of structural change. Therefore, the infrared difference spectrum reveals only the net change of secondary structure. A worst case scenario is shown in Fig. 3a, where nearly all residues change their secondary structure, but the net change is zero. Movements of rigid domains are not visible, only the working portion that changes its backbone geometry reflects in the difference spectra. Thus, it may be misleading to use terms such as "large" and "small conformational change" since considerable movements of rigid domains may originate from very small flexible parts of a protein like hinge regions that comprise only a few residues. An example is shown in Fig. 3b. Movement of the rigid domains (shown in grey) does not lead to signals in the infrared difference spectrum. Only the flexible part (shown in black), where the conformational change alters the

60

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

relative orientation of neighbouring amide groups, gives rise to infrared difference signals. Since transition dipole coupling leads to delocalised amide I modes, a simple linear relationship between signal magnitude and secondary structure change is not expected when individual residues change their secondary structure. The sensitivity towards conformational changes however, seems to be very high. For example, if an -helix shortens, this affects not only the amide modes of the backbone portion that unwinds, but also those of the remaining helix [122]. In addition to a secondary structure change, more subtle changes within a persisting secondary structure will also manifest in the spectrum. Examples are a change in hydrogen bonding to the C=O oxygens, a change in the twist of a -sheet or bending of an -helix. Signals due to amino acid side chains may overlap, although the amide I mode has a strong extinction coefficient [123, 124] which is generally larger than that of amino acid side chains in the amide I region [125, 126]. Several approaches have been proposed to quantify the number of residues participating in a secondary structure change [108, 113-115]. In spite of the implications and limitations discussed above, the approaches nevertheless seem to provide realistic estimates of the net secondary structure change [22, 127, 128].

Figure 3. Quantifying the extent of conformational change with infrared difference spectroscopy. (a) Worst case scenario: the protein undergoes a large conformational change, but the net change of secondary structure is zero since the N-terminal -sheet converts into an -helix and the C-terminal -helix into a -sheet. Infrared difference spectroscopy would not detect that conformational change - only the net change is detected. (b) Movement of rigid domains is invisible for infrared difference spectroscopy. When they move relative to each other, only the working part of the protein that causes the movement (shown in black) shows up in the spectrum. A large change in shape of a protein may therefore be accompanied only by small infrared absorbance changes.

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

61

Usually, signals of protein backbone perturbations have been found to be rather small, as shown for the electron transfer reactions of the photosynthetic reaction centre [39, 129], cytochrome c oxidase [92], cytochrome c [88, 130], bacterial cytochrome c3 [131], and myoglobin [132]. This indicates that the protein often provides an optimised solvent [129] rather than to act via a considerable reorganisation of secondary structure. On the other hand, small net secondary structure changes have also been observed for the Ca2+-ATPase [108], which alters the relative arrangement of its domains during catalysis. Therefore, small infrared signals in the amide I region do not rule out a considerable change in protein shape. 4.3. The Absorption of Amino Acid Side Chains Amino acid side chains are often at the heart of the molecular mechanism of proteins. Thus, side chain absorption can provide very valuable information, in particular when it is possible to follow the fate of the participating groups in a single time-resolved experiment. The aim of this kind of research is to identify the catalytically important side chains and to deduce their environmental and structural changes from the spectrum in order to understand the molecular reaction mechanism. For example, information may be obtained on the protonation state, coordination of cations and hydrogen bonding. Table 1 in the appendix gives an overview of the infrared absorption of amino acid side chains in 1H2O and 2H2O [24, 133]. Only the strongest bands are listed in table 1, or those in a spectral window free of overlap by bands from other groups. The absorption of a side chain in a protein may deviate significantly from its absorption in solution or in a crystal. The special environment provided by a protein is able to modulate strength and polarity of bonds, thus changing the vibrational frequency and the absorption coefficient. Therefore, the band positions given in the table should be regarded only as guidelines for the interpretation of spectra. It may be mentioned here that also the pKa of acidic residues in proteins may differ significantly from solution values. An example is Asp-96 of bacteriorhodopsin for which a pKa > 12 has been found [134]. 4.4. Molecular Interpretation: Band Assignment To fully exploit the information in an infrared difference spectrum, the spectroscopist needs to know which molecular group causes a given feature in the spectrum. Assignment of infrared bands to specific molecular groups is possible by studying model compounds, by chemical modifications of cofactors or ligands, by site-directed mutagenesis and by isotopic labeling of ligands, cofactors and amino acids. Model spectra. contributions of cofactors or substrate molecules to the infrared spectrum can be identified by normal mode calculations or by comparison with the spectra of the isolated molecules or model compounds in an appropriate environment. An example are chlorophyll studies [135-137]. Site-directed mutagenesis is a very powerful approach. Ideally, an infrared signal due to a specific amino acid is missing when this amino acid has been selectively replaced. The missing signal can then be assigned to the mutated amino acid. However, mutagenesis cannot be applied to crucial amino acids because their mutation abolishes protein function. Also, a mutation may exert wide-spread conformational effects on the protein which extensively modify the structural changes and thus the infrared

62

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

difference spectrum. Therefore, the effect of mutation on infrared difference spectra has to be evaluated very carefully. Isotopic labeling has been used as a powerful tool since the early days of infrared spectroscopy on proteins in order to observe a specific group in a large protein [5, 7]. It avoids perturbations of protein structure that might be introduced by mutagenesis and allows to label crucial amino acids that cannot be mutated without loss of function. Due to the mass effect on vibrational frequencies, infrared absorption bands of a labeled group are shifted with respect to those of the unlabeled groups and can be identified in the spectrum. Ligands, cofactors and protein side chains as well as backbone groups can be labeled. Labeling ligands is very informative when the interactions between ligands and proteins are investigated. Such studies include the binding of CO [5] and O 2 [138] to hemoglobin, of O2 [138] to myoglobin, of carbonyl groups to triosephosphate isomerase [7] and phospholipase A2 [139], and of phosphate groups to Ras [68] and Ca2+-ATPase [140]. Protein cofactors have also been labeled for example for bacteriorhodopsin [36, 141], photosynthetic reaction centres [38, 40, 41] and cytochrome c oxidase [94]. In favourable cases the substrate can transfer a labeled group to the protein which can then be studied in its protein environment. This approach has been used to study the acyl enzyme of serine proteases [20] and the phosphate group of the phosphoenzyme intermediates of the Ca2+-ATPase [142, 143]. Amino acids in proteins can be labeled in various ways. 1H/2H exchange is simply done by replacing 1H2O by 2H2O which exchanges the protons of accessible acidic groups, like OH, NH and SH groups, by deuteriums [144]. The observed characteristic band shifts often allow the assignment of these bands to peptide groups or to specific amino acid side chains. An additional advantage is the shift of the strong water absorption away from the amide I region (1610-1700 cm-1) which is sensitive to protein structure. Recombinant proteins can be labeled uniformly with for example 13C or 15N [145], all amino acids of one type can be labeled, or a label can be placed specifically on one particular amino acid [146, 147]. This site-directed labeling is the most powerful interpretation tool, unfortunately, it requires great effort and is usually not feasible. 4.5. Molecular Interpretation: Quantification Once a difference band is assigned to a specific molecular group, it is evident that this group participates in the studied protein reaction. Furthermore its frequency or the wavenumber of the corresponding infrared band provides precise information on a number of bond parameters and other molecular properties [148]. The quantitative interpretation becomes even more powerful with the increased ease of quantumchemical calculations of the vibrational spectrum [149]. For enzymes, the information obtained from calculations is a telltale of how the environment shapes the molecular properties of catalytically active groups. This is the information needed to unveil the secrets behind the amazing catalytic power of enzymes.

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

63

5. Enzyme Activity Measurements Enzymes are not only fundamental to life, they are also used in many biotechnological processes. Therefore, enzymatic activity is an important parameter. It is usually measured indirectly because substrate and product cannot be distinguished in the UV or visible range of the spectrum. Therefore coloured or fluorescent substrate analogues have been developed or the enzymatic reaction of interest is coupled to auxiliary enzymatic reactions that can be followed in the UV or visible spectral range. In contrast, infrared spectroscopy can provide a direct, "on-line" monitor of enzymatic reactions, because the infrared spectra of educts and products of an enzymatic reaction are often different. Measurements of enzyme activity with infrared spectroscopy are relatively straightforward and of interest even for researchers whose focus is to study the conformational changes associated with a protein reaction. This is because the infrared signals of the conversion of substrate to product allow an activity control in the same experiment that is used to monitor conformational changes. Examples for infrared spectroscopic measurements of enzyme activity are urea hydrolysis by urease [193], cefoxitin hydrolysis by -lactamase [150], deacylation of cinnamoyl-chymotrypsin [151], ATP hydrolysis by the Ca2+-ATPase [152], oxidation of D-glucose by glucose oxidase [153], hydrolysis of amides [154, 155], synthesis of hydroxamic acid derivatives [155] by amidase and the reaction of -ketoglutarate and Ala to Glu and pyruvate by glutamic-pyruvic transaminase [156]. In the example discussed here, ATP hydrolysis by the Ca2+-ATPase has been followed with infrared spectroscopy [152]. The different infrared absorption of substrates and products could be exploited to measure enzyme activity with only 7.5 g enzyme needed. Hydrolysis of ATP involves the net transformation of one PO2- group into a PO32group as shown in Fig. 4a. This is accompanied by a decrease in electron density in the terminal P-O bonds leading to a decrease in the vibrational frequency and thus a change in the vibrational spectrum. Fig. 4b shows the spectrum of the substrate ATP and that of the products ADP and Pi in a 1:1 mixture. Near 1230 cm-1 the antisymmetric stretching vibration of the PO2- group absorbs [157] and the - and the -PO2- group of ATP give rise to a prominent ATP band. This band is significantly reduced for the ADP and Pi mixture, since only the -PO2- group of ADP contributes. In contrast, near 1080 cm-1 the products absorb more strongly. This is the spectral region of the asymmetric stretching vibration of the PO32- group [157] and there are two PO32groups in the products ADP and Pi but only one in the substrate ATP. When an enzymatic reaction is followed by infrared spectroscopy, first a reference spectrum is recorded that represents the sample with substrate, in this case ATP. Then the reaction is started and successive spectra are recorded until the reaction is complete. From these spectra, the reference spectrum is subtracted to obtain difference spectra that only reflect the changes of absorption that occur in the course of the reaction. From the absorption spectra of ATP and the products ADP and Pi, a difference spectrum can be calculated that models ATP hydrolysis: absorbance of ADP and Pi minus the absorbance of ATP. This difference spectrum is shown in Fig. 4b. The following difference bands are observed: (i) a negative band near 1230 cm-1 reflecting the disappearing ATP band at 1230 cm-1 and (ii) a positive band near 1080 cm-1 reflecting

64

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

the increased absorbance of the products formed. This shows that both, the substrate concentration of ATP and the product concentration of ADP and Pi can be followed when ATP hydrolysis is monitored with infrared spectroscopy. The difference bands discussed above are expected for ATP hydrolysis catalysed by an enzyme. An example gives Fig. 4c: difference spectra after the release of ATP from caged ATP in the presence of the Ca2+-ATPase. They show bands due to two reactions: hydrolysis of ATP and photolysis of caged ATP. The first spectrum is dominated by photolysis bands, subsequent changes are due to the hydrolysis of ATP. As expected for ATP hydrolysis, the negative PO2- band evolves near 1230 cm-1 and a positive PO32- band near 1080 cm-1.

Figure 4. Infrared spectra of ATP hydrolysis showing that it is possible to measure ATPase activity with infrared spectroscopy. (a) Hydrolysis of ATP. (b) Model spectra: infrared absorption spectrum of 100 mM ATP (bold line), of 100 mM ADP plus 100 mM Pi (thin line) and difference spectrum (dotted line): absorbance of ADP and Pi minus absorbance of ATP. (c) ATPase activity measurement: infrared difference spectra induced by the release of ATP from caged ATP in the presence of Ca2+-ATPase. Five subsequent spectra of a typical sample are shown that monitor the progress of the hydrolysis reaction. The average time of recording is indicated. Reprinted from [152] with permission from Sage, Inc.

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

65

The PO2- band was used to measure enzyme activity with infrared spectroscopy [152]. The resulting specific activity determined from 6 measurements was 4.1 0.5 moles mg-1min-1. The major source of error seems to be the handling of minute protein and substrate volumes (1 l) and this can be minimised when automated mixing devices are used. In spite of this limitation, the specific activity obtained by infrared spectroscopy is in excellent agreement with the results of an independent activity measurement which gave 4.6 0.3 moles mg-1min-1 (5 measurements). This example shows that infrared spectroscopy can be used to measure enzyme activity with good accuracy. The amount of enzyme needed (here less than 10 g) is comparable to current methods and considerably less than needed for infrared studies of the molecular function of proteins. The advantages of infrared spectroscopy are: (i) no activity assay is required since substrates and products are monitored directly, (ii) the experimental conditions are not limited to those necessary for an activity assay and (iii) the infrared method has wide applicability since many enzymatic reactions lead to changes in the infrared spectrum. These advantages are not limited to the specific approach used here to start the reaction, but also apply to the more general mixing techniques. The ongoing development of mixing cuvettes will facilitate infrared activity measurements considerably - measuring enzyme activity with infrared spectroscopy will therefore become a wide-spread method in the immediate future. 6. Protein Function the Ca2+ Pump 6.1. The Ca2+ Pump P-type ATPase are major players in primary active transport of ions across biological membranes. Their name derives from the fact that these enzymes become phosphorylated by ATP during the transport cycle. One of the best characterized members of this family is the Ca2+-ATPase of the sarcoplasmic reticulum (SR) membrane from skeletal muscle (SERCA1a) [158] which serves as a model for the whole family of P-type ATPases. For reviews see [159-163]. The SR Ca2+-ATPase transports Ca2+ from the cytoplasm of muscle cells into the SR lumen which relaxes a flexed muscle. Protons are countertransported in exchange for Ca2+. Active transport of two Ca2+ is fuelled by the free energy from the hydrolysis of one molecule of ATP which is used with up to 100% efficiency [164]. A simplified version of the reaction sequence is given in Fig. 5. The Ca 2+-free ATPase exists in a pH dependent equilibrium between an E2 and an E1 form. It binds two cytosolic Ca2+ to the two high affinity Ca2+ binding sites in exchange for protons. Ca2+ binding enables ATP to phosphorylate Asp-351 of the ATPase which occludes the bound Ca2+. At least two phosphoenzyme intermediates (Ca2E1P and E2P) with different properties are formed consecutively. The first phosphoenzyme intermediate Ca2E1P is ADP-sensitive, i.e. dephosphorylates with ADP to form ATP, the second phosphoenzyme intermediate E2P is ADP-insensitive (E2P) and dephosphorylates by reaction with water. Phosphoenzyme conversion from Ca2E1P to E2P is accompanied by release of Ca2+ into the SR lumen and uptake of protons from the SR lumen. Hydrolysis of E2P and regeneration of the high affinity Ca2+ binding sites complete the reaction cycle. The Ca2+-ATPase serves as example to illustrate the study of protein function with infrared difference spectroscopy. The ATPase was the first protein to be studied by a

66

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

photolytically induced concentration jump [58]. Caged ATP [58, 108, 165, 166] and caged Ca2+ [114, 167, 168] have been predominantly used. The rapid scan technique with a time resolution of 65 ms is sufficient to kinetically resolve the main intermediates in the pump cycle after ATP release [108]. The functionality of the ATPase in the infrared samples has been demonstrated by a number of control experiments, for example Ca2+ uptake [165] and intrinsic fluorescence measurements [168], and by control experiments with inhibitors [114, 165-167, 169]. 6.2. Making Use of the Fingerprint Approach Regarding the infrared signals in the amide I region as fingerprint of the conformational change has enabled several conclusions on the transport mechanism. From the infrared data, it does not seem possible to distinguish between minor and major secondary structure changes in the catalytic cycle of the Ca2+-ATPase. This is in contrast to what would be expected from the classical model of the ATPase reaction cycle by de Meis and Vianna [170] that is based on only two main protein conformations E1 and E2, but is in line with the recent structural data [160, 162]. Instead, all main reaction steps studied are associated with secondary structure changes of comparable magnitude with that of phosphorylation somewhat smaller [108]. The clear detection of a conformational change upon phosphorylation is particularly interesting, because it is missed by X-ray crystallography [171]. In addition, evidence has also been obtained for a pH dependent conformational change of the protein that affects the Ca2E1 E2P transition [172].

Figure 5. Simplified reaction cycle of the Ca2+-ATPase. The cartoons illustrate accessibility and protonation state of the Ca2+ binding sites. For clarity only one Ca2+ and one H+ is indicated. Cyt stands for cytoplasm and L for lumen.

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

67

For all partial reactions investigated, the overall backbone conformational changes proceed at the same time as the local perturbations of side chains [108]. An analysis of the kinetics has not detected one of the postulated intermediates in the reaction cycle [108]. It was therefore concluded that it is either short-lived or does not exist. Difference spectra of the two Ca2+ release reactions from the phosphorylated (Ca2E1P E2P) and the unphosphorylated enzyme (Ca2E1 E1/E2) show striking similarity [112] and similar conformational changes have been concluded from this observation. Since difference spectra of a reaction contain information on protein structure, side chain protonation, hydrogen bonding and Ca2+ binding mode of the initial and the final state, the observed similarity suggests that the occupied and unoccupied Ca2+ binding sites are most likely the same in the two reactions. Thus, a model with only one pair of binding sites for Ca2+ is favoured from the infrared spectra. It is in contrast to the model by Jencks [173] that proposes two different pairs of sites for cytoplasmic high affinity and lumenal low affinity binding sites, respectively. However, it is in agreement with mutagenesis studies [174] and with crystal structures in different ATPase states [175, 176]. 6.3. Nucleotide Binding Depends on Individual Interactions One line of studies has been to map interactions between protein and substrate ATP. Fig. 6a shows infrared absorbance changes induced by nucleotide (NTP) binding to the Ca2+-ATPase. The spectra reveal the difference in absorbance between the initial nucleotide-free state Ca2E1 and the nucleotide-ATPase complexes Ca2E1NTP. The difference spectra reflect conformational changes of the protein backbone in the amide I (1700-1610 cm-1) region. The signals near 1693, 1641 and 1628 cm-1 are characteristic of -sheets, those near 1665 cm-1 are suggestive of turns, and those near 1653 cm-1 are indicative of -helical structures. The spectra indicate that -helices, sheets and turns are affected by nucleotide binding [107]. Close ATP analogs (Fig. 6b) produce nucleotide binding spectra that are different from that obtained with ATP (Fig. 6a). Therefore, the conformational change upon nucleotide binding depends to a surprising degree on individual interactions between ATPase and nucleotide [107, 177]. The lack of individual interactions produces more than just local adjustments, it affects the entire conformation of the nucleotide-ATPase complex. Surprisingly, modification at opposite ends of the ATP molecule, interacting with different domains, produces similar effects. In particular, omission of the phosphate [177] and modification of the amino [107] group both reduce the conformational change, with the latter modification having a more dramatic effect. This suggests a concerted conformational change upon ATP binding for which all interactions need to be in place [107]. As a consequence, the (average) structure of the nucleotide-ATPase complex is characteristic of the nucleotide bound. From the sensitivity of the conformational change on individual interactions it has been concluded that the ATPase interacts with the -phosphate [177], the ribose hydroxyls and the amino function [107] of ATP. The interactions identified by infrared spectroscopy have later been confirmed by X-ray crystallography [171, 178].

68

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

Figure 6. Infrared absorbance changes induced by nucleotide binding to the ATPase. (a) Difference spectra of nucleotide binding to the Ca2+-ATPase (Ca2E1 Ca2E1NTP) obtained with ATP, 2'-deoxyATP, 3'deoxyATP and ITP [107]. Results for ADP [177] and the close ATP analog AMPPNP (,-imidoadenosine 5'-triphosphate) [107] are not shown. Labels indicate the band positions of the ATP binding spectrum. (b) Structures of ATP and ATP analogues highlighting the modified functional groups of ATP. The functional groups interact with different domains of the ATPase. Light and dark grey indicate interaction with the N and P domain, respectively. Each modification of ATP affects the binding induced conformational change. Thus all modified groups are involved in important interactions with the ATPase. Reprinted in modified form from [127]. 2006 Nova Science Publishers.

Binding of ATP closes a cleft between nucleotide-binding (N) and phosphorylation (P) domain of the ATPase [160, 179]. This movement delivers the -phosphate to the phosphorylation site Asp-351 in the P domain [171, 178]. The two domains are bridged by ATP in the ATP-ATPase complex (Ca2E1ATP), as sketched in Fig. 6b. This bridging function of ATP provides an explanation for the drastic structural effects of modifying the ribose 3'-OH and the adenine amino function. Interactions of both groups stabilise the closed conformation of the complex, 3'-OH directly via an interaction with Arg-678 in the P domain, and the amino group indirectly because its interaction with Glu-442 of the N domain seems to position the ribose hydroxyls such that interaction with the P domain is possible. When inosine triphosphate (ITP) is used

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

69

instead of ATP, Glu-442 will repel the negative partial charge on the inosine oxygen, reorient the inosine moiety and thereby sacrifice interactions of the ribose hydroxyls with the P domain. This explains the weaker binding of ITP and the smaller extent of conformational change upon ITP binding [107]. The results discussed so far in this chapter have been obtained by monitoring the conformational change of the peptide backbone. However, the absorption of the bound nucleotide can be observed directly if isotopic labeling is used to identify difference bands of specific nucleotide groups. Since the vibrational frequency depends on the masses of the vibrating atoms, isotopic labeling shifts bands of the labeled group which can then be identified in the spectrum. In studies of nucleotide binding to the Ca 2+ATPase, - and -phosphates have been labeled. The spectral band positions of the and -phosphate bands indicate that P-O bond strengths of bound - and -phosphate are similar to those of ATP in aqueous solution, i.e. that ATP's hydrogen bonds of ATP to water are largely replaced by interactions with the protein for bound ATP [140]. Compared to GTP binding to Ras [68, 69, 180, 181], the phosphate bands of ATP bound to the ATPase are generally found at similar positions. However, the extent of vibrational coupling between different phosphate groups seems to be different: they are largely decoupled in the GTP-Ras complex, whereas they are coupled for ATP bound to the ATPase. Transfer of the nucleotidic -phosphate to Asp351 follows nucleotide binding and depends also on the type of nucleotide [128]. An interesting feature has been observed with ITP: its phosphorylation spectrum shows additional signals in the amide I region as compared to ATP, which are similar to nucleotide binding signals [128]. Thus it seems that upon phosphorylation with ITP the enzyme catches up on a conformational change that cannot be achieved by ITP binding because the interactions between protein and base moiety are impaired. ADP dissociation from the phosphoenzyme Ca2E1P results in conformational changes which are the reverse of those induced by ADP binding to the unphosphorylated ATPase Ca2E1 [57]. This is indicated by infrared experiments that accumulate Ca2E1P and in which ADP is removed by the helper enzyme apyrase. Upon dissociation of ADP from the phosphoenzyme, the conformation relaxes partially back to that of the unphosphorylated state Ca2E1. Thus, ADP plays an important role in stabilizing the closed conformation of Ca2E1P [57, 128]. ADP dissociation from Ca2E1P does not trigger the transition to E2P [57] as proposed previously, since the spectral characteristics of E2P [108, 142, 182] are not observed upon ADP dissociation. 6.4. Protonation of the Empty Ca2+ Binding Sites Has Been Observed Directly Amongst the side chain groups, protonated carboxyl groups are often the infrared spectroscopist's favourite. They absorb in a region (1700 - 1800 cm-1) that is usually free from the absorption of other groups and thus can readily be assigned. The ability of infrared spectroscopy to detect protonation states of amino acids is important, since protons are invisible in X-ray crystallography but proton transfer steps are often an essential part of the enzymatic reaction mechanism. For the Ca2+-ATPase, infrared bands of protonated Asp and Glu residues of E2P [142, 165, 182] and of Ca2+ free ATPase (E1 or E2) [112, 114, 167, 168] have been observed upon Ca2+ release from the phosphoenzyme and Ca2+ binding to the unphosphorylated ATPase, respectively. The E2P bands are marked with "prot" in Fig. 2. They are composed of at least five overlapping bands. Three of the E2P bands are

70

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

pH dependent and titrate with a pKa value near 8.3 [172] which is similar to the apparent pKa value of residues binding lumenal H+ for proton countertransport [183185]. This similarity of the pKa values supports the earlier interpretation that the bands originate from the protonation of carboxyl groups in the Ca2+ binding sites [114, 168], which are involved in H+ transport [112]. The spectral position of the bands indicates that some of the carboxyl carbonyl oxygens are hydrogen bonded while others are not. This observation has enabled a tentative assignment of the signals to Glu-771 (H-bonded), Asp-800 (some conformers H-bonded, some not) and Glu-908 (H-bonded) by multiconformation continuum electrostatics calculations [172]. A definitive assignment has to await experiments with site-directed mutants. 6.5. The Environment of the Phosphate Group Facilitates Phosphate Transfer The ATPase is one of the many examples in which phosphorylation controls biochemical reactions. The ATPase phosphoenyzymes have different properties, which is essential for coupling ATP hydrolysis to Ca2+ transport [186]. For example, E2P dephosphorylates faster with water than Ca2E1P and the model compound acetyl phosphate. This is required for the fast progression of the pump cycle and therefore for the efficient removal of Ca2+ from the cytoplasm of muscle cells. Obviously, the environment of the phosphate group is important in controlling the dephosphorylation properties. Bond properties and interactions of the phosphate group have been characterised by infrared spectroscopy. The essential step here is to identify the phosphate absorption in the difference spectra with help of isotopic substitution. Work on Ca 2E1P [140, 187] and an initial study on E2P [142] have compared infrared difference spectra obtained with labeled and unlabeled -phosphate of ATP. This approach has identified isotopesensitive bands of Ca2E1P and E2P which can be assigned to the phosphate group. The band positions are different for the two phosphoenzymes, indicating a conformational change that directly affects geometry and electron density of the phosphate group and makes the environment in E2P more hydrophobic [142]. The complete set of three E2P P-O stretching vibrations has been determined in an isotope exchange experiment which is more sensitive than the comparison of spectra obtained with different isotopes of ATP [188]. The experiment has observed an oxygen isotope exchange at the phosphate group that is catalysed by the ATPase [189]. It provides an infrared spectrum at atomic resolution in a crowded spectral region [143, 188] which reveals the three stretching vibrations of the transiently bound phosphate group in spite of a background absorption of 50 000 protein vibrations. The spectrum is shown in Fig. 7a. Bands of the terminal P-O stretching vibrations of the unlabeled phosphate group are found at 1194, 1137, and 1115 cm-1.

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

71

The information on the E2P phosphate vibrations has been evaluated [143] using a correlation between P-O frequency and P-O bond valence, the bond valence model and empirical correlations to calculate P-O bond strengths and lengths, and the dissociation energy of the bridging P-O bond [143]. Compared to the model compound acetyl phosphate, structure and charge distribution of the E2P aspartyl phosphate resemble somewhat the transition state in a dissociative phosphate transfer reaction: the aspartyl phosphate of E2P has 0.02 shorter terminal P-O bonds and a 0.09 longer bridging P-O bond, which is ~20% weaker and has 64 - 90 kJ/mol less bond energy [143]. These findings are summarised in Fig. 7b. Similar effects have been concluded for Ca 2E1P [140, 190], the values of which are between those of acetyl phosphate and E2P, but closer to those of E2P. Interestingly, the differences between acetyl phosphate and the phosphoenzymes in the bridging P-O equilibrium bond length (max. 0.09 ) are comparable to the bond length fluctuations in the vibrational ground state (0.06 ) [190].

Figure 7. Determination of phosphate bond properties by infrared spectroscopy. (a) Infrared difference spectrum of E2P16O3 E2P18O3 isotope exchange at the phosphate group [143] calculated by subtracting the spectrum before exchange from the spectrum after exchange. Reprinted from [127]. (b) Phosphate bond parameters for the model compound acetyl phosphate and E2P [143]. Italic numbers above bonds give bond lengths in , normal print numbers below bonds give bond valences in vu [192]. The values have been rounded to two decimal digits. Reprinted in modified form from [127]. 2006 Nova Science Publishers.

72

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

The destabilisation of the bridging P-O bond is a ground state property of Ca2E1P and E2P. This finding is consistent with the view that part of the catalytic power of enzymes derives from ground state properties, in particular from favouring near-attack conformations in which the arrangement of reacting atoms is similar to that in the transition state [191]. The weaker bridging P-O bond of E2P accounts for a 1011- to 1015-fold hydrolysis rate enhancement implying that P-O bond destabilization facilitates phosphoenzyme hydrolysis. P-O bond destabilization is caused by a shift of non-covalent interactions from the phosphate oxygens to the aspartyl oxygens. Therefore it has been proposed [143] that the relative strength of non-covalent bonding to the phosphate and aspartyl oxygens is one of the key factors that tunes the hydrolysis rate of the ATPase phosphoenzymes and related phosphoproteins. Weaker bonding to the phosphate oxygens and stronger bonding to the aspartyl oxygens weakens the bridging P-O bond, which increases dramatically the catalytic power of the enzyme. Weakening and elongation of the bridging P-O bond is not accomplished by external mechanical forces that pull the bond apart. Instead it is an in-built response of aspartyl phosphate to a shift of interactions from phosphate to aspartyl oxygens, with only subtle changes in distances are required. This provides an elegant "handle" for the enzyme to control hydrolysis. 7. Appendix
Table 1. Overview of amino acid side chain infrared bands. Extended and corrected version of a previous compilation [133]. For more information and references see [24, 133]. If available, parameters of infrared spectra of amino acid side chains are given. If not, data are taken from infrared spectra of model compounds or from Raman spectra. Band positions are given for 1H2O and 2H2O, the value in brackets is the absorption coefficient or extinction coefficient . The shift upon 1H/2H exchange is given when a compound in both solvents is compared in the original work. The listing of internal coordinate contributions to a normal mode is according to their contribution to the potential energy of the normal mode (if specified in the literature): if the contribution of an internal coordinate to the potential energy of a normal vibration is 70% only that coordinate is listed. Two coordinates are listed if their contribution together is 70%. In all other cases those three coordinates that contribute strongest to the potential energy are listed. If no assignment is listed, no or multiple assignments are given in the original publications. Vibrations dominated by amide group motions are not included. : stretching vibration, s: symmetric stretching vibration, as: antisymmetric stretching vibration, : in plane bending vibration, as: asymmetric in plane bending vibration, w: wagging vibration, t: twisting vibration, r: rocking vibration. Assignment Cys, (SH) Asp, (C=O) Glu, (C=O) Asn, (C=O) Arg, as(CN3H5 ) Gln, (C=O) Arg, s(CN3H5+) HisH2 , (C=C)
+ +

Band position / cm-1, ( / M-1cm-1) in 1H2O 2551 1716 (280) 1712 (220) 1677-1678 (310-330) 1652-1695 (420-490) 1668-1687 (360-380) 1614-1663 (300-340) 1631 (250)

Band position / cm-1, ( / M-1cm-1) in 2H2O 1849 1713 (290) 1706 (280) 1648 (570) 1605-1608 (460) 1635-1654 (550) 1581-1586 (500) 1600 (35), 1623 (16)

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

73

Lys, as(NH3+) Tyr-OH, (CC) (CH) Asn, (NH2) Trp, (CC), (C=C) Tyr-O , (CC) Tyr-OH, (CC) Gln, (NH2) HisH, (C=C) Asp, as(COO ) Glu, as(COO ) Lys, s(NH3 ) Tyr-OH, (CC), (CH) Trp, (CN), (CH), (NH) Tyr-O , (CC), (CH) Trp, (CC), (CH) Phe, (CCring) as(CH3) (CH2) Pro, (CN) Trp, (CH), (CC), (CN) His , (CH3), (CN) Trp, (NH), (CC), (CH) Gln, (CN) Glu, s(COO ) Asp, s(COO ) s(CH3) Trp Trp (CH) Trp, (NH), (CN), (CH) Tyr-O-, (C-O), (CC) Asp, Glu, (COH) Trp, (CH), (CC) Tyr-OH (C-O), (CC) His, (CH), (CN), (NH) Trp, (CC)
+ -

1626-1629 (60-130) 1614-1621 (85-150) 1612-1622 (140-160) 1622 1599-1602 (160) 1594-1602 (70-100) 1586-1610 (220-240) 1575,1594 (70) 1574-1579 (290-380) 1556-1560 (450-470) 1526-1527 (70-100) 1516-1518 (340-430) 1509 1498-1500 (700) 1496 1494 (80) 1445-1480 1425-1475 1400-1465 1462 1439 1412-1435 1410 1404 (316) 1402 (256) 1375 or 1368, 1385 1352-1361 1334-1342 1315-1350 1276 1269-1273 (580) 1264-1450 1245 1235-1270 (200) 1217, 1229, 1199 1203

1201 1612-1618 (160)

1618 1603 (350) 1590-1591 (<50) 1163 1569, 1575 1584 (820) 1567 (830) 1170 1513-1517 (500)

1498-1500 (650)

1455 (200) 1439 1382 1409 1407 1404

1334 (100)

955-1058

1248-1265 (150) 1217, 1223, 1239

74

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

Ser, (COH) or (CO2H), (CO) w(CH2) Tyr-OH, (COH) Asp,Glu, (C-O) His, (CN), (CH) Trp, (CH), (NC) Trp, (NC), (CH), (CC) t(CH2) Thr, (C-O) Ser, (C-O) Trp, (CC), (CH) Ser, (CO) or (CC) Ser, (CO), (CO H) Thr, (CO H) r(CH2)
2 2

1181-1420 1170-1382 1169-1260 (200) 1160-1253 1104,1090,1106,1094 1092 1064 1063-1295 1075-1150 1030 1012-1016 983

875-985

913 1250-1300 1104,1096,1107,1110

1023 1012

940 865-942 724-1174

References
[1] J. Trewhalla, W.K. Liddle, D.B. Heidorn, and N. Strynadka, Biochemistry 28 (1989), 1294-1301. [2] M. Jackson, P.I. Haris, and D. Chapman, Biochemistry 30 (1991), 9681-9686. [3] M. Nara, M. Tasumi, M. Tanokura, T. Hiraoki, M. Yazawa, and A. Tsutsumi, FEBS Lett. 349 (1994), 8488. [4] H. Fabian, T. Yuan, H.J. Vogel, and H.H. Mantsch, Eur. Biophys. J. 24 (1996), 195-201. [5] J.O. Alben and W.S. Caughey, Biochemistry 7 (1968), 175-183. [6] M.E. Riepe and J.H. Wang, J. Biol. Chem. 243 (1968), 2779-2787. [7] J.G. Belasco and J.R. Knowles, Biochemistry 19 (1980), 472-477. [8] P. Tonge, G.R. Moore, and C.W. Wharton, Biochem. J. 258 (1989), 599-605. [9] P.J. Tonge, M. Pusztai, A.J. White, C.W. Wharton, and P.R. Carey, Biochemistry 30 (1991), 4790-4795. [10] G. Zundel, Adv. Chem. Phys. 111 (2000), 1-217. [11] G. Iliadis, G. Zundel, and B. Brzezinski, Biospectroscopy 3 (1997), 291-297. [12] F. Bartl, D. Palm, R. Schinzel, and G. Zundel, Eur. Biophys. J. 28 (1999), 200-207. [13] M.S. Braiman and K.J. Rothschild, Annu. Rev. Biophys. Biophys. Chem. 17 (1988), 541-570. [14] K. Gerwert, Biol. Chem. 380 (1999), 931-935. [15] J. Heberle, Recent Res. Devel. Applied Spectroscopy 2 (1999), 147-159. [16] C. Jung, J. Mol. Recognit. 13 (2000), 325-351. [17] R. Vogel and F. Siebert, Curr. Opin. Chem. Biol. 4 (2000), 518-523. [18] S. Kim and B.A. Barry, J. Phys. Chem. 105 (2001), 4072-4083. [19] K. Fahmy, Recent Res. Devel. Chem. 2 (2001), 1-17. [20] C.W. Wharton, Nat. Prod. Rep. 17 (2000), 447-453. [21] W. Mntele, TIBS 18 (1993), 197-202. [22] A. Barth and C. Zscherp, Quart. Rev. Biophys. 35 (2002), 369-430. [23] C. Zscherp and A. Barth, Biochemistry 40 (2001), 1875-1883. [24] A. Barth, Biochim. Biophys. Acta 1767 (2007), 1073-1101. [25] S.Y. Venyaminov and F.G. Prendergast, Anal. Biochem. 248 (1997), 234-245. [26] R.J. Ellis, Curr. Opin. Struct. Biol. 11 (2001), 114-119. [27] U.P. Fringeli, In situ infrared attenuated total reflection membrane spectroscopy, in: Internal reflection spectroscopy, ed. F.M.J. Mirabella, Marcel Dekker, Inc., New York, 1992, 255-324.

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

75

[28] E. Goormaghtigh, V. Raussens, and J.-M. Ruysschaert, Biochim. Biophys. Acta 1422 (1999), 105-185. [29] L.K. Tamm and S.A. Tatulian, Quart. Rev. Biophys. 30 (1997), 365-429. [30] K.A. Oberg and A.L. Fink, Anal. Biochem. 256 (1998), 92-106. [31] P. Rigler, W.P. Ulrich, P. Hoffmann, M. Mayer, and H. Vogel, CHEMPHYSCHEM 4 (2003), 268-275. [32] P. Rigler, W.P. Ulrich, and H. Vogel, Langmuir 20 (2004), 7901-7903. [33] F. Siebert, W. Mntele, and W. Kreutz, Biophys. Struct. Mech. 6 (1980), 139-146. [34] J.O. Alben, D. Beece, S.F. Bowne, L. Eisenstein, H. Frauenfelder, D. Good, M.C. Marden, P.P. Moh, L. Reinisch, A.H. Reynolds, and K.T. Yue, Phys. Rev. Lett. 44 (1980), 1157-1160. [35] K.J. Rothschild, M. Zagaeski, and W.A. Cantore, Biochem. Biophys. Res. Commun. 103 (1981), 483489. [36] K.J. Rothschild, J. Bioenerg. Biomembr. 24 (1992), 147-167. [37] A. Maeda, Israel J. Chem. 35 (1995), 387-400. [38] J. Breton, Biochim. Biophys. Acta 1507 (2001), 180-193. [39] W. Mntele, Infrared vibrational spectroscopy of the photosynthetic reaction center, in: The photosynthetic reaction center, eds. J. Deisenhofer and J.R. Norris, Vol. 2, Academic Press, San Diego, 1993, 239-283. [40] W. Mntele, Infrared vibrational spectroscopy of reaction centers, in: Anoxygenic Photosynthetic Bacteria, eds. E. Blankenship, M.T. Madigan, and C.E. Bauer, Kluwer Academic Publishers, Dordrecht, 1995, 627-647. [41] E. Nabedryk, Light-induced Fourier transform infrared difference spectroscopy of the primary electron donor in photosynthetic reaction centers, in: Infrared spectroscopy of biomolecules, eds. H.H. Mantsch and D. Chapman, Wiley-Liss, New York, 1996, 39-82. [42] J.E. Baenziger, K.W. Miller, and K.J. Rothschild, Biochemistry 32 (1993), 5448. [43] J.E. Baenziger, K.W. Miller, and K.J. Rothschild, Biophys. J. 61 (1992), 983-992. [44] F. Scheirlinckx, V. Raussens, J.-M. Ruysschaert, and E. Goormaghtigh, Biochem. J. 382 (2004), 121129. [45] M. Iwaki, N.P.J. Cotton, P.G. Quirk, P.R. Rich, and J.B. Jackson, J. Am. Chem. Soc. 128 (2006), 26212629. [46] K. Fahmy, Biophys. J. 75 (1998), 1306-1318. [47] E. Agic, O. Klein, and W. Mntele, Binding and interaction of effector molecules to proteins studied with an attenuated total reflection infrared (ATR-IR) microdialysis cell, in: Book of abstracts: 10th European conference on the spectroscopy of biological molecules, eds. B. Szalontai and Z. Kta, JATEPress, Szeged, 2003, 93. [48] S. Gourion-Arsiquaud, S. Chevance, P. Bouyer, L. Garnier, J.-L. Montillet, A. Bondon, and C. Berthomieu, Biochemistry 44 (2005), 8652-8663. [49] M. Krasteva, S. Kumar, and A. Barth, Spectroscopy 20 (2006), 89-94. [50] A.J. White, K. Drabble, and C.W. Wharton, Biochem. J. 306 (1995), 843-849. [51] P. Hinsmann, M. Haberkorn, J. Frank, P. Svasek, M. Harasek, and B. Lendl, Appl. Spectrosc. 55 (2001), 241-251. [52] R. Masuch and D.A. Moss, Stopped flow system for FTIR difference spectroscopy of biological macromolecules, in: Spectroscopy of biological molecules: new directions, eds. J. Greve, G.J. Puppels, and C. Otto, Kluwer Academic Publishers, Dordrecht, 1999, 689-690. [53] E. Kauffmann, N.C. Darnton, R.H. Austin, C. Batt, and K. Gerwert, Proc. Natl. Acad. Sci. USA 98 (2001), 6646-6649. [54] H. Fabian and D. Naumann, Methods 34 (2004), 28-40. [55] J.H. Kaplan, B. Forbush, and J.F. Hoffman, Biochemistry 17 (1978), 1929-1935. [56] M. Goeldner and R. Givens, eds., Dynamic studies in biology. Wiley-VCH, Weinheim, 2005. [57] M. Liu, E.-L. Karjalainen, and A. Barth, Biophys. J. 88 (2005), 3615-3624. [58] A. Barth, W. Mntele, and W. Kreutz, FEBS Lett. 277 (1990), 147-150. [59] M. Lbben and K. Gerwert, FEBS Lett. 397 (1996), 303-307. [60] L. Zhang, R. Buchet, and G. Azzar, Biophys. J. 86 (2004), 3873-3881. [61] J. Bandorowicz-Pikula, A. Wrzosek, M. Danieluk, S. Pikula, and R. Buchet, Biochem. Biophys. Res. Commun. 263 (1999), 775-779. [62] F. Moro, V. Fernandez-Saiz, and A. Muga, Protein Sci. 15 (2006), 223-233. [63] H. Fabian, D. Chapman, and H.H. Mantsch, New trends in isotope-edited infrared spectroscopy, in: Infrared spectroscopy of biomolecules, eds. H.H. Mantsch and D. Chapman, Wiley-Liss, New York, 1996, 341-352. [64] F. Von Germar, A. Galn, O. Llorca, J.L. Carrascosa, J.M. Valpuesta, W. Mntele, and A. Muga, J. Biol. Chem. 274 (1999), 5508-5513. [65] C. Raimbault, R. Buchet, and C. Vial, Eur. J. Biochem. 240 (1996), 134-142. [66] C. Raimbault, F. Besson, and R. Buchet, Eur. J. Biochem. 244 (1997), 343-351.

76

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

[67] E.M. White, A.R. Holland, and G. MacDonald, Biochemistry 47 (2008), 84-91. [68] V. Cepus, A.J. Scheidig, R.S. Goody, and K. Gerwert, Biochemistry 37 (1998), 10263-10271. [69] X. Du, H. Frei, and S.-H. Kim, J. Biol. Chem. 275 (2000), 8492-8500. [70] H. Cheng, S. Sukal, H. Deng, T.S. Leyh, and R. Callender, Biochemistry 40 (2001), 4035-4043. [71] B.C. Butler, R.H. Hanchett, H. Rafailov, and G. MacDonald, Biophys. J. 82 (2002), 2198-2210. [72] J. Backmann, H. Fabian, and D. Naumann, FEBS Lett. 364 (1995), 175-178. [73] D. Reinstdler, H. Fabian, J. Backmann, and D. Naumann, Biochemistry 35 (1996), 15822-15830. [74] R.B. Dyer, F. Gai, W.H. Woodruff, R. Gilmanshin, and R.H. Callender, Acc. Chem. Res. 31 (1998), 709-716. [75] R. Gilmanshin, S. Williams, R.H. Callender, W.H. Woodruff, and R.B. Dyer, Biochemistry 36 (1997), 15006-15012. [76] S. Williams, T.P. Causgrove, R. Gilmanshin, K.S. Fang, R.H. Callender, W.H. Woodruff, and R.B. Dyer, Biochemistry 35 (1996), 691-697. [77] J. Wang and M.A. El-Sayed, Biophys. J. 76 (1999), 2777-2783. [78] C.M. Phillips, Y. Mizutani, and R.M. Hochstrasser, Proc. Natl. Acad. Sci. USA 92 (1995), 7292-7296. [79] A.P. Ramajo, S.A. Petty, and M. Volk, Chem. Phys. 323 (2006), 11-20. [80] G. Panick, R. Malessa, R. Winter, G. Rapp, K.J. Frye, and C.A. Royer, J. Mol. Biol. 275 (1998), 389402. [81] G. Panick and R. Winter, Biochemistry 39 (2000), 1862-1869. [82] A. Barth, K. Hauser, W. Mntele, J.E.T. Corrie, and D.R. Trentham, J. Am. Chem. Soc. 117 (1995), 10311-10316. [83] A. Barth, J.E.T. Corrie, M.J. Gradwell, Y. Maeda, W. Mntele, T. Meier, and D.R. Trentham, J. Am. Chem. Soc. 119 (1997), 4149-4159. [84] V. Cepus, C. Ulbrich, C. Allin, A. Troullier, and K. Gerwert, Methods Enzymol. 291 (1998), 223-245. [85] A. Barth, Time-resolved IR spectroscopy with caged compounds: An introduction, in: Dynamic studies in biology: Phototriggers, photoswitches and caged biomolecules, eds. M. Goeldner and R.S. Givens, Wiley-VCH, Weinheim, 2005, 369-399. [86] V. Jayaraman, IR spectroscopy with caged compounds: Selected applications, in: Dynamic studies in biology: Phototriggers, photoswitches and caged biomolecules, eds. M. Goeldner and R.S. Givens, Wiley-VCH, Weinheim, 2005, 400-410. [87] A. Barth and C. Zscherp, FEBS Lett. 477 (2000), 151-156. [88] D. Moss, E. Nabedryk, J. Breton, and W. Mntele, Eur. J. Biochem. 187 (1990), 565-572. [89] E. Hamacher, J. Kruip, M. Rgner, and W. Mntele, Spectrochim. Acta A 52 (1996), 107-121. [90] M. Leonhard and W. Mntele, Biochemistry 32 (1993), 4532-4538. [91] M. Bauscher, M. Leonhard, D.A. Moss, and W. Mntele, Biochim. Biophys. Acta 1183 (1993), 59-71. [92] P. Hellwig, B. Rost, U. Kaiser, C. Ostermeier, H. Michel, and W. Mntele, FEBS Lett. 385 (1996), 5357. [93] P. Hellwig, J. Behr, C. Ostermeier, O.M. Richter, U. Pfitzner, A. Odenwald, B. Ludwig, H. Michel, and W. Mntele, Biochemistry 37 (1998), 7390-7399. [94] J. Behr, P. Hellwig, W. Mntele, and H. Michel, Biochemistry 37 (1998), 7400-7406. [95] P. Hellwig, C. Ostermeier, H. Michel, B. Ludwig, and W. Mntele, Biochim. Biophys. Acta 1409 (1998), 107-112. [96] P. Hellwig, T. Soulimane, G. Buse, and W. Mntele, FEBS Lett. 458 (1999), 83-86. [97] B. Rost, J. Behr, P. Hellwig, O.M. Richter, B. Ludwig, H. Michel, and W. Mntele, Biochemistry 38 (1999), 7565-7571. [98] J. Behr, H. Michel, W. Mntele, and P. Hellwig, Biochemistry 39 (2000), 1356-1363. [99] E.D. Dodson, X.J. Zhao, W.S. Caughey, and C.M. Elliott, Biochemistry 35 (1996), 444-452. [100] R.B. Gennis, FEBS Lett. 555 (2003), 2-7. [101] G. Tollin, J. Bioenerg. Biomembr. 27 (1995), 303-309. [102] R. Traber, H.E.A. Kramer, and P. Hemmerich, Biochemistry 21 (1982), 1687-1693. [103] J. Contzen and C. Jung, Biochemistry 38 (1999), 16253-16260. [104] M. Lbben, A. Prutsch, B. Mamat, and K. Gerwert, Biochemistry 38 (1999), 2048-2056. [105] Y. Yamazaki, H. Kandori, and T. Mogi, J. Biochem. 126 (1999), 194-199. [106] Y. Yamazaki, H. Kandori, and T. Mogi, J. Biochem. 125 (1999), 1131-1136. [107] M. Liu and A. Barth, J. Biol. Chem. 278 (2003), 10112-10118. [108] A. Barth, F. von Germar, W. Kreutz, and W. Mntele, J. Biol. Chem. 271 (1996), 30637-30646. [109] R. Brudler, R. Rammelsberg, T.T. Woo, E.D. Getzoff, and K. Gerwert, Nature Struct. Biol. 8 (2001), 265-270. [110] D. Reinstdler, H. Fabian, and D. Naumann, Proteins 34 (1999), 303-316. [111] A. Troullier, D. Reinstdler, Y. Dupont, D. Naumann, and V. Forge, Nature Struct. Biol. 7 (2000), 7886.

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

77

[112] A. Barth, W. Mntele, and W. Kreutz, J. Biol. Chem. 272 (1997), 25507-25510. [113] R.S. Chittock, S. Ward, A.-S. Wilkinson, P. Caspers, B. Mensch, M.G.P. Page, and C.W. Wharton, Biochem. J. 338 (1999), 153-159. [114] A. Troullier, K. Gerwert, and Y. Dupont, Biophys. J. 71 (1996), 2970-2983. [115] F. Scheirlinckx, R. Buchet, J.-M. Ruysschaert, and E. Goormaghtigh, Eur. J. Biochem. 268 (2001), 3644-3653. [116] S. Krimm and Y. Abe, Proc. Natl. Acad. Sci. USA 69 (1972), 2788-2792. [117] P. Hamm, M. Lim, W.F. DeGrado, and R.M. Hochstrasser, Proc. Natl. Acad. Sci. USA 96 (1999), 2036-2041. [118] H. Torii and M. Tasumi, J. Raman Spectrosc. 229 (1998), 81-86. [119] H. Torii, T. Tatsumi, T. Kanazawa, and M. Tasumi, J. Phys. Chem. B 102 (1998), 309-314. [120] J.R. Parrish and E.R. Blout, Biopolymers 11 (1972), 1001-1020. [121] S.T.R. Walsh, R.P. Cheng, W.W. Wright, D.O.V. Alonso, V. Daggett, J.M. Vanderkooi, and W.F. DeGrado, Protein Sci. 12 (2003), 520-531. [122] N.A. Nevskaya and Y.N. Chirgadze, Biopolymers 15 (1976), 637-648. [123] S.Y. Venyaminov and N.N. Kalnin, Biopolymers 30 (1990), 1259-1271. [124] Y.N. Chirgadze, B.V. Shestopalov, and S.Y. Venyaminov, Biopolymers 12 (1973), 1337-1351. [125] S.Y. Venyaminov and N.N. Kalnin, Biopolymers 30 (1990), 1243-1257. [126] Y.N. Chirgadze, O.V. Fedorov, and N.P. Trushina, Biopolymers 14 (1975), 679-694. [127] A. Barth, Infrared spectroscopy, in: Methods in protein structure and stability analysis: vibrational spectroscopy, eds. V.N. Uversky and E.A. Permyakov, Nova Science Publishers, New York, 2007, 69151. [128] M. Liu and A. Barth, J. Biol. Chem. 279 (2004), 49902-49909. [129] W. Mntele, Infrared and Fourier-transform infrared spectroscopy, in: Biophysical techniques in photosynthesis, eds. J. Amesz and A.J. Hoff, Kluwer Academic Publishers, Dordrecht, 1996, 137-160. [130] D.D. Schlereth and W. Mntele, Biochemistry 32 (1993), 1118-1126. [131] D.D. Schlereth, V.M. Fernandez, and W. Mntele, Biochemistry 32 (1993), 9199-9208. [132] D.D. Schlereth and W. Mntele, Biochemistry 31 (1992), 7494-7502. [133] A. Barth, Prog. Biophys. Mol. Biol. 74 (2000), 141-173. [134] C. Zscherp, R. Schlesinger, J. Tittor, D. Oesterhelt, and J. Heberle, Proc. Natl. Acad. Sci. USA 96 (1999), 5498-5503. [135] J.J. Katz, L.L. Shipman, T.M. Cotton, and T.R. Janson, Chlorophyll aggregation: coordination interactions in chlorophyll monomers, dimers, and oligomers, in: The porphins, ed. D. Dolphin, Vol. 5, Academic Press, New York, 1978, 401-458. [136] J.J. Katz, R.C. Dougherty, and L.J. Boucher, Infrared and nuclear magnetic resonance spectroscopy of chlorophyll, in: The chlorophylls, eds. L.P. Vernon and G.R. Seely, Academic Press, New York, 1966, 185-251. [137] M. Lutz and W. Mntele, Vibrational spectroscopy of chlorophylls, in: Chlorophylls, ed. H. Scheer, CRC Press, Boca Raton, Florida, 1991, 855-902. [138] W.T. Potter, M.P. Tucker, R.A. Houtchens, and W.S. Caughey, Biochemistry 26 (1987), 4699-4707. [139] P.K. Slaich, W.U. Primrose, D.H. Robinson, C.W. Wharton, A.J. White, K. Drabble, and G.C.K. Roberts, Biochem. J. 288 (1992), 167-173. [140] M. Liu, M. Krasteva, and A. Barth, Biophys. J. 89 (2005), 4352-4363. [141] K. Gerwert and F. Siebert, EMBO J. 5 (1986), 805-811. [142] A. Barth, J. Biol. Chem. 274 (1999), 22170-22175. [143] A. Barth and N. Bezlyepkina, J. Biol. Chem. 279 (2004), 51888-51896. [144] S.W. Englander and N.R. Kallenbach, Quart. Rev. Biophys. 4 (1984), 521-655. [145] P.I. Haris, G.T. Robillard, A.A. Van Dijk, and D. Chapman, Biochemistry 31 (1992), 6279-6284. [146] S. Sonar, C.P. Lee, M. Coleman, N. Patel, X. Liu, T. Marti, H.G. Khorana, U.L. RajBhandary, and K.J. Rothschild, Struct.Biol. 1 (1994), 512-517. [147] J.L. Spudich, Struct.Biol. 1 (1994), 495-496. [148] M.A. Palafox, Trends Appl. Spectrosc. 2 (1998), 37-57. [149] R.J. Meier, Vibrational Spectroscopy 43 (2007), 26-37. [150] J. Fisher, J.G. Belasco, S. Khosla, and J.R. Knowles, Biochemistry 19 (1980), 2895-2901. [151] A.J. White, K. Drabble, S. Ward, and C.W. Wharton, Biochem. J. 287 (1992), 317-323. [152] D. Thoenges and A. Barth, J. Biomol. Screen. 7 (2002), 353-357. [153] K. Karmali, A. Karmali, A. Teixeira, and M.J.M. Curto, Anal. Biochem. 333 (2004), 320-327. [154] R. Pacheco, M.L.M. Serralheiro, A. Karmali, and P.I. Haris, Anal. Biochem. 322 (2003), 208-214. [155] R. Pacheco, A. Karmali, M.L.M. Serralheiro, and P.I. Haris, Anal. Biochem. 346 (2005), 49-58. [156] W. Wright and J.M. Vanderkooi, Biospectroscopy 3 (1997), 457-467. [157] H. Takeuchi, H. Murata, and I. Harada, J. Am. Chem. Soc. 110 (1988), 392-397.

78

A. Barth / The Study of Protein Reactions by Reaction-Induced Infrared Difference Spectroscopy

[158] W. Hasselbach and M. Makinose, Biochem. Z. 333 (1961), 518-528. [159] J.P. Andersen, Biochim. Biophys. Acta 988 (1989), 47-72. [160] C. Toyoshima and G. Inesi, Annu. Rev. Biochem. 73 (2004), 269-292. [161] H.-J. Apell, Bioelectrochemistry 63 (2004), 149-156. [162] J.V. Mller, P. Nissen, T.L.-M. Srensen, and M. le Maire, Curr. Opin. Struct. Biol. 15 (2005), 387393. [163] A. Barth, Spectroscopy (2008), in press. [164] W. Hasselbach and W. Waas, Ann. N. Y. Acad. Sci. 402 (1982), 459-469. [165] A. Barth, W. Mntele, and W. Kreutz, Biochim. Biophys. Acta 1057 (1991), 115-123. [166] R. Buchet, I. Jona, and A. Martonosi, Biochim. Biophys. Acta 1104 (1992), 207-214. [167] R. Buchet, I. Jona, and A. Martonosi, Biochim. Biophys. Acta 1069 (1991), 209-217. [168] H. Georg, A. Barth, W. Kreutz, F. Siebert, and W. Mntele, Biochim. Biophys. Acta 1188 (1994), 139150. [169] M. Liu and A. Barth, Biophys. J. 85 (2003), 3262-3270. [170] L. De Meis and A. Vianna, Annu. Rev. Biochem. 48 (1979), 275-292. [171] T.L.-M. Srensen, J.V. Mller, and P. Nissen, Science 304 (2004), 1672-1675. [172] J. Andersson, K. Hauser, E.-L. Karjalainen, and A. Barth, Biophys. J. in press (2008). [173] W.P. Jencks, Biosci. Rep. 15 (1995), 283-287. [174] J.P. Andersen, Biosci. Rep. 15 (1995), 243-261. [175] C. Toyoshima and H. Nomura, Nature 418 (2002), 605-611. [176] C. Toyoshima, H. Nomura, and T. Tsuda, Nature 432 (2004), 361-368. [177] M. Liu and A. Barth, Biopolymers (Biospectroscopy) 67 (2002), 267-270. [178] C. Toyoshima and T. Mizutani, Nature 430 (2004), 529-535. [179] D.L. Stokes and N.M. Green, Annu. Rev. Biophys. Biomol. Struct. 32 (2003), 445-468. [180] C. Allin and K. Gerwert, Biochemistry 40 (2001), 3037-3046. [181] H. Cheng, S. Sukal, R. Callender, and T.S. Leyh, J. Biol. Chem. 276 (2001), 9931-9935. [182] A. Barth, W. Kreutz, and W. Mntele, Biochim. Biophys. Acta 1194 (1994), 75-91. [183] X. Yu, L.N. Hao, and G. Inesi, J. Biol. Chem. 269 (1994), 16656-16661. [184] C. Peinelt and H.J. Apell, Biophys. J. 82 (2002), 170-181. [185] F. Tadini-Buoninsegni, G. Bartolommei, M.R. Moncelli, R. Guidelli, and G. Inesi, J. Biol. Chem. 281 (2006), 37720-37727. [186] W.P. Jencks, Methods Enzymol. 171 (1989), 145-164. [187] A. Barth and W. Mntele, Biophys. J. 75 (1998), 538-544. [188] A. Barth, Biopolymers (Biospectroscopy) 67 (2002), 237-241. [189] T. Kanazawa and P.D. Boyer, J. Biol. Chem. 248 (1973), 3163-3172. [190] J. Andersson and A. Barth, Biopolymers 82 (2006), 353-357. [191] T.C. Bruice and F.C. Lightstone, Acc. Chem. Res. 32 (1999), 127-136. [192] I.D. Brown, The chemical bond in inorganic chemistry. The bond valence model. Oxford University Press, Oxford, 2002. [193] W.P. Jencks, Methods Enzymol. 6 (1963), 914-928.

Biological and Biomedical Infrared Spectroscopy A. Barth and P.I. Haris (Eds.) IOS Press, 2009 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-045-2-79

79

Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins


Haruto ISHIKAWA, Seongheun KIM, Ilya J. FINKELSTEIN, and Michael. D. FAYER1 Department of Chemistry Stanford University, Stanford, CA 94305-5080, USA

Abstract. The investigation of protein structural dynamics on short time scales (100 fs to 100 ps) using ultrafast two dimensional (2D-IR) vibrational echo spectroscopy is presented. Under thermal equilibrium conditions, a proteins structure is constantly fluctuating among conformations associated with different positions on the broad rough minimum on the free energy landscape. Although different conformational substates may be apparent in a vibrational absorption spectrum, linear IR absorption spectra cannot provide information on structural dynamics because dynamical information is masked by inhomogeneous broadening of the lineshapes. 2D-IR vibrational echo spectroscopy makes structural fluctuations a direct experimental observable. Changes in structure manifest themselves through the time evolution of the 2D-IR line shape (spectral diffusion). Here details of the experimental method including the pulse sequence, heterodyne detection to provide full phase information, and the extraction of the molecular dynamics from 2D-IR spectra are outlined. The method and the nature of the information that can be obtained are illustrated with four examples: the influence of mutations on myoglobin dynamics, differences between the dynamics of neuroglobin and myoglobin, the effect of the disulfide bond in neuroglobin on its structural dynamics, and how substrate binding to the enzyme horseradish peroxidase influences its structural fluctuations. Keywords. Two-dimensional infrared; vibrational echo; protein dynamics; heme proteins; neuroglobin; horseradish peroxidase; myoglobin

Introduction Proteins and other biological molecules are dynamic structures, and their dynamics are intimately related to their function. For example, the diffusion of a ligand through a protein like myoglobin or hemoglobin to the active site is made possible by structural fluctuations[1,2]. Even under thermal equilibrium conditions, proteins are never at rest. A folded protein in a particular structure occupies a minimum on the free energy landscape[3,4]. There may be more than one minimum. Each minimum corresponds to a different conformation called a conformational substate (see Figure
1

Corresponding Author: E-mail: fayer@stanford.edu

80

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

1)[3,4]. Within the minimum for a particular conformational substate, the free energy landscape is broad and rough with many local minima separated by low barriers with a wide range of barrier heights. Cryochemical experiments confirm that a protein in a substate has a variety of structures that are represented in Figure 1 by the shallow local minima in a particular substate valley[5,6]. At normal biological temperatures under thermal equilibrium conditions there is sufficient thermal energy to produce transitions among these minima that are responsible for protein structural fluctuations. As indicated in Figure 1, there can be hierarchies of barrier heights giving rise to structural fluctuations on different time scales. The linear IR absorption spectrum of a protein can provide information on aspects of protein structure. For example, CO bound to the active site of myoglobin (MbCO) displays three CO peaks in the IR absorption spectrum of the CO stretch region (~1950 cm-1)[7]. These correspond to three substates of the protein that produce distinct configurations of the distal histidine (His64)[8-12]. However, the IR absorption spectrum does not provide information on protein dynamics. The absorption bands in MbCO and other proteins are inhomogeneously broadened because of the large number of structural configurations associated with the energy landscape surrounding the minimum for each substates (see Figure 1). The width of the absorption band reflects the distribution of protein configurations but does not give information on interconversion among configurations. Ultrafast two dimensional infrared (2D-IR) vibrational echo spectroscopy can obtain information on protein dynamics and structure that cannot be obtained from the IR absorption spectrum alone. The ultrafast vibrational echo method was first applied as a one dimension experiment in 1993[13] and first applied to proteins in 1996[14-16]. Since these early experiments[17,18], vibrational echo spectroscopy has made major advances in both the nature of the technique and the range of applications. It has become a full two dimensional spectroscopy akin to 2D NMR[19] but operating on time scales many orders of magnitude faster and directly examining the structural degrees of freedom of complex molecular systems[20,21].
folded protein

substates

Energy landscape, low barriers, fast fluctuations. Energy landscape, moderate barriers, moderately fast fluctuations.

Figure 1. Schematic illustration of a conformational energy landscape showing the folded protein well with two substate minima and the energy landscape structure about the minima. The hierarchies of barrier height give rise to structural fluctuations on different time scales.

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

81

Ultrafast 2D-IR vibrational echo spectroscopy has been applied to a wide variety of problems. Because of the nature of the method, it a useful tool for the study of problems involving rapid dynamics under thermal equilibrium conditions in condensed phases. Such problems are ubiquitous in nature and difficult to study by other means. 2D-IR experiments can have temporal resolution of < 50 fs, which is sufficiently fast to study the fastest chemical processes. The vibrational excitation associated with 2D-IR experiments produce negligible perturbations of molecular systems. Vibrational excitation does not change the chemical properties of the samples, in contrast to electronic excitation that may produce substantial perturbations. 2D-IR experiments can also be useful as a tool for chemical structural analysis by revealing the relationship among different mechanical degrees of freedom of a molecular or biomolecular system. 2D-IR vibrational echo experiments have been applied to study fast chemical exchange reactions and solution dynamics[22-24], water dynamics[25-30], hydrogen network evolution[31], intramolecular vibrational energy relaxations[32], and of particular interest here protein structure and dynamics[12,20,33-53]. The pulse sequence in 2D-IR vibrational echo experiment induces and then probes the coherent evolution of excitations (vibrations) of a molecular system. The signal is generated by a sequence of three ultrashort IR pulses tuned to the vibrational transitions of interest. The first pulse in the sequence causes vibrational modes of an ensemble of molecules to oscillate initially all with the identical phase. The second pulse in some sense labels the frequencies of the molecules initially excited by the first pulse. During the period between the second and third pulses, structural changes in the system cause the frequencies of the labeled molecules to change. The third pulse begins the read out of the molecular frequencies and generates the observable signal, a fourth pulse, the vibrational echo. The characteristic spectrum obtained from observing the frequencies, intensities, and phases of the vibrational echo and Fourier transformation into the frequency domain, is sensitive to changes in environments of individual molecules during the experiment, even if the aggregate populations in distinct environments do not change. For example, the structural fluctuations of a protein or the formation and dissociation of molecular complexes under thermal equilibrium conditions can be observed. In this chapter, the theoretical background, methodology, and recent progress of the 2D-IR vibrational echo measurements for heme proteins are described. For heme proteins, 2D-IR vibrational echo experiments use the heme-ligated CO vibration as a direct sensor of protein dynamics[11,12,20,36,47-49,54-57]. The linear IR absorption spectrum of the CO stretching mode of heme protein generally displays several bands that reflect structural differences, i.e., distinct structural substates[7,58]. While the linear IR absorption spectrum can not provide information on a proteins structural dynamics, the time evolution of the 2D-IR spectra (spectral diffusion) of the CO bands reveals the fast protein structural fluctuations of the substates. The heme-ligated CO absorption bands are observed in 1,900cm-1 to 2,000cm-1 range, which is separate from other absorbing group. Three protein systems will be discussed. Myoglobin and myoglobin mutants will be used to demonstrate how small changes in the amino acid sequence can have significant effects on protein dynamics as sensed by CO bound at the active site[51]. Neuroglobin, a recently discovered heme protein found in the brain and nerve tissue[59], is compared to myoglobin, and it is used to study the influence that removing a disulfide bond has on protein dynamics[51,53]. Finally, the enzyme horseradish peroxidase is studied with and without a bound substrate. It is

82

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

demonstrated that substrate binding makes substantial changes in the enzymes dynamics[36].

1. Methodology The ultrafast 2D-IR vibrational echo measurement involves three femtosecond IR pulses that are tuned to the frequency of the vibrational modes of interest[60-62]. Because of the very short pulses, they have a broad bandwidth that can simultaneously excite a number of vibrational modes or a broad spectral feature. Femtosecond IR pulses employed in the experiments are generated using a Ti:Sapphire regeneratively amplified laser/optical parametric amplifier (OPA) system[63]. The spectral widths of the heme protein-CO transition discussed below are relatively narrow (10 20 cm-1) and the spread in the transition frequencies that occur in the various systems is ~100 cm-1. Therefore, the pulse durations in the IR are tailored to produce the appropriate band width for the experiments. The output of the regenerative amplifier is ~100 fs and produces transform limited ~0.5 mJ pulses centered at ~800 nm at 1 kHz repetition rate. These are used to pump an IR OPA. The bandwidth and pulse duration of IR pulses are 150 cm-1 and 100 fs, respectively. A 2D-IR vibrational echo spectroscopy setup is illustrated schematically in Figure 2. The three pulses have wave vectors k1, k2, and k3 with variable delay time W between the pulses 1 and 2 with wave vectors k1 and k2 and with variable delay time Tw between pulses 2 and 3 with wave vectors k2 and k3. The vibrational echo pulse is detected in the ke = k2 + k3  k1 phase-matched direction. When the vibrational echo pulse is sent directly into an IR detector, its intensity is measured but phase information is lost. To perform Fourier transforms from the time domain into the frequency domain, phase relationships are necessary. To obtain the vibrational echo signal E-field rather than detecting the intensity, the echo pulse is overlapped with a fifth pulse called the local oscillator (LO). The function of the LO is to phase resolve and optically heterodyne amplify the vibrational echo signal. In the heterodyne detected vibrational echo experiments, the vibrational echo pulse, which is overlapped with LO pulse, is passed through a monochromator and then detected with a 32-element HgCdTe (MCT) IR array detector. The following is a qualitative description of the vibrational echo experiment. The first pulse excites the molecules to a coherent superposition of the ground state (0) and first excited state (1) with all of the vibrational oscillators initially oscillating in phase at their initial frequencies. The phase relationships among the oscillators decay quickly because of inhomogeneous broadening of the spectral line (the range of transition frequencies across the spectroscopic line) with additional contributions from fast fluctuations of the transition frequencies caused by structural dynamics of the system. This initial loss of phase relationships is called the free induction decay (FID). The second pulse transfers the initial coherent superposition states of each molecule into a population state in either the 0 or 1 states. Because of structural evolution of the system during the population period, Tw, the molecular oscillators (the CO stretch for the experiments discussed below) undergo frequency shifts, called spectral diffusion, which cause molecules to lose memory of their initial frequencies. The third pulse again generates coherent superposition states of the oscillators. Initially, the oscillators are not in phase, but the pulse sequence initiates a rephasing process. If some memory of the initial frequency is retained, the vibrational echo

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

83

pulse is generated after the third pulse at a time W, because the vibrational oscillators are again oscillating in phase. Decay of memory of the initial frequencies of the molecular oscillators caused by structural evolution of the system (spectral diffusion) causes the 2D spectrum to change shape. Even if all of the memory of the initial frequencies has been lost because of rapid structural evolution, there is still a vibrational echo signal, but the 2D spectrum is symmetrical, and its width and shape reflect the width and shape of the absorption spectrum. Observation of the vibrational echo is limited by the vibrational life time (T1). Decay to the ground state of the excited vibrations causes the amplitude of the signal to decay. If the vibrational lifetime causes the vibrational echo signal to decay to zero before spectral diffusion is complete (all structures have been sampled), the 2D spectrum will not reach its symmetrical shape. In short, the first pulse labels the initial structures of the species in the sample and initiates the first coherence period The second pulse ends the first coherence period W and starts clocking the waiting time during which the frequency-labeled molecular oscillators experience structural dynamics that can cause them to evolve to different frequencies. The third pulse ends the waiting period Tw and begins a third period of length W, which ends with emission of the vibrational echo pulse. The echo signal reads out information about the final structures experienced by the labeled oscillators. A 2D vibrational echo spectrum is obtained with the initial labeled frequencies as one axis (ZW) and the final frequencies of the sample as the other axis (Zm).

Figure 2. Schematic layout of the 2D-IR vibrational echo experiment. The three mid-IR pulses have wave vectors k1, k2, and k3 with variable delay time W between the pulses 1 and 2 (k1 and k2) and with variable delay time Tw between pulses 2 and 3 (k2 and k3). The vibrational echo signal is detected in the phase-matched direction. The emitted vibrational echo pulse and the local oscillator pulse are combined to enable the measurement of both intensity and phase information.

84

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

An example is given in Figure 3 for the CO adduct of L29F mutant Mb[51]. Figure 3A shows the linear FTIR absorption spectra of the CO stretching mode bound to the ferrous heme iron in the L29F mutant. The single peak indicates that the L29FCO mutant exists in a single conformational substate. In the 2D vibrational echo spectrum, there are two frequency axes, which require two Fourier transforms to the 2D frequency spectrum. The vibrational echo signal is measured as a function of one frequency variable, Zm, and two time variables, W and Tw. One of the two Fourier transform is done by the monochromator. The spectrum of the combined vibrational echo-local oscillator wave packet resolves it into its frequency components. This Fourier transform provides the vertical axis, Zm (m for monochromator), of the spectrum, corresponding to the time between the third pulse and the vibrational echo pulse. The Zm axis is the axis corresponding to the frequencies of the vibrational echo emission. The other axis is obtained by scanning W. As W is scanned, the vibrational echo pulses moves in time relative to the fixed local oscillator, producing an interferogram. There is one such interferogram for each frequency Zm. These interferograms are numerically Fourier transformed to provide the data along the ZW axis. The ZW axis corresponds to the frequency of the first interaction of the radiation field (first pulse) with the molecular oscillators. Two-dimensional vibration echo spectra are recorded as a function of Tw. The amplitude is depicted as a function of both ZW and Zm that correspond to the Z1 and Z3 axes, respectively in 2D-NMR. Details of the method including phase error corrections have been presented[63,64]. The Tw dependent spectral changes in 2D-IR provide direct information on the time evolution of the protein through the influence of the structural changes on the frequency of the CO vibrational mode.

A
absorbance (norm)

1.0 0.8 0.6 0.4 0.2 0.0

B
1940

Tw = 0.25 ps

Zm (cm-1)

1930 1920 1910 1900


1900 1910 1920 1930 1940

1900 1910 1920 1930 1940 1950 1960

frequency (cm-1)

ZW (cm-1)

Figure 3. 2D-IR measurement on the CO stretch of CO bound to the L29F Mb mutant. (A) The linear FT-IR spectrum of L29FCO. (B) 2D-IR vibrational echo spectrum of L29FCO at Tw = 0.25 ps. There is a positive going peak (labeled +) on the diagonal and a negative going peak (labeled ) off diagonal that correspond vibrational echo emission at the 0-1 and 1-2 vibrational transitions, respectively.

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

85

The 2D-IR vibrational echo spectrum of L29FCO at Tw = 0.5ps is shown in Figure 3B. There is a positive going peak on the diagonal and a negative going peak off diagonal, which correspond to vibrational echo emission at the 0-1 and 1-2 vibrational transitions, respectively. The off-diagonal 1-2 band is shifted along the Zm axis by the vibrational anharmonicity of the CO stretching mode. The band on the diagonal has ZW = Zm. The frequency of the initial excitation (first pulse) is equal to frequency of the echo emission. The off diagonal 1-2 peak arises as follows. The first pulse excites a coherent superposition state of the 0-1 transition. The second pulse produces a population in the 1 state. The third pulse couples the 1 state to the 2 state, which produces a coherent superposition of 1 and 2 vibrational states, and leads to vibrational echo emission at the 1-2 transition frequency. The band is off-diagonal because the first interaction is the ZW frequency of the 0-1 transition, but the echo emission is at the Zm frequency of the 1-2 transition resulting in the shift to lower frequency along Zm by the vibrational anharmonicity, ~25 cm-1. The 1-2 band is negative going, while the 0-1 band is positive going. Because the dynamical information obtained from the off-diagonal 1-2 bands is the same as that obtained from the 0-1 bands, only the 0-1 vibrational transition peaks are analyzed below.

2. Data Analysis Figure 4A shows an example of 2D-IR spectra for CO bound to the L29F mutant of Mb at several values of Tw. Only the 0-1 transition region is shown. The peak position is located on the diagonal at (ZW Zm) = (1,933 cm-1, 1933 cm-1). The position of the peak is time-independent. However, the band shape changes. As Tw increases, the band go from highly elongated along the diagonal to less elongated and increasingly broad along the ZW axis. In the long time limit, the band would become round. The change in band shape is caused by spectral diffusion arising from protein structural fluctuations. To analyze the time evolution of the 2D-IR vibrational echo spectrum, the inverse of the center line slope (CLS) is used[21,51,53,65]. A slice through the 2D spectrum at a particular Zm value is projected onto the ZW axis to give a spectrum. The peak position on the ZW axis is determined. The result is a point with coordinates (Zm, ZW). Many such slices are taken and the set of (Zm, ZW) points forms a line, the center line. Examples of center lines are shown in Figure 4A as the white dotted lines in the panels with Tw = 0.5 and 32 ps. The slope of the center line changes as Tw increases and spectral diffusion makes the band more symmetrical. It has been demonstrated theoretically, that the change in the inverse of the slope of the center line (referred to as the CLS) is directly related to the underlying dynamics of the system[21,65]. In the absence of a homogeneous broadening component (see below), the initial slope would be that of a line at 45q, that is, a slope of 1. At sufficiently long time, the shape of the 2D spectrum is symmetrical, and the center line is vertical with an infinite slope. Therefore, the inverses of the center line slopes range from 1 to 0. In Figure 4A, it can be seen that for Tw = 32 ps the slope of the center line is steeper than at Tw = 0.5 ps. The Tw dependent CLS for CO bound to L29F mutant Mb are shown in Figure. 4B. A quantitative description of the amplitudes and time scales of frequency fluctuations of a vibrational oscillator is provided by the frequency-frequency correlation function (FFCF)[26,61,66,67]. The FFCF connects the experimental

86

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

observables to the underlying dynamics in the sample. The FFCF is a joint probability distribution that the frequency has a certain initial value at t = 0 and another value at t. The CLS method is used for extracting the FFCF from the Tw dependence of the 2D-IR spectra[21,65]. Once the FFCF is known, all linear and non-linear optical experimental observables can be calculated using time-dependent diagrammatic perturbation theory. Conversely, the FFCF can be extracted from 2D-IR vibrational echo spectra with additional input from the linear FTIR absorption spectrum, including the Tw-independent homogeneous component. A multiexponential form of the FFCF, C(t), is used to model the multi-time scale dynamics of the protein structural fluctuations and has been found to reproduce the influence of structural dynamics on the CO frequency in heme proteins[11,51,55,56,68]. The FFCF has the form

C (t )

' e
i 1

2  t /W i i

 '2 s .

(1)

The 'i and Wi terms are the amplitudes and correlation times, respectively, of the frequency fluctuations induced by protein structural dynamics. Wi reflects the time scale of a set of structural fluctuations and the 'i is the range of CO frequencies sampled due to the structural fluctuations. The experimental time window is several times the vibrational life time, T1, because lifetime decay of the excited vibrational state reduces the signal to zero. The 2D-IR vibrational echo experiment is sensitive to fluctuations a few times longer than this window, i.e., a hundred picoseconds or more depending on the sample, because some portion of slower fluctuations will occur in the experimental window if their time scale is not too slow[69]. Protein structural dynamics that are sufficiently slow will appear as static inhomogeneous broadening, which is reflected in C(t) by 's, a static term. In obtaining the FFCF from the data, the 'i and the Wi are determined. However, for ultrafast dynamics in the motionally narrowed limit ('W < 1), only the product, '2W = 1/T2* or ** = (ST2*)-1, can be obtained[56,70]. T2* is called the pure dephasing time, which gives rise to the pure homogeneous linewidth, **. The total dephasing time T2 is

1 T2

1 1 1   . T2 2T1 3Tor

(2)

where ** = (ST2*)-1 is the Lorentzian homogeneous line width. T1 and Tor are the vibrational life time and orientational relaxation time, respectively. Because the rotational diffusion of the protein is very slow relative to the vibrational life time, the orientational relaxation contribution can be neglected. T2 is determined from the CLS with use of the linear absorption spectrum, and T1 is obtained from the independent IR pump-probe experiments. Therefore, the pure dephasing time, T2*, is obtained using 1/T2 = 1/T2* + 1/2T1. The homogeneous contribution is not dependent on Tw. It manifests itself as a deviation from the CLS being equal to 1 at Tw = 0. The CLS data points in Figure 4B are fit to a biexponential function. A biexponential without a static term, 's, is sufficient to describe the Tw dependent protein dynamics sensed by CO bound at the active site of the L29F mutant. The homogeneous contribution is obtained from the Tw = 0 value of the CLS and the linear absorption spectrum as described in detail previously[21,65]. The fit is shown as the line in Figure 4B. The FFCF parameters obtained from 2D-IR and linear FT-IR spectra, T2*, and T1 values are given in Table 1. The FFCF parameters delineate that

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

87

A
Zm (cm-1)

1950 1940 1930

Tw = 0.5

Tw = 4

0.7 0.6 0.5

CLS

1920

Tw = 16

Zm (cm-1)

1940 1930 1920 1920 1930 1940

Tw = 32

0.4 0.3 0.2

ZW (cm-1)

1920 1930 1940 1950

0.1

ZW (cm-1)

10

Tw (ps)

20

30

Figure 4. Time dependent spectral diffusion of L29F Mb mutant. (A) A series of 2D-IR spectra for L29FCO at several Tw values. Only the 0-1 transition is shown, and the white dashed lines are the center lines (see text). (B) Tw dependent CLS data (circles) for L29FCO. The solid curve is calculated with the FFCF obtained from the CLS data.

the L29FCO band has a fast (1.7 ps) decay followed by slower decay of 66 ps. Because L29FCO does not have a static term, all possible structures that give rise to the inhomogeneous broadened CO absorption band are sampled in ~200 ps. This is an interesting and important result. The equilibrium structural fluctuations of L29F have sampled all structural configurations about the folding minimum in ~200 ps. The results demonstrate that there are no slower components so long as the protein remains in the folded substates. As illustrated in Figure 1, the very fast fluctuation arise from transitions between minima with very small barrier heights while the slower fluctuations involve transitions between a set of minima separated by higher barriers.

Table 1. Experimental parameters for the proteins studied


protein Mb L29F Mb H64V wild-type Ngb N3 wild-type Ngb N0 reduced Ngb N3 reduced Ngb N0 HRP free red HRP BHA T2* (ps) 4.8 7.7 5.0 11.9 6.0 11.0 14.0 7.6 1 (cm-1) 2.8 2.1 1.9 1.8 3.0 3.0 3.1 2.3

W1 (ps)
1.7 5.2 2.0 11.5 1.3 3.7 1.5 4.4

2 (cm-1) 2.2 2.7 3.3 5.6 -

W2 (ps)
66 14 22 21 -

s (cm-1) 2.7 3.1 3.5 2.9 4.3 1.9

T1 (ps) 15.8 24.1 19.3 18.4 16.0 16.1 8.0 8.0

88

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

3. Applications 3.1. Comparison of the protein dynamics for the different conformational substates Proteins do not necessarily fold into a single tertiary structure. As discussed in connection with Figure 1, these somewhat different structures are called conformational substates[3,4]. In ambient temperature, a protein can convert from one substate conformation to another. The different conformational substates usually show the different reaction rates for the protein function. Differences in protein structural dynamics for different substates have been revealed by several kinds of experiments, such as NMR, flash photolysis, kinetic hole burning, fluorescence and resonance Raman spectroscopy[71-75]. Here using 2D-IR vibrational echo spectroscopy we show that the differences in fast dynamics can be measured and quantified. The vibrational echo experiment is sensitive to both the local heme pocket dynamics and the global protein dynamics[11,12]. The CO transition frequencies are exquisitely sensitive to electric fields[7,10,68,76-78]. The protein is composed of charged, polar, and relatively non-polar groups. Structural fluctuations produce fluctuating electric fields at the CO and couple to the CO transition frequency through the Stark effect[11,12,68]. Therefore, the vibrational fluctuations measured by the 2D-IR vibrational echo experiment and quantified through the determination of the FFCF are determined by motions throughout the protein[11,12].

Closed
Leu29 His64 CO Val68 Ile107

Open

His93

B 1.0 absorbance (norm)


0.8 0.6 0.4 0.2 0.0 1900 1920 1940 1960 1980 2000

L29F (A3)

H64V (A0)

frequency

(cm-1)

Figure 5. Substate dependent spectral changes in MbCO. (A) 3D structure of the active site of CO-bound to wild-type Mb for open (right) and closed (left) conformations (Protein Data Bank). The heme and some selected amino acid residues are shown. (B) The linear FT-IR spectrum of L29FCO and H64VCO mutants.

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

89

Myoglobin is a small globular heme protein having 153 amino acids, which is found in mammalian muscle tissue. Heme can reversibly bind a number of small gaseous molecules such as O2, CO, and NO. The heme bound CO shows a number of stretching bands in the mid-IR region between 1,900 and 2,000 cm-1[7]. In the linear FT-IR absorption spectrum of the CO stretch of wild-type MbCO, there are three well known bands, denoted A0, A1, and A3. Mutant studies have shown that the distal histidine, His64, plays a prominent role in determining the CO stretching bands in Mb[7,77]. The physiological importance of the substates has been attributed to either steric hindrance of CO or stabilization of O2 binding by a hydrogen bond with the distal histidine[79-81]. The A1 and A3 conformations have the distal histidine localized in the heme pocket (closed), resulting in the distal histidine interacting significantly with the heme bound CO (Figure 5A left panel)[11]. The A0 substate has the distal histidine rotated out of the pocket (open), and there is little interaction between the distal histidine and the ligand (Figure 5A right panel)[76]. The A1 substate of wild-type MbCO is predominately populated and dominates the FT-IR absorption spectrum. The structural dynamics for the A1 and A3 conformations of wild-type MbCO have been measured previously using vibrational echo spectroscopy[11,12,20]. However, the A3 band overlaps substantially with the much larger A1 band and the A0 band is too small in wild-type MbCO to be useful for quantitative 2D-IR vibrational echo experiments. To compare the dynamics of conformational substates in MbCO, the mutant Mb proteins L29F and H64V were used. As shown in the absorption spectrum (Figure 5B), the L29F mutant has one major band at 1,932 cm-1 corresponding to the A3 conformation. Because the position 29 is close to the distal histidine, the replacement of amino acid residue from leucine to phenylalanine changes the distribution of the conformational substates. The H64V mutant protein, with the distal histidine replaced by a valine, mimics the situation in which the distal histidine is not in the heme pocket. The linear FT-IR absorption band for H64V at 1,968 cm-1 corresponds to the A0 conformation in wild-type Mb. Figure 4A and 6A show the 2D-IR vibrational echo spectra of CO bound to L29F and H64V Mb at several Tws, respectively[51]. The diagonal 0-1 transition bands and representative center lines are shown. It is clear that the Tw dependent elongation for CO stretching band of H64V (spectral diffusion) is slower than that of L29F. It also can be seen that for Tw = 32 ps the slope of the center line for L29F is steeper than that of H64V. Therefore, the inverse of the slope of the center line, which is related to the FFCF, is closer to zero for L29F. These results indicate that the different conformational substates in Mb have different rates of structural fluctuation induced spectral diffusion. The Tw dependent CLS data (points) and the fits (solid curves) for the L29F and H64V proteins are presented in Figure 6B[51]. (Data are acquired and fit to 100 ps but are not shown so that the fast component can be seen more readily.) The fits combined with the linear absorption spectrum yield the FFCF. It is clear from inspection of the data that the dynamics of the two Mb substates are very different. The spectral diffusion (structural fluctuations) of L29F is significantly faster than that of H64V. The differences in the dynamics of the two proteins can be quantified through their FFCFs. The FFCF parameters are given in Table 1[51]. The motionally narrowed component, characterized by T2* is slower for H64V than L29F. However, because this contribution to the 2D line shape is motionally narrowed, it is not possible to

90

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

separate the magnitude of the frequency fluctuations (') and the decay rate (W). The other components of the FFCF are more informative. The fast decay component of the FFCF ('1 and W1) are very different for L29F and H64V. The decay constant for L29F is a factor of three faster than that of H64V and the amplitudes of the fast components are similar. L29F has a W2 decay component of 66 ps while H64Vs slower component is static. A static component means that the dynamics are too slow for the 2D-IR experiment to measure within the time window of that experiment that is limited by the vibrational lifetime, T1. T1 = 24.1 ps for H64V, and typically we can obtain useful data for times as long as ~5T1. Therefore, the static component is significantly slower than several hundred ps. The amplitudes '2 for L29F and 's for H64V are about the same. In L29F, the structural fluctuations sample all accessible configurations about the protein folding minimum in several hundred ps. That is, all configurations of the protein that give rise to the L29 F CO absorption line (Figure 5) are sampled rapidly. In contrast, only about half of the H64V configurations are sampled on the same time scale. The FFCFs of the two proteins demonstrate quantitatively what can clearly be seen in Figure 6 B; the structural fluctuation dynamics of H64V are significantly slower than those of L29F. The structural fluctuations sensed by the CO bound at the active site of heme proteins are both global and local[11,12]. The major difference between L29F and H64V is the removal of the distal histidine and its replacement with valine, which has a small non-polar side group, in contrast to the histidine. Previous vibrational echo experiments and molecular dynamics (MD) simulations show that the A3 substate of MbCO (the equivalent of L29F) has the protonated epsilon nitrogen (NH) of the distal histidines imidazole side group closely associated, probably hydrogen bonded, to the CO ligand[11,12]. The results presented here indicate that removal of this interaction enables the protein to obtain a different conformation that is significantly less flexible and has greatly reduced fast (1 100 ps) structural fluctuations. Because of the great similarities between Mbs A3 substate and L29F, and Mbs A0 substate and H64V, it is reasonable to assume that Mbs A3 and A0 substates behave in the same manner as that observed for L29F and H64V, respectively.

A
Zm (cm-1)

1980 1975 1970 1965

Tw = 0.5

Tw = 4

1.0 0.8 0.6

CLS

H64V (A0 )
0.4 0.2 0.0 0

Zm (cm-1)

1975 1970 1965 1960

Tw = 16

Tw = 32

L29F (A3 )

1965 1970 1975

ZW (cm-1)

196519701975

ZW (cm-1)

10

20

Tw (ps)

30

40

50

Figure 6. Comparison of the protein dynamics of L29FCO and H64VCO. (A) 2D-IR spectra of H64VCO Mb at various times, Tw. Only the 0-1 transition is shown and white dashed lines are the center lines. (B) Tw dependent CLS for L29FCO (circles) and H64VCO (triangles). The calculated solid curves are from the FFCF obtained from the CLS data.

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

91

3.2. Neuroglobin

Neuroglobin (Ngb) is a recently discovered family of vertebrate globin proteins, which is expressed predominantly in the nervous system[59]. Comparison of Ngb with vertebrate Mb and Hb sequences show only minor similarities at the amino acid level (<25% identity), but Ngb features the conserved globin fold and contains heme[59]. Ngb has been hypothesized to facilitate O2 diffusion to the mitochondria. However, the concentration of Ngb in the brain is too low to play the role that Mb plays in the muscles. Since the expression level of Ngb is increased under hypoxic condition, Ngb may be involved in neuronal response to ischemia[82,83]. Another possibility is that Ngb detoxifies reactive oxygen species or is involved in signal transduction in the brain[84-86]. The structural feature of Ngb is that the heme iron is hexacoordinated both in the ferrous and ferric forms of Ngb[87]. An external gaseous ligand must compete with the sixth ligand, the distal histidine, for binding. While the ligand binding process in Ngb is distinct, several key residues in the heme pocket of Mb are conserved in Ngb (Figure 7A). However, leucine at position 29 in Mb is replaced by phenylalanine at position 28 in Ngb. The role of phenylalanine at position 28 in Ngb has not been elucidated, although many nonsymbiotic plants Hbs that have hexacoordinated binding to the heme iron also have phenylalanine at the same position[88]. Another feature of Ngb is that the structural analysis revealed that human Ngb has an intramolecular disulfide bond which affects its oxygen affinity[89]. Thus, the human Ngb sample discussed here contains an intramolecular disulfide bond. As shown in Figure 7B, the CO adduct of wild-type Ngb that contains a disulfide bond has two CO absorption bands at 1,933 cm-1 and 1,968 cm-1 that correspond to A3 and A0 conformations in Mb, respectively[51,90]. These bands have been called the N3 and N0[51,90]. The N3 band closely corresponds to the absorption at band 1,932 cm-1 of the L29F mutant of Mb. Because the leucine at position 29 in Mb is replaced by phenylalanine in Ngb, the L29F mutant Mb mimics the heme pocket structure in Ngb. The N0 band arises for the structure in which the distal histidine is rotated out of the heme pocket[90,91]. The H64V mutant of Mb mimics the situation and has the CO absorption band at 1,968 cm-1.

Phe28

B
absorbance (norm)

1.0 0.8 0.6 0.4 0.2 0.0 1900

L29F

N3

H64V

Phe42

His64 Val68

His96

N0
1920 1940 1960 1980 2000

frequency (cm-1)
Figure 7. Structural and spectral comparison of Ngb and Mb. (A) 3D structure of the active site of NgbCO (light gray) and L29FCO Mb (dark gray) proteins (Protein Data Bank). The heme and some selected amino acid residues are shown. The amino acid residue numbers are based on Ngb. (B) The linear FT-IR spectra of wild-type NgbCO (solid curve), L29FCO (dashed curve) and H64VCO (dot-dashed curve).

92

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

A
Zm (cm-1)

1980 1960 1940 1920 1980 1960 1940 1920

Tw = 0.5 +

Tw = 4

1.0 0.8 0.6 0.4 0.2 0.0

N0 N3

Tw = 16

Tw = 32

Zm (cm-1)

1920 1940 19601980 192019401960 1980

CLS

ZW

(cm-1)

ZW

(cm-1)

10

20

Tw (ps)

30

40

50

Figure 8. Structural dynamics of N3 and N0 substates of wild-type NgbCO. (A) 2D-IR spectra of wild-type NgbCO as a function of Tw. The positive going peaks (labeled +) on the diagonal and negative going peaks (labeled ) off diagonal correspond to vibrational echo emission at 0-1 and 1-2 vibrational transition frequencies, respectively. (B) Tw dependent CLS for N3 (circles) and N0 (triangles) conformations of wild-type NgbCO. The solid curves are calculated with the FFCF obtained from the CLS data.

To compare the structural dynamics of the wild-type Ngb and Mb mutants, the 2D-IR vibration echo spectrum of CO bound wild-type Ngb was measured[51]. Figure 8A presents 2D-IR spectrum of CO bound to Ngb at several values of Tw. There are two positive gong bands on the diagonal (N3 and N0), which correspond to the 0-1 vibrational transitions. The negative going bands arise from vibrational echo emission at the 1-2 vibrational transition. In Figure 8A, the peaks are normalized to the largest peak in each panel. As discussed above for the 2D-IR vibrational spectra of the mutant Mbs, the bands go from highly elongated along the diagonal to less elongated and increasingly broad along the ZW axis at long Tw. The Tw dependent CLS for the N3 and N0 bands for wild-type Ngb are presented in Figure 8B[51]. The difference between the CLS decays of N3 and N0 bands is qualitatively similar to the difference between the L29F and H64V mutant Mb (Figure 6B). These results indicate that the configuration of the distal histidine has a large effect on the structural dynamics and therefore the spectral diffusion of both of Ngb and Mb. When the distal histidine is out of the heme pocket, a strong interaction between the distal histidine and the ligated CO is eliminated. In both of Ngb and Mb, the configuration of the distal histidine is closely related to the fast protein fluctuation. The change in structure associated with the presence or absence of the distal histidine side group in the pocket influences the local and global protein structural dynamics. Although the qualitative nature of the relationship of the dynamics of N3/N0 to L29F/H64V is the same, the FFCF parameters are quite different (Table 1) [51]. The W1 value for the N3 band is similar to that of L29F mutant Mb, but the N3 longer time scale dynamics are quite different. The longer time scale equilibrium structural fluctuations of L29F show a complete sampling of all structures in several hundred ps, with a time constant of 66 ps. In contrast, N3 has a transition (14 ps) to a static component of the FFCF. Therefore, N3 has very slow dynamics that are not present in L29F. A substantial fraction of the structures in the N3 substate are sampled on times long compared to 100 ps. Therefore, in spite of the near identity of the heme pocket region of the protein, the structural fluctuations of N3 have a very slow component compared to L29F. It is possible that the fast component of both L29F and N3 arise

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

93

from the motions of the distal histidine and the differences in the FFCFs arise from differences in the global structural dynamics of the protein. The dynamics of the N0 substate in wild-type Ngb is also different from those of H64V although neither have the distal histidine side group in the heme pocket. Although both have static components of similar amplitude, the W1 values for N0 and H64V are 11.5 ps and 5.2 ps, respectively. Because the distal histidine is absent in both proteins, these measurements also support the idea that the observed structural fluctuations of wild-type Ngb and Mb have contributions from global motions rather than only very local interactions with the side group of the distal histidine at the active site. Ultrafast 2D-IR vibrational echo experiments on wild-type Ngb reveal that the protein fluctuations in the globin family proteins differ in the picosecond time region. Four types of globins have been discovered in humans and other vertebrates. They have a globin fold with several conserved key residues around the heme. However, they have distinct roles in the human body. Hb transports O2 in red blood cells; Mb in muscle provides O2 to mitochondria; Ngb and Cytoglobin (Cgb) are two newly discovered globin family proteins[59,92,93], which may provide O2 for mitochondria and may be involved in a signal transduction and/or scavenging reactive oxygen species. The differences in Ngb and Mb dynamics may be related to their distinct functions.
3.3. The Influence of the Disulfide B on Fast Protein Dynamics

Intramolecular covalent disulfide bonds take part in the regulation of protein folding, stability, and activity[94-96]. Because disulfide bonds are rigid structural elements in proteins, local and global protein structural fluctuations might be expected to be regulated by such bonds. Disulfide bond dependent structural configurations have been investigated by NMR and molecular dynamics (MD) simulation studies[97-101]. However, the effect of the bonds on the fast protein dynamics has not been well characterized experimentally.

A
Cys55 CO

Cys120

1.0 0.8

N0

CLS

Cys46

0.6 0.4 0.2

N3

10

20

30

40

50

Tw (ps)
Figure 9. Structure and disulfide bond dependent dynamics of NgbCO. (A) 3D structure of human Ngb (the position Cys55 and Cys120 are mutated by Ser; the position Cys46 is mutated by Gly) (Protein Data Bank). Cys46 and Cys55 of human Ngb are form a disulfide bond. (B) Tw dependent CLS data for wild-type (filled symbols) and reduced (open symbols) NgbCO. The curves through the data points (wild-type Ngb solid; reduced Ngb dashed) are calculated with the FFCF obtained from the CLS data.

94

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

The structural analysis of human Ngb has shown the presence of an intramolecular disulfide bond that regulates the O2 affinity in vitro (see Figure 9A)[89]. Although the physiological role of the disulfide bond in human Ngb is not clear, hypoxia may induce the reduction of the disulfide bond and result in a subsequent release of O2[89]. These results suggest that the formation of the intramolecular disulfide bond stresses the protein, and breaking the disulfide bond provides additional structural degrees of freedom of the protein, resulting in an increased affinity for O2 binding. Because Ngb contains a disulfide bond and heme, 2D-IR vibrational echo spectroscopy can investigate the effect of an intramolecular disulfide bond on dynamics via the spectral diffusion of the heme-ligated CO stretching band. As discussed above, the 2D-IR vibrational echo spectroscopy is sensitive to not only very local interactions at the active site but also global motions of proteins. Comparison between human Ngb with and without the disulfide bond provides insights into disulfide dependent structural modulation. To reveal the structural regulation of a disulfide bond in human Ngb, the disulfide bond is eliminated by reduction. The linear FT-IR spectrum of reduced Ngb is almost identical to that of wild-type Ngb, which has two CO bands at 1933 cm-1 and 1967 cm-1 (see Figure 7 B)[53]. Figure 9B shows a comparison of the 2D-IR vibrational echo CLS data for wild-type Ngb (filled symbols) and reduced Ngb (open symbols)[53]. It is clear that reduction of the disulfide bond has increased the rate of structural fluctuations on fast time scales for both N3 and N0 conformations compared to those of the wild-type Ngb protein. It is important to note that the disulfide bond is ~20 from the CO bound to the heme[102-104]. Therefore, the observed changes in the dynamics with the elimination of the disulfide bond are most likely global modifications of the protein fluctuations rather alterations that occur very locally in the heme pocket. The quantitative description of the time scales of the N3 and N0 bands are provided by the FFCFs (Table 1)[53]. The curves through the data points (wild-type Ngb solid curves; reduced Ngb dashed curves) are calculated from the FFCF obtained using the CLS method. The N3 substates of the wild-type and reduced Ngb have a homogeneously broadened component (T2*), two dynamic components (W1 and W2) that give rise to spectral diffusion within the time window of the experiment, and a static component ('s). For both the N3 and N0, reduction of the disulfide bond leaves T2* unchanged. Therefore, the ultrafast small very local motions of the protein are unchanged by elimination of the disulfide bond. The major change upon reduction of the disulfide bond is in the time constant W1 of both N3 and N0. Elimination of the disulfide bond reduces the fast N3 dynamics by ~1/3 but reduces the N0 dynamics by a factor of ~3. The fast dynamics of the N0 substate of the wild-type protein is much slower than that of the N3 substate probably because of there is not a contribution from to the spectral diffusion from the imidazole side group of the distal histidine. Without the contribution from the distal histidine, the N0 band is more sensitive to the global structural fluctuations of the protein. These results show that the disulfide band significantly reduces fast structural fluctuations of the protein. The N0 band does not display an intermediate time scale decay. Both the wild-type and reduced N0 bands have significant static components. The N3 band of both the wild-type and reduced proteins have an intermediate decay, which is actually faster for the wild-type. This may occur because some of the intermediate time scale fluctuations of the reduced protein have become part of the fast component and some fluctuations that were too slow to be observed and part of the wild-type static component have become faster

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

95

with the elimination of the disulfide bond and now contribute on the intermediate time scale. While the rate of fast structural fluctuations is increased in reduced Ngb, the reduction of the disulfide bond does not change the linear FT-IR absorption spectrum of the CO bound at the active site. The lack of change in the linear FT-IR spectrum indicates that the distribution of structures in the vicinity of the folded protein free energy minimum that are sampled under thermal equilibrium conditions are not altered significantly. These results suggest that the disulfide bond in Ngb regulates the heights of the relatively low energy landscape barriers that determine the rate of the fast structural fluctuations. Disruption of the disulfide bond in Ngb lowers the O2 affinity in vitro[89]. The disulfide bonds influence on the protein dynamics of Ngb may play a physiological.
3.4. The Influence of Substrate Binding on Enzyme Dynamics

Protein fluctuations are intimately coupled to the binding of a substrate to an enzyme[105,106]. An enzyme has many local structural minima on the free energy landscape and is constantly fluctuating among them[3,4]. Substrate binding to an enzyme induces conformational changes in the protein that may produce either a significant shift of the energy landscape minimum or the generation of what should be considered a new substate. These structural changes may be necessary for substrate binding and the subsequent chemical reaction. Insights into the influence of substrate binding on an enzyme can be gained by observing the affect of binding on the proteins dynamics. Horseradish peroxidase (HRP) is a heme containing enzyme that catalyzes a variety of organic molecules in the presence of hydrogen peroxide as the oxidizing agent[107]. HRP is widely used in analytical biochemistry and biotechnology[108-110]. To characterize the influence of substrate binding on enzyme dynamics, 2D-IR vibrational echo experiments were conducted on the substrate-free HRP and the substrate-bound HRP[36]. Since the heme in both substrate-free and substrate-bound HRP can bind CO, the CO spectral diffusion can be used as a probe of protein dynamics in the same manner as discussed for Mb and Ngb above. Five substrates, which are benzhydroxamic acid (BHA) analogs, have been studied with the 2D-IR vibrational spectroscopy[36]. Figure 10A shows the protein structure in the region of the active site with BHA bound.

A
His42 BHA Arg38

B
absorbance (norm)

1.0 0.8 0.6 0.4 0.2 0.0 1880 1900 1920

O N H OH

BHA

His170

1940

1960

frequency

(cm-1)

Figure 10. The influence of substrate binding to HRP. (A) 3D structure of the active site of HRP with the substrate BHA bound (Protein Data Bank). (B) The linear FT-IR spectra of HRPCO in the substrate-free (solid curve) and BHA-bound (dashed curve) forms. The structure of BHA is also shown.

96

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

A
Zm(cm-1)

Tw= 0.5 ps + 1930


1910 1890 1930 1910

Tw= 2 ps

B 0.8
0.7 0.6

Tw= 8 ps

Tw= 32 ps

CLS

0.5 0.4 0.3 0.2 0 5

No substrate (red)

Zm(cm-1)

BHA substrate

1890 1890 1910 1930 1890 1910 1930

ZW (cm-1)

ZW (cm-1)

10

Tw (ps)

15

20

25

30

Figure 11. The influence of substrate binding on HRP dynamics. (A) 2D-IR spectra of substrate-free HRPCO as a function of Tw. The positive going peaks (labeled +) on the diagonal and negative going peaks (labeled ) off diagonal correspond to vibrational echo emission at 0-1 and 1-2 vibrational transition frequencies, respectively. (B) Tw dependent CLS data for HRPCO without substrate (squares) and HRPCO with the substrate BHA bound (triangles). The curves through the data points (no substrate solid; with substrate dashed) are calculated with the FFCF obtained from the CLS data.

Figure 10 B shows the linear FT-IR spectra of the substrate-free and BHA-bound HRPCO[36]. Free HRP has two spectroscopically distinct substates, with CO absorption bands at 1,903 cm-1 and 1,934 cm-1. Previous studies indicated that the heme-ligated CO in the red state (absorbing at 1,903 cm-1) is nearly normal to the heme plane and has a strong interaction with the histidine residue and a weaker one with the arginine residue in the heme pocket[111,112]. In the blue state (absorbing at 1,934 cm-1), the CO ligand has a strong interaction with the arginine and a weaker one with the histidine[112]. When HRP binds BHA, the CO stretching band becomes a single at 1,909 cm-1[111,113]. Binding of all five substrates studied results in a single band. These bands have slightly different center frequencies that all strongly overlap the red band of the free HRP. Therefore, the single bound substate is mostly more closely related to the red state of the free HRP than to the blue state. Here for brevity only the red state of the free HRP will be discussed. Both free substates and all five substrates have been discussed previously[36]. Figure 11 A shows 2D-IR spectrum of CO bound to substrate-free HRP at several values of Tw. There are two positive gong bands on the diagonal (red and blue substates), which correspond to the 0-1 vibrational transitions. The negative going bands arise from the 1-2 vibrational transition. The Tw dependent CLS data for the red state of the substrate-free HRPCO (squares) and for HRP with BHA bound are presented in Figure 11 B[65]. The change in the dynamics upon substrate binding is dramatic. The solid curves are calculated with the FFCF determined for the CLS data. In addition to a motionally narrowed component, the red substate of substrate-free HRP has two time scales, W1 =1.5 ps and W2 = 21 ps, and no static term (see Table 1). All possible structures in the red substate of HRP without a bound substrate are sampled in less than 100 ps. However, in addition to a motionally narrowed component, the HRP with BHA bound has a single decay (W1 = 4.4 ps) and a static term. Therefore, substate binding causes

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

97

the fast component to slow by a factor of ~3 and the slow component to become static within the experimental time window, which indicates that the dynamics are slower than 100 ps. The other HRP-substrates also exhibit single time constants of ~2 ps to ~5 ps and a static term[36]. Detailed comparisons of the 2D-IR vibrational echo data for substrate bound an substrate free HRP with Mb mutants (see Figure 6 B) suggests that a major effect of substrate binding is lock up the distal arginine and distal histidine. The decay of the free HRP data is qualitatively similar to that of L29F while the decay of the BHA bound HRP is qualitatively similar to H64V[36]. While Mb only has a distal histidine, the difference in the dynamics of L29F and H64V, which does not have distal histidine, suggests a significant change is caused by the lack of the distal histidine side group motion in H64V. HRP with a substrate bound still has both distal ligands, but if their motions are greatly impeded on a fast time scale, the effect on the vibrational echo data will be similar. The substrate binding produces a new protein conformation (substate) that has greatly reduced dynamics. The changes will be both local in the pocket and global. Since the distal residues play an important role in every step of the enzymatic cycle of HRP, the conformational restrictions induced by the substrate binding may be important for the subsequent enzymatic reaction.

4. Concluding Remarks

2D-IR vibrational echo spectroscopy has made it possible to investigate fast protein dynamics under thermal equilibrium condition. Here we have briefly described the applications of 2D-IR vibrational echo methods to the study of heme protein dynamics. The experiments explicated the influences on dynamics of (1) different protein conformations, (2) different families of globin proteins, (3) the elimination of an intramolecular disulfide bond, and (4) substrate binding to an enzyme[20,36,51,53]. In these experiments, CO bound to the iron-heme was used as a reporter of protein dynamics. The CO provides a well defined vibrational chromophore that reports on both global and local protein structural fluctuations. Many other topics have been studied by a variety of research groups using 2D-IR spectroscopy[37,38,50,114-116]. Comparison of 2D-IR vibrational echo spectroscopy with other spectroscopic methods for the study of protein dynamics makes clear the utility of 2D-IR vibrational echo technique. NMR spectroscopy provides extremely high resolution structural information. However, the time resolution of NMR is much slower than that of 2D-IR vibrational echo spectroscopy. While the other optical methods, such as time resolved UV/vis and resonance Raman spectroscopy, can operate on ultrafast time scales, these are generally limited to photo induced processes rather than measuring dynamics under thermal equilibrium conditions. To understand the relationship between dynamics and function of biological molecules, thermal equilibrium condition dynamics are important. Therefore, the 2D-IR vibrational echo spectroscopy is a useful method for the understanding of structural dynamics and the nature of molecular interactions in biological molecules. In the experiments presented here, a single CO at a known location was used as the vibrational probe. Experiments can be conducted on, for example, the amide I band, but these lack location specificity for investigating dynamics. The types of experiments conducted here can be expanded beyond CO as a single site vibrational probe using artificially added vibrational dynamics probes much in the manner that

98

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

spin labels have been used in ESR. Recent progress of molecular biology makes it possible to generate proteins having unnatural amino acid residues[117]. For example, cyano-phenylalanine, azido-tyrosine, azido-phenylalanine, or azido-alanine containing protein have been synthesized[118-121]. All of these are IR active. The heme-ligated azide has been used for vibrational echo measurements[122]. The advantage of the introduction of an unnatural amino acid residue with an IR probe is that the position of probe is variable in protein sample. Another approach is the introduction of a substrate with an IR probe into enzyme. Such substrate probes will provide dynamical data the vantage point of the bound substrate. Over the past decade 2D-IR spectroscopy has developed rapidly. Currently, a vibrational echo spectrometer must be assembled. An instrument can be built from commercially available laser equipment, a great deal of optics, and commercially available very specialized detection equipment. In addition, a great deal of in house generated software is required to take and process the data. The early days of NMR experiments were very similar. Now multidimensional NMR instruments can be purchased as commercial packages. But it is important to recall that any sophisticated NMR instrument has a dedicated Ph. D. level operator that makes its use by a non-specialist community possible. One day soon, like NMR, a vibrational echo spectrometer will come as a package. However, right now, a Ph. D. 2D-IR vibrational echo spectroscopist could put together and operate a vibrational echo spectrometer for non-experts at a fraction of the cost of a top of the line NMR instrument. The availability of such instruments to the wider community would advance the fields of 2D-IR and biological research.

5. Acknowledgment

We thank Prof. R. Kopito (Stanford University) for the use of protein expression and purification equipment; Dr. K. Wakasugi (The Tokyo University) for kindly provide the plasmid neuroglobin; and Professor John S. Olson (Rice University) for providing the myoglobin mutant proteins. This work was supported by the NIH (2 R01 GM-061137-05). H. I. was supported by the Human Frontier Science Program. S. K. was supported by a fellowship from the Korea Research Foundation Grant funded by the Korean Government (KRF-2006-214-C00038).

References
[1] [2] D. Beece, L. Eisenstein, H. Frauenfelder, D. Good, M.C. Marden, L. Reinisch, A.H. Reynolds, L.B. Sorensen, K.T. Yue, Solvent viscosity and protein dynamics, Biochemistry 19 (1980), 5147-5157. P.J. Steinbach, A. Ansari, J. Berendzen, D. Braunstein, K. Chu, B.R. Cowen, D. Ehrenstein, H. Frauenfelder, J.B. Johnson, D.C. Lamb, Ligand binding to heme proteins: Connection between dynamics and function, Biochemistry 30 (1991), 3988-4001. H. Frauenfelder, F. Parak, R.D. Young, Conformational substates in proteins, Ann Rev Biophys Biophys Chem 17 (1988), 451-479. H. Frauenfelder, S.G. Sligar, P.G. Wolynes, The energy landscapes and motions of proteins, Science 254 (1991), 1598-1603. R.H. Austin, K. Beeson, L. Eisenstein, H. Frauenfelder, I.C. Gunsalus, V.P. Marshal, Activation-energy spectrum of a biomolecule: Photodissociation of carbonmonoxy myoglobin at low-temperatues, Phys Rev Lett 32 (1974), 403-405. R.H. Austin, K.W. Beeson, L. Eisenstein, H. Frauenfelder, I.C. Gunsalus, Dynamics of ligand binding

[3] [4] [5]

[6]

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

99

[7] [8]

[9] [10]

[11]

[12]

[13]

[14] [15] [16] [17] [18] [19] [20]

[21] [22] [23] [24]

[25]

[26] [27] [28] [29] [30]

[31] [32]

to myoglobin, Biochemistry 14 (1975), 5355-5373. G.N. Phillips, Jr., M.L. Teodoro, T. Li, B. Smith, J.S. Olson, Bound CO is a molecular probe of electrostatic potential in the distal pocket of myoglobin, J Phys Chem B 103 (1999), 8817-8829. A. Ansari, J. Berendzen, D. Braunstein, B.R. Cowen, H. Frauenfelder, M.K. Hong, I.E.T. Iben, J.B. Johnson, P. Ormos, T.B. Sauke, et al., Rebinding and relaxation in the myoglobin pocket, Biophys Chem 26 (1987), 337-355. T.S. Li, M.L. Quillin, G.N. Phillips, Jr., J.S. Olson, Structural determinants of the stretching frequency of CO bound to myoglobin, Biochemistry 33 (1994), 1433-1446. D. Morikis, P.M. Champion, B.A. Springer, S.G. Sligar, Resonance Raman investigations of site-directed mutants of myoglobin - effects of distal histidine replacement, Biochemistry 28 (1989), 4791-4800. K.A. Merchant, W.G. Noid, R. Akiyama, I. Finkelstein, A. Goun, B.L. McClain, R.F. Loring, M.D. Fayer, Myoglobin-CO substate structures and dynamics: Multidimensional vibrational echoes and molecular dynamics simulations, J Am Chem Soc 125 (2003), 13804-13818. K.A. Merchant, W.G. Noid, D.E. Thompson, R. Akiyama, R.F. Loring, M.D. Fayer, Structural assignments and dynamics of the a substates of MbCO: Spectrally resolved vibrational echo experiments and molecular dynamics simulations, J Phys Chem B 107 (2003), 4-7. D. Zimdars, A. Tokmakoff, S. Chen, S.R. Greenfield, M.D. Fayer, T.I. Smith, H.A. Schwettman, Picosecond infrared vibrational echoes in a liquid and glass using a free electron laser, Phys Rev Lett 70 (1993), 2718-2721. C.W. Rella, A. Kwok, K.D. Rector, J.R. Hill, H.A. Schwettmann, D.D. Dlott, M.D. Fayer, Vibrational echo studies of protein dynamics, Phys Rev Lett 77 (1996), 1648-1651. C.W. Rella, K.D. Rector, A.S. Kwok, J.R. Hill, H.A. Schwettman, D.D. Dlott, M.D. Fayer, Vibrational echo studies of myoglobin-CO, J Phys Chem 100 (1996), 15620-15629. M.D. Fayer, Fast protein dynamics probed with infrared vibrational echo experiments, Ann Rev Phys Chem 52 (2001), 315-356. A. Tokmakoff, M.D. Fayer, Infrared photon echo experiments: Exploring vibrational dynamics in liquids and glasses, Acc Chem Res 28 (1995), 437-445. M.D. Fayer (Ed), Ultrafast infrared and Raman spectroscopy Marcel Dekker, Inc, New York, Basel, 2001. M.A. Brown, R.C. Semelka, Mri: Basic principles and applications, Wiley-Liss, 1999. I.J. Finkelstein, J. Zheng, H. Ishikawa, S. Kim, K. Kwak, M.D. Fayer, Probing dynamics of complex molecular systems with ultrafast 2D IR vibrational echo spectroscopy, Phys Chem Chem Phys 9 (2007), 1533-1549. S. Park, K. Kwak, M.D. Fayer, Ultrafast 2D-IR vibrational echo spectroscopy: A probe of molecular dynamics, Laser Phys Lett 4 (2007), 704-718. J. Zheng, K. Kwak, J.B. Asbury, X. Chen, I. Piletic, M.D. Fayer, Ultrafast dynamics of solute-solvent complexation observed at thermal equilibrium in real time, Science 309 (2005), 1338-1343. Y.S. Kim, R.M. Hochstrasser, Chemical exchange 2D IR of hydrogen-bond making and breaking, Proc Natl Acad Sci USA 102 (2005), 11185-11190. J. Zheng, K. Kwak, X. Chen, J.B. Asbury, M.D. Fayer, Formation and dissociation of intra-intermolecular hydrogen bonded solute-solvent complexes: Chemical exchange 2D IR vibrational echo spectroscopy, J Am Chem Soc 128 (2006), 2977-2987. J.B. Asbury, T. Steinel, C. Stromberg, S.A. Corcelli, C.P. Lawrence, J.L. Skinner, M.D. Fayer, Water dynamics: Vibrational echo correlation spectroscopy and comparison to molecular dynamics simulations, J PhysChem A 108 (2004), 1107-1119. J.B. Asbury, T. Steinel, K. Kwak, S.A. Corcelli, C.P. Lawrence, J.L. Skinner, M.D. Fayer, Dynamics of water probed with vibrational echo correlation spectroscopy, J Chem Phys 121 (2004), 12431-12446. C.J. Fecko, J.D. Eaves, J.J. Loparo, A. Tokmakoff, P.L. Geissler, Local and collective hydrogen bond dynamics in the ultrafast vibrational spectroscopy of liquid water, Science 301 (2003), 1698-1702. S. Park, M.D. Fayer, Hydrogen bond dynamics in aqueous NaBr solutions, Proc Natl Acad Sci USA 104 (2007), 16731-16738. S.T. Roberts, J.J. Loparo, A. Tokmakoff, Characterization of spectral diffusion from two-dimensional line shapes, J Chem Phys 125 (2006), 084502. C.J. Fecko, J.J. Loparo, S.T. Roberts, A. Tokmakoff, Local hydrogen bonding dynamics and collective reorganization in water: Ultrafast infrared spectroscopy of HOD/D2O, J Chem Phys 122 (2005), 054506. J.B. Asbury, T. Steinel, M.D. Fayer, Hydrogen bond networks: Structure and evolution after hydrogen bond breaking, J Phys Chem B 108 (2004), 6544-6554. M. Khalil, N. Demirdoven, A. Tokmakoff, Vibrational coherence transfer characterized with Fourier-transform 2D IR spectroscopy, J Chem Phys 121 (2004), 362-373.

100

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

[33] K.D. Rector, C.W. Rella, A.S. Kwok, J.R. Hill, S.G. Sligar, E.Y.P. Chien, D.D. Dlott, M.D. Fayer, Mutant and wild type myoglobin-CO protein dynamics: Vibrational echo experiments, J Phys Chem B 101 (1997), 1468-1475. [34] M.T. Zanni, R.M. Hochstrasser, Two-dimensional infrared spectroscopy: A promising new method for the time resolution of structures, Curr Opin Struct Biol 11 (2001), 516-522. [35] P. Mukherjee, I. Kass, I.T. Arkin, M.T. Zanni, Picosecond dynamics of a membrane protein revealed by 2D IR, Proc Natl Acad Sci USA 103 (2006), 3528-3533. [36] I.J. Finkelstein, H. Ishikawa, S. Kim, A.M. Massari, M.D. Fayer, Substrate binding and protein conformational dynamics measured via 2D-IR vibrational echo spectroscopy, Proc Natl Acad Sci USA 104 (2007), 2637-2642. [37] J. Wang, J. Chen, R.M. Hochstrasser, Local structure of beta-hairpin isotopomers by FTIR, 2D IR, and ab initio theory, J Phys Chem B 110 (2006), 7545-7555. [38] H.S. Chung, M. Khalil, A.W. Smith, Z. Ganim, A. Tokmakoff, Conformational changes during the nanosecond-to-millisecond unfolding of ubiquitin, Proc Natl Acad Sci USA 102 (2005), 612-617. [39] S. Woutersen, R. Pfister, P. Hamm, Y. Mu, D.S. Kosov, G. Stock, Peptide conformational heteogeneity revealed from nonlinear vibrational spectroscopy and molecular-dynamics simulations, J Chem Phys 117 (2002), 6833-6840. [40] Y. Kim, R.M. Hochstrasser, Dynamics of amide-I modes of the alanine dipeptide in D2O, J Phys Chem B 109 (2005), 6884-6891. [41] C. Fang, R.M. Hochstrasser, Two-dimensional infrared spectra of the 13C=18O isotopomers of alanine residues in an alpha-helix, J Phys Chem B 109 (2005), 18652-18663. [42] P. Mukherjee, A.T. Krummel, E.C. Fulmer, I. Kass, I.T. Arkin, M.T. Zanni, Site-specific vibrational dynamics of the CD3zeta membrane peptide using heterodyned two-dimensional infrared photon echo spectroscopy, J Chem Phys 120 (2004), 10215-10224. [43] M.F. DeCamp, L. DeFlores, J.M. McCracken, A. Tokmakoff, K. Kwac, M. Cho, Amide I vibrational dynamics of N-methylacetamide in polar solvents: The role of electrostatic interactions, J Phys Chem B 109 (2005), 11016-11026. [44] N. Demirdoven, C.M. Cheatum, H.S. Chung, M. Khalil, J. Knoester, A. Tokmakoff, Two-dimensional infrared spectroscopy of antiparallel beta-sheet secondary structure, J Am Chem Soc 126 (2004), 7981-7990. [45] S. Mukamel, D. Abramavicius, Many-body approaches for simulating coherent nonlinear spectroscopies of electronic and vibrational excitons, Chem Rev 104 (2004), 2073-2098. [46] J.H. Choi, H. Lee, K.K. Lee, S. Hahn, M. Cho, Computational spectroscopy of ubiquitin: Comparison between theory and experiments, J Chem Phys 126 (2007), 045102. [47] A.M. Massari, I.J. Finkelstein, M.D. Fayer, Dynamics of proteins encapsulated in silica sol-gel glasses studied with IR vibrational echo spectroscopy, J Am Chem Soc 128 (2006), 3990-3997. [48] I.J. Finkelstein, A.M. Massari, M.D. Fayer, Viscosity dependent protein dynamics, Biophys J 92 (2006), 3652-3662. [49] A.M. Massari, B.L. McClain, I.J. Finkelstein, A.P. Lee, H.L. Reynolds, K.L. Bren, M.D. Fayer, Cytochrome c mutants: Structure and dynamics at the active site probed by multidimensional NMR and vibration echo spectroscopy, J Phys Chem B 110 (2006), 18803-18810. [50] C. Fang, A. Senes, L. Cristian, W.F. DeGrado, R.M. Hochstrasser, Amide vibrations are delocalized across the hydrophobic interface of a transmembrane helix dimer, Proc Natl Acad Sci USA 103 (2006), 16740-16745. [51] H. Ishikawa, I.J. Finkelstein, S. Kim, K. Kwak, J.K. Chung, K. Wakasugi, A.M. Massari, M.D. Fayer, Neuroglobin dynamics observed with ultrafast 2D-IR vibrational echo spectroscopy, Proc Natl Acad Sci USA 104 (2007), 16116-16121. [52] C. Kolano, J. Helbing, M. Kozinski, W. Sander, P. Hamm, Watching hydrogen-bond dynamics in a beta-turn by transient two-dimensional infrared spectroscopy., Nature 444 (2006), 469-472. [53] H. Ishikawa, S. Kim, K. Kwak, K. Wakasugi, M.D. Fayer, Disulfide bond influence on protein structural dynamics probed with 2D-IR vibrational echo spectroscopy, Proc Natl Acad Sci USA 104 (2007), 19309-19314. [54] I.J. Finkelstein, B.L. McClain, M.D. Fayer, Fifth-order contributions to ultrafast spectrally resolved vibrational echoes: Heme-CO proteins, J Chem Phys 121 (2004), 877-885. [55] I.J. Finkelstein, A. Goj, B.L. McClain, A.M. Massari, K.A. Merchant, R.F. Loring, M.D. Fayer, Ultrafast dynamics of myoglobin without the distal histidine: Stimulated vibrational echo experiments and molecular dynamics simulations, J Phys Chem B 109 (2005), 16959-16966. [56] A.M. Massari, I.J. Finkelstein, B.L. McClain, A. Goj, X. Wen, K.L. Bren, R.F. Loring, M.D. Fayer, The influence of aqueous vs. Glassy solvents on protein dynamics: Vibrational echo experiments and molecular dynamics simulations, J Am Chem Soc 127 (2005), 14279-14289. [57] K.A. Merchant, D.E. Thompson, Q.-H. Xu, R.B. Williams, R.F. Loring, M.D. Fayer, Myoglobin-CO

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

101

[58]

[59] [60] [61] [62] [63] [64] [65]

[66] [67]

[68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80]

[81] [82] [83] [84] [85] [86]

conformational substate dynamics: 2D vibrational echoes and md simulations, Biophys J 82 (2002), 3277-3288. H. Hartmann, F. Parak, W. Steigemann, G.A. Petsko, D.R. Ponzi, H. Frauenfelder, Conformational substates in a protein: Structure and dynamics of metmyoglobin at 80k, Proc Natl Acad Sci USA 79 (1982), 4967-4971. T. Burmester, B. Weich, S. Reinhardt, T. Hankeln, A vertebrate globin expressed in the brain, Nature 407 (2000), 520-523. M. Khalil, N. Demirdoven, A. Tokmakoff, Coherent 2D IR spectroscopy: Molecular structure and dynamics in solution, J Phys Chem A 107 (2003), 5258-5279. S. Mukamel, Principles of nonlinear optical spectroscopy, Oxford University Press, New York, 1995. D.M. Jonas, Two-dimensional femtosecond spectroscopy, Annu Rev Phys Chem 54 (2003), 425-463. J.B. Asbury, T. Steinel, M.D. Fayer, Vibrational echo correlation spectroscopy probes hydrogen bond dynamics in water and methanol, J Lumin 107 (2004), 271-286. M. Khalil, N. Demirdoven, A. Tokmakoff, Obtaining absorptive line shapes in two-dimensional infrared vibrational correlation spectra, Phys Rev Lett 90 (2003), 047401. K. Kwak, S. Park, I.J. Finkelstein, M.D. Fayer, Frequency-frequency correlation functions and apodization in two-dimensional infrared vibrational echo spectroscopy: A new approach, J Chem Phys 127 (2007), 124503. J.D. Eaves, J.J. Loparo, C.J. Fecko, S.T. Roberts, A. Tokmakoff, P.L. Geissler, Hydrogen bonds in liquid water are broken only fleetingly, Proc Natl Acad Sci USA 102 (2005), 13019-13022. P. Hamm, R.M. Hochstrasser, Structure and dynamics of proteins and peptides: Femtosecond two-dimensional infrared spectroscopy. In Ultrafast infrared and Raman spectroscopy. Edited by M.D. Fayer, Marcel Dekker, Inc., 2001,273-347. R.B. Williams, R.F. Loring, M.D. Fayer, Vibrational dephasing of carbonmonoxy myoglobin, J Phys Chem B 105 (2001), 4068-4071. Y.S. Bai, M.D. Fayer, Time scales and optical dephasing measurements: Investigation of dynamics in complex systems, Phys Rev B 39 (1989), 11066-11084. J. Schmidt, N. Sundlass, J. Skinner, Line shapes and photon echoes within a generalized kubo model., Chem Phys Lett 378 (2003), 559-566. J.S. Olson, G.N. Phillips, Jr., Kinetic pathways and barriers for ligand binding to myoglobin, J Biol Chem 271 (1996), 17593-17596. H.J. Dyson, P.E. Wright, Insights into protein folding from NMR, Annu Rev Phys Chem 47 (1996), 369-395. Y. Mizutani, T. Kitagawa, Ultrafast dynamics of myoglobin probed by time-resolved resonance Raman spectroscopy, Chem Rec 1 (2001), 258-275. R.M. Hochstrasser, D.K. Negus, Picosecond fluorescence decay of tryptophans in myoglobin, Proc Natl Acad Sci USA 81 (1984), 4399-4403. V. Srajer, P.M. Champion, Investigations of optical line shapes and kinetic hole burning in myoglobin, Biochem 30 (1991), 7390-7402. E. Oldfield, K. Guo, J.D. Augspurger, C.E. Dykstra, A molecular model for the major conformational substates in heme proteins, J Am Chem Soc 113 (1991), 7537-7541. T.G. Spiro, I.H. Wasbotten, CO as a vibrational probe of heme protein active sites, J Inorg Biochem 99 (2005), 34-44. E.S. Park, S.G. Boxer, Origins of the sensitivity of molecular vibrations to electric fields: Carbonyl and nitrosyl stretches in model compounds and proteins, J Phys Chem B 106 (2002), 5800-5806. E. Antonini, M. Brunori, Hemoglobin and myoglobin in their reactions with ligands, North-Holland, Amsterdam, 1971. J.B. Johnson, D.C. Lamb, H. Frauenfelder, J.D. Muller, B. McMahon, G.U. Nienhaus, R.D. Young, Ligand binding to heme proteins. 6. Interconversion of taxonomic substates in carbonmonoxymyoglobin, Biophys J 71 (1996), 1563-1573. S.E. Phillips, B.P. Schoenborn, Neutron diffraction reveals oxygen-histidine hydrogen bond in oxymyoglobin, Nature 292 (1981), 81-82. Y. Sun, K. Jin, X.O. Mao, Y. Zhu, D.A. Greenberg, Neuroglobin is up-regulated by and protects neurons from hypoxic-ischemic injury, Proc Natl Acad Sci USA 98 (2001), 15306-15311. Y. Sun, K. Jin, A. Peel, X.O. Mao, L. Xie, D.A. Greenberg, Neuroglobin protects the brain from experimental stroke in vivo, Proc Natl Acad Sci USA 100 (2003), 3497-3500. K. Wakasugi, T. Nakano, I. Morishima, Oxidized human neuroglobin acts as a heterotrimeric Galpha protein guanine nucleotide dissociation inhibitor, J Biol Chem 278 (2003), 36505-36512. K. Wakasugi, I. Morishima, Identification of residues in human neuroglobin crucial for guanine nucleotide dissociation inhibitor activity, Biochemistry 44 (2005), 2943-2948. S. Herold, A. Fago, R.E. Weber, S. Dewilde, L. Moens, Reactivity studies of the Fe(III) and Fe(II)NO

102

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

forms of human neuroglobin reveal a potential role against oxidative stress, J Biol Chem 279 (2004), 22841-22847. [87] S. Dewilde, L. Kiger, T. Burmester, T. Hankeln, V. Baudin-Creuza, T. Aerts, M.C. Marden, R. Caubergs, L. Moens, Biochemical characterization and ligand binding properties of neuroglobin, a novel member of the globin family, J Biol Chem 276 (2001), 38949-38955. [88] S. Kundu, J.T. Trent, 3rd, M.S. Hargrove, Plants, humans and hemoglobins, Trends Plant Sci 8 (2003), 387-393. [89] D. Hamdane, L. Kiger, S. Dewilde, B.N. Green, A. Pesce, J. Uzan, T. Burmester, T. Hankeln, M. Bolognesi, L. Moens, et al., The redox state of the cell regulates the ligand binding affinity of human neuroglobin and cytoglobin, J Biol Chem 278 (2003), 51713-51721. [90] H. Sawai, M. Makino, Y. Mizutani, T. Ohta, H. Sugimoto, T. Uno, N. Kawada, K. Yoshizato, T. Kitagawa, Y. Shiro, Structural characterization of the proximal and distal histidine environment of cytoglobin and neuroglobin, Biochemistry 44 (2005), 13257-13265. [91] T. Uno, D. Ryu, H. Tsutsumi, Y. Tomisugi, Y. Ishikawa, A.J. Wilkinson, H. Sato, T. Hayashi, Residues in the distal heme pocket of neuroglobin. Implications for the multiple ligand binding steps, J Biol Chem 279 (2004), 5886-5893. [92] T. Burmester, B. Ebner, B. Weich, T. Hankeln, Cytoglobin: A novel globin type ubiquitously expressed in vertebrate tissues, Mol Biol Evol 19 (2002), 416-421. [93] J.T. Trent, 3rd, M.S. Hargrove, A ubiquitously expressed human hexacoordinate hemoglobin, J Biol Chem 277 (2002), 19538-19545. [94] D. Barford, The role of cysteine residues as redox-sensitive regulatory switches, Curr Opin Struct Biol 14 (2004), 679-686. [95] H. Liu, R. Colavitti, Rovira, II, T. Finkel, Redox-dependent transcriptional regulation, Circ Res 97 (2005), 967-974. [96] P.J. Hogg, Disulfide bonds as switches for protein function, Trends Biochem Sci 28 (2003), 210-214. [97] S.F. Betz, J.L. Marmorino, A.J. Saunders, D.F. Doyle, G.B. Young, G.J. Pielak, Unusual effects of an engineered disulfide on global and local protein stability, Biochemistry 35 (1996), 7422-7428. [98] S.A. Beeser, T.G. Oas, D.P. Goldenberg, Determinants of backbone dynamics in native BPTI: Cooperative influence of the 14-38 disulfide and the Tyr35 side-chain, J Mol Biol 284 (1998), 1581-1596. [99] J.J. Kelley, III, T.M. Caputo, S.F. Eaton, T.M. Laue, J.H. Bushweller, Comparison of backbone dynamics of reduced and oxidized Escherichia coli glutaredoxin-1 using 15N NMR relaxation measurements, Biochemistry 36 (1997), 5029-5044. [100] B. Tidor, M. Karplus, The contribution of cross-links to protein stability: A normal mode analysis of the configurational entropy of the native state, Proteins 15 (1993), 71-79. [101] M.E. Moghaddam, H. Naderi-Manesh, Role of disulfide bonds in modulating internal motions of proteins to tune their function: Molecular dynamics simulation of scorpion toxin Lqh III, Proteins 63 (2006), 188-196. [102] A. Pesce, S. Dewilde, M. Nardini, L. Moens, P. Ascenzi, T. Hankeln, T. Burmester, M. Bolognesi, Human brain neuroglobin structure reveals a distinct mode of controlling oxygen affinity, Structure 11 (2003), 1087-1095. [103] B. Vallone, K. Nienhaus, M. Brunori, G.U. Nienhaus, The structure of murine neuroglobin: Novel pathways for ligand migration and binding, Proteins 56 (2004), 85-92. [104] B. Vallone, K. Nienhaus, A. Matthes, M. Brunori, G.U. Nienhaus, The structure of carbonmonoxy neuroglobin reveals a heme-sliding mechanism for control of ligand affinity, Proc Natl Acad Sci USA 101 (2004), 17351-17356. [105] R. Jimenez, G. Salazar, J. Yin, T. Joo, F.E. Romesberg, Protein dynamics and the immunological evolution of molecular recognition, Proc Natl Acad Sci USA 101 (2004), 3803-3808. [106] B. Ma, M. Shatsky, H.J. Wolfson, R. Nussinov, Multiple diverse ligands binding at a single protein site: A matter of pre-existing populations, Protein Sci 11 (2002), 184-197. [107] N.C. Veitch, Horseradish peroxidase: A modern view of a classic enzyme, Phytochemistry 65 (2004), 249-259. [108] N.C. Veitch, A.T. Smith, Horseradish peroxidase, Adv Inorg Chem 51 (2000), 107-162. [109] S.M. Aitken, J.L. Turnbull, M.D. Percival, A.M. English, Thermodynamic analysis of the binding of aromatic hydroxamic acid analogues to ferric horseradish peroxidase, Biochemistry 40 (2001), 13980-13989. [110] A.T. Smith, N.C. Veitch, Substrate binding and catalysis in heme peroxidases, Curr Opin Chem Biol 2 (1998), 269-278. [111] W.J. Ingledew, P.R. Rich, A study of the horseradish peroxidase catalytic site by FTIR spectroscopy, Biochem Soc Trans 33 (2005), 886-889. [112] S. Hashimoto, H. Takeuchi, Protonation and hydrogen-bonding state of the distal histidine in the CO

H. Ishikawa et al. / Ultrafast 2D-IR Vibration Echo Spectroscopy of Proteins

103

complex of horseradish peroxidase as studied by ultraviolet resonance Raman spectroscopy, Biochemistry 45 (2006), 9660-9667. [113] I.E. Holzbaur, A.M. English, A.A. Ismail, Infrared spectra of carbonyl horseradish peroxidase and its substrate complexes: Characterization of pH-dependent conformers, J Am Chem Soc 118 (1996), 3354-3359. [114] H.S. Chung, Z. Ganim, K.C. Jones, A. Tokmakoff, Transient 2D IR spectroscopy of ubiquitin unfolding dynamics, Proc Nat Acad Sci USA 104 (2007), 14237-14242. [115] P. Mukherjee, I. Kass, I.T. Arkin, M.T. Zanni, Structural disorder of the CD3zeta transmembrane domain studied with 2D IR spectroscopy and molecular dynamics simulations, J Phys Chem B 110 (2006), 24740-24749. [116] H. Maekawa, C. Toniolo, Q.B. Broxterman, N. Ge, Two-dimensional infrared spectral signatures of 3 10- and alpha-helical peptides, J Phys Chem B 111 (2007), 3222-3235. [117] J. Xie, P.G. Schultz, Adding amino acids to the genetic repertoire, Curr Opin Chem Biol 9 (2005), 548-554. [118] K.C. Schultz, L. Supekova, Y. Ryu, J. Xie, R. Perera, P.G. Schultz, A genetically encoded infrared probe, J Am Chem Soc 128 (2006), 13984-13985. [119] S. Ohno, M. Matsui, T. Yokogawa, M. Nakamura, T. Hosoya, T. Hiramatsu, M. Suzuki, N. Hayashi, K. Nishikawa, Site-selective post-translational modification of proteins using an unnatural amino acid, 3-azidotyrosine, J Biochem 141 (2007), 335-343. [120] J.W. Chin, T.A. Cropp, J.C. Anderson, M. Mukherji, Z. Zhang, P.G. Schultz, An expanded eukaryotic genetic code, Science 301 (2003), 964-967. [121] K.L. Kiick, E. Saxon, D.A. Tirrell, C.R. Bertozzi, Incorporation of azides into recombinant proteins for chemoselective modification by the staudinger ligation, Proc Natl Acad Sci USA 99 (2002), 19-24. [122] M. Lim, P. Hamm, R.M. Hochstrasser, Protein fluctuations are sensed by stimulated infrared echoes of the vibrations of carbon monoxide and azide probes., Proc Natl Acad Sci USA 95 (1998), 15315-15320.

104

Biological and Biomedical Infrared Spectroscopy A. Barth and P.I. Haris (Eds.) IOS Press, 2009 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-045-2-104

FTIR Data Processing and Analysis Tools


Erik GOORMAGHTIGH Laboratory for the Structure and Function of Biological Membranes, Center for Structural Biology and Bioinformatics, Universit Libre de Bruxelles, Belgium

Abstract: The information retrieved from FTIR spectra largely depends on both the quality of the original spectra and on the correction and processing methods. This contribution reviews the entire process driving to a fine and reliable interpretation of the data. Keywords: Fourier transform infrared, Attenuated total reflection, water vapor, side chain, noise, correlations, principal component analysis, deuterium exchange secondary structure, orientation

1. Introduction In the course of the last 20 years we have used FTIR spectroscopy for the study of membrane and membrane proteins. In turn we essentially used attenuated total reflection (ATR) instead of transmission spectroscopy as the former method allows the recording of the orientation of specific chemical groups in orientated membranes. The potential differences between transmission and ATR spectra as well as the specific advantages of ATR have been described thoroughly in a previous review [1]. The relevant conclusion of this previous review for the present chapter is that, in some instances, ATR spectra can present particularities not encountered in transmission spectra. Significant distortions occur when the spectra are recorded close to the critical angle, which happen to be the case at 45 incidence with common materials such as KRS-5 or ZnSe. All the spectra presented in this review have been recorded on germanium crystals, at 45 incidence. Because of the high refractive index of germanium, the critical angle is far from 45 and such distortions do not occur. The reader is referred to our previous review [1] for a detailed discussion. The purpose of this chapter is to present some practical information about the recording and processing of the FTIR spectra.

____________________________
1 Corresponding Author: Laboratory for the Structure and Function of Biological Membranes, Center for Structural Biology and Bioinformatics, Universit Libre de Bruxelles, CP 206/2, Boulevard du Triomphe, B1050 Brussels, Belgium; E-mail: egoor@ulb.ac.be.

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

105

2. Methodology: Recording of Spectra The recording of spectra is the most important step prior to analysis. If the data do not contain the information or contain too much noise, there is no way to overcome the problem. We present below the typical procedure we follow for the recording of the spectra. Attenuated total reflection infrared (ATR-FTIR) spectra presented in this chapter were obtained on one of our five Bruker FTIR spectrophotometers of the IFS55/equinox family (Ettlingen, Germany) all equipped with a MCT detector (broad band 12000-420 cm-1, liquid N2 cooled, 24h hold time). Spectra were recorded at a resolution of 2 cm-1 with an aperture of 3.5 mm and acquired in the double-sided, forward-backward mode. The spectrometer was placed on vibration-absorbing sorbothane mounts (Edmund Industrial Optics, Barrington, NJ, USA). Two levels of zero filling of the interferogram were applied prior to Fourier transform. Spectra were finally saved in the ASCII JCAMP format, then they were conveniently encoded with one data point every wavenumber for subsequent manipulations. The spectrometer was continuously purged with dry air (Whatman 75-62, Haverhill, MA, USA or K-MT8 air dryer from Zander, Essen, Germany). For a better stability, the purging of the spectrometer optic compartment (5 l/min) and of the sample compartment (10-20 l/min) were controlled independently by flowmeters (Fisher Bioblock Scientific, Illkirch, France). Room temperature was maintained constant at 22C with an air conditioning system. The germanium crystals were washed in Superdecontamine (Intersciences, AS, Brussels, Belgium), a lab detergent solution at pH 13, rinsed with distilled water, washed with methanol, then with chloroform and finally placed for 2 min in a plasma cleaner PDC23G (Harrick, Ossining, NY, USA) working under reduced air pressure. Thin films were obtained by slowly evaporating a sample containing a total 10-100 g of protein or lipid on one side of the ATR plate under a stream of nitrogen. This hold for large internal reflection element, 52x20x2 mm trapezoidal germanium ATR plate (ACM, Villiers St Frdric, France) with an aperture angle of 45 yielding 25 internal reflections. In some experiments a Golden Gate diamond ATR unit (Specac, Orpington, UK) was used with 0.1 to 1 g sample. It must be stressed that, as a rule, the dry weight of buffer, salt and other molecules from the solution must be kept smaller than the dry weight of membranes (lipids + proteins). If this condition is not respected, the spectrum intensity is weak because of the exponential decay of the evanescent wave. In fact, diluting the molecules of interest in others, even non-absorbing molecules, results in parallel decrease of the spectral intensity. We found that an excellent way to improve the quality of the spectra is to overlay the membrane film with ca 300 l of a buffer, then remove the liquid as much as possible by tilting the crystal and soaking the liquid with a filter paper, and finally dry the film again under N2. This procedure also allows the composition of the buffer, or pH to be modified at will [2]. The germanium crystal was placed in an ATR holder for liquid sample with an in- and out-let (Specac, Orpington, UK). The liquid cell was placed at 45 incidence on a Specac vertical ATR setup. Two such setups, the second mounted as the mirror image of the first one were fitted on the sample shuttle provided by Bruker, allowing the recording of two samples or two H/D exchange kinetic experiments almost simultaneously. This is an important feature since the two samples can be compared under identical conditions (temperature, gas flow rate). Furthermore, an elevator under computer control made it possible to move the whole setup along a vertical axis (built by WOW Company SA, Nannine,

106

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

Belgium). This allows the crystal to be separated in different lanes, one of them being used for the background. This elevator is absolutely required when the temperature of the sample is to be changed in the course of the experiment, for instance when monitoring phase transitions in lipid membranes. For hydrogen/deuterium exchange experiments, nitrogen gas was saturated with 2 H2O by bubbling through a series of four vials containing 2H2O. The flow rate of 50 ml/min was controlled by a flowtube (Fisher Bioblock Scientific, Illkirch, France). Bubbling was started at least one hour before starting the experiments. At zero time, the tubing was connected to the cavity of the liquid cell chamber surrounding the film. 20 scans were recorded and averaged for each time point. The time interval was increased exponentially (see Figure 6). After 27 minutes, the interval between the scans was large enough to allow the interdigitation of a second kinetics measurement. The second sample was then analyzed with the same time sampling but with a 27 minute offset. Deuteration started by connecting it to the 2H2O-saturated N2 flow from the output of the first sample chamber. Sample shuttle movements and spectrum recording were under control of a macro program written for OPUS (Bruker, Ettlingen, Germany). For the analysis of 1H/2H experiments, the areas of the lipid Q(C=O), amide I, I', and II were obtained by automatic integration. For each spectrum, the area of amide II was divided by the area of amide I in order to take into account the swelling of the sample layer due to the presence of 2H2O. All kinetic curves were analyzed as multiexponential decays of populations Di of amide protons with the same timeconstants Tj, using a nonlinear-least-squares procedure. It is usual to fit the proportion of unexchanged amide proton curve H(t) by a small number M of exponential (typically 3) representing each a class Aj of amide groups:

H (t )

A
j 1

exp(

t ) Tj

(1)

2D correlation spectra were calculated according to Noda [3-5] and as described by Nabet and Pzolet [6]. Computation was carried out using the Hilbert transform as recently reported [7]. The 2D correlation spectra can be interpreted using the rules described by Noda [8] and more recently by Ekgasit et al. [9]. Fourier self-deconvolution, when required, was performed according to [10-12]. Deconvolution was performed with a Lorentzian line (FWHH = 30 cm-1) and apodization with a Gaussian line (FWHH = 15 cm-1) resulting in a so-called linenarrowing factor (K) of 2.0. It must be noted that while the narrowing effect of Fourier self-deconvolution has been widely used in the past, the shape and width of the deconvoluting line shape are usually unknown, resulting in less efficient band narrowing as clearly illustrated in the past [13, 14] and more recently by LorenzFonfria et al [15]. Smoothing was simply obtained by apodization of the spectrum Fourier transform, typically by the Fourier transform of a 4 cm-1 wide Gaussian. All the software used for data processing was written under MatLab (Mathworks Inc, Natick, Ma, USA).

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

107

3. Primary Spectral Processing We present below the processing of the spectra which is almost automatically applied to recorded data. 3.1. Spectral Corrections for Atmospheric Water The presence of sharp atmospheric water absorption lines superimposed on the sample spectra cannot be avoided in long-term experiments or when recording very small absorbances. Atmospheric water absorbance is better corrected when spectra are recorded at relatively high resolution for taking advantage of the linewidth difference existing between the atmospheric water vapor and the solid sample bands [16]. This is illustrated on Figure 1.

0.4 0.35

120 100

Resolution (cm-1) 0.5 1.0 2.0 4.0 8.0


1700 1600 cm-1 1500 14

0.3 0.25 0.2 0.15 0.1 0.05 0 1800 1700 1600 cm-1 1500 14

Absorbance

Absorbance

80 60 40 20 0 1800

Figure 1. Water vapor spectra recorded at different nominal resolution. The same amount of water vapor is present in all the spectra and the spectra are presented on the same scale, with an offset. On the left panel, water vapor spectra have been recorded after decreasing the purging of the sample compartment. Spectra were recorded at 0.5, 1, 2, 4 and 8 cm-1 as indicated. On the right panel, the spectra of a tumor cell line are overlaid with the spectra from the left panel. It can be observed that the same amount of water is practically undetectable for the lower resolutions.

Obviously no smoothing should be carried out before this step. Simple subtraction of a reference water vapor spectrum is sufficient in most cases but derivative-shaped lines are usually left over. The reason is obviously related to a difference between the reference water vapor used for the correction and the one present on the sample spectrum. The problem is further complicated when one realizes that there is not one but up to 4 water vapor spectra involved. These contributions comes from 1) the water vapor present at the time of the recording of the background, 2) the water vapor present at the time of the recording of the sample, 3) the water vapor present at the time of the recording of the reference of the water vapor spectrum and 4) the water vapor present

108

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

at the time of the recording of the background used for the latter reference spectrum. The physical reasons for the different spectra are not completely clear. Temperature and atmospheric pressure play an important role. In addition we have noticed that the precise positioning of the germanium crystal contributes to the problem. The problem of water vapor subtraction has attracted the attention of several groups in the past [16, 17]. We present below different approaches that, in our hands, best help solve the problem. 1) purging thoroughly the sample compartment until "equilibrium". This appears obvious but in practice equilibrium is never reached. Our software can plot the subtraction coefficients used for the correction for overnight experiments. Fluctuations keep appearing even after 12 hours demonstrating that the stability of the purge system is not sufficient. 2) collecting and subtracting a water reference spectrum. It can be hypothesized that the best reference spectrum is the one collected on top of the sample. A spectrum (SpA) is collected say after 10 min purging and another one (SpB) after 12 min purging. Simply computing SpA-SpB yields a water vapor spectrum collected exactly in the same conditions as the sample and both backgrounds (for the sample and reference water vapor) are identical. In turn the 4th contribution mentioned above is not relevant anymore. The program then computes the subtraction coefficient as the ratio of the atmospheric water band area between 1562 and 1555 cm-1 (a straight line is draw between the spectrum points at these two wavenumbers) on the sample spectrum and on the reference atmospheric water spectrum. This level of processing is generally sufficient for experiments such as the monitoring of H/D kinetics. If an acceptable result is not achieved at this point a further step must be undertaken. 3) In order to account for the small frequency shifts of the water vapor bands we wrote software that accurately evaluates the position of the 1573-1579 cm-1 band by curve fitting a Gaussian lineshape on the reference and sample spectra. Once the shift is determined, the reference water vapor spectrum is shifted (typically by 0.1-0.2 cm-1) by linear interpolation before subtraction as above. This approach improves the results in most cases but may also fail to yield a correct result because of the multiplicity of the water contributions as described above. The most difficult cases are processed with either one of the two last methods. Figure 2 presents the reference water vapor spectrum (curve A), the original sample spectrum (curve B), the corrected spectrum obtained without shift of the reference spectrum (curve C) and the corrected spectrum obtained after shift of the reference spectrum (curve D). The smoothed curve D appears as curve E. This Figure demonstrates clearly the need for the wavenumber adjustment of the reference water vapor spectrum. This approach also demonstrates that the internal shift to be applied is fluctuating in an apparent random manner during the course of an overnight experiment (not shown). 4) In order to account for several contributions with several bandshifts, we use a combination of a reference spectrum with itself shifted by several values (typically the starting values are -0.1 and + 0.1 cm-1). In turn, 3 reference spectra are generated and 5 parameters must be determined to optimize the subtraction: the three subtraction coefficients and the two shifts. We found it convenient to use a least square procedure to minimize the "length" of the

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

109

resulting spectrum obtained after water vapor subtraction in a defined spectral range. This idea comes from the observation that at rather "high" resolution, the sample spectrum overlaid by the spiky water vapor spectrum is much "longer" than the intrinsically smooth spectrum of the sample. The "length" of the spectrum is obtained as the sum of the distance between all the datapoints in the selected spectral range. Practically it is obtained by subtracting the resulting spectrum from itself after shifting the datapoints by one point. The sum of the difference is minimized in order to adjust the 5 different coefficients. Unexpectedly we found that the recording of the reference spectrum is not critical. Good results are obtained when using a standard vapor spectrum recorded several years before.

250
3 Absorbance x10 Absorbance

200 150 100 50 0 1800

E. D. C. B. A.
1700 1600 cm-1 1500 14

Figure 2. Illustration of water vapor band removal on a spectrum of triose phosphate isomerase. Reference water vapor spectrum (curve A), original sample spectrum (curve B), corrected spectrum obtained without shifting the reference spectrum (curve C), corrected spectrum obtained after shifting the reference spectrum (curve D) and smoothing of curve D by apodization of its Fourier transform by a Gaussian lineshape in order to obtain a resolution of 4 cm-1 (curve E). The subtraction coefficient is computed as the ratio of the area of the band present between 1562 and 1555 cm-1 as indicated by the arrows.

5)

The alternative procedure does not use shifts but many spectra that represent the variety of the water vapor contributions. Principal components were used to reconstruct the water vapor contribution. The principal components were obtained from initially 1360 water vapor spectra collected in recent year by different researchers in the lab. After rescaling, analysis of the spectra for noise, rejection of outliers, the series was subjected to principal component decomposition (Figure 3). As before the "length" of

110

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

6)

the curve between 1700 and 1600 cm-1 is minimized in order to determine the subtraction coefficient for the different principal components. The best results are usually obtained with 2-5 principal components. Using too many principal components can result in a degradation of the result. Finally, whatever the procedure followed, the corrected spectrum is smoothed by apodization of its Fourier transform by the Fourier transform of a 4 cm-1 wide Gaussian lineshape.

0.7 0.6

Absorbance x103

Absorbance

0.5 0.4 0.3 0.2 0.1 1800 1750 1700 1650 1600
-1 cm cm
-1

1550

1500

1450

14

Figure 3. 4 first principal components describing water vapor contribution in the 1800-1400 cm-1 range.

3.2. Spectral Corrections for Amino Acid Side Chain Contribution In the course of the 1H/2H exchange experiments, once the atmospheric water bands have been removed, the spectra display a distinct feature located near 1580 cm-1 whose intensity rapidly increases as a function of the deuteration time (Figure 4, left arrow). Another shoulder is present near 1515 cm-1 throughout the experiment (Figure 4, right arrow) but becomes more and more visible as the exchange proceeds. These are due to absorbances of the protein amino acid residue side chains. The 1515 cm-1 can easily been assigned to tyrosine ring vibration and the broad shoulder at 1580 cm-1 is an overlap of Qas(COO-) from Asp and Glu and of Arg Qs(CN32H5+). Figure 4 demonstrates that the area under amide II as limited by the baselines drawn on the Figure is underestimated because of the side chain contributions. In the undeuterated spectrum, the sum of the side chain contributions represents about 10% of the amide I intensity (Figure 5). Because of its overall broad shape, it is usually not taken into account when the amide I shape is analyzed for protein secondary structure determination. In the deuterated spectrum, a distinct maximum appears near 1580 cm-1 (Figure 5) and makes it impossible to establish a reliable baseline to evaluate the amide II area. We therefore developed software that computes the contribution of the protein side-chains as a function of the extent of the deuteration. The deuterated and undeuterated contributions of the side chains (computed from the amino acid composition in the protein and from

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

111

the data reported by Venyaminov and Kalnin [18], Chirgadze and Brazhnikov [19], summarized in [20-22]) appear in Figure 5. The contributions of individual side chains reported in the literature had to be slightly adjusted in order to fit the amino acid side chain contributions observed here according to [23]. More recent reports on side chain contributions could help improve the quality of the correction [21, 24-26] but a precise modelisation of the complex environment found in real proteins remains out of reach. Subtracting the side chain contribution from every spectrum of the series recorded in the time course of the kinetics pre-supposes the knowledge of 1) the pH which governs the ionization state of the carboxylic amino acids, 2) the fraction of deuterated and undeuterated amino acid side chain for every spectrum of the kinetic and, 3) the subtraction coefficient. In the present study, 1) the pH was supposed to be the pH of the solution from which the film was prepared since we showed previously that ionization of carboxylic acids was identical in films and solutions [27] and that pH dependency of the exchange rate [23] and of peptide orientation [2] are maintained in film samples. 2) for every spectrum of the kinetic a side chain deuteration index was computed from the intensity decay at 1673 cm-1 during the first 10 minutes of the experiment, which monitors mainly arginine and asparagine deuteration, i.e. the main side-chain contributions sensitive to deuteration in the present case. In agreement with previous observations [28], the exchange of the amino acid side chains was fast. 3) The scaling factor for the subtraction of the total amino acid side chain contribution was based on the integrated intensity of the side chain contributions (see above) on the one hand and on the amide I extinction coefficients computed according to the protein secondary structure from data reported in the literature ([29-31], summarized in [20]) on the other hand. In the course of this work, it appeared that the so-determined subtraction coefficients are not completely adequate (accuracy of the reported extinction coefficients, difference in experimental conditions) and should be multiplied by 1.5 to obtain a satisfactory subtraction of the Tyr contributions.

400
3 Absorbance x10 Absorbance x10 Absorbance
3

300 200 100 0

1750 1700 1650 1600 1550 1500 1450 -1 cm cm-1

Figure 4. Spectra of the gastric ATPase obtained before (bottom) and after 80 minutes of exposure to 2H2O. The baselines drawn are used to delimitate the area of the ester Q(C=O), amide I and amide II. Absorbance reading refers to scaled spectra.

112

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

400

A.
Absorbance 300 200 100 0

1750 1700 1650 1600 1550 1500 1450 -1 cm

Asn Gln Tyr Arg Asp Glu Tyr Phe

B.
80 60 40 20 0

Absorbance

1750 1700 1650 1600 1550 1500 1450 -1 cm

Figure 5. Illustration of side chain contribution removal from an undeuterated spectrum (bottom) and from a spectrum deuterated for 80 min (top). The original (thin line) and corrected (thick line) spectra obtained after subtraction of the side chain contribution (dotted line) appear in panel. The sum of the side chain contributions as well as their individual contributions appear in panel B for the spectra presented on panel A. Absorbance scale refer to scaled spectra.

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

113

3.2.1. Critical Appraisal of Side Chain Contribution Subtraction While useful when amide II area has to be evaluated, limitations of the method need to be stressed. First, the reconstituted side chain contributions presented here are based on model compounds whose characteristics have been measured a long time ago. The type of spectrometer, resolution and hydration levels are not considered. Second, the large variety of environments present in real protein cannot be represented by the bands used. Third, the scaling for the subtraction is at best an approximation. In turn, when we tried to improve secondary structure prediction as explained in [32] by getting rid of side chain contributions in the amide I and amide II range, we rather observed a slight degradation of the predictions. 3.3. Spectral Rescaling In ATR experiments the intensity of the spectra depends on both the amount of material and on the way the material is spread on the ATR crystal. The latter parameter is never fully under control. In turn a scaling of the spectra is often required. This scaling is necessary to compare spectra, to average spectra, to run PCA or correlations analyses. Typically, the area under a band or several bands will be set to a given value. An alternative method is to use the entire spectrum for scaling. In 1H/2H exchange experiments, the additional hydration of the film induced upon contact with 2H2O gas results in a swelling of the membrane film. In turn, the intensity of the whole spectrum decreases because of the exponential decay of the evanescent wave outside the germanium crystal. In order to account for this intensity decrease that is not related to the deuteration process, the amide II areas (Figure 4) are rescaled with respect to either the lipid Q(C=O) band or the amide I band. Results are comparable for both bands (not shown). These two bands are equally affected by the swelling effect and are conveniently located close to the amide II band. This is important since the effect of the swelling on the absorbance is expected to depend on the wavelength because of the wavelength dependency of the penetration depth of the evanescent wave (see equation 12 and Figure 5 in [1]). 3.4. Noise Level Check Noise is defined here as the standard deviation in a segment of the spectrum after subtraction of a linear baseline. For noise estimation, a spectral region (preferentially without absorbance) is defined, typically 2200-2100 cm-1. A segment of defined length is moved on the spectrum. For every position 1) a linear baseline is fitted to this segment and subtracted to account for any general tilt of the baseline and 2) the standard deviation around the mean is computed. This can be repeated by steps of short segments (e.g. 12 cm-1) to minimize the contribution of broad bands when present. A rejection threshold should be set to eliminate automatically spectra of bad quality when studying large series of spectra. Scaling of the spectra, if used, should be applied before this step for the sake of the comparison of the standard deviations. Decent spectra should have a standard deviation below 1/1000 of the amide I intensity.

114

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

4. Tools for the Analysis of FTIR Spectra Numerous approaches are currently used for extracting the relevant information from FTIR spectra. Most often series of spectra are produced, requiring some kind of analysis able to take into account the small variations which describe the dynamics of the system. Among the method used, difference spectroscopy is one of the most powerful. As it will be described in more detail in the present book, the reader is referred to that chapter or to previous publications on this topic [22, 24, 33-35]. 4.1. 1H/2H Exchange Kinetics: Hydrogen isotope exchange has long been used for the analysis of protein structure and dynamics [36-38] (for a review see [39, 40]). When compared to mass spectrometry or 1 H/3H exchange, the great advantage of monitoring the exchange by FTIR is that the measure is focused on the amide protons only, yielding data proportional to the number of residues in the protein. The interest of a correct and accurate analysis of the 1H/2H exchange kinetics is twofold. First, the exchange rate contains important information on the structure and structure stability of a protein at a submolecular level. Second, the as clearly illustrated in the past [13, 14] and more recently by Lorenz-Fonfria et al [41] exchange acts as a perturbation that can help reveal otherwise hidden components in the spectra. Simultaneous use of polarized and 1H/2H exchange experiments adds a further dimension to the analysis [42, 43]. Series of spectra recorded in the course of 1H/2H exchange experiments are reported on Figure 6 before and after correction for the contributions of the side chains. Figure 7 reports H(t), the evolution of the number of 1H amide remaining as a function of the time. It is usual to fit the curve H(t) by a small number M of exponential (typically 3) representing each a class Aj of amide groups (equation 1). The result of a curve fitting with three exponentials also appears on Figure 7. The curve fitting indicates that 384 residues belong to the fast exchanging class T=1.2 min) and 804 to the slowly exchanging amide protons. Further experiments described in the literature report that hundreds of residues change their accessibility to exchange [44, 45] in the presence of ligand inducing an E1 or E2 conformation. Simultaneously, attenuated total refection Fourier transform infrared experiments under a flowing buffer were carried out to modulate the environment of the protein inside the measurement cell . The high accuracy of the results allows to demonstrate that the E1 to E2 transition induces a net change in secondary structure that concerns 10 15 amino acid residues over a total of 1324 in the proteins [44]. Decomposition of the exchange curves as presented in Figure 7 supposes 3 distinct -1 classes of amide protons. Alternatively, the inverse Laplace transform L immediately yields the distribution shape without hypothesis on the number of classes (2) f ( k ) L-1 ^H (t ) ` Knox and Rosenberg [46] suggested a dimensionless presentation of the distribution function obtained after rewriting of the integral expression H(t)

H (t )

f

f

k f ( k ) exp(  kt ) d ln( k )

(3)

Solving the inverse Laplace transform is subject to several artifacts if not carefully treated [47]. We used here the CONTIN program kindly provided by Dr. Provencher

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

115

[38, 48]. Figure 8 shows the time constant distribution for the experiment described in Figure 7.

A.
Absorbance x103

Absorbance x103

B.

cm-1
Figure 6. Evolution with time of exposure to 2H2O of the spectrum of the gastric H+,K+-ATPase in native tubulovesicle membranes. The first 10 spectra were recorded before the beginning of the deuteration. The spectra have been recorded after 0; .25; .50; .75; 1.00; 1.25; 1.50; 1.75; 2.00; 2.50; 3.00; 3.50; 4.00; 5; 6; 7; 8; 10; 12; 14; 16; 20; 24; 28; 32; 40; 48; 56; 64; 80; 96; 112; 128; 160; 192; 224; 256; 320; 384; 448 and 512 min. Spectra have been corrected for water vapor contribution. They are shown before (A) and after correction for side chain contribution (B). The direction of the main time evolutions is indicated by arrows.

116

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

H(t) (%)

Proportion Residue number

Fast (T=1.2 min) 28.9% 384

Intermediate (T=8.3 min) 12.9% 171

Slow (T=2275 min) 60.5% 804

Time (min)
Figure 7. Evolution of the area of amide II/amide I ratio (in %) for the gastric ATPase exposed to 2H2O vapor. A curve fitting with 3 exponentials (equation 1) resulted in the line drawn through the experimental points (circles). The characteristics obtained for the 3 exponentials are tabulated on the Figure. Residue number come from the conversion of the percentage into amino acid residues taking into account the 1324 residues present in the entire protein

Figure 8. Distribution of the 1H/2H exchange time constants k=1/T for the gastric ATPase obtained after inverse Laplace transform (equation 3) of the curve presented in Figure 7.

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

117

So far, 1H/2H exchange demonstrated a unique sensitivity to structural changes. The question of which secondary structure are involved in the different classes of amide protons can be addressed in different ways. Amide I can be analyzed in details to obtain the exchange rates of the different secondary structure types as demonstrated earlier [28, 49]. Alternatively, since amide II is as sensitive as amide I to secondary structure ([32] and see below), amide II decay at different wavenumbers will yield information on different secondary structures. An inverse Laplace transform was performed on the decay observed every other wavenumber between 1580 and 1520 cm1 for the data displayed on Figure 6. Results are reported in Figure 9. It can be observed in Figure 9 that the intermediate component exchanging in ca 10 min is largely due to helices whose maximum appears near 1545-1550 cm-1 in the amide II while the slower component near 1000 min has a maximum at lower wavenumbers, i.e. closer to the Esheet contribution (near 1540 cm-1 and below). Asynchronous correlations computed in the same spectral range confirm the presence of different secondary structure exchanging at different rates. Figure 10 reports the asynchronous map for the series of spectra plotted on Figure 6B. A clear maximum (indicated by a circle on the Figure) at 1540/1555 cm-1 confirms the inverse Laplace analysis results. It must be noted that the synchronous map does not reveal the feature. Once the time constant present in the exchange process are known, it is relatively easy to express the entire series of spectra as a linear combination of three spectra weighted each by an exponential decay as described in equation 1. A simple matrix inversion allows the extraction of the three spectra representing the exchange for the three time constants [50] (Figure 11).

1580

Wavenumber (cm-1)

1570 1560 1550 1540 1530 1520 0 2 4 -log(k) 6

Figure 9. Contour plot of the inverse Laplace transform computed every other wavenumber between 1580 and 1520 cm-1 for the spectra reported in Figure 6B.

118

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

1520

Wavenumber (cm-1)

1530 1540 1550 1560 1570 1580 1580 1560 1540


-1

1520

Wavenumber (cm )
Figure 10. Asynchronous map for the series of spectra plotted on Figure 6B. The circle indicates the 1540/1555 cm-1 peak.

20 15 Absorbance 10
-1660

-1619

x 10

-3

T=2275 min

-1628

-1648

-1523

T=8.3 min
-1629 -1534

5
-1661

0 -5 1800

T=1.2 min
-1536

1700

1600 -1 cm

1500

14

Figure 11. Decomposition of the spectra from Figure 6B into 3 spectra representing the spectral variations occurring for each time constant revealed in Figure 7. The decomposition is obtained by linear regression as explained in [50]

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

119

The continuous shift of the negative and positive peaks in the amide I towards smaller wavenumbers is in line with helical structure exchanging first, then sheet structure exchanging more slowly. The similar shift in the amide II region confirms this interpretation. It must be kept in mind that the position of the original bands is different from the extrema found in the difference-shaped spectra. In conclusion, evaluation of 1H/2H exchange kinetics requires a thorough preparation of the data. We have presented here some of the key issues that should be taken into account and the main approaches used for the analysis of the data. 4.2. Secondary Structure Determination Since the eighties a large number of methods to estimate protein secondary structure content via the analysis of FTIR spectra have been reported. Curve fitting was originally used and an example of such an analysis is that published by Byler and Susi [51] in which protein amide I bands was analyzed by fitting with a series of Gaussian curves. The success reported in the original paper was spectacular: the RMS errors for D-helix and E-sheet were on the order of ca 2.5%. The curve fitting method compensates for band position variation among a same secondary structure assignment by assigning all component bands found in a given regions of the spectrum to a particular structure. Used with Fourier self-deconvolution, this method can be highly effective when applied by one experienced in its use [13, 51-54]. Yet, curve fitting requires a series of subjective decisions that can dramatically affect both the results and the interpretation [13, 55, 56]. Furthermore, curve fitting has a tendency to overestimate the E-sheet content of primarily helical proteins, and routinely finds 15-20% E-sheet for proteins that actually have none [51, 54, 57-59]. Multivariate statistical analysis methods have proven to be an alternative powerful tool for the analysis of protein spectra e.g. factor analysis [60-64], singular value decomposition [65], more sophisticated approaches such as the holistic approach developed in [66], multiple neural network in [67-70] or the enhanced prediction of secondary structure obtained by combining curve analysis and hydrogen/deuterium exchange [71], curve analysis and isotope editing [72] or curve analysis and temperature [73]. Both transmission [74, 75] and ATR-FTIR spectroscopy [1] have been reviewed extensively. A critical parameter that has not been taken into account systematically, except for genetic algorithms [76, 77] and the local regression method interval partial leastsquares (iPLS) [78] is the selection of the wavenumbers used for building models. Including large wavenumber ranges involves wavenumbers that are not correlated with the particular secondary structure to be estimated and in turn results in a degradation of the prediction accuracy. Discussion is still current about the interest of the various regions of the spectrum for the determination of the secondary structure content [66] and computation of spectra might shed some new light on this problem in a near future [79]. Today, the high quality of the FTIR spectrometers makes the absorbance at every single wavenumber almost noiseless. We address here the question of wavenumber information content and of the redundancy of the information present at different wavenumbers for secondary structure prediction. It appears that at most the absorbances at 3 distinct wavenumbers contain all the non-redundant information that can be related to one secondary structure content. Addition of more spectral data points is useless or even degrades the prediction quality. Interestingly; wavenumber by wavenumber analyses identify the relevance of every wavenumber in the IR spectrum for the prediction of a given secondary structure and yields a particularly simple

120

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

method for computing the secondary structure content since a linear equation that contains the absorbance at only a few wavenumber yields the best predictions. We have previously built a protein database that covers as well as possible the / secondary structure space, the fold space as described by CATH (Class, Architecture, Topology and homology classification of proteins [80] as well as other structural features such as helix length, and number of chains in a sheet. We identified 50 commercially available proteins that can be obtained with sufficient purity and for which we assessed the quality of the crystal-derived structure [81]. From this database we can address questions about the secondary structure information contained in the spectra. Figure 12 reports the error on the secondary structure prediction for the D-helix and E-sheet when building simple linear model such as Struct(%)=c1 + c2 . absorbance at wavenumber i

Error on structure prediction (standard deviation)

20 Absorbance

D-helix 1 wavenumber
-1224

-1515

15

-1694

-1624 -1302

-1655

10 1800 1700

D-helix 2 wavenumbers
1200 1100 10

-1545

1600

1500

1400 cm-1

1300

16 Absorbance 14 12 10 8 1800 1700


-1696 -1628 -1656

E-sheet 1 wavenumber

-1423

-1514 -1543

-1221

E-sheet 2 wavenumbers
1100 10

-1304

1600

1500

1400 cm-1

1300

1200

Figure 12. Evolution of the standard deviation on the predicted D-helix and E-sheet content (%) in the protein database when building a simple linear model based on the absorbance at one wavenumber or including two wavenumbers. Such models have been built for every wavenumber of the spectrum and reported in the Figure. A baseline has been subtracted at 1720, 1485, 1426, 1355, 1211 and 1010 cm-1 and the spectra were scaled between 1720 and 1485 cm-1.

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

121

The constants c1 and c2 were determined by linear regression for all the wavenumbers i and the models were evaluated by the error on the structure prediction. The profiles for the errors reported in Figure 12 present a view of the information contained in the spectra for every secondary structure. Surprisingly, the best wavenumber identified for D-helix prediction is 1545 cm-1 in the amide II. Amide III also contains very valuable information on the helix content. Yet, because amide III is most often overlapped with buffer or lipid contributions it remains practically less interesting. For the E-sheet structure, the best prediction is obtained using the absorbance at 1628 cm-1. The model can be complicated and improved by adding a second wavenumber. In the example of Figure 12 we have retained the best wavenumber identified as described above and screened for the additional gain in prediction accuracy when a second one is added. The curves presented for both structures indicate that most of the information was contained in the first wavenumber; the residual information presented is significant but rather small. These profiles emphasize the redundancy of the information content as, for instance for the D-helix structure, once the absorbance at 1545 cm-1 is included in the model the large predicting power of the other bands largely disappear. It must be emphasized that the profiles presented in Figure 12 do not allow drawing conclusions on the wavenumbers assigned to each structure. This is due to the fact that D-helix and E-sheet structures are complementary in any database. Predicting one already yields most of the information on the other. Obviously the profile reported for the D-helix structure has minima at wavenumbers associated with helix bands (1655 and 1545 cm-1) but also with bands associated with E-sheet (1694 and 1624 cm-1). The first principal component (Figure 13) obtained by principal component analysis (PCA) contains describes 63% of the variance between 1800 and 1400 cm-1 and is highly correlated to the D-helix and E-sheet content. It displays the D-helix feature positive at 1655 and 1545 cm-1 on the one hand and the E-sheet feature negative at 1694, 1629 and 1515 cm-1. The value of 1515 cm-1 is lower than expected, probably because of a mixing with side chain contributions.

-1655

Absorbance

0.1
-1545

0
-1694

-1466

-1586

-1567

-1515

-0.1 -0.2 1800

-1629

1700

1600 -1 cm cm-1

1500

14

Figure 13. First principal component obtained from the protein database used for building Figure 12.

122

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

E
1500

E E D -

Synchronous correlation

1550

1600

E
1650

D E

1700 1700 1500

1650

1600

1550

1500

E D x

Asynchronous correlation

1550

1600

E
1650

D E
1650 1600 1550 1500

1700 1700 1500

E D x

Coorelation coefficient

1550

1600

E
1650

D E

1700 1700

1650

1600

1550

1500

cm-1

Figure 14. Synchronous correlation, asynchronous correlation and correlation coefficient in the 1700-1500 cm-1 region computed on the spectra of the 50 protein database described above. D and E refer to a potential assignment to D-helix and E-sheet respectively. Negative values are plotted as thin lines.

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

123

Synchronous and asynchronous maps shed more light on the correlation features present in the 50 protein spectra. Figure 14 displays some features of the correlation coefficient, synchronous and asynchronous maps. Following the dotted line at 1545 cm1 from the bottom we can read that this band is negatively correlated with 1695, 1625 and 1520 cm-1, all characteristics of E-sheet structures and positively correlated with 1655 cm-1, a wavenumber characteristic of the helix structure. The synchronous map gives the same reading but misses the 1695 cm-1 band. The asynchronous map reveals additional cross-peak at 1640 and 1580 cm-1 belonging more likely to another structure (turn, random) and/or to side chain contributions. The 6 bands crossed when moving vertically along the dotted line drawn at 1545 cm-1 are the six major features that can be distinguished on the maps. A vertical dotted line has been draw at these wavenumbers on Figure 14. This short discussion indicates that both the correlation coefficients and the asynchronous maps bring complementary information. They allowed the identification of 6 major distinct contributions in the amide I/amide II region of the spectrum. The synchronous and the correlation coefficient maps are very similar in nature but the sensitivity of the correlation coefficient is much better as variations at every wavenumber are rescaled by the variance at the same wavenumber. In turn, smallamplitude variations are as significant as the largest ones. The 1695 cm-1 E-sheet band for instance appears strongly on the correlation coefficient but not on the synchronous map. The unassigned band at 1585 cm-1 only appears on the correlation coefficient map. The latter contribution is likely related to side chain contributions. 4.3. Orientation Determination of molecular orientations from polarized ATR-FTIR has been reviewed previously [1, 14, 54] and a deep insight about the dipole orientation and geometry of the secondary structures has been provided [82-86]. Polarized ATR-FTIR approach is extremely powerful as it allows the determination of the orientation of membrane molecules including the lipids, protein [43, 87-100, 100, 101], peptides [91, 102-110] or drugs [111, 112] simultaneously, on the same sample, without labeling. The only question we will address here is that of the quality of the orientation of the membranes. The measured order parameter S, denoted Sexperimental, obtained from RATR as explained elsewhere [1, 14, 113, 114] can be generally expressed as the product of several order parameters related to a set of nested, uniaxial symmetric distributions [115]. In this condition

Sexperimental = Smembrane . Shelix . Sdipole

(4)

where Smembrane describes the distribution of the lipid membrane patches (smallest planar membrane unit) with respect to the internal reflection element, Shelix describes the orientation of the helices within the membrane plane and Sdipole describes the dipole orientation of either amide I or amide II with respect to the helix axis. A schematic representation of these nested distributions appears in Figure 15. Each contribution to the product can be seen itself as the product of the mean tilt contribution by the contribution of the disorder characterized by the distribution of the angular values about their mean.

124

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

Figure 15. Set of nested axially symmetric distributions. The membrane normal is distributed about the germanium crystal normal (angle J), the secondary structure axis about the membrane normal (angle E) and the transition dipole moment about the secondary structure axis (angle D).

Remarkably, because of the symmetry of the experiment, it can be demonstrated that only these two Legendre polynomials contribute to the IR dichroism [14, 115]. The coefficient <Pn> can be evaluated as
S

 Pn !

2 0

D(J ).Pn (cos J ). sin J .dJ


S

(5)

2 0

D(J ). sin J .dJ

It means that whatever the shape D(J) of the distribution of the tilts about the mean value J0, the <P0> and <P2> coefficients fully describe the IR dichroism even if they may poorly describe the angular distribution. The value of the coefficient <P2> is usually called order parameter, S, as it describes the disordering around the mean value for P-ATR experiments. In recent paper [116], D(J) for the membranes was directly measured from AFM images and its projection on P2, i.e. <P2>=Smembrane, evaluated according to equation 5. The membrane used were the intracytoplasmic tubulovesicles bearing the H+,K+-ATPase directly extracted from pig stomachs. These membranes represent an example of native membrane by opposition to better ordered systems built with synthetic lipids. Details about the preparation and characterization of the tubulovesicles can be found elsewhere [93, 95, 117-120]. The spectral intensity and linear dichroism were measured for average thicknesses ranging between 0 and 100

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

125

bilayers. Height profiles were obtained by atomic force microscopy (AFM) along lines randomly through the image (Figure 16). Orientation distribution function were obtained from the slopes and decomposed into Legendre polynomials. It was found that the second Legendre polynomials coefficient characterizing the membrane orientation was always larger than 0.9 [116]. Remarkably, addition of tubulovesicle membranes in small amount (about just enough to cover the area) smears out the roughness of the germanium surface resulting from the polishing. Further addition of membrane materials fully hides the polishing grooves. In conclusion, it appears that even for natural membrane, the disordering of the membrane on a clean germanium crystal is quite small and can be ignored. Thermally induced bending fluctuations as described by Marsh, Shanmugavadivu and Kleinschmidt [121] will have little impact in stacks of membranes prepared in the absence of an excess of water.

Figure 16. Slope distributions obtained on a 5x5 m2 image for a tubulovesicle multilayer stack.

5. Conclusions Modern recording techniques allow the recording of hundreds or thousands of spectra every day. The strength of FTIR precisely relies on its unique capability to detect small differences in spectra, either recorded on different objects (imaging produces thousands of spectra in a matter of minutes) or in the course of a reaction. The present challenge is to handle these spectra. Corrections for water vapor, smoothing, detection of outliers must be automated to be of practical interest. Similarly, spectral investigations rely upon correlation analyses, decomposition into components that have a special meaning or advanced statistics. The questions to be answered vary from one problem to the next. In turn, a great flexibility is required. Hypotheses must put forward and tested. In this review we have presented a few examples of specific analyses related to specific problems. Other problems will raise other questions, other hypotheses and other testing. We found it most useful in recent year to build the ability to handle the spectra in a flexible programming environment.

126

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

6. References
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] Goormaghtigh,E., V.Raussens, and J.M.Ruysschaert. Biochim.Biophys.Acta 1422 (1999), 105-185. Bechinger,B., J.M.Ruysschaert, and E.Goormaghtigh. Biophys J 76 (1999), 552-563. Noda,I. Appl.Spectrosc. 44 (1990), 550-561. Noda,I. Appl. Spectrosc. 47 (1993), 1329-1336. Noda,I., A.E.Dowrey, C.Marcott, G.M.Story, and Y.Ozaki. Appl. Spectrosc. 54 (2000), 236A-248A. Nabet,A. and M.Pzolet. Appl. Spectrosc. 51 (1997), 466-469. Sasic,S., A.Muszynski, and Y.Ozaki. Appl. Spectrosc. 55 (2001), 343-349. Noda,I., A.E.Dowrey, and C.Marcott. Appl.Spectrosc. 47 (1993), 1317-1323. Ekgasit,S. and H.Ishida. Appl. Spectrosc. 49 (1995), 1243-1253. Kauppinen,J.K., D.J.Moffat, H.H.Mantsch, and D.G.Cameron. Anal.Chem. 53 (1981), 1454-1457. Kauppinen,J.K., D.J.Moffat, H.H.Mantsch, and D.G.Cameron. Appl.Spectrosc. 35 (1981), 271-276. Kauppinen,J.K., D.J.Moffat, and H.H.Mantsch. Appl.Opt. 20 (1981), 1866-1880. Goormaghtigh,E., V.Cabiaux, and J.M.Ruysschaert. Subcell.Biochem. 23 (1994), 405-450. Goormaghtigh,E. and J.M.Ruysschaert. 1990. In Molecular description of biological membrane components by computer-aided conformational analysis. R.Brasseur, editor. CRC Press, Boca Raton FL. 285-329. Lorenz-Fonfria,V.A., J.Villaverde, and E.Padros. Appl. Spectrosc. 56 (2002), 232-242. Goormaghtigh,E. and J.M.Ruysschaert. Spectrochim.Acta 50A (1994), 2137-2144. Bruun,S.W., A.Kohler, I.Adt, G.D.Sockalingum, M.Manfait, and H.Martens. Appl. Spectrosc. 60 (2006), 1029-1039. Venyaminov,S.Y. and N.N.Kalnin. Biopolymers 30 (1991), 1243-1257. Chirgadze,Y.N., O.V.Fedorov, and N.P.Trushina. Biopolymers 14 (1975), 679-694. Goormaghtigh,E., V.Cabiaux, and J.M.Ruysschaert. Subcell.Biochem. 23 (1994), 329-362. Barth,A. Progress in Biophysics & Molecular Biology 74 (2000), 141-173. Barth,A. and C.Zscherp. Quaterly Rev.Biophys. 35 (2002), 369-430. Goormaghtigh,E., H.H.de-Jongh, and J.M.Ruysschaert. Appl.Spectrosc. 50 (1996), 1519-1527. Barth,A. Biochim.Biophys.Acta 1767 (2007), 1073-1101. Bush,M.F., M.W.Forbes, R.A.Jockusch, J.Oomens, N.C.Polfer, R.J.Saykally, and E.R.Williams. J. Phys.Chem. A 111 (2007), 7753-7760. Forbes,M.W., M.F.Bush, N.C.Polfer, J.Oomens, R.C.Dunbar, E.R.Williams, and R.A.Jockusch. J. Phys.Chem. A 111 (2007), 11759-11770. de-Jongh,H.H., E.Goormaghtigh, and J.M.Ruysschaert. Biochemistry 34 (1995), 172-179. de-Jongh,H.H., E.Goormaghtigh, and J.M.Ruysschaert. Biochemistry 36 (1997), 13593-13602. Chirgadze,Y.N., B.V.Shestopalov, and S.Y.Venyaminov. Biopolymers 12 (1973), 1337-1351. Chirgadze,Y.N. and E.V.Brazhnikov. Biopolymers 13 (1974), 1701-1712. Venyaminov,S.Y. and N.N.Kalnin. Biopolymers 30 (1990), 1259-1271. Goormaghtigh,E., J.M.Ruysschaert, and V.Raussens. Biophys.J. 90 (2006), 2946-2957. Liu,M., M.Krasteva, and A.Barth. Biophys. J.89 (2005), 4352-4363. Stolz,M., E.Lewitzki, D.Thoenges, W.Mantele, A.Barth, and E.Grell. J. Gen. Physiol. 126 (2005), 32A. Ritter,M., O.Anderka, B.Ludwig, W.Mantele, and P.Hellwig. Biochemistry 42 (2003), 12391-12399. Gregory,R.B. and R.Lumry. Biopolymers 24 (1985), 301-326. Knox,D.G. and A.Rosenberg. Biopolymers 19 (1980), 1049-1068. Provencher,S.W. and V.G.Dovi. Can J.Biochem. 1 (1979), 313-318. Englander,S.W. and N.R.Kallenbach. Q.Rev.Biophys. 16 (1984), 521-655. Kim,P.S. Methods in Enzymology 131 (1986), 136-156. Lorenz-Fonfria,V.A., J.Villaverde, and E.Padros. Appl. Spectrosc. 56 (2002), 232-242. Garczarek,F. and K.Gerwert. Journal of the American Chemical Society 128 (2006), 28-29. Grimard,V., C.Vigano, A.Margolles, R.Wattiez, H.W.van-Veen, W.N.Konings, J.M.Ruysschaert, and E.Goormaghtigh. Biochemistry 40 (2001), 11876-11886. Scheirlinckx,F., R.Buchet, J.M.Ruysschaert, and E.Goormaghtigh. Eur.J.Biochem. 268 (2001), 36443653. Vigano,C., M.Smeyers, V.Raussens, F.Scheirlinckx, J.M.Ruysschaert, and E.Goormaghtigh. Biopolymers 74 (2004), 19-26. Knox,D.G. and A.Rosenberg. Biopolymers 19 (1980), 1049-1068.

[15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46]

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

127

[47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92]

Provencher,S.W. and V.G.Dovi. J.Biochem.Biophys.Methods 1 (1979), 313-318. Provencher,S.W. Computer Physics Communications 27 (1982), 213-227. de-Jongh,H.H., E.Goormaghtigh, and J.M.Ruysschaert. Biochemistry 36 (1997), 13603-13610. Raussens,V., J.M.Ruysschaert, and E.Goormaghtigh. Appl.Spectrosc. 58 (2004), 68-82. Byler,D.M. and H.Susi. Biopolymers 25 (1986), 469-487. Cabiaux,V., R.Brasseur, R.Wattiez, P.Falmagne, J.M.Ruysschaert, and E.Goormaghtigh. J.Biol.Chem. 264 (1989), 4928-4938. Prestrelski,S.J., K.A.Pikal, and T.Arakawa. Pharm.Res. 12 (1995), 1250-1259. Goormaghtigh,E., V.Cabiaux, and J.M.Ruysschaert. Eur.J.Biochem. 193 (1990), 409-420. Jackson,M. and H.H.Mantsch. Crit.Rev.Biochem.Mol.Biol. 30 (1995), 95-120. Surewicz,W.K. and H.H.Mantsch. Biochim.Biophys.Acta 952 (1988), 115-130. Haris,P.I., D.Chapman, and G.Benga. Eur.J.Biochem. 233 (1995), 659-664. Jap,B.K., M.F.Maestre, S.B.Hayward, and R.M.Glaeser. Biophys J 43 (1983), 81-89. Van Hoek,A.N., M.Wiener, S.Bicknese, L.Miercke, J.Biwersi, and A.S.Verkman. Biochemistry 32 (1993), 11847-11856. Baumruk,V., P.Pancoska, and T.A.Keiderling. J Mol Biol 259 (1996), 774-791. Lee,D.C., P.I.Haris, D.Chapman, and R.C.Mitchell. Biochemistry 29 (1990), 9185-9193. Pribic,R., I.H.van Stokkum, D.Chapman, P.I.Haris, and M.Bloemendal. Anal.Biochem. 214 (1993), 366-378. Sarver,R.W., Jr. and W.C.Krueger. Anal.Biochem. 199 (1991), 61-67. Dousseau,F. and M.Pezolet. Biochemistry 29 (1990), 8771-8779. Rahmelow,K. and W.Hubner. Anal.Biochem 241 (1996), 5-13. Vedantham,G., H.G.Sparks, S.U.Sane, S.Tzannis, and T.M.Przybycien. Anal.Biochem. 285 (2000), 3349. Hering,J.A., P.R.Innocent, and P.I.Haris. Proteomics 2 (2002), 839-849. Hering,J.A., P.R.Innocent, and P.I.Haris. Spectroscopy-An International Journal 16 (2002), 53-69. Severcan,M., P.I.Haris, and F.Severcan. Anal. Biochem. 332 (2004), 238-244. Hering,J.A., P.R.Innocent, and P.I.Haris. Proteomics 4 (2004), 2310-2319. Baello,B.I., P.Pancoska, and T.A.Keiderling. Anal.Biochem. 280 (2000), 46-57. Venyaminov,S.Y., J.F.Hedstrom, and F.G.Prendergast. Proteins 45 (2001), 81-89. Arrondo,J.L., J.Castresana, J.M.Valpuesta, and F.M.Goni. Biochemistry 33 (1994), 11650-11655. Arrondo,J.L.R., I.Etxabe, U.Dornberger, and F.M.Goni. Biochem.Soc.Trans. 22 (1994), S380. Arrondo,J.L.R. and F.M.Goni. Prog.Biophys.Mol.Biol. 72 (1999), 367-405. Smith,B.M. and S.Franzen. Anal.Chem 74 (2002), 4076-4080. Smith,B.M., L.Oswald, and S.Franzen. Anal.Chem 74 (2002), 3386-3391. Navea,S., R.Tauler, and A.de Juan. Anal.Biochem. 336 (2005), 231-242. Brauner,J.W., C.R.Flach, and R.Mendelsohn. J.Am.Chem.Soc. 127 (2005), 100-109. Orengo,C.A., A.D.Michie, S.Jones, D.T.Jones, M.B.Swindells, and J.M.Thornton. Structure 5 (1997), 1093-1108. Oberg,K.A., J.M.Ruysschaert, and E.Goormaghtigh. Protein Science 12 (2003), 2015-2031. Marsh,D. Biophys. J.72 (1997), 2710-2718. Marsh,D., M.Muller, and F.J.Schmitt. Biophys.J. 78 (2000), 2499-2510. Marsh,D., M.Muller, and F.J.Schmitt. Biophys. J.78 (2000), 2499-2510. Marsh,D. and T.Pali. Biophys J 80 (2001), 305-312. Marsh,D. J.Mol. Biol. 338 (2004), 353-367. Alegre-Cebollada,J., A.M.del Pozo, J.G.Gavilanes, and E.Goormaghtigh. Biophys. J.93 (2007), 31913201. Challou,N., E.Goormaghtigh, V.Cabiaux, K.Conrath, and J.M.Ruysschaert. Biochemistry 33 (1994), 6902-6910. Goormaghtigh,E., L.Vigneron, M.Knibiehler, C.Lazdunski, and J.M.Ruysschaert. Eur.J.Biochem. 202 (1991), 1299-1305. Le-Saux,A., J.M.Ruysschaert, and E.Goormaghtigh. Biophys J 80 (2001), 324-330. Lopes,S.C.D.N., E.Goormaghtigh, B.J.C.Cabral, and M.A.R.B.Castanho. J.Am.Chem.Soc. 126 (2004), 5396-5402. Raussens,V., V.Narayanaswami, E.Goormaghtigh, R.O.Ryan, and J.M.Ruysschaert. J.Biol.Chem. 270 (1995), 12542-12547.

128

E. Goormaghtigh / FTIR Data Processing and Analysis Tools

[93] Raussens,V., J.M.Ruysschaert, and E.Goormaghtigh. J.Biol.Chem. 272 (1997), 262-270. [94] Raussens,V., C.A.Fisher, E.Goormaghtigh, R.O.Ryan, and J.M.Ruysschaert. J Biol Chem 273 (1998), 25825-25830. [95] Raussens,V., H.de-Jongh, M.Pezolet, J.M.Ruysschaert, and E.Goormaghtigh. Eur.J.Biochem. 252 (1998), 261-267. [96] Raussens,V., J.Drury, T.M.Forte, N.Choy, E.Goormaghtigh, J.M.Ruysschaert, and V.Narayanaswami. Biochem. J. 387 (2005), 747-754. [97] Sonveaux,N., A.B.Shapiro, E.Goormaghtigh, V.Ling, and J.M.Ruysschaert. J.Biol.Chem. 271 (1996), 24617-24624. [98] Sturgis,J., B.Robert, and E.Goormaghtigh. Biophys J 74 (1998), 988-994. [99] Vigneron,L., J.M.Ruysschaert, and E.Goormaghtigh. J.Biol.Chem. 270 (1995), 17685-17696. [100]Wald,J.H., E.Goormaghtigh, J.De-Meutter, J.M.Ruysschaert, and A.Jonas. J.Biol.Chem. 265 (1990), 20044-20050. [101]Abrecht,H., E.Goormaghtigh, J.M.Ruysschaert, and F.Homble. J Biol Chem 275 (2000), 40992-40999. [102]Aisenbrey,C., E.Goormaghtigh, J.M.Ruysschaert, and B.Bechinger. Molecular Membrane Biology 23 (2006), 363-374. [103]Aisenbrey,C., R.Kinder, E.Goormaghtigh, J.M.Ruysschaert, and B.Bechinger. J.Biol.Chem. 281 (2006), 7708-7716. [104]Demel,R.A., E.Goormaghtigh, and B.de-Kruijff. Biochim.Biophys.Acta 1027 (1990), 155-162. [105]Haro,A., M.Velez, E.Goormaghtigh, S.Lago, J.Vazquez, D.Andreu, and M.Gasset. J.Biol.Chem. 278 (2003), 3929-3936. [106]Houbiers,M.C., C.J.Wolfs, R.B.Spruijt, Y.J.Bollen, M.A.Hemminga, and E.Goormaghtigh. Biochim Biophys Acta 1511 (2001), 224-235. [107]Houbrechts,A., B.Moreau, R.Abagyan, V.Mainfroid, G.Preaux, A.Lamproye, A.Poncin, E.Goormaghtigh, J.M.Ruysschaert and J.A.Martial Protein Eng. 8 (1995), 249-259. [108]Leenhouts,J.M., Z.Torok, V.Mandieau, E.Goormaghtigh, and B.de-Kruijff. FEBS Lett. 388 (1996), 3438. [109]Lopes,S.C.D.N., C.M.Soares, A.M.Baptista, E.Goormaghtigh, B.J.C.Cabral, and M.A.R.B.Castanho. J.Phys.Chem. B 110 (2006), 3385-3394. [110]Martin,I., E.Goormaghtigh, and J.M.Ruysschaert. Biochim.Biophys.Acta (Biomembranes) 1614 (2003), 97-103. [111]Fa,N., S.Ronkart, A.Schanck, M.Deleu, A.Gaigneaux, E.Goormaghtigh, and M.P.Mingeot-Leclercq. Chem.Phys.Lipids 144 (2006), 108-116. [112]Goormaghtigh,E., R.Brasseur, P.Huart, and J.M.Ruysschaert. Biochemistry 26 (1987), 1789-1794. [113]Fringeli,U.P. and H.H.Gnthard. Mol.Biol.Biochem.Biophys. 31 (1981), 270-332. [114]Harrick,N.J. 1967. Interscience Publischers, New York. [115]Rothschild,K.J. and N.A.Clark. Biophys.J. 25 (1979), 473-487. [116]Ivanov,D., N.Dubreuil, V.Raussens, J.M.Ruysschaert, and E.Goormaghtigh. Biophys.J. 87 (2004), 1307-1315. [117]Scheirlinckx,F., V.Raussens, J.M.Ruysschaert, and E.Goormaghtigh. Biochem.J. 382 (2004), 121-129. [118]Raussens,V., M.Pezolet, J.M.Ruysschaert, and E.Goormaghtigh. Eur J Biochem 262 (1999), 176-183. [119]Raussens,V., M.le Maire, J.M.Ruysschaert, and E.Goormaghtigh. FEBS Lett 437 (1998), 187-192. [120]Raussens,V., V.Narayanaswami, E.Goormaghtigh, R.O.Ryan, and J.M.Ruysschaert. J.Biol.Chem. 271 (1996), 23089-23095. [121]Marsh,D., B.Shanmugavadivu, and J.H.Kleinschmidt. Biophys. J.91 (2006), 227-232.

Biological and Biomedical Infrared Spectroscopy A. Barth and P.I. Haris (Eds.) IOS Press, 2009 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-045-2-129

129

FTIR Spectroscopy for Analysis of Protein Secondary Structure


a

Joachim A. HERING b and Parvez I. HARIS a,1 Faculty of Health and Life Sciences, De Montfort University, The Gateway, Leicester, LE1 9BH, UK b Department of Computer Science, University of Applied Sciences Ulm, Prittwitzstrae 10, 89075 Ulm, Germany

Abstract. One of the major challenges of the post-genomic era is the rapid characterisation of protein structure. High-throughput structural genomic projects involving X-ray crystallography and NMR spectroscopy are in progress to solve the three-dimensional structures of a large of number of proteins. These techniques have their advantages and disadvantages and cannot be applied to study all proteins, giving sufficient opportunity for other techniques to also a play significant role in proteomics research. Fourier transform infrared (FTIR) spectroscopy is one of the techniques that has gained popularity in this area since measurements on small quantities of proteins can be carried out very rapidly in various environments. However, there is a need for improvements in the interpretation of protein FTIR infrared spectra and development of methods for accurately quantifying protein secondary structure from infrared spectra of proteins. Over the years, much progress has been made in this area and here we provide an overview of the major progress made so far, along with their strengths and weaknesses. The particular focus of the Chapter is on methods used for quantitative prediction of secondary structure from infrared spectra. Keywords. Secondary structure, FTIR spectroscopy, Neural networks, Multivariate analysis, Curve-fitting

1. Introduction FTIR spectroscopy has been applied to study the secondary structure of proteins in aqueous solution for about 60 years [13]. Elliot and Ambrose were the first to demonstrate that infrared spectral data may be used to obtain information on protein secondary structure [1,4,5]. They showed that an empirical correlation between the amide I and amide II absorption bands of a protein and the secondary structure ( -helix and -sheet) contents of proteins as determined by X-ray crystallography exists. Since then, FTIR spectroscopy as a technique to determine the secondary structure of proteins has become increasingly popular especially in situations where methods such as X-ray diffraction [6] and nuclear magnetic resonance (NMR) spectroscopy cannot be readily applied [7]. Although X-ray crystallography and NMR are very precise and capable of determining the protein secondary and tertiary structure at atomic
1 Correspondence should be addressed to Parvez I. Haris, School of Molecular Sciences, De Montfort University, The Gateway, Leicester, LE1 9BH, United Kingdom, E-mail: pharis@dmu.ac.uk.

130

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

resolution, they are not free of limitations. Before X-ray crystallography studies can be performed to determine the three-dimensional structure of a protein at high resolution, a well-ordered crystal of the molecule is required. However, this is not possible for all proteins (e.g., the vast majority of membrane proteins). Even if it is possible, obtaining suitable protein crystals is still a difficult, time-consuming task and there is a potential risk that the condition used for crystallisation of a protein may change its structure. Although multidimensional NMR spectroscopy as an alternative to X-ray crystallography allows protein structure determination in solution, it is currently limited to the examination of small proteins with approximately 200 amino acid residues. Additionally, high concentrations of proteins are required for NMR analysis. These limitations have led to the development of alternative methods not working at atomic resolution but still capable of providing protein secondary structural information. These methods include vibrational (FTIR, Raman) and circular dichroism (CD) spectroscopy. CD spectroscopy as an alternative to FTIR spectroscopy, has been widely applied for determination of protein secondary structure [885]. In contrast to FTIR spectroscopy, CD spectroscopy can not be applied over a wide range of concentrations and is limited to only optically clear solutions. Additionally, -sheet and random coil structures have relatively small CD signals making their interpretation more prone to errors. In contrast, FTIR spectroscopy is not limited by protein size and the physical state of the sample allowing the examination of protein secondary structure in a variety of environments. This makes FTIR spectroscopy one of the few techniques that can be used to study the role of the surrounding environment on protein conformation. Additionally, FTIR spectra with high quality can be obtained relatively easily without problems of background fluorescence and light scattering. FTIR spectroscopy may also be used as a tool to distinguish between native and aggregated (unfolded) proteins [86]. All these advantages of FTIR spectroscopy, and the fact that costs of the required equipment are relatively low, have lead to the popularity of FTIR spectroscopy for protein secondary structure quantification [2,8691]. This Chapter focuses on various methods that are currently used for the prediction of secondary structure from protein infrared spectra. An excellent review on infrared spectroscopy of proteins has been published by Barth (2007) which covers theoretical and practical aspects of protein infrared spectroscopy [92]. See Chapter 1 for discussion on the historical development of infrared spectroscopy. Chapter 4 also discusses in some detail the methods used for analysis of protein infrared spectra including data processing techniques. 1.1. Amide Vibrations By far most studies related to protein FTIR spectroscopy for the analysis of protein secondary structure are using the amide bands. Altogether, up to 9 characteristic bands named amide A, B, I, II, , VII have been identified [93]. However, due to technical and theoretical limitations, only the amide I, II, and III bands are used to investigate the secondary structure of proteins. A description of amide modes has been given by Miyazawa et al. as well as by Krimm et al. [94103]. It is assumed that the exact frequencies of the amide I and II absorption is influenced by the strength of any hydrogen bonds involving amide C=O and N-H groups. Since each individual secondary structural conformation is associated with a characteristic hydrogen bonding pattern between these groups, each type of secondary structure

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

131

Table 1. Empirically determined, structure-sensitive regions within the amide I region. Note that the ranges given here are the union of ranges taken from a number of references -helix -sheet turns unordered H 2O 16481660 cm1 16201640 cm1 16701695 cm1 16201640 cm1 16501695 cm1 16401657 cm1 16601670 cm1 D2O 16421660 cm1 16151640 cm1 16701694 cm1 16531694 cm1 16391654 cm1 References [1,2,4,5,8688,106,111,113,140 142,227,229,230] [1,2,4,5,8689,106,111,113,140 142,227,229,230] [28,8688,103,106,111,113,227,229231] [1,4,5,8688,106,111,227,229,230]

gives rise to different frequencies at which amide bond vibrations occur resulting in characteristic amide I and II absorption. It is this separation of the amide absorption, which forms the basis of protein structure quantification from FTIR spectra of proteins. Correlation between amide frequency and protein secondary structure has been demonstrated using both normal mode calculations [103,104] and experimental studies on peptides and proteins [2,88,105,106]. However, because of the complexity of naturally occurring proteins, most of this data has been obtained from studies on model compounds with only a single secondary structure. These model compounds ranged from simple amino acid derivatives to large synthetic polypeptides. 1.1.1. Amide I Band Absorption The most widely used of these amide bands is the amide I band [2,12,87,88,106111]. Amide I absorption is directly related to the backbone conformation with major contribution from C=O stretching vibration and minor contribution from the C-N stretching vibration. Absorption for this band occurs in the region 16001700 cm1. Empirical studies have identified regions within the amide I band, which are sensitive to particular secondary structural conformation (see Table 1). In most cases, good relationship exists between protein secondary structure and respective amide I frequencies. However, this correlation does not generally apply to all proteins and peptides. For example, deviations may occur with proteins containing unusual or less common structures. Examples for -helix structure in H2O include poly-L-lysine where absorption around 1638 cm1 has been reported [112,113] and bacteriorhodopsin absorbing at 1662 cm1 [113]. Possibility of significant difference of absorbing band frequency for -helix structure between short, solvent exposed peptides and helix structure buried within a solvent inaccessible region of a highly folded globular protein has been pointed out [113]. For example, absorption for small peptides has been shown to occur at 1632 cm1 [114]. Additionally, absorption from amino acid side chains, steric situations, and dielectric properties of the solvent are known to influence the frequency of amide vibrations [115118]. Table shows how a variety of amide I band frequencies has been attributed to helical structure in the literature. Figures 13 shows the absorbance and second-derivative spectra of three proteins with differing secondary structures. Spectra were obtained for proteins dissolved in H2O. The absorbance and second-derivative spectra of a predominantly -helical protein (cytochrome c), a predominantly -sheet protein (prealbumin) and a protein with a mixture of -helical and sheet structure are presented.

132
A

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure
B
1655

Absorbance/Wavenumber

Absorbance

1679

1700

1680

1660

1640

1620

1600

1700

1680

1660

1656

1640

1632

1620

1615

1600

Wavenum ber

Wavenum ber

Figure 1. FTIR spectrum of Cytochrome C from horse heart (A) and its second derivative spectrum (B). Cytochrome C is a predominantly helical protein as is evident from the main amide I band maximum at 1656 cm1. Parameters used for derivation: Savitzky-Golay, 2nd, Points 13.

A
1632

Absorbance/Wavenumber

Absorbance

1688

1673

1652

1700

1680

1660

1640

1620

1600

1700

1680

1660

1640

1631 1620

1600

Wavenum ber

Wavenum ber

Figure 2. FTIR spectrum of Prealbumin from human plasma (A) and its second derivative spectrum (B). Prealabumin is a predominantly beta-sheet proten which is conistent with the amide I maximum at 1631 cm1. Parameters used for derivation: Savtizky-Golay, 2nd, Points 13.

A
1650

B
2

Absorbance/Wavenumber

Absorbance

1685

1672

1700

1680

1660

1640

1620

1600

1700

1680

1660

1650

1640

1633

1623

1620

1615

1600

Wavenum ber

Wavenumber

Figure 3. FTIR spectrum of Papain from papaya latex (A) and its second derivative spectrum (B). Papain is a protein with a mixture of -helical and -sheet structure evident from the strong bands at 1650 cm and 1633 cm1, respectively. Parameters used for derivation: Savitzky-Golay, 2nd, Points 13.

The relatively high intensity of the amide I band, compared to other amide vibrations, has been an important factor behind its use for protein secondary structure analysis. However, it also happens to occur in a region most prone to errors due to the overlap with the O-H stretching band at 1640 cm1. Furthermore, interference from absorp-

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

133

tion of water and water vapour mainly within the amide I region used to be a significant problem, since H2O strongly absorbs in that region due to OH vibrations. As a result, in the past most infrared studies were restricted to measurements in the solid state or in 2H2O. However, since it is now possible to digitally subtract overlapping H2O absorption from the spectrum of the protein solution, this is not a problem anymore provided that a relatively high protein concentration of about 10 mg/ml and short path length cells of 6m are used for recording spectra of proteins in H2O. Additionally, purging the sample compartment with dry air or nitrogen is generally performed to eliminate water vapour inside the sample compartment allowing good spectra with high signal-to-noise ratio to be recorded. Over or under subtraction of liquid water and water vapour absorbance has been a very common problem encountered by protein infrared spectroscopists, especially when working with low peptide and protein concentrations. Complications arising from absorbance from non-protein moieties has also led to erroneous assignments of peaks in the amide I region. This is mainly due to presence of molecules that are linked with various steps in the isolation and purification of proteins (for example detergents, buffers) and peptides (e.g. acids). For example, several studies in the literature attributed a band at 1674 cm1 to secondary structural elements when in reality it arises from carboxyl group of trifluoroacetic acid (TFA) that remains strongly bound to peptides after their synthesis & purification. Most experienced protein infrared spectroscopists are now aware of this problem and remove the bound TFA by acid treatment prior to infrared analysis. 1.1.2. Amide II Band Absorption Amide II absorption results both from N-H bending vibration and from C-N stretching vibration. The absorption for this band occurs in the region 15001600 cm1. Empirically, the amide I absorption has been found to be more useful for protein secondary structure determination than the amide II absorption [2,12,87,88,106111]. However, its inclusion with the amide I band has been reported to provide improved prediction accuracy by some workers using multivariate data analysis techniques [117,119121]. 1.1.3. Amide III Band Absorption Determination of protein secondary structure has also been demonstrated based on the amide III region [122126]. Absorption for this band occurs in the region 12201330 cm1. Amide III absorption mainly arises from C-N stretching vibrations as well as N-H in-plane bending vibrations, with weak contributions from C-C stretching and C=O in-plane bending vibrations [93]. The amide III region has been characterised to be a less well defined vibrational mode, with contributions from different vibrations varying between proteins [103]. Additionally, signal contribution arising from amide vibration within the amide III region has been found to be very weak and extensively mixed with CH vibration of amino acid side chains [21]. However, despite relatively weak intensity in the amide III region, there are no interfering OH vibrations from water. Generally, -helix structure occurs in the region 12931328 cm1, -sheet in the region 12251250 cm1, and unordered structures in the region 12571288 cm1 [127].

134

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

1.1.4. Interference from Amino Acid Side Chain Absorption Absorption from a number of amino acid residues has been found to occur in the amide I and amide II regions which may influence protein secondary structure prediction from FTIR spectra of proteins [115,116,128]. Effects of amino acid side chain absorption on secondary structure quantification have been investigated based on proteins in 2 H2O and proteins in H2O [115,116,128]. Chirgadze et al. report amino acid side chain absorption of asparagine, glutamine, aspartic acid, glutamic acid, arginine, and tyrosine in the region 15001800 cm1 in 2H2O [115]. Venyaminov and Kalnin report amino acid side chain absorption of asparagine, glutamine, aspartic acid, glutamic acid, arginine, lysine, tyrosine, histidine, and phenylalanine in the region 14001800 cm 1 in H2O [116]. In both studies, band assignments and intensities were established based on curve fitting procedures assuming Gaussian/Lorentzian lineshapes. Rahmelow et al. have investigated amino acid side chain absorption in the region 14401800 cm1 based on infrared spectra of 9 amino acids, 23 dipeptides, 7 tripeptides, 7 tetrapeptides, and three polypeptides in aqueous solution [128]. These samples were chosen such that for each absorbing amino acid residue at least three different compounds were measured. They used an inverse matrix method to show that protein secondary structure prediction accuracy may be improved by subtracting amino acid side chain absorption of the residues asparagine, glutamine, aspartic acid, glutamic acid, arginine, tyrosine, and lysine from the amide I and amide II regions. Side chain contribution has been subtracted based on spectra from model compounds in aqueous solution. However, as Barth and Zscherp state, this may be problematic for side chains not exposed to the surrounding aqueous environment, since the influence of the protein on the spectral characteristics of these side chains is unknown [86]. Although spectral characteristics of amino acid side chain absorption was obtained by a linear matrix model as opposed to a curve fitting procedure used by Venyaminov and Kalnins study [116], good agreement of results has been reported. Simonetti and Di Bello propose a method based on a traditional curve fitting approach, utilising FTIR isotopic exchange techniques in the presence of organic solvents incapable of donating hydrogens [129,130]. Based on a synthetic fragment of proocytocin [129] as well as a series of synthetic fragments corresponding to the processing site of the proocytocin-neurophysin precursor [130], they demonstrated that interference from amino acid side chain absorption could be reduced after H-D exchange in the presence of dimethylsulfoxide. Additionally, quantification of -turn structure is facilitated. 1.1.5. Developments in Amide Band Assignments Apart from empirical studies for identifying secondary structure sensitive regions, computational approaches have been suggested. Pancoska, Kubelka et al. suggest a modification of Nodas algorithm [131] for calculating two-dimensional correlation maps to identify spectral regions associated with a specific secondary structure [23,26]. In their approach, the two-dimensional maps have been generated by fitting the intensity variance at each frequency by a polynomial. Their method has been applied to protein spectral data from FTIR, CD, and Raman spectroscopy. Recently, we introduced an automatic amide I frequency selection procedure based on a hybrid between genetic algorithms and neural networks [132]. Based on a reference set of 18 spectra from proteins in H2O, this procedure identified frequen-

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

135

cies within the amide I band of protein FTIR spectra, which could be best related to secondary structure contents by subsequent neural network analysis. Similar approaches have been suggested based on the combination of Genetic Algorithms with multivariate data analysis techniques for a variety of prediction problems in chemistry [133139].

2. Quantitative Estimation of FTIR Spectroscopy For recent articles on quantitative estimation of protein secondary structure based on FTIR spectra, see [29,86,92,113,140146]. To date, existing methods for protein secondary structure quantification from FTIR spectral data fall into two main categories: Those based on band narrowing and decomposition of mainly the amide I band shape into its underlying components often referred to as frequency based or curve fitting approaches and those based on the principle of pattern recognition. Of the pattern recognition based approaches, most work has been done using multivariate data analysis techniques [12,16,109,110,117,120,121,146]. Alternative pattern recognition approaches used are based on the principle of artificial neural networks. All of the techniques suggested for protein secondary structure quantification from FTIR spectra of proteins have their advantages and disadvantages [113]. There is clearly a great need for further work to improve current methods or to develop additional approaches to achieve better quality of secondary structure prediction from infrared spectral data. Here, the most widely used methods for protein secondary structure quantification will be critically reviewed and compared. Additionally, a summary of prediction accuracy achieved by different methods in predicting protein secondary structure contents from their FTIR spectra in terms of the standard error of prediction (SEP) for each secondary structure under investigation is given (see Table 2). The same definition as that given in Lee et al.s paper [109] is used:

SEP =

( p
j =1

cj

pxj )

Equation 1.

where pcj = the proportion of structure predicted for left-out protein j by the respective method, pxj = the proportion of structure calculated from the original X-ray data for protein j, and n = the number of proteins. 2.1. Curve Fitting Curve fitting, the most widely used method for protein secondary structure quantification, mainly involves curve fitting of the amide I band, e.g., [2,87,88,147154] as well as occasionally the amide III band, e.g., [124,126,152]. This procedure has been reported to provide good estimation of protein secondary structure (see Table 2). The basic principle of the curve fitting procedure is to resolve the original protein spectrum into individual bands that fit the spectrum. However,

Table 2. Comparison of various secondary structure prediction methods from FTIR spectra in terms of the SEP Data set size 11 6 14 12 14 14 18 13 17 21 39 23 13 8 23 18 18 18 50 FTIR spectra Spectral region used combined (FTIR) for best with CD resultsa data No No No No No No No No No Yes No No No No No No No No No I I I I I + II I + II I I + II I I I + II I + II I (18001600 cm1) I I + II + I + II I I I Selected Method used for calculating target secondary structure fractionsb LG LG LG LG R R LG LG/KS KS KS KS KS LG/KS ST KS LG KS LG KS Prediction SEP for SEP for method helixd (%) sheetd (%) usedc C C C C C M M M M M M M M M M N N N M 2.17 2.24 10.31 5.76 5.95 5.57 7.8 5.11 9.8 7 12.14 8.6 8.32 2.1 5.34 7.7 4.58 4.47 5.5 6.34 2.76 2.55 6.87 6.82 2.56 2.4 9.7 3.71 11.22 9.5 9.08 7.34 8.79 2.9 6.33 6.4 5.72 6.16 6.6 6.18 SEP for turns (%) NPe NP NP 8.07 3.99 3.51 4.3 NP 6.61 7 NP 1.39 6.48 3.7 3.39 4.8 4.42 4.61 3.4 4.69 SEP for bends (%) NP NP NP NP NP NP NP NP NP NP NP 3.55 NP NP 4.19 NP 3.95 NP NP 3.9 SEP for otherf (%) NP NP NP 6.02 3.27 3.73 NP 5.14 9.18 10 NP 3.79 8.77 2.3 5.37 NP 6.12 NP 8 5.97 Average of SEPs (%) 2.47 2.4 8.59 6.67 3.94 3.80 7.27 4.65 9.20 8.38 10.61 4.93 8.09 2.75 4.92 6.3 4.96 5.08 5.88 5.84

136
J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

Ref. [2] [87] [168] [108] [119] [169] [109] [117] [110] [12] [120] [121] [180] [184] [16] [194] [132] [196] [146] Average
a b

Year 1986 1986 1990 1990 1994 1996 1990 1990 1991 1993 1996 1997 1998 2000 2000 2001 2001 2002 2006

I: amide I; II: amide II; I + II: Both amide I and amide II region; I + II + I + II: amide I, amide II, amide I, and amide II regions were used. LG: Levitt and Greer [170]; KS: Kabsch & Sanders DSSP [177]; R: Ramachandran plots, ST: STRIDE [228]. c C: Curve fitting; M: Multivariate data analysis; N: Neural network analysis. d If the secondary structure has been further divided (e.g., parallel, anti-parallel -sheet), the average is taken. e NP: This type of secondary structure class has not been predicted. f This structural class is also often referred to as unordered, random coil, random, irregular, and undefined.

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

137

before the curve fitting procedure can be performed, band narrowing techniques need to be applied to obtain estimates of the number and positions of discrete absorption that make up the complex amide I band profile. Since the outcome of the curve fitting procedure is highly dependent on the outcome of these estimates, the most widely used band narrowing techniques, namely Fourier self-deconvolution (FSD) and derivation, are briefly discussed first. 2.1.1. Band Narrowing Techniques Since most proteins usually contain more than one secondary structure, they give rise to several amide absorption bands. A considerable amount of overlap in terms of width and separation of these absorption bands often results in featureless absorption profiles. This makes it difficult to analyse the frequency of the composite amide maximum and any visible shoulders, which may lead to misinterpretation of spectral shifts [155]. Jackson & Mantsch therefore argue, that deduction of structural parameters from the relatively featureless amide I band alone is of limited use with regards to the curve fitting method [142]. In an attempt to tackle this problem, a number of mathematical data processing techniques have been developed to visualise those overlapping bands allowing to extract detailed information from infrared spectra of proteins [156161]. These techniques are often referred to as resolution enhancement techniques. However, since resolution is an instrumental parameter that cannot be increased after a spectrum is recorded, these techniques are more correctly referred to as band narrowing techniques, since they mainly involve narrowing the widths of infrared bands, allowing increased separation of the overlapping components. The two most popular of those techniques are second-derivative (see Figs 13) and Fourier self-deconvolution (see Figs 4, 5). Fourier self-deconvolution (FSD) [156,157,162,163] is based on the assumption that in the liquid or solid state, the absorption bands are broadened (or convoluted) by a function such that bands overlap and cannot be distinguished in the amide envelope. The self-deconvolution procedure uses this function to narrow (or deconvolute) the spectrum. Although the exact shape of the convolution function is still not determined, most authors have assumed Lorentzian or Gaussian functions [2,87,108,119]. The basic underlying principle of FSD is described elsewhere [88,156,157,162,163]. However, for our discussion it is important to note, that FSD is controlled by two parameters: The full-width at half-height (FWHH) and the resolution enhancement factor (K). Both have to be determined manually, mainly by trial and error. The exact number and frequency of the resulting components is highly dependent on the choice of deconvolution parameters. Different combinations of values for the FWHH and for K yield different shapes of the resulting deconvoluted spectrum. E.g., if the FWHH is too small, little narrowing is obtained. If it is too large, negative side-lobes appear. Since the choice of the deconvolution parameters is subjective, varying results of subsequent curve fitting may be expected for the same protein FTIR spectrum under investigation. Spectral derivation an operation similar to deconvolution has been originally realised by the method of Savitzky & Golay [164]. In the area of spectral analysis, second order derivative is most widely applied. The Savitzky-Golay method uses information from a localised segment of the spectrum to calculate the derivative at a particular wavelength rather than the difference between adjacent data points. In most cases, this reduces the problem of noise enhancement and may actually apply some smoothing to

138

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

Ovalbumin

Original

AU

FSD

SD

1800

1700

1600

1500
-1

1400

1300

WAVENUMBER (cm )
Figure 4. The original FTIR spectrum of ovalbumin in H2O solution (top), the Fourier Self-Deconvolution spectrum (middle), and the inverted Second Derivative spectrum (bottom). Fourier Self-Deconvolution was performed with a full-width at half-height of 24.0 cm1 and a resolution enhancement factor of 2.6. Second Derivative spectrum was obtained by applying the Savitsky-Golay method with 7 convolution points.

the data. However, large side lobes on either side of intense absorption bands are produced making it difficult to determine the true limits of these absorption bands. A significant problem of spectral derivation is that it does not preserve relative intensities of absorption bands. Relative absorption bands in derivative spectra are strongly dependent on the width of the absorption in the original spectrum where narrow absorption bands will be enhanced at the expense of broader bands. For example, a broad band in the original spectrum will be reduced to hardly any feature at all after derivation. Due to these problems, derivative spectra are often merely used to confirm the initial identification of the band positions by deconvolution [143]. However, successful application of curve fitting directly based on derivative spectra has been demonstrated recently, e.g., [22,108,165,166]. Since both spectral derivation and FSD are very sensitive to changes in the spectra, it is possible that noise may be amplified using these methods [167]. Therefore, Singh suggests that curve fitting should not be performed on a second derivative and/or deconvolved spectrum directly [143]. They should merely be used to determine the parameters of the curve fitting procedure (i.e., the number of component bands and their positions). 2.1.2. Curve Fitting Procedure A complete description of the curve fitting procedure can be found elsewhere, e.g., [143]. Basically, it involves an iterative process, where a set of Gaussian, Lorentzian, or a mixture of Gaussian/Lorentzian-shaped curves is determined, that best fits the original protein spectrum. The best fit is obtained by a root mean square analy-

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

139

d A/d

T S

1700

1680

1660

1640

1620

1600

WAVENUMBER (cm-1)
Figure 5. Curve-fitting of the inverted second derivative spectrum from ovalbumin in H2O and assignment to protein secondary structure. Table 3. Frequencies, relative areas and assignments of second-derivative infrared amide I components of ovalbumin in H2O solutiona Assignment -sheet turns turns -helix -sheet -sheet a The amide I band assignments were made on the basis of previous FT-IR spectroscopic studies of other globular proteins in H2O solutions [232]. Frequency (cm1) 1697.2 1686.0 1675.5 1657.4 1638.2 1624.7 Band Area (%) 1.5 12.2 6.6 36.6 30.8 12.3

sis determining the optimal set of curve fitting parameters (band height, band width, band position, and baseline). The band area or intensity of each individual band is used to calculate its relative contribution to a particular protein secondary structure in relation to the overall area of the original spectrum. It should be noted, that variations of the just described curve fitting process have also been applied [2,87,108,119,124,143,168,169]. An example of curve-fitting applied to the inverted second derivative spectrum from ovalbumin in H2O along with assignments to protein secondary structure are provided in Fig. 4 and Table 3, respectively. Susi & Byler [2,87] have investigated the quantitative estimation of helix and sheet structure from second derivative and deconvolved FTIR spectra using curve fitting applied to the deconvolved FTIR spectra directly. They have reported good agreement with secondary structure calculations from X-ray crystallography by Levitt & Greer [170]. For six proteins, their reported predictions have resulted in an SEP of 2.24% for helix and an SEP of 2.55% for extended chain structure [87]. For 11 of the

140

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

proteins covered by Levitt and Greer [170], their reported estimations have resulted in an SEP of 2.17% for helix structure and an SEP of 2.76% for beta structure [2]. The resulting averages of SEPs were 2.4% and 2.47% respectively. In contrast to Susi et al.s procedure where the curve fitting parameters were chosen manually, Goormaghtigh et al. [168] have employed an automatic procedure for choosing the curve fitting parameters in an attempt to make the curve fitting procedure more objective and more accessible to other investigators. Curve fitting was performed on spectra obtained with very little deconvolution. Goormaghtigh et al. were interested in secondary structure prediction based on thin hydrated films of 17 proteins by attenuated total reflection (ATR) spectroscopy. The structural properties of interest were helix, -sheet, -turn, and random structure. The target fractions of secondary structure were calculated based on data from Levitt & Greer [170]. However, since these target fractions of secondary structure have only been given for 14 proteins as well as only for -helix and -sheet structure, the SEPs will be reported for these only. The application of their more objective automatic procedure resulted in worse SEPs as compared to Susi & Bylers work. Using their method, an SEP of 10.31% for -helix and an SEP of 6.87% for -sheet was achieved. The average of SEPs is 8.59%. Dong et al. [108] have investigated the quantitative estimation of -helix, -sheet, turn structure, and unordered for twelve globular proteins in aqueous solution. They have performed the estimation directly from the band areas (integrated intensities) of the second derivative spectra and have subsequently compared them with the amounts obtained by Levitt & Greer from X-ray crystallography [170]. For the twelve proteins investigated, their reported predictions have resulted in an SEP of 5.76% for -helix, an SEP of 6.82% for -sheet, an SEP of 8.07% for turn, and an SEP of 6.02% for unordered. The average of SEPs is 6.67%. Note, that for hemoglobin and myoglobin they have reported that the band due to unordered structure appeared as a shoulder on the -helix band and was therefore too small (less than 5%) to be separated from the helix structure. Hence, it was included in the -helix value. For the calculation of the SEP values, we have therefore set the fraction of the unordered structure to 0. Also, for major histocompatibility complex antigen A2 and 2-microglobulin, no X-ray data was given for turn and unordered structure. Therefore, these values have been excluded from the calculation of the SEPs in Table 2. In another approach Singh et al. have used the deconvolved and second derivative spectra for peak assignment, followed by curve fitting on the original spectrum [124,143]. Unfortunately, we were unable to calculate the SEPs from the data provided. Kumosinski & Unruh [119] have estimated the secondary structure proportions of 14 globular proteins by curve fitting both amide I and amide II bands. They have used second derivative spectra for identifying individual peak positions. The original spectra have then been subjected to Fourier deconvolution with subsequent curve fitting. The secondary structural properties of interest were helix, sheet, turn, and irregular structure. The target secondary structure proportions were based on traditional Ramachandran plots calculated from the X-ray crystallographic structure of their proteins in conjunction with data from the Protein Data Bank (PDB). Note that their analysis involved manual assignment of up to 28 component peaks to secondary structure, which is a good example of the large amount of subjectivity inherent in the curve fitting method.

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

141

Applying their method to the 14 proteins under investigation resulted in an SEP of 5.95% for -helix, an SEP of 2.56% for -sheet, an SEP of 3.99% for turn structure, and an SEP of 3.27% for irregular. The average of SEPs is 3.94%. In a similar study, Kumosinski and Unruh have performed a Gauss-Newton nonlinear iterative curve-fitting analysis based on 14 globular proteins and two synthetic polypeptides with known secondary structure from Ramachandran analysis of the X-ray crystallographic structure [169]. They have fitted both the amide I and amide II bands. Again, FSD spectra, second-derivative spectra, and the original spectra were used for their analysis resulting in an SEP of 5.57% for helix, an SEP of 2.4% for sheet structure, an SEP of 3.51% for turn structure, and an SEP of 3.73% for irregular. The average of SEPs is 3.8%. Summarising, it can be said that there are a number of assumptions implicit in the curve fitting approach that are not necessarily justified [88]. The real number and position of individual amide I band contours present in a protein is not guaranteed to be accurately reflected in the number obtained from deconvolution or derivation. Hence, there is a potential source of error in the assignment of each absorption. Additionally, it is often assumed that unique infrared frequencies can be uniquely correlated to specific structural types [10,167]. However, the assignments of particular amide I absorption to specific secondary structures is not always clear and often subjective, i.e., all the bands within the amide I region need to be assigned to secondary structure types manually. Based on a survey of the current literature on protein infrared spectroscopy, it has been shown that there are in fact conflicting amide I assignments for the different types of protein secondary structure [113]. The curve fitting method usually assumes Lorentzian or Gaussian band shapes (or a mixture of those). This assumption, however, may not be true for complex molecules such as proteins. It is also not clear to what extent environmental effects such as the hydrogen bond strength are important in determining band shape. 2.2. Pattern Recognition Based Methods Alternative methods for quantitative analysis of protein secondary structure have been introduced that require fewer assumptions and remove much of the subjectivity inherent in the curve fitting method described above. The basic underlying principle is to extract common features or patterns associated with fractions of secondary structures from a reference set of spectra with known secondary structure mostly from X-ray crystallography studies. Hence, these techniques are often referred to as pattern recognition based techniques, a category, which encompasses both multivariate data analysis and neural network methods. 2.2.1. Multivariate Data Analysis Techniques The most popular methods of the pattern recognition based approaches are the multivariate data analysis methods. Multivariate data analysis methods basically involve relating two sets of data, commonly expressed in two matrices X and Y, by regression. Typically, one set of data consists of target secondary structure fractions for proteins in the reference set (Y) and the other set consists of the absorption values for a range of wavenumbers characterising these proteins (X). Typically, rows in X are the n proteins of the reference set and each column in X represents one variable characterising the respective protein. Generally, when dealing with data from FTIR spectroscopy, X con-

142

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

tains p variables, one for each wavenumber whose values are absorption values for the respective wavenumber of the reference protein spectra. Following multivariate data analysis terminology, one protein infrared spectrum in X is specified by a point in pdimensional space, where p is the number of wavenumbers recorded. X and the corresponding Y together are often referred to as the calibration set or training set. Based on this calibration set, a multivariate regression model is derived which can then be used to predict the secondary structure fractions of a new protein based on its infrared spectral data, i.e. the built model is used on new X to predict the new Y. These methods have been pioneered by applications based on spectral data from other techniques like CD, e.g., [21,31,3346] and Raman spectroscopy, e.g., [171176]. The classical regression method, multiple linear regression (MLR), may lead to serious misinterpretations which may even remain undiscovered, when the X-variables are intercorrelated (linearly dependent) and when there is noise, i.e., errors in X. This lead to the development of more advanced projection based regression methods like Principal Component Regression (PCR) and the Partial Least Squares (PLS) methods. These methods avoid the problem of intercorrelation and are more capable of dealing with significant errors in the spectral data (X). As already mentioned, protein infrared spectra of the reference set can be thought of as a swarm of points in p-dimensional space, where p is the number of variables in the original spectral data matrix X. The projection based methods involve transformation of these original p dimensions into another coordinate system with fewer dimensions still describing the largest variation in the original data. This is achieved through projection. The basic assumption is that this new object space still corresponds to the most useful part of the data, i.e. the main multivariate trends. The dimensions of the original data set that contribute the least to the variation in the data set are eliminated during the process. They are assumed to represent the noise part of the data. Hence, this process is often referred to as decomposition of the original data matrix into a structure part and a noise part. The probably most important property of the variables comprising the new object space describing the protein spectra is that they are orthogonal, i.e., linearly independent. Most widely used methods employed for achieving the just described transformation include Principal Component Analysis (PCA), Factor Analysis (FA), and Singular Value Decomposition (SVD) methods that are closely related. As a result, in projection based regression methods, intercorrelated data sets may be modelled without difficulty. This makes spectroscopic data ideal applications of these projection methods since it is usually highly intercorrelated. Lee et al. [109] have applied a regression method based on a factor analysis method to a set of 18 FTIR spectra from proteins in H2O. The target fractions of secondary structure were calculated based on data from Levitt & Greer [170]. The structural properties of interest were -helix, -sheet, and turn structure. The quality of prediction was evaluated using the leave-one-out method. Using the normalised region of the amide I band for the 18 proteins, their method achieved an SEP of 7.8% for -helix, an SEP of 9.7% for -sheet, and an SEP of 4.3% for turn. The average of SEPs is 7.27%. Dousseau & Pezolet have investigated the application of both a classical least squares method (MLR) and a partial least squares method (PLS) for the estimation of protein secondary structure for -helix, -sheet, and undefined structure based on FTIR spectra from 13 proteins in H2O and 2H2O [117]. They were interested in the performance of the employed methods based merely on the amide I band as well as the

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

143

combination of the amide I and amide II band. The target fractions of secondary structure were calculated based on data from Levitt & Greer [170], except for myoglobin, for which the results of Kabsch and Sander [177] were used. Based on the results published in their paper (Table II) [117], we have calculated the performance of their presented methods in terms of the SEP. PLS proved to be more appropriate than the classical linear regression method (MLR). The best results were achieved by employing PLS, breaking down the -helix structure into ordered and disordered -helix structure, and by the inclusion of the data points both from the amide I and amide II region. This resulted in an overall SEP of 5.11% for -helix, an SEP of 3.71% for -sheet, and an SEP of 5.14% for unordered. The average of SEPs is 4.65%. Sarver and Krueger [110] employed a method based on single value decomposition to predict the secondary structure fractions for helix, -sheet, -turn, and other from the amide I region of FTIR spectra from 17 proteins. The target fractions of secondary structure were calculated using Kabsch & Sanders DSSP program [177]. For the 17 proteins, their reported secondary structure estimations using the leave-oneout method resulted in an SEP of 9.80% for helix, an SEP of 11.22% for -sheet, an SEP of 6.61% for -turn, and an SEP of 9.18% for other. The average of SEPs is 9.2%. Pribic et al. suggested an approach, where both FTIR and CD spectra of 21 reference proteins of known structure were used to generate spectra characteristic of helix, anti-parallel -sheets, parallel -sheets, -turns, and other [12]. Quantification was performed by applying a multivariate linear model (Gauss-Markoff model) in combination with singular value decomposition. The underlying assumption was that a protein spectrum to be analysed is a linear superposition of the characteristic spectra. This approach allowed for the combination of different spectral regions in a single analysis. The target fractions of secondary structure were calculated using Kabsch & Sanders DSSP program [177]. The secondary structure fractions were predicted with varying spectral regions from separate spectra as well as from combined FTIR and CD spectra. Their predictions were evaluated by employing the leave-one-out method. Pribic et al. [12] demonstrated a slightly improved quantification by combining the 21 FTIR spectra (amide I region) with circular dichroism CD spectra of the same proteins and using this extended spectrum for the analysis. This resulted in an SEP of 7% for helix, an SEP of 9.5% for sheet structure, an SEP of 7% for turn structure, and an SEP of 10% for other. The average of SEPs is 8.38%. Rahmelow and Hbner have investigated the accuracy of secondary structure prediction based on 39 FTIR spectra from proteins with known structure from X-ray crystallography by applying different multivariate data analysis techniques, namely, classical and inverse least squares, singular value decomposition, partial least squares, and ridge regression [120]. The structural properties of interest were helix and -sheet. The structural proportions were calculated from X-ray crystallography data using the DSSP program based on the work from Kabsch & Sander [177]. The quality of the investigated methods was expressed in terms of a slightly modified version of the standard error of prediction as used here (see Eq. 1).

SEPRahmelow / Hbner =

( p
j =1

cj

pxj )

n 1

Equation 2. SEP as defined by Rahmelow and Hbner.

144

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

where pcj = the proportion of structure predicted for left-out protein j by the respective method, pxj = the proportion of structure calculated from the original X-ray data for protein j, and n = the number of proteins. However, knowing that n in their investigation was 39, the SEP using our definition could be calculated using the following equation:
SEPOur _ Definition = 38 ( SEPRahmelow / Hbner )2 39

Equation 3. Calculation of the SEP as defined by us.

This enabled us to express their results in terms of the SEP as defined here. Applying singular value decomposition with seven primary factors, the best prediction of secondary structures was obtained when both the amide I and amide II bands were included in the calculations. An SEP of 12.14% for helix and an SEP of 9.08% for -sheet were achieved. The average of SEPs is 10.61%. Based on the results presented in their paper, they have concluded, that the methods singular value decomposition, partial least squares, and ride regression are very similar in terms of their prediction capabilities and should be given priority over the less robust classical and inverse least squares methods. They have also acknowledged that the prediction accuracy is not limited by the applied procedure, but by the quality of the data set. Finally, a decomposition of the data set into subgroups with similar spectra as performed by a cluster analysis did not further improve prediction accuracy. Wi et al. [121] have studied the effect of an increased band shape variation of the amide I and amide II regions of 23 FTIR protein spectra in H2O on the prediction accuracy of their factor analysis based restricted multiple regression method. The increased band shape variation was achieved by applying Fourier self-deconvolution (FSD) to the spectral data. The structural properties of interest were helix, sheet, turn, bend, and other. The structural proportions were calculated using the DSSP program based on work from Kabsch & Sander [177]. Their prediction method was based on a principal component method of factor analysis (PC/FA) where the spectral data of a training set is decomposed it into linear combinations of a set of orthogonal component spectra. They claim that the loadings of these component spectra, for each protein, provide a compact numerical description of spectral bandshape variability, and form a pool of statistically significant spectral descriptors. From these descriptors the optimal subset that can be correlated to the fraction of each type of secondary structure was selected using a restricted multiple regression (RMR) analysis where details can be found in [13]. They recognised the fact that the choice of optimal deconvolution parameters is a key and quite subjective step of the type of spectral analysis they used. They have therefore varied both deconvolution parameters in a series of steps with subsequent application of their FA/RMR analysis. Since the prediction quality varied depending on the deconvolution parameters used, they have defined the optimally deconvolved set as that one which gives the best improvement in the prediction of protein secondary structure. They have acknowledged, however, that in practice this optimally deconvolved set may be different for different secondary structure fractions (helix, sheet, etc.) as well as for different spectral data. The quality of prediction was evaluated using the leave-one-out method. In their article, they report that after optimisation, FSD has only little impact on the prediction

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

145

of helix and sheet fractions. However, using a moderately deconvoluted set in combination with a number of component spectra of 16 retained in the pool to select which loadings to be used in the selective multiple linear regression (RMR) optimisation, best overall results were obtained. This resulted in an SEP of 8.6% for helix, an SEP of 7.34% for sheet, an SEP of 1.39% for turn, an SEP of 3.55% for bend, and an SEP of 3.79% for other. The average of SEPs is 4.93%. Note that since only the detailed results for 22 of the 23 proteins were given in their paper, the SEPs calculated here were based on these 22 proteins. Baello et al. [16] have presented an equilibrium hydrogen exchange FTIR spectroscopy method for the prediction of secondary structure proportions. The structural properties of interest were helix, sheet, turns, bends, and other. The structural proportions were calculated from X-ray crystallography data using the DSSP program based on work from Kabsch & Sander [177]. In their studies, the amide I, I, II, and II regions were used resulting in a range from 18001350 cm1. In carrying out their method, four basic steps were performed: First, for each protein, six spectra were measured with a systematic variation of the solvent H-D ratio. Second, this set of spectra was subjected to factor analysis to determine the most significant component spectra for each protein. Basically, this step aimed at extracting independent aspects of the spectral response to deuteration. Third, these component spectra were subjected to a second factor analysis over the entire training set to determine components of the bandshape based on their commonality over the training set of component spectra provided by the third step. The loadings of each resulting component spectrum are normally used to fit to the secondary structural proportions as determined by X-ray crystallography. The resulting fitting relationship can subsequently be used to predict the secondary structural fraction of an unknown protein. Hence, in a last step, Baello et al. have used restricted multiple regression analysis to selectively choose those loadings of the resulting component spectra from step three resulting in the most reliable predictions for each structural type independently. The quality of prediction was evaluated using the leave-one-out method, where one protein was systematically removed before developing a regression relation between the loadings and the fractions of the secondary structural components under investigation (i.e., helix, sheet, turns, bends, other). The predicted secondary structure fractions of the protein left out were then calculated. The leave-one-out method was restricted to only 19 proteins, since only for those proteins good X-ray crystal structure analysis had been available. The optimal predictions were then determined for the set of loadings that resulted in the lowest prediction error for the entire set. The resulting differences of secondary structure predictions from the secondary structure calculated based on X-ray crystal structure analysis for each of the 19 proteins was given in their paper, enabling us to calculate the corresponding SEPs. Their method achieved an SEP of 5.34% for helix, an SEP of 6.33% for sheet, an SEP of 3.39% for turns, an SEP of 4.19% for bends, and an SEP of 5.37% for other. The average of SEPs is 4.92%. Recently, Wu et al. have used two-dimensional IR correlation spectroscopy (using time-dependent spectral variations) in combination with PCA to investigate the secondary structure and the kinetics of H-D exchange of human serum albumin in D2O [178]. In their study, they made use of the fact that the amide protons of each secondary structural conformation are not exchanged at the same time. Hence, contributions of secondary structures to amide bands can be separated using H-D exchange studies. They, they report that for human serum albumin, H-D took place in the follow-

146

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

ing order: Extended chain and -turns first (after a few minutes), the water accessible parts of -helices (after about 6 minutes). H-D exchange for -turns continued until 13.4 minutes after initiation and for 25% of the -helix structure, no H-D exchange took place even after 4 hours after initiation at 25C. PCA was used both to separate the acquired spectra into different groups and to deconvolute amide I and amide II bands (using loadings plots). Filosa et al. have used resolution-enhanced two-dimensional infrared correlation spectroscopy to study structural changes in response to change in temperature of structurally homologous proteins, namely horse, cow, and tuna ferricytochromes c [179]. Despite high similarity of the respective sequences, Filosa et al. found that ferricytochrome c from horse and cow had different thermal unfolding pathways. Forato et al. applied singular value decomposition (SVD) to FTIR spectra of 13 globular proteins in KBr pellet [180]. Based on their results, they claim, that protein secondary structure is preserved in solid state. Target secondary structure has been calculated based both on Levitt and Greers method [170] and Kabsch and Sanders algorithm [177]. Unfortunately, the authors did not specify which target secondary structure had been calculated with which method. Since the amide I band in solid state reaches nearly 1800 cm1, the authors have included a range from 1800 cm1 to 1600 cm1 in their analysis. Using their approach, they report an SEP of 8.32% for helix, an SEP of 8.79% for sheet, an SEP of 6.48% for turns, and an SEP of 8.77% for other. The average of SEPs is 8.09%. Vedantham et al. claim that the impact of solutes (i.e., proteins) on O-H bending and stretching vibrations of solvent (i.e., water) spectra as shown by Raman spectroscopy [176,181,182] as well as possibly varying molar extinction coefficients for different absorbing secondary structures [112,142,183] has not been accounted for sufficiently by current methods for secondary structure prediction from infrared spectra [184]. They also claim that normalisation of the amide region should be performed before background subtractions to be able to correctly account for overlapping regions between peaks that correlate with protein structure and those that do not. To account for these problems, Vedantham et al. employ a method for generating idealised reference spectra in the amide I and amide III regions. This method, which has been originally suggested by Sane et al. for data based on Raman spectroscopy [176], all subtractions, normalisation, and amide band deconvolution steps are performed simultaneously in a single mathematical function for protein spectra. Additionally, varying molar extinction coefficients are allowed for peaks in the infrared spectra correlating with protein secondary structure. The underlying assumption is that all underlying components of protein spectra are additive so that the overall spectral intensity may be described by a single function. Reference spectra were generated using singular value decomposition on the isolated amide I and III bands based on eight proteins in the reference set. Subsequently, their procedure permits the estimation of protein secondary structure of samples outside the reference set. They claim, that based on a reference set of eight protein infrared spectra in H2O, their method provides good secondary structure estimates for proteins comparing well with other established methods for protein secondary structure prediction. Best results were reported based on amide I data. They report an SEP of 2.1% for helix, an SEP of 2.9% for sheet, an SEP of 3.7% for turns, and an SEP of 2.3% for random. The average of SEPs is 2.75%. Note, that results reported for predictions made for proteins within the reference set are not given here. It should also be noted that no details are given about the method for prediction accuracy

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

147

validation (i.e., it is not clear if the leave-one-out method has been used or if any other method has been employed) and which proteins have been used for prediction accuracy evaluation. Hence, the results reported should be viewed with caution and may not be directly compared with the results obtained by the other methods reported here. Vedantham et al. have calculated secondary structure assignments based on the STRIDE algorithm developed by Frishman and Argos [185]. They claim that the use of the STRIDE algorithm results in significantly improved prediction accuracies compared to results obtained based on calculations from Kabsch and Sanders DSSP method [177] and Levitt and Greers algorithm [170]. In a previous study, Sparks [186] has used the same algorithm for secondary structure prediction approach based on a set of five protein spectra. Here, target secondary structure fractions were calculated based on Levitt and Greers method [170]. The results reported were significantly worse with an SEP of 4.1% for helix, an SEP of 10.1% for sheet, an SEP of 4.4% for turns, and an SEP of 5.6% for random. The average of SEPs is 6.05%. Note, that results reported for predictions made for proteins within the reference set are not given here. Some recent advances include the attempt to select specific wavenumbers for prediction of protein secondary structure [132,146]. This approach is based on the fact that the advent of FT instruments makes it possible to obtain absorbance values for individual wavenumbers at high signal-to-noise ratio. These studies have shown for the first time that few distinct frequencies are sufficient to provide information on protein secondary structure content. In their study, Goormagtigh et al. used an ascending stepwise method that identifies the relevance of every wavenumber in the FTIR spectrum for the prediction of a particular secondary structure [146]. For the analysis they developed a 50-protein database (with minimal fold redundancy). The standard error of prediction in cross-validation was found to be 3.4%, 5.5% and 6.6% for -turn, -helix and -sheet, respectively. 2.2.2. Neural Network Analysis Neither multivariate data analysis methods nor curve fitting methods are free of problems and further improvements in this field are necessary [167]. An alternative pattern recognition approach emerged, which is based on the principle of artificial neural networks. Neural networks have their roots in artificial intelligence. Their basic underlying principles are described in many text books, e.g. [187]. Figure 6 shows an example of a typical neural network architecture for protein secondary structure prediction from FTIR spectral data of proteins with one input layer, one hidden layer, and one output layer. The nodes (often referred to as neurons) of each layer are fully connected with weights. Input values (FTIR spectral data) are simply fed into the input layer nodes. The output value of each following node is calculated as the weighted sum of values of each preceding node. This value is then fed into an activation (sigmoidal function). This way, input values are fed through the neural network to produce outputs (fractions of protein secondary structure content). In the past, neural networks have been most widely used for secondary structure prediction from amino acid sequences, e.g., [188191] as well as from CD spectral data, e.g., [14,17,48,192]. In the present review, our main focus is on protein secondary structure prediction techniques based on FTIR spectra. Hence, we will focus our discussion on work from that domain. Only very few neural network approaches have

148

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

Figure 6. Example of a neural network architecture for protein secondary structure prediction from FTIR spectral data of proteins.

been suggested particularly based on FTIR spectra of proteins [17,24,193198]. The results in Table 2 show that neural networks bear great potential in making good predictions about the secondary structure of proteins from their FTIR spectra. However, as with all other methods discussed here, the success of neural networks depends very much on the right configuration. It has been shown that the choice of training scheme, neural network topology as well as the choice of data pre-processing methods are important design choices in achieving good prediction accuracy [17,24,193196]: For example, techniques have been successfully employed to reduce the number of weight connections in the neural networks to a number appropriate for the limited number of training patterns available which in fact had a favourable effect on their generalisation capabilities and hence their prediction accuracy [194196]. Obviously, the reduction of neural network weight connections has the additional side-effect of faster neural network training, which might become an issue as the size of the training set increases. Most of the work in the field of applying neural networks to problems in analytical biochemistry has been done using feed-forward multi-layer perceptrons trained with the conventional backpropagation algorithm [199]. However, there are a number of potential problems and pitfalls of the basic backpropagation algorithm, which have lead to the development of improved neural network training algorithms like for instance the locally adaptive learning scheme, resilient backpropagation [200]. Although the basic backpropagation learning rule described in [199] is relatively simple, it is often a difficult task to choose the learning-rate appropriately, since it is strongly dependent on the shape of the error function. The shape of the error function, however, is usually not known and changes with the learning task. A small learning-

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

149

rate results in long convergence time on a flat error function. A large learning rate, on the other hand, may lead to oscillations, preventing the error to fall below a certain value. Additionally, there is no guarantee that the algorithm will find a global minimum of the error function at all. Another problem with the standard backpropagation algorithm employing gradient-descent is the contra intuitive influence of the partial derivative on the size of the weight step. If the error function is shallow, the derivative is relatively small, resulting in small weight steps. In the presence of steep ravines in the error surface where cautious steps should be taken, large derivatives lead to large weight steps, which could possibly take the algorithm to a completely different region in weight space. Since the introduction of the backpropagation algorithm [199] there have been several suggestions to improve weight training in feed-forward neural networks based on the concept of supervised learning in multi-layer perceptrons using the technique of gradient-descent. Recent studies [132,194198] have successfully employed the resilient backpropagation learning algorithm [200] which makes use of a local adaptation strategy to improve the conventional backpropagation learning technique. We believe that this neural network learning technique is particularly suitable for protein secondary structure prediction from spectral data where mostly only a limited amount of training data is available. Therefore, there is no easy detection of possible overfitting of the neural networks due to the very limited number of training patterns available. Overfitting refers to the case where the neural network has begun to memorise each individual training pattern rather than settling for weights that generally describe the mapping for all cases. Resilient backpropagation has a number of properties explaining the superiority over the conventional backpropagation algorithm in the domain of secondary structure prediction from FTIR spectral data. The harmful influence of the size of the partial derivative on the weight step in the standard backpropagation algorithm is eliminated in resilient backpropagation by considering only the sign of the derivative to indicate the direction of the weight update. This is done by modifying the size of the weight step directly by introducing the concept of resilient update values, resulting in an adaptation effort, which is not prone to unpredictable gradient behaviour. Another advantage of the resilient backpropagation algorithm over the conventional backpropagation algorithm is the speed of convergence. In a study comparing backpropagation to resilient backpropagation and two other adaptive learning methods it was demonstrated on a couple of representative benchmark problems that local adaptive algorithms, in particular resilient backpropagation, converge considerably faster than the ordinary backpropagation (gradient-descent) algorithm [201]. Additionally, robustness of resilient backpropagation with respect to the choice of the initial parameters has been dramatically improved. Consequently, the choice of initial parameters does not have such an important impact on the outcome of the neural network training as with the conventional backpropagation algorithm. This certainly is an additional factor of introducing reliability of the neural network learning algorithm employed. One of the most notable advantages of resilient backpropagation over standard backpropagation is its improved generalisation capability: Riedmiller showed in one of his studies [202] that the introduction of a weight-decay parameter in combination with a relatively low maximum step size did in fact lead to improved generalisation: He demonstrated that overfitting did not occur even with long training times (large number of epochs). Improved generalisation capability is an important feature in protein secondary structure prediction from FTIR spectra of proteins, where there is generally only a limited amount of reference spectra with known X-ray struc-

150

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

ture available. In that case, usual splitting up of the original pattern set into a training-, validation-, and test set would not be sensible. Consequently, the leave-one-out method is often employed, e.g., [12,16,18,109,110,121], since there is no easy detection of possible overfitting of the neural network. Severcan et al. have used a neural network approach employing the standard resilient backpropagation learning algorithm [194]. They have recognised the importance of good generalisation capabilities of neural networks particularly with respect to the limited set of spectral data available. Hence, they have employed a discrete cosine transform retaining 13 coefficients to reduce the number of neural network inputs to 13 for each spectrum. Instead of having one neural network with three outputs for each secondary structural property, they have employed specialised neural networks with one neural network for each secondary structural feature to be predicted. Their analysis was based on the same set of 18 FTIR spectra of proteins as in Lee et al.s factor analysis approach [109]. The quality of prediction was evaluated using the leave-one-out method. By using a generated set of 10 networks for each prediction and averaging the predicted values, they achieved an SEP of 7.7% for -helix, an SEP of 6.4% for -sheet, and an SEP of 4.8% for turn. The average of SEPs is 6.3%. Based on the same set of 18 FTIR spectra of proteins as in Lee et al.s factor analysis approach [109], we have recently suggested neural network approaches with improved prediction accuracy [132,196,197]. Those studies are based on multi-layer feedforward neural networks using an enhanced version the resilient backpropagation training algorithm, where a weight decay parameter has been added to the error function for improved generalisation [200]. A first approach showed that providing the neural network analysis with only part of the amide I region from empirically determined structure sensitive regions in combination with appropriate pre-processing of the spectral data produced better results than providing all data of the amide I region. This lead to a standard error of prediction (SEP) of 4.47% for -helix, an SEP of 6.16% for -sheet, and an SEP of 4.61% for turns. The average of SEPs is 5.08%. In a further study we employed a genetic algorithm to automatically identify an optimal set of amide I frequencies most suited for our neural network analysis. This resulted in an SEP of 4.58% for helix, an SEP of 5.72% for sheet, an SEP of 4.42% for turn, an SEP of 3.95% for bend, and an SEP of 6.12% for other. The average of SEPs is 4.96%. Pancoska et al. claim that the traditional description of protein secondary structure in terms of overall fractions of helix, sheet and other components is only part of the protein structural information that can be derived from spectroscopic data [17,24,25]. They argued that in addition to overall fractional secondary structure composition, secondary structure segment length and perturbations of regular secondary structures can lead to observable spectral effects. Hence, they introduced a matrix descriptor with integer entries representing the number of secondary structure segments of each type on the diagonal and the number of their interconnections as the off-diagonal elements. Subsequently, neural networks were employed to study the correlation of spectral data with their matrix descriptor. A three-layer backpropagation network with a hyperbolic tangent transfer function trained by a normalised cumulative delta rule was employed. The neural network topology used had one hidden layer where the number of neurons in the hidden layer was determined by a network topology optimisation scheme implementing principal component analysis of the synaptic weights [18]. The neural network had eight output values reflecting their matrix descriptor for helix, sheet, and other segments. The number of inputs to the neural network was determined by the

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

151

number of spectral points used, i.e., 100 equidistantly digitised intensities of ECD spectra (180260 nm), 103 equidistantly recorded intensities of VCD (2H2O) spectra (15701720 cm1) and 126 points of FTIR (H2O) spectra (14781720 cm1) were used after normalisation. The mapping between the spectral data and the matrix descriptor was evaluated using the leave-one-out method with 23 protein spectra from electronic circular dichroism (ECD) and vibrational circular dichroism (VCD). Pancoska et al. demonstrated that their matrix descriptor could be predicted to an accuracy comparable to that of conventionally predicted average fractional secondary structures. Furthermore, they report that the ECD predictions were more accurate than the VCD ones, which may have resulted from the longer range length dependence of the ECD bandshape and intensity. Results for a parallel analysis using FTIR spectra have indicated a lower reliability than that for VCD. A recent study employed both neural networks trained with resilient backpropagation and adaptive neuro-fuzzy inference systems (ANFIS) to predict helix/sheet segment information based on an extended reference set of 41 FTIR spectra of proteins [203]. Overall, better predictions were achieved using ANFIS. Although information on the number of helix/sheet segments and information on the average helix/sheet segment length are closely related, more accurate predictions were made for the latter. Finally, it was observed that predictions for average helix/sheet length merely based on the amide I band maximum position and the full-width at half-height were comparable to those when individual absorbance values were provided highlighting the importance of that information. 2.2.3. Secondary Structure Prediction of Unknown Proteins (Generalisation) Most pattern recognition based methods are supervised learning techniques, i.e. they base their predictions on a calibration or training set of reference spectra. Based on this set of spectra, a model is built, which allows secondary structure predictions to be made for proteins not seen during the analysis. Probably the most important feature the resulting model must have is the ability to generalise. Generalisation in pattern recognition based approaches is referred to as the capability of producing reasonable outputs for patterns presented, that have not been seen during the analysis. Here, we want the pattern recognition based methods to extract common features from data based on our database of FTIR spectra with known secondary structure from X-ray crystallography studies building a model that correctly maps the spectral data presented to it to its corresponding secondary structural fractions. If the pattern recognition based methods are capable of generalising, they are able to make good predictions for spectral data from new proteins presented to it, which have not been seen during the analysis. 2.2.4. Finding a Representative Training Set Clearly, for the pattern recognition based approaches to be able to come up with a model with good generalisation, the reference set of infrared spectra used for training needs to be representative. The more representative those spectral samples, the more accurate and reliable predictions about the secondary structure of unknown proteins will be. However, finding an optimal composition of representative spectra will be a difficult task since merely increasing the size of the spectral training set may not have the effect of improved prediction accuracy. In a recent study, Pancoska et al. [13] were interested in the effects of increasing the number of proteins with known crystal struc-

152

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

tures included in the training set. Based on their results, they reported that inclusion of more spectral data in the analysis lead to a deterioration of the quality of prediction. In order to find an optimal set of representative spectra, a joint goal of all researchers in that area should be to build and constantly increase a common database of FTIR spectra for all proteins with known secondary structure. It would certainly be ideal, if access to FTIR spectral data could be added to the existing PDB, a repository for the processing and distribution of 3-D macromolecular structure data primarily determined experimentally [204]. Obviously, standards would need to be defined and additional functionality would need to be implemented which is outside the scope of the present paper. However, work in that direction is a vital step to be able to build systems capable of making good and reliable predictions of new proteins based on their FTIR spectra. Once sufficiently large sets of protein infrared spectra with known secondary structure are available, we are also in a position to separate protein spectra according to structural properties taken from classification databases like e.g. the Structural Classification of Proteins (SCOP) database [205,206]. Individual pattern recognition based systems could then be trained one for each structural class and predictions could be made based on those structure specialised systems. Recently, we introduced a SCOP class specialised neural networks architecture combining an adaptive neuro-fuzzy inference system (ANFIS) with SCOP class specialised backpropagation neural networks [198]. Here, proteins were accurately classified into two main classes all alpha proteins and all beta proteins merely based on the amide I band maximum position of their FTIR spectra. This allowed structure specialised predictions to be made. Our study showed improved predictions using structure specialised neural networks compared to a conventional neural network approach, where one neural network was trained with spectra of both structural classes. 2.2.5. Input Data Reduction for Improved Generalisation Substantial variation exists in the literature suggesting the optimal neural network topology to achieve good generalisation [207209]. However, one factor critical for achieving good generalisation, which all authors seem to agree on is to keep the number of weight connections in the neural network relatively low. This has been achieved by employing various input reduction techniques [132,194196,210,211]. These studies showed that a reduction in input data and hence in the number of neural network weight connections did in fact have a favourable effect on the neural networks generalisation capabilities. Since multivariate data analysis techniques include techniques such as principle component analysis, factor analysis, or singular value decomposition, the input data, i.e., the X data matrix dimensions, generally gets significantly reduced prior to regression. In other words, multivariate data analysis techniques have an efficient data reduction technique already embedded in the overall procedure. Hence, Esbensen et al. claim, that if the X-variables (i.e., the variables describing the absorbances for each wavenumber) are correlated (which is generally the case regarding spectroscopic data), 20 to 60 samples and thousands of spectral wavelengths (i.e., wavenumbers) may not be a problem at all [212]. However, the results in Table 2 suggest, that other data compression techniques like the boxcar averaging method may lead to improved results. Since neural networks generally do not rely on the input data to be linearly independent, they are more flexible in the choice of input data technique to be employed.

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

153

2.3. Pattern Recognition Approaches Versus Curve Fitting Approaches Although multivariate data analysis approaches and neural network approaches are different in the actual implementation, at higher level, they both belong to the same class, i.e. they are both pattern recognition based methods. Hence, most of the strengths and weaknesses can be generally attributed to both approaches. Here, we will therefore focus on the strengths and weaknesses of pattern recognition based approaches in general and compare them to those of the curve fitting methods. For the pattern recognition based methods to be performed, deconvolution and/or derivation of the spectral band envelope as well as manual assignment of resulting bands to different structures is not required. This removes a significant amount of subjectivity compared to curve fitting. However, the reduction of subjectivity is at the expense of the need for a representative reference set of spectra with known secondary structure. That is, pattern recognition based techniques are dependent on the composition of the training set used to establish good correlation with structure. Therefore, for these approaches to produce good results, the database of reference proteins must be representative, i.e., it must consist of sufficient proteins composed of the types of secondary structure likely to be encountered. E.g., if the reference set is used to build a model for quantifying secondary structure of proteins with only little -helix contents, it would not be appropriate to include only data from proteins with high -helix contents in the reference set. If the composition of the spectra in the reference set is not carefully selected, we may not expect good predictions to be made for a new protein, which has not been used during the analysis. Additionally, Simonetti and Di Bello claim, that since the pattern recognition methods applied thus far are mainly based on a set of reference spectra composed of proteins, prediction of short peptides is likely to be imprecise [129]. They argue that proteins do not represent suitable models for short peptides. The size of data sets used for the pattern recognition based approaches presented here are summarised in Table 2. They range from 8 to 50 which can be said to be a representative data set size for the majority of work done in that area. However, even with a set of 50 spectra, it should not be concluded too easily, that these reference spectra represent all features common to all the thousands of existing proteins. Clearly, the larger and more representative the set of protein spectra used as reference for the analysis, the more reliable the resulting system will be to make good predictions about spectra from proteins with unknown secondary structure. For such increased reference sets to be built, however, we have to allow reference sets to be composed of spectral data recorded in different laboratories under varying conditions. Possible effects on prediction accuracy achieved by a neural network analysis when using reference sets composed of FTIR spectra from different laboratories were investigated as a first step towards developing a common protein infrared spectral database composed of spectral data from different laboratories [213]. However, with the limited size of training set data currently used to date, the potential of protein secondary structure prediction techniques can still be demonstrated. At the end of the curve fitting method, the area of each fitted band is expressed as a percentage of the total area of the amide envelope under investigation. However, this procedure assumes that molar absorptivities of C=O groups responsible for each of the fitted bands are equal commonly, no weighting function is applied to each band. However, studies on poly-L-lysine have shown that the molar absorptivities of differ-

154

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

ent secondary structures can in fact vary slightly [112,116]. De Jongh et al. suggest, that sheet structure displays greatest molar absorptivity [183]. In constrast, others suggest that molar absorptivities are essentially independent of secondary structure [116,214,215]. Such problems are of less significance for pattern recognition based approaches, since predictions are mainly based on band shape variation. Hence, pattern recognition approaches are much less dependent on intensity measures. As a result, they should be at least partially immune to problems of different molar absorptivities, water vapour and solvent absorbance interference when they are similar for all protein spectra of the reference set. Another potential problem for curve fitting approaches are contributions from amino acid side chains, which may also be less problematic with pattern recognition based approaches. Rahmelow et al. suggested that protein secondary structure prediction based on multivariate data analysis methods may not be disturbed by amino acid side chain absorbances provided that only little variation of absorbing amino acid residues exists amongst the proteins of the reference set [128]. Hence, it is important to calculate the respective standard deviation for each absorbing amino acid across the reference set of proteins. If high variation occurs, amino acid side chain absorbance will have to be taken into account to avoid deterioration in prediction accuracy. We believe that this is generally true for all pattern recognition based approaches including neural network analysis. In their multivariate data analysis study, Rahmelow et al. reported that subtraction of amino acid side chain absorbance from the protein infrared spectra of concanavalin A and erabutoxin lead to an improvement in prediction accuracy. 2.4. Multivariate Data Analysis Approaches Versus Neural Network Methods Multivariate data analysis methods generally involve compressing the input data using techniques like principal component analysis (PCA). Because of the transformation of the original data set into a linearly independent object space with generally significant reduction in dimensions, full spectra instead of only few selected wavelengths may be used for the analysis. Hence, data compression techniques or selection of particular regions within the protein spectra prior to analysis are usually not required as opposed to neural network analysis. However, one critical parameter in these projection methods is the choice of the correct number of dimensions of the new object space, i.e. the decision about what is considered structure and what is considered noise in the original data. If there are too few dimensions, some of the important information of the original data may have been lost. This may become an issue particularly with nonlinearity in the data, since PCA used for data compression is a linear projection method, it imposes a linear structure on the spectral data set, which may not be appropriate if non-linear features are present. As a result, this non-linearity may not be described by the first PCs as in the case of linear data. If there are too many dimensions, however, significant parts of noise may be included in the resulting model to predict protein secondary structure possibly leading to misinterpretations. Additionally, when plotting the reduced orthogonal data set resulting from PCA against the data of the original spectrum, hardly any resemblance is apparent [210]. What is often shown are reconstructed spectra. Compression techniques like boxcar averaging on the other hand compress the spectral data in a way that the overall bandshape is well preserved.

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

155

With respect to prediction accuracy, PLS may achieve better results than PCR since the Y-variables to be predicted guide the decomposition of X. A comparison of various types of these methods can be found in [9,120]. A good comparison between neural networks and multivariate data analysis techniques has been given by Despagne and Massart [216]. One of the useful features of neural networks in protein secondary structure prediction is that they are universal approximators, i.e., they are capable of fitting any continuous function (including linear and non-linear functions) to a predefined degree of accuracy [217]. As with multivariate data analysis methods, neural networks may also be used to build models of the form Y = F(X) + . However, they are best used on non-linear data sets, e.g., where random noise is present in the data. Neural networks have been shown to be particularly well suited for dealing with noisy data where good results could be obtained despite the presence of noise in the data [218221]. This robustness may be explained by the highly distributed way, information is stored across the neural network weights avoiding single units of the neural network to fail [222,223]. In contrast, multivariate data analysis techniques like PLS or PCR require the inclusion of higher order components to model non-linear data. These higher order components, however, are more likely to be distorted by noise [224]. One of the main advantages of multivariate data analysis methods like MLR, PCR, and PLS is their ease of model interpretation. PLS and PCR, for example, use linear combinations of the original variables (absorbances at wavenumbers) for modelling. Through the projection of the samples into a space with significantly reduced dimensionality, outliers or possible clusters in the data may be easily visualised. Neural networks on the other hand are often criticised as being black box models, where a model is built merely on the basis of input-output data pairs and where interpretation of the resulting model is far more complex than for techniques like PLS and PCR. This is mainly due to the highly distributed way, information is stored across weight connections. However, we have recently introduced a SCOP class specialised neural networks architecture, where an adaptive neuro-fuzzy inference system (ANFIS) is used to classify proteins into structural classes [198]. ANFIS is an adaptive network, which is functionally equivalent to a fuzzy inference system consisting of a set of rules and hence allows inspection and fine-tuning of the rules generated provided that there are not too many inputs. What multivariate data analysis techniques and neural networks have in common is that the underlying training methods are driven by minimising a least squares criterion. Despagne and Massart claim that if only linear transfer functions are used for hidden and output units in a neural network, similarity with multivariate data analysis techniques exists [216]. In this case, the two linear combinations performed between neural network layers is equivalent to a single MLR regression. The only difference lies in the way, the model parameters are optimised. Neural networks make use of an iterative optimisation process (i.e., neural network training) whereas MLR employs matrix inversion. Additionally, if linear transfer functions are used for the neural network, similarity also exists with PCR and PLS [225]. Here, weights between input and hidden layers may be compared to the X-data loadings and the hidden layer units activation may be compared with scores. However, where in neural networks adjustable parameters are fitted to minimise a least squares criterion regardless of any restrictions, additional constraints are taken into account for PCR and PLS. E.g., orthogonality of score values, maximum of variance in the X-data (PCR) and maximum of covariance

156

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

in the X-Y-data (PLS). Hence, parameters obtained by PCR and PLS can be expected to be different to those obtained by neural networks [216].

3. Applications of Protein Secondary Structure Prediction Table 2 summarises representative applications of protein secondary structure prediction methods along with reported prediction accuracy. Here, prediction accuracy is expressed in terms of the standard error of prediction (SEP). Only those studies are listed, where SEPs could be calculated from the information provided. Best results achieved are only shown when the entire infrared spectral data set available was included in the analysis. Although all of these methods have demonstrated to be very useful techniques to relate infrared spectral data to protein secondary structure (see Table 2), there is still a high degree of freedom in the application of these methods. With respect to curve fitting, we believe that it would be beneficial to standardise its application by developing automated procedures to determine the secondary structure without manual determination of FSD or derivation parameters, no manual peak assignments to secondary structure, no manual choice about the shape of the convolution function, no manual choice of which exact implementation of the curve fitting procedure to use, and no choice whether to apply the curve fitting to the original spectrum, the FSD spectrum or the second derivative spectrum. An attempt in that direction has been made by Rahmelow and Hbner [226]. They have suggested a procedure, which allows the fully automated determination of the parameters required for deconvolution, the subsequent execution of deconvolution using these parameters, and the determination of favourable starting values for a subsequent curve fitting procedure. However, the resulting bands still need to be assigned to the respective types of secondary structure. In their study, they have performed this assignment based on both empirical results taken from Susi and Byler [87] and by applying an algorithmic procedure suggested by Goormaghtigh et al. [168]. Unfortunately, they had to conclude that the results using their automated procedure (SEPs around 14% to 16%) are not satisfying. Inferiority of an automated procedure is probably due to the fact that, as pointed out by Goormaghtigh et al. [227], there is a gap between the explanations most often given for the success of the curve fitting method and its true basis. If a better understanding of the relationship between the curve fitting procedure as a mathematical technique and its underlying physical principles could be established, it would be easier to identify a set of general rules to build a more reliable and accurate automatic procedure. We would then be in a position to give a better and more consistent explanation of how to apply the curve fitting procedure. In contrast, pattern recognition based techniques as an alternative approach to curve fitting may be applied with less subjectivity. However, to date, these techniques suffer from the problem that they are based on only a very limited number of protein spectra in the calibration/training set. Hence, the reliability of these approaches for making good predictions about all existing proteins not seen during the analysis may be questioned. Clearly, further work needs to be done, always with the goal in mind to constantly increase and enhance the set of reference protein spectra with known secondary structure as a basis for further analysis. Additionally, pattern based approaches do also require certain parameters (e.g., training set size, data pre-processing method, number of inputs used, number of principal components) to be determined. Like with

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

157

curve fitting, the choice of those parameters is mostly rather subjective. However, with respect to neural networks, using the resilient backpropagation algorithm, the impact of choice of initial training parameters could be considerably reduced [132,194198].

4. Conclusion About 60 years ago, the fact that information on protein secondary structure can be derived from FTIR spectra has been established [1,4,5]. Our interpretation of protein infrared spectra has progressed significantly since then but unfortunately sufficiently enough to eliminate doubts about assignment of certain peaks without reliance on additional data. Nevertheless, FTIR spectroscopy is firmly established as a tool for protein structure analysis and in many cases is the only technique that can be used since other techniques are simply incapable of analysis of such samples. One of the major advance since the days of Elliott & Ambrose has been the utilisation of statistical and computational methods for quantifying secondary structure from FTIR spectra of proteins. Not having to rely on subjective interpretation of individual bands for quantitative analysis has been an important advance. The prediction accuracy of the different methods in use is as good, if not better, compared to other techniques such as Circular Dichroism & Raman Spectroscopy. However, much progress still needs to be made in this area. Although both the curve fitting and the pattern recognition based approaches report good prediction accuracy, at present, for none of them it can be claimed that secondary structure proportions of any new protein with unknown structure can be predicted reliably based on its FTIR spectrum. Due to the complexity and multitude of proteins existing as well as due to the inherent experimental variation in recording FTIR spectra, a complete, all-encompassing set of rules describing the relationship between FTIR spectral bandshapes and secondary structural fractions for all proteins will be very hard to establish. There is still great need for further work to develop new methods or improve the existing methods to arrive at the best possible approximation of this mapping. Even if we were unable to sufficiently express such a set of underlying rules in terms of physical underlying phenomena, we can still employ techniques with the potential to automatically discover a mapping reflecting the overall bandshape-structure relationship. A first step in that direction has been made using pattern recognition based approaches.

Acknowledgements We would like to thank Dr. Aichun Dong (University of Northern Colorado) for providing us with Figs 4, 5, and Table 3.

References
[1] A. Elliot, E.J. Ambrose, Structure of synthetic polypeptides, Nature 165 (1950), 921-922. [2] D.M. Byler, H. Susi, Examination of the Secondary Structure of Proteins by Deconvolved FTIR Spectra, Biopolymers 25(3) (1986), 469-487. [3] U. Goerne-Tschelnokow, D. Naumann, C. Weise, and F. Hucho, Secondary structure and temperature behaviour of acetylcholinesterase Studies by Fourier-transform infrared spectroscopy, 213(3) (1993), 1235.

158

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

[4] E.J. Ambrose, A. Elliot, Infrared spectroscopic studies of globular protein structure, Proc. R. Soc. London Ser. A 208 (1951), 75-90. [5] A. Elliot, Infrared spectra of polypeptides with small side chains, Proc. R. Soc. London Ser. A 226 (1954), 408-421. [6] A. Wlodawer, R. Bott, and L. Sjolin, The refined crystal structure of ribonuclease A at 2.0 resolution, J. Biol. Chem. 257 (1982), 1325-1332. [7] W. Braun, Distance geometry and related methods for protein structure determination from NMR data, Quarterly Reviews of Biophysics 19 (1987), 115-157. [8] B. Dalmas, W.H. Bannister, Prediction of protein secondary structure from circular dichroism spectra: An attempt to solve the problem of the best-fitting reference protein subsets, Analytical Biochemistry 225(1) (1995), 39-48. [9] V. Baumruk, P. Pancoska, and T.A. Keiderling, Predictions of secondary structure using statistical analyses of electronic and vibrational circular dichroism and fourier transform infrared spectra of proteins in H-2O, Journal of Molecular Biology 259(4) (1996), 774-791. [10] P. Pancosca, L. Wang, and T.A. Keiderling, Comparison of protein FT-IR absorption and vibrational circular dichroism frequency analysis in terms of secondary structure, Protein Science 2 (1993), 411419. [11] R. Pribic, Principal Component Analysis of Fourier-Transform Infrared and/or Circular-Dichroism Spectra of Proteins Applied in a Calibration of Protein Secondary Structure, Analytical Biochemistry 223(1) (1994), 26-34. [12] R. Pribic, I.H.M. Van Stokkum, D. Chapman, P.I. Haris, and M. Bloemendal, Protein secondary structure from Fourier transform infrared and/or circular dichroism spectra, Analytical Biochemistry 214(2) (1993), 366-378. [13] P. Pancoska, E. Bitto, V. Janota, M. Urbanova, V.P. Gupta, and T.A. Keiderling, Comparison of and limits of accuracy for statistical analyses of vibrational and electronic circular dichroism spectra in terms of correlations to and predictions of protein secondary structure, Protein Science 4(7) (1995), 1384-1401. [14] M.A. Andrade, P. Chacon, J.J. Merelo, and F. Moran, Evaluation of Secondary Structure of Proteins From UV Circular- Dichroism Spectra Using an Unsupervised Learning Neural-Network, Protein Engineering 6(4) (1993), 383-390. [15] N.J. Greenfield, Methods to Estimate the Conformation of Proteins and Polypeptides from Circular Dichroism Data, Analytical Biochemistry 253(1) (1996), 1-10. [16] B.I. Baello, P. Pancoska, and T.A. Keiderling, Enhanced prediction accuracy of protein secondary structure using hydrogen exchange Fourier transform infrared spectroscopy, Analytical Biochemistry 280(1) (2000), 46-57. [17] P. Pancoska, V. Janota, and T.A. Keiderling, Novel Matrix Descriptor for Secondary Structure Segments in Proteins: Demonstration of Predictability From Circular Dichroism Spectra, Analytical Biochemistry 267(1) (1999), 72-83. [18] P. Pancoska, V. Janota, and T.A. Keiderling, Interconvertibility of Electronic and Vibrational Circular Dichroism Spectra of Proteins: a Test of Principle Using Neural Network Mapping, Applied Spectroscopy 50(5) (1996), 658-668. [19] N. Sreerama, R.W. Woody, Protein Secondary Structure From Circular-Dichroism Spectroscopy Combining Variable Selection Principle and Cluster-Analysis With Neural-Network, RidgeRegression and Self-Consistent Methods, Journal of Molecular Biology 242(4) (1994), 497-507. [20] P. Pancoska, V. Janota, and T.A. Keiderling, Novel approaches to protein structural analyses using combinations of optical spectroscopic methods (electronic and vibrational circular dichroism and FTIR studies), Spectroscopy of biological molecules European conference; 7th, (P. Carmona, R. Navarro, and A. Hernanz, eds.), Kluwer Academic Publishers, 1997, pp. 13-14. [21] G. Bhm, R. Muhr, and R. Jaenicke, Quantitative analysis of protein far UV circular dichroism spectra by neural networks, Protein Engineering 5(3) (1992), 191-5. [22] A. Dong, J.D. Meyer, J.L. Brown, M.C. Manning, and J.F. Carpenter, Comparative Fourier Transform Infrared and Circular Dichroism Spectroscopic Analysis of alpha~1-Proteinase Inhibitor and Ovalbumin in Aqueous Solution, Archives of Biochemistry and Biophysics 383(1) (2000), 148-155. [23] P. Pancoska, J. Kubelka, and T.A. Keiderling, Novel Use of a Static Modification of TwoDimensional Correlation Analysis. Part I: Comparison of the Secondary Structure Sensitivity of Electronic Circular Dichroism, FT-IR, and Raman Spectra of Proteins, 53(6) (1999), 655-665. [24] P. Pancoska, H. Fabian, G. Yoder, V. Baumruk, and T.A. Keiderling, Protein structural segments and their interconnections derived from optical spectra. Thermal unfolding of ribonuclease T-1 as an example, Biochemistry 35(40) (1996), 13094-13106.

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

159

[25] P. Pancoska, V. Janota, and J. Nesetril, Novel matrix descriptor for determination of the connectivity of secondary structure segments in proteins. Analysis of general properties using graph theory, Discrete Mathematics 235(1-3) (2001), 399-423. [26] J. Kubelka, P. Pancoska, and T.A. Keiderling, Novel Use of a Static Modification of TwoDimensional Correlation Analysis. Part II: Hetero-Spectral Correlations of Protein Raman, FT-IR, and Circular Dichroism Spectra, Applied Spectroscopy 53(6) (1999), 666-671. [27] N. Sreerama, S.Y. Venyaminov, and R.W. Woody, Estimation of the number of alpha-helical and beta-strand segments in proteins using circular dichroism spectroscopy, Protein Science 8(2) (1999), 370-80. [28] E. Vass, M. Kurz, R.K. Konat, and M. Hollosi, Ftir and Cd Spectroscopic Studies on Cyclic Pentaand Hexa- Peptides. Detailed Examination of Hydrogen Bonding in Beta- and Gamma-Turns Determined by Nmr, Spectrochimica Acta Part a-Molecular and Biomolecular Spectroscopy 54(5) (1998), 773-786. [29] J.T. Pelton, L.R. Mclean, Spectroscopic Methods for Analysis of Protein Secondary Structure, Analytical Biochemistry 277(2) (2000), 167-176. [30] R.W. Sarver, W.C. Krueger, An Infrared and Circular-Dichroism Combined Approach to the Analysis of Protein Secondary Structure, Analytical Biochemistry 199(1) (1991), 61-67. [31] M. Bloemendal, W.C. Johnson, Physical Methods to Characterise Pharmaceutical Proteins, (J.N. Herron, W. Jiskoot, and D.J.A. Crommelin, eds.), Plenum, New York, 1995, pp. 65-100. [32] S.Y. Venyaminov, J.T. Yang, Determination of protein secondary structure, Circular dichroism and the conformational analysis of biomolecules, (G.D. Fasman, ed.), Plenum, 1996, pp. 69-108. [33] N. Sreerama, R.W. Woody, Poly(Pro)Ii Helices in Globular-Proteins Identification and Circular Dichroic Analysis, Biochemistry 33(33) (1994), 10022-10025. [34] N. Sreerama, R.W. Woody, A Self-Consistent Method for the Analysis of Protein Secondary Structure From Circular-Dichroism, Analytical Biochemistry 209(1) (1993), 32-44. [35] A. Perczel, M. Hollosi, G. Tusnady, and G.D. Fasman, Convex Constraint Analysis a Natural Deconvolution of Circular-Dichroism Curves of Proteins, Protein Engineering 4(6) (1991), 669-679. [36] P. Pancoska, S.C. Yasui, and T.A. Keiderling, Statistical-Analyses of the Vibrational CircularDichroism of Selected Proteins and Relationship to Secondary Structures, Biochemistry 30(20) (1991), 5089-5103. [37] V.V. Shubin, M.L. Khazin, and T.B. Efimovskaya, Prediction of Secondary Structure of GlobularProteins Using Circular-Dichroism Spectra, Molecular Biology 24(1) (1990), 165-176. [38] P. Manavalan, W.C. Johnson, Variable Selection Method Improves the Prediction of Protein Secondary Structure From Circular-Dichroism Spectra, Analytical Biochemistry 167(1) (1987), 76-85. [39] I.A. Bolotina, V.O. Chekhov, V.Y. Lugauskas, and O.B. Ptitsyn, Determination of the Secondary Structure of Proteins From Circular-Dichroism Spectra. 3. Protein-Derived Reference Spectra for Antiparallel and Parallel Beta-Structures, Molecular Biology 15(1) (1981), 130-137. [40] I.A. Bolotina, V.O. Chekhov, V.Y. Lugauskas, A.V. Finkelshtein, and O.B. Ptitsyn, Determination of the Secondary Structure of Proteins From the Circular-Dichroism Spectra .1. Protein Reference Spectra for Alpha Structure, Beta Structure, and Irregular Structure, Molecular Biology 14(4) (1980), 701-709. [41] J.P. Hennessey, W.C. Johnson, Information-Content in the Circular-Dichroism of Proteins, Biochemistry 20(5) (1981), 1085-1094. [42] S.W. Provencher, J. Glockner, Estimation of Globular Protein Secondary Structure From CircularDichroism, Biochemistry 20(1) (1981), 33-37. [43] S. Brahms, J. Brahms, Determination of protein secondary structure in solution by vacuum ultraviolet circular dichroism, Journal of Molecular Biology 138 (1980), 149-178. [44] Y.H. Chen, J.T. Yang, A new approach to the calculation of secondary structures of globular proteins by optical rotary dispersion and circular dichroism, Biochem Biophys Res Commun 44 (1971), 1285-1291. [45] N.J. Greenfield, G.D. Fasman, Computed circular dichroism spectra for the evaluation of protein conformation, Biochemistry 8(10) (1996), 4108-4116. [46] I.H.M. van Stokkum, H.J.W. Spoelder, M. Bloemendal, R. van Grondelle, and F.C.A. Groen, Estimation of protein secondary structure and error analysis form circular dichroism spectra, Analytical Biochemistry 191 (1990), 110-118. [47] K.A. Oberg, V.N. Uversky, Secondary structure of the homologous proteins, alpha-fetoprotein and serum albumin, from their circular dichroism and infrared spectra, Protein and Peptide Letters 8(4) (2001), 297-302.

160

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

[48] P. Unneberg, J.J. Merelo, P. Chacon, and F. Moran, SOMCD: Method for Evaluating Protein Secondary Structure From UV Circular Dichroism Spectra, Proteins-Structure Function and Genetics 42(4) (2001), 460-470. [49] B.A. Wallace, J. Lees, and R.W. Janes, Fold Recognition by Synchrotron Radiation Circular Dichroism (SRCD) Spectroscopy: a New Tool for Structural Genomics, Biophysical Journal 82(1) (2002), 1752. [50] B.A. Wallace, R.W. Janes, Synchrotron Radiation Circular Dichroism Spectroscopy of Proteins: Secondary Structure, Fold Recognition and Structural Genomics, Current Opinion in Chemical Biology 5(5) (2001), 567-571. [51] B.A. Wallace, Synchrotron Radiation Circular-Dichroism Spectroscopy as a Tool for Investigating Protein Structures, Journal of Synchrotron Radiation 7 (2000), 289-295. [52] J.G. Lees, B.A. Wallace, Synchrotron radiation circular dichroism and conventional circular dichroism spectroscopy: A comparison, Spectroscopy 16(3/4) (2002), 121-126. [53] J. Reed, T.A. Reed, A Set of Constructed Type Spectra for the Practical Estimation of Peptide Secondary Structure From Circular Dichroism, Analytical Biochemistry 254(1) (1997), 36-40. [54] P. Pancoska, E. Bitto, V. Janota, and T.A. Keiderling, Quantitative-Analysis of Vibrational CircularDichroism Spectra of Proteins Problems and Perspectives, Faraday Discussions (99) (1994), 287-310. [55] G. Deleage, C. Geourjon, An Interactive Graphic Program for Calculating the Secondary StructureContent of Proteins From Circular-Dichroism Spectrum, Computer Applications in the Biosciences 9(2) (1993), 197-199. [56] A. Perczel, K. Park, and G.D. Fasman, Analysis of the Circular-Dichroism Spectrum of Proteins Using the Convex Constraint Algorithm a Practical Guide, Analytical Biochemistry 203(1) (1992), 83-93. [57] A. Perczel, K. Park, and G.D. Fasman, Deconvolution of the Circular-Dichroism Spectra of Proteins the Circular-Dichroism Spectra of the Antiparallel Beta-Sheet in Proteins, Proteins 13(1) (1992), 57-69. [58] E.A. Carrara, C. Gavotti, P. Catasti, F. Nozza, L.L.B. Bergotto, and C.A. Nicolini, Improvement of Protein Secondary Structure Prediction by Combination of Statistical Algorithms and CircularDichroism, Archives of Biochemistry and Biophysics 294(1) (1992), 107-114. [59] W.C. Johnson, Protein Secondary Structure and Circular-Dichroism a Practical Guide, Proteins 7(3) (1990), 205-214. [60] M.C. Manning, Underlying Assumptions in the Estimation of Secondary Structure-Content in Proteins by Circular-Dichroism Spectroscopy a Critical-Review, Journal of Pharmaceutical and Biomedical Analysis 7(10) (1989), 1103-1119. [61] L. Menendezarias, J. Gomezgutierrez, M. Garciaferrandez, A. Garciatejedor, and F. Moran, A Basic Microcomputer Program to Calculate the Secondary Structure of Proteins From Their CircularDichroism Spectrum, Computer Applications in the Biosciences 4(4) (1988), 479-482. [62] L.A. Compton, W.C. Johnson, Analysis of Protein Circular-Dichroism Spectra for Secondary Structure Using a Simple Matrix Multiplication, Analytical Biochemistry 155(1) (1986), 155-167. [63] P. Manavalan, W.C. Johnson, and P.D. Johnston, Prediction Structure Type for Human-Leukocyte Interferon Subtype-a From Circular-Dichroism, Febs Letters 175(2) (1984), 227-230. [64] A. Lobley, L. Whitmore, and B.A. Wallace, Dichroweb: an Interactive Website for the Analysis of Protein Secondary Structure From Circular Dichroism Spectra, Bioinformatics 18(1) (2002), 211-212. [65] N. Sreerama, S.Y. Venyaminov, and R.W. Woody, Analysis of Protein Cd Spectra With a Reference Protein Set Based on Tertiary Structure Class, Biophysical Journal 80(1) (2001), 1342. [66] A. Lobley, B.A. Wallace, Dichroweb: a Website for the Analysis of Protein Secondary Structure From Circular Dichroism Spectra, Biophysical Journal 80(1) (2001), 1570. [67] N. Sreerama, S.Y. Venyaminov, and R.W. Woody, Estimation of Protein Secondary Structure From Circular Dichroism Spectra: Inclusion of Denatured Proteins With Native Proteins in the Analysis, Analytical Biochemistry 287(2) (2000), 243-251. [68] N. Sreerama, R.W. Woody, Estimation of Protein Secondary Structure From Circular Dichroism Spectra: Comparison of Contin, Selcon, and Cdsstr Methods With an Expanded Reference Set, Analytical Biochemistry 287(2) (2000), 252-260. [69] W.C. Johnson, Analyzing Protein Circular Dichroism Spectra for Accurate Secondary Structures, Proteins 35(3) (1999), 307-312. [70] V. Sieber, F. Jurnak, and G.R. Moe, Circular-Dichroism of the Parallel Beta-Helical Proteins Pectate Lyase-C and Lyase-E, Proteins 23(1) (1995), 32-37. [71] J.C. Sutherland, A. Emrick, L.L. France, D.C. Monteleone, and J. Trunk, Circular-Dichroism User Facility at the National Synchrotron Light-Source Estimation of Protein Secondary Structure, Biotechniques 13(4) (1992), 588-590.

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

161

[72] A. Toumadje, S.W. Alcorn, and W.C. Johnson, Extending Cd Spectra of Proteins to 168 Nm Improves the Analysis for Secondary Structures, Analytical Biochemistry 200(2) (1992), 321-331. [73] S.Y. Venyaminov, I.A. Baikalov, C.S.C. Wu, and J.T. Yang, Some Problems of Cd Analyses of Protein Conformation, Analytical Biochemistry 198(2) (1991), 250-255. [74] P. Pancoska, T.A. Keiderling, Systematic Comparison of Statistical-Analyses of Electronic and Vibrational Circular-Dichroism for Secondary Structure Prediction of Selected Proteins, Biochemistry 30(28) (1991), 6885-6895. [75] W.C. Johnson, Secondary Structure of Proteins Through Circular-Dichroism Spectroscopy, Annual Review of Biophysics and Biophysical Chemistry 17 (1988), 145-166. [76] J.T. Yang, C.S.C. Wu, and H.M. Martinez, Calculation of Protein Conformation From CircularDichroism, Methods in Enzymology 130 (1986), 208-269. [77] M. Schnarr, J.C. Maurizot, Secondary Structure of the Lac Repressor Headpiece Possibilities and Limitations of a Joint Infrared and Circular- Dichroism Study, European Journal of Biochemistry 128(2-3) (1982), 515-520. [78] A. Perczel, Deconvolution of the circular dichroism spectra of proteins: the circular dichroism spectra of the antiparallel beta-sheet in proteins, Proteins 13(1) (1992), 57-69. [79] C.C. Baker, On the analysis of circular dichroic spectra of proteins, Biochemistry 15(3) (1976), 629634. [80] A. Bobba, Estimation of protein secondary structure from circular dichroism spectra: a critical examination of the CONTIN program, Protein Seq Data Anal 3(1) (1990), 7-10. [81] G. Willick, Equivalency of linear least squares curve fitting and reciprocal functions in protein circular dichroic spectra analysis, Biophys Chem 7(3) (1977), 223-227. [82] R.G. Hammonds, Least-squares analysis of circular dichroic spectra of proteins, Eur J Biochem 74(2) (1977), 421-424. [83] M. Bloemendal, Structural information on proteins from circular dichroism spectroscopy possibilities and limitations, Pharm Biotechnol 7 (1995), 65-100. [84] Chen et al., Determination of the helix and b-form of proteins in aqueous solution by circular dichroism, Biochemistry 13(16) (1974), 3350-3359. [85] T.A. Keiderling, Protein and Peptide Secondary Structure and Conformational Determination with Vibrational Circular Dichroism, Current Opinion in Chemical Biology 6(5) (2002), 682-688. [86] A. Barth, C. Zscherp, What Vibrations Tell Us About Proteins, Quarterly Reviews of Biophysics 35(4) (2002), 369-430. [87] H. Susi, D.M. Byler, Resolution-Enhanced Fourier-Transform Infrared-Spectroscopy of Enzymes, Methods in Enzymology 130 (1986), 290-311. [88] W.K. Surewicz, H.H. Mantsch, New insight into protein secondary structure from resolution-enhanced infrared spectra, Biochimica et Biophysica Acta 952 (1988), 115-130. [89] M. Jackson, P.I. Haris, and D. Chapman, Fourier transform infrared spectroscopic studies of lipids polypeptides and proteins, Journal of Molecular Structure 214 (1989), 329-355. [90] J.L.R. Arrondo, A. Muga, J. Castresana, and F.M. Goni, Quantitative Studies of the Structure of Proteins in Solution by Fourier-Transform Infrared-Spectroscopy, Progress in Biophysics & Molecular Biology 59(1) (1993), 23-56. [91] M. Jackson, H.H. Mantsch, Biomembrane structure from FTIR spectroscopy, Spectrochim. Acta Rev. 15 (1993), 53-69. [92] A. Barth, Infrared spectroscopy of proteins, Biochimica et Biophysica Acta 1767 (2007), 1073-1101. [93] J. Bandekar, Amide modes and protein conformation, Biochimica et Biophysica Acta 1120 (1992), 123-143. [94] T. Miyazawa, T. Shimanouchi, and T. Mizushima, Characteristic infrared bands of mono-substituted amides, J. Chem. Phys. 24 (1956), 408-418. [95] T. Miyazawa, T. Shimanouchi, and T. Mizushima, Normal vibrations of N-methylactamide, J. Chem. Phys. 29 (1958), 611-616. [96] T. Miyazawa, The characteristic band of secondary amides at 3100 cm1, J. Mol. Spectrosc. 4 (1960), 168-172. [97] T. Miyazawa, Internal rotation and low frequency spectra of ester, mono-substituted amides and polyglycine, Bull. Chem. Soc. Jpn. 34 (1961), 691-696. [98] T. Miyazawa, Characteristic amide bands and conformations of polypeptides, Polyamino Acids, Polypeptides and Proteins, (M.A. Stahmann, ed.), University of Wisconsin Press, 1962, pp. 201-217. [99] J. Jakes, S. Krimm, Valence force field for the amide group, Spectrochim. Acta Part A 27 (1971), 19-34. [100] J. Jakes, S. Krimm, Normal coordinate analysis of molecules with the amide group, Spectrochim. Acta Part A 27 (1971), 35-63.

162

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

[101] Y. Abe, S. Krimm, Normal vibrations of crystalline polyglycine I, Biopolymers 11 (1972), 1817-1840. [102] Y. Abe, S. Krimm, Normal vibrations of polyglycine II, Biopolymers 11 (1972), 1841-1853. [103] S. Krimm, J. Bandekar, Vibrational spectroscopy and conformation of peptides, polypeptides and proteins, Adv. Protein Chem. 38 (1986), 181-364. [104] T. Miyazawa, E.R. Blout, The infrared spectra of various polypeptides in various conformations: Amide II bands, J. Am. Chem. Soc. 83 (1961), 712-719. [105] H. Susi, S.N. Timasheff, and L. Stevens, Infrared spectra and protein conformations in aqueous solutions: I The amide I band in H2O and D2O solution, J. Biol. Chem. 242 (1967), 5460-5466. [106] P.I. Haris, D. Chapman, Does Fourier-Transform Infrared-Spectroscopy Provide Useful Information on Protein Structures, Trends in Biochemical Sciences 17(9) (1992), 328-333. [107] M.S. Braiman, K.J. Rothschild, Fourier-Transform Infrared Techniques for Probing Membrane- Protein Structure, Annual Review of Biophysics and Biophysical Chemistry 17 (1988), 541-570. [108] A. Dong, P. Huang, and W.S. Caughey, Protein secondary structure from second derivative amide I infrared spectra, Biochemistry 29 (1990), 3303-3308. [109] D.C. Lee, P.I. Haris, D. Chapman, and R.C. Mitchell, Determination of Protein Secondary Structure Using Factor- Analysis of Infrared-Spectra, Biochemistry 29(39) (1990), 9185-9193. [110] R.W. Sarver, W.C. Krueger, Protein secondary structure from Fourier transform infrared spectroscopy: A database analysis, Analytical Biochemistry 194 (1991), 89-100. [111] L.K. Tamm, S.A. Tatulian, Infrared Spectroscopy of Proteins and Peptides in Lipid Bilayers, Quarterly Reviews of Biophysics 30(4) (1997), 365-429. [112] M. Jackson, P.I. Haris, and D. Chapman, Conformational Transitions in Poly(L-Lysine) Studies Using Fourier-Transform Infrared-Spectroscopy, Biochimica Et Biophysica Acta 998(1) (1989), 75-79. [113] P.I. Haris, Fourier Transform Infrared Spectroscopic Studies of Peptides: Potentials and Pitfalls, ACS Symposium series, (B.R. Singh, ed.), American Chemical Society, 2000, pp. 54-95. [114] R. Gilmanshin, S. Williams, R.H. Callender, W.H. Woodruff, and R.B. Dyer, Fast events in protein folding: Relaxation dynamics of secondary and tertiary structure in native apomyoglobin, Proceedings National Academy of Sciences USA, (Anonymous), USA: National Academy Of Sciences, 1997, pp. 3709-3713. [115] Y.N. Chirgadze, O.V. Fedorov, and N.P. Trushina, Estimation of amino acid residue side chain absorptions in infrared spectra of protein solutions in heavy water, Biopolymers 14 (1975), 679-694. [116] S.Y. Venyaminov, N.N. Kalnin, Quantitative IR spectrophotometry of peptide compounds in water (H2O) solutions. I. Spectral parameters of amino acid residue absorption bands, Biopolymers 30 (1990), 1243-1257. [117] F. Dousseau, M. Pezolet, Determination of the secondary structure content of proteins in aqueous solutions from their amide I and amide II infrared bands. Comparison between classical and partial leastsquares methods, Biochemistry 29 (1990), 8771-8779. [118] H. Torii, T. Tatsumi, T. Kanazawa, and M. Tasumi, Effects of Intermolecular Hydrogen-Bonding Interactions on the Amide I Mode of N-Methylacetamide: Matrix-Isolation Infrared Studies and ab Initio Molecular Orbital Calculations, 102(1) (1998), 309-314. [119] T.F. Kumosinski, J.J. Unruh, Global-secondary-structure analysis of proteins in solution Resolutionenhanced deconvolution Fourier transform infrared spectroscopy in water, Molecular Modeling From virtual tools to real problems, in ACS Symposium Series, (T.F. Kumosinski, M.N. Liebman, eds.), American Chemical Society, 1994, pp. 71-98. [120] K. Rahmelow, W. Huebner, Secondary structure determination of proteins in aqueous solution by infrared spectroscopy: A comparison of multivariate data analysis methods, Analytical Biochemistry 241(1) (1996), 5-13. [121] S. Wi, P. Pancoska, and T.A. Keiderling, Predictions of protein secondary structures using factor analysis on Fourier transform infrared spectra: Effect of Fourier self-deconvolution of the amide I and amide II bands, Biospectroscopy 4(2) (1997), 93-106. [122] K. Kaiden, T. Matsui, and S. Tanaka, A study of amide III band by FT-IR spectrometry of the secondary structure of albumin, myoglobin and -globulin, Applied Spectroscopy 41 (1981), 180-184. [123] G. Anderle, R. Mendelsohn, Thermal denaturation of globular proteins: Fourier transform infrared studies of the amide III spectral region, Biophysical Journal 52 (1987), 69-74. [124] B.R. Singh, D.B. DeOliveira, F. Fu, and M.P. Fuller, Fourier transform infrared analysis of amide III bands of proteins for the secondary-structure estimation [1890-11], Biomolecular spectroscopy III, in Proceedings Spie the international society for optical engineering, (L.A. Nafie, H.H. Mantsch, eds.), SPIE, 1993, pp. 47-55. [125] F.-N. Fu, D.B. DeOliveira, W.R. Trumble, and H.K. Sarkar, Secondary Structure Estimation of Proteins Using the Amide III Region of Fourier Transform Infrared Spectroscopy: Application to Analyze Calcium-Binding-Induced Structural Changes in Calsequestrin, 48(11) (1994), 1432.

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

163

[126] B.R. Singh, M.P. Fuller, and B.R. DasGupta, Botulinum neurotoxin type A: structure and interaction with the micellar concentration of SDS determined by FT-IR spectroscopy, J. Protein Chem. 10 (1991), 637-649. [127] K. Griebenow, A.M. Klibanov, Lyophilization-induced reversible changes in the secondary structure of proteins, Proceedings National Academy of Sciences USA, (Anonymous), USA: National Academy of Sciences, 1995, pp. 10969-10976. [128] K. Rahmelow, W. Huebner, and T. Ackermann, Infrared Absorbances of Protein Side Chains, Analytical Biochemistry 257(1) (1998), 1-11. [129] Simonetti M., C. Di Bello, New Fourier transform infrared based computational method for peptide secondary structure determination. I. Description of method, Biopolymers 62(2) (2001), 95-108. [130] M. Simonetti, C. Di Bello, New Fourier transform infrared based computational method for peptide secondary structure determination. II. Application to study of peptide fragments reproducing processing site of ocytocin-neurophysin precursor, Biopolymers 62(2) (2001), 109-121. [131] I. Noda, Generalized Two-Dimensional Correlation Method Applicable to Infrared, Raman, and Other Types of Spectroscopy, 47(9) (1993), 1329. [132] J.A. Hering, P.R. Innocent, and P.I. Haris, Automatic Amide I frequency selection for rapid quantification of protein secondary structure from FTIR spectra of proteins, Proteomics 2(7) (2002), 839-849. [133] C.B. Lucasius, M.L.M. Beckers, and G. Kateman, Genetic Algorithms in Wavelength Selection a Comparative- Study, Analytica Chimica Acta 286(2) (1994), 135-153. [134] R. Leardi, Application of a Genetic Algorithm to Feature-Selection Under Full Validation Conditions and to Outlier Detection, Journal of Chemometrics 8(1) (1994), 65-79. [135] D. Jouan-Rimbaud, D.L. Massart, R. Leardi, and O.E. Denoord, Genetic Algorithms as a Tool for Wavelength Selection in Multivariate Calibration, Analytical Chemistry 67(23) (1995), 4295-4301. [136] J.M. Roger, V. Bellon-Maurel, Using Genetic Algorithms to Select Wavelengths in Near-Infrared Spectra: Application to Sugar Content Prediction in Cherries, Applied Spectroscopy 54(9) (2000), 1313-1320. [137] A.S. Bangalore, R.E. Shaffer, G.W. Small, and M.A. Arnold, Genetic Algorithm-Based Method for Selecting Wavelengths and Model Size for Use With Partial Least-Squares Regression: Application to Near-Infrared Spectroscopy, Analytical Chemistry 68(23) (1996), 4200-4212. [138] M.J. Arcos, M.C. Ortiz, B. Villahoz, and L.A. Sarabia, Genetic-Algorithm-Based Wavelength Selection in Multicomponent Spectrometric Determinations by Pls: Application on Indomethacin and Acemethacin Mixture, Analytica Chimica Acta 339(1-2) (1997), 63-77. [139] B.M. Smith, L. Oswald, and S. Franzen, Single-Pass Attenuated Total Reflection Fourier Transform Infrared Spectroscopy for the Prediction of Protein Secondary Structure, Analytical Chemistry 74(14) (2002), 3386-3391. [140] P.I. Haris, Characterization of protein structure and stability using Fourier transform infrared spectroscopy, Pharmacy and Pharmacology Communications 5(1) (1999), 15-25. [141] P.I. Haris, D. Chapman, Analysis of polypeptide and protein structures using Fourier transform infrared spectroscopy, Microscopy, optical spectroscopy, and macroscopic techniques, in Methods in Molecular Biology, (C. Jones, B. Mulloy, and A.H. Thomas, eds.), Humana Press Inc., 1994, pp. 183-202. [142] M. Jackson, H.H. Mantsch, The Use and Misuse of FTIR Spectroscopy in the Determination of Protein-Structure, Critical Reviews in Biochemistry and Molecular Biology 30(2) (1995), 95-120. [143] B.R. Singh, Basic Aspects of the Technique and Applications of Infrared Spectroscopy of Peptides and Proteins, ACS Symposium series, (B.R. Singh, ed.), American Chemical Society, 2000, pp. 2-37. [144] S. Krimm, Interpreting Infrared Spectra of Peptides and Proteins, in ACS Symposium series, (B.R. Singh, ed.), USA: Washington, DC; American Chemical Society, 2000, pp. 38-53. [145] C. Jung, Insight into protein structure and protein-ligand recognition by Fourier transform infrared spectroscopy, Journal of Molecular Recognition 13(6) (2000), 325-351. [146] E. Goormaghtigh, J.-M. Ruysschaert, and V. Raussens, Evaluation of the Information Content in Infrared Spectra for Protein Secondary Structure Determination, Biophysical Journal 90 (2006), 2946-2957. [147] M. Ruegg, V. Metzger, and H. Susi, Computer analysis of characteristic infrared bands of globular proteins, Biopolymers 14 (1975), 1465-1471. [148] J. Villalain, J.C. Gomez-Fernandez, M. Jackson, and D. Chapman, Fourier transform infrared spectroscopic studies on the secondary structure of the Ca2+-ATPase of sarcoplasmic reticulum, Biochimica et Biophysica Acta 978 (1989), 305-312. [149] A. Dong, B. Kendrick, L. Kreilgard, J. Matsuura, M.C. Manning, and J.F. Carpender, Spectroscopic Study of Secondary Structure and Thermal Denaturation of Recombinant Human Factor XIII in Aqueous Solution, Archives of Biochemistry and Biophysics 347(2) (1997), 213-220.

164

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

[150] A. Dong, J. Matsuura, M.C. Manning, and J.F. Carpenter, Intermolecular Beta-Sheet Results from Trifluoroethanol-Induced Nonnative Alpha-Helical Structure in Beta-Sheet Predominant Proteins: Infrared and Circular Dichroism Spectroscopic Study, Archives of Biochemistry and Biophysics 355(2) (1998), 275-281. [151] J.L.R. Arrondo, F.M. Goni, Structure and dynamics of membrane proteins as studied by infrared spectroscopy, Progress in Biophysics & Molecular Biology 72 (1999), 367-405. [152] K.G. Carrasquillo, C. Sanchez, and K. Griebenow, Relationship between conformational stability and lyophilization-induced structural changes in chymotrypsin, Biotechnol. App. Biochem. 31 (2000), 41-53. [153] A. Troullier, D. Reinstdler, Y. Dupont, D. Naumann, and V. Forge, Transient non-native secondary structures during the refolding of alpha-lactalbumin detected by infrared spectroscopy, Nature Structural Biology 7(1) (2000), 78-86. [154] M. Balsera, J.B. Arellano, J.R. Gutierrez, P. Heredia, J.L. Revuelta, and J. De Las Rivas, Structural Analysis of the PSBQ Protein of Photosystem Ii by Fourier Transform Infrared and Circular Dichroic Spectroscopy and by Bioinformatic Methods, Biochemistry 42(4) (2003), 1000-1007. [155] W. Hbner, H.H. Mantsch, and H.L. Casal, Beware of frequency shifts, Applied Spectroscopy 44 (1990), 732-734. [156] J.K. Kauppinen, D.J. Moffatt, H.H. Mantsch, and D.G. Cameron, Fourier transforms in the computation of self-deconvoluted and first-order derivative spectra of overlapped band contours, Analytical Chemistry 53 (1981), 1454-1457. [157] J.K. Kauppinen, D.J. Moffatt, H.H. Mantsch, and D.G. Cameron, Fourier self-deconvolution: a method for resolving intrinsically overlapped bands, Applied Spectroscopy 35 (1981), 271-276. [158] D.G. Cameron, D.J. Moffatt, Deconvolution, derivation and smoothing of spectra using Fourier transforms, Journal of testing and evaluation 12 (1984), 78-85. [159] D.G. Cameron, D.J. Moffatt, A generalized approach to derivative spectroscopy, Applied Spectroscopy 41 (1987), 539-544. [160] H.H. Mantsch, D.J. Moffatt, Computer-aided methods for the resolution enhancement of spectral data with special emphasis on infrared spectra, NATO ASI Series, Mathematical and Physical Sciences, (R. Fausto, ed.), USA: Kluwer Academic Publishers, 1993, pp. 113-124. [161] P.R. Griffiths, G.L. Pariente, Introduction to spectral deconvolution, Trends Anal. Chem. 5 (1986), 209-215. [162] H. Stone, Mathematical Resolution of Overlapping Spectral Lines, Journal of the Optical Society of America 52(9), 998-1003. [163] J.K. Kauppinen, D.J. Moffatt, H.H. Mantsch, and D.G. Cameron, Smoothing of Spectral Data in the Fourier Domain, Applied Optics 21(10) (1982), 1866-1872. [164] A. Savitsky, M.J.E. Golay, Smoothing and differentiation of data by simplified least squares procedures, Analytical Chemistry 36 (1964), 1627-1639. [165] B.W. Caughey, A. Dong, K.S. Bhat, D. Ernst, S.F. Hayes, and W.S. Caughey, Secondary structure analysis of the scrapie-associated protein PrP 27-30 in water by infrared spectroscopy, Biochemistry 30 (1991), 7672-7678. [166] A. Dong, W.S. Caughey, and T.W. Du Closs, Effects of calcium, magnesium and phosphorylcholine on secondary structures of human C-reactive protein and serum amyloid P component observed by infrared spectroscopy, Journal of Biological Chemistry 269 (1994), 6424-6430. [167] W.K. Surewicz, H.H. Mantsch, and D. Chapman, Determination of Protein Secondary Structure by Fourier Transform Infrared Spectroscopy: A Critical Assessment, 32(2) (1993), 389-394. [168] E. Goormaghtigh, V. Cabiaux, and J.M. Ruysschaert, Secondary structure and dosage of soluble and membrane proteins by attenuated total reflection Fourier-transform infrared spectroscopy on hydrated films, Eur. J. Biochem. 193 (1990), 409-420. [169] T.F. Kumosinski, J.J. Unruh, Quantitation of the global secondary structure of globular proteins by FTIR spectroscopy: comparison with X-ray crystallographic structure, 43(2) (1996), 199-219. [170] M. Levitt, J. Greer, Automatic Identification of Secondary Structure in Globular Proteins, Journal of Molecular Biology 114 (1977), 181-293. [171] J.L. Lippert, D. Tyminski, and P.J. Desmeules, J. Am. Chem. Soc. 101 (1976), 5111-5121. [172] B.M. Bussian, C. Sander, How to Determine Protein Secondary Structure in Solution by RamanSpectroscopy Practical Guide and Test Case Dnase-I, Biochemistry 28(10) (1989), 4271-4277. [173] R.W. Williams, A.K. Dunker, Determination of the Secondary Structure of Proteins From the Amide-I Band of the Laser Raman-Spectrum, Journal of Molecular Biology 152(4) (1981), 783-813. [174] R.W. Williams, Estimation of Protein Secondary Structure From the Laser Raman Amide-I Spectrum, Journal of Molecular Biology 166(4) (1983), 581-603.

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

165

[175] M. Berjot, J. Marx, and A.J.P. Alix, Determination of the Secondary Structure of Proteins From the Raman Amide-I Band the Reference Intensity Profiles Method, Journal of Raman Spectroscopy 18(4) (1987), 289-300. [176] S.U. Sane, S.M. Cramer, and T.M. Przybycien, A Holistic Approach to Protein Secondary Structure Characterization Using Amide I Band Raman Spectroscopy, Analytical Biochemistry 269(2) (1999), 255-272. [177] W. Kabsch, C. Sander, Dictionary of Protein Secondary Structure Pattern-Recognition of HydrogenBonded and Geometrical Features, Biopolymers 22(12) (1983), 2577-2637. [178] Y.Q. Wu, K. Murayama, and Y. Ozaki, Two-dimensional infrared spectroscopy and principle component analysis studies of the secondary structure and kinetics of hydrogen-deuterium exchange of human serum albumin, Journal of Physical Chemistry B 105(26) (2001), 6251-6259. [179] A. Filosa, Y. Wang, A. Ismail, and A.M. English, Two-dimensional infrared correlation spectroscopy as a probe of sequential events in the thermal unfolding of cytochromes c, Biochemistry 40(28) (2001), 8256-8263. [180] L.A. Forato, R. Bernardes-Filho, and L.A. Colnago, Protein Structure in KBr Pellets by Infrared Spectroscopy, Analytical Biochemistry 259(1) (1998), 136-141. [181] N. Colloch, C. Etchebest, E. Thoreau, and B. Henrissat, Comparison of three algorithms for the assignment of secondary structure in proteins: the advantages of a consensus assignment, Protein Engineering 6(4) (1993), 377-382. [182] S. Leikin, V.A. Parsegian, W.-H. Yang, and G.E. Walrafen, Raman spectral evidence for hydration forces between collagen triple helices, in Proceedings National Academy of Sciences USA, (B. Mazur, K. Rubin, eds.), USA: National Academy of Sciences, 1997, pp. 11312-11317. [183] H.H.J. De Jongh, E. Goormaghtigh, and J.-M. Ruysschaert, The Different Molar Absorptivities of the Secondary Structure Types in the Amide I Region: An Attenuated Total Reflection Infrared Study on Globular Proteins, Analytical Biochemistry 242(1) (1996), 95-103. [184] G. Vedantham, H.G. Sparks, S.U. Sane, S. Tzannis, and T.M. Przybycien, A Holistic Approach for Protein Secondary Structure Estimation from Infrared Spectra in H~2O Solutions, Analytical Biochemistry 285(1) (2000), 33-49. [185] D. Frishman, P. Argos, Seventy-five percent accuracy in protein secondary structure prediction, Proteins Structure Function and Genetics 27(3) (1997), 329-335. [186] H.G. Sparks, PhD Thesis, Rensselaer Polytechnic Institute, (1999). [187] Kevin Swingler, Applying Neural Networks A practical guide, Academic Press, 96. [188] J. Moult, The Current State of the Art in Protein Structure Prediction, Current Opinion in Biotechnology 7(4) (1996), 422-427. [189] B. Rost, S. ODonoghue, Sisyphus and prediction of protein structure, CABIOS 13(4) (1997), 345356. [190] P. Baldi, S. Brunak, P. Frasconi, G. Soda, and G. Pollastri, Exploiting the past and the future in protein secondary structure prediction, Bioinformatics (Oxford) 15(11) (1999), 937-946. [191] P. Baldi, S. Brunak, Y. Chauvin, C.A.F. Andersen, and H. Nielsen, Assessing the accuracy of prediction algorithms for classification: An overview, Bioinformatics (Oxford) 16(5) (2000), 412-424. [192] K.A. Oberg, J.-M. Ruysschaert, and G. Goormaghtigh, The optimization of protein secondary structure determination with infrared and circular dichroism spectra, Eur. J. Biochemistry 271 (2004), 2937-2948. [193] V. Cabiaux, K.A. Oberg, P. Pancoska, T. Walz, P. Agre, and A. Engel, Secondary structures comparison of aquaporin-1 and bacteriorhodopsin: A Fourier transform infrared spectroscopy study of twodimensional membrane crystals, Biophysical Journal 73(1) (1997), 406-417. [194] M. Severcan, F. Severcan, and P.I. Haris, Estimation of Protein Secondary Structure From FTIR Spectra Using Neural Networks, Journal of Molecular Structure 565 (2001), 383-387. [195] J.A. Hering, P.R. Innocent, and P.I. Haris, An improved method for rapid quantification of protein secondary structure from FTIR spectra of proteins, 2001 Congress Functional Proteomics, in Proceedings of the Swiss Proteomics Society, (P.M. Palagi, J.-C. Sanchez, and R. Stcklin, eds.), Fontis Media, 2001, pp. 128-132. [196] J.A. Hering, P.R. Innocent, and P.I. Haris, An alternative method for rapid quantification of protein secondary structure from FTIR spectra using neural networks, Spectroscopy 16(2) (2002), 53-69. [197] J.A. Hering, P.R. Innocent, and P.I. Haris, New approaches for quantification of protein secondary structure from FTIR spectra of proteins, 2002 Congress Applied Proteomics, in Proceedings of the Swiss Proteomics Society, (P.M. Palagi, M. Quadroni, J.S. Rossier, J.-C. Sanchez, and R. Stcklin, eds.), Fontis Media, 2002, pp. 163-165. [198] J.A. Hering, P.R. Innocent, and P.I. Haris, Neuro-Fuzzy SCOP classification for improved protein secondary structure prediction, Proteomics 3(8) (2003), 1464-1475.

166

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

[199] J.L. McClelland, and D.E. Rumelhart, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, in Volume 2: Psychological and Biological Models, MIT Press, 87. [200] M. Riedmiller, H. Braun, A direct adaptive method for faster backpropagation learning: The RPROP algorithm, IEEE International Conference on Neural Networks (ICNN-93), (H. Ruspini, ed.), IEEE, San Francisco, USA, 1993, pp. 586-591. [201] M. Riedmiller, Advanced Supervised Learning in Multi-layer Perceptrons From Backpropagation to Adaptive Learning Algorithms, International Journal of Computer Standards and Interfaces, Special Issue on Neural Networks 5 (1994), 8. [202] M. Riedmiller, Untersuchungen zur Konvergenz und Generalisierungsfhigkeit berwachter Lernverfahren mit dem SNNS, Workshop SNNS-93: Simulation Neuronaler Netze mit SNNS, (Anonymous), Universitt Stuttgart, Fakultt Informatik, 1993, pp. 107-116. [203] J.A. Hering, P.R. Innocent, and P.I. Haris, Beyond average protein secondary structure content prediction using FTIR spectroscopy, Accepted for publication in Applied Bioinformatics . [204] H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, and P.E. Bourne, The Protein Data Bank, Nucleic Acids Research 28(1) (2000), 235-242. [205] A.G. Murzin, S.E. Brenner, T. Hubbard, and C. Chothia, SCOP: A Structural Classification of Proteins Database for the Investigation of Sequences and Structures, Journal of Molecular Biology 247(4) (1995), 536-540. [206] L. Lo Conte, B. Ailey, T.J. Hubbard, S.E. Brenner, A.G. Murzin, and C. Chothia, SCOP: A Structural Classification of Proteins database, Nucleic Acids Research 28(1) (2000), 257-259. [207] E.B. Baum, D. Haussler, What size net gives valid generalisation?, Neural computing 1 (1989), 151-160. [208] S. Haykin, Neural Networks, A comprehensive Foundation, Macmillan College Publishing Company, Inc., 94, 176-177. [209] R. Lange, R. Mnner, Quantifying a critical training set size for generalisation and overfitting using teacher neural networks, Proceedings of the International Conference on Artificial Neural Networks (ICANN), (P. Morasso, M. Marinaro, eds.), 1994, pp. 497-500. [210] C. Klawun, and C.L. Wilkins, Optimization of functional group prediction from infrared spectra using neural networks, Journal of Chemical Information and Computer Sciences 36(1) (1996), 69-81. [211] V. Tchistiakov, C. Ruckebusch, L. Duponchel, J.P. Huvenne, and P. Legrand, Neural network modelling for very small spectral data sets: reduction of the spectra and hierarchical approach, Chemometrics and intelligent laboratory systems 54(2) (2000), 93-106. [212] K. Esbensen, S. Schnkopf, and T. Midtgaard, Multivariate Analysis in Practice, Computer-Aided Modelling AS, 94. [213] J.A. Hering, P.R. Innocent, and P.I. Haris, Towards developing a protein infrared spectra databank (PISD) for Proteomics research, Accepted for publication in Proteomics . [214] Y.N. Chirgadze, B.V. Shestopalov, and S.Y. Venyaminov, Biopolymers 12 (1973), 1337-1351. [215] Y.N. Chirgadze, E.V. Brazhnikov, Biopolymers 13 (1974), 1701-1712. [216] F. Despagne, D.L. Massart, Neural Networks in Multivariate Calibration, Analyst 123(11) (1998), 157-178. [217] K. Hornik, M. Stinchcombe, and H. White, Multilayer Feedforward Networks Are Universal Approximators, Neural Networks 2(5) (1989), 359-366. [218] S.R. Amendolia, A. Doppiu, M.L. Ganadu, and G. Lubinu, Classification and Quantitation of H-1 Nmr Spectra of Alditols Binary Mixtures Using Artificial Neural Networks, Analytical Chemistry 70(7) (1998), 1249-1254. [219] R. Goodacre, Use of Pyrolysis Mass Spectrometry With Supervised Learning for the Assessment of the Adulteration of Milk of Different Species, Applied Spectroscopy 51(8) (1997), 1144-1153. [220] R. Goodacre, M.J. Neal, and D.B. Kell, Rapid and Quantitative-Analysis of the Pyrolysis MassSpectra of Complex Binary and Tertiary Mixtures Using Multivariate Calibration and Artificial Neural Networks, Analytical Chemistry 66(7) ( 1994), 1070-1085. [221] J.R. Long, V.G. Gregoriou, and P.J. Gemperline, Spectroscopic Calibration and Quantitation Using Artificial Neural Networks, Analytical Chemistry 62(17) (1990), 1791-1797. [222] S. Biswas, S. Venkatesh, The devil and the network: what sparsity implies to robustness and memory, Advances in Neural Information Processing Systems, (R.P. Lippman, J.E. Moody, and D.S. Touretzky, eds.), 1991, pp. 883-889. [223] B. Hitzmann, A. Ritzka, R. Ulber, T. Scheper, and K. Schugerl, Computational Neural Networks for the Evaluation of Biosensor Fia Measurements , Analytica Chimica Acta 348(1-3) (1997), 135-141. [224] C. Borggaard, H.H. Thodberg, Optimal Minimal Neural Interpretation of Spectra, Analytical Chemistry 64(5) (1992), 545-551.

J.A. Hering and P.I. Haris / FTIR Spectroscopy for Analysis of Protein Secondary Structure

167

[225] J. Naes, K. Kvaal, T. Isaksson, and C. Miller, Artificial neural networks in multivariate calibration, Journal of Near Infrared Spectroscopy 1 (1993), 1-12 . [226] K. Rahmelow, W. Huebner, Fourier Self-Deconvolution: Parameter Determination and Analytical Band Shapes, 50(6) (1996), 795-804. [227] E. Goormaghtigh, V. Cabiaux, and J.-M. Ruysschaert, Determination of Soluble and Membrane Protein Structure by Fourier Transform Infrared Spectroscopy: III. Secondary Structures, in Subcellular Biochemistry, (H.J. Hilderson, G.B. Ralston, eds.), USA: Plenum Publishing Corporation, 1994, pp. 405-450. [228] D. Frishman, P. Argos, Knowledge-Based Protein Secondary Structure Assignment, Proteins 23(4) (1995), 566-579. [229] E. Goormaghtigh, J.M. Ruysschaert, Molecular Description of Biological Components by Computer Aided Conformational Analysis, (R. Brasseur, ed.), CRC Press, 1998, pp. 285-329. [230] J.L.R. Arrondo, F.M. Goni, Protein-Lipid Interactions, (A. Watts, ed.), Elsevier, 1993, pp. 321-349. [231] H.H. Mantsch, A. Perczel, M. Hollosi, and G.D. Fasman, Characterization of a-Turns in Cyclic Hexapeptides in Solution by Fourier Transform IR Spectroscopy, Biopolymers 33(2) (1993), 201-207. [232] A. Dong, W.S. Caughey, Infrared Methods for Study of Hemoglobin Reactions and Structures, in Methods in Enzymology, (J. Everse, K.D. Vandegriff, and R.M. Winslow, eds.), USA: Academic Press Inc Ltd, 1994, pp. 139-175.

168

Biological and Biomedical Infrared Spectroscopy A. Barth and P.I. Haris (Eds.) IOS Press, 2009 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-045-2-168

Infrared Spectroscopy of Protein Pharmaceuticals


Marco VAN DE WEERT 11 and Lene JORGENSEN Biomacromolecular Drug Delivery, Department of Pharmaceutics and Analytical Chemistry, Faculty of Pharmaceutical Sciences, University of Copenhagen, Universitetsparken 2, 2100 Copenhagen, Denmark

1. Introduction In the last few decades, the pharmaceutical industry has seen a rapid rise in the number of drug products containing a protein as the active compound. At present, about one third of new products contains a protein as the active compound [1]. These pharmaceutical proteins are mainly used for life threatening and/or chronic diseases, such as hepatitis, diabetes, cancer, growth disorders, and anemia. Protein pharmaceuticals introduce several challenges for the pharmaceutical scientist [2]. First of all, they are expensive to manufacture at the required high purity. Second, they are very poorly absorbed through biological membranes, making injection or infusion the most practical method of administration. This is not only inconvenient for the patient, but also puts severe restrictions on the additives in the formulation. Finally, proteins are highly complex molecules, and prone to a variety of physicochemical degradation processes. The regulatory agencies do not accept the presence of a significant amount of degradation products, and require that the protein itself, as well as most important degradation products, are thoroughly characterised. The characterisation of a protein requires the use of several techniques that give complementary information. The analytical toolbox for protein characterisation is still expanding, and consists of a wide variety of techniques. The interested reader is referred to the book Methods for structural analysis of protein pharmaceuticals for a description of several methods used for protein pharmaceuticals [3]. One of the methods in this set of techniques is infrared spectroscopy (FTIR). Until relatively recently, FTIR was only sparsely used to characterise protein pharmaceuticals. Most of the early products were formulated as aqueous solutions with low protein concentrations (<1 mg/ml), which precludes the use of FTIR. However, recent years have seen an increasing focus on developing high concentration protein pharmaceuticals, especially monoclonal antibodies, where concentrations well above 100 mg/ml may be the intended target [4]. Many analytical techniques are not capable of handling such high concentrations, while the quality of FTIR spectra actually improves at high concentrations. In addition, in the 1990s a correlation between protein FTIR spectra in the solid state and their storage stability was found [5], as will be discussed further below.
1 Corresponding Author: Marco Van De Weert, Biomacromolecular Drug Delivery, Department of Pharmaceutics and Analytical Chemistry, Faculty of Pharmaceutical Sciences, University of Copenhagen, Universitetsparken 2, 2100 Copenhagen, Denmark.

M. van de Weert and L. Jorgensen / Infrared Spectroscopy of Protein Pharmaceuticals

169

1700

1675

1650

1625

1600

Wavenumber (cm -1)


Figure 1. Example of an area overlap calculation using normalised and inverted second derivative spectra of native lysozyme (solid line) and heat-denatured lysozyme (dashed line). The grey area indicates the area overlap between the two spectra, which amounts to 55% of the normalised area of the two spectra.

Finally, there is an increasing focus on developing sustained release systems for protein pharmaceuticals. Many of these systems scatter radiation to such an extent that most spectroscopic techniques are not able to characterise the protein inside these systems. Also in this case FTIR has shown its value [69]. As a result, FTIR is now a well-accepted analytical technique in the pharmaceutical industry for analysis of protein pharmaceuticals. In this chapter, we will only discuss the analysis of pharmaceutical proteins in the solid state, in sustained release systems, and in a more complex mixture of additives. Analysis of protein pharmaceuticals in solution does not differ from analysis of any protein in solution, and has been discussed in detail in chapter 5.

2. Methodology For an in-depth description of the methodology used to analyse protein structure, the interested reader is referred to chapters 4, 5 and 6. These methods are highly useful to obtain global structural information about the protein. Often, however, the pharmaceutical scientist is less interested in the exact secondary structure of the protein, but rather wishes to monitor deviations from the native state. It is these deviations from the native state that often indicate physicochemical degradation of the protein, which usually is undesirable when developing a protein pharmaceutical. Thus, the pharmaceutical scientist is mostly interested in spectral similarity between the native protein and the protein in its formulation. Various methods have been developed to determine spectral similarity. Perhaps the most useful and practical method is the area overlap method [10]. In this method, the extent of area overlap between two normalised spectra is calculated (Fig. 1). Both raw and resolution-enhanced spectra can be used to this purpose. However, a main advantage of using second derivative spectra is the elimination of spectral distortions, such as baseline slope and band broadening due to scattering. Theoretically, the area overlap may range from 0 to 1 (or 0% to 100%), but in practice the lower limit is around 0.5 (50%) due to band overlap of the various secondary structural elements.

170

M. van de Weert and L. Jorgensen / Infrared Spectroscopy of Protein Pharmaceuticals

The example in Fig. 1 shows the area overlap between the area-normalised (inverted) second derivative spectra of native lysozyme and a heat-denatured sample. The area overlap, or similarity, between the two spectra is 55%. Visual inspection of the spectrum shows a major increase in intermolecular -sheets mainly at the expense of -helical structure, as is common upon heat denaturation [11]. It should be noted that the changes in the spectra upon heat denaturation will be smaller for -sheet proteins. This is due to the smaller spectral change from intramolecular to intermolecular -sheet, as compared to the change from -helix to intermolecular -sheet. Thus, the values obtained from the area overlap method should only be compared for a single protein, and not between proteins.

3. Applications 3.1. Solid Protein Products Freeze-Drying Many protein pharmaceuticals are insufficiently stable in solution to be stored for prolonged periods of time. Generally, a protein pharmaceutical should be stable for at least 2 years, and only very limited physicochemical degradation is allowed. Since many of the degradation processes involve or are catalysed by water, the pharmaceutical industry is required to dry these formulations to slow these processes. For example, aggregation is significantly inhibited in the solid state due to the limited mobility, while deamidation requires the presence of water molecules, which are in abundance in solution [12]. An important aspect of this drying is the removal of water molecules, including those that are tightly bound to the protein. The most common drying procedure is freeze-drying, but alternative methods such as spray-freeze-drying and spray-drying are receiving increased attention. These drying procedures do impose stress forces on the protein, such as potential cold and heat denaturation, introduction of various interfaces (ice-water, water-air), up-concentration of the protein and other components in the formulation, and the removal of hydration water [13]. Thus, the drying process must be designed carefully to assure long-term storage stability of the dried formulation. A significant body of research is available on the freeze-drying of protein pharmaceuticals. It has been shown that a careful design of the process is an absolute requirement [13,14]. That is, a simple freeze-and-vacuum dry process often results in significant protein degradation, a solid material that is difficult to redisperse, and/or an aesthetically unacceptable solid material. A proper freeze-drying process usually involves a freezing phase, sometimes including an annealing step to increase the size of the ice crystals, a primary drying phase under vacuum to remove most of the ice crystals, and a secondary drying phase, in which the temperature of the sample is increased to rapidly remove the remaining ice as well as most of the tightly-bound water molecules. This type of freeze-drying process may take days, and much time can be saved by prior knowledge of fundamental physicochemical parameters, in particular the glass transition (Tg) or eutectic temperature of the liquid formulation [14]. Apart from a carefully designed process, most proteins also require the presence of so-called lyoprotectants, such as a non-reducing carbohydrate, to obtain a stable product. Initially it was thought that these lyoprotectants protected the protein by forming a glassy state with high glass transition temperature (Tg) upon drying, thus reducing po-

M. van de Weert and L. Jorgensen / Infrared Spectroscopy of Protein Pharmaceuticals

171

A
K

B
K

55

60

65

70

75

80

Similarity (%)

85

87

Similarity (%)

89

91

93

95

Figure 2. Correlation between spectral similarity of proteins in the native state and in a freeze-dried formulation, and the observed degradation rate upon storage in the solid state. (A) Rate constant of loss of monomeric growth hormone versus spectral similarity [16]. (B) Aggregation rate constant of a monoclonal antibody versus spectral similarity [17].

tential mobility of the protein. However, highly dried proteins themselves can have very high Tgs (> 100 C) [15], and several compounds with a high Tg do not stabilise proteins as efficiently as those with lower Tg [13]. A potential answer to the function of lyoprotectants has come from FTIR analysis. When comparing the protein structure before and after freeze-drying, many have found a negative correlation between the extent of the changes and the (storage) stability of the protein in the freeze-dried formulation. That is, a lyoprotectant that is capable of reducing the extent of the spectral changes, usually increases the stability of the protein in the formulation. Figure 2 contains a few literature examples, where we have plotted the level of spectral similarity of native and freeze-dried protein against the degradation rate of the freeze-dried protein. Unfortunately, such literature examples are very limited, and many articles merely mention there is a better stability for freeze-dried proteins of which the FTIR spectrum more closely resembles that of the native protein. In Fig. 2A, we have plotted the spectral similarity in the spectra versus the observed rate constant of loss of monomeric human growth hormone, as reported by Costantino et al. [16]. The authors reported only the percentage -helix, -sheet and unordered structure, and we calculated the average percentage similarity to the native state of these three structures. Generally, there is a trend that a higher similarity to the native state yields a better stability. However, these data were obtained using different stabilisers, each with their own glass transition and tendency to crystallise. Thus, this data does not rule out other explanations for the increased stability at higher spectral similarity. For example, some of the samples with the lowest spectral similarity contained polyalcohols with very low glass transition temperatures, or polyalcohols that tend to crystallise. Figure 2B shows a plot of spectral similarity versus the observed degradation (aggregation) rate of a freeze-dried IgG1 antibody [17]. In this case, all samples contain the same stabiliser (sucrose). It is clearly visible that the degradation rate is smaller if the structure more closely resembles that of the native protein. However, the increased retention of native structure also correlates with the relative amount of sucrose in the sample. That is, the reduced aggregation rate may also be caused by a more pronounced dilution of the protein within the freeze-dried sample. A correlation between spectral change and (storage) stability can be explained by assuming that the spectral change is mainly due to a structural change. Generally, any

172

M. van de Weert and L. Jorgensen / Infrared Spectroscopy of Protein Pharmaceuticals

A
1700

1675

Wav enumber (cm-1 )

1650

1625

1600

Figure 3. Normalised inverted second derivative spectra of glucagon and PEGylated glucagon. Glucagon in solution (solid line), glucagon freeze-dried from a water/acetonitril mixture (dashed line), and freeze-dried PEGylated glucagon (5 kDa PEG chain) from a water/acetonitril mixture (dotted line) [18].

structural change of a protein may expose otherwise buried hydrophobic patches. Even within a solid with high Tg molecular motions are still possible, and partially unfolded proteins in the freeze-dried formulation may thus interact and slowly aggregate. Moreover, the protein may be more extended and expose more sites susceptible to chemical degradation. Over time, the formulation containing partly unfolded protein is then more prone to physicochemical degradation. Conversely, a freeze-dried formulation containing the protein in its (near-)native state is less likely to degrade. The apparent correlation means that FTIR can be used to decrease the development time of a stable freeze-dried product. Those formulations in which the protein spectrum in the solid state differs significantly from that in the native state are more likely to have a poor long-term stability, compared to those where the spectra are more similar. Thus, those formulations and freeze-drying protocols that are least likely to give a stable product can be rapidly identified and eliminated, rather than waiting long periods of time, sometimes months, until degradation may be apparent after rehydrating the freeze-dried product. Ultimately, however, long-term stability studies will be required to show that the chosen formulations are good enough to yield a two-year stable product. FTIR has also been used by Stigsnaes et al. [18] to study the effect of PEGylation on the processing stability of the model peptide glucagon. PEGylation is a common process for many peptide and protein pharmaceuticals, and involves covalent attachment of polyethyleneglycol to selected amino acids in the protein chain. This chemical modification increases circulation times with factors up to 100, increases solubility, and decreases potential immune responses to the protein [19]. The study by Stigsnaes et al. [18] showed that PEGylation of glucagon also results in a peptide that is less sensitive to the stresses incurred by the freeze-drying process, as evidenced by the reduction in spectral changes (Fig. 3). 3.2. Solid Protein Pharmaceuticals Spray-Drying FTIR is not only used in the analysis of freeze-dried proteins, but also dried formulations obtained through other means, such as spray-drying. Although research within

M. van de Weert and L. Jorgensen / Infrared Spectroscopy of Protein Pharmaceuticals

173

A
1700

1675

Wav enumber (cm-1 )

1650

1625

1600

Figure 4. FTIR spectra of spray-dried calcitonin formulations with varying weight ratios of calcitonin:chitosan:mannitol. 1:0:18 (solid squares), 1:1:18 (open squares), 1:3:16 (solid circles), 1:4:15 (open circles). The arrow shows the decrease in -helical structure upon increasing the amount of chitosan [28].

100 95 90 85 80

Recovery (%)

85

90

Similarity (%)

95

100

Figure 5. Correlation between spectral similarity of the spectra in Fig. 4 with the sample without chitosan and recovery upon rehydration in phosphate buffer pH 7.4 (squares) or acetate buffer pH 4.4 (triangles) [28].

this area is still limited, the available literature indicates a similar correlation between spectral changes and (storage) stability [2027]. In studies on spray-dried calcitonin formulations containing chitosan, increasing protein spectral changes were observed upon increasing the amount of chitosan in the formulation (Fig. 4) [28]. Subsequent rehydration of these chitosan-containing formulations indicated a reduced recovery of the protein, correlating with the reduction in area overlap compared to the sample spray-dried in the absence of chitosan (Fig. 5). The recovery was higher in the acetate buffer, since calcitonin fibrils will dissolve in this buffer, but not in phosphate buffer [28]. The observed negative influence of chitosan is surprising, since no interaction between calcitonin and chitosan was observed in solution. Currently, no satisfying explanation is available to explain the reduced recovery [28]. These studies thus call for a

174

M. van de Weert and L. Jorgensen / Infrared Spectroscopy of Protein Pharmaceuticals

A
1710

1690

1670

1650

1630

1610

1590

Wav enumber (cm-1 )


Figure 6. Normalised inverted second derivative spectra of insulin entrapped in alginate-chitosan nanoparticles. Different weight ratios of alginate-chitosan were used, resulting in very similar spectra (black lines). A control spectrum of insulin in solution is also shown (dashed line) [29].

more critical approach towards the use of chitosan in formulations containing proteins, as many of such formulations will ultimately be dried. Similar unwanted interactions may occur in formulations containing other proteins. 3.3. Solid Protein Pharmaceuticals Sustained Release Systems Almost all protein formulations are administered through injection or by infusion, which introduces a significant problem in terms of patient compliance. Over the last few decades an enormous effort has been aimed at developing so-called sustained, or controlled, release systems for proteins. Slow, sustained release of the protein from such a system significantly prolongs therapeutic concentrations in the bloodstream, minimising the number of injections required to sustain these concentrations. Moreover, a properly designed system would decrease the fluctuations in blood concentration, which may increase the therapeutic efficacy of the protein drug. The major challenges in designing a proper sustained release system are to achieve appropriate release kinetics, and to assure protein stability within the system. The latter imposes a significant analytical challenge, since these systems are designed to entrap the protein. Any method to break down the system may have a pronounced effect on the protein structure, making it impossible to identify whether the protein was already altered in the system itself, or affected by the extraction procedure. Once again, FTIR allows analysis of the protein structure while it is still entrapped in the system, and thus (partly) resolves this question [69]. Figure 6 shows an example of spectral analysis of an entrapped protein [29], and contains FTIR spectra of insulin entrapped in nanoparticles prepared from mixtures of alginate and chitosan. These oppositely charged polymers interact through electrostatic interactions, forming complex coacervates of about 800 to 1700 nm in diameter. Only limited structural changes of the protein, insulin, are observed upon entrapment, suggesting that the interaction and entrapment procedure do not have a major detrimental effect on protein structure. Moreover, the released protein also has the same structure as the native protein (not shown). Whether this also means that the protein is stable for prolonged periods of time within this matrix needs to be investigated further. Espe-

M. van de Weert and L. Jorgensen / Infrared Spectroscopy of Protein Pharmaceuticals

175

cially when these particles need to be (freeze-)dried, there is some concern as to the possible negative effects of chitosan, as described above for calcitonin. A number of studies have focused on entrapment into poly(lactide-co-glycolide) (PLGA) microspheres. Here, various spectral changes have been observed, including the formation of intermolecular -sheets [8]. The latter structure typically results in (partially) non-reversible protein aggregates. Thus, this reduces the content of active protein, and may ultimately result in the release of aggregated protein into the circulation. Aggregates are thought to be one of the main causes of the unwanted immune response to protein pharmaceuticals [30], and their level must be kept as low as possible. Interestingly, the presence of a large amount of structurally altered lysozyme inside a polymeric matrix composed of a poly(ether ester) has also been observed using FTIR [31]. The spectra indicated that the protein had formed non-covalent aggregates inside the matrix, as evidenced by large absorption bands around 1625 and 1695 cm 1. However, the protein was released in its native and fully active form. A possible explanation for this surprising observation is the formation of rather loose aggregates, which easily dissociate upon hydration. Moreover, the matrix may act as a template for refolding. However, in the pharmaceutical industry and the regulatory agencies, in particular, the presence of these aggregates inside the matrix would raise significant concerns. It is likely that these aggregates become less reversible upon long-term storage, and they may be more prone to other types of degradation. 3.4. Controlled Release Systems Emulsions Proteins may also be entrapped in emulsions. Such emulsions scatter light, and are, by design, difficult to break into the original two phases. Thus, structural analysis of protein structure in such emulsions is very difficult, although not impossible [32]. Here also, FTIR plays an important role. The structure of proteins entrapped inside these emulsions is often different from that in solution. For some proteins a decrease in the -helical content is observed, for others formation of (intermolecular) -sheet is apparent, as is the case for growth hormone (Fig. 7A) [33]. For yet other proteins only slight alterations may be observed, exemplified using glucagon in Fig. 7B. Formation of intermolecular -sheets, at the expense of -helix, has also been observed for -lactoglobulin adsorbed to the oilwater interface [3436]. Thus, this structural rearrangement appears to be common for several proteins, and is a potential concern when using emulsions for drug delivery purposes. Their appearance suggests the formation of aggregate formation, which is highly undesirable for protein pharmaceuticals. Establishment of the structure of the released protein, its activity upon release, and whether all protein is released will have great influence on the choice of delivery system. Protein release from the water-in-oil emulsions is possible, as has been shown by in vitro release of aprotinin (up to 2% released) and insulin aspart (up to 30% released) [3739]. The amount of insulin aspart released can be modified by alterations of the osmotic gradient from the internal aqueous phase to the release media. Nevertheless, only ~30% insulin aspart is released from the emulsion and no certainty of retention of the activity is present [39]. The activity of aprotinin after extraction from the water-inoil emulsion was about 87%, indicating that the structure of aprotinin is not irreversibly altered by the incorporation into the emulsion [37]. It would thus appear that for some proteins an emulsion may be a suitable drug delivery system.

176

M. van de Weert and L. Jorgensen / Infrared Spectroscopy of Protein Pharmaceuticals

A
A

B
A
1700

1700

1650

1600

1650

1600

Wav enumber (cm-1 )

Wav enumber (cm-1 )

Figure 7. Normalised inverted second derivative spectra of native protein (solid line) and protein entrapped in emulsions (dotted line). (A) Growth hormone; (B) Glucagon.

4. Concluding Remarks and Future Directions FTIR is slowly becoming an accepted technique in the pharmaceutical industry for the analysis of protein pharmaceuticals. Its application in detecting drying-induced instability of protein formulations is a standard method in many companies, although the number of publications is still limited. The same applies to structural analysis of proteins entrapped in sustained release systems. Proof-of-Concept is available, but wellestablished procedures and methods are still missing. A more concerted effort is required to firmly establish FTIR as an important and invaluable technique for analysis of protein pharmaceuticals. For example, structural analysis of a wide variety of proteins in a wide variety of sustained release systems is required to thoroughly establish any correlation between FTIR spectra and quality of the release system. Other vibrational techniques are also slowly being implemented in the pharmaceutical industry. Raman and NIR spectroscopy are suitable for on-line analysis, and may further shape the rapid development of protein pharmaceuticals. Although application of these techniques to protein pharmaceuticals is still limited, we are convinced that the next decade will see an exponential growth within this area. Process analytical technology (PAT), aimed at on-line analysis, is one of the key areas that regulatory agencies would like to see developed further. Proper use of these techniques would, however, require a better insight into the correlation between protein structure and protein Raman or NIR spectra. Moreover, improved algorithms will be required to handle the large amount of data coming from such on-line analysis. This is likely to be the main challenge for the next decade.

Acknowledgements The authors would like to acknowledge Drs Pernille Stigsnaes, Mingshi Yang, and Bruno Sarmento for providing the data shown in Figs 3, 4+5, and 6, respectively.

References
[1] S. Lawrence, Nature Biotech. 24 (2007), 1466. [2] M. van de Weert, L. Jorgensen, E.H. Moeller, and S. Frokjaer, Expert Opin. Drug Deliv. 2 (2005), 1029-1037.

M. van de Weert and L. Jorgensen / Infrared Spectroscopy of Protein Pharmaceuticals

177

[3] Methods for structural analysis of protein pharmaceuticals, W. Jiskoot and D.J.A. Crommelin, eds., AAPS Press, Arlington, 2005. [4] S. Matheus, W. Friess, and H.-C. Mahler, Pharm. Res. 23 (2006), 1350-1363. [5] J.F. Carpenter, S.J. Prestrelski, and A. Dong, Eur. J. Pharm. Biopharm. 45 (1998), 231-238. [6] K. Fu, K. Griebenow, L. Hsieh, A.M. Klibanov, and R. Langer, J. Control. Release 58 (1999), 357-366. [7] K. Griebenow, I.J. Castellanos, and K.G. Carrasquillo, Int. J. Vib. Spec. 3, 5 (1999), 2. [8] M. van de Weert, R. van t Hof, J. van der Weerd, R.M.A. Heeren, G. Posthuma, W.E. Hennink, and D.J.A. Crommelin, J. Control. Release 68 (2000), 31-40. [9] T.-H. Yang, A. Dong, J. Meyer, O.L. Johnson, J.L. Cleland, and J.F. Carpenter, J. Pharm. Sci. 88 (1999), 161-165. [10] B.S. Kendrick, A. Dong, S.D. Allison, M.C. Manning, and J.F. Carpenter, J. Pharm. Sci. 85 (1996), 155-158. [11] M. van de Weert, P.I. Haris, W.E. Hennink, and D.J.A. Crommelin, Anal. Biochem. 297 (2001), 160-169. [12] Lyophilization of biopharmaceuticals, H.R. Costantino and M.J. Pikal, eds., AAPS Press, Arlington, 2005. [13] J.F. Carpenter, M.J. Pikal, B.S. Chang, and T.W. Randolph, Pharm. Res. 14 (1997), 969-975. [14] X. Tang and M.J. Pikal, Pharm. Res. 21 (2004), 191-200. [15] D.S. Katayama, J.F. Carpenter, M.C. Manning, T.W. Randolph, P. Setlow, and K.P. Menard, J. Pharm. Sci. 97 (2008), 1011-1022. [16] H.R. Costantino, K.G. Carrasquillo, R.A. Cordero, M. Mumenthaler, C.C. Hsu, and K. Griebenow, J. Pharm. Sci. 87 (1998), 1412-1420. [17] L. Chang, D. Shepherd, J. Sun, D. Ouellette, K.L. Grant, X. Tang, and M.J. Pikal, J. Pharm. Sci. 94 (2005), 1427-1444. [18] P. Stigsnaes, S. Frokjaer, S. Bjerregaard, M. van de Weert, P. Kingshott, and E.H. Moeller, Int. J. Pharm. 330 (2007), 89-98. [19] M. Morpurgo and F.M. Veronese, Methods Mol. Biol. 283 (2004), 45-70. [20] A.M. Abdul-Fattah, V. Truong-Le, L. Yee, L. Nguyen, D.S. Kalonia, M.T. Cicerone, and M.J. Pikal, J. Pharm. Sci. 96 (2007), 1983-2008. [21] H.-K. Chan, A.R. Clark, J.C. Feeley, M.-C. Kuo, S.R. Lehrman, K. Pikal-Cleland, D.P. Miller, R. Vehring, and D. Lechuga-Ballesteros, J. Pharm. Sci. 93 (2004), 792-804. [22] D.L. French, T. Arakawa, and T. Li, Biopolymers 73 (2004), 524-531. [23] Y.-H. Liao, M.C. Brown, T. Nazir, A. Quader, and G.P. Martin, Pharm. Res. 19 (2002), 1847-1853. [24] A. Mauerer and G. Lee, Eur. J. Pharm. Biopharm. 62 (2006), 131-142. [25] M. Maury, K. Murphy, S. Kumar, A. Mauerer, and G. Lee, Eur. J. Pharm. Biopharm. 59 (2005), 251-261. [26] S.U. Sane, R. Wong, and C.C. Hsu, J. Pharm. Sci. 93 (2004), 1005-1018. [27] S. Schle, W. Friess, K. Bechtold-Peters, and P. Garidel, Eur. J. Pharm. Biopharm. 65 (2007), 1-9. [28] M. Yang, S. Velaga, H. Yamamoto, H. Takeuchi, Y. Kawashima, L. Hovgaard, M. van de Weert, and S. Frokjaer, Int. J. Pharm. 331 (2007), 176-181. [29] B. Sarmento, D.C. Ferreira, L. Jorgensen, and M. van de Weert, Eur. J. Pharm. Biopharm. 65 (2007), 10-17. [30] A.S. Rosenberg, AAPS J. 8 (3) (2006), 59. [31] M. van de Weert, R. van Dijkhuizen-Radersma, J.M. Bezemer, W.E. Hennink, and D.J.A. Crommelin, Eur. J. Pharm. Biopharm. 54 (2002), 89-93. [32] L. Jorgensen, M. van de Weert, C. Vermehren, S. Bjerregaard, and S. Frokjaer, J. Pharm. Sci. 93 (2004), 1847-1859. [33] L. Jrgensen, C. Vermehren, S. Bjerregaard, and S. Froekjaer, Int. J. Pharm. 254 (2003), 7-10. [34] Y. Fang and D.G. Dalgleish, J. Colloid Interface Sci. 196 (1997), 292-298. [35] F.A. Husband, M.J. Garrood, A.R. Mackie, G.R. Burnett, and P.J. Wilde, J. Agric. Food Chem. 49 (2001), 859-866. [36] T. Lefvre and M. Subirade, J. Colloid Interface Sci. 263 (2003), 59-67. [37] S. Bjerregaard, L. Wulf-Andersen, R.W. Stephens, L. Rge Lund, C. Vermehren, I. Sderberg, and S. Frokjaer, J. Control. Release 71 (2001), 87-98. [38] S. Bjerregaard, H. Pedersen, H. Vedstesen, C. Vermehren, I. Sderberg, and S. Frokjaer, Int. J. Pharm. 215 (2001), 13-27. [39] L. Jorgensen, C. Vermehren, S. Bjerregaard, and S. Frokjaer, J. Drug Del. Sci. Tech. 14 (2004), 455-459.

178

Biological and Biomedical Infrared Spectroscopy A. Barth and P.I. Haris (Eds.) IOS Press, 2009 2009 The authors and IOS Press. All rights reserved. doi:10.3233/978-1-60750-045-2-178

Quantum Mechanical Calculations of Peptide Vibrational Force Fields and Spectral Intensities
Jan KUBELKAa* , Petr BOUb , Timothy A. KEIDERLINGc a Department of Chemistry, University of Wyoming, USA b Institute of Organic Chemistry and Biochemistry, Academy of Sciences, Czech Republic c Department of Chemistry, University of Illinois at Chicago, USA

Abstract. Vibrational spectra are frequently used for studies of the structure and dynamics of peptides and proteins. Structural interpretation of the experimental data, however, requires theoretical simulation of the spectra for model peptide geometries. Quantum mechanical, in particular density functional theory (DFT), methods have proven exceptionally valuable for calculations of the vibrational force fields and both IR and Raman intensities. A brief review of some recent trends in computation of molecular force fields and spectral intensities is presented. Particular attention is paid to experiments involving circularly polarized light, as these provide enhanced structural information for chiral molecules. Following a historical overview of common approaches, the fundamental theoretical aspects of the calculations of the molecular vibrational spectra are summarized. Special emphasis is given to the problem of simulating spectra for biological molecules (oligo-peptides and nucleotides, proteins and nucleic acids) using DFT methods. The methodology for simulations of large biopolymers with DFT level force fields and intensity parameters abstracted from smaller molecules is reviewed. Several examples with the discussion of successes and difficulties of the vibrational spectra simulations for model peptides are presented. Finally, methods for incorporating the solvent in the spectral simulations are reviewed and discussed.

Keywords. Vibrational spectra, Infrared, Raman, vibrational circular dichroism, peptide force fields, density functional theory, secondary structure, solvent effects.

* Corresponding Author: Department of Chemistry, University of Wyoming, Laramie, WY 82071 USA; Email: jkubelka@uwyo.edu. Corresponding Author: Institute of Organic Chemistry and Biochemistry, Academy of Sciences, Flemingovo nm. 2, 16610 Prague 6, Czech Republic; E-mail: bour@uochb.cas.cz. Corresponding Author: Department of Chemistry, University of Illinois at Chicago, Chicago IL 606077061 USA; E-mail: tak@uic.edu.

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

179

1. Introduction Vibrational spectroscopic methods are important experimental tools for studies of biological molecules: peptides, proteins and nucleic acids. In particular, various Fourier Transform Infrared (FTIR) and high sensitivity Raman techniques are among the most frequently used methods in protein structure, folding and function analyses [1-4]. The vibrational modes of the amide group (the most useful ones are illustrated in Figure 1) are sensitive probes for the secondary structure, which can be utilized in both infrared (IR) and Raman studies. Historically, such analyses were carried out by empirical correlations of IR or Raman amide band frequencies with secondary structures for model polypeptides. Theoretical approaches to understanding the characteristic polypeptide spectral features were based on simplified vibrational calculations through coupled oscillator models or parameterized force fields (FF) obtained by fitting empirical force constants for internal coordinates to observed spectra. Within the last decade, quantum mechanical (QM) computations of complete FF offered a powerful means for determination of peptide vibrational properties independent of problemspecific empirical parameters [5].

Amide I 1600-1700 cm-1

Amide II 1480-1580 cm-1

Amide III 1210-1350 cm-1

Figure 1. Schematic representation of three normal modes of vibration of the amide group which are most important for peptide and protein secondary structure analyses: (a) amide I, (b) amide II, (c) amide III.

While most early IR and Raman studies used frequencies of component bands as structural markers, the intensities and polarizations of individual modes provide additional, valuable structural insights. In particular they are necessary for modeling band shapes of overlapping transitions, as found in biopolymer systems. Vibrational circular dichroism (VCD) and Raman optical activity (ROA) are polarization methods whose conformationally determined sign patterns add to the vibrational frequency resolution and enhance structural sensitivity through characteristic bandshapes. VCD and ROA intensities arise from the differential response (absorption or scattering, respectively) to left- and right-circularly polarized light by chiral structures, and therefore are very sensitive to the conformation of peptides (or nucleic acids) [6-8]. A major aspect of the renewed interest in vibrational spectroscopy studies of biopolymeric molecules arises from the ability of IR and Raman to sense fast time scale motions. Various rapid mixing schemes have been proposed for millisecond resolution [9], laser phototriggering [10-12] and temperature-jump [13, 14] designs for specific protein schemes have been utilized to follow nanosecond scale processes. In addition recent 2-D IR studies have probed bio-molecules with femtosecond resolution [15, 16]. Finally, utilization of selective isotopic labeling has introduced site-specific

180

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

conformational sensitivity into IR, VCD and Raman techniques, which normally provide only average structural information [17]. Interpretation of the additional spectral details and associated extended applications for vibrational spectra is dependent on their accurate theoretical description [18, 19]. With development of fast, inexpensive computers and approximations allowing solution of the Schrdinger equation for larger systems, quantum mechanical (QM) force fields (FF) as well as intensities for IR, Raman, VCD [20-23] and ROA can be determined [22-25]. Initial biomolecule-oriented studies focused on small model systems, but QM vibrational spectral calculations have become feasible for moderate oligopeptides [26, 27], and by use of transfer methods [28], QM level FFs are now realistic for large peptides [29-31] and even proteins [32]. Parallel applications for nucleic acids have also developed [33]. The development of such ab initio FFs and intensity parameters for peptides and their application to vibrational spectra are the topic of this review. As for all theoretical models, the ab initio simulations of vibrational spectra must be critically evaluated by comparison with experiment, for which we supply several examples, as do other chapters in this book. After a brief survey of empirical methods, a more detailed description of ab initio spectral calculations with an emphasis on density functional theory (DFT) approaches will be presented. Extension to large biomolecules utilizing the transfer of DFT vibrational parameters will be discussed along with specific examples focused on peptide systems. Finally, approaches to correction of the simulated spectra for solvation and other structural perturbations will be discussed. This review focuses primarily on equilibrium applications and linear vibrational spectra analyses; in a separate chapter Choi and Cho discuss dynamics and non-linear, 2-D IR simulations [34].

2. Observed Peptide and Protein Vibrational Spectra Infrared (IR) absorption frequencies have for a long time characterized peptide and protein secondary structure [1, 35-39]. The most studied band in the IR of peptides is the amide I, which is primarily amide C=O stretch (Figure 1) and generally appears between 1600 and 1700 cm-1, but typically at 1650-1660 cm-1 for D-helices and 16201640 cm-1 with a weaker component at ~1680-90 cm-1 for E-sheets (see Figure 2 for example peptide IR and VCD spectra). Due to the C=O group orientation, the amide I is polarized with respect to the helix axis in D-helices and E-sheets, which provides an added diagnostic for oriented samples [40]. In VCD, the D-helix and coil amide I bands have oppositely signed couplet patterns (Figure 2), while for E-sheets they are weak and predominantly negative in polypeptides (aggregates), but stronger in globular proteins where such sheets are limited in extent and quite twisted.

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

181

Figure 2. Examples of IR (left) and VCD (right) spectra (amide I and II regions) for model peptides with characteristic secondary structures: (top) D-helix (middle) E-sheet, (bottom) unordered, often termed: random coil but locally left-handed 31-helical (which is sometimes denoted PPII).

Figure 3: Examples of experimental Raman spectra in the amide I III region for an alanine-rich peptide, Ac-(AAAAK)3-AAAAY-NH2 (left), in an D-helical conformation (A) at 5oC and in an unordered conformation (B) at 55 oC, plus (right) a highly D-helical protein, bovine serum albumin (C), and a mostly E-sheet protein, concanavalin A (D).

The amide II (primarily in-plane NH deformation mixed with C-N stretch, ~15001580 cm-1) and the amide A (N-H stretch, ~3300 cm-1 but quite broad) bands have less pronounced frequency shifts with change in secondary structure (Figure 2) although they are polarized and highly sensitive to deuteration effects, which can act as a measure of solvent exposure [41-43]. The amide III (opposite-phase combination of NH bend plus C-N stretch) is mixed with other local modes, particularly the CD-H deformation, and very weak in the IR, [36, 44-46, 47 , 48], but is much more important in Raman analyses, where both the amide I and III are used for peptide structural studies [3, 4, 49]. Some examples of Raman spectra in amide I III regions are in Figure 3. Raman is also sensitive to some side-chain modes, particularly of aromatics and disulfide (-S-S-) linkages [50]. Resonance Raman studies of aromatic residues and ligands (such as heme groups) have proven very useful for detailed protein studies of

182

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

local changes [51-53] while UV Resonance Raman studies of the amide modes have been applied to folding analyses [54-57].

3. Computations of Peptide Vibrational Spectra 3.1. Empirical Calculations of Peptide FF and Vibrational Frequencies Miyazawa proposed a scheme for calculation of vibrational frequencies of the amide I mode for regular helical chains, using a coupled oscillator treatment, whereby identical, local amide oscillators were weakly coupled to nearest neighbors through covalent and hydrogen bonds [58, 59]. Assuming an infinitely long chain in an exciton-like model with helical or sheet-like symmetry reduces the problem to a few allowed distributed modes [60]. This theory provided a basic explanation for the IR spectral differences observed for different conformations. Chirgadze and Nevskaya carried out coupled oscillator calculations of amide I IR spectra for antiparallel and parallel E-sheet structures [61, 62], and of the amide I and II for D-helices [63]. Their perturbation model was based on coupling of the single amide oscillators via a transition dipole coupling (TDC) mechanism, which was able to account for the experimentally observed bandshapes for different structures and establish segment size dependence, predicting the effect of D-helix or a E-strand length and the number of strands in a E-sheet. Torii and Tasumi used essentially the same methodology to simulate the amide I IR spectral bandshapes of complete globular proteins and their segments [64-66]. This simplification of the problem to a single oscillator for each amide group made it possible to diagonalize a full protein interaction matrix, since proteins do not have translational symmetry of regular polypeptides. The coupled oscillator approach has been extended by combining various vibrational coupling mechanisms (through-bond, hydrogen bond and TDC), and is often used for simulations of spectra of various structures, including isotopically labeled peptides and locally distorted structures with resolved component bands [6770]. An alternate formulation of the coupled oscillator scheme, termed exciton coupling, considers vibrational coupling between excited states of local, otherwise independent, quantum mechanical oscillators. In practice, this is equivalent to the coupled oscillator approach with alternate interactions [15, 31, 71-73]. All atom empirical force fields (FF) are derived by adjusting internal coordinate force constants (bond stretches and bends) based on the detailed structure of the molecule to best fit observed frequencies. Systematic development of the force fields for polypeptides was undertaken by Krimm and coworkers by refinement of the force field parameters to fit the experimental data [74]. Normal mode calculations and vibrational frequency assignments were developed for a number of polypeptide conformations. TDC was also incorporated into their atomic FF [74, 75], which provided better agreement of the calculated amide I vibrational frequencies with experiment. Others have taken a similar approach by transferring ab initio FF parameters from N-methylacetamide (NMA) or small peptides [76-79].

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

183

3.2. Empirical Calculations of Spectral Intensities Approximate calculations of the IR spectral intensities are straightforward within the empirical approaches discussed above, and can be derived from the unperturbed amide transition dipoles (coupled oscillator approaches) [58, 61, 62, 64-66, 69, 70, 80-82], bond dipoles or atomic partial charges (atomic FF methods) [18, 75, 83, 84]. These calculations are sufficient to account for the overall appearance of the IR spectra, such as locating the strongest and weakest bands and even approximate the exciton distribution in the bands. While extensive formulations of the Raman polarizability tensors have appeared for many years and have been used for qualitative analyses of Raman intensities, until recently, less work was done on systematic simulation of spectra, particularly for peptides. Some models like the bond and atom polarizability approaches did appear [22, 85-87] and variants are used in materials applications [88]. However, for polarization spectra, such as linear dichroism (LD) and especially VCD or ROA, the fundamental spectral bandshapes are critically dependent on the relative intensities and the sign patterns of the differential spectral bands. These tend to arise from detailed impact of molecular distortions on the electronic distributions and cannot be accurately modeled with fixed dipole or charge models. The first theoretical VCD was an empirical coupled oscillator model, applied to a cyclic dipeptide, where the TDC causes the frequency splitting but also gives rise to the circular dichroism signal [89, 90]. The first polypeptide applications used parameters from Miyazawa and Krimm perturbation theories for coupling of amide oscillators and experimental transition electric dipole values [91, 92]. While this model provided correct qualitative predictions in some cases, especially for the D-helical amide I, it was unsuccessful for other modes and conformations. Diem extended the model to non-degenerate oscillators and used it to calculate IR absorption and VCD of polypeptides and nucleic acids [68, 93-96]. The classical DeVoe polarizability theory approach [97, 98] has been applied to IR and VCD of nucleic acids [99, 100]. A similar model, formulated in terms of the linear response polarizability tensors has been applied to polypeptides [101] and small cyclic peptides [102]. In general, while these models often give reasonable predictions for the VCD of nucleic acids, where through-space coupling as modeled by TDC dominates, and are sometimes adequate for peptide IR, particularly for sheet or extended conformations, they are typically inaccurate when applied to peptide VCD. Accurate simulation of peptide VCD requires quantum mechanical treatment to properly model through-bond effects (coupling) on the wave function.

4. Quantum Mechanical Calculations of Vibrational Spectra Quantum mechanical calculations provide complete molecular FFs (including all the vibrational interactions) without need for empirical parameterization. In addition, QM models, now developed mostly at the Density Functional Theory (DFT) level, are critical for obtaining accurate IR, Raman, ROA and VCD spectral intensities and are implemented in most quantum mechanical computational packages. Although VCD computation has some fundamental issues requiring a certain degree of care, various theoretical schemes for VCD have been implemented, as has been extensively reviewed [20-23, 103-105]. Finally, methods for determining the magnetic and quadrupolar derivatives needed for ROA have become increasingly available [25, 106].

184

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

4.1. Vibrational Spectral Frequencies In the Born-Oppenheimer (BO) approximation [107], central to most molecular QM calculations, the total molecular wavefunction is separated into electronic and nuclear components, represented as <(r,R) = Iel(r,R)FX(R) where the electronic function depends on the nuclear positions, R, but not velocities. Solving the electronic Schrdinger equation yields an electronic energy for the ground state, H(Ri). Combining H(Ri) with the nuclear repulsion, Vnn(Ri), forms an effective (average) potential energy term for the motion of the nuclei. Most molecular vibrational analyses are done in the harmonic approximation, where the potential is expanded around the equilibrium position with respect to the nuclear displacements, 'Ri, and only the leading (quadratic) term is retained. The nuclear Hamiltonian then becomes: 2 3N 3N 1 t 1 3 N Pi t (1)  Fij 'Ri 'R j H 2 p .p  q .f.q 2 i 1 mi i 1 j 1 where qi
mi 'Ri and f ij

Fij / mi m j are the mass-weighted coordinates and force

field (q and f are vector and matrix representations), respectively, and Pi are the nuclear momenta (correspondingly, pi are mass-weighted). The Cartesian force constant matrix, w2H , (2) Fij wRiwR j is referred to as the (harmonic) force field (FF, also referred to as the Hessian). Anharmonic effects (higher order energy derivatives) are small for the most important peptide vibrations (amide I III), but can be included [22] at much greater computational cost [108, 109]. In the harmonic approximation, the multidimensional Hamiltonian can be elegantly reduced to a sum of independent one dimensional operators by a unitary transformation of coordinates from mass-weighted Cartesian displacements and momenta (qi and pi) to normal mode coordinates Qk and corresponding momenta 3k,

Qk

s
j 1

3N

1 kj

qj ,

3k

s
j 1

3N

1 kj

pj

(3)

By substitution of (3) into (1)

1 t t (4) 3 .s .s.3  Q t .s t .f.s.Q . 2 When the transformation is unitary (st.s = E, where E is the identity matrix) and the transformed force field, st.f.s = /, is diagonal (i.e. /ij=0 for izj and /ii=Zi2, which can be always done for a quadratic potential form), the Hamiltonian becomes a sum of onedimensional harmonic oscillator Hamiltonians hi H
3N 1 3N 2 2 2 (5) 3  Z Q i i i hi (Qi ) . 2i1 i 1 Consequently, the equation of motion for the nuclei (substituting the QM operator for momentum, 3 = -i w/wQ) reduces to a set of 3N uncoupled 1-D Schrdinger equations

Note, in molecular mechanics force field denotes a more general, empirical dependence of the energy on internal coordinates than represented by these derivatives.

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

185

1 2 w2  !  Zi2Qi2 2 F i (Qi ) 2 w Q i

Ei F i (Qi ) ,

i = 1 , 2, , 3N

(6)

The harmonic oscillator energies, Ei,v, and wavefunctions, Fi, can be obtained analytically, as 1 (7a) E i ,X !Z i  X i 2 (7b) AXi H Xi ([ )e 2 i where i is the quantum number, [ Z !Qi and AH are the normalized Hermite polynomials. Instead of s, a direct Cartesian-normal mode transformation matrix, S, can be defined as w'R j (8) S s / m .
kj

F X ([ )

[2

wQk

kj

Once the molecular FF is known, calculation of the vibrational frequencies and normal modes of vibration is straightforward: the Cartesian force constant matrix is mass weighted and diagonalized; the resulting diagonal constants correspond to squares of the quantum energies for the fundamental transitions (i = 0 for all i o i = 1 for a normal mode i). The accuracy of vibrational frequencies and normal modes is determined by the accuracy of the FF. Obtaining an accurate FF (equation 2) is therefore the main and most difficult task of vibrational frequency calculations. Coordinates and geometries. QM vibrational calculations are generally done using force constants (Fij) directly computed in Cartesian coordinates for which the solution contains 3N coordinates Qk. The rotational and translational degrees of freedom, corresponding to external motion of the molecule as a whole, will correspond to six (five if the molecule is linear) coordinates whose eigenvalues are zero and can be separated from the internal nuclear (vibrational) coordinates [22, 110, 111]. This is in contrast to more traditional approaches to molecular FF determination, which are based on more chemically intuitive and transferable internal coordinate representations (i.e. bond stretches and bends). While the force field could be transformed to internal, nonredundant symmetry-adapted coordinates, where the zero-energy modes do not appear, this has a disadvantage in that the kinetic energy in (1) becomes more complicated in internal coordinates, no longer being separable. Since harmonic vibrational frequency calculations are applicable only to molecules whose geometry corresponds to the minimum of the potential energy surface, geometry optimizations should be carried out prior to calculation of vibrational frequencies. However, to explore the vibrational characteristics of specific biopolymer conformations, it is typically necessary to constrain the geometry optimizations to represent realistic, large biomolecular structures. Smaller model conformers used for computation are not stable under full, unconstrained energy minimization due to the limited sizes and solvation approximations required for QM FF calculations. However, in our work, only backbone torsional motions are constrained, since they do not significantly affect the high frequency, stretching and bending modes of interest for spectral analysis, whose coordinates were fully optimized. Alternately, low-frequency normal mode coordinates can be fixed directly, as they approximately correspond to the torsional and other large-amplitude motions [112-114]. Other groups have carried out full minimizations of larger structures, by limiting basis sets or coupling with semiempirical models [27, 233, 234].

186

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

4.2. Spectral Intensities The spectral intensities of molecular vibrational transitions can be formulated using quantum mechanical time-dependent perturbation theory [22, 87]. IR and Raman intensities can be obtained at the first level of approximation, where the electromagnetic field interacts only with the molecular dipole moment. For VCD and ROA the magnetic dipolar and electric quadrupolar interactions must be included. An important difference between the absorption (IR, VCD) and scattering (Raman, ROA) processes is that the former are one-photon and the latter two-photon processes. IR intensity: atomic polar tensor. In the electric dipole approximation, the IR intensities are commonly expressed as the dipolar strength, which is the square of the transition electric dipole moment:

DX 0

P0X PX 0

PX 0

FX PD F 0

(9)

where PD denotes the electric dipole moment operator and the index refers to the vibrational quantum number. Due to selection rules for the harmonic oscillator ' = r1, only = 1 needs to be considered in (9). The dipole moment can be expanded around the equilibrium nuclear positions as a function of the normal modes, Qj. Keeping only the linear term and using linear harmonic oscillator wavefunctions, the D-th component of transition dipole moment for the normal mode j becomes:
F 1 PD F 0
j

! 2Z j

wPD wQ j

! 2Z j

PED S E
A A,

(10)
AE , j

where S is defined in (8) and the atomic polar tensor (APT, "dipole derivatives") is defined as: wP A (11) P A D = E ED  Z A eG ED
ED

wR AE 0

A where E ED is the electronic contribution, and the last term in (11) is the nuclear contribution to the APT. Evaluation of the nuclear contribution is simple and can be related to empirical modeling of intensities based on effective charges fixed to the nuclei [91, 115]. In that model, initially partial charges were guessed and later taken from quantum chemical calculations of charge distributions (Mulliken populations). Later extensions allowed for charge flow and for localized molecular orbital motion [23, 83, 105, 116]. These methods are not very accurate since electron charge responds virtually instantaneously to nuclear motion, the basis of the BO approximation. However, they may be still useful in simplified QM/MM models to obtain a first approximation of the spectral intensities [117]. A less trivial task is to obtain the electronic contribution, which is done using perturbation theory [22, 118, 119]: wM , w w 2H wH (12) A G

EED

wR A, E

MG PD MG

 wR wF A, E D 0 , 0

2 MG wFD 0 wRA, E 0

0 where M G is the electronic ground state wavefunction and F the electric field, the

electric field derivative of the Hamiltonian is the dipole operator: (H/FD)0 = -Pel,D The derivatives can be obtained either numerically or, more accurately and efficiently, by use of analytical derivative techniques, which will be discussed later.

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

187

Raman intensity: polarizability derivatives. Raman intensities have similar dependencies on the properties of the BO wavefunction and the molecular polarizability tensor, D, once expanded in the normal coordinates. This approach is known as the Plazcek approximation [22, 87, 120]. The polarizability depends on the light frequency Z as: Z gn 2 (13) D Re P gn P ng , 2 ! n Zng  Z 2 where g and n denote the ground and excited electronic state and Zgn is the energy difference of ground and excited states expressed as a frequency, 'E/!. Within the BO (Plazcek) and harmonic approximations, each vibrational normal mode contributes independently to the spectral intensity, via isotropic invariants of the polarizability. These polarizability derivatives can be computed ab initio, as the second derivative of the energy H with respect to the electric field, F, and the normal mode derivatives can be obtained by further differentiation of D by Q. In practice, Cartesian derivatives are computed and transformed into normal modes utilizing the S-matrix, much as for the APT. Cartesian derivatives of D related to individual atomic coordinates are analogs of the APT for the scattering process. Taking the linear term in the expansion of D, the allowed transitions in the harmonic approximation are from i =0 to i=1. Observed Raman transition intensities are not just proportional to the square of the polarizability change, [D i ]2 = [D/Qi]2 , but additional tensor components contribute depending on the experimental setup. For example, back-scattering Stokes Raman intensity for an isotropic sample within the harmonic approximation would be given by

1 , (14) Z k T D x, y,z E x, y,z i B where K is a constant (since absolute scattered intensities are rarely measured). VCD intensity: atomic axial tensor. To evaluate VCD intensity, it is necessary to calculate the rotational strength, which can be expressed as: (15) F0 , FX FX m R0X Im 0X m 0X Im F 0 Si 6K

7D

(i ) (i ) (i ) DE D DE  D EE D DD 1  exp  (i )

Zi

1

where P and m are, respectively, electric and magnetic transition dipole moments. The electric moment is evaluated as before (eqn. (10)-(12)). For the magnetic moment it is necessary to consider the dependence of the electronic wavefunction on the nuclear velocities (i.e. momenta) [20-23, 103, 104]. Expansion of the magnetic moment in the nuclear momenta, P, and retaining only the linear term leads to the expression:
1 !Z j 2 wmD A 3 A 2 (16) D F0 F1 m i 2 M A wP A M ED S AE , j  2! Z j M ED S AE , j A, E A, E E where SAE,i is defined in (8) and iM A A (17) wmD I ADE  ieZ A H DJE R 0 AJ M ED A 2! wPE 0 4! is the atomic axial tensor (AAT). As for the APT (eqn. 11), MADE is separated into the j 1

electronic ( I DE ) and nuclear part, the last term in (17), where ZA, MA and R0A are the nuclear charge, mass, and equilibrium position, respectively, and HDJE is the antisymmeric (Levi-Civita) tensor. The nuclear contribution is again straightforward to evaluate, and a similar procedure as for the APT can be used to obtain the AAT

188

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

electronic contribution, except a magnetic field is now the perturbation. Formal derivation of the electronic part requires dependence the wavefunction on nuclear momenta, which is equivalent to inclusion of non-Born-Oppenheimer corrections. The final expression for the electronic part of the AAT contains only the nuclear position and magnetic field derivatives of the wave function [20, 103, 121, 122]:
A I ED

wMG wR A, E

wMG wB D 0

(18)

where
wM G wB D 0 

M K M K mD M G
EG  E K

(19)

K zG

The expression (18) corresponds to the most commonly used magnetic field perturbation (MFP) formalism for VCD, first derived by Stephens [20, 103] and independently by Buckingham and coworkers [123]. Stephens showed that a sum over states formula describing the response of MG to the nuclear motion (needed in (18)) can be avoided, without any approximations, by introducing the magnetic field perturbation [124]. The magnetic term in (18), (19) containing the gradient operator suggests that accurate wave functions might be needed to obtain useful AAT, a condition that has led to much testing of this theory for its basis set sensitivities. There are several other theoretical models of VCD [23, 105, 125-131] but these are not commonly used. ROA intensity: optical activity tensors. For ROA terms beyond the (electric) dipolar approximation must also be considered, in particular, A, the electric dipoleelectric quadrupole (4) and G, the electric dipole-magnetic dipole (the optical rotation tensor) polarizabilities: Z gn 2 (20) AD ,EJ Re PD , gn 4 EJ ,ng , 2 ! n Zng  Z 2

Z 2 (21) Im PD , gn mE ,ng . 2 ! n Zng  Z 2 As for ordinary Raman, the expressions for the experimentally observed ROA intensities are complex, depending on the experimental design. For example, the simple Stokes backscattering ROA intensity is given by
G 'DE 
S

48K Z 1 i) (i ) (i ) (i ) (i ) (i ) DDD ZH DEJ D DG G 'DE G ' (EE AEJG / 3 1  exp  i 3DDE c D ,E x, y , z kT Zi

1

(22)

where c is the velocity of light. The derivatives of A and G can be thought of as scattering analogues of AAT. These are normally computed using Cartesian derivatives [22, 24, 25], and a fully analytical implementation has recently appeared [25, 106]. 4.3. Computational Aspects The first non-empirical computations of molecular FF and intensities used semiempirical QM methods (e.g. CNDO or MNDO) [132, 133], and later ab initio HartreeFock (HF) approaches were developed [134-136]. Initially these used finite derivative approaches, where the energy would be recalculated for a set of small deviations from the equilibrium geometry, but now most use more efficient and accurate analytical methods, as originated by Pulay [5, 118, 119, 137, 138]. While HF methods are more accurate than semi-empirical FF, they require greater computer resources, limiting the

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

189

accessible molecular size. HF-level calculations are also systematically in error (frequencies too high). Higher level correlated calculations (e.g. MP2) can provide improvement of the results, but require a much greater computational cost. Fortunately, more efficient density functional theory (DFT) methods [139143] were adapted to IR, Raman and VCD calculations enabling more accurate modeling even for bigger molecules [5, 104, 122, 130, 144-147]. Accuracy approaching that of the correlated calculations could be obtained with some DFT functionals using CPU times comparable to that needed for HF calculations. However, DFT incorporates the functional as an unknown, since there is not a systematic method like the variation principle for its choice. Experience now provides guidance whereby the results obtained with some standard functionals and basis sets have been shown to be reliable, although systematic testing is still desirable [144, 145, 147-150]. DFT is an ideal method for the simulations of spectra for large biological molecules [151], where it is necessary to find a compromise between accuracy and computer resources. The following sections provide a brief overview of DFT and analytical derivative DFT theory. DFT has been reviewed extensively [140-142, 152-154], including focuses on biological systems [151], comparison of practical DFT and HF methods [143] as well as various density functional methods [144] and analytical derivative methods [5, 137, 144, 155]. Several different QM codes have been developed that incorporate DFT calculations. Perhaps the most commonly used is the Gaussian suite of programs [156], which has a long history in chemistry labs and is fairly accessible. CADPAC [157] and the Dalton suite of programs [158], are more specialized for calculations of molecular electromagnetic properties, the latter being freely available. The DFT-only Amsterdam density functional package (ADF), based on Slater atomic orbitals, computes IR and VCD intensities as well [159]. DFT Energy Computations. QM methods that solve the electronic Schrdinger equation with no empirical parameters are referred to as ab initio. Until recently these mostly used the HF approximation, where the wavefunction is replaced by a Slater determinant (antisymmetric product) of molecular orbitals. The coupled one-electron HF equations !2 2 (23)  \ i  V\ i H i\ i 2me were solved to self-consistency (SCF). DFT encompasses the various attempts to replace the wavefunction by the electronic density, which would also allow molecular properties to be determined "ab initio. Molecular DFT methods are based on the Kohn and Sham (KS) approximation, where a HF-type wavefunction is introduced and resultant equations resemble the HF one-particle equations (23). The only difference is in the electron-electron potential term, V, which can be a parameterized function ("functional") of the density in DFT. In Kohn-Sham (KS) DFT [139] the molecule is approximated as a system of noninteracting electrons, with independent orbitals \i for which a one body potential adjusts the independent particle density 2 (24) U \i
i

to be the same as for the real, fully interacting system. The energy of the real system is

190

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

T0  Vext Udr  J U  E XC U

(25)

where T0 is the kinetic energy of the non-interacting system, !2 * (26) T0  \ i 2\ i dr , 2me i Vext represents the external field (e.g. electron-nuclear attraction), J(U) is the Coulombic electron repulsion 1 U r1 U r2 (27) J U dr1dr2 r12 8SH 0 and EXC(U) is called exchange-correlation (XC) energy, which includes everything not contained in the first three-terms. The KS equations are obtained by variation of the energy (25) using relation (24) between the KS orbitals \i and U, yielding !2 2 !2 2 (28)  \ i  Vext  V J  V XC \i  \ i  V\ i H i\ i 2 me 2 me where VJ is the Coulomb potential and VXC is the exchange-correlation potential, a functional derivative of the XC energy. This yields essentially the same equation as (23), but the potential is different and contains, at least in principle, full electron correlation. Thus the KS method retains the HF simplicity, but is more powerful. Since the quality of DFT calculations depend on EXC, which is unknown, development of approximations of exchange and correlation density functionals is a key for success of DFT methods as reviewed by Becke [160]. There are three main classes of density functionals. The first, termed local density approximation (LDA, also known as LSDA for Linear Spin-Density Approximation) has the form LDA (29) E XC (r ) f ( U (r ))dr where f depends on the "local" electron density. LDA calculations often improve on HF [140, 151], but overestimate correlation and binding energies. Generalized gradient approximation (GGA) functionals correct the overbinding by addition of terms dependent on the gradient of the density, yielding functionals of the form GGA (30) E XC (r ) f ( U (r ), U (r )) dr . which are also termed gradient-corrected or non-local functionals. Combined GGA functionals have been proposed including the exchange of Becke (B) [161] and the correlation of Lee Yang and Parr (LYP) [162] or Perdew and Wang (PW91) [163, 164], which combine to yield BLYP and BPW91 functionals. PW91 [165, 166] uses no empirical parameters, therefore it is rigorously ab initio, while the Becke exchange functional contains a parameter fit to the exact exchange energies of noble gas atoms. Aside from these two "pure" DFT methods, a third class, the hybrid functionals, incorporate the Fock exchange integral
E
F XC

F X

1 8SH 0


i, j

\ i * r1 \ j * r1 \ j r2 \ i r2
r12

dr1dr2 ,

(31)

where the sum runs over all spin orbitals. The most popular of these is the Becke three-parameter hybrid functional B3LYP [167]
B 3 LYP E XC S 1  c1 E X F B S VWN LYP  c1 E X  c2 ( E X  EX ) 1  c3 EC  c3 EC

(32)

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

191

in which c1, c2 and c3 are constants determined by fitting experimental data, giving these methods an empirical flavor. In addition, the Fock exchange (31) is a truly nonlocal term, which significantly increases computational complexity and scaling of the hybrid methods. Nevertheless, B3LYP is widely used for vibrational analysis, due to the very good agreement obtained with it between calculated and experimental data. Basis sets. Both the HF and KS (DFT) methodologies depend on the concept of use of linear combination of atomic orbitals (LCAO) to form molecular orbitals (MO)

\ i ( MO )

c M
i k k

( AO ) .

(33)

Most quantum chemical programs use Gaussian type AOs (GTOs). Slater type AOs (STOs) allow a decrease of the number of AOs needed to attain a given precision, but are less efficient. Less commonly used basis sets are formed from purely numerical representations or plane waves [168]. In general large basis sets provide better results by imposing fewer constraints on the electron distribution at the cost of longer computational times. Incomplete AO sets may cause a significant error in computed molecular properties, therefore selection of an efficient basis set is important. Basis sets are divided into several types: Minimal basis sets contain the minimum number of basis functions for each atom, an example of which is STO-3G [169] where three Gaussian functions are used to approximate a Slater-type atomic orbital. Split valence basis sets have two or more basis functions for each valence (not core) orbital, examples of which are double zeta basis sets, such as 6-31G [170] with one orbital function made of six contracted Gaussians for heavy atom core electrons and three for H as well as three plus an extra non-contracted function for valence electrons on second row atoms. Triple zeta basis sets like 6-311G [171], use three separate functions for valence electrons. Polarized basis sets contain added functions with higher angular momentum beyond that required for the simple ground configuration of each atom. For example, 6-31G(d) (also denoted 6-31G*), adds a d-type function for heavy atoms and 6-31G(d,p) (6-31G**) has a p-type function for hydrogen. Diffuse basis sets [172] add radially larger s- and p-type functions, such as 6-31++G which contains extra 2s orbitals for hydrogen and 3p orbitals for second row atoms. Diffuse functions allow electrons more overlap with other atoms and are important for anions, atoms with lone pairs, excited states and some properties, in particular the electronic polarizability. In principle, basis sets can be expanded infinitely, eventually yielding exact results, but in practice, this is severely limited by available computational power. Generally, molecular properties computed with DFT methods show fairly rapid convergence with basis set size. Typically, valence double-zeta (or split-valence) polarized quality basis sets as, for example 6-31G(d), produce acceptable results for geometry optimization and vibrational frequencies, with some notable exceptions [151]. Nevertheless, evaluation of the influence of the basis set size on calculated properties is important. Computing analytical harmonic FF, APT and AAT. The most computationally demanding task is the DFT computation of the harmonic FF, i.e. the second derivatives of the total energy with respect to the nuclear displacements. Analytical derivative calculations require solving coupled perturbed Kohn-Sham (CPKS) equations, in analogy to the coupled perturbed HF (CPHF) theory [118, 119, 137, 173, 174]. The XC integrals cannot be computed analytically and in general must be evaluated by numerical quadrature on a grid. However, the use of cutoffs, efficient weighing schemes and grid compression can significantly reduce the computational cost [175]. In addition, more efficient evaluation of Coulombic term derivatives (27) in CPKS can be

192

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

obtained by use of the fast multipole method (FMM) [155, 176, 177]. As a result, the DFT analytical second derivatives scale much better with the size of the system than do the HF level analytical derivatives (CPHF equations), since the latter require evaluation of the Fock exchange terms, which cannot be improved by FMM [155]. The more favorable scaling makes DFT spectral calculations more attractive, especially for large peptides. By contrast, hybrid DFT methods, such as B3LYP (32), are computationally more expensive, since they require evaluations of both the HF exchange as well as the DFT XC quadrature. Advanced implementations of DFT promise near linear scaling for large molecules [178-180]. Improvements of the efficiency of DFT geometry optimizations and frequency calculations have been described in detail [175]. The APTs are normally calculated together with vibrational frequencies using (12) and do not add a significant overhead to the computational cost [181]. Additional considerations, however, are necessary for computations of the AAT (18) to obtain VCD intensities, since with conventional basis sets they are dependent on the choice of the coordinate origin [121]. The gauge origin problem is a direct consequence of incomplete basis sets, but can be eliminated by using magnetic field dependent atomic orbitals, known as gauge including atomic orbitals (GIAO) [182, 183]. Bak and coworkers [184-186] first used GIAO for VCD calculations, which reduces the need for large basis sets; in fact, GIAOs accelerate basis set convergence of magnetic properties [184]. Similarly, ROA intensities become origin-independent with GIAOs.

5. QM Simulations of Peptide Vibrational Spectra

5.1. Model IR and VCD Simulations Due to computational limits, ab initio calculations of IR spectra for peptides initially focused on small model amides, such as NMA [77, 187-192], and then expanded to two and three amide containing peptides [193-196]. Some recent studies still focus on this level of molecular complexity while pursuing models of ever more complex interactions. HF force fields normally overestimate the vibrational frequencies, which seems to correlate to inaccurate optimized geometries (the bond lengths are typically too short) [5]. As a remedy, scaling of ab initio FF, termed scaled quantum mechanical (SQM) FF, was proposed to best match experiment, but in turn makes it partially empirical [5, 135]. Development and implementation of DFT methods with GGA or hybrid functionals, allowed for efficient computation of much more accurate force fields [147, 197-199] as well as infrared intensities [148, 149, 200]. NMA, having a single peptide bond, has become a benchmark for amide vibrational frequency calculations [168, 187-191, 201-216]. Figure 4 shows example results from our laboratories, comparing experimental frequencies for NMA in vapor with computed values using HF, correlated and various DFT methods (Fig 4A-left) and various sized basis sets with one DFT functional (Fig 4B-right). As can be seen the pattern of improvement with regard to fitting these two experimental frequencies is not smooth, but the HF and higher correlated level results (with the same basis set) are all >100 cm-1 too high, while the DFT results with various basis sets agree better although varying non-monotonically. From comparison of various DFT methods, we and others [217] have found the BPW91 functional to perform best for simulations of the amide I and II spectra, but B3LYP and other hybrid functionals may be better for other modes. Generally, our DFT vibrational spectra simulations use BPW91 density functional.

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

193

Figure 4. Quantum chemical calculations of amide I (dots), amide II (squares) and amide III (triangles) vibrational frequencies contrasted with experimentally determined gas phase values (dashed lines). Left (A) Comparison of HF, higher correlated methods (CASSCF-complete active space SCF, MP2-second order Moeller-Plesset perturbation theory, QCISD quadratic configuration interaction singles, doubles, CCD-coupled clusters doubles) and several common DFT methods (LDSA linear spin-density approximation, GGA-generalized gradient approximation and hybrid HF+GGA density functionals). The calculations all use a 6-31G(d) basis set. Right: (B) Basis set dependence of density functional calculations using BPW91. Note: 6-31G(d0.3) is a 6-31G(d) basis set with a stretched d-basis function using a lower Gaussian exponent of 0.3 (default is 0.8).

Comparison of calculated spectra for small peptides with experimental data for larger peptides with defined secondary structure requires constraining the peptide backbone geometries (I\ torsions) to specific values. Determining an experimental system whose spectra can sensibly be compared with fully optimized small peptide structures is a major impediment, since real peptides fluctuate and short sequences do not usually have stable, well-defined secondary structures, computationally or experimentally, unless stabilized by specific interactions. Although small peptides have been shown to have a significant component of polyproline II (PPII or 31-helical) type structure [218-222], which stems from the local left-handed turn or extended helix nature of the random coil [219, 223], this has little impact on understanding of spectra for well-defined conformations. Recent interest in unstructured proteins and peptides [224-226] as well as development of structural tools based on coupling of inequivalent modes have increased studies of small peptide structures, but realistic modeling of solvent effects remains a challenge (vide infra). Already at the dipeptide level, the computed spectra for constrained helical or sheet conformations have characteristic features of the specific secondary structures due to the dominance of near-neighbor coupling [227]. Tripeptides give similar results, and show a clear discrimination between helix and sheet, as well as helix and coil, see

194

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

Figure 5 [228]; however, small peptide models lack the build-up of intensity in single delocalized modes characteristic of exciton coupling in polymers. Similar calculations for many small peptides have been carried out by other groups, several using full sidechains to relate to experimental conditions (see below), and/or full minimization to find a QM structure [26, 27, 228-234].

Figure 5: Simulated IR (top), VCD (middle) and Raman (bottom) spectra for triamides Ac-Ala2-NHCH3 in (a) D-helical, (b) 310-helical and (c) left-handed 31-helical (PolyProII-like) conformations, calculated at the DFT: BPW91/6-31G* level. Vertical lines indicate the positions and relative intensities of the normal modes. Envelopes represent Lorentzian broadened bandshapes centered on those positions which were summed to give a representative spectrum in molar units (for IR and VCD).

Development of improved computer systems, codes and DFT methods, made it possible to move beyond these small oligopeptides and directly address computations of moderately large peptide structures. For modeling D-helices, we first computed 7amide oligo-Ala peptides so that the center amide group had H-bonds forward and back, then these were extended to 10- and 11-amide structures [26]. In addition to D-helices, we have also modeled 310- and 31-helical as well as E-sheet geometries [26, 29, 112, 227, 228, 235-241]. A set of 10-mer structures were allowed to fully minimize from initial D-helix, 310-helix and ProII-helix geometries. Without solvent (vacuum) the AcAla9-NH-Me D-helical structures reverted to a 310-helical geometry, but the Ac-Pro9NH-Me did minimize to both Pro I and ProII conformations (cis and trans amide structures with right- (103) and left- (31) handed helical structures) [26]. With a solvent correction (see below), all three structures were stable. This minimization study enabled comparison of spectra calculated for different oligopeptide lengths, and also for idealized D-helices with those for fully minimized structures. As shown in Figure 6, with increasing length of the peptide, the IR intensity builds in one component and the amide I becomes dominant, which is characteristic for extended repeating structures.

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

195

The differences between the fully minimized and an ideal oligomer (Fig. 6c,d) are very minor. Subsequently Cheeseman and Stephens [234] have done similar full minimizations on peptides as long as 17, 20 and 25 residues, and Dannenberg up to 18 residues [233], using DFT/semiempirical ONIOM methodology [27, 242]. Raman simulations have been done for similar constrained structures (Roy et al. unpublished).

Figure 6. Simulated IR and VCD spectra for various length D-helical model oligopeptides at the DFT BPW91/6-31G(d) level: (a) idealized D-helical triamide Ac-(Ala)2-NH-CH3, (b) idealized D-helical heptaamide Ac-(Ala)6-NH-CH3, (c) idealized D-helical undecaaamide Ac-(Ala)10-NH-CH3, and (d) fully optimized decaamide Ac-(Ala)9-NH-CH3. Intensity units as in Figure 5.

Important insights into the characteristics of the vibrational spectra of the E-sheet structures were obtained from the DFT simulations of the model peptide E-sheets. In Figure 7 the simulated IR and VCD are compared for a single tripeptide E-strand, and two- and three-stranded antiparallel peptide E-sheets. While even the single strand bears the characteristic qualitative features of the Esheet IR and VCD spectra [228] (Figure 2), the cross-strand interactions and H-bonding enhance the amide I band dispersion, make the amide II region complex and normalize the relative amide I-II intensity ratio [30, 239]. The frequency shifts reflect the increase in H-bonding, since the edge residues point into vacuum but the interior ones (multistrand models) are Hbonded; this difference, however, would be less dramatic if the edges were solvated with water [239]. The VCD decreases with added strands, reflecting the delocalization of the underlying modes over the effectively more planar (and therefore less chiral), larger sheet structure. These results are even more evident in larger Esheets, whose spectra were simulated using the parameter transfer method (next section). Assuming that the polypeptide can be treated as having identical residues is computationally useful and reflects the empirical observation that all peptides and proteins have similar frequency patterns. However, when trying to match the more detailed experimental features, particularly for heterogenous structures (see below [235, 237, 238, 241, 243, 244]), the sequence becomes more important. For a series of tripeptides, AlaXxxAla, all constrained to the same geometry, we varied Xxx to see the effect on the computed frequencies (diagonal FF). Site specific spectral shifts due to the side chain can best be recognized by isotopic labeling the Xxx amide C=O group with 13C=18O to decouple it from the other C=O groups (Table 1). Beyond the variations seen here, a Pro in the sequence makes the Ala-Pro tertiary amide chemically different, shifting the frequency significantly down (while losing the amide II contribution) which can have a significant impact on the interpretation, especially of E-

196

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

sheet like modes [235]. Others have also investigated the effects of different sidechains on the spectra for small peptides [245, 246].

Figure 7. Simulated IR and VCD spectra for Esheet model oligopeptides at the DFT BPW91/6-31G(d) level: (a) a triamide Ac-(Ala)2-NH-CH3 single Estrand, (b) antiparallel two-stranded E-sheet (triamide strands) and (c) antiparallel three-stranded E-sheet (triamide strands). Intensity units as in Figure 5.

Table 1. Uncorrected amide I frequencies (cm-1) simulated at the BPW91/6-31G** level for tripeptides labeled on the center (*) amide with 13C=18O. AXAa PPII 310 D-helix AG*A 1667 1658 1658 AA*A 1662 1655 1651 AV*A 1655 1657 1641 * AL A 1650 1655 1637 AF*Ab 1663(F), 1616(ring) 1647(F), 1612(ring) 1649(F), 1616(ring) a Constraints for (I\) are: D-helix (-57q, -47q), PPII (-78q, +149q), 310 (-60q, -30q) [247] b Phe ring modes were separated enough to avoid coupling to the amide.

5.2 Model Raman and ROA Simulations Similar to VCD and IR, accurate force field modeling is vital for interpreting the structural information from Raman and ROA spectra, which has been exemplified by published simulations of these spectra for the polyproline II conformation [248]. Due to the variation in sensitivity to modes other than the large electric dipole transitions seen in IR and VCD, the Raman/ROA provide a complementary probe of structure. In general, this implies that sidechain modes should also be simulated. The precision of ROA simulations is rather limited for polypeptides. Two reasons for such difficulties were found in small molecule studies. For the alanine zwitterion the experimental signal can be simulated faithfully only by accounting for internal rotation of NH3+, CO2-, and CH3 groups [249]. Sensitivity to rotamer populations (effectively coupling low- and high-frequency modes, primarily an issue for side chains) requires Boltzmann averaging of an ensemble of conformations. Second, for the lower-frequency modes, the spectrum of the solvent becomes inseparable from that of the solute [250].

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

197

Simulations of Raman for amide modes (using Ala oligomer models) yield frequency patterns shifted from the IR bands as expected from empirical results.

5.3. Cartesian Coordinate Transfer (CCT) Method In the examples above, we have shown selected computational results using DFT methods to simulate spectra for relatively small conformationally constrained peptides as well as fully minimized structures. For very large molecules this approach is not practical with available computational resources. Consequently, we developed a method of transfer of force constants, APT, AAT and the Raman tensor derivatives from computations on smaller molecules to much larger ones [28]. This approach is based on the basic assumption of transferability of the force constants, which harks back to empirical methods, in which force constants expressed in terms of internal coordinates were transferred from various molecules onto more complex systems and optimized by fitting experimental spectra. The same approach is also widely used in developing molecular mechanics (MM) FFs in wide use today. The differences are first that we calculate not only local force constants but complete inter-residue interactions, albeit limited in range, and second that our FFs are not empirical but represent DFT results transferred to larger structures. The CCT is based on the property of regular conformations that amide units will have the same local environment in a large oligopeptide, L, with N amides, as in a smaller oligopeptide, s, with n amides, constrained to have the same conformation. FF and intensity parameters from s (e.g. those calculated at the DFT level) can be successively assigned, unit by unit, to the corresponding parts of L [28]. This is illustrated in Figure 8 where L is an extended D-helix, N = 20, and s is a blocked hexapeptide, n = 7. In this case, the nitrogen-terminal atom parameters of s can be transferred to the corresponding N-terminus of L and the C-terminal parameters of s to the L C-terminal atoms. The center residue in s is H-bonded forward and back and provides a good model for the repeating residues in the longer helix. A single calculation on s then suffices to provide data to create a nearly ab initio (DFT) quality FF for L, with the only loss being interaction constants for residues more than n positions apart in the chain, assuming the longer peptide is fully regular. The applicability of our original transfer method for oligopeptide FF, APT and AAT stems from the approximation that oligopeptides with defined secondary structure are composed of repeating connected amide units, if the differences in the side-chains are ignored. The approximation of identical side-chains is a minor limitation for most systems, but for some sequences, especially those containing Pro, it may not be sufficient, as described above [235]. For a coupled helical sequence (or coupled strands to form a sheet), the inter-residue coupling is larger than or similar to typical diagonal FF variations between residues, and these generally do not greatly disturb the overall spectral appearance and dispersion. A more restrictive assumption in this CCT model is that of repeating geometry, or translational symmetry. For systems with an irregular structure we have developed an analogous, although less efficient procedure utilizing DFT computations for a greater number of fragments [235, 238, 239, 251, 252].

198

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

Figure 8. Schematic of the CCT transfer idea from a helical 7-amide small peptide, Ac-(Ala)6-NHMe, to a helical 20-amide large oligopeptide, Ac-(Ala)19-NHMe. The N-terminal residue parameters transfer to the N-terminus, and C-terminal to the C-terminus, while the parameters from the center residue, H-bonded both forward and back, transfer to all the central residues that have the same properties in the large peptide.

The CCT algorithm. In practice, the CCT method uses a molecular graphics routine [253] for fitting s onto the corresponding region of L to specify the atomic overlaps for transfer of the appropriate FF, APT and AAT parameters. The FF, APT and AAT all transform as second rank Cartesian tensors under rotations, while derivatives of D and G have three and A has four indices to transform. To assign the proper parameters to the target atoms (A, B, . . .), the corresponding atoms from the best fit on the small fragment, (a, b, . . .), must be transformed to the same orientation, by minimizing 2 (34) G U R  Ur where R is a column vector of coordinates of (A, B, , M) on L, r is a column vector of coordinates of (a, b, , m) on s and U is a unitary (UtU=E), 3u3 rotation matrix dependent on the Euler angles (TIF). The matrix U is then used to transform the FF: fAB=UtfabU (35) The tensors dependent on one atom only, such as APT, AAT, D, are transformed in the same manner [28]: P' A Ut Pa U (36) M' A Ut M a U
' A Ut a UU In transfers from smaller segments the transferred FF inevitably lacks the interactions between some distant atoms in the target molecule, but for sufficiently large fragments, the impact is negligible (as shown below). Other approaches that just use NMA or simple dipeptides to get diagonal and possible near-neighbor interaction constants (all normally identical) make a much bigger assumption about transferability,

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

199

but of course gain flexibility to apply their parameters to a variety of structures [31, 254, 255]. Torii [77] and others [73, 215, 254, 256] have developed maps of interaction terms by doing FF calculations for small peptides whose conformations were varied over a range of I\ angles. The appropriate conformational correction values for each local element can be abstracted from the maps and added to the diagonal terms (typically degenerate) to develop a more conformationally sensitive FF for larger hetero-structured molecules. In our model, the missing long-range interaction constants can be approximated, for example, by semiempirical calculation of the FF of the whole peptide onto which the more accurate ab initio FF can be subsequently transferred, or by TDC calculations of the missing interaction terms [112, 257]. Practically, we have not found this to be very useful, assuming sufficiently large fragments, e.g. about n t 5, were used for s. The combination of different FFs, APT and other parameters obtained from different sources is also enabled by the CCT. This is useful where implementation of some parameters was not available with DFT or with solvent (PCM) correction, and is useful for transfer of Raman tensors, i.e., the polarizability derivatives as well as G and A tensors for ROA. For example, even the local and non-local part of the G tensor, computed in different ways, can be combined using the same algorithm [24, 25, 250].

Figure 9. Raman (top), IR (middle) and VCD (bottom) spectra of an alanine 21-amide peptide, Ac-Ala20NH-CH3, calculated by CCT transfer of FF, APT and AAT parameters from shorter fragments having (I\) appropriate for an (a) D-helix, (b) 310-helix, (c) 31-helix. Comparison to Figures 5 and 6 shows the effect of lengthening the strand and developing extended exciton coupled transitions in these repeating structures. The large intensity in the Raman at ~1500 cm-1 is due to the CH3 groups, which are computed too high and overlap the amide II position.

Example peptide IR and VCD spectra calculated by CCT. The main strength of the CCT methodology is allowing non-empirical simulations of vibrational spectra for oligopeptides, which correspond to experimental peptide models with stable secondary structures. Such peptide sequences, typically around twenty or more amino acids, are generally much larger than can be efficiently calculated with DFT methods alone.

200

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

Figure 9 shows an example of the IR and VCD spectral simulations by CCT for 21mers in three different helical conformations [257]. In general transfer onto longer structures preserves the basic characteristics of the DFT simulated spectra of the smaller oligomers. However, longer peptides tend to have more intense and narrower bands, due to diminished end effects and to exciton coupling of repeating modes into polymer-like coupled modes. The amide I also gains intensity, both absolute and relative to the amide II, due to the H-bonds in the D- and 310-helices, but not in the 31helix (no H-bonds). The 31-helix results have the amide I IR weaker than the amide II, the opposite of what is typically observed for coil conformations, which are normally hydrated in solution and expected to be locally 31-helical [219, 223, 225, 258]. Proline peptides, which form regular 31-helices (poly-Pro II structure) do not have an amide II for comparison. However, when solvent effects are added to the 31 calculation, the amide I/II relative intensities are computed correctly. VCD patterns for the D-helix and 31-helix amide I closely resemble the experimental results, as do D- and 310-helical amide II predictions, but for E-sheets, while the IR is very good, the VCD agreement is more qualitative. Transfers of Raman tensors have also been carried out to model multiple transitions with some success (A. Roy et al., unpublished) [248]. The simulated 310-helix spectra in Figure 9 differ in detail from the experimental amide I VCD, which is less intense and has a conservative couplet bandshape. However, since Aib (D-amino isobutyric acid) is often used to promote formation of the 310-helix, the effects of Aib on the spectral signatures of 310-helices were tested by comparing CCT simulations for Aib2n and (Aib-Ala)n 310-helical peptide models [259]. The results showed that Aib vs. Aib-Ala peptide sequences differ in the amide I VCD bandshapes in a systematic manner that highlighted the potential accuracy of CCT DFT vibrational spectra simulation. The positive bias in the 310-helical amide I VCD for Aib2n and its amide II being sharper than for (Aib-Ala)n are both predicted in detailed agreement with the experimental spectra for 310-helices.

Figure 10. Comparison of simulated IR and VCD spectra for a three-stranded E-sheet with different lengths. The spectra were simulated by transfer of parameters from DFT calculations on three-stranded fragments with triamide strands and corresponding geometries. Anti-parallel planar E-sheet composed of three octa-amide strands, Ac-Ala7-NH-CH3, (3x8) (a) exhibits the greatest amide I mode dispersion resulting in a characteristic IR with low frequency intense maximum and a high frequency secondary maximum, and very weak VCD signal. A protein like (b) 3x8 anti-parallel E-sheet (model from fatty acid binding 1IFC) and parallel E-sheets, both (c) a planar 3x8 model and (d) a twisted 3x5 variant (Ac-Ala4NH-CH3 strands) from a protein (pectate lyase 1PEC) have qualitatively similar, but less split and more broadened amide I IR, and more intense predominantly negative VCD.

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

201

The CCT technique has also been used to simulate vibrational spectra for very large, regular E-sheet structures, which can model peptide aggregates or even multiplestranded E-sheets found in proteins [30, 257]. While the basic qualitative features of the E-sheet spectra are apparent even from the DFT calculations for small peptides, the dispersion of the amide I IR bands, which is characteristic of large, extended E-sheets, can be simulated only by CCT. Our calculations [30] in Figure 10, show that planar anti-parallel sheets exhibit the most distinct amide I IR bandshapes with extremely weak VCD ('A/A ~ 5x10-6), since the sheets are almost planar and therefore the C=O interactions are nearly achiral. By contrast, twisted E-sheets, such as those found in globular proteins have less pronounced amide I IR splitting and stronger, but still weak VCD ('A/A ~ 2-3x10-5). The IR amide I of parallel E-sheets, both planar and proteinlike, are similar, due to more pleating and greater deviations from planarity. An enhancement of the structural resolving power of vibrational spectroscopy in biomolecular applications is enabled by site-specific isotopic labeling [260-262]. The coupling between isotopically labeled sites, which is dependent on local structure, can be identified by vibrational spectra, as has been explored by several groups [17, 69, 263-265]. We have done a succession of studies with this technique on both helix and sheet models [29, 112, 228, 236-239, 241, 251, 266-268]. An example of direct experimental measurement of the vibrational coupling between specific sites was demonstrated using a 25 residue D-helical peptide, with two central residues labeled by 13 C substitution on the amide C=O [236]. These were compared by varying their relative separations in the sequence. These data show a reversal in the sign of the coupling constant (which shifts the IR component frequencies and flips the sign of the 13 C VCD) between the case of sequential (neighboring) labels and those separated by one residue, as shown in Figure 11, This trend is perfectly predicted by the theoretical model. Additional studies showed that the coupling drops off with separation, and that larger signals are detectable with added numbers of labels, as would be expected.

Figure 11. Experimental amide I (a) IR and (b) VCD spectra for a helical (low temperature), Ala-rich 25-residue peptide unlabeled (thin line) and with two 13C=O labels placed either adjacent (solid line) or separated (by one residue, dashed line) in the center of the sequence, and compared with the result for calculated amide I (c) IR and (d) VCD obtained using CCT to transfer parameters calculated at DFT:BPW91/6-31G** level for helical Ac-Ala10-NHMe to a 25 residue helical Ala peptide [236].

202

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

Theoretical modeling of the effects of isotopic labeling on E-sheet systems brought out two interesting amide I IR spectral enhancements for the intensity for the 13C=O band. First, if two (or more) labels are introduced to a strand either in sequence (nearneighbor) or separated by one residue (alternate) the intensities for the 13C=O band are drastically different [69]. With CCT DFT methods, this 13C absorbance enhancement can be seen to arise from formation of multiple-stranded antiparallel E-sheet aggregates [266]. The anomalously large intensity for the alternately labeled case arises from their being in-phase dipole oscillators in the lowest frequency, highest intensity mode. If on the other hand, single labels are on each strand, they can cross-strand couple to give unique patterns, provided they are located relatively near to each other [239, 266]. This provides a potential method of determining strand alignment and distinguishing parallel from anti-parallel structures, much as done with solid state NMR methods [269, 270]. We have compared the coupling determined with TDC model to that obtained from a full DFT calculation on 2 strands of 6 amides each. Coupling drops off with distance so that DFT results are required for close interactions, while TDC works relatively well for larger separations [239, 240]. A similar conclusion was found when a comparison was made of TDC and DFT results for helices, suggesting a hybrid approach can be useful to encompass long range interactions [236].
1690 1628 1740 1064

0.4

(rA)8*(rU)8 (Calc.)

1107 1073 1087

1705 1677 1635

0.0

1106 1084 1066 1053 1020 989 977 924 959 1121 1095 1075 1021 986 952 913

1740 1696

1124 1100

979

998

0.2

1697 1665 1636 1623

'H

960

974

poly(rA)*poly(rU) (Exp.)

1564 1523

1689 1669 1631

863

poly(rA)*poly(rU) (Exp.)
813

0 1800

1600

1569

1000
-1

860

1000

(rA)8*(rU)8 (Calc.)

1633

800

Wavenumber/cm

Figure 12. Calculated and experimental RNA spectra: (Left) VCD and IR for the (rA)8* (rU)8 duplex [271]. For the calculations, the octanucleotide duplex was simulated by use of parameters transferred from fragments including a base pair (A-U) and a sugar-phosphate dimer at the DFT(BPW91/6-31G**) level and longer range interactions represented in a duplexed pair of dinucleotides computed at the P3 level.

Nucleic acids. The methods described in this chapter are not limited to peptides and are principle applicable to other biopolymers, such as nucleic acids (NA). Computationally, NAs differ from peptides mainly by the size and nature of the chromophores (e.g. the basic chiral unit contains two base pairs, sugars and phosphate residues, which includes many more atoms than a dipeptide). Nevertheless, such systems are accessible to computational methods, as illustrated in Figure 12, where experimental spectra for an RNA double strand (duplex) octa-nucleotide are compared to simulations using DFT-level parameters from smaller NA segments [271]. Clearly, the simulations predict most of the characteristic features of the spectra, including

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

203

relative intensities and VCD signs. As for peptides, the C=O stretching mode in the bases dominates the IR intensities, which suggests that the TDC model would work relatively well for C=O coupling, and it does even better for PO2- (sym) coupling [33, 272-274]. Similar studies enabled interpretation of spectra for various RNA structures [271, 275, 276], non-periodic NA structures [274, 275, 277] and DNA-platinum complexes [278]. Limitations and extensions. Provided that the structures of the small fragments, whose spectra are calculated at the DFT level, and the target large peptide are identical, the CCT effectively provides DFT-level FF and intensity parameters for the target large molecule. The only approximation is that the effects of interactions between the atoms that are separated beyond the span of the small fragment are neglected. We have explored the effects of the size of the small fragment that produces converged results, and the effect of neglected long-range interactions on the simulated spectra. In D-helical studies, transfer from a tripeptide gives qualitatively correct amide I II band shapes, but misses the internal H-bond effects [228]. Use of a longer peptide fragment (or more strands for E-sheets) makes it possible to model internal H-bonds for all relevant residues and improve the amide I-II frequency separation and intensity distribution [26, 29, 112, 235]. Approximating the long range effects by semiempirical (AM1 or PM3) QM FF or TDC [112, 257] leads to discernable effects for calculations based on transfer of a triamide FF, but virtually no change in the D-helical spectral bandshapes computed by transfer from a 7-mer. For the D-helical conformation, the 7amide fragment can thus be considered to produce qualitatively converged results for use in the CCT method [257]. Furthermore, there is very little effect of increasing the size of the small fragment from a 7-mer to an 11-mer for computing the spectra of a 25mer D-helix [236, 257]. Similar behavior was found for E-sheets by comparing transfer with 2 strands of 3 residues vs. 2 of 6 residues, except that the shorter strands retain end effects on the cross-strand H-bonded rings [239]. Additional extensions are necessary for peptides of non-uniform structures. For example, in order to relate the structures of E-sheet models used in computations to experimental peptide sample conditions, a means of experimentally controlling the degree of aggregation was necessary. Consequently we have been preparing and studying various hairpin models [235, 237, 239, 241, 243 , 267], which are monomers and have a well-defined number of strands but are also non-uniform, forcing us to use an alternative to the CCT model, since the turn cannot be modeled with the strand segments. The simplest method is to assume an ideal hairpin, use two strand segments to model the strand part and a turn model for the turn, allowing enough overlap to eliminate the end (truncation) effects arising from the small oligomer [235]. If such hairpins are 13C labeled, they generate cross strand coupling which gives rise to the same sorts of spectral patterns noted above for just simple 2-strand anti-parallel sheet models. If the labels are part of a 10-member H-bonded ring, the cross-strand coupling is strong, but the lower frequency component is the more intense giving the spectra a large 13C-12C splitting in the amide I. By contrast, formation of a labeled 14-member ring gives rise to opposite sign coupling, which results in a more intense but higher frequency 13C=O band, less separate from 12C=O, as shown in Figure 14 [239, 241, 251]. For these ideal hairpins, the theoretical predictions fit the observed 13C=O effects well, but they did not reproduce the experimental 12C=O spectra since the termini are disordered in real, solvated molecules and the turn is not well described.

204

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

Figure 13. Comparison of (a) computed and (b) experimental amide I IR for the Gellman A hairpin, a 12-residue hairpin peptide stabilized by an Aib-Gly turn, with a RYVEVBGKKILN sequence. 13C shifted bands are labeled in both figures with L for cross-strand 13C=O H-bonds forming a 14-atom ring, labeled on positions V3-K8 (dashed line), and S for those forming a 10-atom ring, positions I10-V3 (thick solid line), compared to the unlabeled result (thin solid line) [241].

Figure 14. Amide I IR spectra of 12-amide E-hairpin models. Snapshot schematics of the backbone conformation for highly ordered, partially unfolded and unfolded structures as shown in parts a, b, and c, respectively. The corresponding simulated spectra are in parts a, b, and c, respectively. Increase in disorder is predicted to cause a shift to increased wave number and a broadening of features, much as seen experimentally with increase in temperature [251].

To overcome the limitations of ideal structures, we developed a fragmentation method which allows us to compute spectra for each segment of the target peptide, based for example on structures derived from NMR analyses or MD calculations, and to transfer their parameters onto the full peptide [238, 239, 251]. This approach is limited by the need for the fragments to overlap so that effects of the truncation can be eliminated, but makes it possible to explore the variations of the spectra that

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

205

accompany dynamic fluctuations of the peptide, such as in the ensemble of structures that can be represented by an MD trajectory. In Figure 14 are shown three example structures (sampled from a 450K, ~10 ns, MD trajectory) for a 12-residue hairpin fully folded (a), partially unfolded (b, frayed ends) and fully unfolded (c) [241, 251]. The spectra show a gradual broadening and shift of the main amide I absorbance. This is consistent with what happens when the molecule is heated and undergoes a phase transition to the unfolded state, but it is also representative of the low temperature ensemble. The differences suggest that following a phase transition with IR will not give a two state behavior since the various intermediates will have temperature dependent populations and thus the spectrum will shift continuously, a behavior characteristic of a number of hairpins [235, 237, 241, 267, 268, 279-282]. 5.4. Incorporating Solvent into Spectral Simulations For biological relevance, peptide and protein spectra are normally measured in aqueous solutions. Solvent has a profound effect on the amide frequencies, in particular on the amide I, as illustrated in Figure 15 (left) for experimental NMA IR spectra in the gas phase and in acetonitrile and aqueous solutions [207, 209, 215, 216, 283-285]. Accurate theoretical modeling of biomolecular systems thus cannot ignore the effect of solvent. The most rigorous, but also computationally expensive, correction is to include added solvent molecules explicitly with the peptide in the computational model. Continuum solvent models, including the Onsager and Polarized Continuum Models (PCM) are less expensive. Both are implemented in DFT-based methods in QM programs and are referred to as Self Consistent Reaction Fields (SCRF). An alternate approach combines MD simulations of the fluctuating solvent geometry and empirical electrostatic correction of the spectra [210, 212, 213, 215, 246, 285, 286].

Figure 15. Comparison of (left) experimental IR spectra for NMA in (a) gas phase and in (b) acetonitrile and (c) aqueous solution (dash line is D2O) with (right) simulated amide I and II vibrational frequencies in water as obtained with explicit and implicit solvent corrections. The amide I frequencies shift lower in going from (a) the gas phase spectrum to (c) water by ~ 100 cm-1,while the amide II (only in H2O) shifts ~ 85 cm-1 higher. As seen in the comparisons (right) of theoretical models, use of explicit hydrogen-bonded water has the greatest effect on the amide I and II frequencies; however, additionally including a continuum solvent model in the NMA-water cluster significantly improves the predictions.

206

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

Examples using explicit solvent. To achieve the best correction of the amide I and II frequencies, solvent molecules should be explicitly included at some level in the calculation. This has been used for modeling spectra of NMA, [188, 206, 212, 287290] amino acids and dipeptides [229, 230, 291-295]. Results of DFT frequency calculations for NMA [209] with explicit as well as implicit solvent models Figure 15 (right) demonstrate that much better amide I-II frequencies can be calculated for aqueous solutions if hydrogen bonded molecules are represented explicitly (using two waters on C=O and one on N-H) [204, 296]. Of course such models must assume a structure for the solvent molecules, which actually have a high degree of dynamic variation. Thus while frequencies improve, other aspects, such as chiral properties, can be distorted unless averaging over the ensemble or other corrections are used. While added improvement can be obtained by inclusion of added layers of water [213], their structures are even more fluxional, and a simpler approximation is obtained by implicitly including their solvent effect by means of PCM models [209]. A similar level of explicit solvent correction was applied to an alanine tripeptide to investigate the effects of deuteration on the Amide I VCD bandshape, [112] as well as for calculations of fully solvated D- (7-mer), 310- and 31- (5-mers) helices [297]. The resulting FF parameters were subsequently transferred to 21-mer structures with the same helical conformation using the CCT approach [297]. The smaller oligomers show increased dispersion in the amide I and II bands due to interaction with the water molecules and have dramatic amide I frequency shifts, due to H-bonding, as expected. For the longer peptides, exciton coupling dominates the local dispersion and similar patterns, as seen for the vacuum calculations, result at the shifted frequencies, as can be seen in the selected examples in Figure 16.

Figure 16. Simulations of IR (top) and VCD (bottom) spectra for helical peptides with explicit solvent. DFT calculations were made for an D-helical hepta-amide, Ac-Ala6-NH-CH3, and 310 helical and 31 helical pentaamides, Ac-Ala4-NH-CH3 with solvent modeled by hydrogen-bonded water molecules on each intrapeptide hydrogen bonded amide C=O and a non-hydrogen bonded N-H group, and two water molecules forming hydrogen bonds to the terminal, non-intrapeptide hydrogen bonded C=O groups. These parameters were transferred onto helical 21-amide models, Ac-Ala20-NH-CH3, with the same (I\) angles, and the results directly compare to the vacuum calculations in Figure 9. The predominant effect of the solvent is shifts in the amide I and II vibrational frequencies. An increase in the IR intensity is also predicted, especially for the amide I, whose relative intensity with respect to amide II now more closely mimics experiment. The VCD bandshapes remain unchanged with solvent as compared to vacuum, with the exception of the amide II for 31 helix and small variations in the amide I of the 310 helix.

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

207

Figure 17. Comparison of simulated D-helical 15-amide, Ac-Ala14-NH-CH3, IR and VCD amide I for unsolvated (gas phase), partially solvated (mimicking an D-helix in a protein) and fully solvated (mimicking solvated model D-helical) peptide models. The gas phase (top) and fully solvated (bottom) peptides exhibit the same spectral bandshapes, both for IR and VCD, with the solvated amide I shifted to lower frequency. The partially solvated peptide (middle) has an intermediate amide I frequency, but also exhibits additional broadening, the higher ones representing an extreme of what might correspond to the interior (desolvated) residues in a protein. The amide I VCD preserves the D-helical shape and is dominated by the signal from the higher frequency modes, with a second D-helix like couplet predicted, corresponding to the solvated groups. The IR simulations are in qualitative agreement with the solvated and buried amide components often observed in protein IR. However, the simulated results overestimate the effects of hydrogen-bonded water and better correspond to the cryogenic experiments [298] rather than room temperature spectra, where the dynamic nature of the solvent must be taken into account.

While the overall spectral bandshape patterns do not change significantly, inclusion of solvent in the calculations leads to better quantitative agreement with experiment in the calculated spectral intensities, and even in some qualitative details of the spectral dispersion. In the IR, the amide I for the solvated 31-helix is predicted to be more intense than amide II, in agreement with condensed phase results, while in the gas phase (Fig. 9) the relative amide I and II intensities are reversed, which has not been seen experimentally. In VCD, the amide I band shape change upon N-deuteration from a (-,+) couplet to a (-,+,-) shape for an D-helix is correctly reproduced only when solvent is included [213, 297]. Much better quantitative agreement of the calculated 13C IR and VCD intensities with experiments for isotopically labeled alanine-rich peptides [29] is obtained by recalculating the spectra using CCT to transfer hydrated representations of the local helical sequence [297]. The solvent correction is therefore useful not only for predicting correct frequencies, but also for more subtle features of the normal mode distributions and spectral intensities. On the other hand, the overall qualitative pattern seen in the vacuum calculations persists, so the need to correct for solvent depends on the detail one wishes to derive from the spectral data. The corrections are often fairly predictable, so empirical methods might provide useful substitutes for computationally intense DFT level calculations of these solvent effects. The effect of partial solvation on the D-helices has also been studied [252] to mimic the environment in proteins. Partially solvated D-helices were constructed using

208

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

a fragmentation method, similar to that discussed above, employing parameter transfer from several short helical fragments solvated by explicit water at different positions. A comparison of simulated gas phase, partially- and fully-solvated D-helix IR amide I spectra is shown in Figure 17. These computations predicted significant broadening and splitting of the amide I into the solvated and unsolvated parts, in agreement with cryogenic experiments [298], but showed that explicit hydrogen-bonded water overcorrects for the solvent effects, when compared to vacuum, but this does not fully account for the effect of the protein environment on the non-solvated side of the helix. In order to reproduce the spectral bandshapes at room temperature, the dynamical nature of the solvent might be taken into account, for example by the electrostatic solvent correction [210, 211, 213, 246]. Continuum solvent models. In continuum models, the electrostatic effects of bulk solvent are represented by a continuous dielectric medium outside a cavity occupied by a solute molecule [299]. This ignores much molecular detail, while assuming a linear response, but it provides a computationally tractable approximation of the electrostatic effects of the solvent, as has been extensively reviewed [300, 301]. The Onsager SCRF model [302] places the solute molecule at the center of a spherical cavity of an appropriate radius, and the solute charge distribution is reduced to just the dipole moment. This allows the solute-solvent interaction to be expressed in a simple closed form, which is robust and computationally efficient [303]. However, the shape restriction of the Onsager cavity to a sphere introduces artifacts, and only the resulting dipole moment contributes to the reaction field, reflecting, for example, the helix macrodipole but not those of individual polar groups. Polarized continuum models (PCM), developed following Tomasi and coworkers [304], provide more realistic cavity shapes. Another, very efficient approach called "COSMO" was pioneered by Klamt [305], in which the polarized continuum outside the solute cavity is described using electrostatic field boundary conditions of a conductor. Later, this model was adapted so that it allowed for an arbitrary solvent permittivity [306-308] and has been implemented within the PCM framework [309]. In the Gaussian programs this model is referred to as CPCM. The PCM cavity is realized as a sum of interlocking spheres centered on each nucleus with the appropriate atomic radii [300]. For practical calculations the cavity surface is partitioned into small, planar, triangular domains, called tesserae, whose contributions summed give the reaction field energy. CPCM adds only a small computational cost to both energies and analytical gradients [309] and turns out to be somewhat faster than dielectric-based PCM models. CPCM is thus in principle more efficient for geometry optimizations, particularly for large molecules. Implementation of analytical second derivatives within (C)PCM opened new possibilities for obtaining more realistic peptide force fields corrected for the environment [310]. However, practical calculations with currently available PCM and CPCM methods do have some stability problems [300]. In particular, geometry optimizations can converge very slowly and sometimes do not converge at all within the default convergence criteria. Electrostatic solvent correction. A computationally cheaper alternative for explicit solvent correction was first proposed by Cho and coworkers [211] where an empirical frequency correction, due to the solvent induced electric field, is applied to the ab initio amide I frequencies (Z0) obtained from a vacuum calculation. In the original

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

209

formulation [211, 311] the amide I stretching frequency Z (in solvent) linearly relates to the electrostatic solvent potential Mi, as

Z Z0  biM i
k 1

(37)

measured at the amide group atoms (CNHCOC). (In another chapter, Choi and Cho provide more detail [34].) The correction can be generalized to any chromophore (any number of atoms) and applied to the intensity tensors [210, 287]. The coefficients bi can be obtained by a fit to ab initio computations involving explicit solvent molecules. For intensity parameters, it is more convenient to express the fitting coefficients in a local coordinate system, yielding the corrected atomic polar tensor, for example, as
O PDE O PDE
( 0)

 bi ,OD , E M i ,
i 1

(38)

where indices follow those used in (11). We have developed a set of bi parameters for NMA in water clusters by referencing to DFT computed FF results [210, 287]. The approach is especially suitable to combined QM/MM modeling, as the correction is fast and allows for averaging of large ensembles of geometries. The electrostatic correction can reproduce ab initio results; for example the changes observed in D-helical peptide spectra under deuteration are correctly predicted [213]. Furthermore, the complicated IR and VCD spectral patterns of E-hairpins can be better understood in terms of the electrostatic interactions between the solvent and the amide groups, or in terms of shielding of some groups by the side chains [238].

6. Concluding Remarks

The examples given above demonstrate that it is now fully possible to compute vibrational spectra for moderately sized peptides to an experimentally useful degree of accuracy using a method free of empirical parameterization, except for the fairly universal DFT functionals. These developments reflect continuing advances in quantum chemical computational methodology, particularly with DFT based methods, and in widely available computer capabilities. The analyses have gone beyond frequency correlation to address apparent band shapes due to the dispersion of intensity in the exciton coupled modes characteristic of repeating molecular structural units in such polymeric species. Modern developments are permitting computation to address the effects of solvation and structural fluctuation in ways previously viewed as unachievable. These developments have made computational simulation an integral part of experimental vibrational spectroscopy. The ab initio approach to simulations of peptide vibrational spectra has been criticized [19, 70] for not providing physical insight into the nature of amide vibrational coupling. However, reproducing the experimental spectra using an assumed coupling mechanism, e.g. through-bond and through space, such as TDC, with adjustable parameters can provide a rather misleading picture. The through-bond and through-space schemes in coupled oscillator approaches are inseparable. The amide vibrations are coupled through the electronic structure, which is only approximately separated into such conceptually attractive components. On the contrary, by virtue of not relying on empirically (spectrally) derived parameters, quantum mechanical simulations of the spectra provide important physical insight. For example consider the

210

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

vibrational frequencies of peptides in solution: while in coupled oscillator schemes the unperturbed frequency of the amide modes is adjusted to fit the data, the DFT calculations clearly show that the solvent is necessary to reproduce the vibrational frequencies measured in solution, while the vacuum, isolated molecule calculations, not surprisingly, can reproduce gas phase frequency values [209, 216]. Quantum mechanics in fact provides the means to test rigorously the validity of approximations for inter-amide coupling. Since the DFT calculations yield transition dipole moments, it is always possible to test, for example, to what extent the transition dipole coupling contributes to observed spectra. As we have demonstrated, TDC alone is not sufficient to account for observed bandshapes in helical or E-sheet peptides [240]. TDC becomes reasonable approximation for long-range coupling between residues that are far separated in sequence. The coupled oscillator must make up for this deficiency by using much larger transition dipoles than the more accurate values calculated by DFT (which actually replicate the overall vibrational intensity). The advantages of DFT calculations as compared to TDC were also verified experimentally, via the doubly 13C labeled D-helical peptide results, which allow direct measures of the vibrational coupling between two amides. Varying the distance in the sequence between the two labels demonstrates the variation of the magnitude and also the sign of the vibrational coupling. DFT calculations provide quantitative explanations of both IR and VCD patterns, while TDC is a good approximation only for distant oscillators, where the coupling essentially approaches zero [236]. These summary comments are meant to illuminate the importance of accurate, non-empirical calculations of the polypeptide force fields for correct physical understanding of the interactions that govern the peptide and protein spectral features. Certainly there is a role for theory at multiple levels, but now that the DFT approaches have become accessible and that our CCT and related transfer or localized methods permit high level modeling of larger structures, it is important to do systematic comparisons with the simpler methods before relying on them to determine the finer details of spectra and structure that are now being derived from careful peptide and protein vibrational spectroscopic studies.
Acknowledgement. The work at UIC is currently sponsored by the National Science Foundation (CHE03-16014 and 07-18543 to TAK), that at the CAS by the Czech Science Foundation (grants Nos. 203/06/0420, 202/07/0732 to PB) and the Grant Agency (A4005507020) and at UW by the Faculty Grant-in-Aid and Basic Research Grant programs of the University of Wyoming (to JK). We thank Ahmed Lakahani, George Papadantonakis, Anjan Roy, Rong Huang, Heng Chi and Ling Wu for unpublished results and help with assembling references and figures.

References
[1] P.I. Haris, Fourier Transform Infrared Spectroscopic Studies of Peptides: Potentials and Pitfalls. In: Infrared Analysis of Peptides and Proteins: Principles and Applications. ACS Symposium Series. B. Ram Singh (Ed.), ACS, Washington DC, 2000, 54-95. [2] A. Barth, Infrared spectroscopy of proteins, Biochim. Biophys. Acta. 1767 (2007) 1073-1101. [3] A.T. Tu, Raman Spectroscopy in Biology, Wiley, New York, 1982. [4] R.W. Williams, Protein secondary structure analysis using Raman amide I and amide III spectra, Methods Enzymol. 130 (1986) 311-331.

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

211

[5] P. Pulay, Analytical derivative techniques and the calculation of vibrational spectra. In: Modern electronic structure theory D.R. Yarkony (Ed.), Vol. 2, World Scientific, Singapore, 1995, 1191-1240. [6] T.A. Keiderling, Peptide and protein conformational studies with vibrational circular dichroism and related spectroscopies. In: Circular Dichroism: Principles and Applications, 2nd Ed., N. Berova, K. Nakanishi and R.W. Woody (Eds.), Wiley-VCH, New York, 2000, 621-666. [7] T.A. Keiderling, Protein and peptide secondary structure and conformational determination with vibrational circular dichroism, Curr. Opin. Chem. Biol. 6 (2002) 682-688. [8] L.D. Barron and L. Hecht, Vibrational Raman optical activity: From fundamentals to biochemical applications. In: Circular dichroism, principles and applications K. Nakanishi, N. Berova and R.W. Woody (Eds.), Wiley-VCH, New York, 2000, 667-701. [9] H. Fabian and D. Naumann, Methods to study protein folding by stopped-flow FT-IR, Methods. 34 (2004) 28-40. [10] R. Vogel and F. Siebert, Vibrational spectroscopy as a tool for probing protein function, Curr. Opin. Chem. Biol. 4 (2000) 518-523. [11] M.S. Braiman and Y.W. Xiao, Step-scan time-resolved FT-IR spectroscopy of biopolymers. In: Vibrational Spectroscopy of Biological and Polymeric Materials V.G. Gregoriou and M.S. Braiman (Eds.), CRC Press, 2005, 353-419. [12] C. Kotting and K. Gerwert, Proteins in action monitored by time-resolved FTIR spectroscopy, ChemPhysChem. 6 (2005) 881-888. [13]R.B. Dyer, F. Gai and W.H. Woodruff, Infrared studies of fast events in protein folding, Acc. Chem. Res. 31 (1998) 709-716. [14] R. Callender and R.B. Dyer, Advances in time-resolved approaches to characterize the dynamical nature of enzymatic catalysis, Chem. Rev. 106 (2006) 3031-3042. [15] S. Woutersen and P. Hamm, Nonlinear two-dimensional vibrational spectroscopy of peptides, J. Phys. Condens. Matter. 14 (2002) R1035-1062. [16] H.S. Chung, Z. Ganim, K.C. Jones and A. Tokmakoff, Transient 2D IR spectroscopy of ubiquitin unfolding dynamics, Proc. Natl. Acad. Sci. U. S. A. 104 (2007) 14237-14242. [17] S.M. Decatur, Elucidation of Residue-Level Structure and Dynamics of Polypeptides via Isotope-Edited Infrared Spectroscopy, Acc. Chem. Res. 39 (2006) 169-175. [18] S. Krimm, Interpreting Infrared Spectra of Peptides and Proteins. In: Infrared Analysis of Peptides and Proteins: Principles and Applications. ACS Symposium Series. B.R. Singh (Ed.), ACS, Washington DC, 2000, 38-53. [19] R. Schweitzer-Stenner, Advances in vibrational spectroscopy as a sensitive probe of peptide and protein structure - A critical review, Vibr. Spectrosc. 42 (2006) 98-117. [20] P.J. Stephens, Theory of vibrational circular dichroism, J. Phys. Chem. 89 (1985) 748-752. [21] D. Yang and A.J. Rauk, The a priori calculation of vibrational circular dichroism intensities. In: Reviews in Computational Chemistry K.B. Lipkowitz and D.B. Boyd (Eds.), Vol. 7, VCH Publishers, Inc., New York, 1996, 261-301. [22] P.L. Polavarapu, Vibrational spectra: principles and applications with emphasis on optical activity, Vol. 85, Elsevier, Amsterdam, 1998. [23] L.A. Nafie and T.B. Freedman, Vibrational optical activity theory. In: Circular Dichroism. Principles and Applications. 2nd edition. N. Berova, K. Nakanishi and R.W. Woody (Eds.), Wiley-VCH, New York, 2000. [24]P. Bou, Computations of the Raman optical activity via the sum-over-states expansions, J. Comp. Chem. 22 (2001) 426-435. [25] K. Ruud, T. Helgaker and P. Bou, Gauge-origin independent density-functional theory calculations of vibrational Raman optical activity, J. Phys. Chem. A. 106 (2002) 7448-7455. [26] P. Bou, J. Kubelka and T.A. Keiderling, Quantum Mechanical Models of Peptide Helices and Their Vibrational Spectra, Biopolymers. 65 (2002) 45-69. [27] R. Wieczorek and J.J. Dannenberg, Amide I vibrational frequencies of alpha-helical peptides based upon ONIOM and density functional theory (DFT) studies, J. Phys. Chem. B. 112 (2008) 1320-1328. [28] P. Bou, J. Sopkov, L. Bednrov, P. Malo and T.A. Keiderling, Transfer of molecular property tensors in Cartesian coordinates: A new algorithm for simulation of vibrational spectra, J. Comput. Chem. 18 (1997) 646-659.

212

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

[29] R.A.G.D. Silva, J. Kubelka, S.M. Decatur, P. Bou and T.A. Keiderling, Site-specific conformational determination in thermal unfolding studies of helical peptides using vibrational circular dichroism with isotopic substitution., Proc. Natl. Acad. Sci. U. S. A. 97 (2000) 8318-8323. [30] J. Kubelka and T.A. Keiderling, Differentiation of E-sheet forming structures: ab initio based simulations of IR absorption and vibrational CD for model peptide and protein E-sheets., J. Am. Chem. Soc. 123 (2001) 12048-12058. [31] J.-H. Choi, J.-S. Kim and M. Cho, Amide I vibrational circular dichroism of polypeptides: Generalized fragmentation approximation method, J. Chem. Phys. 122 (2005) 174903-174913. [32] J.H. Choi, H. Lee, K.K. Lee, S. Hahn and M. Cho, Computational spectroscopy of ubiquitin: Comparison between theory and experiments, J. Chem. Phys. 126 (2007). [33] V. Andrushchenko, H. Wieser and P. Bou, B-Z conformational transition of nucleic acids monitored by vibrational circular dichroism. Ab Initio Interpretation of the Experiment, J. Phys. Chem. B. 106 (2002) 12623-12634. [34] J.H. Choi and M. Cho. In: FTIR Spectroscopy in Biomedical Applications A. Barth and P.I. Haris (Eds.), 2008. [35] A. Elliott and E.J. Ambrose, Structure of synthetic polypeptides, Nature. 165 (1950) 921-922. [36] T. Miyazawa, Infrared Spectra and Helical Conformations. In: Poly-D-Amino Acids: Proteins Models for Conformational Analysis G.D. Fasman (Ed.), Dekker, New York, 1967, 69-103. [37] R.D.B. Fraser and T.P. MacRae, Infrared Spectroscopy. In: Conformation in Fibrous Proteins and related synthetic polypeptides B. Horecker, N.O. Kaplan, J. Marmur and H.A. Scheraga (Eds.), Academic Press, New York, 1973, 94-123. [38] D.M. Byler and H. Susi, Examination of the secondary structure of proteins by deconvolved FTIR spectra, Biopolymers. 25 (1986) 469-487. [39] P.I. Haris and D. Chapman, The conformational analysis of peptides using Fourier Transform IR spectroscopy, Biopolymers. 37 (1995) 251-263. [40] E.J. Ambrose and A. Elliott, The structure of synthetic polypeptides. II. Investigation with polarized infra-red spectroscopy, Proc. Royal Soc. A205 (1951) 47-60. [41] E.R. Blout, C. De Loze and A. Asadourian, The deuterium exchange of water-soluble polypeptides and proteins as measured by infrared spectroscopy, J. Am. Chem. Soc. 83 (1961) 1895-1900. [42] A.C. Sen and T.A. Keiderling, Vibrational circular dichroism of polypeptides. III. Film studies of several alpha-helical and beta-sheet polypeptides, Biopolymers. 23 (1984) 1533-1545. [43] V.P. Gupta and T.A. Keiderling, Vibrational CD of the amide II band in some model polypeptides and proteins, Biopolymers. 32 (1992) 239-248. [44] K. Kaiden, T. Matsui and S. Tanaka, A Study of the amide III band by FT-IR spectrometry of the secondary structure of albumin, myoglobin and gamma-globulin, Appl. Spectrosc. 41 (1987) 180-184. [45] B.R. Singh, M.P. Fuller and G. Schiavo, Molecular structure of tetanus neurotoxin as revealed by FT-IR and CD spectroscopy, Biophys. Chem. 46 (1990) 155-166. [46] F. Fu, D.B. DeOliveira, W.R. Trumble, H.K. Sarkar and B.R. Singh, Secondary structure estimation of proteins using the amide III region of Fourier transform infrared spectroscopy: Application to analyze calcium-binding-induced structural changes in calsequestrin, Appl. Spectrosc. 48 (1994) 1432-1440. [47] B.I. Baello, P. Panoka and T.A. Keiderling, Vibrational circular dichroism spectra of proteins in the amide III region. Measurement and correlation of bandshape to secondary structure, Anal. Biochem. 250 (1997) 212-221. [48] S. Cai and B.R. Singh, Determination of the secondary structrue of proteins from amide I and amide III infrared bands using partial least square method. In: Infrared Analysis of Peptides and Proteins: Principles and Applications. ACS Symposium Series. B.R. Singh (Ed.), ACS, Washington DC, 2000, 117-129. [49] R. Tuma, Raman spectroscopy of proteins: from peptides to large assemblies, J. Raman Spectrosc. 36 (2005) 307-319. [50] W. Qian and S. Krimm, Vibrational studies of the disulfide group in proteins. VI. General correlations of SS and CS stretch frequencies with disulfide bridge geometry, Biopolymers. 32 (1992) 1025-1033. [51] R.P. Rava and T.G. Spiro, Resonance enhancement in the ultraviolet Raman spectra of aromatic amino acids, J. Phy. Chem. 89 (1985) 1856-1861.

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

213

[52] T.G. Spiro (Ed.), Resonance Raman Spectra of Polyenes and Aromatics, Vol. 2, Wiley, New York, 1987. [53] X.J. Zhao and T.G. Spiro, Ultraviolet resonance raman spectroscopy of hemoglobin with 200 and 212nm excitation: H-bonds of tyrosines and prolines, J. Raman Spectrosc. 29 (1998) 49-55. [54] S. Song, S.A. Asher and S. Krimm, Assignment of a new conformation-sensitive UV resonance Raman band in peptides and proteins, J. Am. Chem. Soc. 110 (1988) 8547-8548. [55] Z.H. Chi and S.A. Asher, UV Resonance Raman determination of protein acid denaturation - selective unfolding of helical segments of horse myoglobin, Biochemistry. 37 (1998) 2865-2872. [56] I.K. Lednev, A.S. Karnoup, M.C. Sparrow and S.A. Asher, Alpha-helix peptide folding and unfolding activation barriers: A nanosecond UV resonance raman study . J. Am. Chem. Soc. 121 (1999) 8074-8086. [57] S.A. Asher, A. Ianoul, G. Mix, M.N. Boyden, A. Karnoup, M. Diem and R. Schweitzer-Stenner, Dihedral psi angle dependence of the amide III vibration: a uniquely sensitive UV resonance Raman secondary structural probe., J. Am. Chem. Soc. 123 (2001) 11775-11781. [58] T. Miyazawa, Perturbation treatment of the characteristic vibrations of polypeptide chains in various conformations, J. Chem. Phys. 32 (1960) 1647-1652. [59] T. Miyazawa and E.R. Blout, The infrared spectra of polypeptides in various conformations: amide I and II bands, J. Am. Chem. Soc. 83 (1961) 712-719. [60] P.W. Higgs, The vibration spectra of helical molecules: infra-red and Raman selection rules, intensities and approximate frequencies, Proc. R. Soc. London. A220 (1953) 472-485. [61] Y.N. Chirgadze and N.A. Nevskaya, Infrared Spectra and Resonance Interaction of Amide I Vibration of the Antiparallel-Chain Pleated Sheet, Biopolymers. 15 (1976) 607-625. [62] Y.N. Chirgadze and N.A. Nevskaya, Infrared spectra and resonance interaction of amide-I vibration of the parallel-chain pleated sheet, Biopolymers. 15 (1976) 627-636. [63] N.A. Nevskaya and Y.N. Chirgadze, Infrared spectra and resonance interactions of Amide-I and II vibrations of D-helix, Biopolymers. 15 (1976) 637-648. [64] H. Torii and M. Tasumi, Application of the three-dimensional doorway-state theory to analyses of the amide I infrared bands of globular proteins, J. Chem. Phys. 97 (1992) 92-98. [65] H. Torii and M. Tasumi, Model calculations on the amide-I infrared bands of globular proteins, J. Chem. Phys. 96 (1992) 3379-3387. [66] H. Torii and M. Tasumi, Theoretical analyses of the amide I infrared bands of globular proteins. In: Infrared spectroscopy of biomolecules H.H. Mantsch and D. Chapman (Eds.), Wiley-Liss, Chichester UK, 1996, 1-17. [67] S.S. Birke, I. Agbaje and M. Diem, Experimental and Computational Infrared CD Studies of Prototypical Peptide Conformations, Biochemistry. 31 (1992) 450-455. [68] T. Xiang, D.J. Goss and M. Diem, Strategies for the computation of infrared CD and absorption spectra of biological molecules: ribonucleic acids, Biophys. J. 65 (1993) 1255-1261. [69] J.W. Brauner, C. Dugan and R. Mendelsohn, 13C Isotope labeling of hydrophobic peptides. Origin of the anomalous intensity distribution in the infrared amide I spectral region of beta-sheet., J. Am. Chem. Soc. 122 (2000) 677-683. [70] J.W. Brauner, C.R. Flach and R. Mendelsohn, Quantitative reconstruction of the amide I contour in the IR spectra of globular proteins: From structure to spectrum, J. Am. Chem. Soc. 127 (2005) 100-109. [71] S. Woutersen and P. Hamm, Structure determination of trialanine in water using polarization sensitive two-dimensional vibrational spectroscopy, J. Phys. Chem. B. 104 (2000) 11316-11320. [72] R. Schweitzer-Stenner, Secondary structure analysis of polypeptides based on an excitonic coupling model to describe the band profile of amide I ' of IR, Raman, and vibrational circular dichroism spectra J. Phys. Chem. B. 108 (2004) 16965-16975. [73] J. Wang and R.M. Hochstrasser, Characteristics of the two-dimensional infrared spectroscopy of helices from approximate simulations and analytic models, Chem. Phys. 297 (2004) 195-219. [74] S. Krimm and J. Bandekar, Vibrational spectroscopy and conformation of peptides, polypeptides and proteins, Adv. Protein Chem. 38 (1986) 181-364. [75] W.H. Moore and S. Krimm, Transition dipole coupling in amide I modes of E polypeptides., Proc. Natl. Acad. Sci. U.S.A. 72 (1975) 4933-4935. [76] S.H. Lee and S. Krimm, Ab initio-based vibrational analysis of alpha-poly(L-alanine), Biopolymers. 46 (1998) 283-317.

214

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

[77] H. Torii and M. Tasumi, Ab initio molecular orbital study of the amide I vibrational interactions between the peptide groups in di- and tripeptides and considerations on the conformation of the extended helix, J. Raman Spectrosc. 29 (1998) 81-86. [78] C. Fang, J. Wang, A.K. Charnley, W. Barber-Armstrong, A.B. Smith III, S.M. Decatur and R.M. Hochstrasser, Two-dimensional infrared measurements of the coupling between amide modes of an alpha-helix, Chem. Phys. Lett. 382 (2003) 586-592. [79] R. Schweitzer-Stenner, F. Eker, K. Griebenow, C. Xiaolin and L.A. Nafie, The conformation of tetraalanine in water determined by polarized Raman, FT-IR, and VCD spectroscopy, J. Am. Chem. Soc. 126 (2004) 2768 - 2776 [80] T. Miyazawa, Characteristic amide bands and conformations of polypeptides. In: M.A. Stahmann (Ed.), Polyamino Acids, Polypeptides and Proteins: International Symposium, University of Wisconsin, Madison, WI, 1962, pp. 201-217. [81] Y.N. Chirgadze, B.V. Shetopalov and S.Y. Venyaminov, Intensities and other spectral parameters of infrared amide bands of polypeptides in the beta and random forms, Biopolymers. 12 (1973) 1337-1351. [82] H. Torii and M. Tasumi, Three-dimensional doorway-state theory for analyses of absorption bands of many-oscillator systems, J. Chem. Phys. 97 (1992) 86-91. [83] H. Torii and M. Tasumi, Infrared intensities of vibrational modes of an alpha-helical polypeptide. Calculations based on the equilibrium charge-charge flux (ECCF) model, J. Mol. Struct. 300 (1993) 171179. [84] K. Palmo and S. Krimm, Electrostatic model for infrared intensities in a spectroscopically determined molecular mechanics force field., J. Comp. Chem. 19 (1998) 754-768. [85] D.A. Long, Intensities in Raman spectra . 1.a Bond polarizability theory, Proceedings of the Royal Society of London Series a-Mathematical and Physical Sciences. 217 (1953) 203-221. [86] L.D. Barron, J.R. Escribano and J.F. Torrance, Polarized Raman optical-activity and the bond polarizability model, Mol. Phys. 57 (1986) 653-660. [87] L.D. Barron, Molecular Light Scattering and Optical Activity, Cambridge University Press, Cambridge UK, 2004. [88] L. Wirtz, M. Lazzeri, F. Mauri and A. Rubio, Raman spectra of BN nanotubes: Ab initio and bondpolarizability model calculations, Physical Review B. 71 (2005). [89] G. Holzwarth and I. Chabay, Optical activity of vibrational transitions: A coupled oscillator model, J. Chem. Phys. 57 (1972) 1632. [90] I. Tinoco, Radiation Res. 20 (1963) 133. [91] J.A. Schellman, Vibrational optical activity, J. Chem. Phys. 58 (1973) 2882-2886. [92] J. Snir, R.A. Frankel and J.A. Schellman, Optical activity of polypeptides in the infrared. Predicted CD of the amide I and amide II bands, Biopolymers. 14 (1975) 173-196. [93] M. Gulotta, D.J. Goss and M. Diem, IR vibrational CD in model deoxyoligonucleotides: Observation of the B - > Z phase transition and extended coupled oscillator intensity calculations., Biopolymers. 28 (1989) 2047-2058. [94] W. Zhong, M. Gulotta, D.J. Goss and M. Diem, DNA solution conformation via infrared circular dichroism: Experimental and Theoretical Results for B-Family Polymers, Biochemistry. 29 (1990) 74857491. [95] M. Diem, O. Lee and G.M. Roberts, Vibrational studies, normal-coordinate analysis, and infrared VCD of alanylalanine in the amide-III spectral region, J. Phys. Chem. 96 (1992) 548-554. [96] S.S. Birke, I. Agbaje and M. Diem, Experimental and computational infrared CD studies of prototypical peptide conformations, Biochemistry. 31 (1992) 450. [97] H. DeVoe, Optical properties of molecular aggregates. II. Classical model of electronic absorption and refraction, J. Chem. Phys. 41 (1964) 393-400. [98] H. DeVoe, Optical properties of molecular aggregates. II. Classical theory of refraction, absorption and optical activity in solutions and crystals, J. Chem. Phys. 43 (1965) 3199-3208. [99] B.D. Self and D.S. Moore, Nucleic acid vibrational circular dichroism, absorption and linear dichroism spectra. 1. A DeVoe theory approach, Biophys. J. 73 (1997) 339-347. [100] B.D. Self and D.S. Moore, Nucleic acid vibrational circular dichroism, absorption and linear dichroism spectra. 2. A DeVoe theory approach, Biophys. J. 74 (1998) 2249-2258.

J. Kubelka et al. / QM Calculations of Peptide Vibrational Force Fields and Spectral Intensities

215

[101] H. Ito and Y.J. I'Haya, Linear response polarizability theory for vibrational circular dichroism: VCD and IR bandshape calculations of D-helical and E-pleated polypeptides, Bull. Chem. Soc. Jpn. 67 (1994) 1238-1245. [102] H. Ito, Linear response polarizability bandshape calculations of vibrational circular dichroism, vibrational absorption, and electronic circular dichroism of cyclo(Gly-Pro-Gly-D-Ala-Pro): a small cyclic peptide having E- and J-turns, Biospectroscopy. 2 (1996) 17-37. [103] P.J. Stephens and M.A. Lowe, Vibrational circular dichroism, Ann. Rev. Phys. Chem. 36 (1985) 213241. [104] P.J. Stephens, F.J. Devlin, C.S. Ashvar, C.F. Cha