Вы находитесь на странице: 1из 18

articles

The complete genome sequence


of the Gram-positive bacterium
Bacillus subtilis
F. Kunst1, N. Ogasawara2, I. Moszer3, A. M. Albertini4, G. Alloni4, V. Azevedo5, M. G. Bertero3,4, P. Bessières5, A. Bolotin5, S. Borchert6,
R. Borriss7, L. Boursier3, A. Brans8, M. Braun9, S. C. Brignell10, S. Bron11, S. Brouillet3,12, C. V. Bruschi13, B. Caldwell14, V. Capuano5,
N. M. Carter10, S.-K. Choi15, J.-J. Codani16, I. F. Connerton17, N. J. Cummings17, R. A. Daniel18, F. Denizot19, K. M. Devine20, A. Düsterhöft9,
S. D. Ehrlich5, P. T. Emmerson21, K. D. Entian6, J. Errington18, C. Fabret19, E. Ferrari14, D. Foulger18, C. Fritz9, M. Fujita22, Y. Fujita23, S. Fuma24,
A. Galizzi4, N. Galleron5, S.-Y. Ghim15, P. Glaser3, A. Goffeau25, E. J. Golightly26, G. Grandi27, G. Guiseppi19, B. J. Guy10, K. Haga28, J. Haiech19,
C. R. Harwood10, A. Hénaut29, H. Hilbert9, S. Holsappel11, S. Hosono30, M.-F. Hullo3, M. Itaya31, L. Jones32, B. Joris8, D. Karamata33,
Y. Kasahara2, M. Klaerr-Blanchard3, C. Klein6, Y. Kobayashi30, P. Koetter6, G. Koningstein34, S. Krogh20, M. Kumano24, K. Kurita24, A. Lapidus5,
S. Lardinois8, J. Lauber9, V. Lazarevic33, S.-M. Lee35, A. Levine36, H. Liu28, S. Masuda30, C. Mauël33, C. Médigue3,12, N. Medina36,
R. P. Mellado37, M. Mizuno30, D. Moestl9, S. Nakai2, M. Noback11, D. Noone20, M. O’Reilly20, K. Ogawa24, A. Ogiwara38, B. Oudega34,
S.-H. Park15, V. Parro37, T. M. Pohl39, D. Portetelle40, S. Porwollik7, A. M. Prescott18, E. Presecan3, P. Pujic5, B. Purnelle25, G. Rapoport1,
M. Rey26, S. Reynolds33, M. Rieger41, C. Rivolta33, E. Rocha3,12, B. Roche36, M. Rose6, Y. Sadaie22, T. Sato30, E. Scanlan20, S. Schleich3,
R. Schroeter7, F. Scoffone4, J. Sekiguchi42, A. Sekowska3, S. J. Seror36, P. Serror5, B.-S. Shin15, B. Soldo33, A. Sorokin5, E. Tacconi4,
T. Takagi43, H. Takahashi28, K. Takemaru30, M. Takeuchi30, A. Tamakoshi24, T. Tanaka44, P. Terpstra11, A. Tognoni27, V. Tosato13, S. Uchiyama42,
M. Vandenbol40, F. Vannier36, A. Vassarotti45, A. Viari12, R. Wambutt46, E. Wedler46, H. Wedler46, T. Weitzenegger39, P. Winters14, A. Wipat10,
H. Yamamoto42, K. Yamane24, K. Yasumoto28, K. Yata22, K. Yoshida23, H.-F. Yoshikawa28, E. Zumstein5, H. Yoshikawa2 & A. Danchin3
1
Institut Pasteur, Unité de Biochimie Microbienne, 25 rue du Docteur Roux, 75724 Paris Cedex 15, France
2
Nara Institute of Science and Technology, Graduate School of Biological Sciences, Ikoma, Nara 630-01, Japan
3
Institut Pasteur, Unité de Régulation de l’Expression Génétique, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France
4
Dipartimento di Genetica e Microbiologia, Universita di Pavia, Via Abbiategrasso 207, 27100 Pavia, Italy
5
INRA, Génétique Microbienne, Domaine de Vilvert, 78352 Jouy-en-Josas Cedex, France
6
Institut für Mikrobiologie, J. W. Goethe-Universität, Marie Curie Strasse 9, 60439 Frankfurt/Maine, Germany
7
Institut für Genetik und Mikrobiologie, Humboldt Universität, Chausseestrasse 17, D-10115 Berlin, Germany
8
Centre d’Ingénierie des Protéines, Université de Liège, Institut de Chimie B6, Sart Tilman, B-4000 Liège, Belgium
9
QIAGEN GmbH, Max-Volmer-Strasse 4, D-40724 Hilden, Germany
10
Department of Microbiological, Immunological and Virological Sciences, The Medical School, University of Newcastle, Framlington Place, Newcastle upon Tyne NE2 4HH, UK
11
Department of Genetics, University of Groningen, Kerklaan 30, 9751 NN Haren, The Netherlands
12
Atelier de BioInformatique, Université Paris VI, 12 rue Cuvier, 75005 Paris, France
13
ICGEB, AREA Science Park, Padriciano 99, I-34012 Trieste, Italy
14
Genencor International, 925 Page Mill Road, Palo Alto, California 94304-1013, USA
15
Bacterial Molecular Genetics Research Unit, Applied Microbiology Research Division, KRIBB, PO Box 115, Yusong, Taejon 305-600, Korea
16
INRIA, Domaine de Voluceau, PB 105, 78153 Le Chesnay Cedex, France
17
Institute of Food Research, Department of Food Macromolecular Science, Reading Laboratory, Earley Gate, Whiteknights Road, Reading RG6 6BZ, UK
18
Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford, OX1 3RE, UK
19
Laboratoire de Chimie Bactérienne, CNRS BP 71, 31 Chemin Joseph Aiguier, 13402 Marseille Cedex 09, France
20
Department of Genetics, Trinity College, Lincoln Place Gate, Dublin 2, Republic of Ireland
21
Department of Biochemistry and Genetics, The Medical School, University of Newcastle, Framlington Place, Newcastle upon Tyne, NE2 4HH, UK
22
Radioisotope Center, National Insitute of Genetics, Mishima, Shizuoka-ken 411, Japan
23
Department of Biotechnology, Faculty of Engineering, Fukuyama University, Higashimura-cho, Fukuyama-shi, Hiroshima 729-02, Japan
24
Institute of Biological Sciences, Tsukuba University, Tsuiuba-shi, Ibaraki 305, Japan
25
Faculté des Sciences Agronomiques, Unité de Biochimie Physiologique, Université Catholique de Louvain, Place Croix du Sud, 2-20 B-1348 Louvain-la-Neuve, Belgium
26
Novo Nordisk Biotech, 1445 Drew Avenue, Davis, California 95616-4880, USA
27
Eniricerche, Via Maritano 26, San Donato Milanese, 20097 Milan, Italy
28
Institute of Molecular and Cellular Biology, The University of Tokyo, Bunkyo-ku, Tokyo 113, Japan
29
Laboratoire Génome et Informatique, Université de Versailles, Bâtiment Buffon, 45 Avenue des États-Unis, 78035 Versailles Cedex, France
30
Faculty of Agriculture, Tokyo University of Agriculture and Technology, Fuchu, Tokyo 183, Japan
31
Mitsubishi Kasei Institute of Life Sciences, 11 Minamyiooa, Machida-shi, Tokyo 194, Japan
32
Institut Pasteur, Service d’Informatique Scientifique, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France
33
Institut de Génétique et Biologie Microbiennes, Université de Lausanne, 19 rue César Roux, 1005 Lausanne, Switzerland
34
Department of Molecular Microbiology, MBW/BCA, Faculty of Biology, Vrije Universiteit Amsterdam, De Boelelaan 1087, 1081 HV Amsterdam, The Netherlands
35
Chongju University College of Science and Engineering, Chongju City, Korea
36
Institut de Génétique et Microbiologie, Université Paris Sud, URA CNRS 2225, Université Paris XI–Bâtiment 409, 91405 Orsay Cedex, France
37
Centro Nacional de Biotecnologia (CSIC), Campus Universidad Autonoma, Cantoblanco, 28049 Madrid, Spain
38
National Institute of Basic Biology, 38 Nishigounaka, Myoudaiji-chou, Okazaki 444, Japan
39
Gesellschaft für Analyse-Technik und Consulting mbH, Fritz-Arnold Straße 23, D-78467 Konstanz, Germany
40
Department of Microbiology, Faculty of Agronomy, 6 Avenue du Maréchal Juin, B-5030 Gembloux, Belgium
41
Biotech Research, BMF, Wilhelmsfeld, Klingelstrasse 35, D-69434 Hirschhorn, Germany
42
Department of Applied Biology, Faculty of Textile Science and Technology, Shinshu University 3-15-1, Tokida, Ueda-shi, Nagano 386, Japan
43
Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108, Japan
44
Department of Marine Science, School of Marine Science and Technology, Tokai University, 3-20-1 Orido Shimizu, Shizuoka 424, Japan
45
European Commission, DG XII-E-1, SDME 8/78, Rue de la Loi 200, B-1049 Brussels, Belgium
46
AGOWA GmbH, Glienicker Weg 185, 12489 Berlin, Germany
. ............ ............ ............ ........... ............ ............ ............ ........... ............ ............ ............ ........... ............ ............ ............ ........... ............ ............ ............ ............ ...........

Bacillus subtilis is the best-characterized member of the Gram-positive bacteria. Its genome of 4,214,810 base pairs
comprises 4,100 protein-coding genes. Of these protein-coding genes, 53% are represented once, while a quarter of
the genome corresponds to several gene families that have been greatly expanded by gene duplication, the largest
family containing 77 putative ATP-binding transport proteins. In addition, a large proportion of the genetic capacity is
devoted to the utilization of a variety of carbon sources, including many plant-derived molecules. The identification of
five signal peptidase genes, as well as several genes for components of the secretion apparatus, is important given the
capacity of Bacillus strains to secrete large amounts of industrially important enzymes. Many of the genes are involved
in the synthesis of secondary metabolites, including antibiotics, that are more typically associated with Streptomyces
species. The genome contains at least ten prophages or remnants of prophages, indicating that bacteriophage
infection has played an important evolutionary role in horizontal gene transfer, in particular in the propagation of
bacterial pathogenesis.

Nature © Macmillan Publishers Ltd 1997


NATURE | VOL 390 | 20 NOVEMBER 1997 249
articles
Techniques for large-scale DNA sequencing have brought about a ways and developmental checkpoints, culminates in the pro-
revolution in our perception of genomes. Together with our under- grammed death and lysis of the mother cell and release of the
standing of intermediary metabolism, it is now realistic to envisage mature spore12. In an alternative developmental process, B. subtilis is
a time when it should be possible to provide an extensive chemical also able to differentiate into a physiological state, the competent
definition of many living organisms. During the past couple of state, that allows it to undergo genetic transformation13.
years, the genome sequences of Haemophilus influenzae,
Mycoplasma genitalium, Synechocystis PCC6803, Methanococcus General features of the DNA sequence
jannaschii, M. pneumoniae, Escherichia coli, Helicobacter pylori, Analysis at the replicon level. The B. subtilis chromosome has
Archaeoglobus fulgidus and the yeast Saccharomyces cerevisiae have 4,214,810 base pairs (bp), with the origin of replication coinciding
been published in their entirety1–8, and at least 40 prokaryotic with the base numbering start point14, and the terminus at about
genomes are currently being sequenced. Regularly updated lists of 2,017 kilobases (kb)15. The average G þ C ratio is 43.5%, but it
genome sequencing projects are available at http://www.mcs.anl. varies considerably throughout the chromosome. This average is
gov/home/gaasterl/genomes.html (Argonne National Laboratory, also different if one considers the nucleotide content of coding
Illinois, USA) and http://www.tigr.org (TIGR, Rockville, Maryland, sequences, for which G and A (24% and 30%) are relatively more
USA). abundant than their counterparts C and T (20% and 26%). A
The list of sequenced microorganisms does not currently include significant inversion of the relative G 2 C=G þ C ratio is visible at
a paradigm for Gram-positive bacteria, which are known to be the origin of replication, indicating asymmetry of the nucleotide
important for the environment, medicine and industry. Bacillus composition between the replication leading strand and the lagging
subtilis has been chosen to fill this gap9,10 as its biochemistry, strand16. Several A þ T-rich islands are likely to reveal the signature
physiology and genetics have been studied intensely for more of bacteriophage lysogens or other inserted elements (Fig. 1, see
than 40 years. B. subtilis is an aerobic, endospore-forming, rod- below).
shaped bacterium commonly found in soil, water sources and in We have analysed the abundance of oligonucleotides (‘words’) in
association with plants. B. subtilis and its close relatives are an the genome in various ways: absolute number of words in the
important source of industrial enzymes (such as amylases and genomic text, or comparison with the expected count derived from
proteases), and much of the commercial interest in these bacteria several models of the chromosome (for example, Markov models, or
arises from their capacity to secrete these enzymes at gram per litre simulated sequences in which previously known features of the
concentrations. It has therefore been used for the study of protein genome were conserved17). Comparing the experimental data with
secretion and for development as a host for the production of various models allowed us to define under- and overrepresentation
heterologous proteins11. B. subtilis (natto) is also used in the of words in the experimental data set by reference to the model
production of Natto, a traditional Japanese dish of fermented chosen. In general, the dinucleotide bias follows closely what has
soya beans. been described for other prokaryotes18,19, in that the dinucleotides
Under conditions of nutritional starvation, B. subtilis stops most overrepresented are AA, TT and GC, whereas those less
growing and initiates responses to restore growth by increasing represented are TA, AC and GT. Plots of the frequencies of AG,
metabolic diversity. These responses include the induction of GA, CT and TC in sliding windows along the chromosome show
motility and chemotaxis, and the production of macromolecular dramatic decreases or increases around the origin and terminus of
hydrolases (proteases and carbohydrases) and antibiotics. If these replication (data not shown). Trinucleotide frequency, directly
responses fail to re-establish growth, the cells are induced to form related to the coding frame, will be discussed below. The distribu-
chemically, irradiation- and desiccation-resistant endospores. tion of words of four, five and six nucleotides shows significant
Sporulation involves a perturbation of the normal cell cycle and correlations between the usage of some words and replication
the differentiation of a binucleate cell into two cell types. The (several such oligonucleotides are very significantly overrepresented
division of the cell into a smaller forespore and a larger mother cell, in one of the strands and underrepresented in the other one).
each with an entire copy of the chromosome, is the first morpho- Setting a statistical cut-off for the significance of duplications at
logical indication of sporulation. The former is engulfed by the 10−3, we expected duplication by chance of words longer than 24
latter and differential expression of their respective genomes, nucleotides to be rare20. In fact, the genome of B. subtilis contains a
coupled to a complex network of interconnected regulatory path- plethora of such duplications, some of them appearing more than

0.50

0.45
G+C (%)

0.40

0.35

1 2 3 4 PBSX 5 6 SPß skin 7


0.30
0 500,000 1,000,000 1,500,000 2,000,000 2,500,000 3,000,000 3,500,000 4,000,000

Position (base pairs)

Figure 1 Distribution of A þ T-rich islands along the chromosome of B. subtilis, in by dots at the bottom of the graph. Known prophages (PBSX, SPb and skin) are
sliding windows of 10,000 nucleotides, with a step of 5,000 nucleotides. Location indicated by their names, and prophage-like elements are numbered from 1 to 7.
of genes from class 3 according to codon usage analysis (see Fig. 4) is indicated

Nature © Macmillan Publishers Ltd 1997


250 NATURE | VOL 390 | 20 NOVEMBER 1997
articles
twice. Among the duplications, we identified, as expected, the similarities to known genes in other organisms or because they had a
ribosomal RNA genes and their flanking regions, but also regions good GeneMark prediction (see Methods). This has not yet been
known to correspond to genes comprising long sequence repeats substantiated experimentally. However, in the case of the gene
(such as pks and srf ). We also found several regions that were not coding for translation initiation factor 3, the similarity with its E.
expected: a 182-bp repetition within the yyaL and yyaO genes; a 410-bp coli counterpart strongly suggests that the initiation codon is ATT,
repetition between the yxaK and yxaL genes; an internal duplication as is the case in E. coli.
of 174 bp inside ydcI; and significant duplications in the regions We have not annotated CDSs that largely or entirely overlap
involved in the transcriptional control of several genes (such as existing genes, although such genes (for example, comS inside
118 bp repeated three times between yxbB and yxbC). Finally, we srfAA) certainly exist. It is also likely that some of the short CDSs
found several repetitions at the borders of regions that might be present in the B. subtilis genome have been overlooked. For these
involved in bacteriophage integration. reasons and possible sequencing errors, the estimated number of B.
The most prominent duplication was a 190-bp element that was subtilis CDSs will fluctuate around the present figure of 4,100.
repeated 10 times in the chromosome. Multiple alignment of the ten In several cases, in-frame termination codons or frameshifts were
repeats showed that they could be classified into two subfamilies confirmed to be present on the chromosome (for example, an
with six and three copies each, plus a copy of what appears to be a internal termination codon in ywtF, or the known programmed
chimaera. Similar sequences have also been described in the closely translational frameshift in prfB), indicating that the genes are either
related species Bacillus licheniformis21,22. A striking feature of these non-functional (pseudogenes) or subject to regulatory processes. It
repeats is that they are only found in half of the chromosome, at will therefore be of interest to determine whether these gene features
either side of the origin of replication, with five repeats on each side. are conserved in related Bacillus species, especially as strain 168 is
Furthermore, with the exception of the most distal repeat at derived from the Marburg strain that was subjected to X-ray
position 737,062, they lie in the same orientation with respect to irradiation23.
the movement of the replication fork (Figs 2 and 3). Putative A few regions do not have any identifiable feature indicating that
secondary structures conserved by compensatory mutations, as they are transcribed: they could be ‘grey holes’ of the type described
well as an insert in three of the copies, suggest that this element in E. coli 24. Preliminary studies involving all regions of more than
could indicate a structural RNA molecule. 400 bp without annotated CDSs indicated that, of ,300 such
Analysis at the transcription and translation level. Over 4,000 regions, only 15% were likely to be really devoid of protein-
putative protein coding sequences (CDSs) have been identified, coding sequences. One of the longest such regions, located between
with an average size of 890 bp, covering 87% of the genome yfjO and yfjN, is 1,628 bp long. Grey holes seem generally to be
sequence (Fig. 2). We found that 78% of the genes started with clustered near the terminus of replication. However, a grey-hole
ATG, 13% with TTG and 9% with GTG, which compares with 85%, cluster located at ,600 kb might be related to the temporary
3% and 14%, respectively, in E. coli8. Fifteen genes (eight in the chromosome partition observed during the first stages of sporula-
predicted CDSs in bacteriophage SPb) exhibiting unusual start tion, when a segment of about one-third of the chromosome enters
codons (namely ATT and CTG) were also identified through their the prespore, and remains the sole part of the chromosome in the
prespore for a significant transition period25.
The codon usage of B. subtilis CDSs was analysed using factorial
R Table 1 Functional classification of the Bacillus subtilis protein-coding correspondence analysis17. We found that the CDSs of B. subtilis
genes
The genes of known function or encoding products similar to known proteins in B.
could be separated into three well-defined classes (Fig. 4). Class 1
subtilis or in other organisms have been classified into functional categories
comprises the majority of the B. subtilis genes (3,375 CDSs),
(2,379 genes). The total number of genes in each category is indicated after the
including most of the genes involved in sporulation. Class 2 (188
category title. Genes are listed in alphabetical order within each category, and
CDSs) includes genes that are highly expressed under exponential
their positions (in kilobases) on the B. subtilis chromosome are indicated after the
growth conditions, such as genes encoding the transcription and
gene names. A brief description is given for each gene. In some cases, interacting
translation machineries, core intermediary metabolism, stress pro-
proteins have been indicated between brackets (for example, histidine kinases
teins, and one-third of genes of unknown function. Class 3 (537
and response regulator, phosphatases and their substrates). More detailed and
CDSs) contains a very high proportion of genes of unidentified
constantly updated information is available in the SubtiList database (see
function (84%), and the members of this class have codons enriched
Methods). A preliminary assessment of the significance of sequence similarities
in A þ T residues. These genes are usually clustered into groups
was obtained through an automated procedure involving a combination between
between 15 and 160 genes (for example, bacteriophage SPb) and
the BLAST2P probability and the percentage of amino-acid identity. Matches
correspond to the A þ T-rich islands described above (Fig. 1).
considered significant were re-examined manually. It should be emphasized that
When they are of known function, or when their products display
functions assigned to ‘y’ genes are based only on sequence similarity information
similarity to proteins of known function, they usually correspond to
with the best counterparts in protein databanks. Genes whose products are only
functions found in, or associated with, bacteriophages or trans-
similar to other unknown proteins, or not significantly similar to any other proteins
posons, as well as functions related to the cell envelope. This
in databanks (categories V and VI), were omitted.
includes the region ydc/ydd/yde (40 genes that are missing in
some B. subtilis strains26), where gene products showing similarities
to bacteriophage and transposon proteins are intertwined. Many of
R Figure 2 General view of the B. subtilis chromosome. Arrows indicate the these genes are associated with virulence genes identified in patho-
orientation of transcription. Genes are coloured according to their classification genic Gram-positive bacteria, suggesting that such virulence factors
into six broad functional categories (blue, category I; green, category II; red, are transmitted horizontally among bacteria at a much higher
category III; orange, category IV; purple, category V; pink, category VI; see Table frequency than previously thought. If we include these A þ T-rich
1). Class 2 CDSs according to codon usage analysis are indicated by oblique regions as possible cryptic phages, together with known bacterio-
hatches, and class 3 CDSs are indicated by vertical hatches. Ribosomal RNA phages or bacteriophage-like elements (SPb, PBSX and the skin
genes are coloured in yellow. Transfer RNA genes are marked by triangles. Other element), we find that the genome of B. subtilis 168 contains at least
RNA genes are represented as white arrows. Known genes (non-‘y’ genes) are 10 such elements (Figs 2 and 3). Annotation of the corresponding
printed in bold type. Putative transcription termination sites are represented as regions often reveals the presence of genes that are similar to
loops. Known prophages and prophage-like elements are indicated by brown bacteriophage lytic enzymes, perhaps accounting for the observa-
hatches on the chromosome line. The 190-bp element repeated ten times is tion that B. subtilis cultures are extremely prone to lysis.
represented by hatched boxes. The ribosomal RNA genes have been previously identified and

Nature © Macmillan Publishers Ltd 1997


NATURE | VOL 390 | 20 NOVEMBER 1997 251
articles
shown to be organized into ten rRNA operons, mainly clustered subtilis proteins, we assigned at least one significant counterpart
around the origin of replication of the chromosome (Figs 2 and 3). with a known function to 58% of the B. subtilis proteins. Thus for up
In addition to the 84 previously identified tRNA genes, by using the to 42% of the gene products, the function cannot be predicted by
Palingol27 and tRNAscan28 programs, we propose four putative new similarity to proteins of known function: 4% of the proteins are
tRNA loci (at 1,262 kb, 1,945 kb, 2,003 kb and 2,899 kb), specific for similar only to other unknown proteins of B. subtilis; 12% are
lysine, proline and arginine (UUU, GGG, CCU and UCU antic- similar to unknown proteins from some other organism; and 26%
odons, respectively). The 10S RNA involved in degradation of of the proteins are not significantly similar to any other proteins in
proteins made from truncated mRNA has been identified (ssrA), databanks. This preliminary analysis should be interpreted with
as well as the RNA component of RNase P (rnpB) and the 4.5S RNA caution, because only ,1,200 gene functions (30%) have been
involved in the secretion apparatus (scr). experimentally identified in B. subtilis. We used the ‘y’ prefix in gene
There is a strong transcription orientation bias with respect to the names to emphasize that the function has not been ascertained
movement of the replication fork: 75% of the predicted genes are (2,853 ‘y’ genes, representing 70%).
transcribed in the direction of replication. Plotting the density of Regulatory systems. Transcription regulatory proteins. Helix–
coding nucleotides in each strand along the chromosome readily turn–helix proteins form a large family of regulatory proteins
identifies the replication origin and terminus (Fig. 3). To identify found in both prokaryotes and eukaryotes. There are several classes,
putative operons, we followed ref. 29 for describing Rho- including repressors, activators and sigma factors. Using BLAST
independent transcription termination sites. This yielded ,1,630 searches, we constructed consensus matrices for helix–turn–helix
putative terminators (340 of which were bidirectional). We retained proteins to analyse the B. subtilis protein library. We identified 18
only those that were located less than 100 bp downstream of a gene, sigma or sigma-like factors, of which nine (including a new one) are
or that were considered by the program to be ‘very strong’ (in order of the SigA type. We also putatively identified 20 regulators (among
to account for possible erroneous CDSs). This yielded a total of which 18 were products of ‘y’ genes) of the GntR family, 19
,1,250 terminators, with a mean operon size of three genes. A regulators (15 ‘y’ genes) of the LysR family, and 12 regulators (5
similar approach to the identification of promoters is problema- ‘y’ genes) of the LacI family. Other transcription regulatory proteins
tical, especially because at least 14 sigma factors, recognizing were of the AraC family (11 members, 10 ‘y’), the Lrp family (7
different promoter sequences, have been identified in B. subtilis. members, 3 ‘y’), the DeoR family (6 members, 3 ‘y’), or additional
Nevertheless, the consensus of the main vegetative sigma factor (jA) families (such as the MarR, ArsR or TetR families). A puzzling
appears to be identical to its counterpart in E. coli (j70): 59- observation is that several regulatory proteins display significant
TTGACA-n17-TATAAT-39. Relaxing the constraints of the similarity similarity to aminotransferases (seven such enzymes have been
to sigma-specific consensus sequences led to an extremely high identified as showing similarity to repressors).
number of false-positive results, suggesting that the consensus- Two-component signal-transduction pathways. Two-component
oriented approach to the identification of promoters should be regulatory systems, consisting of a sensor protein kinase and a
replaced by another approach17. response regulator, are widespread among prokaryotes. We have
identified 34 genes encoding response regulators in B. subtilis, most
Classification of gene products of which have adjacent genes encoding histidine kinases. Response
Genes were classified according to ref. 14, based on the representa- regulators possess a well-conserved N-terminal phospho-acceptor
tion of cells as Turing machines in which one distinguishes between domain30, whereas their C-terminal DNA-binding domains share
the machine and the program (Table 1). Using the BLAST2P similarities with previously identified response regulators in E. coli,
software running against a composite protein databank compound Rhizobium meliloti, Klebsiella pneumoniae or Staphylococcus aureus.
of SWISS-PROT (release 34), TREMBL (release 3, update 1) and B. Representatives of the four subfamilies recently identified in E. coli 31
oriC
-G
rrnJ-W
rrnO

rrnI-H
rrnA

100% 2
E
rrn
3
80% Figure 3 Density of coding nucleotides along the
B. subtilis chromosome. Yellow stands for the
density of coding nucleotides in both strands of
the sequence; red indicates the density of coding
0% rrnD nucleotides in the clockwise strand (nucleotides
involved in genes transcribed in the clockwise
rrnB
orientation). The movement of the replication
forks is represented by arrows. Ribosomal RNA
operons are indicated by brown boxes. Known
prophages and prophage-like elements are
4
represented as blue lines. The 190-bp element
PB
SX
repeated ten times is represented by green lines.

7
in
sk
5
SPβ

terC

Nature © Macmillan Publishers Ltd 1997


252 NATURE | VOL 390 | 20 NOVEMBER 1997
articles
Protein secretion. It is known that B. subtilis and related Bacillus
species, in particular B. licheniformis and B. amyloliquefaciens, have
a high capacity to secrete proteins into the culture medium. Several
genes encoding proteins of the major secretion pathway have been
identified: secA, secD, secE, secF, secY, ffh and ftsY. Surprisingly, there
is no gene for the SecB chaperone. It is thought that other
chaperone(s) and targeting factor(s), such as Ffh and FtsY, may
take over the SecB function. Further, although there is only one such
gene in E. coli, five type I signal peptidase genes (sipS, sipT, sipU, sipV
and sipW) have been found33. The lsp gene, encoding a type II signal
peptidase required for processing of lipo-modified precursors, was
also identified. PrsA, located at the outer side of the membrane, is
important for the refolding of several mature proteins after their
translocation through the membrane.
Other families of proteins. ABC transporters were the most
frequent class of proteins found in B. subtilis. They must be
extremely important in Gram-positive bacteria, because they have
an envelope comprising a single membrane. ABC transporters will
therefore allow such bacteria to escape the toxic action of many
compounds. We propose that 77 such transporters are encoded in
the genome. In general they involve the interaction of at least three
gene products, specified by genes organized into an operon. Other
families comprised 47 transport proteins similar to facilitators (and
perhaps sometimes part of the ABC transport systems), 18 amino-
Figure 4 Factorial correspondence analysis of codon usage in the B. subtilis acid permeases (probably antiporters), and at least 16 sugar trans-
CDSs. Red dots, genes from class 1; green triangles, genes from class 2; blue porters belonging to the PEP-dependent phosphotransferase
crosses, genes from class 3. Class 2 contains genes coding for the translation system.
and transcription machineries, and genes of the core intermediary metabolism. General stress proteins are important for the survival of bacteria
Class 3 genes correspond to codons strongly enriched in A or T in the wobble under a variety of environmental conditions. We identified 43
position; they generally belong to prophage-like inserts in the genome. temperature-shock and general stress proteins displaying strong
similarity to E. coli counterparts.
Missing genes. Histone-like proteins such as HU and H-NS have
(OmpR, FixJ, CitB and LytR) have been identified in B. subtilis. In a been identified in E. coli. We found that B. subtilis encodes two
fifth subfamily, CheY, the DNA-binding domain is absent. The putative histone-like proteins that show similarity to E. coli HU,
DNA-binding domain of a single B. subtilis response regulator, namely HBsu and YonN, but found no homologue to H-NS. It is
YesN, shares similarity with regulatory proteins of the AraC family. known that the hbs gene encoding HBsu is essential, but we do not
Quorum sensing. The B. subtilis genome contains 11 aspartate expect the yonN gene to be essential because it is present in the SPb
phosphatase genes, whose products are involved in dephosphoryla- prophage. IHF is similar to HU, and it is not known whether HBsu
tion of response regulators, that do not seem to have counterparts in plays a similar role to that of IHF in E. coli. Similarly, no protein
Gram-negative bacteria such as E. coli. Downstream from the similar to FIS could be found.
corresponding genes are some small genes, called phr, encoding Genes encoding products that interact with methylated DNA,
regulatory peptides that may serve as quorum sensors32. Seven phr such as seqA in E. coli, involved in the regulation of replication
genes have been identified so far, including three new genes (phrG, initiation timing, or mutH, the endonuclease recognizing the newly
phrI and phrK). synthesized strand during mismatch repair at hemi-methylated
64
47 77
57
38
210

72 (2%)
112 (3%)
singlets
84 (2%) doublets
triplets
100 (3%) quadruplets
quintuplets
sextuplets
168 (4%)
heptuplets
2,126 (53%)
octuplets
9 to 19 genes
273 (7%) 38 genes
47 genes
57 genes
64 genes
77 genes

568 (14%)

Figure 5 Gene paralogue distribution in the genome of B. subtilis. Each B. subtilis comparison using 100 independent random shuffles of the protein sequence
protein has been compared with all other proteins in the genome, using a Smith (Z-score . 13).
and Waterman algorithm. The baseline is established by making a similar

Nature © Macmillan Publishers Ltd 1997


NATURE | VOL 390 | 20 NOVEMBER 1997 253
articles
GATC sites, are also missing. This is in line with the absence of from within the SPb genome. In this latter case, the gene corre-
known methylation in B. subtilis, equivalent to Dam methylation in sponding to the large subunit both contains an intron and codes for
E. coli. Similarly, E. coli sfiA, encoding an inhibitor of FtsZ action in an intein (V.L., unpublished data). The gene of the small subunit of
the SOS response, has no counterpart in B. subtilis. In contrast, B. this enzyme also contains an intron, encoding an endonuclease, as
subtilis replication initiation-specific genes, such as dnaB and dnaD, was found for the homologue in bacteriophage T4.
are missing in E. coli. The exact counterpart of the E. coli mukB gene, By similarity with genes from other organisms, there appears to
involved in chromosome partitioning, does not exist in B. subtilis, be, in addition to genes involved in amino-acid degradation (such
but genes spo0J and smc (Smc is weakly similar to MukB), which are as the roc operon, which degrades arginine and related amino acids),
suggested to be involved in partitioning of the B. subtilis chromo- a large number of genes involved in the degradation of molecules
some, are missing in E. coli. such as opines and related molecules, derived from plants. This is
Turnover of mRNA is controlled in E. coli by a ‘degradosome’ also in line with the fact that B. subtilis degrades polygalacturonate,
comprising RNase E. It has a counterpart in B. subtilis, but we failed and suggests that, in its biotope, it forms specific relations with
to find a clear homologue of RNase E in this organism. Whether this plants.
is related to the role of ribosomal protein S1 as an RNA helicase Secondary metabolism. In addition to many genes coding for
involved in mRNA turnover in E. coli requires further investigation. degradative enzymes, almost 4% of the B. subtilis genome codes
In particular, a homologue of rpsA (S1 structural gene), ypfD, might for large multifunctional enzymes (for example, the srf, pps and pks
be involved in a structure homologous to the degradosome34. loci), similar to those involved in the synthesis of antibiotics in other
Structurally unrelated genes of similar function. Several genes genera of Gram-positive bacteria such as Streptomyces. Natural
encode products that have similar functions in E. coli and B. subtilis, isolates of B. subtilis produce compounds with antibiotic activity,
but have no evident common structure. This is the case for the such as surfactin, fengycin and difficidin, that can be related to the
helicase loader genes, E. coli dnaC and B. subtilis dnaI; the genes above-mentioned loci. This bacterium therefore provides a simple
coding for the replication termination protein, E. coli tus and B. and genetically amenable model in which to study the synthesis of
subtilis rtp; and the division topology specifier genes, E. coli minE antibiotics and its regulation. These pathways are often organized in
and B. subtilis divIVA. The situation may even be more complex in very long operons (for example, the pks region spans 78.5 kb, about
multisubunit enzymes: B. subtilis synthesizes two DNA polymerase 2% of the genome). The corresponding sequences are mostly
III a chains, one having 39–59 proofreading exonuclease activity located near the terminus of replication, together with prophages
(PolC) and the other without the exonuclease activity (DnaE); in E. and prophage-like sequences.
coli, only the latter exists. E. coli DNA polymerase II is structurally
related to DNA polymerase a of eukaryotes, whereas B. subtilis YshC Paralogues and orthologues
is related to DNA polymerase b. It is important to relate intermediary metabolism to genome
structure, function and evolution. We therefore compared the B.
Metabolism of small molecules subtilis proteins with themselves, as well as with proteins from
The type and range of metabolism used for the interconversion of known complete genomes, using a consistent statistical method that
low-molecular-weight compounds provide important clues to an allows the evaluation of unbiased probabilities of similarities
organism’s natural environment(s) and its biological activity. Here between proteins37,38. For Z-scores higher than 13, the number of
we briefly outline the main metabolic pathways of B. subtilis before proteins similar to each given protein does not vary, indicating that
the reconstruction of these pathways in silico, the correlation of this cut-off value identifies sets of proteins that are significantly
genes with specific steps in the pathway, and ultimately the predic- similar.
tion of patterns of gene expression. Families of paralogues. Many of the paralogues constitute large
Intermediary metabolism. It has long been known that B. subtilis families of functionally related proteins, involved in the transport of
can use a variety of carbohydrates. As expected, it encodes an compounds into and out of the cell, or involved in transcription
Embden–Meyerhof–Parnas glycolytic pathway, coupled to a func- regulation. Another part of the genome consists of gene doublets
tional tricarboxylic acid cycle. Further, B. subtilis is also able to grow (568 genes), triplets (273 genes), quadruplets (168 genes) and
anaerobically in the presence of nitrate as an electron acceptor. This quintuplets (100 genes). Finally, about half of the genome is made
metabolism is, at least in part, regulated by the FNR protein, of genes coding for proteins with no apparent paralogues (Fig. 5).
binding to sites upstream of at least eight genes (four sites experi- No large family comprises only proteins without any similarity to
mentally confirmed and four putative sites). A noteworthy feature proteins of known function.
of B. subtilis metabolism is an apparent requirement of branched The process by which paralogues are generated is not well
short-chain carboxylic acids for lipid biosynthesis35. Branched- understood, but we might find clues by studying some of the
chain 2-keto acid decarboxylase activity exists and may be linked duplications in the genome. Several approximate DNA repetitions,
to a variety of genes, suggesting that B. subtilis can synthesize and associated with very high levels of protein identity, were found,
utilize linear branched short-chain carboxylic acids and alcohols. mainly within regions putatively or previously identified as pro-
Amino-acid and nucleotide metabolism. Pyrimidine metabolism phages. This is in line with previous observations about PBSX and
of B. subtilis seems to be regulated in a way fundamentally different the skin element39,40, and suggests that these prophage-like elements
from that of E. coli, as it has two carbamylphosphate synthetases share a common ancestor and have diverged relatively recently. In
(one specific for arginine synthesis, the other for pyrimidine). addition, several protein duplications are in genes that are located
Additionally, the aspartate transcarbamylase of B. subtilis does not very close to each other, such as yukL and dhbF (the corresponding
act as an allosteric regulator as it does in E. coli. As in other proteins are 65% identical in an overlap of 580 amino acids), yugJ
microorganisms, pyrimidine deoxyribonucleotides are synthesized and yugK (proteins 73% identical), yxjG and yxjH (proteins 70%
from ribonucleoside diphosphates, not triphosphates. The cytidine identical), and the entire opuB operon, which is duplicated 3 kb
diphosphate required for DNA synthesis is derived from either the away (opuC operon, yielding ,80% of amino-acid identity in the
salvage pathway of mRNA turnover or from the synthesis of corresponding proteins).
phospholipids and components of the cell wall. This means that The study of paralogues showed that, as in other genomes, a few
polynucleotide phosphorylase is of fundamental importance in classes of genes have been highly expanded. This argues against the
nucleic acid metabolism, and may account for its important role idea of the genome evolving through a series of duplications of
in competence36. Two ribonucleoside reductases, both of class I, ancestral genomes, but rather for the idea of genes as living
NrdEF type, are encoded by the B. subtilis chromosome, in one case organisms, subject to evolutionary constraints, some being sub-

Nature © Macmillan Publishers Ltd 1997


254 NATURE | VOL 390 | 20 NOVEMBER 1997
articles
mitted to expansion and natural selection, and others to local evolutionary divergence, one billion years ago, of eubacteria into the
duplications of DNA regions. Gram-positive and Gram-negative groups. The availability of
Among paralogue doublets, some were unexpected, such as the powerful genetic tools will allow the B. subtilis genome sequence
three aminoacyl tRNA synthetases doublets (hisS (2,817 kb) and data to be exploited fully within the framework of a systematic
hisZ (3,588 kb); thrS (2,960 kb) and thrZ (3,855 kb); tyrS (3,036 kb) functional analysis program, undertaken by a consortium of 19
and tyrZ (3,945 kb)) or the two mutS paralogues (mutS and yshD). European and 7 Japanese laboratories coordinated by S. D. Ehrlich
This latter situation is similar to that found in Synechocystis. In the (INRA, Jouy-en-Josas, France) and by N. Ogasawara and H.
case of B. subtilis, the presence of two MutS proteins could indicate Yoshikawa (Nara Institute of Science and Technology, Nara,
that there are two different pathways for long-patch mismatch Japan). M
.........................................................................................................................
repair, possibly a consequence of the active genetic transformation
mechanism of B. subtilis. Methods
Families of orthologues. Because Mycoplasma spp. are thought to Genome cloning and sequencing. An international consortium was
be derived from Gram-positive bacteria similar to B. subtilis, we established to sequence the genome of B. subtilis strain 168 (refs 9, 10, 42).
compared the B. subtilis genome with that of M. genitalium. Among At its peak, 25 European, seven Japanese and one Korean laboratory partici-
the 450 genes encoded by M. genitalium, the products of 300 are pated in the program, together with two biotechnology companies. Five
similar to proteins of B. subtilis. Among the 146 remaining gene contiguous DNA regions totalling 0.94 Mb, and two additional regions of
products, a further 3 are similar to proteins of other Bacillus species, 0.28 and 0.14 Mb, were sequenced by the Japanese partners, while the European
and 9 to proteins of other Gram-positive bacteria; 25 are similar to partners sequenced a total of 2.68 Mb. A few sequences from strain 168
proteins of Gram-negative bacteria; and 19 are similar to proteins of published previously were not resequenced when long overlaps did not indicate
other Mycoplasma spp. This leaves only 90 genes that would be differences.
specific to M. genitalium and might be involved in the interaction of A major technical difficulty was the inability to construct in E. coli gene
this organism with its host. banks representative of the entire B. subtilis chromosome using vectors that
The B. subtilis genome is similar in size to that of E. coli. Because have proved efficient for other sources of bacterial DNA (such as bacteriophage
these bacteria probably diverged more than one billion years ago, it or cosmid vectors). This was due to the generally very high level of expression of
is of evolutionary value to investigate their relative similarity. About B. subtilis genes in E. coli, leading to toxic effects. This limitation was overcome
1,000 B. subtilis genes have clear orthologous counterparts in E. coli by: cloning into a variety of vectors9,43,44; using an E. coli strain maintaining low-
(one-quarter of the genome). These genes did not belong either to copy number plasmids44; using an integrative plasmid/marker rescue genome-
the prophage-like regions or to regions coding for secondary walking strategy44; and in vitro amplification using polymerase chain reaction
metabolism (,15% of the B. subtilis genome). This indicates that (PCR) techniques45,46.
a large fraction of these genomes shared similar functions. At first Although cloning vectors were used in the early stages as templates for
sight, however, it seems that little of the operon structure has been sequencing reactions, they were largely superseded in the later stages by long-
conserved. We nevertheless found that ,100 putative operons or range and inverse PCR techniques. To reduce sequencing errors resulting from
parts of operons were conserved between E. coli and B. subtilis. PCR amplification artefacts, at least eight amplification reactions were
Among these, ,12 exhibited a reshuffled gene order (typically, the performed independently and subsequently pooled. The various sequencing
arabinose operon is araABD in B. subtilis and araBAD in E. coli). In groups were free to choose their own strategy, except that all DNA sequences
addition to the core of the translation and transcription machinery, had to be determined entirely on both strands.
we identified other classes of operons that were well conserved Sequence annotation and verification. The sequences were annotated by the
between the two organisms, including major integrated functions groups, and sent to a central depository at the Institut Pasteur14. The Japanese
such as ATP synthesis (atp operon) and electron transfer (cta and sequences were also sent there through the Japanese depository at the Nara
qox operons). As well as being well preserved, the murein bio- Institute of Science and Technology. The same procedures were used to identify
synthetic region was partly duplicated, allowing creation of part of CDSs and to detect frameshifts. They were embedded within a cooperative
the genes required for the sporulation division machinery41. The computer environment dedicated to automatic sequence annotation and
amino-acid biosynthesis genes differ more in their organization: the analysis39. In a first step, we identified in all six possible frames the open
E. coli genes for arginine biosynthesis are spread throughout reading frames (ORFs) that were at least 100 codons in length. In a second step,
the chromosome, whereas the arginine biosynthesis genes of three independent methods were used: the first method used the GeneMark
B. subtilis form an operon. The same is true for purine biosynthetic coding-sequence prediction method47 together with the search for CDSs
genes. Genes responsible for the biosynthesis of coenzymes and preceded by typical translation initiation signals (59-AAGGAGGTG-39),
prosthetic groups in B. subtilis are often clustered in operons that located 4–13 bases upstream of the putative start codons (ATG, TTG or
differ from those found in E. coli. Finally, several operons conserved GTG); the second method used the results of a BLAST2X analysis performed on
in E. coli and B. subtilis correspond to unknown functions, and the entire B. subtilis genome against the non-redundant protein databank at the
should therefore be priority targets for the functional analysis of NCBI; and the third method was based on the distribution of non-overlapping
these model genomes. trinucleotides or hexanucleotides in the three frames of an ORF48.
Comparison with Synechocystis PCC6803 revealed about 800 In general, frameshifts and missense mutations generating termination
orthologues. However, in this case the putative operon structure codons or eliminating start codons are relatively easy to detect. We shall devise a
is extremely poorly conserved, apart from four of the ribosomal procedure for detecting another type of error, GC instead of CG or vice versa,
protein operons, the groES–groEL operon, yfnHG (respectively in which are much more difficult to identify. It should be noted that putative
Synechocystis rfbFG), rpsB-tsf, ylxS-nusA-infB, asd-dapGA-ymfA, frameshift errors should not be corrected automatically. The sequences of the
spmAB, efp-accB, grpE-dnaK, yurXW. The nine-gene atp operon of flanking regions of a 500-bp fragment centred around a putative error were sent
B. subtilis is split into two parts in Synechocystis: atpBE and to an independent verification group, which performed PCR amplifications
atpIHGFDAC. using chromosomal DNA as template, and sequenced the corresponding DNA
products.
Conclusion Organization and accessibility of data. The B. subtilis sequence data have
The biochemistry, physiology and molecular biology of B. subtilis been combined with data from other sources (biochemical, physiological and
have been extensively studied over the past 40 years. In particular, B. genetic) in a specialized database, SubtiList49, available as a Macintosh or
subtilis has been used to study postexponential phase phenomena Windows stand-alone application (4th Dimension runtime) by anonymous
such as sporulation and competence for DNA uptake. The genome FTP at ftp://ftp.pasteur.fr/pub/GenomeDB/SubtiList. SubtiList is also accessible
sequences of E. coli and B. subtilis provide a means of studying the through a World-Wide Web server at http://www.pasteur.fr/Bio/SubtiList.html,

Nature © Macmillan Publishers Ltd 1997


NATURE | VOL 390 | 20 NOVEMBER 1997 255
articles
where it has been implemented on a UNIX system using the Sybase relational 27. Billoud, B., Kontic, M. & Viari, A. Palingol: a declarative programming language to describe nucleic
acids’ secondary structures and to scan sequence database. Nucleic Acids Res. 24, 1395–1403 (1996).
database management system. A completely rewritten version of SubtiList is in 28. Fichant, G. A. & Burks, C. Identifying potential tRNA genes in genomic DNA sequences. J. Mol. Biol.
preparation to facilitate browsing of the information of the whole chromo- 220, 659–671 (1991).
some. Flat files of the whole DNA and protein sequences in EMBL and FASTA 29. d’Aubenton Carafa, Y., Brody, E. & Thermes, C. Prediction of rho-independent Escherichia coli
transcription terminators. A statistical analysis of their RNA stem-loop structures. J. Mol. Biol. 216,
format will be made available at the above ftp address. Another B. subtilis 835–858 (1990).
genome database is also under development at the Human Genome Center of 30. Stock, J. B., Surette, M. G., Levitt, M. & Park, P. in Two-Component Signal Transduction (eds Hoch, J. A.
& Silhavy, T. J.) 25–51 (ASM, Washington DC, 1995).
Tokyo University (http://www.genome.ad.jp), and SubtiList will also be avail- 31. Mizuno, T. Compilation of all genes encoding two-component phosphotransfer signal transducers in
able there. the genome of Escherichia coli. DNA Res. 4, 161–168 (1997).
32. Perego, M., Glaser, P. & Hoch, J. A. Aspartyl-phosphate phosphatases deactivate the response
Received 16 July; 29 September 1997. regulator components of the sporulation signal transduction system in Bacillus subtilis. Mol.
Microbiol. 19, 1151–1157 (1996).
1. Fleischmann, R. D. et al. Whole-genome random sequencing and assembly of Haemophilus influenzae
33. Tjalsma, H. et al. Bacillus subtilis contains four closely related type I signal peptidases with overlapping
Rd. Science 269, 496–512 (1995).
substrate specificities: constitutive and temporally controlled expression of different sip genes. J. Biol.
2. Fraser, C. M. et al. The minimal gene complement of Mycoplasma genitalium. Science 270, 397–403
Chem. 272, 25983–25992 (1997).
(1995).
34. Danchin, A. Comparison between the Escherichia coli and Bacillus subtilis genomes suggests that a
3. Kaneko, T. et al. Sequence analysis of the genome of the unicellular Cyanobacterium Synechocystis sp.
major function of polynucleotide phosphorylase is to synthesize CDP. DNA Res. 4, 9–18 (1997).
strain PCC6803. II. Sequence determination of the entire genome and assignment of potential
35. Suutari, M. & Laakso, S. Unsaturated and branched chain-fatty acids in temperature adaptation of
protein-coding regions. DNA Res. 3, 109–136 (1996).
Bacillus subtilis and Bacillus megaterium. Biochim. Biophys. Acta 1126, 119–124 (1992).
4. Bult, C. J. et al. Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii.
36. Luttinger, A., Hahn, J. & Dubnau, D. Polynucleotide phosphorylase is necessary for competence
Science 273, 1058–1073 (1996).
development in Bacillus subtilis. Mol. Microbiol. 19, 343–356 (1996).
5. Himmelreich, R. et al. Complete sequence analysis of the genome of the bacterium Mycoplasma
37. Landès, C., Hénaut, A. & Risler, J.-L. A comparison of several similarity indices used in the
pneumoniae. Nucleic Acids Res. 24, 4420–4449 (1996).
classification of protein sequences: a multivariate analysis. Nucleic Acids Res. 20, 3631–3637 (1992).
6. Goffeau, A. et al. The yeast genome directory. Nature 387, 5–105 (1997).
38. Glémet, E. & Codani, J.-J. LASSAP, a LArge Scale Sequence compArison Package. Comput. Appl.
7. Tomb, J.-F. et al. The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature
Biosci. 13, 137–143 (1997).
388, 539–547 (1997).
39. Médigue, C., Moszer, I., Viari, A. & Danchin, A. Analysis of a Bacillus subtilis genome fragment using a
8. Blattner, F. R. et al. The complete genome sequence of Escherichia coli K-12. Science 277, 1453–1462
co-operative computer system prototype. Gene 165, GC37–GC51 (1995).
(1997).
40. Krogh, S., O’Reilly, M., Nolan, N. & Devine, K. M. The phage-like element PBSX and part of the skin
9. Kunst, F., Vassarotti, A. & Danchin, A. Organization of the European Bacillus subtilis genome
element, which are resident at different locations on the Bacillus subtilis chromosome, are highly
sequencing project. Microbiology 389, 84–87 (1995).
homologous. Microbiology 142, 2031–2040 (1996).
10. Ogasawara, N. & Yoshikawa, H. The systematic sequencing of the Bacillus subtilis genome in Japan.
41. Daniel, R. A., Drake, S., Buchanan, C. E., Scholle, R. & Errington, J. The Bacillus subtilis spoVD gene
Microbiology 142, 2993–2994 (1996).
encodes a mother-cell-specific penicillin-binding protein required for spore morphogenesis. J. Mol.
11. Harwood, C. R. Bacillus subtilis and its relatives: molecular biological and industrial workhorses.
Biol. 235, 209–220 (1994).
Trends Biotechnol. 10, 247–256 (1992).
42. Anagnostopoulos, C. & Spizizen, J. Requirements for transformation in Bacillus subtilis. J. Bacteriol.
12. Stragier, P. & Losick, R. Molecular genetics of sporulation in Bacillus subtilis. Annu. Rev. Genet. 30,
81, 741–746 (1961).
297–341 (1996).
43. Azevedo, V. et al. An ordered collection of Bacillus subtilis DNA segments cloned in yeast artificial
13. Solomon, J. M. & Grossman, A. D. Who’s competent and when: regulation of natural genetic
chromosomes. Proc. Natl Acad. Sci. USA 90, 6047–6051 (1993).
competence in bacteria. Trends Genet. 12, 150–155 (1996).
44. Glaser, P. et al. Bacillus subtilis genome project: cloning and sequencing of the 97 kb region from 3258
14. Moszer, I., Kunst, F. & Danchin, A. The European Bacillus subtilis genome sequencing project: current
to 3338. Mol. Microbiol. 10, 371–384 (1993).
status and accessibility of the data from a new World Wide Web site. Microbiology 142, 2987–2991
45. Ogasawara, N., Nakai, S. & Yoshikawa, H. Systematic sequencing of the 180 kilobase region of the
(1996).
Bacillus subtilis chromosome containing the replication origin. DNA Res. 1, 1–14 (1994).
15. Franks, A. H., Griffiths, A. A. & Wake, R. G. Identification and characterization of new DNA
46. Sorokin, A. et al. A new approach using multiplex long accurate PCR and yeast artificial chromosomes
replication terminators in Bacillus subtilis. Mol. Microbiol. 17, 13–23 (1995).
for bacterial chromosome mapping and sequencing. Genome Res. 6, 448–453 (1996).
16. Lobry, J. R. Asymmetric substitution patterns in the two DNA strands of bacteria. Mol. Biol. Evol. 13,
47. Borodovsky, M. & McIninch, J. GENMARK: parallel gene recognition for both DNA strands. Comput.
660–665 (1996).
Chem. 17, 123–133 (1993).
17. Hénaut, A. & Danchin, A. in Escherichia coli and Salmonella: Cellular and Molecular Biology (eds
48. Fichant, G. A. & Quentin, Y. A frameshift error detection algorithm for DNA sequencing projects.
Neidhardt, F. et al.) 2047–2066 (ASM, Washington DC, 1996).
Nucleic Acids Res. 23, 2900–2908 (1995).
18. Nussinov, R. The universal dinucleotide asymmetry rules in DNA and amino acid codon choice.
49. Moszer, I., Glaser, P. & Danchin, A. SubtiList: a relational database for the Bacillus subtilis genome.
Nucleic Acids Res. 17, 237–244 (1981).
Microbiology 141, 261–268 (1995).
19. Karlin, S., Burge, C. & Campbell, A. M. Statistical analyses of counts and distributions of restriction
sites in DNA sequences. Nucleic Acids Res. 20, 1363–1370 (1992). Acknowledgements. We thank C. Anagnostopoulos, R. Dedonder and J. Hoch for their pioneering
20. Burge, C., Campbell, A. M. & Karlin, S. Over- and under-representation of short oligonucleotides in efforts, and A. Bairoch for advice in annotating B. subtilis protein data. The main funding of the European
DNA sequences. Proc. Natl Acad. Sci. USA 89, 1358–1362 (1992). network was provided by the European Commission under the Biotechnology program. The Japanese
21. Kasahara, Y., Nakai, S. & Ogasawara, H. Sequence analysis of the 36-kb region between gntZ and trnY project was included in the Human Genome Program, and supported by a research grant from the
genes of Bacillus subtilis genome. DNA Res. 4, 155–159 (1997). Ministry of Education, Science and Culture, and the Proposal-Based Advanced Industrial Technology
22. Presecan, E. et al. The Bacillus subtilis genome from gerBC (3118) to licR (3348). Microbiology 143, R&D Program from New Energy and Industrial Technology Development Organization. The Swiss and
3313–3328 (1997). Korean projects were funded by the Swiss National Fund and the Korean government, respectively. An
23. Burkholder, P. R. & Giles, N. H. Induced biochemical mutations in Bacillus subtilis. Am. J. Bot. 33, industrial platform was set up to facilitate contacts between participants of the European consortium and
345–348 (1947). some European biotechnology companies: DuPont de Nemours (France, USA), Frimond (Belgium),
24. Daniels, D. L., Plunkett, G. III, Burland, V. & Blattner, F. R. Analysis of the Escherichia coli genome: Genencor (Finland, USA), Gist Brocades (The Netherlands), Glaxo-Wellcome (UK, Italy), Hoechst
Marion Roussel (France, Germany), F. Hoffmann-La Roche AG (Switzerland), Novo Nordisk (Denmark),
DNA sequence of the region from 84.5 to 86.5 minutes. Science 257, 771–778 (1992).
SmithKline Beecham (UK).
25. Wu, L. J. & Errington, J. Bacillus subtilis SpoIIIE protein required for DNA segregation during
asymmetric cell division. Science 264, 572–575 (1994). Correspondence and requests for materials should be addressed to F.K. (e-mail: fkunst@pasteur.fr), N.O.
26. Itaya, M. Stability and asymmetric replication of the Bacillus subtilis 168 chromosome structure. J. (nogasawa@bs.aist-nara.ac.jp), H.Y. (hyoshika:bs.aist-nara.ac.jp) or A.D. (adanchin@pasteur.fr). The
Bacteriol. 175, 741–749 (1993). sequence has been deposited in EMBL/GenBank/DDBJ with accession numbers from Z99104 to Z99124.

Nature © Macmillan Publishers Ltd 1997


256 NATURE | VOL 390 | 20 NOVEMBER 1997
Table 1 . Functional classification of the Bacillus subtilis protein-coding genes.
I CELL ENVELOPE AND CELLULAR prophage-mediated lysis) specific enzyme IIC component
PROCESSES 866 xlyB 1317 N-acetylmuramoyl-L-alanine amidase (PBSX lmrB 290 lincomycin-resistance protein
prophage-mediated lysis) lplA 779 lipoprotein
I.1 CELL WALL ........................................................................ 93 yfnG 799 CDP-glucose 4,6-dehydratase lplB 781 transmembrane lipoprotein
cwlA 2665 N-acetylmuramoyl-L-alanine amidase (minor yhdD 1013 cell wall-binding protein lplC 782 transmembrane lipoprotein
autolysin) ykuA 1467 penicillin-binding protein mdr 334 multidrug-efflux transporter (puromycin, ner-
cwlC 1873 N-acetylmuramoyl-L-alanine amidase (sporula- ylbI 1569 lipopolysaccharide core biosynthesis floxacin, tosufloxacin)
tion mother cell wall) ymaG 1865 cell wall protein msmE 3097 multiple sugar-binding protein
cwlD 157 N-acetylmuramoyl-L-alanine amidase (germina- yngB 1946 UTP-glucose-1-phosphate uridylyltransferase msmX 3984 multiple sugar-binding transport ATP-binding
tion) yocH 2093 cell wall-binding protein protein
cwlJ 282 cell wall hydrolase (sporulation) yodJ 2135 D-alanyl-D-alanine carboxypeptidase mtlA 449 phosphotransferase system (PTS) mannitol-
dacA 18 penicillin-binding protein 5 (D-alanyl-D-alanine yojL 2116 cell wall-binding protein specific enzyme IIABC component
carboxypeptidase) (peptidoglycan biosynthe- yomC 2263 N-acetylmuramoyl-L-alanine amidase narK 3833 nitrite extrusion protein
sis) ypdQ 2310 cell wall enzyme nasA 363 nitrate transporter
dacB 2424 penicillin-binding protein 5* (D-alanyl-D-alanine ypfP 2306 cell wall synthesis natA 296 Na+ ABC transporter (extrusion) (ATP-binding
carboxypeptidase) (peptidoglycan biosynthe- ypjH 2357 lipopolysaccharide biosynthesis-related protein protein)
sis) (spore cortex) yqeE 2649 N-acetylmuramoyl-L-alanine amidase natB 297 Na+ ABC transporter (extrusion) (membrane
dacF 2445 penicillin-binding protein (D-alanyl-D-alanine car- yqfY 2588 peptidoglycan acetylation protein)
boxypeptidase) (peptidoglycan biosynthesis) yqiI 2515 N-acetylmuramoyl-L-alanine amidase nrgA 3756 ammonium transporter
ddlA 508 D-alanyl-D-alanine ligase A (peptidoglycan yrhL 2771 acyltransferase nupC 4050 pyrimidine-nucleoside transport protein
biosynthesis) yrrR 2791 penicillin-binding protein oppA 1219 oligopeptide ABC transporter (binding protein)
dltA 3951 D-alanyl-D-alanine carrier protein ligase (lipotei- yrvJ 2818 N-acetylmuramoyl-L-alanine amidase (initiation of sporulation, competence develop-
choic acid biosynthesis) ytcC 3157 lipopolysaccharide N-acetylglucosaminyltrans- ment)
dltB 3953 D-alanine transfer from Dcp to undecaprenol- ferase oppB 1221 oligopeptide ABC transporter (permease) (initia-
phosphate (lipoteichoic acid biosynthesis) ytkC 3135 autolytic amidase tion of sporulation, competence development)
dltC 3954 D-alanine carrier protein (lipoteichoic acid ytxN 3161 lipopolysaccharide N-acetylglucosaminyltrans- oppC 1222 oligopeptide ABC transporter (permease) (initia-
biosynthesis) ferase tion of sporulation, competence development)
dltD 3954 D-alanine transfer from undecaprenol-phos- yubE 3191 N-acetylmuramoyl-L-alanine amidase oppD 1223 oligopeptide ABC transporter (ATP-binding pro-
phate to the poly(glycerophosphate) chain yvcE 3575 cell wall-binding protein tein) (initiation of sporulation, competence
(lipoteichoic acid biosynthesis) ywhE 3849 penicillin-binding protein development)
dltE 3955 involved in lipoteichoic acid biosynthesis ywtD 3697 murein hydrolase oppF 1224 oligopeptide ABC transporter (ATP-binding pro-
gcaD 56 UDP-N-acetylglucosamine pyrophosphorylase tein) (initiation of sporulation, competence
(peptidoglycan and lipopolysaccharide biosyn- I.2 TRANSPORT/BINDING PROTEINS AND development)
thesis) LIPOPROTEINS ............................................................... 381 opuAA 321 glycine betaine ABC transporter (ATP-binding
ggaA 3670 galactosamine-containing minor teichoic acid aapA 2766 amino acid permease protein) (osmoprotection)
biosynthesis alsT 1938 amino acid carrier protein opuAB 322 glycine betaine ABC transporter (permease)
ggaB 3669 galactosamine-containing minor teichoic acid amyC 3099 maltose transport protein (osmoprotection)
biosynthesis amyD 3098 sugar transport opuAC 323 glycine betaine ABC transporter (glycine
gtaB 3665 UTP-glucose-1-phosphate uridylyltransferase appA 1213 oligopeptide ABC transporter (oligopeptide- betaine-binding protein) (osmoprotection)
lytB 3662 modifier protein of major autolysin LytC binding protein) opuBA 3462 choline ABC transporter (ATP-binding protein)
(CWBP76) appB 1215 oligopeptide ABC transporter (permease) (osmoprotection)
lytC 3660 N-acetylmuramoyl-L-alanine amidase (major appC 1216 oligopeptide ABC transporter (permease) opuBB 3461 choline ABC transporter (membrane protein)
autolysin) (CWBP49) appD 1211 oligopeptide ABC transporter (ATP-binding pro- (osmoprotection)
lytD 3687 N-acetylglucosaminidase (major autolysin) tein) opuBC 3460 choline ABC transporter (choline-binding pro-
(CWBP90) appF 1212 oligopeptide ABC transporter (ATP-binding pro- tein) (osmoprotection)
lytE 1018 cell wall lytic activity (CWBP33) tein) opuBD 3460 choline ABC transporter (membrane protein)
mbl 3747 MreB-like protein araE 3485 L-arabinose transport (permease) (osmoprotection)
mraY 1587 phospho-N-acetylmuramoyl-pentapeptide araN 2942 L-arabinose transport (sugar-binding protein) opuCA 3470 glycine betaine/carnitine/choline ABC trans-
transferase (peptidoglycan biosynthesis) araP 2941 L-arabinose transport (integral membrane pro- porter (ATP-binding protein) (osmoprotection)
mreB 2861 cell-shape determining protein tein) opuCB 3469 glycine betaine/carnitine/choline ABC trans-
mreBH 1517 cell-shape determining protein araQ 2940 L-arabinose transport (integral membrane pro- porter (membrane protein) (osmoprotection)
mreC 2860 cell-shape determining protein tein) opuCC 3468 glycine betaine/carnitine/choline ABC trans-
mreD 2859 cell-shape determining protein azlC 2729 branched-chain amino acid transport porter (osmoprotectant-binding protein) (osmo-
murA 3778 UDP-N-acetylglucosamine 1-carboxyvinyltrans- azlD 2728 branched-chain amino acid transport protection)
ferase (peptidoglycan biosynthesis) bglP 4034 phosphotransferase system (PTS) β-glucoside- opuCD 3467 glycine betaine/carnitine/choline ABC trans-
murB 1592 UDP-N-acetylenolpyruvoylglucosamine reduc- specific enzyme IIABC component porter (membrane protein) (osmoprotection)
tase (peptidoglycan biosynthesis) blt 2716 multidrug-efflux transporter opuD 3076 glycine betaine transporter (osmoprotection)
murC 3049 UDP-N-acetylmuramate-alanine ligase (peptido- bmr 2494 multidrug-efflux transporter opuE 728 proline transporter (osmoprotection)
glycan biosynthesis) braB 3027 branched-chain amino acid transporter pbuX 2319 xanthine permease
murD 1588 UDP-N-acetylmuramoylalanine-D-glutamate lig- brnQ 2728 branched-chain amino acid transporter ptsG 1457 phosphotransferase system (PTS) glucose
ase (peptidoglycan biosynthesis) citM 834 secondary transporter of the Mg2+/citrate com- -specific enzyme IIABC component
murE 1586 UDP-N-acetylmuramoylananine-D-gluta- plex ptsI 1459 phosphotransferase system (PTS) enzyme I
mate-2,6-diaminopimelate ligase (peptidoglycan csbX 2838 α-ketoglutarate permease (general energy coupling protein of the PTS)
biosynthesis) cydC 3976 ABC transporter required for expression of pyrP 1618 uracil permease (pyrimidine biosynthesis)
murF 509 UDP-N-acetylmuramoylalanyl- cytochrome bd (ATP-binding protein) rbsA 3703 ribose ABC transporter (ATP-binding protein)
D-glutamyl-2,6-diaminopimelate-D-alanyl- cydD 3974 ABC transporter required for expression of rbsB 3705 ribose ABC transporter (ribose-binding protein)
D-alanyl ligase (peptidoglycan biosynthesis) cytochrome bd (ATP-binding protein) rbsC 3704 ribose ABC transporter (permease)
murG 1591 UDP-N-acetylglucosamine-N-acetylmuramyl- czcD 2724 cation-efflux system membrane protein rbsD 3702 ribose ABC transporter (membrane protein)
(pentapeptide)pyrophosphoryl-undecaprenol dppA 1360 dipeptide ABC transporter (sporulation) rocC 3876 amino acid permease (arginine and ornithine
N-acetylglucosamine transferase (peptidogly- dppB 1361 dipeptide ABC transporter (permease) (sporula- utilization)
can biosynthesis) tion) rocE 4143 amino acid permease (arginine and ornithine
murZ 3806 UDP-N-acetylglucosamine 1-carboxyvinyltrans- dppC 1362 dipeptide ABC transporter (permease) (sporula- utilization)
ferase (peptidoglycan biosynthesis) tion) sacP 3904 phosphotransferase system (PTS) sucrose-
pbp 1999 penicillin-binding protein (peptidoglycan biosyn- dppD 1363 dipeptide ABC transporter (ATP-binding protein) specific enzyme IIBC component
thesis) (sporulation) slp 1533 small peptidoglycan-associated lipoprotein
pbpA 2583 penicillin-binding protein 2A (peptidoglycan dppE 1364 dipeptide ABC transporter (dipeptide-binding sunT 2269 sublancin 168 lantibiotic transporter
biosynthesis) (spore outgrowth) protein) (sporulation) tetB 4188 tetracycline resistance protein
pbpB 1581 penicillin-binding protein 2B (peptidoglycan ebrA 1865 multidrug resistance protein treP 850 phosphotransferase system (PTS) trehalose-
biosynthesis) (cell-division septum) ebrB 1864 multidrug resistance protein specific enzyme IIBC component
pbpC 463 penicillin-binding protein 3 (peptidoglycan ecsA 1077 ABC transporter (ATP-binding protein) trkA 2723 potassium uptake
biosynthesis) ecsB 1078 ABC transporter (membrane protein) yabM 65 amino acid transporter
pbpD 3233 penicillin-binding protein 4 (peptidoglycan expZ 606 ATP-binding transport protein ybaE 151 ABC transporter (ATP-binding protein)
biosynthesis) feuA 183 iron-uptake system (binding protein) ybbF 191 sucrose phosphotransferase enzyme II
pbpE 3535 penicillin-binding protein 4* (peptidoglycan feuB 182 iron-uptake system (integral membrane protein) ybcL 212 chloramphenicol resistance protein
biosynthesis) (spore cortex) feuC 181 iron-uptake system (integral membrane protein) ybdA 217 ABC transporter (binding protein)
pbpF 1083 penicillin-binding protein 1A (peptidoglycan fhuB 3417 ferrichrome ABC transporter (permease) ybdB 218 ABC transporter (permease)
biosynthesis) (germination) fhuC 3415 ferrichrome ABC transporter (ATP-binding pro- ybeC 231 amino acid transporter
pbpX 1765 penicillin-binding protein (peptidoglycan biosyn- tein) ybfS 257 phosphotransferase system enzyme II
thesis) fhuD 3418 ferrichrome ABC transporter (ferrichrome-bind- ybgF 262 histidine permease
ponA 2341 penicillin-binding proteins 1A/1B (peptidogly- ing protein) ybgH 264 sodium/proton-dependent alanine transporter
can biosynthesis) fhuG 3416 ferrichrome ABC transporter (permease) ybxA 150 ABC transporter (ATP-binding protein)
racE 2903 glutamate racemase (peptidoglycan biosynthe- fruA 1509 phosphotransferase system (PTS) fructose- ybxG 227 amino acid permease
sis) specific enzyme IIBC component ycbE 270 glucarate transporter
spoVD 1584 penicillin-binding protein (peptidoglycan biosyn- gabP 686 γ-aminobutyrate permease ycbK 277 efflux system
thesis) (spore cortex) glnH 2802 glutamine ABC transporter (glutamine-binding) ycbN 280 ABC transporter (ATP-binding protein)
tagA 3680 involved in polyglycerol phosphate teichoic acid glnM 2803 glutamine ABC transporter (membrane protein) yccK 298 ion channel
biosynthesis glnP 2804 glutamine ABC transporter (membrane protein) ycdI 309 ABC transporter (ATP-binding protein)
tagB 3681 involved in polyglycerol phosphate teichoic acid glnQ 2802 glutamine ABC transporter (ATP-binding pro- yceI 317 transporter
biosynthesis tein) yceJ 320 multidrug-efflux transporter
tagC 3682 involved in polyglycerol phosphate teichoic acid glpF 1002 glycerol uptake facilitator ycgH 337 amino acid transporter
biosynthesis glpT 235 glycerol-3-phosphate permease ycgO 347 proline permease
tagD 3680 glycerol-3-phosphate cytidylyltransferase (tei- gltP 255 H+/glutamate symport protein yckA 368 amino acid ABC transporter (permease)
choic acid biosynthesis) gltT 1097 H+/Na+-glutamate symport protein yckB 368 amino acid ABC transporter (binding protein)
tagE 3679 UDP-glucose:polyglycerol phosphate glucosyl- glvC 892 phosphotransferase system (PTS) arbutin-like yckI 410 glutamine ABC transporter (ATP-binding pro-
transferase (teichoic acid biosynthesis) enzyme IIBC component tein)
tagF 3677 CDP-glycerol:polyglycerol phosphate glycero- gntP 4115 gluconate permease (gluconate utilization) yckJ 410 glutamine ABC transporter (permease)
phosphotransferase (teichoic acid biosynthe- hisP 3004 histidine transport protein (ATP-binding protein) yckK 411 glutamine ABC transporter (glutamine-binding
sis) hutM 4046 histidine permease protein)
tagG 3675 teichoic acid translocation (permease) iolF 4077 inositol transport protein yclF 417 di-tripeptide ABC transporter (membrane pro-
tagH 3674 teichoic acid translocation (ATP-binding protein) kdgT 2322 2-keto-3-deoxygluconate permease (pectin uti- tein)
tagO 3649 teichoic acid linkage unit synthesis lization) yclH 424 ABC transporter (permease)
tuaA 3658 biosynthesis of teichuronic acid lctP 330 L-lactate permease yclI 426 transporter
tuaB 3657 biosynthesis of teichuronic acid levD 2762 phosphotransferase system (PTS) fructose- yclN 432 ferrichrome ABC transporter (permease)
tuaC 3656 biosynthesis of teichuronic acid specific enzyme IIA component yclO 433 ferrichrome ABC transporter (permease)
tuaD 3655 biosynthesis of teichuronic acid (UDP-glucose levE 2762 phosphotransferase system (PTS) fructose- yclP 434 ferrichrome ABC transporter (ATP-binding pro-
6-dehydrogenase) specific enzyme IIB component tein)
tuaE 3653 biosynthesis of teichuronic acid levF 2761 phosphotransferase system (PTS) fructose- yclQ 435 ferrichrome ABC transporter (binding protein)
tuaF 3652 biosynthesis of teichuronic acid specific enzyme IIC component ycnB 437 multidrug resistance protein
tuaG 3651 biosynthesis of teichuronic acid levG 2760 phosphotransferase system (PTS) fructose- ycnJ 448 copper export protein
tuaH 3650 biosynthesis of teichuronic acid specific enzyme IID component ycsG 457 branched chain amino acids transporter
wapA 4029 cell wall-associated protein precursor licA 3959 phosphotransferase system (PTS) lichenan- ydbA 493 ABC transporter (binding protein)
(CWBP200, 105, 62) specific enzyme IIA component ydbE 497 C4-dicarboxylate binding protein
wprA 1153 cell wall-associated protein precursor (CWBP23 licB 3961 phosphotransferase system (PTS) lichenan- ydbH 500 C4-dicarboxylate transport protein
and serine protease CWBP52) specific enzyme IIB component ydbJ 502 ABC transporter (ATP-binding protein)
xlyA 1347 N-acetylmuramoyl-L-alanine amidase (PBSX licC 3960 phosphotransferase system (PTS) lichenan- ydeG 566 metabolite transport protein

Nature © Macmillan Publishers Ltd 1997


ydeR 578 antibiotic resistance protein ytmJ 3007 amino acid ABC transporter (binding protein) yclK 427 two-component sensor histidine kinase [YclJ]
ydfA 580 arsenical pump membrane protein ytmK 3006 amino acid ABC transporter (binding protein) ydbF 497 two-component sensor histidine kinase [YdbG]
ydfJ 589 antibiotic transport-associated protein ytmL 3006 amino acid ABC transporter (permease) ydfH 587 two-component sensor histidine kinase [YdfI]
ydfL 595 multidrug-efflux transporter regulator ytmM 3005 amino acid ABC transporter (permease) yesM 758 two-component sensor histidine kinase [YesN]
ydfM 596 cation efflux system ytnA 3125 proline permease yfiJ 903 two-component sensor histidine kinase [YfiK]
ydfO 597 ABC transporter (binding protein) ytrB 3118 ABC transporter (ATP-binding protein) yhcY 1008 two-component sensor histidine kinase [YhcZ]
ydgF 608 amino acid ABC transporter (permease) ytrE 3115 ABC transporter (ATP-binding protein) yhjL 1129 sensory transduction pleiotropic regulatory pro-
ydgH 609 transporter ytsC 3111 ABC transporter (ATP-binding protein) tein
ydgK 613 bicyclomycin resistance protein ytsD 3110 ABC transporter (permease) ykoH 1392 two-component sensor histidine kinase [YkoG]
ydhL 626 chloramphenicol resistance protein yttB 3108 multidrug resistance protein ykrQ 1419 two-component sensor histidine kinase
ydhM 626 cellobiose phosphotransferase system enzyme II yubD 3192 multidrug resistance protein ykvD 1432 two-component sensor histidine kinase
ydhN 627 cellobiose phosphotransferase system enzyme II yubG 3188 Na+-transporting ATP synthase yocF 2090 two-component sensor histidine kinase [YocG]
ydhO 627 cellobiose phosphotransferase system enzyme II yufN 3239 ABC transporter (lipoprotein) yrkQ 2704 two-component sensor histidine kinase [YrkP]
ydiF 646 ABC transporter (ATP-binding protein) yufO 3240 ABC transporter (ATP-binding protein) ytrP 3035 two-component sensor histidine kinase
ydjD 668 H+-symporter yufR 3244 organic acid transport protein ytsB 3112 two-component sensor histidine kinase [YtsA]
ydjK 676 sugar transporter yufU 3248 Na+/H+ antiporter yufL 3236 two-component sensor histidine kinase [YufM]
yeaB 687 cation efflux system membrane protein yufV 3249 Na+/H+ antiporter yvcQ 3566 two-component sensor histidine kinase [YvcP]
yecA 712 amino acid permease yugO 3218 potassium channel protein yvfT 3497 two-component sensor histidine kinase [YvfU]
yesO 761 sugar-binding protein yunJ 3330 purine permease yvqB 3385 two-component sensor histidine kinase [YvqA]
yesP 762 lactose permease yunK 3331 purine permease yvqE 3395 two-component sensor histidine kinase [YvqC]
yesQ 763 lactose permease yurJ 3345 multiple sugar ABC transporter (ATP-binding pro- yvrG 3407 two-component sensor histidine kinase [YvrH]
yfhA 921 iron(III) dicitrate transport permease tein) ywpD 3741 two-component sensor histidine kinase
yfhI 926 antibiotic resistance protein yurM 3348 sugar permease yxdK 4071 two-component sensor histidine kinase [YxdJ]
yfiB 893 ABC transporter (ATP-binding protein) yurN 3349 sugar permease yxjM 3992 two-component sensor histidine kinase [YxjL]
yfiC 895 ABC transporter (ATP-binding protein) yurO 3350 multiple sugar-binding protein yycG 4153 two-component sensor histidine kinase [YycF]
yfiG 900 metabolite transport protein yurY 3360 ABC transporter (ATP-binding protein)
yfiL 905 ABC transporter (ATP-binding protein) yusC 3363 ABC transporter (ATP-binding protein) I.4 MEMBRANE BIOENERGETICS (ELECTRON
yfiM 906 ABC transporter (ATP-binding protein) yusP 3374 multidrug-efflux transporter TRANSPORT CHAIN AND ATP
yfiN 907 ABC transporter (ATP-binding protein) yusV 3379 iron(III) dicitrate transport permease SYNTHASE) ............................................................................78
yfiS 913 multidrug resistance protein yutK 3307 Na+/nucleoside cotransporter atpA 3784 ATP synthase (subunit α)
yfiU 916 multidrug-efflux transporter yuxJ 3232 multidrug-efflux transporter atpB 3787 ATP synthase (subunit a)
yfiY 920 iron(III) dicitrate transport permease yvaE 3448 multidrug-efflux transporter atpC 3781 ATP synthase (subunit ε)
yfiZ 920 iron(III) dicitrate transport permease yvbW 3490 amino acid permease atpD 3782 ATP synthase (subunit β)
yfjQ 872 divalent cation transport protein yvcC 3579 ABC transporter (ATP-binding protein) atpE 3786 ATP synthase (subunit c)
yfkE 865 H+/Ca2+ exchanger yvcR 3565 ABC transporter (ATP-binding protein) atpF 3786 ATP synthase (subunit b)
yfkF 865 multidrug-efflux transporter yvcS 3565 ABC transporter (permease) atpG 3783 ATP synthase (subunit γ)
yfkH 862 transporter yvdB 3561 transporter atpH 3785 ATP synthase (subunit δ)
yfkL 861 multidrug resistance protein yvdG 3555 maltose/maltodextrin-binding protein atpI 3787 ATP synthase (subunit i)
yflA 844 aminoacid carrier protein yvdH 3554 maltodextrin transport system permease cccA 2599 cytochrome c550
yflE 844 anion-binding protein yvdI 3552 maltodextrin transport system permease cccB 3625 cytochrome c551
yflF 840 phosphotransferase system enzyme II yveA 3538 permease ccdA 1922 required for a late step of cytochrome c synthesis
yflS 829 2-oxoglutarate/malate translocator yvfH 3510 L-lactate permease ctaA 1558 cytochrome caa3 oxidase (required for biosynthe-
yfmC 826 ferrichrome ABC transporter (binding protein) yvfK 3508 maltose/maltodextrin-binding protein sis)
yfmD 825 ferrichrome ABC transporter (permease) yvfL 3506 maltodextrin transport system permease ctaB 1559 cytochrome caa3 oxidase (assembly factor)
yfmE 824 ferrichrome ABC transporter (permease) yvfM 3505 maltodextrin transport system permease ctaC 1560 cytochrome caa3 oxidase (subunit II)
yfmF 823 ferrichrome ABC transporter (ATP-binding protein) yvfR 3498 ABC transporter (ATP-binding protein) ctaD 1561 cytochrome caa3 oxidase (subunit I)
yfmM 815 ABC transporter (ATP-binding protein) yvgK 3424 molybdenum-binding protein ctaE 1563 cytochrome caa3 oxidase (subunit III)
yfmO 812 multidrug-efflux transporter yvgL 3424 molybdate-binding protein ctaF 1563 cytochrome caa3 oxidase (subunit IV)
yfmR 809 ABC transporter (ATP-binding protein) yvgM 3425 molybdenum transport permease cydA 3978 cytochrome bd ubiquinol oxidase (subunit I)
yfnA 806 metabolite transporter yvgW 3440 heavy metal-transporting ATPase cydB 3977 cytochrome bd ubiquinol oxidase (subunit II)
ygaD 939 ABC transporter (ATP-binding protein) yvgX 3443 heavy metal-transporting ATPase etfA 2915 electron transfer flavoprotein (α subunit)
ygaL 961 nitrate ABC transporter (binding protein) yvgY 3443 mercuric transport protein etfB 2916 electron transfer flavoprotein (β subunit)
ygaM 963 ABC transporter (permease) yvkA 3618 multidrug-efflux transporter fer 2409 ferredoxin
ygbA 962 ABC transporter (binding lipoprotein) yvmA 3605 transporter hmp 1372 flavohemoglobin
yhaQ 1062 ABC transporter (ATP-binding protein) yvqJ 3399 macrolide-efflux protein narG 3829 nitrate reductase (α subunit)
yhaU 1060 Na+/H+ antiporter yvrA 3402 iron transport system narH 3825 nitrate reductase (β subunit)
yhcA 977 multidrug resistance protein yvrB 3403 iron permease narI 3823 nitrate reductase (γ subunit)
yhcG 981 glycine betaine/L-proline transport yvrC 3403 iron-binding protein narJ 3824 nitrate reductase (protein J)
yhcH 982 ABC transporter (ATP-binding protein) yvrO 3413 amino acid ABC transporter (ATP-binding protein) ndhF 205 NADH dehydrogenase (subunit 5)
yhcJ 984 ABC transporter (binding lipoprotein) yvsH 3420 ABC transporter (amino acid permease) qcrA 2364 menaquinol:cytochrome c oxidoreductase (iron-
yhcL 986 sodium-glutamate symporter ywbA 3938 phosphotransferase system enzyme II sulphur subunit)
yhdG 1023 amino acid transporter ywbF 3933 sugar permease qcrB 2364 menaquinol:cytochrome c oxidoreductase
yhdH 1024 sodium-dependent transporter ywcA 3923 Na+-dependent symport (cytochrome b subunit)
yheH 1047 ABC transporter (ATP-binding protein) ywcJ 3904 nitrite transporter qcrC 2363 menaquinol:cytochrome c oxidoreductase
yheI 1045 ABC transporter (ATP-binding protein) ywfA 3874 chloramphenicol resistance (cytochrome b/c subunit)
yheL 1044 Na+/H+ antiporter ywfF 3869 efflux protein qoxA 3917 cytochrome aa3 quinol oxidase (subunit II)
yhfQ 1107 iron(III) dicitrate-binding protein ywhQ 3837 ABC transporter (ATP-binding protein) qoxB 3916 cytochrome aa3 quinol oxidase (subunit I)
yhjB 1120 metabolite permease ywjA 3821 ABC transporter (ATP-binding protein) qoxC 3914 cytochrome aa3 quinol oxidase (subunit III)
yhjO 1133 multidrug-efflux transporter ywoA 3758 bacteriocin transport permease qoxD 3913 cytochrome aa3 quinol oxidase (subunit IV)
yhjP 1133 transporter binding protein ywoD 3754 transporter resA 2421 essential protein similar to cytochrome c biogene-
yitG 1177 multidrug resistance protein ywoE 3753 permease sis protein
yitZ 1194 multidrug resistance protein ywoG 3749 antibiotic resistance protein resB 2420 essential protein similar to cytochrome c biogene-
yjbQ 1240 Na+/H+ antiporter ywpC 3743 large conductance mechanosensitive channel sis protein
yjdD 1272 fructose phosphotransferase system enzyme II protein resC 2418 essential protein similar to cytochrome c biogene-
yjkB 1296 amino acid ABC transporter (ATP-binding protein) ywrA 3721 chromate transport protein sis protein
yjmB 1301 Na+:galactoside symporter ywrB 3720 chromate transport protein tlp 1930 thioredoxin-like protein
yjmG 1307 hexuronate transporter ywrK 3712 arsenical pump membrane protein trxA 2912 thioredoxin
ykaB 1350 low-affinity inorganic phosphate transporter ywtG 3693 metabolite transport protein trxB 3573 thioredoxin reductase
ykbA 1352 amino acid permease yxaM 4100 antibiotic resistance protein ycgT 352 thioredoxin reductase
ykcA 1353 ABC transporter (binding protein) yxcC 4087 metabolite transport protein ycnD 439 NADPH-flavin oxidoreductase
ykfD 1368 oligopeptide ABC transporter (permease) yxdL 4070 ABC transporter (ATP-binding protein) ydbP 508 thioredoxin
yknU 1499 ABC transporter (ATP-binding protein) yxdM 4069 ABC transporter (permease) ydeQ 576 NAD(P)H oxidoreductase
yknV 1501 ABC transporter (ATP-binding protein) yxeB 4066 ABC transporter (binding protein) ydfQ 598 thioredoxin
yknY 1505 ABC transporter (ATP-binding protein) yxeM 4059 amino acid ABC transporter (binding protein) ydgI 613 NADH dehydrogenase
ykoD 1390 cation ABC transporter (ATP-binding protein) yxeN 4058 amino acid ABC transporter (permease) yfkO 854 NAD(P)H-flavin oxidoreductase
ykoK 1395 Mg2+ transporter yxeO 4058 amino acid ABC transporter (ATP-binding protein) yfmJ 818 quinone oxidoreductase
ykpA 1512 ABC transporter (ATP-binding protein) yxeR 4054 ethanolamine transporter yjdK 1280 cytochrome c oxidase assembly factor
ykrM 1416 Na+-transporting ATP synthase yxiQ 4009 Mg2+/citrate complex transporter yjlD 1299 NADH dehydrogenase
ykuC 1476 macrolide-efflux protein yxjA 4005 pyrimidine nucleoside transport ykuN 1486 flavodoxin
ykvW 1451 heavy metal-transporting ATPase yxkJ 3979 metabolite-sodium symport ykuP 1488 sulfite reductase
ylmA 1606 ABC transporter (ATP-binding protein) yxlA 3970 purine-cytosine permease ykuU 1492 2-cys peroxiredoxin
ylnA 1630 anion permease yxlF 3968 ABC transporter (ATP-binding protein) ykvV 1450 thioredoxin
yloB 1637 calcium-transporting ATPase yxlH 3966 multidrug-efflux transporter yneN 1929 thiol:disulfide interchange protein
ynaJ 1887 H+-symporter yyaJ 4194 transporter yojN 2114 nitric-oxide reductase
yncC 1896 metabolite transport protein yybF 4180 antibiotic resistance protein yolI 2267 thioredoxin
yocN 2098 permease yybJ 4175 ABC transporter (ATP-binding protein) yosR 2159 thioredoxin
yocR 2106 sodium-dependent transporter yybL 4174 ABC transporter (permease) ypdA 2401 thioredoxin reductase
yocS 2106 sodium-dependent transporter yybO 4169 ABC transporter (permease) yqiG 2516 NADH-dependent flavin oxidoreductase
yodE 2129 aromatic metabolite transporter yycB 4159 ABC transporter (permease) yqjM 2475 NADH-dependent flavin oxidoreductase
yodF 2130 proline permease yydI 4125 ABC transporter (ATP-binding protein) yrkL 2708 NAD(P)H oxidoreductase
yojA 2125 gluconate permease yyzE 4122 phosphotransferase systeme enzyme II ythA 3139 cytochrome d oxidase subunit
ypqE 2337 phosphotransferase system enzyme II ytpP 3054 thioredoxin H1
yqeW 2620 Na+/Pi cotransporter I.3 SENSORS (SIGNAL TRANSDUCTION) .......................... 38 ytrC 3117 cytochrome c oxidase subunit
yqgG 2581 phosphate ABC transporter (binding protein) cheA 1712 two-component sensor histidine kinase ytrD 3116 cytochrome c oxidase subunit
yqgH 2580 phosphate ABC transporter (permease) [CheB/CheY] chemotactic signal modulator yufD 3249 NADH dehydrogenase (ubiquinone)
yqgI 2579 phosphate ABC transporter (permease) citS 830 two-component sensor histidine kinase [CitT] yufT 3246 NADH dehydrogenase
yqgJ 2578 phosphate ABC transporter (ATP-binding protein) comP 3255 two-component sensor histidine kinase [ComA] yumB 3300 NADH dehydrogenase
yqgK 2577 phosphate ABC transporter (ATP-binding protein) involved in early competence yumC 3301 thioredoxin reductase
yqiH 2515 lipoprotein degS 3646 two-component sensor histidine kinase [DegU] yusE 3364 thioredoxin
yqiX 2492 amino acid ABC transporter (binding protein) involved in degradative enzyme and competence yutJ 3308 NADH dehydrogenase
yqiY 2491 amino acid ABC transporter (permease) regulation yvaB 3445 NAD(P)H dehydrogenase (quinone)
yqiZ 2491 amino acid ABC transporter (ATP-binding protein) kinA 1469 two-component sensor histidine kinase [Spo0F] ywcG 3911 NADPH-flavin oxidoreductase
yqjV 2466 multidrug resistance protein involved in the initiation of sporulation ywhN 3840 ubiquinol-cytochrome c reductase
yqkI 2453 Na+/H+ antiporter kinB 3229 two-component sensor histidine kinase [Spo0F] ywrO 3708 NAD(P)H oxidoreductase
yraO 2745 citrate transporter involved in the initiation of sporulation
yrbD 2841 sodium/proton-dependent alanine carrier protein kinC 1518 two-component sensor histidine kinase [Spo0A] I.5 MOBILITY AND CHEMOTAXIS ......................................... 55
ytbD 2968 antibiotic resistance protein involved in the initiation of sporulation (phospho- cheC 1715 inhibition of CheR-mediated methylation of
ytcP 3087 ABC transporter (permease) relay-independent) methyl-accepting chemotaxis proteins
ytcQ 3086 lipoprotein lytS 2957 two-component sensor histidine kinase [LytT] cheD 1715 required for methylation of methyl-accepting
yteQ 3082 sugar transport protein involved in the rate of autolysis chemotaxis proteins by CheR
ytgA 3145 ABC transporter (membrane protein) phoR 2977 two-component sensor histidine kinase [PhoP] cheR 2380 methyl-accepting chemotaxis proteins methyl-
ytgB 3144 ABC transporter (ATP-binding protein) involved in phosphate regulation transferase
ytgC 3143 ABC transporter (membrane protein) resE 2416 two-component sensor histidine kinase [ResD] cheV 1473 modulation of CheA activity in response to attrac-
ythP 3071 ABC transporter (ATP-binding protein) involved in aerobic and anaerobic respiration tants (CheW and CheY similar domains)
ytlC 3132 anion transport ABC transporter (ATP-binding pro- ybdK 222 two-component sensor histidine kinase [YbdJ] cheW 1714 modulation of CheA activity in response to attrac-
tein) ycbA 266 two-component sensor histidine kinase [YcbB] tants
ytlD 3133 ABC transporter (permease) ycbM 279 two-component sensor histidine kinase [YcbL] flgB 1691 flagellar basal-body rod protein
ytlP 3065 ABC transporter (permease) yccG 295 two-component sensor histidine kinase [YccH] flgC 1691 flagellar basal-body rod protein
Nature © Macmillan Publishers Ltd 1997
flgE 1700 flagellar hook protein cotX 1251 spore coat protein (insoluble fraction) SASP)
flgK 3639 flagellar hook-associated protein 1 (HAP1) cotY 1250 spore coat protein (insoluble fraction) sspE 937 small acid-soluble spore protein (major γ-type
flgL 3637 flagellar hook-associated protein 3 (HAP3) cotZ 1249 spore coat protein (insoluble fraction) SASP)
flgM 3640 flagellin synthesis regulatory protein (anti-sigma csgA 228 sporulation-specific SASP protein sspF 53 small acid-soluble spore protein (minor α/β-type
factor [σD]) jag 4213 SpoIIIJ-associated protein SASP)
flhA 1707 flagella-associated protein kapB 3230 activator of KinB in the initiation of sporulation usd 3748 required for translation of spoIIID
flhB 1706 flagella-associated protein kapD 3232 inhibitor of the KinA pathway to sporulation yknT 1495 sporulation protein σE-controlled
flhF 1709 flagella-associated protein kbaA 159 activation of the KinB signaling pathway to sporu- ykvU 1449 spore cortex membrane protein
flhO 3746 flagellar basal-body rod protein lation ynzH 1901 spore coat protein
flhP 3745 flagellar hook-basal body protein obg 2853 GTP-binding protein involved in initiation of sporu- yobW 2083 membrane protein σK-controlled
fliD 3633 flagellar hook-associated protein 2 (HAP2) lation (Spo0A activation) yqgT 2568 γ-D-glutamyl-L-diamino acid endopeptidase I
fliE 1692 flagellar hook-basal body protein phrA 1316 phosphatase (RapA) inhibitor (imported by Opp) yqjG 2483 lipoprotein SpoIIIJ-like
fliF 1692 flagellar basal-body M-ring protein phrC 430 phosphatase (RapC) regulator / competence and yraD 2754 spore coat protein
fliG 1694 flagellar motor switch protein sporulation stimulating factor (CSF) yraE 2754 spore coat protein
fliH 1695 flagellar assembly protein phrE 2660 phosphatase (RapE) regulator yraF 2752 spore coat protein
fliI 1695 flagellar-specific ATP synthase phrF 3846 phosphatase (RapF) regulator yraG 2752 spore coat protein
fliJ 1697 flagellar protein required for formation of basal phrG 4141 phosphatase (RapG) regulator yrbA 2845 spore coat protein
body phrI 548 phosphatase (RapI) regulator yrbB 2844 spore coat protein
fliK 1698 flagellar hook-length control phrK 2063 phosphatase (RapK) regulator yrbC 2843 spore coat protein
fliL 1701 flagellar protein required for flagellar formation rapA 1315 response regulator aspartate phosphatase ytaA 3161 spore coat protein
fliM 1701 flagellar motor switch protein [Spo0F~P] ytgP 3074 spore cortex protein
fliP 1704 flagellar protein required for flagellar formation rapB 3771 response regulator aspartate phosphatase ytpT 3051 DNA translocase stage III sporulation protein
fliQ 1705 flagellar protein required for flagellar formation [Spo0F~P] yyaA 4208 DNA-binding protein Spo0J-like
fliR 1705 flagellar protein required for flagellar formation rapC 428 response regulator aspartate phosphatase
fliS 3632 flagellar protein rapD 3743 response regulator aspartate phosphatase I.9 GERMINATION .....................................................................23
fliT 3632 flagellar protein rapE 2658 response regulator aspartate phosphatase gerAA 3390 germination response to L-alanine
fliY 1702 flagellar motor switch protein rapF 3845 response regulator aspartate phosphatase gerAB 3391 germination response to L-alanine
fliZ 1704 flagellar protein required for flagellar formation rapG 4139 response regulator aspartate phosphatase gerAC 3392 germination response to L-alanine
hag 3635 flagellin protein rapH 750 response regulator aspartate phosphatase gerBA 3688 germination response to the combination of glu-
mcpA 3207 methyl-accepting chemotaxis protein (glucose rapI 547 response regulator aspartate phosphatase cose, fructose, L-asparagine, and KCl
and α-methyl-glucoside) rapJ 304 response regulator aspartate phosphatase gerBB 3689 germination response to the combination of glu-
mcpB 3212 methyl-accepting chemotaxis protein rapK 2061 response regulator aspartate phosphatase cose, fructose, L-asparagine, and KCl
(asparagine, glutamine and histidine) sinI 2552 antagonist of SinR gerBC 3690 germination response to the combination of glu-
mcpC 1463 methyl-accepting chemotaxis protein (cysteine, soj 4206 centromere-like function involved in forespore cose, fructose, L-asparagine, and KCl
proline, threonine, glycine, serine, lysine, valine chromosome partitioning / inhibition of Spo0A gerCA 2384 heptaprenyl diphosphate synthase component I
and arginine) activation (menaquinone biosynthesis)
motA 1435 motility protein (flagellar motor rotation) splB 1461 spore photoproduct lyase gerCB 2383 menaquinone biosynthesis methyltransferase
motB 1434 motility protein (flagellar motor rotation) spmA 2423 spore maturation protein (spore core dehydrata- (menaquinone biosynthesis)
tlpA 3209 methyl-accepting chemotaxis protein tion) gerCC 2382 heptaprenyl diphosphate synthase component II
tlpB 3205 methyl-accepting chemotaxis protein spmB 2422 spore maturation protein (spore core dehydrata- (menaquinone biosynthesis)
tlpC 374 methyl-accepting chemotaxis protein tion) gerD 159 germination response to L-alanine and to the
yfmS 808 methyl-accepting chemotaxis protein spo0B 2854 sporulation initiation phosphoprotein (part of combination of glucose, fructose, L-asparagine,
yhfV 1113 methyl-accepting chemotaxis protein phosphorelay: Spo0F~P->Spo0B~P->Spo0A~P) and KCl
ylqH 1679 flagellar biosynthetic protein spo0E 1430 negative sporulation regulatory phosphatase gerKA 420 germination response to the combination of glu-
ylxG 1699 flagellar hook assembly protein [Spo0A~P] cose, fructose, L-asparagine, and KCl
ylxH 1710 flagellar biosynthesis switch protein spo0J 4206 chromosome positioning near the pole and trans- gerKB 423 germination response to the combination of glu-
yoaH 2030 methyl-accepting chemotaxis protein port through the polar septum / antagonist of Soj cose, fructose, L-asparagine, and KCl
ytxD 3043 flagellar motor apparatus spoIIAA 2444 anti-anti-sigma factor [SpoIIAB] gerKC 421 germination response to the combination of glu-
ytxE 3042 motility protein spoIIAB 2444 anti-sigma factor [σF(SpoIIAC)] and serine kinase cose, fructose, L-asparagine, and KCl
yvaQ 3457 transmembrane receptor taxis protein [SpoIIAA] gerM 2902 germination (cortex hydrolysis) and sporulation
yvyC 3634 flagellar protein spoIIB 2864 endospore development (oligosporogenous (stage II, multiple polar septa)
yvyF 3640 flagellar protein mutation) gpr 2635 spore protease (degradation of SASPs)
yvyG 3639 flagellar protein spoIID 3777 required for complete dissolution of the asymmet- sleB 2399 spore cortex-lytic enzyme
yvzB 3609 flagellin ric septum yfkQ 850 spore germination response
spoIIE 71 serine phosphatase [SpoIIAA~P] (σF activation) / yfkR 848 spore germination protein
I.6 PROTEIN SECRETION .........................................................18 asymmetric septum formation yfkT 847 spore germination protein
csaA 2079 chaperonin involved in protein secretion spoIIGA 1603 protease (processing of pro-σE to active σE) ykvT 1448 spore cortex-lytic enzyme
ffh 1672 signal recognition particle spoIIIAA 2537 mutants block sporulation after engulfment yndD 1907 spore germination protein
ftsY 1670 signal recognition particle spoIIIAB 2536 mutants block sporulation after engulfment yndE 1908 spore germination protein
lsp 1616 signal peptidase II spoIIIAC 2535 mutants block sporulation after engulfment yndF 1909 spore germination protein
lytA 3662 secretion of major autolysin LytC spoIIIAD 2535 mutants block sporulation after engulfment
prsA 1071 protein secretion (post-translocation chaperonin) spoIIIAE 2535 mutants block sporulation after engulfment I.10 TRANSFORMATION/COMPETENCE .................. ...........20
secA 3630 preprotein translocase subunit spoIIIAF 2534 mutants block sporulation after engulfment cinA 1763 competence-damage inducible protein
secE 118 preprotein translocase subunit spoIIIAG 2533 mutants block sporulation after engulfment comC 2864 late competence protein required for processing
secF 2828 protein-export membrane protein (product also spoIIIAH 2532 mutants block sporulation after engulfment and translocation of ComGC
similar to SecD of E. coli) spoIIIE 1752 DNA translocase required for chromosome parti- comEA 2640 late competence operon required for DNA bind-
secY 145 preprotein translocase subunit tioning through the septum into the forespore ing and uptake
sipS 2432 signal peptidase I spoIIIJ 4214 essential for σG activity at stage III comEB 2640 late competence operon required for DNA bind-
sipT 1511 signal peptidase I spoIIM 2450 required for dissolution of the septal cell wall ing and uptake
sipU 454 signal peptidase I spoIIP 2634 required for dissolution of the septal cell wall comEC 2639 late competence operon required for DNA bind-
sipV 1122 signal peptidase I spoIIQ 3760 required for completion of engulfment ing and uptake
sipW 2554 signal peptidase I spoIIR 3794 required for processing of pro-σE comER 2640 non-essential gene for competence
yaaT 42 signal peptidase II spoIISA 1349 lethal when synthesized during vegetative growth comFA 3643 late competence protein required for DNA uptake
yacD 81 protein secretion PrsA homologue in the absence of SpoIISB comFB 3641 late competence gene
yobE 2057 general secretion pathway protein spoIISB 1348 disruption blocks sporulation after septum forma- comFC 3641 late competence gene
tion comGA 2559 late competence gene
I.7 CELL DIVISION ..................................................................... 21 spoIVA 2387 required for proper spore cortex formation and comGB 2558 DNA transport machinery
divIB 1593 cell-division initiation protein (septum formation) coat assembly comGC 2557 exogenous DNA-binding
divIC 69 cell-division initiation protein (septum formation) spoIVB 2520 intercompartmental signalling of pro-σK process- comGD 2557 DNA transport machinery
divIVA 1612 cell-division initiation protein (septum placement) ing/activation in the mother-cell comGE 2557 DNA transport machinery
ftsA 1596 cell-division protein (septum formation) spoIVCA 2654 site-specific DNA recombinase required for creat- comGF 2556 DNA transport machinery
ftsE 3625 cell-division ATP-binding protein ing the sigK gene (excision of the skin element) comGG 2556 DNA transport machinery
ftsH 77 cell-division protein / general stress protein (class spoIVFA 2857 inhibitor of SpoIVFB comS 390 assembly link between regulatory components of
III heat-shock) spoIVFB 2856 protease (processing of pro-σK to active σK) the competence signal transduction pathway
ftsL 1581 cell-division protein (septum formation) spoVAA 2443 mutants lead to the production of immature comX 3255 competence pheromone precursor (activation of
ftsX 3624 cell-division protein spores ComA)
ftsZ 1597 cell-division initiation protein (septum formation) spoVAB 2442 mutants lead to the production of immature mecA 1229 negative regulator of competence
gid 1685 glucose-inhibited division protein spores ypbH 2403 negative regulation of competence MecA homo-
gidA 4211 glucose-inhibited division protein spoVAC 2441 mutants lead to the production of immature logue
gidB 4209 glucose-inhibited division protein spores
maf 2862 septum formation spoVAD 2441 mutants lead to the production of immature II INTERMEDIARY METABOLISM 742
minC 2859 cell-division inhibitor (septum placement) spores
minD 2858 cell-division inhibitor (septum placement) spoVAE 2440 mutants lead to the production of immature II.1 METABOLISM OF CARBOHYDRATES AND RELATED
(ATPase activator of MinC) spores MOLECULES ...................................................................... 261
yacA 75 cell-cycle protein spoVAF 2439 mutants lead to the production of immature II.1.1 SPECIFIC PATHWAYS ........................................................214
yfhF 925 cell-division inhibitor spores abfA 2939 α-L-arabinofuranosidase
yjoB 1314 cell-division protein FtsH homologue spoVB 2829 involved in spore cortex synthesis abnA 2949 arabinan-endo 1,5-—L-arabinase (degradation of
ylaO 1552 cell-division protein spoVC 60 thermosensitive mutant blocks spore coat forma- plant cell wall polysaccharide)
ylmH 1611 cell-division protein tion ackA 3015 acetate kinase
ywcF 3912 cell-division protein spoVE 1590 required for spore cortex synthesis acoA 879 acetoin dehydrogenase E1 component (TPP-
spoVFA 1744 dipicolinate synthase subunit A dependent α subunit)
I.8 SPORULATION ................................................................... 139 spoVFB 1745 dipicolinate synthase subunit B acoB 880 acetoin dehydrogenase E1 component (TPP-
bofA 30 inhibitor of the pro-σK processing machinery spoVG 56 required for spore cortex synthesis dependent β subunit)
bofC 2837 forespore regulator of the σK checkpoint spoVID 2872 required for assembly of the spore coat acoC 881 acetoin dehydrogenase E2 component (dihy-
cgeA 2148 maturation of the outermost layer of the spore spoVK 1873 disruption leads to the production of immature drolipoamide acetyltransferase)
cgeB 2148 maturation of the outermost layer of the spore spores acoL 882 acetoin dehydrogenase E3 component (dihy-
cgeC 2148 maturation of the outermost layer of the spore spoVM 1655 required for normal spore cortex and coat synthe- drolipoamide dehydrogenase)
cgeD 2147 maturation of the outermost layer of the spore sis acsA 3039 acetyl-CoA synthetase
cgeE 2146 maturation of the outermost layer of the spore spoVR 1015 involved in spore cortex synthesis acuA 3039 acetoin utilization
cotA 685 spore coat protein (outer) spoVS 1769 required for dehydratation of the spore core and acuB 3040 acetoin utilization
cotB 3715 spore coat protein (outer) assembly of the coat acuC 3040 acetoin utilization
cotC 1905 spore coat protein (outer) spsA 3892 spore coat polysaccharide synthesis adhA 2756 NADP-dependent alcohol dehydrogenase
cotD 2332 spore coat protein (inner) spsB 3891 spore coat polysaccharide synthesis adhB 2753 alcohol dehydrogenase
cotE 1774 spore coat protein (outer) spsC 3890 spore coat polysaccharide synthesis aldX 4093 aldehyde dehydrogenase
cotF 4166 spore coat protein spsD 3889 spore coat polysaccharide synthesis aldY 3985 aldehyde dehydrogenase
cotG 3716 spore coat protein spsE 3888 spore coat polysaccharide synthesis alsD 3709 α-acetolactate decarboxylase (acetoin biosynthe-
cotH 3716 spore coat protein (inner) spsF 3887 spore coat polysaccharide synthesis sis)
cotJA 755 polypeptide composition of the spore coat spsG 3886 spore coat polysaccharide synthesis alsS 3710 α-acetolactate synthase (acetoin biosynthesis)
cotJB 756 polypeptide composition of the spore coat spsI 3885 spore coat polysaccharide synthesis amyE 327 α-amylase
cotJC 756 polypeptide composition of the spore coat spsJ 3884 spore coat polysaccharide synthesis amyX 3063 pullulanase
cotK 1926 spore coat protein spsK 3883 spore coat polysaccharide synthesis araA 2948 L-arabinose isomerase (L-arabinose utilization)
cotL 1926 spore coat protein sspA 3025 small acid-soluble spore protein (major α-type araB 2946 L-ribulokinase (L-arabinose utilization)
cotM 1925 spore coat protein (outer) SASP) araD 2945 L-ribulose-5-phosphate 4-epimerase (L-arabinose
cotN 2553 spore coat-associated protein sspB 1050 small acid-soluble spore protein (major β-type utilization)
cotS 3160 spore coat protein SASP) araL 2944 L-arabinose operon
cotT 1280 spore coat protein (inner) sspC 2155 small acid-soluble spore protein (minor α/β-type araM 2943 L-arabinose operon
cotV 1251 spore coat protein (insoluble fraction) SASP) bglA 4122 6-phospho-—glucosidase
cotW 1251 spore coat protein (insoluble fraction) sspD 1413 small acid-soluble spore protein (minor α/β-type bglC 1940 endo-1,4-—glucanase (cellulose degradation)
Nature © Macmillan Publishers Ltd 1997
bglH 4033 β-glucosidase (cellulose degradation) yjmA 1300 glucuronate isomerase mmgD 2510 citrate synthase III
bglS 4011 endo-—1,3-1,4 glucanase (lichenan degradation) yjmD 1304 sorbitol dehydrogenase odhA 2111 2-oxoglutarate dehydrogenase (E1 subunit)
crh 3569 catabolite repression HPr-like protein yjmE 1305 D-mannonate hydrolase odhB 2108 2-oxoglutarate dehydrogenase (dihydrolipoamide
csn 2748 chitosanase yjmF 1306 2-deoxy-D-gluconate 3-dehydrogenase transsuccinylase, E2 subunit)
csrA 3635 carbon storage regulator yjmI 1309 tagaturonate reductase sdhA 2907 succinate dehydrogenase (flavoprotein subunit)
fruB 1508 fructose 1-phosphate kinase yjmJ 1311 altronate hydrolase sdhB 2905 succinate dehydrogenase (iron-sulphur protein)
galE 3990 UDP-glucose 4-epimerase (galactose metabo- ykcC 1356 dolichol phosphate mannose synthase sdhC 2908 succinate dehydrogenase (cytochrome b558 sub-
lism) ykfB 1366 chloromuconate cycloisomerase unit)
galK 3921 galactokinase (galactose metabolism) ykfC 1367 polysugar degrading enzyme sucC 1680 succinyl-CoA synthetase (β subunit)
galT 3919 galactose-1-phosphate uridyltransferase (galac- ykoT 1403 dolichol phosphate mannose synthase sucD 1681 succinyl-CoA synthetase (α subunit)
tose metabolism) ykrW 1427 ribulose-bisphosphate carboxylase yjmC 1303 malate dehydrogenase
gdh 445 glucose 1-dehydrogenase yktC 1537 myo-inositol-1(or 4)-monophosphatase yqkJ 2452 malate dehydrogenase
glcK 2571 glucose kinase ykuF 1477 glucose 1-dehydrogenase ytsJ 2990 malate dehydrogenase
glgA 3167 starch (bacterial glycogen) synthase (glycogen ykvO 1442 glucose 1-dehydrogenase ywkA 3801 malate dehydrogenase
biosynthesis) ykvQ 1445 chitinase
glgB 3171 1,4-—glucan branching enzyme (glycogen biosyn- yloR 1653 ribulose-5-phosphate 3-epimerase II.2 METABOLISM OF AMINO ACIDS AND RELATED
thesis) ylxY 1741 deacetylase MOLECULES ..................................................................... 205
glgC 3169 glucose-1-phosphate adenylyltransferase (glyco- ynfF 1943 endo-xylanase ald 3277 L-alanine dehydrogenase
gen biosynthesis) yngE 1951 propionyl-CoA carboxylase ampS 1516 aminopeptidase
glgD 3168 required for glycogen biosynthesis yoaC 2023 xylulokinase ansA 2456 L-asparaginase
glgP 3165 glycogen phosphorylase (glycogen metabolism) yoaD 2024 phosphoglycerate dehydrogenase ansB 2455 L-aspartase
glpD 1004 glycerol-3-phosphate dehydrogenase (glycerol yoaE 2025 formate dehydrogenase aprE 1105 extracellular alkaline serine protease (subtilisin E)
utilization) yoaI 2031 4-hydroxyphenylacetate-3-hydroxylase aprX 1862 intracellular alkaline serine protease
glpK 1003 glycerol kinase (glycerol utilization) yogA 2007 alcohol dehydrogenase argB 1197 N-acetylglutamate 5-phosphotransferase (argi-
glvA 890 6-phospho-—glucosidase (arbutin fermentation) yqiQ 2507 phosphoenolpyruvate mutase nine biosynthesis)
gntK 4113 gluconate kinase (gluconate utilization) yqjD 2488 propionyl-CoA carboxylase argC 1195 N-acetylglutamate γ-semialdehyde dehydroge-
gntZ 4116 6-phosphogluconate dehydrogenase (gluconate yrhE 2780 formate dehydrogenase nase (arginine biosynthesis)
utilization) yrhG 2780 formate dehydrogenase argD 1198 N-acetylornithine aminotransferase (arginine
gpsA 2389 NAD(P)H-dependent glycerol-3-phosphate dehy- yrhH 2778 methyltransferase biosynthesis)
drogenase yrhO 2768 cyclodextrin metabolism argE 2142 acetylornithine deacetylase (arginine biosynthe-
gutB 667 sorbitol dehydrogenase yrpG 2742 sugar-phosphate dehydrogenase sis)
iolB 4082 myo-inositol catabolism ysdC 2950 endo-1,4-—glucanase argF 1203 ornithine carbamoyltransferase (arginine biosyn-
iolC 4081 myo-inositol catabolism ysfC 2932 glycolate oxidase subunit thesis)
iolD 4080 myo-inositol catabolism ysfD 2934 glycolate oxidase subunit argG 3013 argininosuccinate synthase (arginine biosynthe-
iolE 4078 myo-inositol catabolism ytbE 2969 plant metabolite dehydrogenase sis)
iolG 4076 myo-inositol 2-dehydrogenase (inositol catabo- ytcA 3155 NDP-sugar dehydrogenase argH 3012 argininosuccinate lyase (arginine biosynthesis)
lism) ytcB 3156 NDP-sugar epimerase argJ 1196 ornithine acetyltransferase / amino-acid acetyl-
iolH 4075 myo-inositol catabolism ytcI 3024 acetate-CoA ligase transferase (arginine biosynthesis)
iolI 4074 myo-inositol catabolism ytdA 3155 UTP-glucose-1-phosphate uridylyltransferase aroA 3046 3-deoxy-D-arabino-heptulosonate 7-phosphate
iolS 4084 myo-inositol catabolism ytiB 3138 carbonic anhydrase synthase / chorismate mutase-isozyme 3 (shiki-
kdgA 2323 deoxyphosphogluconate aldolase (pectin utiliza- ytoP 3055 endo-1,4-—glucanase mate pathway)
tion) yttI 2989 acetyl-CoA carboxylase aroB 2378 3-dehydroquinate synthase (shikimate pathway)
kdgK 2324 2-keto-3-deoxygluconate kinase (pectin utiliza- yugF 3227 dihydrolipoamide S-acetyltransferase aroC 2413 3-dehydroquinate dehydratase (shikimate path-
tion) yugJ 3224 NADH-dependent butanol dehydrogenase way)
kduD 2326 2-keto-3-deoxygluconate oxidoreductase (pectin yugK 3222 NADH-dependent butanol dehydrogenase aroD 2645 shikimate 5-dehydrogenase (shikimate pathway)
utilization) yugT 3215 exo-—1,4-glucosidase aroE 2368 5-enolpyruvoylshikimate-3-phosphate synthase
kduI 2325 5-keto-4-deoxyuronate isomerase (pectin utiliza- yulC 3200 rhamnulokinase (shikimate pathway)
tion) yulE 3198 L-rhamnose isomerase aroF 2380 chorismate synthase (shikimate pathway)
lacA 3504 β-galactosidase yusZ 3382 retinol dehydrogenase aroH 2377 chorismate mutase (isozymes 1 and 2) (aromatic
lctE 329 L-lactate dehydrogenase yutF 3318 N-acetyl-glucosamine catabolism amino acids biosynthesis)
licH 3959 6-phospho-—glucosidase yuxG 3203 sorbitol-6-phosphate 2-dehydrogenase aroI 340 shikimate kinase (shikimate pathway)
lplD 782 hydrolytic enzyme yvaM 3455 hydrolase asd 1745 aspartate-semialdehyde dehydrogenase
melA 3100 α-D-galactoside galactohydrolase yvcN 3568 N-hydroxyarylamine O-acetyltransferase ask 2910 aspartokinase II attenuator
mtlD 451 mannitol-1-phosphate dehydrogenase yvcT 3562 glycerate dehydrogenase asnB 3127 asparagine synthetase
nagA 3594 N-acetylglucosamine-6-phosphate deacetylase yvdA 3561 carbonic anhydrase asnH 4098 asparagine synthetase
(N-acetyl glucosamine utilization) yvdF 3557 glucan 1,4-—maltohydrolase aspB 2348 aspartate aminotransferase
nagB 3596 N-acetylglucosamine-6-phosphate isomerase yvdL 3548 oligo-1,6-glucosidase bcsA 2317 naringenin-chalcone synthase (phenylalanine
(N-acetyl glucosamine utilization) yvdM 3547 β-phosphoglucomutase metabolism)
narQ 3773 required for formate dehydrogenase activity yveB 3537 levanase bfmBAA 2499 branched-chain α-keto acid dehydrogenase E1
pel 828 pectate lyase yvfO 3502 arabinogalactan endo-1,4-—galactosidase (2-oxoisovalerate dehydrogenase α subunit)
pelB 2034 pectate lyase yvfQ 3499 hydrolase bfmBAB 2498 branched-chain α-keto acid dehydrogenase E1
pmi 3688 mannose-6-phosphate isomerase yvfV 3495 glycolate oxidase (2-oxoisovalerate dehydrogenase β subunit)
pps 2053 phosphoenolpyruvate synthase yvgN 3427 plant-metabolite dehydrogenase bfmBB 2497 branched-chain α-keto acid dehydrogenase E2
pta 3865 phosphotransacetylase yvkC 3615 pyruvate,water dikinase subunit (lipoamide acyltransferase)
ptsH 1459 histidine-containing phosphocarrier protein of the yvoE 3592 phosphoglycolate phosphatase bltD 2718 spermine/spermidine acetyltransferase
phosphotransferase system (PTS) (HPr protein) yvoF 3591 O-acetyltransferase bpr 1599 bacillopeptidase F
rbsK 3701 ribokinase (ribose metabolism) yvpA 3590 pectate lyase cad 1535 lysine decarboxylase
sacA 3902 sucrase-6-phosphate hydrolase yvyH 3664 UDP-N-acetylglucosamine 2-epimerase carA 1199 carbamoyl-phosphate transferase-arginine (sub-
sacB 3535 levansucrase ywdH 3895 aldehyde dehydrogenase unit A) (arginine biosynthesis)
sacC 2759 levanase ywfD 3872 glucose 1-dehydrogenase carB 1200 carbamoyl-phosphate transferase-arginine (sub-
sacX 3941 negative regulatory protein of SacY ywjI 3805 glycerol-inducible protein unit B) (arginine biosynthesis)
treA 851 trehalose-6-phosphate hydrolase ywqF 3730 NDP-sugar dehydrogenase ctpA 2133 carboxy-terminal processing protease
xsa 2914 β-xylosidase / α-L-arabinosidase (xylan degrada- yxbG 4091 glucose 1-dehydrogenase cysE 113 serine acetyltransferase (cysteine biosynthesis)
tion) yxiA 4040 arabinan endo-1,5-—L-arabinosidase cysH 1630 phosphoadenosine phosphosulfate reductase
xylA 1891 xylose isomerase (xylose metabolism) yxjF 4000 gluconate 5-dehydrogenase (cysteine biosynthesis)
xylB 1893 xylulose kinase (xylose metabolism) yxnA 4107 glucose 1-dehydrogenase cysK 82 cysteine synthetase A (cysteine biosynthesis)
xynA 2054 endo-1,4-—xylanase (xylan degradation) yyaE 4202 formate dehydrogenase dal 517 D-alanine racemase
xynB 1888 xylan β-1,4-xylosidase (xylan degradation) yyaI 4196 galactoside acetyltransferase dapA 1748 dihydrodipicolinate synthase
xynD 1945 endo-1,4-—xylanase (xylan degradation) yycR 4136 formaldehyde dehydrogenase (diaminopimelate/lysine biosynthesis)
ybaN 161 polysaccharide deacetylase dapB 2359 dihydrodipicolinate reductase
ybbD 188 β-hexosaminidase II.1.2 MAIN GLYCOLYTIC PATHWAYS ...................................... .28 (diaminopimelate/lysine biosynthesis)
ybcM 213 glucosamine-fructose-6-phosphate aminotrans- eno 3477 enolase (glycolysis) dapG 1747 aspartokinase I (α and β subunits)
ferase fbaA 3808 fructose-1,6-bisphosphate aldolase (glycolysis) def 1646 polypeptide deformylase
ybfT 258 glucosamine-6-phosphate isomerase fbp 4127 fructose-1,6-bisphosphatase (gluconeogenesis) epr 3939 minor extracellular serine protease
ycbC 268 5-dehydro-4-deoxyglucarate dehydratase gap 3482 glyceraldehyde 3-phosphate dehydrogenase (gly- glmS 200 L-glutamine-D-fructose-6-phosphate amidotrans-
ycbD 269 aldehyde dehydrogenase colysis) ferase
ycbF 272 glucarate dehydratase gapB 2967 glyceraldehyde 3-phosphate dehydrogenase (gly- glnA 1878 glutamine synthetase
ycdF 305 glucose 1-dehydrogenase colysis) gltA 2014 glutamate synthase (large subunit) (glutamate
ycdG 306 oligo-1,6-glucosidase iolJ 4073 fructose-1,6-bisphosphate aldolase (glycolysis) biosynthesis)
ycgS 352 aromatic hydrocarbon catabolism pckA 3129 phosphoenolpyruvate carboxykinase gltB 2009 glutamate synthase (small subunit) (glutamate
yckE 370 β-glucosidase pdhA 1528 pyruvate dehydrogenase (E1 α subunit) biosynthesis)
yckG 375 D-arabino 3-hexulose 6-phosphate formaldehyde pdhB 1529 pyruvate dehydrogenase (E1 β subunit) glyA 3789 serine hydroxymethyltransferase (glycine/ser-
lyase pdhC 1530 pyruvate dehydrogenase (dihydrolipoamide ine/threonine metabolism)
ycsN 466 aryl-alcohol dehydrogenase acetyltransferase E2 subunit) hisA 3584 phosphoribosylformimino-5-aminoimidazole car-
ydaD 471 alcohol dehydrogenase pdhD 1531 pyruvate dehydrogenase / 2-oxoglutarate dehy- boxamide ribotide isomerase (histidine biosyn-
ydaF 473 acetyltransferase drogenase (dihydrolipoamide dehydrogenase E3 thesis)
ydaM 482 cellulose synthase subunit) hisB 3585 imidazoleglycerol-phosphate dehydratase (histi-
ydaP 488 pyruvate oxidase pfk 2987 6-phosphofructokinase (glycolysis) dine biosynthesis)
ydhP 628 β-glucosidase pgi 3221 glucose-6-phosphate isomerase (glycolysis) hisC 2371 histidinol-phosphate aminotransferase (histidine
ydhR 631 fructokinase pgk 3480 phosphoglycerate kinase (glycolysis) biosynthesis) / tyrosine and phenylalanine
ydhS 632 mannose-6-phosphate isomerase pgm 3478 phosphoglycerate mutase (glycolysis) aminotransferase
ydhT 632 mannan endo-1,4-—mannosidase pycA 1554 pyruvate carboxylase hisD 3587 histidinol dehydrogenase (histidine biosynthesis)
ydjE 670 fructokinase pykA 2986 pyruvate kinase (glycolysis) hisF 3583 HisF cyclase-like protein (synthesis of D-erythro-
ydjL 679 L-iditol 2-dehydrogenase tkt 1919 transketolase (pentose phosphate) imidazole glycerol phosphate)
ydjP 682 arylesterase tpi 3479 triose phosphate isomerase (glycolysis) hisG 3587 ATP phosphoribosyltransferase (histidine biosyn-
yeaC 688 methanol dehydrogenase regulation ybbT 198 phosphoglucomutase (glycolysis) thesis)
yesY 774 rhamnogalacturonan acetylesterase ydeA 558 glyceraldehyde 3-phosphate dehydrogenase (gly- hisH 3585 amidotransferase (histidine biosynthesis)
yesZ 774 β-galactosidase colysis) hisI 3583 phosphoribosyl-AMP cyclohydrolase / phospho-
yfhM 929 epoxide hydrolase yhfR 1109 phosphoglycerate mutase (glycolysis) ribosyl-ATP pyrophosphohydrolase (histidine
yfhR 937 glucose 1-dehydrogenase yqeC 2651 6-phosphogluconate dehydrogenase (pentose biosynthesis)
yfjS 869 polysaccharide deacetylase phosphate) hom 3315 homoserine dehydrogenase (threonine/methion-
yfmT 807 benzaldehyde dehydrogenase yqiV 2501 dihydrolipoamide dehydrogenase ine biosynthesis)
yfnH 798 glucose-1-phosphate cytidylyltransferase yqjI 2481 6-phosphogluconate dehydrogenase (pentose hutG 4045 formiminoglutamate hydrolase (histidine utiliza-
ygaK 958 reticuline oxidase phosphate) tion)
yhcW 997 phosphoglycolate phosphatase yqjJ 2478 glucose-6-phosphate 1-dehydrogenase (pentose hutH 4041 histidase (histidine utilization)
yhdF 1022 glucose 1-dehydrogenase phosphate) hutI 4044 imidazolone-5-propionate hydrolase (histidine uti-
yhdN 1030 aldo/keto reductase ywjH 3807 transaldolase (pentose phosphate) lization)
yheN 1041 endo-1,4-—xylanase ywlF 3791 ribose 5-phosphate epimerase (pentose phos- hutU 4042 urocanase (histidine utilization)
yhfE 1095 glucanase phate) ilvA 2293 threonine dehydratase (isoleucine biosynthesis)
yhxB 1006 phosphomannomutase ilvB 2896 acetolactate synthase (large subunit)
yhxC 1115 alcohol dehydrogenase II.1.3 TCA CYCLE ........................................................................... 19 (valine/isoleucine biosynthesis)
yhxD 1118 ribitol dehydrogenase citA 1021 citrate synthase I ilvC 2894 ketol-acid reductoisomerase (valine/isoleucine
yisS 1164 myo-inositol 2-dehydrogenase citB 1926 aconitate hydratase biosynthesis)
yitF 1175 mandelate racemase citC 2980 isocitrate dehydrogenase ilvD 2302 dihydroxy-acid dehydratase (valine/isoleucine
yitY 1192 L-gulonolactone oxydase citG 3389 fumarate hydratase biosynthesis)
yjdE 1274 mannose-6-phosphate isomerase citH 2979 malate dehydrogenase ilvN 2894 acetolactate synthase (small subunit)
yjeA 1281 endo-1,4-—xylanase citZ 2981 citrate synthase II
yjgC 1285 formate dehydrogenase malS 3058 malate dehydrogenase
Nature © Macmillan Publishers Ltd 1997
6S 3S S 6S 3S S 6S 3S S 6S 3S S
-1 -2 -5 L -1 -2 -5 V G C T L E -1 -2 -5 -1 -2 -5 G
aA rB rA O aB cA aJ r aX C N aO bD bE aD oV bK oV bM oII bS bT cA rT H cB cC cD sK bB H cI C s cK cL cM cN X sE sS xF sL sG
aN aaA cF aB nO nO nO aD aaE rS nS aK cR aL fA nA k aQ aR lB aaT abA abB azA abC etS bF sgA bG eg spF abH urR abJ po s fd bN abOabPabQ ivICabR nS bA abC ul lA lK azB acF sS nJ nJ nJ J nW nW nW R c zC cO cP H m cE sG lK lA lJ lL xB oB oC s fA aC psJ plC plD plW
dn dn y re ya gy gy rr trn rr rr gu da ya y se tr ya sc dn ya re ya bo rr
nA trnA
rr nA fB a yaa
rr cs xp ya tm ya ya ho y y y y y m ya ya ya k ya v s y p y s gc pr ctc sp ya m sp ya ya y y y d y tr sp ya ya ya hp ftf s ya ya ya cy pa pa p s fo fo y y ly rr rr rr trn rr rr rr cts ya ya clp sm ya ya ya ya glt cy cy ya ya ya sig rp se nu rp rp rp rp yb rp rp yb rp rp fu tu yb r r r r r

1
0° 4° 6° 8° 10°
I
oriC aC aF aG aH yaa rB
ya ya ya ya ab
S S 6S 3S S 6S 3S S
D Y J E F 16 23 5S -1 -2 -5 -1 -2 -5 S H T C
sQ lN lX lE sN sH lF lR sE m lO c k ap fA m sM sK oA lQ xA ba ba uA plMpsI aJ baK wlD aL aA nI- nI- nI- I nH nH nH nG nG nG aR aS SL W bM bP bR bT bU aA aB hF cC cD cF cH cI cL cM bcObcP bcQ bcS bcT bdA dB dD dE bdG bdJ bdK bdL dO xG gA x d e eF fA fB rT pr bfJ bfK sA fM d fN fO fP fQ gA gB bgE bA bB bC bD bE bF bG bH bJ zA
rp rp rp rp rp rp rp rp rp rp rp se ad m in rp rp rp rp rp yb y y tr r r yb y c yb kb rr rr rr trn rr rr rr rr rr rr yb yb trn sig yb yb yb yb glm yb ad ad nd yb yb yb yb yb yb yb y y y y y y yb yb yb y y y y yb yb cs yb yb yb yb yb yb pu m y y ps yb ps yb yb yb yb yb yb y yc yc yc yc yc yc yc yc yc yc
16° 20° 22°
140,001
12° 14° 18°
D N A B D E F H b I A 'prophage' 1 xI Q T fE fF fH fI P fS fT F G H J
r bb uC uB uA bb bC bb b bJ bK dM dN yB fG yb g g g g
ge yba y fe fe fe y yb ybb y yb y bb yb yb yb alk yb yb yb yb glp glp yb yb yb yb glt yb yb yb yb yb yb
A B C A S B C D
bN bO lJ bR oD zB bT bU cC A cF tA tB cK dB dC pJ dF dG dH dI eA eC eD eE eF eG eH eI uA uA uA gA gB yE tE tP gE gF gG gI dE oI gK h gL gM gN gO gP gT sA iA iB iC kC kD kE kH fA m fA fA fA xA xD lB lC lD clE
yc yc cw yc ph yc yc yc yc lip yc na na yc yc yc ra yc yc yc yc yc yc yc yc yc yc yc yc op op op yc yc am lc lc yc yc yc yc na ar yc ca yc yc yc yc yc yc na yc yc yc yc yc yc yc sr co sr sr sr yc yc yc yc yc y
30° 32°
280,001
24° 26° 28° 34

bP p rB rA zC cG cH dA dD eB eJ eK hX dr gH rB gJ gQ gR gS sF sE sD sC sB kA ckB C kF kG xB xC E kI kJ K lA lF
yc pc lm lm yc yc yc yc yc yc yc yc m yc tm yc yc yc yc na na na na na yc y
ninucA
n tlp yc yc yc yc sfp ycz yc yc yck yc yc
am
A C B F R S T E F G
rK rK rK lJ lK pC rC lN lO lP lQ zG nG nH xE h nL tlA tlD sA U sD sE sF sG sI sJ sO csK czI pC sN aA aB aD daE daF daG aH C pB aJ aK aL aM aN aO utT aP aQ bA siB dbBdbC bF bG bH bI bJ bK bL dbM bO lA ur b b b cB cC l cD cE bR bS bT bU bV bW B bX cF cG cH cI cK S cV cO cP cQ cR cS cT dA dB dC dD d d d dH dI dJ pI rI dM A dQ dR dS dT
ge ge ge yc yc ra ph yc yc yc yc yc yc yc yc gd yc m m yc sip yc yc yc yc yc yc yc y y pb yc yd yd yd y y y yd lrp to yd yd yd yd yd yd m yd yd yd g y y yd yd yd yd yd yd yd y yd dd m yd yd yd yd yd da yd yd rs rs rs rs rs rs sig rs yd yd yd yd yd trn sa yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd ra ph yd lrp yd yd yd yd
42° 44°
420,001
36° 38° 40° 46°
lH lI lM nI nJ zJ T 'prophage' 2 K B
yc nB nC nD nE nF nK zH aC zA aR aS a bD bE bN bP cA cL cM cN d dN
yc yc yc yc yc yc yc yc yc yc yc yc yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd lrp
6S 3S S S L
E -1 -2 -5 B C
eD eF eG eH deI eL eM eO eQ eS eT fA fB fD fF fH fI fJ fN fO
f fP fQ fS gC gD gE gG gH gK hC hD hE hG hJ hK hM hN hO hP hQ dhR dhS hT nE nE nE nE iA iB iC iD iE iG iH iI iJ oE oE iM iN iO iP iQ iR iS jA jB jC tB jD jE jF jG djH jI jK jM jN a a aD bA aA bB bC bD bE b
yd yd yd yd y yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd yd y y yd trn rr rr rr tr yd yd yd yd yd yd yd yd yd gr gr yd yd yd yd yd yd yd yd yd yd gu yd yd yd yd y yd yd yd yd ye ye ye ye gu ye ye ye ye ye
48° 50°
560,001
52° 54° 56° 58°
B E 'prophage' 2 J K P R fC fE p fR fT A B F B B F B I L U iF iK iL 'prophage' 3 jJ jL
j jO jP A
e zE eC e e e eN zF e e fG fK dfL fM zH pZ g gI gJ h h hH dh h h tR yd a tA bP
yd yd yd yd yd yd yd yd yd yd yd yd yd na yd y yd yd yd yd ydgydg ex yd din yd yd yd yd ho yd y yd yd yd yd yd gu yd yd yd ye co ga
p
A B C T S R Q P O N L K H G
rB urC exA urL rQ rF rM urN rH rD cA rA rB erC rE rF rG rH rI rL rM rN rP rQ fA fC eA eB eC eG pH eI eK sE esF otJ otJ otJ esJ esK esL sM sN sO sP sQ sR sS sT esU esV sW sX sY sZ tA
t lA lB lC lD tF tI zB zD tJ etK tM tO I H G F E D AT l S S T P M N M
pu p y p pu pu pu p pu pu ye ye ye y ye ye ye ye ye ye ye ye ye ye ye ye ye ye ye ye ra ye ye ye y c c c y y y ye ye ye ye ye ye ye ye y y ye ye ye ye ye lp lp lp lp ye ye ye ye ye y ye ye yfn yfn yfn yfn yfn yfn yfm yfm yfm yfm yfm yfm yfm yfm yfm yfm yfm yfmyfl pe yfl cit cit yfl cit yfl yfl
62° 70°
700,001
60° 64° 66° 68°
rD tL C B A M J I F E D C B L
zC pB uE rO efB eD zA eF tG tH tN yfl
ye ye sa op ye y ye ye ye ye ye ye ye yfn yfn yfn yfm yfm yfm yfm yfm yfm yfm yfm
6S 3S S
R C D -1 -2 -5 L A M A
F D C B A P A R O M J I H E D T S P O N M L oA oB oC oL o U A A B C D fiE F G H I J K L M N B Q T W X Z A B C G H I J K L M bB O Q R pE aB aC a aF aG aH xA nD nD nD nD zA gaJ iA nS a b a cA aN zA SL b bB pR bD bE bF kA bH bI bJ
yfl yfl yfl yfl yfl tre tre tre yfk yfk yfk yfk yfk yfk yfk yfj yfj yfj yfj yfj yfj yfj ac ac ac ac ac yfj glv yfi glv yfi yfi yfi y yfi yfi yfi yfi yfi yfi yfi yfi yfi lip yfi yfi yfi yfi yfi yfh yfh yfh yfh yfh yfh yfh yfh yfh yfh cs yfh yfh yfh ss yg yg yg yg yg yg yg rr rr rr tr yg y th se yg yg yg yg yg yh trn yh yh cs yh yh yh pr yh yh yh
72° 78° 82°
840,001
74° 76° 80°
E T S R Q N L K F C B A R Q F E D C B A O R S U V Y D E F P S E B a I K O
yfl yfj yfj yfj yfj yfj yfj yfj yfj yfi yfi yfi yfi yfi a a a tA a zB
yfk yfk yfk yfk yfk yfk yfk yfk yfk yfk yfk yfi yfh yfh yfh yfh yfh yg gs yg yg ka yg yh
R E H Y K
cJ cL cN cO cP cR cS cU cV cW cX xA P F K D xB cY oV A dF dG dH dJ dN dO dR dT dZ eM eJ eI eH eB eA aX aW aV aR aQ aP aO aN a I sA sB sC pF m m m g D g E fC fE fF fI fJ fK fL fN fO fP fQ xC m
cE cF cG cH hcI cZ dA hdC tE dX hdY aM haL
yh yh yh yh y yh yh yh yh yh yh yh yh yh yh yh yh glp glp glp glp yh yh yh yh y sp ly cit yh yh yh yh yh yh yh yh yh y yh yh yh yh yh yh yh yh yh yh yh yh yh yh yh yh y yh ec ec ec pb he he he yh yh yh yh yh yh yh yh yh yh yh yh yh yh co
980,001
84° 86° 88° 90° 92° 94°
B c K M c T d B d E x B R dI d P Q d S e N e L e K F C a Z a Y a U aT aS aK a J r H G rC hit a A fA gB gC fB fD T fH fM rE fR fS fT fU fV fW zC x
p c cQ dD oA cit yh dK dL dM d dU dV dW eG e pB eE heD e sA hp ha ha yh glt yh yh
cs yh yh yh yh yh yh yh yg ph yh yh yh yh yh yh yh yh yh yh yh yh yh yh ss yh y yh yh yh yh yh yh pr yh yh y y se yh yh yh yh yh yh yh ap yh yh yh yh yh yh

A X Y Z A A F G O Q V
jE V jG jH jM jP dB dA cD Y B K L pr O P R gA S V W Y C D L Q R rB T i V W X Y Z gC gJ gB gD rA rB gF jzC U V W A pD pF pA pB ppC pA pB ppC ppD ppF B C D ec L M N nA nI R S T U W X A B C
yh sip yh yh yh yh ad ad sb yir yis yis yis w yis yis yis de yis yis yis yis yit yit yit yit yit np yit ip yit yit yit yit yit ar ar ar ar ca ca ar y yja yja yja yjz yja yja yja ap ap ap ap a yjb op op o o o yjb yjb yjb m yjb yjb yjb yjb yjb yjb yjb te te yjb yjb yjb yjb yjb yjb yjb yjc yjc yjc
100°
1,120,001
96° 98° 102° 104° 106°
jC jD jI jJ jK jL jN jO C D E F G H isI J N Q T U X Z A B E F G J K M N O P S U D B S E H I J K P D E F G H
yh jQ jR yit yit yit yit
H yitI
yit yit yit yit
tZ tY tX tW tV
yh yh yh yh yh yh yh yh yh yis yis yis yis yis yis y yis yis yis yis yis yis yis yit yit yit yit yit yit yjz yjz trp yjb yjb yjb yjb yjb yjb co co co co co yjc yjc yjc yjc yjc

SL N O P Q R S A C D E F I A C D A B A B C D E F G H I B B B C A B dE dF dG dH dI dJ dK dM dN dO dP dQ dR dS dT dU dV dW dX dY lA lB A pE p
C B C A B C D pA rA dB kdC kdDtrA pf cA cB cC eA pA ppB ppC ppD fA fB fC fD hA jA kA kB kC kD kE oB oA lA zA pU oG oH oI oJ zD oK z
trn yjc yjc yjc yjc yjc yjc yjd yjd yjd yjd yjd yjd yje yjf yjg yjg yjh yjh yji yji yjj yjl yjl yjl yjm yjm yjm yjm yjm yjm yjm yjm yjm yjm yjo ra ph xly yjq yjq xk x x x x xtm xtm xk xk xk xk xk xk xk xk xk xk xk xk xk xk xk xk xk xk xk xk xh xh xly yk yk yk yk dp d d d dp yk yk yk yk yk hm yk yk yk yk yk yk pr pr yk yk is yk yk yk yk yk yk yk
108° 116°
1,260,001
110° 112° 114° 118°
K L M B 'prophage' 4 J K tT A B A B A A B A A A A e PBSX A A
yjc
G jdH
yjd yjf yjf yji yjl yjp dA xr SB SA aB aA bA gB gA zH m nA etC pA oC oD oE oF rA
yjc yjc yjd yjd y yjd co yjg yjg yjk yjk yjn yjo yjq xk oII oII yk yk yk htr yk yk yk yk yk m is yk yk yk yk tn
sp sp
E C A B A B E D
oS oT oX oY oZ rI rL rM zE rQ t rV rW rX krY krZ o0 g vE vI vJ vK vL vM vO vP vQ vR vT vU vV vW vY vZ T G H I lA lB cp wC uA A eV uF uG uH uI uJ uK zF uL uM uN uO uP uQ uR uS uU uV uW ob oe oe ob oa oa nU nV nW knX knY nZ R uB A T pA pB h C qA qB eC rA yA hA hB hC hD tA zI tC
yk yk yk yk yk yk yk yk yk yk da yk yk yk y y sp ea yk yk yk yk yk yk yk yk yk yk yk yk yk yk yk yk glc pts pts pts sp sp m yk yk kin ch yk yk yk yk yk yk yk yk yk yk yk yk yk yk yk yk yk yk m m m m m m yk yk yk y y yk fru fr fru sip yk yk ab kin yk yk ad yk yk pd pd pd pd yk yk yk
120° 126°
1,400,001
122° 124° 128° 130°
Q oU oV W rP rS rT rU vA vD E vN vS B D tA yB uC uT nT oA rB d tB
o o pD krK yk yk yk yk otB otA w w uD kuE pS kpC eBH qC kzG yk slp ca yk
yk yk yk yk ss y yk yk m m clp yk yk yk yk pa yk yk yk y yk yk yk y r am yk y
m
E A
VD E Y D G B G A A B II M A
A B C D G H K M N O cA B C D E F G B C D E F G H I K L N mF O Q A B A L pB o ur ra ur oV ur ur IB W X p A Z r oII E G A B C D E F G H IV S A p B rR rP rB rC rA rA rD rD yrF yrE sH A B C D E F B C D H I iA f t M N O P Q R S oV U V W A B C X bD bG p cS c Y M sP C D E mD lS F h
yla yla yla yla yla yla yla yla yla yla py cta cta cta cta cta cta ylb ylb ylb ylb ylb ylb ylb ylb ylb ylb ylb rp ylb ylb yll yll ylx fts pb sp m m m sp m m div ylx ylx sb fts fts bp sp sig sig ylm ylm ylm ylm ylm ylm ylm ylm div ile yly ls yly py py py py py py py py p p cy yln yln yln yln yln yln ylo ylo ylo ylo ylo pr de fm ylo ylo ylo ylo ylo ylo ylo sp ylo ylo ylo ylp ylp ylp pls fa fa ac rn sm fts ylx ffh rp ylq ylq ylq tr rp ylq rn
132°
1,540,001
134° 136° 138° 140° 142°
rE E F I J L A A J M P A B B
np yla yla yla yla yla cta ylb ylb ylb ylb ylo m ylq
rp
FA FB IE S
cC cD f dV Q Y dY B C F G E eY B A F H eB eA eW eC eD D L A sA B C lC S sA R Q C sO pA Y xG xH pG pA fA fB oII G
fC fD fE fF fG fH fI fJ fK fL fM sA A pX dA dB oV h l cB cA tE sA sB sC s D s E s F s G sH sI s J s K sL
pA sB f bA r oS fB P fA uB oV poV sd cA utS utL
su su sm to gid co clp clp co flg flg fliE fliF fliG fliH fliI fliJ ylx fliK ylx flg fliL fliM fliY ch fliZ fliP fliQ fliR flh flh flh ylx ch ch ch ch ch sig ylx rp ts sm fr ylu cd ylu ylu pr po ylx nu ylx ylx in ylx rb tr rib rp pn ylx ym ym sp s a da da ym ym sp ym ym ym ym ym ym ym ym ym ym ym pg cin re pb ym ym sp td kb ym ym co m m pk pk pk pk pk pk pk pk pk pk pk pk
1,680,001
144° 146° 148° 150° 152° 154°
cC
ym
K
sM sN sP sR aC aD dE dF aB oV bA bB R A xB zF zG aB aC aD aJ nB lA lB cC cE cF yA dA zB dD dE d F dG dH d J dK dL d N eA eB zC B e N eP eQ e T lB lA T lC fE SL gA gB gC
aF iaA aH zC zA aA aE aF aG naI t eE eF cdA neI neJ
pk pk pk pk ym ym ym m ym ym ym ym nr nr ym sp yn yn gln gln yn yn yn yn yn yn yn yn yn y yn xy xy xy yn yn yn th yn yn yn yn yn yn yn yn yn yn yn yn yn yn tk yn yn c y y cit yn tlp yn yn yn gr gr als bg yn trn yn yn yn
166°
1,820,001
156° 158° 160° 162° 164°
X 'prophage' 5 M S fF D gI
sS zB aE r rB rA aG lC lR cB cD zH cM tC zA dB d xA zD eK tM tL tK eR e fC nD g gE gF gG gH gJ zE
pk ym ym ap eb eb ym cw xy yn yn yn yn co yn yn yn le yn yn co co co yn yn yn yn xy yn yn yn yn yn yn yn yn

L B D N O W
eB S t gA C xC xB aA aE aF aI lB aM aQ aS zG aT aV nP b b zI bE pK rK zM b b b cA cC cD cE cF ocG
yo trn gg yo glt terC yo yo yo yo yo yo pe yo yo yo yo yo yo pe yo yo yo yo ra ph yo yo yo yo yo yo yo yo yo y
168° 170° 172° 174°
1,960,001
176° 178°
p fA B A Z A s F 'prophage' 6 b I J K L M cI
sE sD sC sB sA xA eA eC eD oJ oH rtp xD aB aC aD aG aH aJ aK aN aO aP zF aR aU aW oa b nA zH b zJ bH zK zL b b b b aA bQ bR bS bT bU bV zA zB cB cH cJ cK cL
pp pp pp pp pp pb yo yo yo yo yo glt glt pr pr yo yo yo yo yo yo yo yo yo yo yo yo yo yo y yo pp xy yo yo yo yo yo yo yo yo yo yo yo cs yo yo yo yo yo yo yo yo yo yo yo yo yo yo yo
yo

Nature © Macmillan Publishers Ltd 1997


aS hC dF W V T S P O N Z
cS jI jH dA dC dF dH dI eA geB dU tL pC sU sA SPß q q q qO qM p SPß n n n nK nJ nH nG nF nE nD nC nB nA m
dh sq so yo yo yo yo yo yo yo yo cg c yo yo ss yo yo yo yo yo yo yo yo yo yo yo yo yo yo yo yo yo yo yo yo yo yo y
180° 184° 186° 188°
2,100,001
182° 190°
jO jN jL jK jJ jE jC A E I rL rK rJ rI rA J I Q I A X V n I
cR hB hA jM jG jF jB ojA dB dD odE dJ oD dL M zD dN zE dO dP gE dR dS dT e eD eC dV tN tM tK otJ otI tH tG tF tE tD tC tB sZ sX W sV osT osS sR sQ sP sO sN sM sL sK sJ os sH sG sF sE sD sC sBorZ orYorX rWorV P orT orS rR rQ rP rO rN rM yo rH orG rF orE orDorCorB qZ qY qX p p n n nU nT nS nR
yo yo yo yo yo yo yo yo yo yo yo yo y yo yo y ctp yo de yo yod yo yo yo yo yo ar yo yo yo cg cg cg yo yo yo yo y y yo yo yo yo yo yo yo yo yo yos yo y y yo yo yo yo yo yo yo yo yo y yo yo yo yo yo yo yo y y y yo y mtb y y yo yo yo yo yo yo yo yo yo yo y yo y y y y yo yo yo yo
qU qS qR qP qN oqLoqKoq yoqoqHoqGoqFoqEoqDoqCoqBoqAopZopYopX pWopVopUopT opR
yo yo yo yo yo y y y y y y y y y y y y y y yo y y y y yo
pP pO pN pM pL pK opJyop opHopGopFopE opD opC opB
yo yo yo yo yo yo y y y y y y y y yo yo yo yo yo yo yo yo
od od
T S R Q P O N M K J I H G F E D C B A A F B nA
m m m m m m m m m om m m m m m m m om om lC kF okE kD kA qP oP lP lQ etB fP pD zA dP dQ uI uD q p p jC
yo yo yo yo yo yo yo yo yo y yo yo yo yo yo yo yo y y yo yo y yo yo yp yp yp yp m yp cs yp yp yp kd kd yp yp yp po yp
194° 198
8° 202°
2,240,001
192° 196° 200°
L SPß lK lJ lI lB lA k I H C B P T S P A D R P S R X t A R A tA rA G A jH jG jF B jD iA C A B F C D E
zP m yo nT kL kK kJ kG ok k pQ pP m m mR mQ m ilv gQ aA gR eQpeP c bQ sA xp gT gA gK sC pBpsB psA otD rB qE nS nD nC nB pS jB jA rC rB rA iF iB oE rA oH
yo yo yo yo su
nA olF vrX yolD
su y u yo yo yo yo yo yo yok yo y yo yp yp ypn kP frA yB jQ jP piP phP
yp d th yp yp y y ilv y pg yp bs yp ypb ypb yp bc bu ypw dg y pv yp yp rn y y c yp yp yp
pG pE pD ppC
yp yp yp y yp
oC nth naD
d as as
pB pmB mA
din bir pa yp yp yp qc qc qc yp yp yp ar ty his trp trp trp trp trp trp ar a
yp yp yp yp yp de yp y p kd kd kd k y yp pa pa pa yp yp dap yp

r r A uF zC zD uA sR kF kD zH jV qjU jO jM jL jJ jG jF rU r rR zF iG
fe se yp yp yp yp an yq yq yq yq y yq yq yq yq yq yq bm bm bm yq yq
204° 208° 212° 214°
2,380,001
206° 210°
k C B A B A s A fD k fB fA eB B dA cA bB aA zE sE L uI H G T H A B G uE uD S iB F F p m X kI xK kE jT jS jR jQ jP jN jK jI jH jE jD jC jB jA iZ iY iX B A iV iU iT iS iR iQ
hF hE psA hC hB hA pgA dC bH pbG pbF pbE pbD cQ oC uN igX sD sC sB sA pu mB mA acB rib rib rib uC uB sA AF AE pn dr rip
kL M kK qkJ
yq sB sA kG kC kB kA qjZ qjY qjX qjW
yq yq yq yq yq yq BB yq yq yq yq gE gD gC gB gA yq
iK yqiI qiH
o0
nd erC erC erC mtr mtr hb oIV yp yp g yp yp yp y yp cm yp yp yp sle yp yp yp yp y y y y re yp yp yp ar yp s re re re re re y sp sp d yp ypu ypu rib rib yp yp sip yp yp pp ly oV oV
AD AC AB AA sig IIABIIAA ac
o o d yq poII yq y an an yq yq yq yq yq yq y y y y yq yq yq yq yq yq yq yq yq yq yq yqiW BA BA yq yq m m m m m y
g g g sp
oV poVpoV poV s bfm m m m m m sp
sp sp sp s s s sp sp bfm bfm
ER CB
hQ qhP hL hH hG I R zG hB gY gW zD zC gB fZ fY fW fT fQ fP e W sT m eD eB oIV
yq y yq yq yq sin sin yq yq yq yq yq yq yq yq yq yq yq yq yq yq rp co yq yq sp skin
216° 222° 224° 226°
2,520,001
218° 220°
iE C B H G F hI tN A xL N L gI fX fV fU fS fR fO fN A fL S Q x d A fF fD fC fB fA Z V U T N r N L K J I H G F E L J
cN rC xC iD iC qiB lD hZ Y E D C B A V fp hT hS R hK hJ W xM zE G GF GE D C B G hA SL gZ gT gS K gQ gP gO g gM qg gK gJ gH gG pA gE dA qgC gA cA aG xD zB xN oH e eY sU e e e aJ aK pE rcA m pA xA IIP gp e EC EB EA eM qe e qe qe roD e e e e eC cB CA cM qc qcK qc
re ah yq yq yq yq y fo yq yqh acc acc IIIA IIIA IIIA IIIA IIIAIIIA IIIA IIIA yqh e yq yq yqh yq
hO qhN qhM
y y yq yq yq co sip yq m yq yq
gX qgV qgU
y y yq yq glc yq yq yq yq yq y yq yq yq yq yq pb yq so y yq yq yq yq yq yq yq yq cc sig yq yq yq gly gly yq be cd gk yqfG
d yq yq yq yq yq yq yq rp yq yq yq dn dn gr h le yq po yq m m m yq y yq y y a yq yq yq yq yq nu yq y y y yq
o o o o o o o o yq mGom m mGmG mG yq trn yq dn ph he s oIV
sp sp sp sp sp sp sp sp co c co co co co co co co co sp
II IC
cG cF dB aP aE aB o O N M D Q R B C D G M L K G F hB E D C hA raA pA O V M L K E D A
yq yq yq skin yq yq yq sp yrk yrk yrk blt blt yrd glt 'prophage' 7 yrp yrp yrp yrp yra yra yra yra yra ad yra yra yra ad y aa yrh sig yrh yrh yrh yrh yrh yrz
228° 230° 232° 236° 238°
2,660,001
234°
L J I L I S R Q P L K C B R A R A N D C B A E Z O N n I H B P J I H I G F C B A U T S R k O N M L B K
xJ xI lA xH xG cE cD cC cB cA bT bS bR Q bP bO bN bA aT aJ qa aH aG dAqaF aD aC J rkI H G F E D
cD K nQ lD lC lB F pA dK
J ra
cC vR eA ud
yq yq cw yq yq yq yq yq yq yq yq yq yq yqb yq yq yq
bMqb qbK qb yqb qbHqbGqbF qbE
yq y y y y y y y yq
bD qbC qbB
y y yq yq yq
aS qaR qaQ
y y
aO aN qaM qa qaK
yq yq y y y yq y yq yq yq y yq yq yrk yrk yrk yrk yrk yrk yrk y yrk yrk yrk yrk yrk yrk yrk blt yrk yrd trk cz yrd yrd br az az az yrd cy yrd yrd yrd yrd yrp sig yra yra cs yra y yra yra sa le
vG levF vE vD
le le le yrh yrh yrh yrh yrz yrh yrh yrh yrh yrh yrr yrr yrr yrr gr yrr yrr yrr yrr yrz yrr
aa

Q H M P P N J VB E F G H D A L
o S xD nD nE nF S lB gB fA fB fC fD
gln gln gln gln yrv yrv yrv sp yrz yrz yrz yrz yrb nif yrx ys ys ys ys trn ys ys ys ys ys ys
240° 242° 244° 246° 248° 250°
2,800,001
I D C B A O C M S K I t E D C B D G F t E C B A g A D C D C B f B C lS L B D C X A X C N B k A a A B iB iA fE
yrr pS yrv lA ap cF tg eA vB dB eA eB B A B lU B
m lC xE ID m m m m m m xC nA nB tig oA uD uC uB uA nB nA rph rM cE mB rE A hB hA hC sC as rC xs fA hE hD hC hB hA eT eS gA tA
yrr yrr yrr yrr yrv yrz yrv as his yrz re yrv yrv yrv se yrv yrz yrb yrb qu ru
vA fC sbX
ru bo c yrb yrb yrb yrb na
dA adC
n ph ph ob ys
xA ma
oII fo va ys oV ys lo lo clp ys le le le le ilv ilv ilv ys ys ge ra ys ge ysm sd sd sd ly uv trx etf etf ys ys lc ys ys ys ys ys ph ph ys ys cs
na sp
o0 rpmysx rp IVF IVF min min mre mre mre
o o sp co he he he he he he
sp sp sp
dB cA cB aA D E I I I I I aB P P sD uA uB cuC Q B H P P E P uD
ys ys ys ys ytb ytb ytv ytr ytp ytl ytd br ytv yts rp ac ac a yto ytz ytz ytk yti ytz ytf op
252° 254° 256° 260° 262°
2,940,001
258°
I C B aI D H C Z I A I J I I M O R J I O P M L K J I L K D H B K I J tfI J I J I J Z P P P E D A J H G C T S R Q P P P Q P R Q P P Q P F G P V
aP aN aM aL raD aB aA nA dC dA lT m fC tS rS aB G tcF pB G taF tM lA oR oP kA pfk cA ytt aE ytq yto gG kA ytc pA rS sA oA ur alS m yX ytj
ar ar ar ar a ar ar ab ys ys rp rp in
bB bA lytT
ys ys ly th ytx ytx dn ytc y ytc yta po cit cit cit ytw ytz ac yts dn ytn hip rib ytn ytn tm his tm ytm ytm ytm ytm
y ytk ytk ytz arg ar oa ac ytx ytg ytf y yte yte ytc ytb nif ytw ytt ytr ty ac ar ytx ytx ytx ytp ytp ytp ytp ytp yto m ytn ytm ytm ytl ytl ytl yth yth ytz ytz ytg yte
dn ga y mu ph ph py y m ss ytx ytx ccp m am
R E E
P sm yD yC A A A B A kA A B C D A A B C A A B C A B a J a aC aB aA b G b C lF l
sm elA
ytd m m am am m ytv ytt ytq ytq yto pc ytl ytl ytl ytl ytj yth yth yth yte ytc ytc ytc yta yta yu yu yu yu yu yu yu yu tg
264° 266° 268° 270° 272° 274°
3,080,001

R Q P Q P Q I B D F A W P F B B D C B A F E D C B A C B A A B A B A D C B A D E B M D F A O N P A D C B B S a I G D D A lE lD lC lB B A A B
uS ytt nB etK
D C ps A tjB
yti yti tS -5 3S 6S a aF a sB sA bF bE b b xG gU gT gS gP
yte yte yte ytc ytc ytb bio bio bio bio bio bio yta ytw le ytv yts yts yts yts ytr ytr ytr ytr ytr ytr ytz ytp ytp ytn as ytm ytm ytk ytk d ytk y ytg ytg ytg ytg ytf en en ytx en en ytd yttx co ytx glg glg glg glg glg trn -2 -1 yu yu yu yu yu yu yu
SL ubB
yu yu yu yu yu yu tlp cp tlp cp yu yu yu yu
m m m m m nB gb gb trn y m m
rr nB nB
rr rr
B pD C D A I J K L M
zA gF tB pB xJ xK fL fM fN fO fP fQ fR fT fU fV ufD fC fB zC zE kF xI kJ iG iD m m tJ tI tG tC n n n n n n rI rK rS
yu yu pa kin ka yu pb yu yu yu yu yu yu yu yu yu yu yu y yu yu yu yu yu ald yu yu yu yu yu yu yu yu yu yu yu yu yu yu yu yu yu yu yu
276° 278° 284° 286°
3,220,001
280° 282°
i gI fK fS P X Q Q I iI iH iF iE iC iB iA B tK m tH tF tE tD tB E rB rJ rL rT
pg gK gJ gH ugG gE pD xO mA m xH eK eJ e eH eG eF zF eE eD eC eB kA kB kC ukDukE kL kM bF bB bE bC bA yu zG iB iA tM utL zB zD xL rB rC nB nC unD n nF nG unH rC rD rEE rF rG rH rM urN rO rP urQ urR rU rV rW
yu yu yu yu y yu ka yu yu m m eg yu yu yu yu yu yu yu yu yu yu yu yu yu yu yu y y yu yu dh dh dh dh dh yu yu yu yu yu yu yum yu pa pa yu y yu yu yu yu th th ho yu yu yu yu yu yu yu y yu yu yu y yu yu yu yu yu yu yu yu yu yu y yu yu y y yu yu yu yu
yu co co co co d
A A B C
sN sO sP sT sZ rg qA qB xN rA rA rA rD rK rL uD sG gJ gL vgM gO aA aM aN aO aP aQ aV aW aX vaY bF bH bI bK aR bV
yu yu yu yu yu m yv yv yu ge ge ge yv yv yv fh yv yv yv y yv yv yv yv yv yv yv yv yv yv y yv yv yv yv ar yv
288° 290° 294°
3,360,001
292° 296° 298°
I tA G I rI F I D C B A C B A o m i k p
rZ sA sB sC sD sE sF sG sH us sJ sK sL sM sQ sR sS sU sV sW sX sY tB qC qE qJ qK rA rB rC rE rG rH yv rM vrN vrO vrP uC uG uB sH gK gN gP gQ gR gS gT vgU vgV gW gX gY gZ aB aC aD aE a aG rA va aJ aK aL zC aZ bA D bG bJ tpp bQ aE bT bU bW bX bY fW fV vfU vfT fS
yu yu yu yu yu yu yu yu yu y yu yu yu yu yu yu yu yu yu yu yu yu yv yv cit yv yv yv
qF vqG vqH yvq
y y yv yv yv yv yv yv yv yv yv y y y fh fh fh yv yv yv yv yv yv yv yv y y yv yv yv yv yv yv yv yv yv yv ss y yv yv yv yv uB uB uB puB yv yv puC puC puC puC yv yv en pg pg ga yv ar yv yv yv yv yv yv yv y y yv
op op op o o o o o
bA cB P B A
fP fH fG eB eA dT dS dR dO dC cT pB pA gA agB voA nA m m kN kC kB kA
yv yv yv slr pn sa yv yv yv yv yv yv clp yv yv yv yv na n y yv yv yv yv yv yv yv
300° 304° 306° 308° 310°
3,500,001
302°
fL fK fI L fF fE F Q dI L cJ cI B I F A H B D G Z t C lD lC lB lA rA jB X E B jA C g A iF iE L
fO cA fM cR yv fD vfC vfB vfA veT eS eR eQ eP eO eN eM eL veK dQ dP dM dL dK dJ dH dG dF dE dD dB dA cS cR cQ cP cN crh vc cK cE cD cC zA cB cA oF oE oD lg oB nB pX m zB rB bA jD zD fB cA yD fliT fliS
yv la yv yv yv la sig yv yv yv y y y y yv yv yv yv yv yv yv yv y
dC eGve cX bpE
pa yv y ra p yv yv trn yv yv yv yv yv yv yv yv yv yv yv yv yv yv yv yv yv y yv yv yv trx yv yv yv yv yv yv his his his his his his his his yv yv yv yv yv cy yv yv yv yv yv uv uv cs yv yv yv fts fts ccc yv pr se yv fliD yvy ha csr yv yv flg
yv
A B C B A F D H G A
yE hJ tR B gA gB gC rB rB rB tF tE sR sK bsD sA sC sB wsB wsA R rK tG wrF wrE rC rB rA qM q q p p pD o o gA rgB wo nH nG nE nC
yv yv ly gta ta ta ta ge ge ge yw yw rb rb r rb rb rb y y als yw co y y yw yw yw yw yw yw yw yw ra yw yw nr n y yw yw yw yw
312° 314° 316° 320° 322°
3,640,001
318°
iA i S rJ F J E P O F E D ta G F C D
FC FB FA gU gS gO aH aG aF aE aD aC aB aA tC tB tA yH aB aA gH gG gF gE gD tD tG tD tC tB tA sC rO lsD tB tH rD qO qN qL qK qJ qI qH qG q qE qD qC p R pH pG p pC pB bl ID d o o o oC oB nJ IIQ nF nB nA eC eB eA m m pB rA arQ mE mD m u
yv de de ta tu tu tu tu tu tu tu tu ly ly ly yv ta ta ta ta ta ly pm yw yw yw yw yw yw yw a als yw co co yw yw yw yw ywyw yw yw yw flh flh m oII us yw yw yw m yw yw ur ur ur ra na oII
m m om gg gg yw yw yw yw yw yw glc yw yw yw yw yw yw yw yw po
s yw yw n yw yw yw sp m
co co c sp
lA kD kC jG jE jD jC iC o hP hO hN hM H hE hB fN fI fH dL d H cJ r cE
iA whRwhQ pF rF h dC iD
yw yw yw yw yw yw yw yw sb yw y y yw yw yw yw ra ph yw yw yw yw yw yw yw yw yw th yw vp yw
324° 330° 332° 334°
3,780,001
326° 328°
C D G A I p A k E o Z jH A jF jB jA iE rI rJ iD r iB rZ r fL fK fF fE fB fA B A I K dI g F E D I
H F E B tp lG lF lE lD wlC lB IIR kF kE fA kB kA rh ywjI ur aA o0F oE dA rH rG fn rK gS hL hK hG hF fM pta fG fD fC cC cB cA e e sK sJ ps sG psF psE sD sC sB sA d dJ cP cT wc cH cG cF xD xC xB xA
atp atp atp atp atp atp atp atp a up gly yw yw yw yw y yw po yw yw pr yw yw
td pm
r m yw fb ctr rp ac yw yw yw yw na na na na yw na ar yw yw yw yw yw
hD hC whA
yw yw y th m
m wgB wgA wfO wzC
y y y y yw yw yw yw yw yw yw yw yw yw ro ro ro yw yw sp sp s sp s s sp sp sp sp yw yw yw un ywd ywd ywd yw
dA acA
s sa sa y yw yw yw qo qo qo qo
s sp

cB cA bH bG wbF wbE bC r cX cY aE rZ aD A B C D E aA tX lH lA kJ kI zE Y kC jO xjN xjM jL pT jJ jI jH jG jA iQ iO tP tH tU tI tG tM


yw yw yw yw y y yw ep sa sa yw ty yw dlt dlt dlt dlt dlt yw ka yx yx yx yx yx ald yx yx y y yx pe yx yx yx yx yx yx yx hu hu hu hu hu hu
342° 344° 346°
3,920,001
336° 338° 340°
bI H A C B R lF lE lD lC Y X j jC jB iT iS lS T iP D G A lP iD iC iB iA p a
lK cD cC bO bN bM bL iC iK bD bB bA pA aF aC aB zF xlJ lG kO dD dC dB dA kH kF kD lE kA tB a iM iL iK iJ xiI G iH iG C iF xF xiE lH xE xD pC dr oR xB eR eQ
yw th th yw yw yw llic lic lic lic lic yx y yx yx yx yx yx sig yx cy yx sm yx yx ga yx
yx yxjE yxjD yx yx yx yx ka bg lic yx yx yx yx yx y yxz yx yx yxz yx yxx ap yx y bg bg yx yx yx yx yx yx pd yx yx yx yx
eP xeO
y y
ga yw yw yw yw yw yw yw gs yw yw cy cy cy m de w nu de

eE eD eB lR lS cE cD cC bG X bB bA nB nH aM aI nA aC tR tK tP tZ pC pF dK p cS pG rG cR cD cC cB cA tF bQ bO bN bM bL bK bJ bG bE bD bC aT aS aQ aP aO aN aM aL aJ
yx yx yx io io yx yx yx yx ald yx yx yx as yx yx yx yx gn gn gn gn ah ah yy fb yy ra ph ro yy yy yy yy co yy yy yy yy yy yy yy yy yy yy yy yy yy yy yy yy yy yy yy yy
348° 352° 354°
4,060,001
35
50° 356° 358°
I lJ lI lF lE lC lA G J I cI Y lI bI H aI H G A
eK eJ xe eC eA dM dL dK dJ io lH lG lD lB cA bF aL aK xa aH aG aF aD aB aA