Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (605K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (613)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Tatusov, R. L.
Right arrow Articles by Koonin, E. V.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Tatusov, R. L.
Right arrow Articles by Koonin, E. V.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2001, Vol. 29, No. 1 22-28
© 2001 Oxford University Press

The COG database: new developments in phylogenetic classification of proteins from complete genomes

Roman L. Tatusov, Darren A. Natale, Igor V. Garkavtsev, Tatiana A. Tatusova, Uma T. Shankavaram, Bachoti S. Rao, Boris Kiryutin, Michael Y. Galperin, Natalie D. Fedorova and Eugene V. Koonin*

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA

Received October 2, 2000; Accepted October 11, 2000.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 THE CURRENT STATUS OF...
 ADDING PROTEINS FROM...
 DETECTING MISSED GENES
 NEW FEATURES ASSOCIATED WITH...
 THE COG WORLDWIDE WEB...
 REFERENCES
 
The database of Clusters of Orthologous Groups of proteins (COGs), which represents an attempt on a phylogenetic classification of the proteins encoded in complete genomes, currently consists of 2791 COGs including 45 350 proteins from 30 genomes of bacteria, archaea and the yeast Saccharomyces cerevisiae (http://www.ncbi.nlm.nih.gov/COG). In addition, a supplement to the COGs is available, in which proteins encoded in the genomes of two multicellular eukaryotes, the nematode Caenorhabditis elegans and the fruit fly Drosophila melanogaster, and shared with bacteria and/or archaea were included. The new features added to the COG database include information pages with structural and functional details on each COG and literature references, improvements of the COGNITOR program that is used to fit new proteins into the COGs, and classification of genomes and COGs constructed by using principal component analysis.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 THE CURRENT STATUS OF...
 ADDING PROTEINS FROM...
 DETECTING MISSED GENES
 NEW FEATURES ASSOCIATED WITH...
 THE COG WORLDWIDE WEB...
 REFERENCES
 
The database of Clusters of Orthologous Groups of proteins (COGs) has been incepted as a phylogenetic classification of proteins from complete genomes (1). Each COG includes proteins that are thought to be orthologous, i.e. connected through vertical evolutionary descent (2). Orthology may involve not only one-to-one, but also, in cases of lineage-specific gene duplications, one-to-many and many-to-many relationships (hence Orthologous Groups of proteins). The purpose of the COGs database is to serve as a platform for functional annotation of newly sequenced genomes and for studies on genome evolution. To facilitate functional studies, the COGs have been classified into 17 broad functional categories, including a class for which only a general functional prediction, usually that of biochemical activity, was feasible and a class of uncharacterized COGs. Additionally, some of the COGs with known functions are organized to represent specific cellular systems and biochemical pathways. The database is accompanied by the COGNITOR program, which assigns new proteins, typically from newly sequenced genomes, to pre-existing COGs. Here we describe the new developments in the COGs database in the year 2000, which included both the quantitative update through addition of new genomes and development of new functionalities associated with the database.


    THE CURRENT STATUS OF THE COGS—NEW GENOMES
 TOP
 ABSTRACT
 INTRODUCTION
 THE CURRENT STATUS OF...
 ADDING PROTEINS FROM...
 DETECTING MISSED GENES
 NEW FEATURES ASSOCIATED WITH...
 THE COG WORLDWIDE WEB...
 REFERENCES
 
Since the second release of the COG database in January 2000 (3), nine new genomes have been added to the database using the COGNITOR program with subsequent manual validation to identify new members of pre-existing COGs and previously described procedures for the construction of new COGs. The additions included the first sequenced genome of a crenarchaeon (representative of the second major division of the archaea), Aeropyrum pernix; a fifth representative of the Euryarchaea, Pyrococcus abyssi; and seven bacterial genomes, including those from unusual organisms such as the extremely radio­resistant Deinococcus radiodurans (Table 1). The previously described trend held with the new genomes in that 60–80% of the proteins from each of the prokaryotic genomes could be included in COGs (Table 1).


View this table:
[in this window]
[in a new window]
 
Table 1. Representation of genomes in the COGsa
 
The genome of the crenarchaeon A.pernix (4), which was of particular interest because this major evolutionary lineage had not been previously represented among completely sequenced genomes, was investigated in detail as a benchmark for annotation of newly sequenced genomes using the COG system (5). The COG analysis resulted in an ~50% increase in confident functional prediction for A.pernix genes compared to the original annotations. On the other hand, a significant fraction of open reading frames (ORFs), originally annotated as genes, did not show detectable similarity to any proteins in current databases, but overlapped with proteins included in the COGs, strongly suggesting that these ORFs were not real genes (Table 2). Thus the analysis of the genome of an organism that had no close relatives among other organisms with sequenced genomes appears to corroborate the effectiveness of the COG system as a genome annotation tool.


View this table:
[in this window]
[in a new window]
 
Table 2. Analysis of the predicted A.pernix proteins using the COG system
 
Given the accumulation of multiple, complete genome sequences, we were interested in the growth dynamics of the COG set with the increased number of included genomes. The growth curve was constructed by imitating the COG formation for each of the 106 random orders of genome inclusion (Fig. 1). For each number of species, the maximum, the minimum and the average number of COGs was determined. The minimal and the maximal curves define the area containing all possible growth curves (Fig. 1). The average curve approximates the expected dynamics of the COG growth. Given that the number of completely sequenced genomes is still relatively small and that some of them are closely related, it remains uncertain whether or not the number of COGs is starting to approach saturation, and if it is, what is the asymptotic value.



View larger version (41K):
[in this window]
[in a new window]
 
Figure 1. Growth dynamics of the COG set with the increase of number of included genomes. The circles show the sequence of genome inclusion according to the actual order of sequencing, and the smooth line shows the mean of 106 random permutations of the genome order. The colored area indicates the range between the maximal and minimal value for each point (number of genomes) in 106 random permutations.

 

    ADDING PROTEINS FROM MULTICELLULAR EUKARYOTES TO PROKARYOTIC COGs
 TOP
 ABSTRACT
 INTRODUCTION
 THE CURRENT STATUS OF...
 ADDING PROTEINS FROM...
 DETECTING MISSED GENES
 NEW FEATURES ASSOCIATED WITH...
 THE COG WORLDWIDE WEB...
 REFERENCES
 
The current COG collection includes multiple bacterial and archaeal genomes and only one eukaryotic species, the yeast Saccharomyces cerevisiae. Incorporating the larger genomes of multicellular eukaryotes into the COG system is a challenging task due to the preponderance of multidomain proteins in these organisms. As a first step toward this goal, we sought to identify eukaryotic proteins that fit into already existing COGs, in other words, those eukaryotic proteins that have orthologs in at least two prokaryotic species. To this end, 19 895 protein sequences from the (nearly) complete genome of the nematode Caenorhabditis elegans (6) and 14 100 sequences from the genome of the fruit fly Drosophila melanogaster (7) were analyzed using the COGNITOR program, which assigns proteins to COGs on the basis of multiple genome-specific best hits and splits multi­domain protein into individual domains if these show affinity with different COGs. After manual validation of the results, 20% of the D.melanogaster proteins and 14% of the C.elegans proteins were assigned to COGs; a significant number of proteins from each of the multicellular eukaryotes were included in COGs of each functional category, with the notable exception of ‘Cell division and chromosome partitioning’ and ‘Cell motility and secretion’, which consist primarily of prokaryote-specific proteins (Table 3). The COG analysis of the worm and fly proteins yielded numerous functional predictions, which have not been described previously (I.V.Garkavtsev and E.V.Koonin, unpublished observations). Eukaryotic proteins that have orthologs in prokaryotes belong to two major categories: (i) ancient proteins inherited from the last common ancestor of all extant life forms or at least the common ancestor of archaea and eukaryotes; (ii) proteins encoded by genes that have been horizontally transferred from organelles to the eukaryotic nucleus or otherwise acquired by eukaryotes from bacteria (8). Analysis of the phylogenetic patterns in the COGs may help distinguish between these two categories.


View this table:
[in this window]
[in a new window]
 
Table 3. Eukaryotic proteins in the COGs
 
After three distant eukaryotic genomes were included in the prokaryotic COGs, it was of interest to analyze their co-occurrence. As expected, the majority of COGs with eukaryotic members include all three genomes; at the same time, a considerable number of COGs include all possible pairs of eukaryotic genomes and each of the individual species (Table 4). These observations, which will be analyzed in detail elsewhere, support the major role of lineage-specific gene loss and horizontal gene transfer in eukaryotic evolution.


View this table:
[in this window]
[in a new window]
 
Table 4. Co-occurrence of the eukaryotic genomes in the COGs
 

    DETECTING MISSED GENES
 TOP
 ABSTRACT
 INTRODUCTION
 THE CURRENT STATUS OF...
 ADDING PROTEINS FROM...
 DETECTING MISSED GENES
 NEW FEATURES ASSOCIATED WITH...
 THE COG WORLDWIDE WEB...
 REFERENCES
 
One of the features associated with the COG database is the analysis of phylogenetic patterns, i.e. the patterns of species that are represented or not represented in each of the COGs. Unexpected phylogenetic patterns, for example, those that contain all but one bacterial species or those that include only one of a pair of closely related species, may be due to omission of genes in genome annotations submitted to GenBank or to unusual evolutionary phenomena such as non-orthologous displacement of a nearly ubiquitous gene. Before considering the second hypothesis, the first one should be tested, and we undertook a systematic analysis of COGs with unexpected phylogenetic patterns in search of missing members (9). The nucleotide sequence of the genome in question was searched using the TNBLASTN program (10) and the sequences of members of the respective COGs as queries. As a result, missing genes coding for members of 48 COGs were identified (Table 5); most of the predicted new proteins are small, which explains why they have escaped the original genome annotations. Thus the COG system is instrumental in improving genome annotation not only with respect to functional predictions, but also for gene identification per se.


View this table:
[in this window]
[in a new window]
 
Table 5. Detection of missed proteins using phylogenetic pattern analysis
 

    NEW FEATURES ASSOCIATED WITH THE COGS
 TOP
 ABSTRACT
 INTRODUCTION
 THE CURRENT STATUS OF...
 ADDING PROTEINS FROM...
 DETECTING MISSED GENES
 NEW FEATURES ASSOCIATED WITH...
 THE COG WORLDWIDE WEB...
 REFERENCES
 
Improvement of the COGNITOR program—statistical evaluation of the fit
The original COGNITOR program uses multiple genome-specific best hits (BeTs) as the only criterion for assigning new proteins to COGs. In the new version, we introduced an estimate of the probability that the query protein is assigned to the given COG by chance. Under the assumption of uniform distribution of hits to each genome in the COG database, the probability of one BeT into a particular COG is, simply, the fraction of proteins from the specified genome that belongs to the COG:

fij = nij/Ni

where nij is the number of proteins from species i in COG j and Ni is the total number of proteins in species i. Then, the probability of exactly two BeTs into COG j is given by:

Similar expressions can be easily obtained for a different number of BeTs. For each COG, we can compute p2j and find the ‘average’ value of Fj that satisfies the equation:

C(2,m)Fj2 (1 – Fj)(m–2) = p2j

where m is the number of species in COG j. Using Fj simplifies the calculation of the probability when the specified number of BeTs is large.

COG-Info pages
In order to increase the utility of the COG system for genome annotation, a web page that contains additional structural and functional information on the COG as a whole and individual members is now associated with each COG. These hyperlinked Info pages include: systematic classification of the COG members under the current classification systems for enzyme or transporters (if applicable); indications which COG members (if any) have been characterized genetically and biochemically; information on the domain architecture of the proteins comprising the COG and the three-dimensional structure of the domains if known or predictable; a succinct summary of the common structural and functional features of the COG members and peculiarities of individual members; key references (Fig. 2). The COG-Info pages are currently at different stages of construction.



View larger version (91K):
[in this window]
[in a new window]
 
Figure 2. An example of a COG-Info page.

 
Classification of genomes on the basis of co-occurrence in COGs using principal component analysis
The data on the co-occurrence of genomes in COGs was used as the input for classification by principal component analysis (PCA). Briefly, the presence or absence of a given species in each COG is converted into a 1/0 coordinate value in a multidimensional space where each dimension corresponds to a COG, which results in a geometric representation of all included species in the >2000-dimensional space. The PCA analysis is then used to choose the subspace of lower dimensionality for visual examination. The eigenvector decomposition yields the orthogonal courses in the space and the corresponding eigenvectors constitute the spread of the objects. The WWW interface provides tools for selection of the subspace, the species to view and the COGs to use for classification (Fig. 3A). Significantly different results were obtained when different functional categories of COGs were analyzed. Specifi­cally, the combined categories of translation, transcription and replication showed a sharp separation between bacteria, archaea and eukaryotes, with representatives of each of these primary domains of life forming a tight cluster (Fig. 3B); the metabolic functions produced a more complex picture, with a separation of free-living and parasitic bacteria and grouping of yeast with the former (Fig. 3C).



View larger version (24K):
[in this window]
[in a new window]
 
Figure 3. Classification of genome by co-occurrence in COGs using PCA. (A) All COGs. (B) Translation, transcription and replication (functional categories J, K and L). (C) Metabolism (functional categories C, E, F, G, H and I).

 
Integration of COGs with the Genome Division of Entrez
The COGs are now integrated with the Genomes division of the Entrez system. From the COG pages, the proteins are linked to the Entrez genome view (the ‘Genome’ button) and to the protein neighbor view (the Blink button). Conversely, the Genomes division of Entrez (11) incorporates COG information in several displays. The COG information including the breakdown by the functional categories is presented for each genome, for example: http://www.ncbi.nlm.nih.gov:80/cgi-bin/Entrez/coxik?gi=131. The main page for each genome includes a (usually) circular genome map, with radial lines corresponding to genes color-coded according to the functional categories adopted in the COG system. Additionally, for all proteins that belong to COGs, the protein view is linked to the respective COG.


    THE COG WORLDWIDE WEB SITE
 TOP
 ABSTRACT
 INTRODUCTION
 THE CURRENT STATUS OF...
 ADDING PROTEINS FROM...
 DETECTING MISSED GENES
 NEW FEATURES ASSOCIATED WITH...
 THE COG WORLDWIDE WEB...
 REFERENCES
 
The COG database is accessible at http://www.ncbi.nlm.nih.gov/COG. The site includes the following main features: complete list of all COGs hyperlinked to individual COG pages; COGs organized by functional category; COGs organized by functional complexes and pathways; an interactive matrix of co-occurrence of genomes in COGs; a phylogenetic pattern search tool; a principal component classification tool; COGNITOR; a COG Help page. Each of the individual COG pages is hyperlinked to: (i) pictorial representations of BLAST search outputs for each member of the COG, which also include links to the respective GenBank and Entrez-Genomes entries, (ii) a multiple alignment of the COG members produced automatically by using the ClustalW program, (iii) a COG-Info page (reached by clicking on the COG number). The supplement to the COGs, which shows proteins from C.elegans and D.melanogaster assigned to each COG is accessible at http://www.ncbi.nlm.nih.gov/COG/euk. The COG data set is also available by anonymous ftp at ftp://ncbi.nlm.nih.gov/pub/COG.


    ACKNOWLEDGEMENTS
 
The authors are grateful to David Lipman for his critical contribution at the initial stage of the COG project and constant support and inspiration and to Vivek Anantharaman, L. Aravind, Kira Makarova, Igor Rogozin and Yuri Wolf for helpful suggestions.


    FOOTNOTES
 
* To whom correspondence should be addressed. Tel: +1 301 435 5913; Fax: +1 301 480 9241; Email: koonin{at}ncbi.nlm.nih.gov Back


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 THE CURRENT STATUS OF...
 ADDING PROTEINS FROM...
 DETECTING MISSED GENES
 NEW FEATURES ASSOCIATED WITH...
 THE COG WORLDWIDE WEB...
 REFERENCES
 

    1 Tatusov,R.L., Koonin,E.V. and Lipman,D.J. (1997) A genomic perspective on protein families. Science, 278, 631–637.[Abstract/Free Full Text]

    2 Fitch,W.M. (1970) Distinguishing homologous from analogous proteins. Syst. Zool., 19, 99–106.[Medline]

    3 Tatusov,R.L., Galperin,M.Y., Natale,D.A. and Koonin,E.V. (2000) The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res., 28, 33–36.[Abstract/Free Full Text]

    4 Kawarabayasi,Y., Hino,Y., Horikawa,H., Yamazaki,S., Haikawa,Y., Jin-no,K., Takahashi,M., Sekine,M., Baba,S., Ankai,A. et al. (1999) Complete genome sequence of an aerobic hyper-thermophilic crenarchaeon, Aeropyrum pernix K1. DNA Res., 6, 83–101.[Abstract]

    5 Natale,D.A., Shankavaram,U.T., Galperin,M.Y., Wolf,Y.I., Aravind,L. and Koonin,E.V. (2000) Genome annotation using clusters of orthologous groups of proteins (COGs) – towards understanding the first genome of a Crenarchaeon. Genome Biol., 1, 0009.1–0009.19.

    6 The C.elegans Sequencing Consortium. (1998) Genome sequence of the nematode C. elegans: a platform for investigating biology. The C.elegans Sequencing Consortium. Science, 282, 2012–2018.[Abstract/Free Full Text]

    7 Adams,M.D., Celniker,S.E., Holt,R.A., Evans,C.A., Gocayne,J.D., Amanatides,P.G., Scherer,S.E., Li,P.W., Hoskins,R.A., Galle,R.F. et al. (2000) The genome sequence of Drosophila melanogaster. Science, 287, 2185–2195.[Abstract/Free Full Text]

    8 Doolittle,W.F. (1998) You are what you eat: a gene transfer ratchet could account for bacterial genes in eukaryotic nuclear genomes. Trends Genet., 14, 307–311.[ISI][Medline]

    9 Natale,D.A., Galperin,M.Y., Tatusov,R.L. and Koonin,E.V. (2000) Using the COG database to improve gene recognition in complete genomes. Genetica, 108, 9–17.[ISI][Medline]

    10 Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402.[Abstract/Free Full Text]

    11 Tatusova,T.A., Karsch-Mizrachi,I. and Ostell,J.A. (1999) Complete genomes in WWW Entrez: data representation and analysis. Bioinformatics, 15, 536–543.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
P. S. Novichkov, I. Ratnere, Y. I. Wolf, E. V. Koonin, and I. Dubchak
ATGC: a database of orthologous genes from closely related prokaryotic genomes and a research platform for microevolution of prokaryotes
Nucleic Acids Res., October 9, 2008; (2008) gkn684v1.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
G. T. Chung, J. S. Yoo, H. B. Oh, Y. S. Lee, S. H. Cha, S. J. Kim, and C. K. Yoo
Complete Genome Sequence of Neisseria gonorrhoeae NCCP11945
J. Bacteriol., September 1, 2008; 190(17): 6035 - 6036.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
F. Escartin, S. Skouloubris, U. Liebl, and H. Myllykallio
Flavin-dependent thymidylate synthase X limits chromosomal DNA replication
PNAS, July 22, 2008; 105(29): 9948 - 9952.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
A. X. Tran, M. S. Trent, and C. Whitfield
The LptA Protein of Escherichia coli Is a Periplasmic Lipid A-binding Protein Involved in the Lipopolysaccharide Export Pathway
J. Biol. Chem., July 18, 2008; 283(29): 20342 - 20349.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
H. Takarada, M. Sekine, H. Kosugi, Y. Matsuo, T. Fujisawa, S. Omata, E. Kishi, A. Shimizu, N. Tsukatani, S. Tanikawa, et al.
Complete Genome Sequence of the Soil Actinomycete Kocuria rhizophila
J. Bacteriol., June 15, 2008; 190(12): 4139 - 4146.
[Abstract] [Full Text] [PDF]


Home page
DNA ResHome page
H. Morita, H. Toh, S. Fukuda, H. Horikawa, K. Oshima, T. Suzuki, M. Murakami, S. Hisamatsu, Y. Kato, T. Takizawa, et al.
Comparative Genome Analysis of Lactobacillus reuteri and Lactobacillus fermentum Reveal a Genomic Island for Reuterin and Cobalamin Production
DNA Res, June 1, 2008; 15(3): 151 - 161.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
O. Gonzalez and R. Zimmer
Assigning functional linkages to proteins using phylogenetic profiles and continuous phenotypes
Bioinformatics, May 15, 2008; 24(10): 1257 - 1263.
[Abstract] [Full Text] [PDF]


Home page
J BiochemHome page
R. Ke, N. Sakiyama, R. Sawada, M. Sonoyama, and S. Mitaku
Vertebrate Genomes Code Excess Proteins with Charge Periodicity of 28 Residues
J. Biochem., May 1, 2008; 143(5): 661 - 665.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
X. Hu, W. Fan, B. Han, H. Liu, D. Zheng, Q. Li, W. Dong, J. Yan, M. Gao, C. Berry, et al.
Complete Genome Sequence of the Mosquitocidal Bacterium Bacillus sphaericus C3-41 and Comparison with Those of Closely Related Bacillus Species
J. Bacteriol., April 15, 2008; 190(8): 2892 - 2902.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
A. G. Vitreschak, A. A. Mironov, V. A. Lyubetsky, and M. S. Gelfand
Comparative genomic analysis of T-box regulatory systems in bacteria
RNA, April 1, 2008; 14(4): 717 - 735.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
T. Shi and P. G. Falkowski
Genome evolution in cyanobacteria: The stable core and the variable shell
PNAS, February 19, 2008; 105(7): 2510 - 2515.
[Abstract] [Full Text] [PDF]


Home page
DNA ResHome page
T. Goto, A. Yamashita, H. Hirakawa, M. Matsutani, K. Todo, K. Ohshima, H. Toh, K. Miyamoto, S. Kuhara, M. Hattori, et al.
Complete Genome Sequence of Finegoldia magna, an Anaerobic Opportunistic Pathogen
DNA Res, February 7, 2008; (2008) dsm030v1.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
T. Mock, M. P. Samanta, V. Iverson, C. Berthiaume, M. Robison, K. Holtermann, C. Durkin, S. S. BonDurant, K. Richmond, M. Rodesch, et al.
From the Cover: Whole-genome expression profiling of the marine diatom Thalassiosira pseudonana identifies genes involved in silicon bioprocesses
PNAS, February 5, 2008; 105(5): 1579 - 1584.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
A. Vivero, R. C. Banos, J. F. Mariscotti, J. C. Oliveros, F. Garcia-del Portillo, A. Juarez, and C. Madrid
Modulation of Horizontally Acquired Genes by the Hha-YdgT Proteins in Salmonella enterica Serovar Typhimurium
J. Bacteriol., February 1, 2008; 190(3): 1152 - 1156.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
B. W. Davies and G. C. Walker
A Highly Conserved Protein of Unknown Function Is Required by Sinorhizobium meliloti for Symbiosis and Environmental Stress Protection
J. Bacteriol., February 1, 2008; 190(3): 1118 - 1123.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
M. Brilli, R. Fani, and P. Lio
Current trends in the bioinformatic sequence analysis of metabolic pathways in prokaryotes
Brief Bioinform, January 1, 2008; 9(1): 34 - 45.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
I. Biswas, L. Drake, D. Erkina, and S. Biswas
Involvement of Sensor Kinases in the Stress Tolerance Response of Streptococcus mutans
J. Bacteriol., January 1, 2008; 190(1): 68 - 77.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
J. A. Bennett, R. M. Aimino, and J. R. McCormick
Streptomyces coelicolor Genes ftsL and divIC Play a Role in Cell Division but Are Dispensable for Colony Formation
J. Bacteriol., December 15, 2007; 189(24): 8982 - 8992.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
R. Sorek, Y. Zhu, C. J. Creevey, M. P. Francino, P. Bork, and E. M. Rubin
Genome-Wide Experimental Determination of Barriers to Horizontal Gene Transfer
Science, November 30, 2007; 318(5855): 1449 - 1452.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. N. Kim, A. Roth, and R. R. Breaker
Guanine riboswitch variants from Mesoplasma florum selectively recognize 2'-deoxyguanosine
PNAS, October 9, 2007; 104(41): 16092 - 16097.
[Abstract] [Full Text] [PDF]


Home page
MicrobiologyHome page
J. M. Sturino and T. R. Klaenhammer
Inhibition of bacteriophage replication in Streptococcus thermophilus by subunit poisoning of primase
Microbiology, October 1, 2007; 153(10): 3295 - 3302.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. O. Allen, C. M. Fauron, P. Minx, L. Roark, S. Oddiraju, G. N. Lin, L. Meyer, H. Sun, K. Kim, C. Wang, et al.
Comparisons Among Two Fertile and Three Male-Sterile Mitochondrial Genomes of Maize
Genetics, October 1, 2007; 177(2): 1173 - 1192.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
M. Grana, A. Haouz, A. Buschiazzo, I. Miras, A. Wehenkel, V. Bondet, W. Shepard, F. Schaeffer, S. T. Cole, and P. M. Alzari
The crystal structure of M. leprae ML2640c defines a large family of putative S-adenosylmethionine-dependent methyltransferases in mycobacteria
Protein Sci., September 1, 2007; 16(9): 1896 - 1904.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. I. Sadreyev, M. Tang, B.-H. Kim, and N. V. Grishin
COMPASS server for remote homology inference
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W653 - W658.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. Moriya, M. Itoh, S. Okuda, A. C. Yoshizawa, and M. Kanehisa
KAAS: an automatic genome annotation and pathway reconstruction server
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W182 - W185.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. K. Saini and D. Fischer
FRalanyzer: a tool for functional analysis of fold-recognition sequence-structure alignments
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W499 - W502.
[Abstract] [Full Text] [PDF]


Home page
MicrobiologyHome page
W. Wang, A. V. Perepelov, L. Feng, S. D. Shevelev, Q. Wang, S. N. Senchenkova, W. Han, Y. Li, A. S. Shashkov, Y. A. Knirel, et al.
A group of Escherichia coli and Salmonella enterica O antigens sharing a common backbone structure
Microbiology, July 1, 2007; 153(7): 2159 - 2167.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
D. J. Miller, L. Shuvalova, E. Evdokimova, A. Savchenko, A. F. Yakunin, and W. F. Anderson
Structural and biochemical characterization of a novel Mn2+-dependent phosphodiesterase encoded by the yfcE gene
Protein Sci., July 1, 2007; 16(7): 1338 - 1348.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. Linz, A. Radtke, and A. von Haeseler
A Likelihood Framework to Measure Horizontal Gene Transfer
Mol. Biol. Evol., June 1, 2007; 24(6): 1312 - 1319.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
B. Andreopoulos, A. An, X. Wang, M. Faloutsos, and M. Schroeder
Clustering by common friends finds locally significant proteins mediating modules
Bioinformatics, May 1, 2007; 23(9): 1124 - 1131.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Wu, F. Mao, V. Olman, and Y. Xu
Hierarchical classification of functionally equivalent genes in prokaryotes
Nucleic Acids Res., April 1, 2007; 35(7): 2125 - 2140.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Z. Yu, P.-A. Genest, B. ter Riet, K. Sweeney, C. DiPaolo, R. Kieft, E. Christodoulou, A. Perrakis, J. M. Simmons, R. P. Hausinger, et al.
The protein that binds to DNA base J in trypanosomatids has features of a thymidine hydroxylase
Nucleic Acids Res., April 1, 2007; 35(7): 2107 - 2115.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
B. Berger, R. D. Pridmore, C. Barretto, F. Delmas-Julien, K. Schreiber, F. Arigoni, and H. Brussow
Similarity and Differences in the Lactobacillus acidophilus Group Identified by Polyphasic Analysis and Comparative Genomics
J. Bacteriol., February 15, 2007; 189(4): 1311 - 1321.
[Abstract] [Full Text] [PDF]


Home page