Article |
DBD: a transcription factor prediction database
MRC Laboratory of Molecular Biology Hills Road, Cambridge CB2 2QH, UK
*To whom correspondence should be addressed. Tel: +44 1223 402479; Fax: +44 1223 213556; Email: skk{at}mrc-lmb.cam.ac.uk
Received August 12, 2005. Revised October 24, 2005. Accepted October 24, 2005.
Regulation of gene expression influences almost all biological processes in an organism; sequence-specific DNA-binding transcription factors are critical to this control. For most genomes, the repertoire of transcription factors is only partially known. Hitherto transcription factor identification has been largely based on genome annotation pipelines that use pairwise sequence comparisons, which detect only those factors similar to known genes, or on functional classification schemes that amalgamate many types of proteins into the category of transcription factor. Using a novel transcription factor identification method, the DBD transcription factor database fills this void, providing genome-wide transcription factor predictions for organisms from across the tree of life. The prediction method behind DBD identifies sequence-specific DNA-binding transcription factors through homology using profile hidden Markov models (HMMs) of domains. Thus, it is limited to factors that are homologus to those HMMs. The collection of HMMs is taken from two existing databases (Pfam and SUPERFAMILY), and is limited to models that exclusively detect transcription factors that specifically recognize DNA sequences. It does not include basal transcription factors or chromatin-associated proteins, for instance. Based on comparison with experimentally verified annotation, the prediction procedure is between 95% and 99% accurate. Between one quarter and one-half of our genome-wide predicted transcription factors represent previously uncharacterized proteins. The DBD (www.transcriptionfactor.org) consists of predicted transcription factor repertoires for 150 completely sequenced genomes, their domain assignments and the hand curated list of DNA-binding domain HMMs. Users can browse, search or download the predictions by genome, domain family or sequence identifier, view families of transcription factors based on domain architecture and receive predictions for a protein sequence.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J. Park, J. Park, S. Jang, S. Kim, S. Kong, J. Choi, K. Ahn, J. Kim, S. Lee, S. Kim, et al. FTFD: an informatics pipeline supporting phylogenomic analysis of fungal transcription factors Bioinformatics, April 1, 2008; 24(7): 1024 - 1025. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Zeng, J. Yan, T. Wang, D. Mosbrook-Davis, K. T. Dolan, R. Christensen, G. D. Stormo, D. Haussler, R. H. Lathrop, R. K. Brachmann, et al. Genome wide screens in yeast to identify potential binding sites and target genes of DNA-binding proteins Nucleic Acids Res., January 17, 2008; 36(1): e8 - e8. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Molina and E. van Nimwegen Universal patterns of purifying selection at noncoding positions in bacteria Genome Res., January 1, 2008; 18(1): 148 - 160. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Vermeirssen, M. I. Barrasa, C. A. Hidalgo, J. A. B. Babon, R. Sequerra, L. Doucette-Stamm, A.-L. Barabasi, and A. J.M. Walhout Transcription factor modularity in a gene-centered C. elegans core neuronal protein-DNA interaction network Genome Res., July 1, 2007; 17(7): 1061 - 1071. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. V. Morozov and E. D. Siggia Connecting protein structure with predictions of regulatory sites PNAS, April 24, 2007; 104(17): 7068 - 7073. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Tjong and H.-X. Zhou DISPLAR: an accurate method for predicting DNA-binding sites on protein surfaces Nucleic Acids Res., March 12, 2007; 35(5): 1465 - 1477. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Wilson, M. Madera, C. Vogel, C. Chothia, and J. Gough The SUPERFAMILY database in 2007: families and functions Nucleic Acids Res., January 12, 2007; 35(suppl_1): D308 - D313. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J.M. Walhout Unraveling transcription regulatory networks by protein-DNA and protein-protein interaction mapping Genome Res., December 1, 2006; 16(12): 1445 - 1454. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Gough Genomic scale sub-family assignment of protein domains Nucleic Acids Res., July 28, 2006; 34(13): 3625 - 3633. [Abstract] [Full Text] [PDF] |
||||



