| Nucleic Acids Research | Pages |
Protein sequence similarity searches using patterns as seeds
Introduction
The PHI-BLAST Algorithm
Statistical Analysis
Implementation And Examples
CED4-like cell death regulators
HS90-type ATPase domains
Archaeal tRNA nucleotidyltransferases
Archaeal homologs of DnaG-type DNA primases
Performance Evaluation
Conclusion
Note
Acknowledgements
References
Protein sequence similarity searches using patterns as seeds
ABSTRACT
INTRODUCTION
In the analysis of a protein or DNA sequence, particular interest often focuses upon a small region, domain or sequence pattern. A natural question is whether there are other related sequences that share the same pattern. The most widely used tools for sequence similarity search allow matching between arbitrary regions of the query and database sequences (1-5). In contrast, many motif-based search methods seek database sequences that match a pre-specified pattern (6-12). If this pattern is too weak, or not specified with sufficient precision, the number of matches may be very large, most being of no biological relevance. On the other hand, an overly-specific pattern may exclude many sequences of interest.
We describe here the pattern-hit initiated BLAST (PHI-BLAST) program, whose hybrid strategy addresses a type of question frequently asked by researchers: namely, is a particular pattern seen in a protein of interest likely to be functionally relevant, or does it occur simply by chance? To address this question, we combine a pattern search with a search for statistically significant sequence similarity. These two approaches were combined previously in a program that explored the output of a BLAST search for conserved patterns (10). PHI-BLAST implements a reverse strategy which is computationally more efficient, and which we believe will be of greater utility. Specifically, the similarity search is restricted to a subset of the sequence database comprised of the sequences that contain the given pattern.
The input to PHI-BLAST consists of a protein or DNA sequence, along with a specific pattern occurring at least once within the sequence. The pattern is currently required to be a sequence of residues or sets of residues, with `wild cards' and variable spacing allowed; all PROSITE patterns (12), for example, have this form. For each match between an instance of the pattern in the query sequence and an instance in a database sequence, PHI-BLAST constructs a high-scoring local alignment that includes the match. All resulting alignments are sorted by score and evaluated statistically.
This approach has greatest utility when it is suspected that a few residues comprising a small motif may be crucial for the biological function of interest. Showing that this pattern occurs within an extended and statistically significant alignment of the query sequence with one or more database sequences greatly reduces the likelihood that the pattern is spurious. Conversely, insisting on the presence of the pattern and hence searching a reduced sequence space may aid the detection of subtle similarities that blend into the background noise in a regular BLAST search.
THE PHI-BLAST ALGORITHM
To search for matches to a given pattern, we adapted a method of Baeza-Yates and Gonnet (13) and Wu and Manber (14). This method permits simple patterns to be represented in a single computer word and matches to be found very efficiently. When the pattern is relatively complex, for example consisting of many rigid parts and/or having wide ranges of spacer lengths, our program first searches for the rigid part that is least likely to match by chance alone, and then performs local searches for the remaining pattern elements.
For each instance of the input pattern in a database sequence, paired with an instance in the query, PHI-BLAST attempts to find the optimal local alignment (1,15) containing the aligned patterns. This can be done rigorously by applying dynamic programming (16,17) to the parts of the two sequences preceding and the parts following the pattern. The alignment returned is required to begin at the corner of the path graph, but is permitted to end anywhere within the graph. The difficulty with this approach is that, to guarantee optimality, a very large portion of the path graph needs to be searched, and this requires inordinate time in a database search (18). Accordingly, we have used the gapped extension heuristic described in Altschul et al. (5) and Zhang et al. (18). Basically, path graph cells are considered only if the score of the best alignment leading into them falls no more than X below the best score yet found. For sufficiently large values of the X parameter, this approach almost always returns the optimal local alignment.
Because PHI-BLAST performs a gapped extension whenever an instance of the input pattern is encountered in the database, reasonable execution times depend upon such instances being relatively rare. Therefore, we allow only patterns that are expected to occur less frequently than once per 5000 database residues. Any pattern that contains four completely specified residues, or three specified residues whose average background frequency is < 5.8%, passes this test. Of course, the more specific the input pattern, the faster PHI-BLAST will run. The frequency with which a pattern will occur within the database can be estimated easily (19) from background amino acid frequencies (20).
STATISTICAL ANALYSIS
An alignment A produced by PHI-BLAST may be divided into three parts: the region A0 spanned by the input pattern, and the local alignments A1 and A2 produced to either side of A0 by the gapped extension routine. Either or both of A1 and A2 may be empty. Correspondingly, the score S of the alignment may be divided into the scores S0, S1 and S2. For the purpose of statistical analysis, it is easiest to assume that all alignment regions A0 that satisfy the input pattern are of equal biological plausibility, and therefore to ignore their scores. Accordingly, each alignment produced by PHI-BLAST is ranked by its reduced score S[prime] = S1 + S2. For a given value x, we wish to estimate how many alignments are expected to have a reduced score S[prime] > x purely by chance.
In general, the input pattern is chosen because it is known to correspond to some feature of biological interest. Therefore, we make no statistical inference from the number of times the pattern is observed to occur within the query sequence (nq) and the database as a whole (nd). We simply record N = nq nd, the number of distinct pattern pairs that may seed a PHI-BLAST local alignment.
The simplest model of protein sequences is as random strings of amino acids, chosen independently with specific background probabilities for the various possible residues. To estimate the random distribution of S[prime], we start by considering the distribution of the scores S1 and S2 of which it is the sum. Each of these scores can be thought of as the result of the gapped extension routine applied to a pair of random sequences. In the limit of large values for the X-dropoff parameter (5,18), S1 is the score of the optimal local alignment required to start at a particular point P. The much studied Smith-Waterman alignment score (1) is just this constrained local alignment score, maximized over all path graph points P. The distribution of Smith-Waterman scores has been established empirically to follow an extreme value distribution, whose scale or decay parameter [lambda] does not change with increasing search space sizes (4,21-24). This implies (25) that the distribution of S1 should have an exponential tail, with decay parameter [lambda] equal to that of the extreme value distribution for Smith-Waterman scores. Some simple calculus then yields that for sufficiently large scores x, the distribution of S[prime] = S1 + S2 has the form Prob(S[prime] > x) [approximately equals] C([lambda]x + 1)e-[lambda]x for some constant C. The scores of optimal local alignments constrained to contain distinct pattern pairs may be correlated, but the expected number of alignments attaining a given score is independent of such correlation. Therefore, the expected number of chance alignments produced by PHI-BLAST with reduced score at least x is
| E(S[prime] > x) [approximately equals] CN([lambda]x + 1)e-[lambda]x | 1 |
Tables of [lambda] for a variety of amino acid substitution matrices and gap costs have been reported (4), and their validity tested on a large number of protein families (26). The values for [lambda] employed here differ slightly from those published previously (4), because we have re-estimated [lambda] using larger and therefore more accurate simulations. The parameter C of equation 1 is new and requires its own estimation. Random simulation (data not shown) using the background amino acid frequencies of Robinson and Robinson (20) yields C [ap] 0.6 for the BLOSUM-62 matrix (27) in conjunction with the complete range of affine gap costs useful for standard protein sequence comparison (4). We will consider the validity of equation 1 after discussing several biological examples.
IMPLEMENTATION AND EXAMPLES
To enhance the utility and functionality of a WWW-based version of PHI-BLAST, we have nested it between two other programs. While one may define a pattern based upon specific knowledge concerning the query sequence, a researcher often wishes to search a pattern-database for any well-characterized motifs the query may contain. To streamline this latter approach, we have implemented a program that first searches the PROSITE database (12) with the query; any patterns found may then be used to launch a PHI-BLAST database search. To facilitate more detailed analysis of PHI-BLAST output, we allow it automatically to serve as the basis for constructing a position-specific score matrix for further database searching via the position-specific iterated BLAST (PSI-BLAST) program (5). Like other BLAST family programs, PHI-BLAST incorporates a pre-filter for protein regions of biased amino acid composition (low complexity) that often corrupt database searches (28,29).
PHI-BLAST may detect subtle relationships that escape standard database similarity searches, but this potential depends upon the specification of an amino acid pattern likely to be conserved within the protein family of interest. We discuss four examples involving protein families whose original description depended critically upon detecting relatively weak sequence similarities. In each case, PHI-BLAST reports a subtle but structurally and functionally relevant relationship. The alignments suggesting these relationships are not all statistically significant but, in each database search output ranked by E-value, they appear immediately after the alignments involving clear family members, thereby prompting further analysis. In contrast, any of these similarities reported by gapped BLAST (5) are preceded by a number of alignments with smaller E-values involving unrelated sequences. The four examples discussed below are summarized in Table 1. All searches were performed on the non-redundant (NR) protein sequence database maintained by the NCBI (30).
Table 1. Detection of subtle protein sequence relationships using PHI-BLAST
CED4-like cell death regulators
The Caenorhabditis elegans protein CED4 is a regulator of programmed cell death (apoptosis). CED4 contains the classical P-loop motif involved in phosphate binding and found in a great variety of ATPases and GTPases. ATP binding by CED4, and the role of ATP in its function, have been demonstrated (31,32). In a gapped BLAST search of the NR database, CED4 shows statistically significant sequence similarity to only one protein, the human apoptosis regulator Apaf-1, in which the P-loop is conserved (33,34). However when PHI-BLAST is used, requiring conservation of the P-loop (Table 1), the best hit after Apaf-1, with E-value 0.038, is to a plant disease resistance protein, Arabidopsis thaliana T7N9.18 (35). Further sequence comparison shows that animal apoptosis regulators and putative plant ATPases involved in disease resistance share several conserved motifs, suggesting that they have a common origin and may have similar roles in programmed cell death (L.Aravind, V.M.Dixit and E.V.Koonin, unpublished observations). Before the Apaf-1 sequence became available, this conclusion had been reached through a laborious comparison of CED4 to a large number of different ATPases (32). Because the Apaf-1 sequence is highly similar to homologous plant proteins, the connection between CED4 and the plant proteins can be easily demonstrated by iterative database search (5). Even without Apaf-1, however, PHI-BLAST is able immediately to establish this link.
HS90-type ATPase domains
We used PHI-BLAST to investigate the subtle but structurally validated relationship between the ATPase domains in the MutL DNA repair proteins, type II topoisomerases, histidine kinases and HS90 family proteins (36,37). The output identified a new family of eukaryotic proteins that contain the same type of predicted ATPase domain, but that in standard database searches do not show significant similarity to any known member of the superfamily. A PHI-BLAST search with the Escherichia coli MutL protein (38) as query showed moderate similarity (E-value 0.017) to the C.elegans protein ZC155.3 (39) that was originally described as having `weak similarity to Bovine synaptocanalin I'. Subsequent database searches with this worm protein sequence as query revealed homologs in humans (KIAA0136) (40) and plants (41,42), whereas a PHI-BLAST search also showed convincing similarity to MutL family members (best E-value 6 × 10-5). Elucidation of the function of this new family of eukaryotic ATP-utilizing enzymes will be of considerable interest; the synaptocanalin domain apparently was fused to the worm protein by exon misassembly.
Archaeal tRNA nucleotidyltransferases
The archaeal tRNA nucleotidyltransferases (Cca) are a distinct family of nucleic acid polymerases (43) that in standard database searches do not have detectable similarity to any proteins other than orthologs from other archaeal species. However, they do contain a conserved motif, with two aspartate residues, that resembles the catalytic sites of many other polymerases (44). When this pattern (Table 1) is specified in a PHI-BLAST search with Methanococcus jannaschii Cca (45) as query, the top hit outside the archaeal Cca family itself, with E-value 0.061, is to hypothetical protein AF0299 from Archaeoglobus fulgidus (46), which belongs to a previously described archaeal family of predicted nucleotidyltransferases (47); the third hit (E-value 0.13) is to an experimentally characterized streptomycin 3[prime][prime]-adenylyltransferase from Enterococcus faecalis (48).
Table 2. Accuracy of PHI-BLAST statistics
Archaeal homologs of DnaG-type DNA primases
Archaeal homologs of bacterial DNA primases, e.g. M.jannaschii protein MJ1206 (45), contain a motif typical of helicases (47), but do not show significant similarity to these proteins in standard BLAST searches. Using M.jannaschii MJ1206 and the helicase motif as query, the first non-trivial PHI-BLAST hit, with E-value 0.54, is to the well known helicase Neisseria gonorrhoeae UvrB (49). The relevance of the helicase motif in the archaeal primase homologs is supported by an extended alignment with the UvrB helicase (L.Aravind, D.D.Leipe and E.V.Koonin, unpublished observations). The similarities uncovered in this example are undetectable with standard database search techniques.
PERFORMANCE EVALUATION
To test the accuracy of the PHI-BLAST statistics given by equation 1, we used each of the examples above to search `random databases' constructed from NR by shuffling or reversing each sequence. For each query, the lowest recorded E-value, and the number of alignments found with E-value < 10, are given in Table 2. For the shuffled database, the geometric mean of the observed numbers of sequences with E-value < 10 is 10.0, and no single case diverges from this value by more than a factor of 2.5. This might be expected, as the values of [lambda] and C used in equation 1 were calculated employing a random protein model in which all amino acids occur independently. Perhaps surprisingly, Table 2 suggests that under an alternative random protein model, based upon reversed real sequences, these statistics are slightly conservative.
To compare the speed of PHI-BLAST to that of a standard gapped BLAST program (5) we timed both for searches of each of the four examples above against the NR database. Analysis of the results (Table 3) suggests that on the computer system used, ~8 s of each PHI-BLAST run were required to scan the database for pattern hits and for system overhead; the remainder was spent on constructing gapped extensions for all pattern hits found. Clearly, the number of hits generated by the input pattern is a key determinant of PHI-BLAST's speed. For relatively informative patterns PHI-BLAST is very fast, requiring not much more time than that needed to search for pattern hits. For relatively weak patterns, PHI-BLAST expends most of its effort extending hits, and can require time comparable to that for gapped BLAST.
Table 3. Execution speed of PHI-BLAST
CONCLUSION
As illustrated by the biological examples discussed above, PHI-BLAST helps both to ascertain the biological relevance of patterns detected within protein sequences, and in some cases to detect subtle similarities that escape a regular BLAST search. We note, however, that PHI-BLAST was specifically designed to combine pattern search with the search for statistically significant sequence similarity, rather than to maximize search sensitivity. Thus in general one should not expect PHI-BLAST, which by its nature is a single-pass search method, to be more sensitive than PSI-BLAST (5). Furthermore, within proteins, residues that are absolutely conserved during evolution constitute a small minority, and even specifying a restricted set of possibilities for a given residue position often excludes many members of a protein family. PHI-BLAST therefore is not the ideal tool for completely delineating a class of related proteins. However, by greatly restricting the size of the search space, PHI-BLAST can allow the similarities of some distant homologs to rise above the background noise that would otherwise obscure them. Such findings can be used subsequently for more extensive family analysis using PSI-BLAST (5) or other tools.
We have developed PHI-BLAST for protein-protein comparison, but plan to extend its applicability. A version that translates a DNA database in all six reading frames for comparison to a protein query would be particularly valuable, and a DNA-DNA comparison version should also find use. We also plan to extend PHI-BLAST so that it may use generalized affine gap costs (50) in place of the traditional affine gap costs (51-54) currently permitted.
Note
Source code for PHI-BLAST is available by anonymous ftp from the machine ncbi.nlm.nih.gov, within the directory `blast', and the program may be run from NCBI's web site at http://www.ncbi.nlm.nih.gov/
ACKNOWLEDGEMENTS
Z.Z. and W.M. are supported by grant LM05110 from the National Library of Medicine. We thank Dr L. Aravind for helpful discussions.
REFERENCES
This article has been cited by other articles:
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 14 Aug 1998
Copyright©Oxford University Press, 1998.
![]()
CiteULike
Connotea
Del.icio.us What's this?
![]()
![]()

![]()
![]()
![]()
A. Andreeva and H. Tidow
A novel CHHC Zn-finger domain found in spliceosomal proteins and tRNA modifying enzymes
Bioinformatics,
October 15, 2008;
24(20):
2277 - 2280.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
M. E. Ortiz-Soto, M. Rivera, E. Rudino-Pinera, C. Olvera, and A. Lopez-Munguia
Selected mutations in Bacillus subtilis levansucrase semi-conserved regions affecting its biochemical properties
Protein Eng. Des. Sel.,
October 1, 2008;
21(10):
589 - 595.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
N. Shu, T. Zhou, and S. Hovmoller
Prediction of zinc-binding sites in proteins from sequence
Bioinformatics,
March 15, 2008;
24(6):
775 - 782.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and D. L. Wheeler
GenBank
Nucleic Acids Res.,
January 11, 2008;
36(suppl_1):
D25 - D30.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
A. Heger, S. Mallick, C. Wilton, and L. Holm
The global trace graph, a novel paradigm for searching protein sequence databases
Bioinformatics,
September 15, 2007;
23(18):
2361 - 2367.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
D. W. Brown, R. A. E. Butchko, M. Busman, and R. H. Proctor
The Fusarium verticillioides FUM Gene Cluster Encodes a Zn(II)2Cys6 Protein That Affects FUM Gene Expression and Fumonisin Production
Eukaryot. Cell,
July 1, 2007;
6(7):
1210 - 1218.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
J. S. Papadopoulos and R. Agarwala
COBALT: constraint-based alignment tool for multiple protein sequences
Bioinformatics,
May 1, 2007;
23(9):
1073 - 1079.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and D. L. Wheeler
GenBank
Nucleic Acids Res.,
January 12, 2007;
35(suppl_1):
D21 - D25.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
L. Hao, J. Klein, and M. Nei
Heterogeneous but conserved natural killer receptor gene complexes in four major orders of mammals
PNAS,
February 28, 2006;
103(9):
3192 - 3197.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
C. B. Thomas and R. I. Gumport
Dimerization of the bacterial RsrI N6-adenine DNA methyltransferase
Nucleic Acids Res.,
February 6, 2006;
34(3):
806 - 815.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and D. L. Wheeler
GenBank
Nucleic Acids Res.,
January 1, 2006;
34(suppl_1):
D16 - D20.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
G. Pugalenthi, A. Bhaduri, and R. Sowdhamini
iMOTdb--a comprehensive collection of spatially interacting motifs in proteins
Nucleic Acids Res.,
January 1, 2006;
34(suppl_1):
D285 - D286.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
T. Bickel, L. Lehle, M. Schwarz, M. Aebi, and C. A. Jakob
Biosynthesis of Lipid-linked Oligosaccharides in Saccharomyces cerevisiae: Alg13p AND Alg14p FORM A COMPLEX REQUIRED FOR THE FORMATION OF GlcNAc2-PP-DOLICHOL
J. Biol. Chem.,
October 14, 2005;
280(41):
34500 - 34506.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
S. Chakrabarti, A. P. Anand, N. Bhardwaj, G. Pugalenthi, and R. Sowdhamini
SCANMOT: searching for similar sequences using a simultaneous scan of multiple sequence motifs
Nucleic Acids Res.,
July 1, 2005;
33(suppl_2):
W274 - W276.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
M. G. Bowden, W. Chen, J. Singvall, Y. Xu, S. J. Peacock, V. Valtulina, P. Speziale, and M. Hook
Identification and preliminary characterization of cell-wall-anchored proteins of Staphylococcus epidermidis
Microbiology,
May 1, 2005;
151(5):
1453 - 1464.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
S. G. Conticello, C. J. F. Thomas, S. K. Petersen-Mahrt, and M. S. Neuberger
Evolution of the AID/APOBEC Family of Polynucleotide (Deoxy)cytidine Deaminases
Mol. Biol. Evol.,
February 1, 2005;
22(2):
367 - 377.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and D. L. Wheeler
GenBank
Nucleic Acids Res.,
January 1, 2005;
33(suppl_1):
D34 - D38.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
G. Pugalenthi, A. Bhaduri, and R. Sowdhamini
GenDiS: Genomic Distribution of protein structural domain Superfamilies
Nucleic Acids Res.,
January 1, 2005;
33(suppl_1):
D252 - D255.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
M. B. Lobocka, D. J. Rose, G. Plunkett III, M. Rusin, A. Samojedny, H. Lehnherr, M. B. Yarmolinsky, and F. R. Blattner
Genome of Bacteriophage P1
J. Bacteriol.,
November 1, 2004;
186(21):
7032 - 7068.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
I. Alam, A. Dress, M. Rehmsmeier, and G. Fuellen
Comparative homology agreement search: An effective combination of homology-search methods
PNAS,
September 21, 2004;
101(38):
13814 - 13819.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
S. McGinnis and T. L. Madden
BLAST: at the core of a powerful and diverse set of sequence analysis tools
Nucleic Acids Res.,
July 1, 2004;
32(suppl_2):
W20 - W25.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
Q.-l. Wang, S. Chen, N. Esumi, P. K. Swain, H. S. Haines, G. Peng, B. M. Melia, I. McIntosh, J. R. Heckenlively, S. G. Jacobson, et al.
QRX, a novel homeobox gene, modulates photoreceptor gene expression
Hum. Mol. Genet.,
May 15, 2004;
13(10):
1025 - 1040.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and D. L. Wheeler
GenBank: update
Nucleic Acids Res.,
January 1, 2004;
32(90001):
D23 - 26.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
A. Bhaduri and R. Sowdhamini
A genome-wide survey of human tyrosine phosphatases
Protein Eng. Des. Sel.,
December 1, 2003;
16(12):
881 - 888.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
L. Papazisi, T. S. Gorton, G. Kutish, P. F. Markham, G. F. Browning, D. K. Nguyen, S. Swartzell, A. Madan, G. Mahairas, and S. J. Geary
The complete genome sequence of the avian pathogen Mycoplasma gallisepticum strain Rlow
Microbiology,
September 1, 2003;
149(9):
2307 - 2316.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
K. M. McGinnis, S. G. Thomas, J. D. Soule, L. C. Strader, J. M. Zale, T.-p. Sun, and C. M. Steber
The Arabidopsis SLEEPY1 Gene Encodes a Putative F-Box Subunit of an SCF E3 Ubiquitin Ligase
PLANT CELL,
May 1, 2003;
15(5):
1120 - 1130.
[Abstract]
[Full Text]
![]()
![]()
![]()

![]()
![]()
![]()
G. K-W. Kong, G. Polekhina, W. J. McKinstry, M. W. Parker, B. Dragani, A. Aceto, D. Paludi, D. R. Principe, B. Mannervik, and G. Stenberg
Contribution of Glycine 146 to a Conserved Folding Module Affecting Stability and Refolding of Human Glutathione Transferase P1-1
J. Biol. Chem.,
January 3, 2003;
278(2):
1291 - 1302.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and D. L. Wheeler
GenBank
Nucleic Acids Res.,
January 1, 2003;
31(1):
23 - 27.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
H. van der Wel, H. R. Morris, M. Panico, T. Paxton, A. Dell, L. Kaplan, and C. M. West
Molecular Cloning and Expression of a UDP-N-acetylglucosamine (GlcNAc):Hydroxyproline Polypeptide GlcNAc-transferase That Modifies Skp1 in the Cytoplasm of Dictyostelium
J. Biol. Chem.,
November 22, 2002;
277(48):
46328 - 46337.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
Y.-H. Feng, Y. Sun, and J. G. Douglas
Gbeta gamma -independent constitutive association of Galpha s with SHP-1 and angiotensin II receptor AT2 is essential in AT2-mediated ITIM-independent activation of SHP-1
PNAS,
September 17, 2002;
99(19):
12049 - 12054.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
H.-S. Lee, M.-S. Kim, H.-S. Cho, J.-I. Kim, T.-J. Kim, J.-H. Choi, C. Park, H.-S. Lee, B.-H. Oh, and K.-H. Park
Cyclomaltodextrinase, Neopullulanase, and Maltogenic Amylase Are Nearly Indistinguishable from Each Other
J. Biol. Chem.,
June 7, 2002;
277(24):
21891 - 21897.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, B. A. Rapp, and D. L. Wheeler
GenBank
Nucleic Acids Res.,
January 1, 2002;
30(1):
17 - 20.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
I. Tatsuno, M. Horie, H. Abe, T. Miki, K. Makino, H. Shinagawa, H. Taguchi, S. Kamiya, T. Hayashi, and C. Sasakawa
toxB Gene on pO157 of Enterohemorrhagic Escherichiacoli O157:H7 Is Required for Full Epithelial Cell Adherence Phenotype<