ABSTRACT
The polypyrimidine tract is one of the important cis-acting sequence elements directing intron removal in pre-mRNA splicing. Progressive deletions of the polypyrimidine tract
have been found to abolish correct lariat formation, spliceosome assembly and
splicing. In addition, the polypyrimidine tract can alter 3'-splice site selection by promoting alternative branch site selection.
However, there appears to be great flexibility in the specific sequence of a
given tract. Not only the optimal composition of the polypyrimidine tract, but
also the role of the tract in introns with no apparent polypyrimidine tracts or where changes in the tract are apparently
harmless are uncertain. Accordingly, we have designed a series of cis-competition splicing constructs to test the functional competitive
efficiency of a variety of systematically mutated polypyrimidine tracts. An
RT/PCR assay was used to detect spliced product formation as a result of
differential branch point selection dependent on direct competition between two opposing polypyrimidine tracts. We found that pyrimidine tracts containing
11 continuous uridines are the strongest pyrimidine tracts. In such cases, the
position of the uridine stretch between the branch point and 3'-splice site AG is unimportant. In contrast, decreasing the continuous
uridine stretch to five or six residues requires that the tract be located
immediately adjacent to the AG for optimal competitive efficiency. The block to
splicing with decreasing polypyrimidine tract strength is primarily prior to
the first step of splicing. While lengthy continuous uridine tracts are the
most competitive, tracts with decreased numbers of consecutive uridines and
even tracts with alternating purine/pyrimidine residues can still function to promote branch point selection, but are far less effective competitors in 3'
-splice site selection assays.
Eukaryotic pre-mRNA splicing involves the excision of introns from nascent transcripts
and the ligation of exons forming mature mRNA (reviewed in 1
-4
). Accurate removal of introns takes place in a two step reaction that requires several cis-acting sequences and trans-acting factors. The trans-acting factors that recognize the conserved cis sequences form a large protein-RNA complex termed the spliceosome. Five small nuclear RNAs and their
associated proteins (U1, U2, U4, U5 and U6 snRNPs) constitute the major
components of the spliceosome (5
,6
), in addition to a large number of non-snRNP proteins (7
,8
). Spliceosomes are dynamic structures that assemble in a step-wise fashion (for reviews see 9
-11
).
The cis sequences required for splicing of the major class of mammalian introns include
a consensus 5'-splice site, a branch point with an adjacent polypyrimidine tract
and the consensus 3'-splice site. The two dinucleotides that define the 5'- and 3'-boundaries of introns are invariant and
when mutated splicing is greatly reduced or completely abolished. However, the requirement for some of the other conserved sequences appears to be quite versatile, at
least in metazoans. Studies have shown that branch sites with perfect
complementarity to a region of U2 snRNA are optimal for utilization (12
-14
). Indeed, in yeast (Saccharomyces cerevisiae) a strictly conserved consensus sequence surrounds the branch point (UACUAAC, branch point bold) that is optimal for pairing with U2 snRNA (15
-17
). However, in most metazoan introns, perfect complementarity does not exist and
the sequence context surrounding the branch point becomes important in its utilization (12
,13
,18
,19
). In addition, metazoan recognition of branch point sequences can be affected
by the adjacent polypyrimidine tract (18
,20
-25
). Progressive deletions of the polypyrimidine tract have been found to abolish
lariat formation, spliceosome assembly and splicing (18
,20
,26
,27
). Not only does the polypyrimidine tract increase the efficiency of branch
point utilization (18
,22
,28
,29
), it can also function in the selection of alternative branch sites and thus 3'-splice site recognition (21
,22
,30
-38
).
In contrast to mammalian introns, yeast have a highly conserved branch point sequence but generally lack a clear polypyrimidine tract.
Nevertheless, it has been found that many yeast introns are enriched for
uridines adjacent to the 3'-splice site AG, particularly at the -9 position preceding the AG (39
). Increasing the number of uridines in this region greatly enhances 3'-splice site utilization in yeast (40
).
Despite the important role of the polypyrimidine tract in splicing, there
appears to be great flexibility in the specific sequence of a given tract. For
certain substrates, the introduction of purines into the polypyrimidine tract
is detrimental to splicing only if the length of the tract is shortened and if
there is a reduction in the number of consecutive uridine residues (18
,23
,24
). Also, the introduction of purines immediately downstream of the branch point is apparently more detrimental than similar substitutions close to the 3'-splice site (18
). In addition, uridine and cytidine do not appear to function equivalently
within a polypyrimidine tract (21
,23
,25
). Consequently, not only the optimal composition of the polypyrimidine tract,
but also the role of the tract in introns where there are no recognizable
tracts or where changes in the polypyrimidine tract are apparently harmless are
in question. We designed a series of cis-competition splicing constructs to test the functional competitive
efficiency of a variety of systematically mutated polypyrimidine tracts. An
RT/PCR assay that detects spliced product formation as a result of direct
competition between two opposing polypyrimidine tracts has been employed. Using
constructs that contain variable numbers of continuous uridines, we have found
that if the continuous stretch of uridines is sufficiently long, the position
of the uridine stretch relative to the branch point or 3'-splice site AG is not important. However, if the continuous uridine
stretch is decreased to five or six residues, such tracts are less competitive
and the splicing of such substrates requires that the uridines be directly
adjacent to the 3'-splice site AG. Pyrimidine tracts consisting of alternating
uridines and guanosines can also support splicing, but tracts containing long
continuous stretches of uridine are functionally the strongest competitors.
Transcripts for in vitro splicing were produced using SP6 polymerase on templates linearized at the BamHI site (44
,45
). Transcription reactions were performed in 20 [mu]l reactions containing 1* transcription buffer (Promega), 10 mM DTT, 0.5 mM each ATP, CTP and UTP, 0.125 mM GTP, 5 [mu]Ci [[alpha]-32P]CTP (400 Ci/mmol; Amersham), 0.5 mM cap analog, 20 U SP6 RNA polymerase (Promega), 20 U rRNasin
(Promega) and 1.0 [mu]g linearized template DNA. After incubation for 1 h at 37oC, reactions were phenol/chloroform extracted and pre-mRNAs were ethanol precipitated.
In vitro splicing reactions were carried out using HeLa cell nuclear extract as described (44
,45
). Following splicing, reactions were digested with proteinase K, phenol/chloroform extracted and RNAs
precipitated with ethanol (44
,45
). Splicing reactions were analyzed on 8% denaturing polyacrylamide gels or subjected to reverse transcription as described below.
Reverse transcription reactions were carried out in 10 [mu]l reactions containing 1* AMV reverse transcription buffer (Promega), 1 mM dNTPs, 1 mM DTT, 250 ng Pycomp 3 and 20 U rRNasin (Promega), using ~40% of the RNA from a given splicing reaction. After a 10 min
pre-incubation at 65oC, 10 U AMV reverse transcriptase (Promega) were added, followed by incubation at 42oC for 2 h. Samples were then diluted 1:100 and frozen.
PCR amplification of the resultant cDNAs was performed with a labeled oligonucleotide to facilitate quantitation. Labeling of 50 ng TMX3 was carried out using 1* kinase buffer (Promega), 5 [mu]Ci [[gamma]-32P]ATP (3000 Ci/mmol, DuPont) and 5 U T4 polynucleotide kinase (Promega) for 30
min at 37oC. Labeling reactions were then heated for 2 min at 90oC and used directly in PCR reactions. PCR was performed in 50 [mu]l reactions containing 10 mM Tris, pH 8.3, 50 mM KCl, 1.5 mM MgCl2, 0.001% gelatin, 0.2 mM dNTPs, 350 ng each Pycomp 2 and TMX3, 50 ng [32P]TMX3, 2 [mu]l diluted reverse transcription reaction and 5 U Taq DNA polymerase. After 20 cycles, products were separated on 3% agarose gels and dried down
under vacuum at 50oC for 3 h. Products were then visualized using a PhosphorImager 445SI
(Molecular Dynamics) and quantitative analysis was performed using IPLabGel
(Signal Analytics).
Pycomp 3 and TMX3 are complementary to [alpha]-tropomyosin ([alpha]-TM) exon 3 and Pycomp 2 is complementary to [alpha]-TM exon 2 (Fig. 3
) with the following sequences:
Pycomp 3: 5'-CCTGGGCATCTTTGAGAGCC-3';
Pycomp 2: 5'-GGGCGTCGGAGGACGAGC-3';
TMX3: 5'-CGAAGCTTGTATTTGTCCAGTTCATCTTC-3'.
As a first test of the competitive behavior of different polypyrimidine tracts, we prepared a construct containing 23 random pyrimidines in the test
position (cis-parent; Fig. 1
B). This construct was subjected to in vitro splicing to determine whether the test tract could compete against the P3
tract. For comparison and determination of spliced products and intermediates,
the splicing of pGC+DX was performed in parallel. As shown in Figure 2
, not only is the test tract of the cis-parent efficiently utilized, as shown by downstream branch point selection, but it actually out-competes the P3 tract. Quantitative analysis of the levels of each spliced
product showed that the test tract was used ~8- to 9-fold more than the P3 tract. This was unexpected given the
strength of P3 (22
), but may be due to the relative positions of the two competing tracts. The
distance between the upstream branch point and the first AG dinucleotide is 69
nt, while it is only 23 nt between the downstream branch point and the wild-type AG. Perhaps this distance difference allows the test tract to compete
more effectively. Alternatively, since the test tract is adjacent to exon 3 of [alpha]-TM, exon sequences may increase utilization of the branch point associated with the test tract (reviewed in 10
,11
,47
; see Discussion).
Uridine and cytidine residues are not functionally equivalent. Test polypyrimidine tracts consisting entirely of uridine, cytidine or
guanosine residues were made to determine the competitive efficiency of tracts composed of individual nucleotides (see Fig. 1
). As expected, when a construct consisting of 17 guanosines (G17) was placed in the test position, the upstream branch point was chosen 100% of the time, indicating that the wild-type polypyrimidine tract (P3) was preferred (Fig. 4
). In contrast, constructs containing 17 and 23 uridines in the test position
utilized the downstream AG ~95% of the time, whereas 16 consecutive cytidines in the test position led
to use of the downstream AG only 8% of the time. Therefore, it seems clear that
an effective polypyrimidine tract must contain a minimal number of pyrimidines but that uridine and cytidine do not function equivalently, as has been
previously demonstrated (23
,25
). Due to the lack of competition with polypyrimidine tracts containing only
cytidine residues (Fig. 4
; 23
), the remaining test polypyrimidine constructs were made using uridine as the variable pyrimidine residue in the test tracts
interspersed with varying levels of guanosine. Since branch points are
typically adenosines (48
), we avoided their use to prevent cryptic branch point activation (49
).
Two classes of introns have been described based on the dependence of the 3'-splice site AG for the first step of splicing (18
,39
,53
). Pre-mRNAs with a branch point followed by a short pyrimidine tract are
typically dependent on the presence of the AG, whereas those containing a long
pyrimidine tract do not require the AG to undergo the first step. In cases
where the branch point is distant from the AG, sequences between the branch
point and the AG, particularly a pyrimidine tract adjacent to the AG, can also
affect the second step of splicing (18
). To assess the effect of polypyrimidine tract composition on the first step of
splicing and, subsequently, the rate of conversion of the first step to the
second step, we subjected selected constructs to in vitro splicing and directly analyzed the lariat intermediates and spliced products.
The specific constructs chosen contain two different lengths and positions of
continuous uridine tracts. When six uridines were positioned adjacent to the
branch point (U6G17), there was an inherently lower efficiency of the first step of splicing as
compared with the amount of lariat intermediate detected when 11 uridines were
adjacent to the branch point (U11G12; Fig. 5
). Also, it appears that the U6G17 tract is blocked after the first step of splicing, yet Figure 4
shows that roughly equivalent amounts of both spliced products are formed. When
quantitated, the ratio of intermediate accumulation for the U6G17 construct is equal to the ratio of product formation for the two competing
events, suggesting that once the first step block is overcome, the second step
of splicing proceeds rapidly and is not biased by branch point choice. When the
number of consecutive uridines was increased from six to 11, the first step of
splicing was much more efficient and there was no apparent block to the second
step. This is consistent with results from the RT/PCR assay that showed that a
continuous stretch of 11 uridines serves as a strong polypyrimidine tract,
regardless of position. Decreasing numbers of continuous uridines decrease the
efficiency of the first step of splicing.
To address what features of a mammalian polypyrimidine tract determine its functional strength, we have utilized a cis-competition assay designed to measure the ability of competing polypyrimidine tracts to promote branch point selection. All of the test tracts are placed in
the exact same position relative to one another and compete against an
identical polypyrimidine tract, so that the test tracts can be ordered with
regard to their ability to promote branch point selection and subsequent 3'-splice site selection. The use of competition assays and systematic
pyrimidine changes allowed us to confirm and extend previous experiments
designed to address similar issues (18
,23
,25
,29
). Our data suggest that first, uridines are the preferred pyrimidine. Second,
the proximity of the polypyrimidine tract to the 3'-splice site is most important when pyrimidines are limiting,
whereas sufficiently strong polypyrimidine tracts are relatively position
independent. Third, the polypyrimidine tract composition can affect the
efficiency of the first step of splicing. Fourth, polypyrimidine tract strength
is not determined solely by length. Lastly, a tract consisting of alternating
pyrimidines and purines is functional.
There are at least two important caveats to our results. The first is the
assumption that selection of an upstream branch point does not lead to
selection of the downstream (second) 3'-splice site AG. If this were the case, there should be no
correlation between the strength of a pyrimidine tract and branch point
selection, as either AG could be selected, regardless of branch point
selection. This does not appear to be the case, as there is a strong
correlation between the strength of the pyrimidine tract and branch point
selection. In addition, the amount of spliced product formed equals the amount
of the corresponding lariat when splicing gels are directly analyzed. Finally,
the first AG downstream of the branch point is almost always selected using [alpha]-TM-derived substrates (28
,46
).
The second possible complication with our cis-competition substrates concerns the possibility that secondary structures could
affect splicing, especially since our test tracts could contain potential G:U
base pairing arrangements. Such secondary structures placed between the branch point and the AG dinucleotide can block splicing after the first step and alter 3'-splice site selection (28
,54
,55
). However, direct analysis of splicing gels (Figs 2
and 5
and data not shown) do not suggest such blocks to splicing with our substrates.
Quantitative analysis focusing on lariat intermediate and lariat product
formation completely agree with similar quantitative analysis using the RT/PCR
spliced product assay, which would not be expected if there was a block to
splicing at an intermediate stage due to unusual secondary structures. Thus, it
appears that our cis-competition substrates derived from [alpha]-TM allow competitive splicing analysis free from at least these two
potential difficulties.
The data shown in Figure 4
clearly show that polypyrimidine tracts with 11 continuous uridines are highly
competitive pyrimidine tracts regardless of distance between the branch point
and polypyrimidine tract. Limiting the number of continuous uridines to six
demands that these uridines be placed immediately adjacent to the 3'-splice site AG to optimally function as a competitive pyrimidine
tract. Tracts containing five or six continuous uridines can compete moderately
well as long as they are positioned closer to the 3' AG than the branch point, but are ineffective competitors if located
adjacent to the branch point. By comparison of known polypyrimidine tracts, it
seems that the threshold level of continuous uridines needed to allow optimal function is eight. A commonly used, efficiently spliced pre-mRNA substrate derived from the adenovirus 2 major late promoter contains a
stretch of eight continuous uridines located 4 nt from the 3'-splice site AG (56
). Insertion of a single adenosine within the continuous uridine stretch leads
to a near total loss of splicing and spliceosome assembly (23
). Similarly, the Drosophila sex-lethal gene contains a regulated intron with eight continuous uridines and breaking
the string leads to reduced splicing efficiency (23
,25
,31
,57
). Increasing the number of continuous uridines has also been used to increase
the splicing efficiency of a variety of pre-mRNA substrates (18
,23
,25
,40
). Combining the current data with these previous results, it appears that a pyrimidine tract with eight or more continuous uridines constitutes a strong, competitive pyrimidine tract. However, introns containing less than eight continuous uridines can still maintain functional pyrimidine
tracts (Fig. 4
), as demonstrated by the substrate containing alternating uridines and
guanosines, as well as many other substrates (48
). Functional pyrimidine tracts do not absolutely require continuous uridines,
but comparison with other substrates shows that increased continuous stretches
of uridines increases splicing efficiency and competitiveness.
Rather than the number of consecutive uridines determining functional strength,
an alternative hypothesis could be that the total number of uridines present in
a given pyrimidine tract is also important in determining strength. For
example, a pre-mRNA substrate derived from [beta]-globin intron 1 is relatively efficiently spliced, yet the
polypyrimidine tract does not contain any more than four continuous uridines (58
). Similarly, model substrates derived from [beta]-globin intron 1 are efficiently spliced with continuous uridine
stretches of no more than three (18
). However, in both cases, additional uridines (and cytidines) are found as part
of the pyrimidine tract. As with the (GU)11 substrate, it is possible that the total uridine content may relate to splicing
efficiency. Thus, a continuous stretch of uridines may be optimal, but the
total percentage of uridines could also determine functional competitiveness.
Since strictly cytidine-containing pyrimidine tracts are apparently non-functional (23
, this study), examination of the uridine content appears to be of greatest
importance when designing or locating a functional pyrimidine tract. However,
it should be stressed that as far as functional pyrimidine tracts are
concerned, continuous uridine tracts versus total uridine content are not necessarily mutually exclusive arrangements. In a two exon, one intron substrate devoid of competition, a weak pyrimidine tract could promote branch point selection but appear non-functional under competition conditions.
Several RNA binding proteins have been found to preferentially bind the
polypyrimidine tracts of metazoan introns, including heterogeneous
ribonucleoprotein C (hnRNP C; 59
), intron binding protein (60
,61
), polypyrimidine tract binding protein (PTB; 44
,56
), PTB-associated splicing factor (PSF; 45
), U2 snRNP auxiliary factor (U2AF; 62
,63
) and the Drosophila splicing regulator sex-lethal (31
,64
-66
). Genetic selection experiments designed to identify the optimal RNA binding
sequence have been performed for four of these proteins (67
-69
). Each protein displays unique but partially overlapping pyrimidine-rich binding sites. The selected sequences for all of these proteins agree
with the hypothesis that uridine content is a major determinant in pyrimidine
tract strength. Since it appears that 3'-splice site selection is partly determined by competitive binding
of these and perhaps other proteins, it may be that pyrimidine tracts with
exceptionally strong or long U-rich tracts might bind such proteins too avidly, disallowing competition.
For mammalian splicing, U2AF binds to the polpyrimidine tract with high
affinity (63
) and regulation of splicing is apparently allowed by having multiple proteins
compete for binding to the pyrimidine tract with variable concentrations among
the different U2AF competitors, particularly sex-lethal and PTB (68
-71
). Consistent with the competition model, it appears there is a dynamic
rearrangement of polypyrimidine tract binding proteins during spliceosome
assembly and during both steps of splicing (72
). The dynamic rearrangement of proteins bound to the polypyrimidine tract are
all consistent with an important role for the polypyrimidine tract in splicing
and spliceosome assembly, consistent with our cis-competition results.
The ability of the (GU)11 pyrimidine tract to compete for splicing is puzzling given the continuous
uridine preference for the above-mentioned factors, particularly the essential splicing factor U2AF. All
genetic selection and binding assays suggest that U2AF would not bind the (GU)11 pyrimidine tract to enable U2 entry into the spliceosome (63
,68
,69
,73
). A possible explanation for the ability of the (GU)11 tract to compete for splicing derives from the combinatorial nature of the cis-acting elements that direct 3'-splice site selection. While our experiments have attempted
to isolate the pyrimidine tract and dissect its individual role in splicing, it
is clear that the strength of the adjacent branch point also plays an important
role in splice site selection (12
,22
,74
). Indeed, the branch point/polypyrimidine tract is perhaps most correctly
viewed as a single functional unit, with contributions from both elements
determining strength (22
). Thus, a weak pyrimidine tract can be offset by a strong branch point and vice
versa. However, functional definition of strength does not end with
combinatorial action between just these two elements. Exon enhancers must also
be considered, as the presence of such enhancers can clearly rescue splicing
from otherwise weak 3'-splice site signals (75
-80
). The presence of multiple exon enhancer sequences allows a normally weak female-specific 3'-splice site in the Drosophila doublesex pre-mRNA to compete for splicing against the male-specific 3'-splice site. Exon 3 of [alpha]-TM contains purine-rich sequences that may function as
an enhancer (S.Mayer, personal communication), thereby promoting splicing using
the downstream pyrimidine tract. This may well account for the seemingly
surprising finding that many of our constructs were able to out-compete the upstream P3 tract. Proximity of the downstream tract to a
possible exon enhancer could allow weaker tracts to compete more efficiently.
Such combinatorial action is a possible explanation for the ability of introns
with seemingly no pyrimidine tract to undergo splicing and may explain why it
has been somewhat difficult to accurately define the specific sequence
requirements for a strong pyrimidine tract. Consistent with such a hypothesis,
equilibrium binding assays have recently shown that U2AF and PTB bind
polypyrimidine tracts with very similar affinities (81
). Since the nuclear concentration of PTB is much higher than U2AF, it seems
likely that other factors, including U1 snRNP and SR proteins, contribute to
U2AF binding (79
,80
,82
). Thus, combinatorial action between cis-acting elements and the trans-acting factors that interact with these sequences likely enables great
diversity in the functional strength of various 3'-splice sites and could account for the difficulty in assigning
pyrimidine tract strength by direct sequence analysis. Nevertheless, when such
variables are held constant, it is possible to derive certain rules and
preferences, as has been done here.
This work was supported by a grant from the National Institutes of Health, R01
GM50418. Phosphorimager analysis was made possible by funds provided by the
National Science Foundation, BIR-9419667. R.J.S. was supported by training grant HL07751.
*To whom correspondence should be addressed. Tel: +1 615 322 4738; Fax: +1 615
343 6707; Email: pattonjg@ctrvax.vanderbilt.edu
REFERENCES
Return




