|
Genome projects are approaching completion and are saturating sequence databases. This paper discusses the role of the two-hybrid
system as a generator of hypotheses. Apart from this rather exhaustive, financially and labour intensive procedure, more refined
functional studies can be undertaken. Indeed, by making hybrids of two-hybrid systems, customised approaches can be developed
in order to attack specific function-related problems. For example, one could set-up a "differential" screen by combining
a forward and a reverse approach in a three-hybrid set-up. Another very interesting project is the use of peptide libraries
in two-hybrid approaches. This could enable the identification of peptides with very high specificity comparable to "real"
antibodies. With the technology available, the only limitation is imagination.
Protein-protein interactions
Protein-protein interactions are intrinsic to virtually every cellular process ranging from DNA replication, transcription,
splicing and translation, to secretion, cell cycle control, intermediary metabolism, formation of cellular macrostructures
and enzymatic complexes. The formation of large cellular structures such as the cytoskeleton, the nuclear scaffold, and the
mitotic spindle result from complex interactions between proteins. Relatively smaller structures such as nuclear pores, centrosomes
and kinetochores are beginning to be characterized and, in each case, protein-protein interactions seem to play a crucial
role.
Apart from the evident structural requirements provided by a plethora of protein-protein interactions, there are a large number
of transient protein-protein interactions that control and regulate a large number of cellular processes. All modifications
of proteins involve such transient protein-protein interactions. Indeed kinases, phosphatases, glycosyl transferases, acyl
transferases and proteases interact only transiently, i.e. for a limited period of time, with their protein substrates. Such
protein-modifying enzymes encompass a large number of fundamental processes such as cell growth, the cell cycle, metabolic
pathways and signal transduction. Surprisingly, very large protein complexes also mediate many of these enzymatic activities.
Transmission of regulatory signals from the external environment to relevant locations in the cell was originally thought
to consist of successive catalytic activities that could amplify a weak signal into a significant cellular response. However,
more recent experiments suggest that in many signal transduction pathways, the catalytic activities involved, such as protein
kinases, may bind strongly to their protein substrates. In addition, structural proteins required for signal transmission
have been suggested to act as scaffolds, bridging several proteins involved at consecutive steps in a signal transduction
pathway (1). Thus, signal transduction pathways might be considered as large protein structures through which a signal is being transmitted.
A striking example is the formation of the Death-Inducing Signaling Complex (DISC) at the Fas receptor. Only a few seconds
after receptor triggering, a highly complex mixture of signal transducing molecules is recruited to the intracellular part
of the receptor. This newly formed complex is capable of transmitting multiple specific signals provoking a highly regulated
cellular response (2).
Alteration of protein-protein interactions is known to contribute to many diseases. Hence, the manipulation of protein-protein
interactions that contribute to disease is a potential therapeutic strategy. Summarising, protein-protein interactions make
up biological machines that are like intricate three-dimensional jigsaw puzzles, forming arrays of interlocking protein components
that assemble and disassemble over time and in response to complex signals.
Tools for the study of protein-protein interactions
The study of protein-protein interactions can be conceptually divided into three major domains: identification, characterization
and manipulation. In general, assemblies of proteins have been analyzed using two complementary approaches: the biochemical
and the genetic. In the well-known analogy to understanding how car runs, biochemists disassemble the engine, transmission
and body, characterize all the pieces and attempt to rebuild a working vehicle. Geneticists, by contrast, break single components,
turn the key and try to determine what effect the single missing part has on the car's operation. This implies that genetic
methods often require a specific phenotype before they can be carried out.
Traditionally, the tools available to analyze protein-protein interactions in multicellular organisms have been restricted
to biochemical approaches. However, despite obvious advantages, biochemical approaches can be time-consuming. Biochemical
methods that detect proteins that bind to another proteins generally result in the appearance of a band on a polyacrylamide
gel. These methods are sometimes referred to as physical methods and include protein affinity chromatography, affinity blotting,
immunoprecipitation and cross-linking (3).
Protein probing uses a labeled protein as a probe to screen an expression library in order to identify genes encoding interacting
proteins. Since all combinations of protein-protein interactions are assayed, including those that might never occur in vivo, the possibility of identifying artifactual partners exists and is a typical disadvantage of most exhaustive screening procedures.
A second drawback derives from the use of a bacterial host, where not all posttranslational modifications needed for the interaction
might occur. A third disadvantage is that screening rather than selection is used as the means of detection, which inherently
limits the number of plaques that can be assayed. The "Phage Display" method is noteworthy in this matter. Here the methodology
is based on the fact that an Escherichia coli filamentous phage can express a fusion protein on its surface.
A major advantage of the described method is the affinity purification step through "panning" cycles, that enables one to
enrich every cycle 1000-fold for specific phages that contain an interacting protein. Disadvantages of phage display include
the size limitation of protein sequence for polyvalent display and the requirement that proteins be secreted form E. coli. Moreover, all phage-encoded proteins are fusion proteins, which may limit the activity or accessibility for the binding
of some proteins (3).
What is it?
Apart from the above-mentioned methods, a new technology has been developed during the past decade. This technique, entitled
"two-hybrid" or "interaction trap", enables not only the identification of interacting partners but also the characterization
of known interaction couples and even embodies the technological means to manipulate protein-protein interactions.
The modular properties of GAL4 and other transcription factors in general fostered this strategy. Indeed many eukaryotic transcription
activators have at least two distinct functional domains, one that directs binding to a promoter DNA sequence and one that
activates transcription (4, 5). This fact was illustrated by exchanging DNA binding domains and activation domains from one transcription factor to the
next while retaining its function. It was shown that the activation domain of yeast GAL4 could be fused to the DNA-binding
domain of E. coli
LexA to create a functional transcription activator in yeast (6).
The "two-hybrid" technique exploits the fact that the DNA-binding domain of GAL4 is incapable of activating transcription
unless physically, but not necessary covalently associated with an activating domain .
Ma and Ptashne (7) demonstrated this principle for the first time. They showed that the GAL80 protein, normally a negative regulatory protein
that interacts with GAL4, could be converted into a transcriptional activator by fusing it to an activation domain (AD). The
activation by this fusion protein, GAL80-AD, was dependent on the presence of a GAL4 derivative bearing the GAL80 binding
domain (C-terminal 30 amino acids) but lacking its own activation domain (7).
The actual use of those different functional modules of a transcription factor to study protein-protein interactions was first
proposed by Fields and Song (8). They demonstrated the "proof-of-concept" by using SNF1 fused to the DNA-binding domain (DB) and SNF4 to an activation domain
(AD). Only after expression of these two chimeras, and subsequent interaction of SNF1 and SNF4, did they reconstitute a functional
transcription factor that is able to induce reporter gene expression. These initial experiments confirmed that a transcriptional
read-out could be used as a tool to study interactions between proteins not involved in the transcription process.
Why not use the two-hybrid system?
It should be noted, however, that the two-hybrid system does not provide a solution for all protein-protein problems. For
different experimental reasons some proteins are not suited for this approach. The sceptics about the use of the two-hybrid
system are furnished with a summary of the extensive list of disadvantages and drawbacks in the next paragraphs.
|
Fig. 1: [Enlarge]
|
Principle of the Two-hybrid system. (A), (B) Two chimeras, one containing the DNA-binding domain (DB: blue circle) and one that contains an activation domain
(AD: half blue circle), are co-transfected into an appropriate host strain. (C) If the fusion partners (yellow and red) interact,
the DB and AD are brought into proximity and can activate transcription of reporter genes (here LacZ).
|
In general, in any two-hybrid experiment a protein of interest is fused to a DNA-binding domain and transfected in a yeast
host cell bearing a reporter gene controlling this DNA-binding domain. When this fusion protein cannot activate transcription
on its own, it can be used as "bait" or as a "target" to screen a library of cDNA clones that are fused to an activation domain.
The cDNA clones within the library that encode proteins capable of forming protein-protein interactions with the bait are
identified by virtue of their ability to cause activation of the reporter gene. So the yeast two-hybrid system is devised
to identify genes encoding proteins that are physically associated with a given protein in vivo. Since the emergence of the two-hybrid approach in 1989, a number of improvements have been incorporated that have increased
its applicability (New Developments).
Since the read-out of the two-hybrid system makes use of a transcription event, one of the most crucial initial experiments
is to check whether your favorite protein (YFP) is able to initiate transcription. If this is the case, it might seriously
handicap the successful use of this protein in any two-hybrid approach (see Auto-activation).
An obvious critique concerns the extensive use of chimeras. The use of artificially made fusion proteins always embodies a
potential risk. The fusion might change the actual conformation of the bait and/or prey and consequently alter functionalities.
This misconformation might result in a limited activity or in the inaccessibility of binding sites. However, the use of tagged
proteins in general has been very successful in many biotechnological approaches. This success might rely on the fact that
protein domains can fold rather independently, enabling the co-existence of different, even artificially introduced, modules
in the same protein.
The best control to assay the correct conformation of your favorite protein is to clone a known positive interactor in the
appropriate vector and "two-hybrid-assay" this interaction. This will only work if both proteins are folded correctly. This
kind of experiment is only conclusive for the domains involved in the "positive" interaction, which might differ from the
ones involved in the new interactions that might be of interest. In summary, this category of drawbacks can be labeled "(mys)steric",
related to folding and the three dimensional structure of the protein assayed. In this respect it is noteworthy that the reciprocal
transfer of proteins, i.e. switching proteins from DNA-binding fusions to activating domain fusions, is not trivial, supporting
the fact that (mys)steric constraints on folding are involved. This switching or "swapping" might provide an empirical way
to escape the problem.
One of the most ambiguous disadvantages is that the two-hybrid system makes use of yeast, S. cerevisiae, as a host. This implicates, as mentioned above, that YFP must be able to fold correctly and exist as a stable protein inside
the yeast cells. The use of yeast can also be seen as an advantage, since yeast is closer to higher eukaryotics than in vitro experiments or those systems based on bacterial hosts.
A major disadvantage of assaying protein-protein interaction in any heterologous system is that some interactions depend upon
posttranslational modifications that do not, or inappropriately, occur in yeast. Such modifications are frequent and include
the formation of disulfide bridges, glycosylation and most commonly phosphorylation. Some new two-hybrid systems, however,
try to circumvent this inconvenience by co-expressing the enzyme responsible for the posttranslational modification.
Since the two-hybrid system needs the fusion proteins to be targeted to the yeast nucleus, it might be a disadvantage for
extracellular proteins or proteins that contain strong(er) targeting signals.
When screening libraries, a good representation is crucial. In classical two-hybrid library preparations only one out of six
fused cDNAs is in the correct frame, pushing the total number of independent clones to be screened to over a million, at the
border of practical feasibility. Making directional libraries of a relevant tissue or cell type might be a solution. Another
solution might be to go for less complex organisms like C. elegans.
Screening of libraries selects for optimized interactions. Many isolates may not represent full-length cDNA. Indeed, it has
been shown that subdomains may interact better than full-length clones, probably reflecting domain function during folding
of the protein. The best way to encompass this problem is probably to clone only full-length cDNAs in the correct open reading
frame. Although extremely labor intensive, this approach was taken to establish the complete yeast protein linkage map (see
Whole genome approaches using the two-hybrid system).
Since only reporter gene activity is measured, it is impossible to exclude the possibility that a third protein Z is bridging
the two interacting partners. Although this possibility is rather unlikely and might even be considered as "specific," it
holds for many of the conventional biochemical techniques.
Some proteins might become toxic upon expression in yeast. A number of proteins, such as cyclins or homeobox gene products
are indeed toxic when expressed and targeted into the yeast nucleus. Such genes might be counterselected during growth and
result in problems. The use of an inducible promoter might circumvent the problem. Other proteins might proteolyse essential
yeast proteins or proteins essential for the system like the DNA binding domain or the activation domain.
Since all combinations of protein-protein interactions are assayed, the possibility of identifying artifactual partners exists
and is a typical disadvantage of all exhaustive screening procedures. Due to the so-called time/space constraints it is potentially
possible that both proteins, although able to interact, are never in close proximity to each other within the cell. The two
proteins could be expressed in different cell types, or even when found in the same cell they could be localized in distinct
subcellular compartments. Moreover, interacting proteins can be expressed at different points during embryogenesis or during
homeostasis (e.g. at different time points in the cell cycle). So once two interacting partners are identified, the biological
relevance of this interaction remains to be determined.
Why use the two-hybrid system?
Apart from the above-mentioned drawbacks, the two-hybrid system has some clear advantages over classical biochemical and genetic
approaches. First of all it embodies an in vivo technique using the yeast host cell as a live test tube. This yeast system brings the higher eukaryotic reality closer than
most in vitro approaches or techniques based on bacterial expression. Appealing features of this system are the minimal requirements to
initiate a screening. Only the cDNA, full-length or even partial of the gene of interest is needed, in contrast to sometimes-high
quantities of purified proteins or good quality antibodies needed in classical biochemical approaches.
Weak and transient interactions, often the most interesting in signaling cascades, are more readily detected in two-hybrid
since the genetic reporter gene strategy results in a significant amplification. It is useful to keep in mind that there is
a trade-off between the identification of weak interactions and the number of false positives encountered when performing
a screening procedure. Apart from the ability to screen libraries, the two-hybrid system also allows for the analysis of known
interactions. This can be achieved by pinpointing crucial residues for interaction or by a functional characterization of
the entire subdomain. By doing semi-quantitative experiments one can even interpret affinities from two-hybrid experiments.
It was demonstrated that the strength of interaction as predicted by the two-hybrid approach generally correlates with that
determined in vitro, permitting discrimination of high-, intermediate- and low-affinity interactions (34). In addition, binding affinities of peptides to retinoblastoma (Rb), as determined by surface plasmon resonance, correlated
with results from the two-hybrid assay.
The two-hybrid system was predicted to be limited to the analysis of cytoplasmic proteins. Indeed extracellular proteins or
protein domains are often N-glycosylated and contain disulfide bonds, both of which are not expected to occur in the yeast
nucleus (9). However, several successes were reported with transmembrane receptors. Appropriate extracellular receptor-ligand interactions
were demonstrated for the growth hormone,prolactin and growth hormone releasing receptors (10, 11). Thus, receptors with whole extracellular critical ligand binding determinants can sometimes be evaluated by the two-hybrid
system. But, it may be inappropriate for receptors with determinants in transmembrane domains that form intramembraneous ligand
binding pockets (11).
One of the most appealing features of the yeast two-hybrid system is that the identification of an interacting protein implies
that at the same time the corresponding gene is cloned. Two-hybrid screens are sometimes referred to as functional screens,
since interacting proteins might give a functional hint if at least one of the partners has a known functional commitment
in a well understood signaling pathway. Trying to attribute function to an unknown target is often more difficult. Here, the
identified partners need to be known or the problem will propagate. Although the outcome of a screening often results in many
new hypotheses, they still need to be validated by other techniques. As a conclusion, there is enough reason to remain sceptic
about two-hybrid screenings but the most convincing argument in favor of the two-hybrid is the number and speed in which many
signaling cascades have been resolved in molecular detail.
|
Fig. 2: [Enlarge]
|
General flowchart of a hypothetic two-hybrid screen. Pointed lines should be followed in case of negative results.
|
Introduction
Actual screening involves many choices, since several comparable two-hybrid systems are available. Many of the most encountered
problems and potential pitfalls are reviewed in this section. In most cases a typical two-hybrid experience goes through a
sequence as outlined in Fig. 2.
Construction of the "target" or "bait"
When performing a two-hybrid screen, the first decision to be made is the choice of the most appropriate vector system. A
large number of different DNA-binding domain (DB) and transcription activation domain (AD) containing vectors have been used
successfully. The most extensively used vectors are GAL4 based, probably because they were the first commercially available.
Alternative systems make use of the DB of the bacterial LexA protein and the AD of VP16 or the so-called B42AD. Both systems, GAL4 and LexA, have advantages and drawbacks which make the choice more difficult. In the following paragraphs the major differences between
the two systems will be pointed out together with the major pitfalls in a typical screening procedure. Since the LexA and the GAL4-based two-hybrid systems have different properties, it is not unreasonable to imagine that some interactions
might be detected differently in both systems. Trying both will increase the chance of success.
The promoter regulates the expression level of the target protein
The 1500-bp full length ADH1 promoter, that normally drives the expression of the metabolic enzyme alcohol dehydrogenase 1,
leads to high-level expression of sequences under its control. This promoter is used in the pAS2(-1) and pLexA plasmids that are used to clone the target-fusion. It is also present in pGAD-GH that can be used to clone the cDNA library.
Expression from this promoter is maximal during logarithmic growth of the yeast cells and becomes repressed in late log phase
by ethanol accumulation in the medium. This explains why Western blot analysis of the expressed fusion protein gives the best
results when logarithmic growing yeast is used. Although the ADH1 is generally considered to be a strong constitutive promoter,
expression is actually repressed as much as 10-fold on non-fermentable carbon sources (12). In contrast to this full-length promoter, several cloning vectors, including pGBT9 (DB) and pGAD424 (AD) contain a truncated
410-bp ADH1 promoter. Expression from this promoter leads to low or very low levels of fusion protein expression that are
hardly detectable on a Western blot (13-15). The choice of expression plasmid might be influenced by the nature of the target.
If the target is expected to interfere with the endogeneous yeast metabolism, a lower expression might be beneficial. However
if the expression of the fusion-protein needs to be assayed, the higher expression level is more convenient. Legrain et al. (15) used known interaction partners to study the influence of expression levels of the fusion proteins on their detection sensitivity
in two-hybrid assays. Combinations of weak expressing plasmids revealed that detection of these interactions was roughly 50-100
times less sensitive. Therefore, an interaction assayed by use of these plasmids may sometimes escape detection. Important,
however, is to notice that this observation was strongly dependent upon the protein-couples used. Some protein combinations
were unaffected in sensitivity whereas others dropped only 10-fold. Another important but less documented parameter is the
relative position of the various "cassettes" in the different vectors. A simple correlation between sensitivity and length
of the promoter fragment used is excluded since the use of other plasmids containing the same proteins under the same truncated
promoter did not result in a loss of sensitivity (15).
Total expression level does not only depend on the promoter strength and the carbon source used, but also on the copy number
of the plasmid. In most commonly used two-hybrid plasmids, the origin of replication is the so-called 2μm origin. The 2-μm
circle plasmids are maintained stable and at high copy numbers (50-100 copies per cell) in yeast, and function solely for
their own replication (16). It was shown that 2-μm DNA replication is similar to chromosomal DNA replication. 2-μm DNA is the only known example of
a multiple-copy extrachromosomal DNA in which every molecule replicates in each cell cycle (17). It should be noted that in the context of a reverse hybrid system (see Reverse Hybrid System), the expression levels of
the hybrid proteins should be maintained as low as possible, since every "background" interaction, which might occur more
often at high protein expression levels, will kill the yeast. Therefore, the vectors used in these reverse hybrid systems
make use of low-copy, so-called centromeric, expression plasmids. The difference in expression levels between high- and low-copy
plasmids can be as much as twenty- or thirty fold. Also when using toxic proteins, the use of these centromeric vectors in
forward screens can be taken into consideration.
Features of target/bait vectors
The pAS vectors also encode a hemaglutinin (HA), YPYDVPDYA, epitope tag in frame with the GAL4DB (18). This allows the protein to be visualized with commercially available anti-HA antibodies. There has been some controversy
about the introduction of the HA tag since this results in a weak auto-activation when the empty vector is transformed and
reporter gene activity is measured. However, when a protein is fused in frame, in general, the fusion protein looses the auto-activation
properties. In principle, this could be used as an "in-frame control" but it has never been advertised in this way. It might
however explain why in the new version of the pAS2, named pAS2-1, the HA tag is removed. The same HA is now introduced in
pACT2, which is the successor of pGAD424. Other characteristics that might influence the choice of a vector are from a more
practical nature. In all vector systems care must be taken to maintain the proper reading frame when creating the two-hybrid
proteins. Compatibility of the multiple cloning site (MCS) in the DB and AD containing vector can be beneficial. Cloning in
both vectors permits both verification of the correct length on Western blot and enables the establishment of a panel to score
new targets in simple co-transformation or mating experiments.
Another "add-on" found in some of the commercially available vectors is the CYH2 gene. CYH2 encodes the L29 protein of the yeast ribosome. Cycloheximide (Chx), a drug which blocks polypeptide elongation during translation,
prevents the growth of cells that contain the wild-type CYH2 gene. Chx resistance results from a single amino acid change in the CYH2 protein (19). When a wild type CYH2 containing plasmid is present in a mutant yeast strain resistant to Chx, it confers Chx sensitivity. Adding Chx to the medium
results in a selection for yeast that has lost the CYH2 containing plasmid. This technique might be useful to select for only the library insert containing plasmid after screening
(see Separating the two-hybrids). A potential danger is that due to leakage on the promoter of this CYH2 gene the yeast growth is retarded.
Most two-hybrid strains have a lesion in either URA3, LEU2, HIS3, TRP1 and/or ADE2 which allows selection for yeast cells that were transformed with plasmids that carry the corresponding gene by growth in
the absence of the appropriate amino acids. The most widely used markers in yeast are genes encoding amino acid biosynthetic
enzymes. However, in a recently introduced target vector (pHybLex/Zeo) zeocin is used as selection marker. Parent et al. (17) have compiled an extensive list of yeast cloning vectors. In addition, a list of the most commonly used plasmids in two-hybrid
and their distinguishable features is listed in Table 1.
| Table 1: Overview of the most commonly used two-hybrid vectors. The last column
describes the origin of the promoter and the accession number in EMBL (AC) |
|
Name
|
Selection- marker
|
Functional domain
|
promoter, AC
|
|
GAL4-based
|
|
pMA424
|
HIS3
|
GAL4DB
|
Original vector, 12 kb
|
|
pGBT9
|
TRP1
|
GAL4DB
|
ADH1(truncated)
AC: U07646
|
|
pAS1
|
TRP1
|
GAL4DB+HA
|
ADH1(full length),
CYH2
|
|
pAS2
|
TRP1
|
GAL4DB+HA
|
ADH1(full length)
CYH2
AC: U30496
|
|
pAS2-1
|
TRP1
|
GAL4DB
|
ADH1(full length)
CYH2
AC:U30497
|
|
pGAD2F
|
LEU2
|
GAL4AD
|
Original vector, 13 kb
|
|
pGAD424
|
LEU2
|
GAL4AD
|
ADH1(truncated)
AC:U07647
|
|
pGAD10
|
LEU2
|
GAL4AD
|
ADH1(truncated)
AC:U13188
|
|
pGAD-GL
|
LEU2
|
GAL4AD
|
ADH1(truncated)
|
|
pGAD-GH
|
LEU2
|
GAL4AD
|
ADH1(full length)
|
|
pGAD1318
|
LEU2
|
GAL4AD
|
ADH1(full length)
|
|
pSE1107
|
LEU2
|
GAL4AD
|
|
|
pSD-10
|
URA3
|
VP16AD
|
|
|
pACT1
|
LEU2
|
GAL4AD
|
|
|
pACT2
|
LEU2
|
GAL4AD+HA
|
ADH1(truncated),
medium expression
AC:U29899
|
|
LexA-based
|
|
pBTM116
|
TRP1
|
LexA
|
ADH1(truncated)
|
|
pLexA
|
HIS3
|
LexA
|
ADH1(full length)
= pEG202
|
|
pB42AD
|
TRP1
|
B42+ SV40 NLS + HA
|
GAL1 (full length),
inducible promoter
= pJG4-5
|
|
pHybLex/Zeo
|
Zeocin
|
LexA
|
ADH1(truncated)
|
|
pYESTrp
|
TRP1
|
V5 epitope + SV40 NLS + B42
|
GAL1 (full length),
inducible promoter
|
|
pGilda
|
HIS3
|
LexA
|
GAL1 (full length), inducible
promoter
centromeric vector
|
Sequencing and Western Blot analysis
Before proceeding with actual screening it is advisable to check the chimeric cDNA by sequencing and to verify the actual
expression of the fusion protein inside yeast. The latter can be achieved by "classical" SDS-PAGE and subsequent Western blot
analysis as outlined in Protocol 1. Both anti-GAL4DB and anti-LexA antibodies are commercially available (Clontech #5399-1). This is preferably performed on the yeast strain used for screening
the library to be sure that the target is properly expressed (20).
Protocol 1 SDS-PAGE and Western blot analysis to check for full-length expression of the target fusion protein in the GAL4
system
After transformation of the plasmids encoding GAL4DB fusion proteins into Saccharomyces cerevisiae strain HF7c and incubation on plates lacking the appropriate amino acids (e.g.Trp), single colonies are inoculated into 15
ml synthetic medium lacking the same amino acid(s). At an optical density (A600) around 0.7, the culture is centrifuged at 2500 rpm for 5 min, the pellet is washed in distilled water and boiled for 3 min
in 200 μl of 2 x Laemmli loading buffer. 50 μl of this sample is separated by 10% SDS-polyacrylamide gel electrophoresis and
blotted onto a nitrocellulose membrane (Schleicher & Schuell, Dassel, F.R.G.). Detection of the expressed fusion proteins
can be performed with polyclonal anti-yeast GAL4DB antibody and peroxidase-conjugated anti-rabbit antibody using ECL (Amersham
Life Science, Amersham, U.K.).
Auto-activation
Since the two-hybrid system is based on reconstitution of a functional transcription factor, checking the auto-activation
capacity of the target is crucial for the overall feasibility. Initiation of transcription, due to some latent activating
activity is present in approximately 5% of all proteins and even more in randomly generated fragments (like in libraries).
The use of a library fused to the DNA binding domain would result in a large amount of false positives illustrating that auto-activation
can cause problems. Since DNA-binding to the upstream activating sequence (UASG) is more stringent, it provides a rational for making the libraries in the vector containing the AD.
A single transformation and reporter gene assay tells you whether your favorite protein (YFP) is able to induce reporter gene
activity. What if YFP is auto activating? If it is only weak, e.g. only a minor background on the HIS3 and not on LacZ, one could try to increase the amount of AT in the plates (see Pilot transformation in the GAL4 system). In the LexA system using less sensible reporter hosts or LacZ expressing plasmids containing variable DNA binding regions can genetically control the sensitivity. This ability to tune
genetically the sensitivity of the reporter might be one of greatest advantages of the LexA system (see Table 4"/>). If this tuning does not help, more radical approaches need to be applied. It is often possible to
delete a small region of a protein that activates transcription. Removal of this activation function while retaining other
properties of the protein might enable the use of the target to screen libraries. Alternatively, instead of using a C-terminal
fusion to the DB, an N-terminal fusion can be tried. Indeed normal two-hybrid methods detect interactions between two proteins
fused at the C-termini of the DB and AD, respectively. This implicates that the N-terminus of none of these proteins is available
for interaction. It was demonstrated, using constructs with reverted polarity, that such constructs give a specific interaction
signal that is dramatically increased. Such constructs might lead to the identification of partners missed during classical
two-hybrid screens and might at least in some cases provide a solution to the auto-activation problem (21).
In theory, another solution could be considered. If auto-activation is strong like in the case of transcription factors, it
could be possible to apply a reverse approach. By using a toxic reporter gene and an auto-activating target under an inducible
promoter, all yeast cells that are not repressed by an interaction, will die upon induction of the promoter. All yeast cells
surviving such a screening procedure should contain a plasmid encoding an interaction partner of the auto-activating bait.
Nuclear localization: the repression assay
Another requirement, apart from correct folding of the fusion protein and impossibility to initiate autonomously transcription,
is the ability of the fusion to be localized to the yeast nucleus. Indeed all transcription events are dedicated nuclear processes.
Specific binding of GAL4 to the (UAS)G is conferred by a zinc-cluster motif located within the N-terminal 64 residues (5, 22,23). GAL4 binds as a dimer to DNA, which is mediated at least in part by an a-helical region within residues 65-94 which
form a coiled-coil interaction (23-24). GAL4DB also has its own nuclear localization signal (NLS) within its N-terminal 74
residues (25-27). A positive control can be used to check the nuclear localization in the GAL4 system. Unless, in the case
of auto-activation, where apart from the problems described above, the nuclear localization problem is trivial.
Since LexA is a bacterial protein it contains no NLS. Therefore, the NLS of SV40 large T is fused in frame ensuring nuclear localization.
The nuclear localization of LexA fusions can be assayed by a so-called "Repression" or "Blocking assay". The repression assay is based on the observation
that LexA and non-activating LexA fusions can repress transcription of a yeast reporter gene that has LexA operators positioned between the TATA and upstream activating sequence (UAS) (28). LacZ expression is induced by galactose and is detectable in the presence of glucose. A transcriptional inert LexA fusion that binds to the operator located between the UASG and the TATA box is able to repress or block the LacZ expression, a clear indication that the LexA fusion is properly located to the yeast nucleus. The reporter plasmid pJK101 is used to perform this repression analysis.
Since repression is never complete, a positive and negative control is needed to calibrate the repression capability of the
protein assayed.
One could argue that nuclear localization of the target or prey as such is not strictly required. Indeed, if interaction occurs
in the cytoplasm, both interacting proteins, of which only one contains a NLS, might be targeted to the nucleus. However,
this might only be true for strong interactions that are not always the most interesting.
Positive control
The best way to avoid most problems concerning bait construction is to clone a known interactor in the opposite vector and
to assay the interaction in a two-hybrid system. If this results in specific reporter gene activation, your fusion protein
is correctly folded and properly targeted to the nucleus. However, if this test fails, finding out why, might be cumbersome.
Library choice
It is always a good idea to start with a library prepared from a tissue in which the target protein is known to be biologically
relevant. To screen a mammalian cDNA library until saturation, more than 5-10 x 106 yeast transformants need to be screened. Both oligo-dT and random primed libraries are used. Important quality parameters
are the number of independent clones before and after amplification, the number of clones that contain an insert and the mean
insert length. Having multiple cloning sites available in the library plasmid is beneficial to subclone in later stages the
insert for additional interaction controls outside the two-hybrid system. The most commonly used plasmids for library construction
are listed in Table 1"/>.
Important is also the relative strength of the activation domains, indicating their ability to initiate transcription. Both
VP16 and the AD of GAL4 are known to be strong activators making the system more sensitive. This might be needed for the detection
of weak interactions but results inevitable in higher backgrounds. Therefore the use of B42AD, a random fragment that was
isolated for its intermediate transactivation capability, might be beneficial in setting up a screening.
For the library plasmid there is a major difference in promoter between the LexA and GAL4 system. While in the most commonly used plasmids of the GAL4 system, fusion proteins are weakly and constitutively
expressed, they are cloned behind a stronger but inducible promoter in the LexA system. In the latest version of the library plasmid used in the GAL4 system, pACT2, a truncated weakly expressing version
of ADH1 is used. However, since this promoter is adjacent to a section of pBR322 that acts as a transcriptional enhancer in
yeast, a medium expression level is obtained.
Inducible expression has the advantage that there is less opportunity for AD fusions proteins to have a toxic effect on the
yeast host, which could also result in the elimination of this protein from the pool of potentially interacting proteins.
However, the experimental protocol is longer in the case of induction since transformation efficiency drops dramatically if
the transformations are directly selected on all auxotrophic markers and on a carbon source that induces expression of the
library fusion protein. Therefore, one typically selects for all plasmids before induction. Although this procedure is longer,
it might be beneficial since only part of the original transformation mix needs to be induced, enabling means to perform screening
in shifts and having back-ups in case of contamination. It is possible to combine vectors as long as they are compatible with
each other with regards to the selectable markers and the UAS before the reporters (see Table 2"/>) (29).
| Table 2: Comparison of the LexA and GAL4 Yeast Two-Hybrid Systems |
|
|
DB vector selection marker
|
AD vector selection marker
|
Chromosomal reporter gene(s)
|
Plasmid reporter gene
|
|
GAL4 Systems
|
TRP1
|
LEU2
|
HIS3, LacZ
|
(none)
|
|
LexA Systems
|
HIS3
|
TRP1
|
LEU2
|
LacZ
|
Choose a yeast strain as a host
Reporter gene(s) and resulting sensitivity
The upstream activating regions and TATA regions are the basic building blocks of yeast promoters. The initiation of gene
transcription in yeast, as in other organisms, is achieved by several molecular mechanisms working in concert. All yeast genes
are preceded by a region containing a TATA box. Many genes are also associated with cis-acting transcription elements and
sequences to which transcription factors and other trans-acting regulatory proteins bind and affect transcription levels.
The term promoter usually refers to both the TATA box and these associated cis-regulatory elements. Gene regulation in yeast
involves cis regulatory elements that are relatively closely associated with the TATA box. The most common type of cis-acting
transcription elements in yeast are upstream activating sequences (UAS). UAS sequences are recognized by specific transcriptional
activators that enhance transcription. The enhancing function of yeast UASs is generally independent of the orientation but
it is sensitive to distance effects, if moved more than a few hundred base pairs from the TATA region. There may be multiple
copies of a UAS upstream of a yeast coding region. In yeast, the genes required for galactose metabolism are controlled by
two regulatory proteins, GAL4 and GAL80, as well as by the carbon source in the medium (30). When galactose is present, the GAL4 protein binds to the GAL-responsive elements within the UAS of at least 20 known galactose-responsive
genes (including GAL1). In the absence of galactose, GAL80 binds to GAL4 and this interaction blocks transcriptional activation. Furthermore, in
the presence of glucose, transcription of galactose genes is immediately repressed. The 17-mer consensus sequence, referred
to as UASG, functions in an additive fashion. Indeed multiple sites lead to higher transcription levels than a single site (31). This explains why the number of UASG located before the reporter determines the sensitivity of the interactions that can be assayed.
To avoid interference by endogeneous GAL4 and GAL80, proteins the yeast host strains used in the GAL4 based two-hybrid system
must carry deletions of the GAL4 and GAL80 genes. Due to the deletion of these two genes, the yeast cells grow more slowly
as compared to yeast containing the wild type version of these genes. The use of bacterial LexA circumvents this disadvantage.
Reporter genes can be integrated into the genome or reside on a plasmid. The inconvenience of having another plasmid containing
the reporter gene, and the need for an additional auxotrophic marker is compensated by several advantages. Indeed, one of
the major advantages of the LexA system, where the LacZ reporter gene is present on a high copy-number plasmid, is that weak signals are more efficiently amplified than in the GAL4
system, which makes it possible to assay β-Galactosidase activity directly on the selection plate by including X-GAL in the
medium. This avoids tedious replica and/or filter lift assays.
Reporter strains in LexA and GAL4
In the GAL4-based MATCHMAKER two-hybrid system, either the intact GAL1 UAS, which contains four GAL4-binding sites, or an artificially constructed UAS consisting of three copies of the 17-mer
consensus binding sequence is used.
| Table 3: Survey of the most commonly used yeast strains, used reporter genes
and the constitution (origin) of their promoter in the GAL4 based two-hybrid
system. |
|
|
|
|
Expression
|
|
|
Reporter genes
|
UAS regulated by +
origin of UAS
|
uninduced
|
induced
|
|
H7Fc
|
LacZ,
HIS3
|
GAL4, 3 x UASG 17-mer
GAL4, GAL1 (= 4 x UASG 17-mer )
|
-
-
|
low
high
|
|
YRG-2
|
LacZ,
HIS3
|
GAL4, 3 x UASG 17-mer
GAL4, GAL1 (= 4 x UASG 17-mer )
|
-
-
|
low
high
|
|
SFY526
|
LacZ
|
GAL4, GAL1 (= 4 x UASG 17-mer
)
|
-
|
high
|
|
Y187
|
LacZ
|
GAL4, GAL1 (= 4 x UASG 17-mer
)
|
-
|
high
|
|
Y190
|
LacZ,
HIS3
|
GAL4, GAL1 (= 4 x UASG 17-mer
)
GAL4, GAL1 (= 4 x UASG 17-mer )
|
-
low
|
high
high
|
|
CG-1945
|
LacZ,
HIS3
|
GAL4, 3 x UASG 17-mer
GAL4, GAL1 (= 4 x UASG 17-mer )
|
-
very low
|
low
high
|
|
L40
|
HIS3,
LacZ
|
4 x LexA op
8 x LexA op
|
|
|
The relative sensitivity of both reporter genes used in the GAL4 based approach, respectively LacZ and HIS3, depends entirely on the constitution of the promoter. In the GAL4 system the LacZ is more stringent, meaning less likely to give false positives, than the HIS3 gene. This was elegantly demonstrated in a recently described reverse three-hybrid approach. Here the expression of Raf from
a third promoter could titrate out the Ras-DB fusion, keeping it from interacting with Raf-AD fusion. This reverse experiment
resulted in the expected LacZ negative phenotype but not in a HIS3 negative phenotype, reflecting the higher sensitivity of the histidine reporter gene as compared to the β-galactosidase reporter
gene in HF7c (32).
In LexA-based two-hybrid systems, the DB is provided by the entire prokaryotic LexA protein. LexA normally functions as a repressor of SOS genes in E. coli by binding LexA operator sequences that are an integral part of the promoter (33). When used in the yeast two-hybrid system, the LexA protein does not act as a repressor because the LexA operators are integrated upstream of the minimal promoter and coding region of the LEU2 reporter gene. Expression of the latter in EGY48 (Erica Golemis Yeast) is under the control of six copies of the LexA operator (op) sequence and a minimal LEU2 promoter. In the LacZ reporter plasmids, the LacZ reporter expression is under the control of 1-8 copies of the LexA operator and the minimal GAL1 promoter (34).
As all of the GAL1 UAS sequences have been removed from the LacZ reporter plasmids. This promoter is not regulated by glucose or galactose (35).
|