Recognition of “Self” and “Non-Self” Glycans by Carbohydrate-Binding Proteins…

Hello, I’m Gerardo R. Vasta from the University
of Maryland in the United States, and my research program is focused on the molecular structural
and functional aspects of protein-carbohydrate interactions that mediate self and non-self
recognition. For example, during developmental processes and immune responses. My goal today
is to illustrate this aspect to this of the proteins that are of interest to my laboratory
and use them as models to examine basic aspects of these interactions. As you can guess from the title of my talk
cover, I will focus on carbohydrate-binding proteins, and the first question to ask is
why are these proteins important. They’re important because oligosaccharides present
on the cell surface and soluble glycans encode complex information, which is decoded by the
carbohydrate-binding proteins, and in doing so, they modulate the interactions between
cells and cells in the extracellular matrix in various processes, such as early development
and immune responses. Innate immunity is the first barrier of defense
because for the most part, these responses do not require antigen processing and are
very fast relative to the adaptive immune responses that may take weeks or months to
develop. Among the whole marks of innate immunity, perhaps the most striking is the diversity
of the innate recognitional receptors, which are distributed among various cell types. Innate immune receptors include both soluble
and cell-associated lectins, natural killer cells, scavenger and complement receptors,
(?) and toll-like receptors (TLRs) and the intracellular NODs that signal via NF-κB to activate immune
genes and up-regulate the expression of antimicrobial peptides in insects or cytokines in mammals.
Now each of these groups are quite diverse themselves, and a good example of the mammalian
toll-like receptors with about a dozen or more members depending on the species some
of the cell surface and some of intracellular localization. So what is a lectin? Lectins are proteins
characterized by carbohydrate recognition domain abbreviated “CRD”. And many examples
are chimeric molecules consisting of additional structurally and functionally diverse domains. Lectins have been classified based on their
calcium requirements, carbohydrate specificity and the presence of unique sequence motifs
in the CRD. And here we can see a partial list of the lectin families identified so
far, and the list keeps expanding. More recently, lectin family has been identified
and characterized by the structural folds and shown here for the pentraxin, I-type,
galectin, P-type, and C-type. As I mentioned previously, the lectin domain
can be joined by other structurally and functionally different domains resulting in Mosaic or chimeric
proteins. This cartoon of the C-type lectin family shows the C-type lectin domain as a
red hexagon in combination with a wide variety of distinct domains. This is (to) enable the
classification of C-type lectins in several groups and types while underscoring their
diversity in both recognition and effect of functions. Some C-type lectins, like the selectins,
recognize self ligands, whereas others, like mannose-binding lectins, conglutinins, and
lung surfactants, bind to microbial glycans and are described as atypical pattern recognition
receptors, based on the immune recognition model by Janeway and Medzhitov. So how does a CRD recognize a sugar and how
are the effects the functions launched. This is Bacillus treated with a prototypical pattern
recognition receptor, the mannose-binding lectin, which displays exquisite specificity
(?) consisting of two equatorial hydroxyls on neighboring carbons. Since the lectin subunits
form trimers that binding clefts are displayed in a way that they can recognize sugars on
the microbial surface that are 45 angstroms apart. Recognition of the microbial surface by the
mannose-binding lectin functions as a tag that is in turn recognized by phagocytic cells
such as macrophages, leading to enhanced phagocytosis which is known as the opsonic effect. The recognition of the microbial surface by
mannose-binding lectin can also lead to association with a specific serine protease, or MASP,
which in turn triggers activation of complement cascade. This leads to increase of opsonization
or direct link of the microbe via the membrane attack complex. Now in this well-established of invertebrate
organisms lacking adaptive immunity mediated by immunoglobulins and BMT cells are also
successful in fighting infectious disease. However, many questions remain open concerning
the capacity for immune recognition of the lectin repertoires which these organisms as
well as vertebrates rely upon. Among these, how is diversity in “self” and “non-self”
recognition achieved? How diverse are the lectin repertoires in a given species or individual?
And have these repertoires evolved in their functional diversity? I’d like to illustrate these aspects with
a new lectin family that a student in my lab, Eric Odom, identified a few years ago, and
which we designated as the F-lectin family because most members bind fucose. This initial
observation was a novel 32 kilo Dalton fucose-binding lectin in the striped bass, Morone saxatilis,
which we called MsFBP-32. This lectin is expressed in the liver and released to the bloodstream
and it is upregulated by immune challenge. Cloning and sequencing of the bass lectin
transcripts of gene reveal the novel lectin family with two unique CRDs in the lectin
subunit. Based on this novel sequence motif, several family members were identified which
display single or multiple F-type CRDs alone or in combination with other recognition or
effector domains. As illustrated in this cartoon, the green
rectangle represents the F-type CRD, present as a single domain in the European eel, and
horseshoe crab tachylectin-4, with three CRDs in Xenopus tropicalis, four in the trout and
Xenopus laevis, five in the oyster bindin, a gamete recognition protein. One CRD in Drosophila
furrowed, a Mosaic protein with a C-type lectin domain and multiple complement control domains.
And also F-type CRDs in the fucose utilization operon in Streptococcus pneumoniae, identified
as a virulence factor in this species. And later in Streptococcus mitis, a selectin (?), a
lectin that promote the formation in the host cells. As we recognized this as a novel lectin
family, we became interested in these structural aspects. To start with, we selected the simplest
one, the eel lectin, also known as Anguilla anguilla agglutinin, or AAA, which displays
a single F-type CRD. In a collaboration with Dr. Amzel and Bianchet
from the Department of Biophysics at the Johns Hopkins University, the protein was crystallized
and the structure resolved. The structure revealed that this is not only a new lectin
family but also a novel structural fold for animal lectins. The bulk of the fold consists
of eight major antiparallel β-strands arranged in two β-sheets of five and three strands
each with five loops which were named CRD 1 to 5, connecting opposite sheets. These
loops surround positively-charged carbohydrate-binding cleft that the side of the barrel, a loop-rich
substructure tightly binds cation most likely calcium. The native 17 kilo Dalton protein
forms a 51 kilo Dalton non-covalently associated trimer that represents the native oligomeric
state of the eel lectin, and can recognize with high affinity carbohydrates 26-angstrom-apart. Here we can see how the eel lectin recognizes
the fucose acyl hydroxyl 4 using three polar interactions established by histidine 52,
arginine 79 and arginine 86, with the perfect tetrahedral geometry. In addition, the hydrogen
of arginine 79 bonds to the oxygen on C5, and the hydrogen of arginine 86 bonds the
equatorial hydroxyl on C3. The disulfide bridge formed by cysteine 82 and 83 in CRD4 establishes
a van der Waals interaction with the fucose ring atoms C1 and C2. Finally, the east fig
are fine methyl group dug in the hydrophobic pocket formed by the aromatic rings of histidine
27 and phenylalanine 45 on top, together with leucine 23 and tyrosine 46 on the side form
the pocket. On the right panel we can see the pentagonal dipyramidal geometry of the
metal atom suggesting that it is calcium. Now in nature, the eel lectin mostly binds
oligosaccharides from glycans that may be soluble or at the cell surface. Here we can
see how recognition of Lewis A and H type-1 oligosaccharides takes place by what we call
“the extended recognition site”, where additional interactions are established between
the protein and the subterminal carbohydrate units. For example, in the left panel, the
glutamine 26 and histidine 27 in CRD 1 recognize equatorial hydroxyls on carbons 3 and 2 of
galactose and the oxygen of the N-acetyl group in the subterminal N-acetylgalactosamine in
Lewis A. On the right panel, these residues interact with the 6 and 4 hydroxyl groups
of N-acetylgalactosamine in the H type-1 oligosaccharide. In both oligosaccharides, the hydroxyl group
of tyrosine 46. In CRD 2, coordinates the glycosidic bond oxygen between galactose and N-acetylgalactosamine. We then move to elucidate this structure of
the family member with two CRDs — the striped bass F-lectin. The structure of F-lectin subunit reveals
that the two globular domains containing the CRDs at the N- and C- terminus, which I will
refer from now on to as N- and C- CRDs, are connected by a linker peptide. The subunits
form a cylindrical trimer that is 81-angstrom-long and 60-angstrom-in-diameter divided into
two halves. One containing the three N-CRDs and the other containing the three C-CRDs.
The top and bottom ends up the cylindrical trimer shows a three-fold arrangements of
fucose binding sites. All six CRDs contain a bound calcium and the chlorines are coordinated
by lysines and arginines in each lectin subunit. If we compare the N- and C- CRDs of the bass
lectin with that of the eel, we can see that there are significant sequence and structure
differences. It is evident that the N-CRD binding site resembles the eel lectin site
more closely than the C-CRD. The C-CRD pocket is less open than its counterpart due to the
replacement of phenylalanine 37 in the N-CRD by the bilocular tryptophan 183. In addition,
in C-CRD, phenylalanine 220 makes an apolar contact with the fucose that in the N-CRDs
provided by disulfide bridge form by consecutive cysteine 74 and 75. These features can be clearly seen in the
landscapes of the N- and C-CRD recognition sites, which own notable differences. In the
N-CRD, tyrosine 18 and histidine 20 form a polar ridge and a valley, which are quite
different from the (?), (?) formed by phenylalanine 166 and glutamate 167 in the CRD. The indole
ring of tryptophan 183 establishes a protruding ridge by connecting with glutamate 184. The
feature is not present in the N-CRD. In the N-CRD, the disulfide bridge forms a flat apolar
surface, while in the C-CRD, the bulky phenylalanine 220 narrows the recognition site. So in summary,
the N- and C-CRDs display differences both in the polar pocket and the extended recognition
site. That explain the differences in the carbohydrate specificity observed. Bilateral view of the eel and bass lectin
shows that while in the eel lectin “eeLF” binding site located on one face of the molecule,
in the bass lectin the molecule looks more like a cylinder while the opposing faces of
different binding properties. In contrast, an uncovalent dimer formed by eel lectin shown
in the left panel has a similar shape, but displays identical binding faces at both ends
of the cylinder. The bass lectin structure suggests that they can link different ligands
perhaps self and non-self as seen at the cartoon below. Blocking experiments revealed that
the N-CRD can recognize oligosaccharides containing Lewis and H trisaccharides whereas the C-CRD
prefers carbohydrates with less than two terminal fucose moieties. This arrangement is able
to crosslink fucosylated glycans on the host cells with those on bacteria, virus, or parasites.
And listed on the right are natural ligands of bass lectin and other F-lectin such as
the horseshoe crab tachylectins that include O antigens of microbial pathogens containing
α-link L-fucose, 2-acetoamido L-fucose, 3-deoxy-L-fucose (commonly known as colitose), and L-rhamnose
(a deoxy-L-mannose frequently observed in E. coli glycans). The crosslinking of host
and microbial glycans can be experimentally demonstrated by pre-exposing E. coli to the
bass F-lectin, and passing for any increase in phagocytosis phrase by peritoneal macrophages
relative to the unexposed controls. The opsonic effect can be inhibited by fucose as seen
on the left and shows those response pattern shown in the right panel. A cartoon showing
the opsonic effect mediated by the eel F-lectin is shown below. In addition to the single and double CRD F-lectins just described, which is shown on the left side, here we can see the hypothetical
organization of trimers of F-lectin subunits with three and four CRD domains. The oldest
plate binding faces with different specificity and this highlights their potential diversity
in recognition properties. Still a higher level of diversity can be attained
by the presence of lectin isoforms, or isolectins. The eel F-lectin has seven isolectins and
the highest variability sequence occurs in the loops that encircle the bind site. The
nature of the amino acid substitutions in the CRDs indicate differences in their fine
specificity of these isolectins. The seven isolectins as shown in the alignment
as eFL-1 to eFL-7, conserves a triad of amino acids associated with polar attraction with
a fucose, but they show sequence variations and those associated with a hydrophobic pocket
for C6 and binding to the subterminal oligosaccharide, particularly in CRD 1 and CRD 2. For example,
in eFL-1 and eFL-5, the residues in the apex of CRD 1 are replaced with smaller residues,
making this loop thinner and more flexible, and opening the C6 hydrophobic pocket to the
solvent. On the right side of the panel, we can see the sequence variability represented
in green with the lighter color indicating higher variability, which localizes to the
areas that interact with the fucose and subterminal sugars. The family member I will now discuss is bindin,
an F-lectin with five CRDs from the Asian oyster, Crassostrea gigas, because it is quite
unique in its sequence diversity. Bindin is a gamete recognition protein of
variable mass present in the sperm microsomal rings. During fertilization, these are exocytosed
and bind to the microvilli on the egg envelop. The protein is encoded by a single gene that
through positive selection, recombination, and alternative splicing, produces a striking
number of variants of the protein, potentially several thousands. Positive selection occurs
at eight sites clustered on the F-lectin fucose binding cleft. In the same region as the eel
F-lectin, showing the overlay of the bindin and the eel lectin on the top right. Additional
diversity results from recombination in the interim in each F-lectin repeat. Finally,
alternative splicing yields bindin transcripts with one to five repeats as shown from a single
individual in the bottom left panel. However, only one or two variants are translated into
protein as seen from several individual males in the bottom right figure, and it has been
proposed as specific recognition by bindin isoforms prevents polyspermy during fertilization. Another mechanism that leads diversity recognition
becomes evident in the domain structure is discoidin. The two-CRD cell adhesion lectin from the
slime mold, Dictyostelium discoideum, shown here in the center. This lectin combines one
F-type lectin domain shown in the green rectangle in common with the bass lectin, and an H-type
lectin domain which recognizes N-acetylated sugars that share with the lectin from the
garden snail, Helix pomatia, seen on the right. This slide summarizes what I just have described
in the both single- and multiple- CRD F-lectins can crosslink glycans on the surfaces of cells.
For example, in the case of the eel and other fish between host cells and microbes for the
purpose of immune recognition. In oocyte, glycans for avoiding polyspermy during fertilization,
and finally between host cells and there’re secretions like in the slime mold for cell
adhesion functions. In summary, the structural diversity of F-lectin
supports a substantial functional diversification. Although many questions remain open about
the mechanism aspects of these interactions. Now I would like to move to the galectins, which
is a lectin family with a very different structure, to illustrate self and non-self recognition
leading to other biological roles. Galectins are lectins characterized by their preference
for β-galactoside moieties and economical sequence motif in their CRD. They’re soluble,
ubiquitous, and highly-conserved. And here you can see the structure of mammalian galectin-1
could crystallize with the sugar ligand, N-acetyllactosamine. This galectin belongs to the prototype, consisting
of two subunits bound by non-covalent interactions, each carrying a single CRD. Chimera type galectins
have a carboxyl terminal CRD that is similar to the prototype, joined to an amino terminal
peptide that is rich in glycine, tyrosine, and proline that can oligomerize mostly as
trimers and pentamers. In the tandem-repeat galectins, two CRDs are joined by a functional
linker peptide. The proto and tandem-repeat types comprise several distinct subtypes which
have been numbered according to the order of the discovery. And so far, above fifteen
have been described mammals’. At the bottom, the key interaction between the protein and
the disaccharide they’re indicated and in ischemic manner the cartoon shows interactions
of histidine 44, asparagine 46 arginine 48, histidine 52, asparagine 61, tryptophan 68,
glutamic acid 71, and arginine 73. Additionally, interactions involving a water molecule that
bridges a nitrogen of an N-acetyl group of the sugar with additional amino acid residues
strengthen further the protein-carbohydrate binding. Galectins are synthesized in the cytoplasm
and was sometimes translocated into the nucleus and associated with right on the proteins.
Most galectins can be exported via non-conventional mechanism to the cell periphery and secreted
to the extracellular space. There they may bind cell surface carbohydrate receptors and
form lattices, or link them with the extracellular matrix. Because soon after the discovery in
the early eighties, they were found to be developmentally regulated, galectins were
proposed to be involved in embryogenesis in early developmental processes. We look at the rows of galectins in development
using the zebrafish model because of its multiple advantages over mammalian models, as genetically
tractable system, with a smaller galectin repertory, and transparent embryos. The prototype galectins Drgal1-L2 and -L4
are expressed on the notochord, and this is shown here for Drgal1-L2 by in situ hybridization
on the top, and by whole mounting immunostaining with the specificity of the antibody validated
in the left panel. Knockdown experiments in zebrafish embryos
using morpholino-modified anti-oligo nucleotides target to 5’-UTR sequence of Drgal1-L2 or
-L4 result in a phenotype with disorganized myofibers not observed with mismatch oligo
that can be rescued by co-injection of the appropriate mRNA. The zebrafish Drgal1-L2 is also involved in
tissue repair and regeneration. Upon experimental photo-induced damage of the retina photoreceptor
cell death, expression of galectin is upregulated and secreted by stem cells in neural progenitors
and regulates regeneration of rod photoreceptors in the retina. As in the previous experiments
knockdown of the galectin expression by the specific morpholino oligo nucleotide, shows
reduced fluorescence staining of the neuroglia in retinas in exposed light for 72 hours and
7 days, images are on the right side of the left panel. By the use of an anti-rhodopsin
specific antibody, we can see reduced regeneration of the rod photoreceptors, and the bottom
right panel while regeneration of the cone-specific GLIS stain by an anti-PDE6C antibody, a marker
specific for the cones, is not affected in the knockdown seen in the top right panel. In summary, the two examples illustrate how
galectins participate in developmental and tissue regeneration processes. Its functions
are based on the recognition of self galactose ligands at the cell surface or extracellular
matrix, followed by activation of specific signaling pathways. In recent years, the critical roles of galectins
in regulation of adaptive immunity have been discovered. One of the earliest and perhaps
most striking findings is the participation in Th1/Th2 polarization. This is based on
the activity of sialyltransferase that masks galectin ligands on the surface of T cells
and prevents galectin binding information of microdomains that would otherwise activate
pro-apoptotic cascades. Clearly, these immune regulatory functions also result from interactions
of galectins with self glycans, in this case, on the surface of immune cells. Even more
recently, however, the binding of galectins to non-self glycans on the surface of microbial,
pathogens, parasites, and fungi has been documented. It would seem as the galectins fit well among
the pattern recognition receptors that previously described that include TLRs and NOD receptors,
complements, scavenger receptors, and both membrane-associated and soluble lectins all
of which are key components of innate immunity. There is another level of complexity in the
galectin-mediated recognition microbial pathogens, however, because some pathogens and parasites
take advantage of the recognition properties of galectins and subvert the roles as pattern
recognition receptors. I will illustrate this observation with interactions
between the eastern oyster, Crassostrea virginica, and the protozoan parasite, Perkinsus marinus.
Oysters are bivalve mollusks that feed by filtering planktonic microalgae that are suspended
in the water column. These images show how oysters can clear the water from suspended
phytoplankton in the matter of minutes. Once internalized, microalgae are engulfed by phagocytic
cells and digested. The problem is, in certain areas, together with the microalgae, the oyster
ingests the parasite Perkinsus, which infects the mollusk and eventually kills it. The ultra-structure of two life stages of
the parasite are shown in transmission electron microscopy. The trophozoite on the left and
the motile zoospore on the right. Scanning electron microscopy shows you oyster
phagocytic cells, called hemocytes, in the processes of engulfing Perkinsus trophozoites
in vitro. Once internalized, the parasite inhibits the respiratory burst, avoiding intracellular
oxidative killing, and survives and proliferates within the phagocytic cells. One of the unresolved
key questions about this process, however, is how the parasite is recognized and internalized
by the hemocytes? We discovered that the oyster possesses a
unique galectin with four tandemly arrayed CRDs which is shown here in comparison with
the mammalian galectin types, the proto, chimera, and tandem-repeat. The oyster galectin CRDs
are well-conserved in structure with respect to the mammalian galectins, such as bovine
galectins shown here in gray. They’re also unique in that they lack two amino acid residues
shown in the sequence above that interact with the N-acetyl group of the disaccharide.
The lack of these two residues confers broader specificity to this galectin. This broader specificity reflected by the
variable binding of adipocyte galectin to the microalgal food and the environmental
bacteria with strongest binding towards all species and strains of the parasite pest.
The binding to the parasites is specific as it can be inhibited by galactose derivative,
the thiodigalactoside (TDG), but not with glucose, as shown in the upper right panel. Interestingly,
RT-PCR revealed that as compare to other tissues on cell types, the highest number of transcripts
can be found in the phagocytic cells, the hemocytes. Furthermore, when the hemocytes
are in circulation, the galectin protein can only be detected by western blot in the cell
extracts, but not in the plasma. When the hemocytes attach to a foreign surface, such
as plastic in a petri dish, the galectin can be detected not only in the cells but also
in the supernatant, indicating that it is released to the environment. If the galectin is immunolocalized with fluorescent
antibodies in the circulating cells, it can only be detected if the hemocytes are permeablized,
as seen on the left, indicating that the galectin is intracellular and homogenously distributed
in the cytoplasm. When the hemocytes attach to a foreign surface, however, the labeled
galectin can be seen in the permeablized cells as translocated to the periphery. In the untreated
cells, this stain is clearly on the external surface, indicating that the galectin was
secreted upon attachment then spreading of the hemocytes and bound to the cell surface. If we pretreat the hemocytes with an antibody
specific to the oyster galectin, the phagocytosis of Perkinsus is decreased in a dose-response
manner. This suggest that the oyster galectin can only bind pseudo-parasites, but that the
galectin bound to the surface contributes to parasite uptake by the hemocytes. This
let us to hypothesize that among the multiple surface receptors, that may contribute to
the phagocytosis of microalgal food, in this case, has a crisis in Tetraselmis species.
The parasite Perkinsus can now compete them by strongly binding to the surface galectin
which the parasite uses to enter the cell. This is illustrated in the left panel of the
cartoon, by mimicry with microalgal surface glycans that parasite Perkinsus enter the
hemocytes, inhibiting intracellular killing mechanisms and proliferate, systematically
invading the host. The use of host galectins for entries conceptually similar for the colonization
of the sandfly by Leishmania parasites ingested with a blood meal. And it happens via galectin
express in the midgut of the insect vector. This is shown on the right panel where after
the procyclic parasites mature into flagellated stage, the binding to the sandfly galectins
is displayed by capping the surface glycans with arabinose. This enables the (?) parasite
to be released from the midgut epithelium and migrate to the vector salivary glands
from where they will be introduced into the mammalian hosts. Based on the Janeway and Medzhitov model for
innate immune recognition, pattern recognition receptors recognize microbes through pathogen-associated
molecular patterns that are absent from the host. Galectins, however, apparently bind
similar galactoside moiety on the surface of host cells and the microbial pathogens.
Therefore, glycan recognition by galectins is a paradox in that the proteins apparently
bind similar self and non-self molecular patterns. From an immune recognition standpoint, this
is counterintuitive that most likely due to our limited knowledge on how galectins are
secreted and compartmentalized. And various aspects of the binding interactions. And these
include the oligomerization of galectins, differences in presentation and scaffolding
of the carbohydrate ligands on the surfaces opposed in microbial cells, in thermodynamic
aspects of the binding interaction. Finally, because galectins play key roles in early
development the immune regulation as we have seen in this presentation, they are subject
to evolutionary constraints and are substantially conserved. So during evolutionary times, some
pathogens and parasites may have evolved their glycomes to mimic their hosts, and take advantage
of these conserved binding properties of galectins to attach and invade them. I would like to acknowledge the contributions
to these studies I described by current and former members of my lab, my collaborators,
and the financial support from the National Institutes of Health, National Science Foundation,
Maryland Sea Grant, National Oceanic and Atmospheric Administration, and the US Department of Agriculture.
I hope you have enjoyed this talk and I encourage you to read more about this exciting field.

, ,

Post navigation

Leave a Reply

Your email address will not be published. Required fields are marked *