Вы находитесь на странице: 1из 14

REVIEWS

RNA-based recognition and targeting:


sowing the seeds of specificity
Stanislaw A.Gorski1, Jrg Vogel1,2 and Jennifer A.Doudna38
Abstract | RNA is involved in the regulation of multiple cellular processes, often by forming
sequence-specific base pairs with cellular RNA or DNA targets that must be identified among
thelarge number of nucleic acids in a cell. Several RNA-based regulatory systems in eukaryotes,
bacteria and archaea, including microRNAs (miRNAs), small interfering RNAs (siRNAs), CRISPR
RNAs (crRNAs) and small RNAs (sRNAs) that are dependent on the RNA chaperone protein Hfq,
achieve specificity using similar strategies. Central to their function is the presentation of short
seed sequences within a ribonucleoprotein complex to facilitate the search for and recognition
of targets.

RNA has a central role in the cell because it directs and acceptors)5, their secondary and tertiary structures
multiple mechanistically distinct processes1. For exam- are important in defining the sequences involved in
ple, tRNAs play a fundamental part in protein synthesis, target recognition. Functional secondary and tertiary
small nuclear RNAs (snRNAs) have key roles in mRNA structures compete with energetically similar alterna-
splicing, and small nucleolar RNAs (snoRNA) and small tive conformations, which in some instances sequester
Cajal body-specific RNAs (scaRNAs) guide RNA mod- the sequences that define target specificity; this favours
1
Institute of Molecular
Infection Biology, University
ifications. Moreover, there are numerous and diverse non-target interactions and limits the rate of regula-
of Wrzburg, Josef-Schneider- classes of non-coding RNA that direct gene regulation tory RNAtarget associations. In addition, interactions
Strasse 2/D15, D-97080 or defend the genome to ensure its integrity. Several involving naked RNA and DNA oligonucleotides must
Wrzburg, Germany. properties make RNA particularly well suited for con- overcome thermodynamic and kinetic barriers to effi-
2
Helmholtz Institute for
trolling these processes1,2. Most importantly, RNA can ciently regulate biological processes58. For example,
RNAbased Infection
Research (HIRI), University regulate nucleic acids in a sequence-specific manner by duplexes containing 8 nucleotides are unstable, whereas
ofWrzburg, D-97080 base-pairing with defined RNA or DNA targets. It can a 20nucleotide RNARNA duplex is nearly irreversible
Wrzburg, Germany. also be rapidly synthesized or processed in response to under physiologicalconditions8.
3
Department of Molecular a signal and it is functional without the need to betrans- The ubiquitous nature of RNA as a regulatory mol-
and Cell Biology, University
ofCalifornia, Berkeley.
lated. Moreover, regulatory RNA molecules can be ecule suggests that RNA-based systems have evolved
4
Howard Hughes Medical codegraded with their targets on binding to each other, important mechanisms to ensure target specificity
Institute, University of enabling them to impart rapid responses to a change in despite the thousands (if not millions) of expressed
California, Berkeley. cellular status with quantitative characteristics that are non-target RNAs that share a degree of complemen
5
California Institute for
distinct from protein-based cellular regulation2,3. tarity. One way to achieve specificity is for RNAs to be
Quantitative Biosciences,
University of California, Despite these advantages, RNA molecules must subdivided into biochemically distinct regions, with
Berkeley. overcome several limitations to ensure that they speci specific subregions that interrogate potential RNA or
6
Department of Chemistry, fically regulate cellular targets through direct base- DNA targets. Indeed, this feature has been identified in
University of California, pairing interactions. Although regulatory RNAs range several RNA-based systems using molecular, genomic
Berkeley.
7
Lawrence Berkeley National
in length from twenty to hundreds of nucleotides, target and computational approaches. These subregions are
Laboratory, Berkeley. binding usually involves partial complementarity with often referred to as seed sequences short stretches
8
Innovative Genomics only a short stretch of nucleotides4. This increases the of nucleotides that are involved in the initial search by
Initiative, University of potential for unintended interactions between RNA RNAs for, and binding of RNAs to, their targets and that
California, Berkeley,
species in a cell. Therefore, solving the problem of tar- disproportionately contribute to regulation compared
California 94720, USA.
get specificity requires that regulatory RNAs contain with other regions of the RNA4,9.
Correspondence to
J.V.andJ.A.D.
functional sequences that are used to search for, and Early studies of RNA-based regulatory systems
joerg.vogel@uni-wuerzburg.de; interact with, regulated transcripts. However, as RNA suggested that the folding of RNA into secondary and
doudna@berkeley.edu sequences are not chemically diverse (RNA consists tertiary structures to expose specific sequences used
doi:10.1038/nrm.2016.174 offour nucleotides that come in two sizes and con- for binding to their targets was a general principle of
Published online 15 Feb 2017 sistof planar bases containing hydrogen bond donors RNA-based recognition. Examples of these structural

NATURE REVIEWS | MOLECULAR CELL BIOLOGY ADVANCE ONLINE PUBLICATION | 1



2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

motifs include the anticodon loop in tRNA 10 and complementary, double-stranded RNAs; these can
looploop kissing interactions in antisense RNAs from originate from transgenes, viral infections and repetitive
plasmids and phages 11. However, almost all RNAs elements34. After the siRNA precursors are cleaved by
associate with RNA-binding proteins, which have Dicer into duplexes of ~21nucleotides they are loaded
important structural, regulatory or catalytic roles in into the siRISC (FIG.1b), from which they recognize and
defining the behaviour of RNA. As the mechanisms associate with target transcripts through base-pairing,
of more RNA-based regulatory systems are elucidated, with high levels of complementarity; this enables the
it is becoming clear that regulatory RNAs associate endonucleolytic cleavage of the scissile bond oppo-
with specific proteins, enabling them to function as site the tenth and eleventh guide nucleotide by AGO
guides to bring each protein to its target site8,1216. Such withinsiRISC34.
dedicated RNA-binding proteins alter the properties To identify physiological roles of miRNA and siRNA,
of nucleic acid hybridization, including how comple- early studies focused on predicting and identifying
mentary nucleic acid sequences find each other, bind their target transcripts in the cell. Several fundamental
and dissociate. Importantly, several recent structural observations provided molecular insight into how they
studies have revealed how the seed sequence is pre- select their targets. Sequence analysis revealed that the
sented within the context of the respective ribonucleo- highly conserved 5regions of miRNAs are complemen-
protein (RNP) particle to ensure its precise targeting tary to sequences in 3untranslated regions (3UTRs)
and regulation1727, and single-molecule studies have of mRNA that had previously been linked to post-
begun to reveal the dynamics of the targeting r eaction transcriptional regulation3842. Distinct contributions of
and to confirm the role of the seed sequence in specific siRNA and miRNA regions to target regulation
RNAtargetinteractions8,2830. support the presence of a seed region; the 5nucleo-
This Review focuses on seed pairing mechanisms tides contribute more to binding and target regulation
in three distinct systems that use RNA as a guide, than the nucleotides at the 3end4346. Consequently,
namely small interfering RNA (siRNA) and microRNA target transcripts can be identified by evaluating the
(miRNA)-mediated gene silencing, bacterial CRISPR conservation of direct WatsonCrick base-pairing
Cas adaptive immunity and Hfq-dependent small RNA between miRNA seed nucleotides (specifically, nucleo
(sRNA)-guided gene regulation (FIG.1). We briefly intro- tides27) (FIG.1e) and sequences in the 3UTRs of
Looploop kissing
interactions duce each system before describing how they handle the mRNA. Targeting can also be facilitated by additional
WatsonCrick base pairing issue of specificity and highlighting common principles sequence elements, such as an unpaired adenosine in
between the loop nucleotides and major differences betweenthem. the target sequence just upstream of the seed sequence41
of two RNA stem loops. and, in rare cases, supplementary pairing between the
Adaptive immunity
Targeting by miRNAs and siRNAs target and the 3end of the miRNA, involving miRNA
A specific response to an The discovery of miRNAs and siRNAs revolutionized nucleotides 1316 (REF.47).
infection by a pathogen the idea that RNA is a regulatory molecule in eukaryotic
basedon prior exposure organisms. Computational and experimental approaches siRNA seed sequence presentation. Initial insights
tothat pathogen.
suggest that metazoan genomes encode hundreds into the molecular basis behind the importance of the
RNA-induced silencing ofmiRNAs31,32, and it is estimated that the most con- seed sequence were obtained from crystal structures
complex served miRNAs control the expression of approximately of the bacterial Ago protein from Thermus thermo
(RISC). A ribonucleoprotein two-thirds of human genes33. Although the origin and philus. Even though these structures used guide DNAs,
complex of an Argonaute precursors of miRNAs and siRNAs differ, they share in they revealed how guide strands mediate RNA recog
protein and an RNA.
part common biogenesis pathways and they are both nition26,27,48 (FIG.2a). T.thermophilus Ago containing a
Scissile bond loaded onto Argonaute (AGO) proteins, which they 5-phosphorylated 21nucleotide DNA guide strand
A covalent bond that can guide to mRNA targets for RNA silencing 16,34,35. maintained the same overall bilobal structure as the
bebroken by an enzymatic apoprotein48 (FIG.2a,b). The DNA guide is nestled within
reaction.
Biogenesis. miRNAs are sRNAs of ~2123nucleotides a basic channel spanning the lobes that harbour the
PAZ domain in length that are encoded by the genome and some PAZ domain and the PIWI domain, and there are exten-
(PIWIArgonauteZwille viruses. The majority of metazoan miRNAs originate sive contacts between its backbone phosphates and
domain). A domain present from independent loci, although some are processed basic residues in the domains and linkers present in
inArgonaute proteins that from within introns. Their primary transcripts contain the protein. The trajectory of the guide is defined by the
isinvolved in binding to
the3end of the guide.
stemloop structures, and consecutive endonuclease anchoring of its 5monophosphate into a pocket within
cleavages by two enzymes, Drosha in the nucleus and the Mid domain and of its 3end in the PAZ domain.
PIWI domain Dicer in the cytoplasm, generate a miRNA duplex con- The conformation of the seed sequence at nucleotides
(P-element-induced wimpy taining a 5monophosphate and a 3hydroxyl. The 26 reveals how its presentation within the T.thermo
testis domain). A domain
mature miRNA strand is loaded into the AGO protein philus Ago RNP is important for RNA recognition.
present in Argonaute proteins
that contains an RNase Hlike within the RNA-induced silencing complex (RISC)34,36 Specifically, in agreement with earlier biochemical and
active site. (FIG.1a). The miRNA within RISC associates with t arget biophysical studies45,49, these nucleotides are stacked
mRNAs through base-pairing, leading to translational forming an Aform helix with their WatsonCrick edges
Aform helix repression and/or mRNA destabilization37 or, in some exposed to the solvent, a favourable, pre-organized
A secondary structure motif
found in RNA in which bases
cases when the miRNA is highly complementary to conformation for interacting with a target mRNA
are tilted with respect to the its target, mRNA cleavage (FIG. 1a,b). siRNAs differ (FIG.2a,b). Subsequent nucleotides are threaded through
helix axis. from miRNAs in that they are excised from long, fully a narrow channel in the T.thermophilus Ago protein

2 | ADVANCE ONLINE PUBLICATION www.nature.com/nrm



2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

and are therefore inaccessible to mRNA. Introducing for mRNA cleavage26,27. However, on the addition of
a kink into the guide strand by positioning two Arg a 20nucleotide target RNA to the T.thermophilus
residues between nucleotides 10 and 11 renders the Ago proteinguide complex, the seed region forms
binary T.thermophilus Ago proteinguide complex an Aform DNARNA helix with the target and
catalytically inactive and prevents the formation of undergoes a conformational change that relieves the
the undistorted helical conformation that is required non-helical geometry, enabling cleavage of the target

a miRNA b siRNA c crRNA d Hfq-dependent sRNA


1 Transcription 1 1
Pol II RNAP RNAP
DNA

2 pre-crRNA

2 pri-miRNA 1 dsRNA tracrRNA sRNA

Drosha RNase III


3 pre-miRNA 3 crRNA
Biogenesis

Dicer Dicer

4 miRNA 2 siRNA
AGO AGO Cas9 Hfq
4 2
RNP loading

5 miRISC 3 siRISC

mRNA mRNA Foreign DNA mRNA

6 4 5 3
Targeting

e 5 3
3 5

5 3 5 3 5 3
miRNA ~22 nt crRNA ~41 nt sRNA ~50200 nt
seed ~7 nt seed ~10 nt seed ~610 nt
Figure 1 | The biogenesis, loading and targeting of RNA in distinct RNA-based regulatory systems. a|MicroRNAs
(miRNAs) are generally transcribed by RNA polymerase II (PolII) (step1) to produceNature
primaryReviews
miRNAs| (pri-mi
RNAs)
Molecular (step
Cell 2);
Biology
these are cleaved by Drosha in the nucleus to produce a stemloop structure the pre-miRNA (step3) which is exported
to the cytoplasm and processed by Dicer to form the mature miRNA duplex (step4), a single strand of which is loaded onto
an Argonaute (AGO) protein as part of the miRNA-induced silencing complex (miRISC) (step5). miRISC binds to target
mRNAs through partial complementary base-pairing (step6), which leads to changes in mRNA stability and/or translational
repression or, in cases of high complementarity (indicated by the arrows), mRNA cleavage (not shown). b|Small interfering
RNAs (siRNAs) are generated from double-stranded RNA (dsRNA) precursors (step1), which are cleaved by Dicer into siRNA
duplexes (step2) and loaded onto AGO proteins as part of the siRISC (step3). The siRISC binds to target mRNAs by highly
complementary base-pairing (step4), which leads to transcript cleavage (not shown). c|The typeII CRISPR RNA (crRNA)
locus contains repeat sequences that are interspersed with sequences from foreign DNA. It is transcribed by RNA
polymerase (RNAP) (step1) to produce a pre-crRNA; a small transactivating crRNA (tracrRNA) interacts with the spacer
sequence of crRNA to form a duplex (step2). The duplex is cleaved by RNase III to produce the mature crRNA (step3).
Themature crRNA is loaded into Cas9 nuclease (step4), from which it binds to a complementary foreign DNA (green)
containing a protospacer-adjacent motif (PAM; yellow) (step5), resulting in DNA cleavage and destruction (not shown).
d|Hfq-dependent small RNAs (sRNAs) are transcribed from independent genes or processed from longer RNA transcripts
(step1). They associate with the RNA chaperone Hfq (step2), which facilitates base-pairing to their target mRNAs (step3).
On targeting an mRNA, the sRNA can positively or negatively affect RNA stability or translation (not shown). e|Properties
ofmiRNAs, crRNAs and Hfq-dependent sRNAs. miRNAs are ~22nucleotides (nt) in length and contain an ~7nt seed
sequence towards their 5end. Mature crRNAs are ~41nt in length and contain a seed sequence of ~10nt towards the
3end of the crRNA (which is part of the crRNAtracrRNA complex). Bacterial Hfq-dependent sRNAs vary in length from
50to 200nt and they often contain seed sequences of ~610nt, which are generally found at the 5end but can be found
throughout the sRNA. The position of each seed sequence is indicated in red.

NATURE REVIEWS | MOLECULAR CELL BIOLOGY ADVANCE ONLINE PUBLICATION | 3



2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

Entropy penalty mRNA26,27 (FIG.2b). The importance of mRNAseed miRNA seed sequence presentation. Insight into the
The thermodynamic cost sequence complementarity is highlighted by the fact presentation of miRNA seed sequences and their role in
associated with a loss of that cleavage is impaired on the introduction of mis- target recognition awaited the structural elucidation of
conformational entropy on the matches or insertions within the seed region in theguide eukaryotic AGO proteins. Crystal structures of human
immobilization of a molecule
ina fixed configuration.
DNA27. Therefore, the pre-organization ofthe seed in and yeast AGO proteins bound to cellular guide RNAs23,24
an Aform helix within T.thermophilus Ago facilitates (FIG.2c) or the human miRNA miR20a25 revealed that
its interaction with thetarget mRNA and offsets the these AGO proteins form a similar bilobal domain
entropy penalty of binding target mRNA, enhancing protein architecture to, and bind RNA in the same
theinteraction affinity up to 300fold50. wayas, that of T.thermophilus Ago (FIG.2c). Despitethe

a Seed
5 p=TGAGGTAGTAGGTTGTATAGT 3

PAZ 5
4
Mid 2 3
3 6

N domain PIWI

b
1 2 3
PAZ Ago Mid
3 mRNA
PIWI
5
siRNA
N domain

Seed
(A-form helix)
c Seed
5 p=UUCACAUUGCCCAAGUCUCUU 3

PAZ
3

Mid

5
6
5
2
3
4
N domain

PIWI
d
1 2 3
AGO2
Helix 7

L2 mRNA
4

Ile
L1
miRNA Seed
(A-form helix)
Nature Reviews | Molecular Cell Biology

4 | ADVANCE ONLINE PUBLICATION www.nature.com/nrm



2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

High-throughput heterogeneity in RNA sequences bound by the yeast and Insight into the dynamics of the target-search pro-
sequencing of RNAs human AGO proteins in the binary complex, nucleo- cess, and support for the stepwise model of miRNA
isolated by crosslinking tides26 of the guide strand in human AGO2 unambigu targeting, has come from single-molecule experiments;
immunoprecipitation ously adopted a similar splayed helical conformation to these studies monitored eukaryotic AGO loaded with a
(HITS-CLIP). A sequencing
method based on the
that of T.thermophilus Ago, with WatsonCrick edges fluorescently labelled guide RNA that was bound to an
ultraviolet crosslinking of exposed to the solvent (FIG.2c). After nucleotide 6 in immobilized, labelled RNA target 8,28,51. Incorporation of
RNAprotein complexes that is the seed sequence, the trajectory of the guide differs the guide RNA (let7a or miR21) into an RNP greatly
used to identify RNA ligands of because human AGO inserts an Ile side-chain from increases the rate constant with which it binds to the
RNA-binding proteins.
helix7, between the two faces of bases 6 and 7, inducing target, as compared with that of a naked guide oligo
Photoactivatable a kink that breaks the Aform helix structure of the guide nucleotide, to the extent that finding the target becomes
ribonucleoside-enhanced (FIG.2d). Formation of the guide RNAtarget RNA duplex limited by diffusion8,28,51. Consistent with the presenta-
crosslinking and coincides with helix7 moving away from the guide to tion of the seed region in an Aform helix that favours
immunoprecipitation avoid a steric clash with the target RNA region that pairs target binding, acceleration of target binding by AGO2
(PAR-CLIP). A sequencing
method that uses
to nucleotides 67 in the guide RNA; this relieves the is dependent on complementarity between the seed
photoactivatable kink at this guide position and facilitates the formation sequence and the target 8. The observation from the
ribonucleosides to crosslink of an Aform helix that can encompass the entire seed human AGO2 structures that only guide nucleotides
RNA and protein to identify region of the miRNA18 (FIG.2d). The target-bound confor- 26 are initially presented in a helical configuration,
RNAs associated with a
mation of the miRNA is stabilized when helix7 interacts for the primary interrogation of targets, is supported by
particular RNA-binding protein.
with the minor groove of the RNA duplex. The degree the fact that different mismatches within the seed have
of stability probably depends on the extent of comple- different effects on guidetarget binding; mismatches at
mentarity between the seed and the target, which dic- nucleotides 25 of the guide reduce the association rate
tates the structure of the duplex minor groove and the more than mismatches at positions 68 (REFS8,18,28).
creation of a new binding surface for helix7, thus func- This further indicates that the arrangement of 5seed
tioning as a checkpoint for correctly identifying targets18. nucleotides in a helical form facilitates the search for
The repositioning of helix7 also widens the channel the target and that, when a potential sequence match
between the N and PAZ domains, enabling nucleotides has been found, the conformational change involving
1418 of the miRNA to adopt a stacked Aform state helix7 enables the rest of the seed to be interrogated.
with WatsonCrick edges that are exposed to the sol- Complementarity between the seed sequence and
vent and opening the 3region of the guide for potential target also determines the rate at which AGO dissoci-
supplementary base-pairing 18 (FIG.2d). Thus, the eukary ates from the target, enabling the complex to discrimin
otic structures suggest that there is a stepwise model for ate between seed-matched and mismatched targets8.
miRNA targeting and correct target validation. In the absence of a matched seed sequence, the RNP
finds targets more slowly and binds to them less stably 8.
The interrogation of mRNAs for target binding sites is
Figure 2 | Presentation of seed sequences in miRNA and siRNA. a|Crystal structure of facilitated by a one-dimensional scanning mechanism
Thermus thermophilus Argonaute (Ago) containing a 21base DNA guide (shown in orange
that uses s ub-seed interactions before stably binding to
and red, where red denotes the seed sequence; the sequence of the guide is shown
above the structure), the 5phosphate (p) of which is bound within a pocket in the Mid themRNA28.
domain and the 3end of which is bound in the PAZ domain (left)27 (Protein Data Bank Genome-wide studies describing the miRNAs that
identifier (PDB ID): 3DLH; see Further information). A closeup of the 5region of the DNA are bound to AGO proteins on a global scale also sup-
guide with nucleotides 26 of the seed sequence presented in a near Aform helical port an important role for seed sequences in recognizing
conformation is shown on the right. b|Schematic showing the presentation of the small mRNA targets. For example, experiments in which AGO
interfering RNA (siRNA) seed sequence and the subsequent targeting of mRNA. Ago binds and miRNA were crosslinked by high-throughput sequenc-
the guide so that its 5end is bound in a pocket in the Ago Mid domain and its 3end is ing of RNAs isolated by crosslinking immunoprecipitation
bound by the PAZ domain of Ago (step1). The seed sequence (nucleotides 27, coloured (HITS-CLIP)52 and by photoactivatable ribonucleoside-
red) is presented in an Aform helix conformation and this is subsequently bound by an enhanced crosslinking and immunoprecipitation (PAR-
mRNA target (light blue) (step2). Duplex formation is accompanied by a conformational
CLIP)53 revealed hundreds of AGO-binding sites in
change in Ago that relieves the non-helical geometry of the guide and produces a
catalytically active state (step3). c|Crystal structure of human AGO2 (left) containing transcripts from the mouse neocortex and human
a21nucleotide superoxide dismutase1 (SOD1) guide RNA (shown in orange and red, embryonic kidney cells, respectively. These binding
where red depicts the seed sequence; the sequence of the guide is shown above the sites were enriched for sequences that were comple-
structure)18 (PDBID: 4W5N). A closeup of nucleotides 25 of the seed region of the guide mentary to the seeds of the most abundant miRNAs
RNA, which are presented in a near Aform helix conformation and exposed to solvent, in the relevant cell types. Transcripts containing direct
isshown on the right. d|Schematic of microRNA (miRNA)-mediated targeting. AGO2 binding sites were more likely to be regulated on over-
binds to miRNA (orange and red) with the 5end bound in a pocket in the Mid domain and expression of the complementary miRNA. However,
the 3end bound by the PAZ domain; nucleotides 26 of the seed sequence (coloured red) in a minority of cases (27% for HITS-CLIP and 7% for
are presented as an Aform helix. Helix 7 of AGO2 (shown as a cylinder) inserts an Ile side PAR-CLIP), AGO-binding sites did not contain exact
chain (green bar) between nucleotides 6 and 7, breaking the helical configuration of the
seed matches, suggesting that they may contain bulges
RNA and sterically hindering guidetarget duplex formation (step1). Base-pairing
between the seed region and the complementary mRNA target coincides with movement or mismatches in the seed region. Based on shape com-
of helix 7, which allows guidetarget duplex formation. Helix 7 interacts with the minor plementarity between helix7 in human AGO2 and
groove of the duplex to monitor shape complementarity (step2). The conformational the miRNAmRNA duplex, it is expected that RISC
change in AGO2 allows base-pairing between the seed region and the target, as well would have a lower affinity for these miRNAtarget
aspotential 3supplementary base-pairing with the 3region of miRNA (step3). RNA interactions 18. However, modification of the

NATURE REVIEWS | MOLECULAR CELL BIOLOGY ADVANCE ONLINE PUBLICATION | 5



2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

crosslinking approach to ligate miRNAtarget RNA which involves the transcription and processing of
duplexes in several related approaches called crosslinking, crRNAs; and interference, whereby the crRNAs are used
ligation and sequencing of hybrids (CLASH), as well as the as guides to identify foreign nucleic acids and facilitate
invivo PAR-CLIP with ligation54 and covalent ligation their cleavage and clearance by associated nucleases58.
of endogenous AGO-bound RNAs (CLEAR)-CLIP, RNA plays a fundamental part in the expression and
suggests that non-canonical seed interactions involving interference phases. TypeII CRISPRCas systems tran-
G:U wobble pairs, mismatches and bulges are more fre- scribe their loci, which contain repeats and spacers, into
quent than canonical exact complementary seed pairings a full-length pre-crRNA. The subsequent processing
(which make up only 37% of seed interactions55) and of this transcript by endogenous RNaseIII requires
hint at a greater involvement of the miRNA 3region duplex formation between the anti-repeat sequence in
in determining interactions than previously thought 55,56. the tracrRNA and a repeat sequence within the pre-
Seedless interactions were determined in 16% of cases55, crRNA6264 (FIG.1c).The resulting mature crRNA con-
although these were poorly conserved and weakly regu tains a 20nucleotide spacer-derived guide sequence
lated by miRNAs. In general, although some non-seed (with complementarity to foreign DNA) at the 5end
interactions modestly contribute to target regulation5456, and a 21nucleotide repeat sequence that base-pairs to
high-throughput analyses indicate that many of the the 5end of the tracrRNA (FIG.1c), which probably has
non-canonical sites bound by miRNAs do not result in a structural role and is required for target DNA binding
the regulation of the target 57. and site-specific cleavage by Cas9 (REF.62).
Taken together, incorporation of siRNA and miRNA The Cas9crRNAtracrRNA ternary complex must
into an RNP complex with AGO enables the presentation identify invading DNA molecules to cause inference.
of the seed sequence in a pre-organized Aform helix. Identifying targets is a stepwise process involving
This facilitates the RNA-mediated target-search process not only base-pairing between DNA and the crRNA
by favouring interactions with the target and dramati- sequence (the complementary site in the DNA is called
cally alters the thermodynamic and kinetic properties the protospacer) but also recognition of a short sequence
of guidetarget hybridization. Conformational changes motif called the protospacer-adjacent motif (PAM) on
within the protein and guide RNA on guidetarget the non-complementary strand of the DNA by Cas9. The
pairing provide a basis for specificity in gene regulation. PAM is required for Cas9 to recognize and stably bind to
DNA (FIGS1c,3c); its sequence varies between CRISPR
CRISPRCas-mediated defence Cas systems but in Streptococcus pyogenes, for example,
Many bacteria and most archaea contain an RNA- it is composed of a 5-NGG sequence (where N is any
mediated silencing mechanism that provides adaptive nucleobase) immediately flanking the protospacer 62.
immunity against bacteriophages and plasmids58. This Biochemical experiments revealed that mismatches
mechanism uses CRISPR RNAs (crRNAs) generated introduced in the 3region of the guide sequence of
from genomic CRISPR loci for the sequence-specific crRNA prevent Cas9mediated cleavage, leading to the
detection and elimination of invading viruses and plas- proposal that there is a seed region within the guide that
mids. crRNA sequences originate from previous invading is essential for target recognition65 (FIG.3a). The exist-
nucleic acids and act as guides for various CRISPR- ence of a seed sequence is further supported by the
associated Cas proteins, thus providing an inherited requirement of a contiguous stretch of 13 complemen-
adaptive immunity against reinfections. CRISPR loci tary base-pairs between the 3region of the RNA guide
can be classified into five main systems, depending and the complementary DNA target strand proximal
on their Cas protein composition59, but all contain a to the PAM; by contrast, 6contiguous mismatches are
nuclease. TypeI and typeIII systems use crRNA that tolerated between complementary DNA and crRNA at
is incorporated into a multisubunit complex to execute the 5end of the guide. When the target DNA has been
their interference activity. TypeII CRISPRCas systems identified, it is cleaved by Cas9 using two domains that
are one of the simplest, requiring only two RNAs, crRNA are homologous to the HNH endonuclease and RuvC
Crosslinking, ligation
andsequencing of hybrids and transactivatingcrRNA (tracrRNA), and a single Cas endonuclease: the HNH domain cleaves the DNA strand
(CLASH). A sequencing protein, Cas9, to function as RNA-guided endonucleases complementary to the crRNA guide and the RuvC domain
methodused to identify RNAs (FIG.3). A simplified chimeric single-guide RNA (sgRNA) cleaves the non-complementary strand65.
and their targets by ligating (FIG.3a) containing functionally important regions of
them together when they
arebound to a specific
crRNA and tracrRNA can be used to target any sequence crRNA seed presentation. Insight into how the seed
RNA-binding protein. to generate double-strand breaks, and it is used in vari- region of the guide RNA contributes to target DNA
ous genome-editing approaches60,61. Below, we focus on recognition and binding was first obtained from the crys-
HNH domain the typeII system owing to its simplicity and the amount tal structure of Cas9 bound to sgRNA without a target 17.
A nuclease domain within Cas9
of structural information available on its mechanism of ApoCas9 forms a bilobal structure with one lobe con-
that is related to McrA-like
restriction endonucleases. RNA recognition. taining the HNH, RuvC and carboxyterminal domains
and the other comprising a large helical domain. Binding
RuvC domain Identifying foreign DNA using crRNA. CRISPRCas of the sgRNA induces conformational changes in the
A nuclease domain within Cas9 adaptive immunity consists of three phases: immuniza protein20; the Cterminal domain (which interacts with
that is related to the RuvC
endonuclease that cuts
tion, where new spacer sequences from invading nucleic the PAM) forms a nucleic acid-binding groove that can
Holliday junctions during acids are incorporated into the CRISPR array to function accommodate the DNA duplex in which PAM is located
homologous recombination. as a molecular memory of prior infections; expression, (FIG.3b). In this conformation, the sgRNA is bound in

6 | ADVANCE ONLINE PUBLICATION www.nature.com/nrm



2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

a A A b HNH
G A

AUCG
UAGC
30 RuvC
40 A

AG
A

GAUUUUG
UAAAAU
U 5 16

3 18
19
Seed 20

UA
5 G G CG CA UA AA GA UG A G A C G C 50 A A G G C
1 10 20 UGCC G
UAU

60 U
Guide RNA sequence
CTD
70
AA

GUUCAAC
A A A G U G U U C G 3
80 85

c Protospacer

1 2 3 4
Seed PAM
(A-form Target DNA
Helical helix)
lobe
Cas9
sgRNA
Nuclease
lobe

Figure 3 | crRNA seed presentation. a|Secondary structure of the single-guide RNA (sgRNA) that is present in the Cas9
Nature Reviews
crystal structures in part b, which is composed of a CRISPR RNA (crRNA) and a transactivating | Molecular
crRNA Cell
(tracrRNA) Biology
that are
fused together. The seed sequence is highlighted in red. Canonical and non-canonical base-pairing is depicted as bars
and dots, respectively. b|Ribbon diagram of the crystal structure of the Streptococcuspyogenes Cas9sgRNA binary
complex (left)17 (Protein Data Bank identifier: 4ZT0; see Further information). The guide RNA is shown in orange and red,
where red depicts the seed sequence; for Cas9, the nuclease lobe contains the HNH endonuclease (green), the RuvC
endonuclease (blue) and the carboxyterminal domain (CTD; grey), and the helical recognition lobe is shown in light
purple, yellow and cyan. A closeup of the presentation of the guide RNA by Cas9, in the positively charged channel
between the two lobes of Cas9, is shown in the surface representation (right); nucleotides 1920 of the guide are exposed
to solvent. c|Schematic of DNA targeting by the Cas9crRNA complex. Cas9 binds to the crRNAtracrRNA complex; the
crRNA seed sequence (nucleotides 1120, shown in red) is presented in an Aform helix with nucleotides 1920 exposed
to the solvent (step1). Target binding is initiated when Cas9 recognizes a protospacer-adjacent motif (PAM) sequence
(yellow) on the non-complementary DNA strand (step2). Strand separation is induced by interactions between Cas9 and
the DNA duplex; the PAM proximal seed nucleotides (nucleotides 1920) nucleate target binding and crRNADNA duplex
formation (step3); the DNA protospacer sequence (which is complementary to the crRNA) is shown in green. The
propagation of helix formation enables Cas9 to acquire an active catalytic state (step4). Parts a and c are modified from
Jiang,F., Zhou,K., Ma,L., Gressel,S. & Doudna,J.A.A. Cas9guide RNA complex preorganized for target DNA recognition.
Science 348, 14771481 (2015) (REF. 17). Reproduced with permission from AAAS.

a positively charged channel between the two lobes of separation to enable the formation of an Rloop between
Cas9 (REF.17) (FIG.3b). Of the 20 nucleotides within the the crRNA guide and target DNA30. The crystal structure
guide, only the nucleotides located in the seed region of S.pyogenes Cas9 bound to sgRNA in the presence of
towards the 3end of the crRNA (nucleotides 1120) are target DNA that contains a canonical PAM implies that
visible within the narrow channel. This region forms an a multistep process exists for target recognition22 (FIG.3c).
Aform helix, similar to that seen in AGOguide struc- On PAM recognition, strand separation is induced by
tures, with nucleotides 1113 and the PAM proximal interactions between Cas9 and the minor groove of the
nucleotides (that is, nucleotides 1920) exposed to the PAM-containing DNA duplex, whereby a Glu and Ser
solvent 17 (FIG.3b,c). Similarly to the AGO structures, a residue of Cas9 interact with the DNA phosphodiester
kink is introduced into the guide RNA by the insertion of group linking the PAM sequence and the protospacer.
an amino acid this time a Tyr between nucleotides This interaction causes a rotation of the phosphate group
15 and 16 of the seed; this is relieved on target binding, and induces a kink that orients the target strand directly
which causes the Tyr to rotate by 120 (REF.17). This pre- towards the guide RNA, which is already pre-ordered
organized helical conformation places the seed sequence in a conformation favourable for duplex formation22.
in a configuration that is thermodynamically favourable Therefore, recognition of the PAM sequence and the
for guidetarget duplex formation. Ahelical arrangement of the seed sequence are coupled
Recognition of a functional target requires the initial to conformational changes that enable Cas9 to recognize
detection of the PAM sequence by Cas9 and DNA strand functional targets19,20,66,67.

NATURE REVIEWS | MOLECULAR CELL BIOLOGY ADVANCE ONLINE PUBLICATION | 7



2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

A stepwise model involving a seed sequence is sup- S.enterica and E.coli bind to the Sm/Lsm superfamily RNA
ported by single-molecule studies using purified Cas9 chaperone protein Hfq, which is required for their intra-
in assays with tethered DNA coupled with total internal cellular stability and their ability to anneal to targets, and
reflection microscopy (TIRFM)30. In this model, Cas9 they are thus called Hfq-dependent sRNAs12,76,77. Hfq-
scans the target DNA via a three-dimensional collision dependent sRNAs target mRNAs to affect their stability
mechanism and, on encountering a PAM sequence, or translation. The binding of sRNAs to mRNAs near the
interrogates the surrounding sequence. Sequence com- ShineDalgarno sequence in their 5UTR regulates their
plementarity at the 3end of the guide sequence adjacent access to ribosomes68,69,78, and sRNAs negatively and
to the PAM is required for DNA binding, highlighting positively influence mRNA stability by recruiting the
the importance of the seed sequence. Even a two-base- major constitutive mRNA decay ribonuclease RNaseE
pair mismatch between crRNA and the DNA flanking or inhibiting its cleavage activity, respectively 7982.
the PAM abrogates binding, which is in accordance with
structural data showing that these two seed nucleotides Seed sequences in sRNA. Interactions between Hfq-
(nucleotides 1920) are exposed to solvent and most dependent sRNA and mRNA usually involve partial
likely form a nucleation site for base-pairing when complementarity between single-stranded regions68.
strand separation has occurred during PAM recog Evidence suggests that many sRNAs contain short seed
nition17,30. The strict requirement for complementarity sequences that disproportionately contribute to the
immediately adjacent to the PAM suggests that unwind- search for, and interaction with, targets compared with
ing of the target duplex is initiated in its vicinity and other regions of the sRNA8387. For example, the presence
that it progresses in a stepwise manner until a single of a seed region in a sRNA was first suggested for sugar
turn of an Aform helix has formed30. This is facili- transport-related sRNA (SgrS), an ~220nucleotide-long
tated by recognition of the PAM, strand separation and sRNA that is induced during glucose-phosphate stress
presentation of a pre-organized Aform helical seed and binds to the glucose-specific phosphotransferase
sequence30. The role of the PAM in forming the Rloop ptsG mRNA, which encodes a membrane component of
is also supported by single-molecule DNA supercoiling the phosphoenolpyruvate phosphotransferase system88.
experiments in which PAM mutations decrease the rate Of a 32nucleotide region in SgrS that displays partial
of Rloopformation29. complementarity with the translation-initiation region
The evolving picture of the CRISPR mechanism is of ptsG mRNA, only 6 nucleotides in its 3region are
that the presentation of the crRNA seed sequence in a essential for its regulatory activity 83. The importance
pre-organized Aform helix within Cas9 places it in an of this six-nucleotide region is further highlighted by
optimal configuration for interaction with the target on the ability of the SgrS seed sequence to distinguish
identification of a PAM sequence. The initial interro- between two horizontally acquired virulence factor
gation involves two of the seed sequence nucleotides, mRNAs, Salmonella outer proteinD (sopD) and sopD2,
which are exposed to the solvent, before conformational at the level of a s ingle hydrogen bond89. Similarly, the
changes within the protein facilitate interactions with the ~80nucleotide-long RybB sRNA, which is induced
remainder of theguide. when the bacterial envelope is subject to stress, contains
a highly conserved 5seed region of seven nucleotides
Hfq-dependent sRNA that is responsible for the targeting of multiple outer
Almost all bacteria contain sRNAs; these participate membrane protein (omp) mRNAs in S.enterica and
in post-transcriptional gene regulation to regulate the E.coli. The seed region was identified using indepen
majority of cellular pathways, including stress-response dent genetic, molecular and biochemical approaches84,85.
pathways, and to control important behavioural The 5region of RybB that contains the seed sequence
phenotypes, such as virulence in many pathogens6870. is sufficient to repress omp mRNAs during the envelope
Escherichia coli and Salmonella enterica, two entero- stress response even when grafted onto an unrelated
bacterial model organisms in which bacterial RNA- RNA scaffold, demonstrating that the seed sequence is
Total internal reflection
microscopy mediated regulation has been extensively described, an important element in RNA mediated silencing 84,90.
(TIRFM).A microscopy contain 200300 sRNAs7174. Finally, the sRNA quorum regulatory RNA4 (Qrr4),
technique that uses an Bacterial sRNAs are more heterogeneous in size and which is involved in quorum sensing in Vibrio bacteria,
evanescent wave to specifically structure than miRNAs and crRNAs, ranging from 50 to harbours a subset of nucleotides within the region pre-
excite fluorophore-labelled
molecules close to a surface.
200nucleotides and containing complex secondary and dicted to base-pair with mRNA that are important for
tertiary structures68. They can be directly transcribed target silencing. Intriguingly, the identity of the crucial
Sm/Lsm superfamily from independent transcription units that are located residues differs according to the identity of the target,
A large family of proteins in intergenic regions and function as primary transcripts providing a basis for discrimination86.
present in all three domains
containing a 5triphosphate group and a 3stem loop The sizes of sRNAs and the relative location of the
oflife and involved in RNA
processing and degradation. (FIG.1d). In addition, they can be processed from longer seed sequences within them are more heterogeneous
transcripts or the 3UTRs of mRNAs, which results in than described for miRNAs and crRNAs (FIG.1e). Inaddi-
ShineDalgarno sequence sRNAs with a 5monophosphate group68,75,95. The major tion, some Hfq-dependent sRNAs, such as Spot42 or
A 510nucleotide sequence class of bacterial sRNAs function through base-pairing GcvB, which are associated with catabolite repression
upstream of the initiation
codon involved in
with multiple mRNA targets via short seed sequences and are involved in amino acid transport and metabo-
definingwhere bacterial that are present in unstructured regions of the RNA lism, respectively, use multiple or extended seed regions
translationinitiates. (FIG.1e). The majority of these trans-encoded sRNAs in to recognize target mRNA9193. Similarly to in miRNAs,

8 | ADVANCE ONLINE PUBLICATION www.nature.com/nrm



2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

the seed sequences in sRNAs are often highly conserved, the seed could form a duplex with the targetrecognition
further highlighting their functional importance94. site in the mRNA while the sRNA is bound to the rim of
Ananalysis of seed-sequence location in 18 well-studied Hfq. Indeed, it has been proposed that this interaction
E.coli and S.enterica sRNAs suggests that, in around one- can be mimicked by the packing of a duplex region on
third of cases, seed sequences are found in the 5region of the rim in the RydCHfq crystal lattice. As the sRNA
the sRNA84. Furthermore, the seed region of a sRNA can interacts predominately with the proximal face of Hfq
be internally located within a primary transcript (either and mRNAs interact predominately with the distal face
a longer sRNA or an mRNA) and become exposed at of Hfq, a model can be proposed in which the sRNA and
the 5end of a transcript on endonucleolytic cleavage. mRNA are held in proximity and the sRNAHfq complex
Examples of such processing include the ~120nucleotide can scan the mRNA until the target site base-pairs with
sRNA ArcZ95, which is involved in the general stress the seed and engages in the formation of a short duplex
response and is processed to an ~50nucleotide-long on the rim of Hfq21,103 (FIG.4c).
sRNA94, and CpxQ, which is liberated from the 3UTR Recent global invivo crosslinking datasets describing
of the cpxP mRNA (which encodes a chaperone pro- the RNA ligands that associate with Hfq provide some
tein) and is associated with the cellular response to inner insight into how sRNA and mRNA bind to the chaper-
membrane stress96. one87,104,105. In S.enterica, Hfq binds to more than 550
mRNAs, mainly in the 5UTR and 3UTR, and although
sRNA seed presentation. Hexameric Hfq contains at least Hfq binds throughout different regions of ~90 sRNAs
three different binding surfaces for RNA. It typically binds that have been assessed it preferentially associates with
sRNAs via Urich sequences, which are often associ the 3end of sRNA104. Analysis of sRNAmRNA pairs
ated with RHO-independent transcription terminators, suggests that Hfq binds to mRNA upstream of its sRNA
onits proximal face21,9799 and binds to mRNAs and some interaction site and that it binds on the 3side of the
sRNAs via Arich sequences on its distal face100. Athird seed sequence in sRNA, consistent with the model that
binding surface is the rim of the Hfq hexamer, which also mRNA and sRNA bind to opposite faces of Hfq before
binds Urich sequences21,97,101103. To fully understand the duplex formation occurs. Although this model assumes
importance and presentation of the seed sequence by that seed regions located at the 5end of sRNA are pre-
Hfq, high-resolution structural information is required. sented on the rim of Hfq (FIG.4c), it is unclear whether
However, obtaining binary and ternary complexes con- sRNAs with seed regions located internally or towards
taining a natural, full-length sRNA has proven challeng- the 3end would present them in the same manner.
ing, probably owing to the large size and heterogeneous However, in some cases, cleavage of full-length sRNAs
organization of sRNA. Nevertheless, a binary complex places the seed sequence at the 5end of the processed
between Hfq and the sRNA RydC has been described, RNA, which may facilitate seedpresentation94,106.
and provides the first glimpse of how asRNA seed region In short, sRNAs seem to bind to the proximal face of
is presented21. RydC is a 65nucleotide sRNAthat folds Hfq with their seed sequence presented in an extended
into a complex pseudoknot tertiary structure (FIG.4a). conformation at the rim, close to the site that has been
RydC binds with Hfq to the 5UTR of the cyclopropane implicated in the strandannealing activity of the chaper
fatty acid (CFA) synthase mRNA, stabilizing the tran- one. Hfq probably stabilizes the structure of the sRNA,
script, increasing the level of CFA synthase and thus ensuring that the single-stranded seed sequence is avail-
changing the fatty acid composition of membranes82. The able to interrogate mRNA targets that are recruited to
Urich RHO-independent terminator that is located at the distal face of the chaperone.
the 3end of RydC binds in a recessed channel on the
proximal face of the Hfq hexamer 21 (FIG.4b). The sRNA Common principles of seed presentation
exits the channel at the rim, where it forms extensive The incorporation of RNA guides into RNPs to ensure the
RNAprotein interactions. The 5seed region of sRNA, recognition of functional targets has evolved in eukaryotic,
which spans nucleotides 212, binds to the rim of the Hfq bacterial and archaeal systems for post-transcriptional
hexamer, with the 3region of the seed (nucleotides 812) regulation and adaptive immunity 17,18,21,2326,48. Structural
in an extended conformation as a result of its interaction and mechanistic studies are beginning to reveal common
with residues 817 in Hfq (FIG.4b,c). Notably, this inter- features that are used by the different systems to impose
action involves Arg16 and Arg17, which were previously specificity. These can be grouped together into two gen-
shown to interact with RNA and which are essential for eral principles: first, the presentation of the seed sequence
Chemical footprinting duplex formation and for the strand exchange activity in a conformation that facilitates the search for, and inter-
Methods used to map RNA of Hfq that accelerates dynamic base-pair formation103. action with, target nucleic acids, and second, the coupling
secondary and tertiary This is also consistent with chemical footprinting and of target recognition and conformational changes
structures based on the
small-angle Xray scattering (SAXS) data, which show that withinthe RNP to ensure that the correct RNA or DNA
accessibility of nucleotides
tospecific chemicals. Hfq interacts with several sites in an mRNA to fold it is regulated(FIG.5).
into a compact tertiary structure, meanwhile placing The guide RNA forms intimate interactions with the
Small-angle Xray scattering the sRNA target site at the Hfq rim in the vicinity of protein, exposing only a small subregion of thesequence
(SAXS). An Xray scattering Arg16 and Arg17 (REF.102). Unfortunately, the remain- to interrogate putative targets and placing the target
method that provides
information on the size
ing 5portion of the RydC seed sequence is not visible near to the functionally active regionoftheprotein.
andshape of biological in the crystal structure, potentially owing to structural InmiRNA, siRNA, crRNA and sRNA, the seed sequence
molecules in solution. heterogeneity and dynamics21. However, this portion of is most important in determining target binding,

NATURE REVIEWS | MOLECULAR CELL BIOLOGY ADVANCE ONLINE PUBLICATION | 9



2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

a Seed b
1 10

CUGGGCACUGCGUCCGC
5 U U C C G A U G U A G C U

ACCCGU
40
G Arg16
*
G 9 Arg17
* U Hfq proximal
U face
A

GACGCAGGCG
C
50
C 10
20 U
C 30 11
C 12
U Hfq proximal
face
U U U U UCU 3
60

Hfq distal face


c
1 2 3
sRNA

mRNA

Seed

Hfq hexamer
Figure 4 | Hfq-dependent sRNA seed presentation. a|Secondary structure of the RydC small RNA (sRNA) from
Nature Reviews | Molecular Cell Biology
Salmonella enterica. The seed region (highlighted in red) is located at nucleotides 212. b|Crystal structure of RydC
(orange cartoon with bases depicted as sticks and the seed sequence shown in red) bound to the proximal faces of two
neighbouring Hfq hexamers (protomers in cyan and blue) (left; Protein Data Bank identifier: 4v2s; see Further
information)21. In the crystal lattice, the RNA bridges two adjacent Hfq hexamers; the structure probably encompasses
thefull set of interactions that would form in a 1:1 RydCHfq complex, which is the biologically relevant species in the cell.
Aclose-up of the presentation of the 3region of the RydC seed sequence (stick representation, coloured red) in an
extended conformation on the rim of the Hfq hexamer is shown on the right. c|Schematic depicting the targeting of
mRNA by HfqsRNA. sRNA binds to the proximal face of the Hfq hexamer, with the 3region of sRNA bound in a recessed
channel and nucleotides 812 of the 5seed sequence presented in an extended conformation at the rim of the hexamer,
proximal to amino acids that are in the annealing activity of Hfq (step1). mRNA binds to a distal face of the Hfq hexamer
and is held by Hfq in proximity to the sRNA to enable it to scan for complementary seed sequences. A sRNAmRNA duplex
probably forms at the rim of the Hfq hexamer (step2). On duplex formation, the sRNA and mRNA probably cycle off Hfq
(step3), which makes it available to bind to another sRNA. Part a is modified with permission from REF.21.

eventhough its size and location within the RNA dif- the formation of non-productive intramolecular and
fers (FIG.1e). For 21nucleotide miRNA and siRNAs, intermolecular interactions that could arise owing to
nucleotides 28 make the largest contribution to the the inherently low information content of RNA bases,
energy of target binding, with mismatches at the centre thereby increasing specificity.
of the seed decreasing binding the most 8,28,4345. In the Importantly, association of the guide RNA with
20nucleotide-long guide sequence of crRNA, 10 nucleo acognate protein pre-organizes the seed sequence in a
tides in the seed region towards the 3end are required conformation that favours duplex formation (FIG.5). This
for t arget binding and cleavage; a 2nucleotide seed is particularly evident for guide RNAs bound by AGO
subregion within this is essential for annealing to DNA and Cas9, for which a subregion of the seed sequence is
next to the PAM sequence and for target recognition30,65. pre-organized in a near Ahelix conformation17,18,21,2326,48
InSgrS, from a predicted 30nucleotide region of partial (FIG.5a,b). Pre-immobilization of the guide strand by
complementarity, 6 nucleotides towards the 3end are the protein decreases the loss in conformational flex-
essential for silencing 8385. ibility on duplex formation50. This pre-payment of
A common principle emerging from studying the entropic cost associated with duplex formation
miRNA-, siRNA-, crRNA- and sRNA-mediated t arget enhances the affinity of the guide strand for the target
recognition is that the protein defines the trajectory when compared with that of a naked oligonucleotide8,50.
and conformation of the guide RNA and exposes the Presentation of the seed in an Aform helix also probably
seed sequence in a thermodynamically favourable decreases the rate-limiting barrier to duplex formation,
configuration for interaction with putative targets. which is the formation of an initial nucleus of base pairs
Presenting predominantly the seed sequence decreases before rapid helix propagation. In agreement with this,

10 | ADVANCE ONLINE PUBLICATION www.nature.com/nrm



2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

AGO proteins increase the rate of target binding up The presentation of a sRNA seed sequence follows
to 250fold compared with that of naked oligonucleo similar principles; binding to Hfq constrains the 3end
tides8,28. This is dependent on sequence complemen of the seed sequence in an extended single-stranded
tarity between seed nucleotides of the guide (which are conformation that is required for target interrogation21
presented in an Ahelix conformation) and the target 8,28. (FIG.5c). Although the conformation of the remaining
The seed sequence also plays a direct part in the mode seed sequence has not been resolved, it is constrained
by which protein complexes search for target sites on a to the region of Hfq that has been implicated in stim-
nucleic acid. AGO28 and Cas9 (REF.30) use a lateral dif- ulating target annealing 103. In the absence of structural
fusion process and a threedimensional collision mech information on a ternary complex, the proposed model
anism to identify their targets, respectively. AGO uses suggests that binding of the sRNA to the proximal face
the seed region to scan for matches to nucleotides 24 of Hfq and presentation of the seed at the rim places
of the seed before converting transient interactions into it in an optimal position to interrogate a target mRNA
more stable ones encompassing the entire seed sequence that is bound to the distal face of Hfq and to stimulate
(that is, nucleotides 28)28. Cas9 scans for targets by sRNAtarget mRNA duplex formation. Similarly, RISC
searching for a short PAM sequence, which initiates exhibits annealing activity that increases the efficiency
strand separation and interrogation of the seed sequence of nucleic acid hybridization, and this also involves the
that is also presented in an Ahelixform30. seed sequence44. Single-molecule studies of SgrS bound
to Hfq suggest that the target-search process is the rate-
limiting step during sRNA-mediated regulation, but the
Seed Conformational change
presentation checkpoints role of the seed sequence remains to be investigated107.
Presentation of a seed sequence in a conformation
a AGO AGO mRNA that facilitates binding seems to be a general principle
miRNA and is not restricted to the RNA-based systems described
mRNA target Conformational in this Review. For example, the typeI CRISPRCas
Pre-organized change
system in E.coli, which involves the multiprotein com-
seed nt 27 as
an A-form helix plex CRISPR-associated complex for antiviral defence
(Cascade), organizes a 61nucleotide crRNA into sev-
b Cas9 crRNA
PAM eral consecutive 5nucleotide-long segments. These seg-
Foreign DNA ments are presented in an Aform conformation, which
Cas9
includes nucleotides 15 and nucleotides 78; both of
Foreign Conformational
DNA target change these nucleotide sets have been experimentally shown to
function as seeds during DNA targeting 108111.
Pre-organized
seed nt 1120 A common feature of seed sequences in these different
as an A-form helix systems is that they are in the range of 610nucleotides
in length. This means that they can reside within a single
c Hfq turn of an Aform helix, which contains ~1011 bases;
making the seed sequences longer would not be benefi-
Hfq cial as the additional nucleotides would not be suitably
sRNA
mRNA target orientated for base-pairing with the target. In addition,
Conformational the discriminatory effect of single mismatches would be
Pre-organized seed change? diminished in longer seed sequences, risking off-target
nt 212, with nt 812 in an
extended conformation effects, and having fewer nucleotides in a seed would
compromise affinity and specificity. A second common
Figure 5 | Common principles of the target-search process. Target-search processes principle of using a subregion of an RNA to interrogate
commonly involve presentation of the seed and conformational changes en route to
targets is the linking of RNAtarget interactions with a
target identification. Using a subregion of an RNANature
to initially interrogate
Reviews targets
| Molecular produces
Cell Biology
a stepwise RNAprotein binding mechanism in which RNAtarget interactions are series of conformational changes in the RNP to ensure
coupled to a series of conformational changes in the ribonucleoprotein to ensure the the identification of the correct target and commitment
identification of the correct target and commitment to the desired biochemical outcome. to the desired biochemical outcome (FIG.5). For example,
a|The microRNA (miRNA) seed sequence is presented in a near Aform helix conformation for miRNA, siRNA and crRNA, a match between the
within the miRNA-induced silencing complex (miRISC) and is used to interrogate seed subregion and a potential target is required before
potential target mRNAs. On mRNA binding, conformational changes in Argonaute (AGO) subsequent interactions in the guide and target are possi-
expose further regions of the miRNA for interaction with the target and also check the ble. The use of a small number of nucleotides to initially
extent of complementarity between miRNA and mRNA in a duplex. A similar principle scan for a functional target gives rise to a multistep bind-
applies to small interfering RNAs (siRNAs), which are not depicted here. b|The CRISPR ing mechanism, which provides various opportunities
RNA (crRNA) seed sequence is presented in a near Aform helix conformation within Cas9.
to check the validity of the target and thus minimise
Cas9 scans DNA for the protospacer-adjacent motif (PAM) sequence before undergoing
conformational changes that induce strand separation and allow it to interrogate the off-target effects. This suggests that RNPs may use sev
flanking DNA sequence for complementarity to seed nucleotides (nt) 1920; this is eral conformational checkpoints to further interrogate
followed by the unwinding and binding of the remaining seed sequence. c|Small RNA targets before placing the RNP in an active conforma-
(sRNA) bound to the proximal face of Hfq presents the seed sequence at the rim, which tion. Forexample, the binding of T.thermophilus Ago to
places it in an optimal position to interrogate a target mRNA that is bound to the distal thetarget RNA involves nucleotides 26 of the seed inthe
face of Hfq and to stimulate sRNAtarget mRNA duplex formation. guide strand; this binding induces a conformational

NATURE REVIEWS | MOLECULAR CELL BIOLOGY ADVANCE ONLINE PUBLICATION | 11



2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

change in T.thermophilus Ago that relieves the kink in RNA-binding protein8. Interestingly, these new insights
the guide RNA to allow base-pairing with the target RNA into seed presentation by RNPs highlight common prin-
beyond the seed and the formation of catalytically active ciples that are involved in some RNARNA interactions.
T.thermophilus Ago. Likewise, in human AGO2, nucleo Of note, earlier work on codonanticodon interactions
tides 25 in the seed region of miRNA initially scan and on antisense RNAs involved in the control of plas-
putative targets; a correct match with the seed induces mid replication revealed that these RNAs interact through
a conformational change in the protein that allows the looploop interactions in the absence of proteins. These
rest of the seed sequence to interrogate targets. This loops contain Uturn motifs, which also present flank-
opens up the remainder of the guide towards the 3end ing nucleotides in a pre-formed Ahelix conformation to
for supplementary base-pairing. Importantly, the move- enhance the search for complementary sequences in their
ment of helix7 enables the protein to sense the shape of target RNAs10,11,113,114.
the minor groove to ensure that a complementary duplex A combination of structural, biochemical, biophysical
has been formed (FIG.5a). In the CRISPRCas system, and genome-wide approaches has provided extensive
Cas9 scans DNA for the PAM sequence before inducing insights into how short RNA sequences can identify
strand separation and thus the interrogation of the DNA targets in the cell. This has revealed shared principles,
sequence by seed nucleotides 1920; this is followed by a which are especially clear for the miRNA, siRNA and
conformational change that relieves a kink in the guide crRNA systems. However, many open questions remain,
and allows binding of the remaining seed sequence to including how are non-canonical RNAtarget inter
target DNA (FIG.5b). By contrast, we currently do not actions specified, how do RNPs find their targets in the
know whether Hfq undergoes a conformational change complex cellular environment and how does Hfq facili-
to gain its RNA chaperone activity, and thus whether it is tate base-pairing between sRNAs and mRNAs? The use
subject to the same conformational control steps as AGO of short RNA guides has been described for several other
proteins and Cas9 (FIG.5c). However, a difference between systems, including the Crich apical loop sequences in
the Hfq system and the siRNA and crRNA-containing Staphylococcus aureus sRNAs115 and also external guide
RNPs is that its interaction with its RNA ligand is more sequences in RNase Pmediated regulation of gene
dynamic, as sRNAs bind to Hfq through an active kinetic expression116. It will be interesting to see whether the
cycling process to ensure a rapid cellular response112. principles discussed in this article are relevant to these
In short, it seems that using a small subregion of modes of regulation as well as to new RNA-based regu
the seed for the interrogation of targets provides a latory systems that are emerging from the relatively
multistep binding mechanism in which the RNP intro- unexplored bacterial and archaeal world, for example
duces m ultiple checkpoints to ensure the correct target new CRISPRCas proteins and global sRNA-binding
isrecognized. proteins, such as ProQ117,118. Understanding how sys-
tems have evolved to identify specific functional targets
Conclusions and future perspectives has great implications for the application of RNA-based
In RNA-based silencing systems that mostly identify systems in biotechnology and therapeutics, such as the
functional targets by seed pairing, incorporation of the generation of specific RNA-based circuits for synthetic
guide within an RNP modifies its properties to avoid biology, the use of miRNAs or antisense inhibitors to
the kinetic and thermodynamic limitations of nucleic treat various non-communicable and infectious diseases,
acid hybridization, ensuring it behaves more like an and the improvement of tools for genome editing.

1. Cech,T.R. & Steitz,J.A. The noncoding RNA 11. Wagner,E.G., Altuvia,S. & Romby,P. Antisense RNAs 19. Nishimasu,H. etal. Crystal structure of Cas9 in
revolution trashing old rules to forge new ones. in bacteria and their genetic elements. Adv. Genet. complex with guide RNA and target DNA. Cell 156,
Cell157, 7794 (2014). 46, 361398 (2002). 935949 (2014).
2. Levine,E. & Hwa,T. Small RNAs establish gene 12. Updegrove,T.B., Zhang,A. & Storz,G. Hfq: The 20. Jinek,M. etal. Structures of Cas9 endonucleases
expression thresholds. Curr. Opin. Microbiol. 11, flexible RNA matchmaker. Curr. Opin. Microbiol. 30, reveal RNA-mediated conformational activation.
574579 (2008). 133138 (2016). Science 343, 1247997 (2014).
3. Mass,E., Escorcia,F.E. & Gottesman,S. Coupled 13. Jiang,F. & Doudna,J.A. The structural biology of 21. Dimastrogiovanni,D. etal. Recognition of the small
degradation of a small regulatory RNA and its CRISPRCas systems. Curr. Opin. Struct. Biol. 30, regulatory RNA RydC by the bacterial Hfq protein.
mRNAtargets in Escherichia coli. Genes Dev. 17, 100111 (2015). eLife 3, e05375 (2014).
23742383 (2003). 14. van der Oost,J., Westra,E.R., Jackson,R.N. Provides the crystal structure of Hfq bound
4. Bartel,D.P. MicroRNAs: Target recognition and &Wiedenheft,B. Unravelling the structural and to a full-length sRNA, with the seed sequence
regulatory functions. Cell 136, 215233 (2009). mechanistic basis of CRISPRCas systems. Nat. Rev. presented in an extended conformation.
5. Herschlag,D. RNA chaperones and the RNA folding Microbiol. 12, 479492 (2014). 22. Anders,C., Niewoehner,O., Duerst,A. & Jinek,M.
problem. J.Biol. Chem. 270, 2087120874 (1995). 15. Swarts,D.C. etal. The evolutionary journey of Structural basis of PAM-dependent target DNA
6. Eguchi,Y., Itoh,T. & Tomizawa,J. Antisense RNA. Argonaute proteins. Nat. Struct. Mol. Biol. 21, recognition by the Cas9 endonuclease. Nature 513,
Annu. Rev. Biochem. 60, 631652 (1991). 743753 (2014). 569573 (2014).
7. Zeiler,B.N. & Simons,R.W. in RNA Structure and 16. Meister,G. Argonaute proteins: Functional insights 23. Schirle,N.T. & MacRae,I.J. The crystal structure
Function Vol. 35, 437464 (Cold Spring Harbor and emerging roles. Nat. Rev. Genet. 14, 447459 ofhuman Argonaute2. Science 336, 10371040
Laboratory Press, 1998). (2013). (2012).
8. Salomon,W.E., Jolly,S.M., Moore,M.J., 17. Jiang,F., Zhou,K., Ma,L., Gressel,S. 24. Nakanishi,K., Weinberg,D.E., Bartel,D.P.
Zamore,P.D. & Serebrov,V. Single-molecule imaging &Doudna,J.A.A. Cas9guide RNA complex &Patel,D.J. Structure of yeast Argonaute with
reveals that argonaute reshapes the binding preorganized for target DNA recognition. Science guideRNA. Nature 486, 368374 (2012).
properties of its nucleic acid guides. Cell 162, 8495 348, 14771481 (2015). 25. Elkayam,E. etal. The structure of human
(2015). Describes the crystal structure of Cas9 and reveals argonaute2 in complex with miR20a. Cell 150,
9. Kunne,T., Swarts,D.C. & Brouns,S.J. Planting the that the crRNA seed sequence is presented in an 100110 (2012).
seed: Target recognition of short guide RNAs. A-form helix configuration. References 18, 24 and 25 report the first crystal
TrendsMicrobiol. 22, 7483 (2014). 18. Schirle,N.T., Sheu-Gruttadauria,J. & MacRae,I.J. structures of eukaryotic AGO proteins bound
10. Quigley,G.J. & Rich,A. Structural domains of transfer Structural basis for microRNA targeting. Science 346, toRNA guides, with the seed sequence presented
RNA molecules. Science 194, 796806 (1976). 608613 (2014). asan Aform helix.

12 | ADVANCE ONLINE PUBLICATION www.nature.com/nrm



2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

26. Wang,Y. etal. Nucleation, propagation and cleavage 49. Mallory,A.C. etal. MicroRNA control of PHABULOSA 76. Vogel,J. & Luisi,B.F. Hfq and its constellation of RNA.
of target RNAs in Ago silencing complexes. Nature in leaf development: Importance of pairing to the Nat. Rev. Microbiol. 9, 578589 (2011).
461, 754761 (2009). microRNA 5 region. EMBO J. 23, 33563364 77. De Lay,N., Schu,D.J. & Gottesman,S. Bacterial small
27. Wang,Y. etal. Structure of an argonaute silencing (2004). RNA-based negative regulation: Hfq and its
complex with a seed-containing guide DNA and target 50. Parker,J.S., Parizotto,E.A., Wang,M., Roe,S.M. accomplices. J.Biol. Chem. 288, 79968003 (2013).
RNA duplex. Nature 456, 921926 (2008). &Barford,D. Enhancement of the seed-target 78. Papenfort,K. & Vanderpool,C.K. Target activation
28. Chandradoss,S.D., Schirle,N.T., Szczepaniak,M., recognition step in RNA silencing by a PIWI/MID byregulatory RNAs in bacteria. FEMS Microbiol. Rev.
MacRae,I.J. & Joo,C.A. Dynamic search process domain protein. Mol. Cell 33, 204214 (2009). 39, 362378 (2015).
underlies microRNA targeting. Cell 162, 96107 51. Jo,M.H. etal. Human Argonaute 2 has diverse 79. Hui,M.P., Foley,P.L. & Belasco,J.G. Messenger RNA
(2015). reaction pathways on target RNAs. Mol. Cell 59, degradation in bacterial cells. Annu. Rev. Genet. 48,
References 8 and 28 describe elegant 117124 (2015). 537559 (2014).
single-molecule studies that provide evidence 52. Chi,S.W., Zang,J.B., Mele,A. & Darnell,R.B. 80. Papenfort,K., Sun,Y., Miyakoshi,M.,
forthe importance of the miRNA seed sequence Argonaute HITS-CLIP decodes microRNAmRNA Vanderpool,C.K. & Vogel,J. Small RNA-mediated
inthe search for targets. interaction maps. Nature 460, 479486 (2009). activation of sugar phosphatase mRNA regulates
29. Szczelkun,M.D. etal. Direct observation of Rloop 53. Hafner,M. etal. Transcriptome-wide identification glucose homeostasis. Cell 153, 426437 (2013).
formation by single RNA-guided Cas9 and Cascade ofRNA-binding protein and microRNA target sites 81. Lalaouna,D., Simoneau-Roy,M., Lafontaine,D.
effector complexes. Proc. Natl Acad. Sci. USA 111, byPAR-CLIP. Cell 141, 129141 (2010). &Masse,E. Regulatory RNAs and target mRNA
97989803 (2014). 54. Grosswendt,S. etal. Unambiguous identification of decayin prokaryotes. Biochim. Biophys. Acta 1829,
30. Sternberg,S.H., Redding,S., Jinek,M., Greene,E.C. miRNA:target site interactions by different types of 742747 (2013).
& Doudna,J.A. DNA interrogation by the CRISPR ligation reactions. Mol. Cell 54, 10421054 (2014). 82. Frhlich,K.S., Papenfort,K., Fekete,A. & Vogel,J. A
RNA-guided endonuclease Cas9. Nature 507, 6267 55. Helwak,A., Kudla,G., Dudnakova,T. & Tollervey,D. small RNA activates CFA synthase by isoform-specific
(2014). Mapping the human miRNA interactome by CLASH mRNA stabilization. EMBO J. 32, 29632979
Reports a single-molecule study describing the reveals frequent noncanonical binding. Cell 153, (2013).
stepwise interrogation of DNA targets by Cas9 and 654665 (2013). 83. Kawamoto,H., Koide,Y., Morita,T. & Aiba,H. Base-
evidence for the importance of the seed sequence. 56. Moore,M.J. etal. miRNA-target chimeras reveal pairing requirement for RNA silencing by a bacterial
31. Fromm,B. etal. A uniform system for the annotation miRNA 3end pairing as a major determinant of small RNA and acceleration of duplex formation by
of vertebrate microRNA genes and the evolution Argonaute target specificity. Nat. Commun. 6, 8864 Hfq. Mol. Microbiol. 61, 10131022 (2006).
ofthehuman microRNAome. Annu. Rev. Genet. 49, (2015). 84. Papenfort,K., Bouvier,M., Mika,F., Sharma,C.M.
213242 (2015). 57. Agarwal,V., Bell,G.W., Nam,J.W. & Bartel,D.P. &Vogel,J. Evidence for an autonomous 5 target
32. Chiang,H.R. etal. Mammalian microRNAs: Predicting effective microRNA target sites in recognition domain in an Hfq-associated small RNA.
Experimental evaluation of novel and previously mammalian mRNAs. eLife 4, e05005 (2015). Proc. Natl Acad. Sci. USA 107, 2043520440
annotated genes. Genes Dev. 24, 9921009 (2010). 58. Marraffini,L.A. CRISPRCas immunity in prokaryotes. (2010).
33. Friedman,R.C., Farh,K.K., Burge,C.B. Nature 526, 5561 (2015). 85. Balbontin,R., Fiorini,F., Figueroa-Bossi,N.,
&Bartel,D.P. Most mammalian mRNAs are 59. Makarova,K.S. etal. An updated evolutionary Casadesus,J. & Bossi,L. Recognition of heptameric
conserved targets of microRNAs. Genome Res. 19, classification of CRISPRCas systems. Nat. Rev. seed sequence underlies multi-target regulation
92105 (2009). Microbiol. 13, 722736 (2015). byRybB small RNA in Salmonella enterica.
34. Czech,B. & Hannon,G.J. Small RNA sorting: 60. Hsu,P.D., Lander,E.S. & Zhang,F. Development and Mol.Microbiol. 78, 380394 (2010).
Matchmaking for Argonautes. Nat. Rev. Genet. 12, applications of CRISPRCas9 for genome engineering. References 8385 describe the identification
1931 (2011). Cell 157, 12621278 (2014). ofseed sequences in bacterial sRNAs.
35. Kuhn,C.D. & Joshua-Tor,L. Eukaryotic Argonautes 61. Doudna,J.A. & Charpentier,E. Genome editing. 86. Rutherford,S.T., Valastyan,J.S., Taillefumier,T.,
come into focus. Trends Biochem. Sci. 38, 263271 Thenew frontier of genome engineering with Wingreen,N.S. & Bassler,B.L. Comprehensive
(2013). CRISPRCas9. Science 346, 1258096 (2014). analysis reveals how single nucleotides contribute to
36. Kim,V.N., Han,J. & Siomi,M.C. Biogenesis of 62. Deltcheva,E. etal. CRISPR RNA maturation by trans- noncoding RNA function in bacterial quorum sensing.
smallRNAs in animals. Nat. Rev. Mol. Cell. Biol. 10, encoded small RNA and host factor RNase III. Nature Proc. Natl Acad. Sci. USA 112, E6038E6047
126139 (2009). 471, 602607 (2011). (2015).
37. Jonas,S. & Izaurralde,E. Towards a molecular 63. Dugar,G. etal. High-resolution transcriptome maps 87. Melamed,S. etal. Global mapping of small RNA-
understanding of microRNA-mediated gene silencing. reveal strain-specific regulatory features of multiple target interactions in bacteria. Mol. Cell 63, 884897
Nat. Rev. Genet. 16, 421433 (2015). Campylobacter jejuni isolates. PLoS Genet. 9, (2016).
38. Lai,E.C. Micro RNAs are complementary to 3 UTR e1003495 (2013). 88. Vanderpool,C.K. & Gottesman,S. Involvement
sequence motifs that mediate negative post- 64. Zhang,Y. etal. Processing-independent CRISPR RNAs of a novel transcriptional activator and small RNA
transcriptional regulation. Nat. Genet. 30, 363364 limit natural transformation in Neisseria meningitidis. inpost-transcriptional regulation of the glucose
(2002). Mol. Cell 50, 488503 (2013). phosphoenolpyruvate phosphotransferase system.
39. Lewis,B.P., Shih,I.H., Jones-Rhoades,M.W., 65. Jinek,M. etal. A programmable dual-RNA-guided Mol. Microbiol. 54, 10761089 (2004).
Bartel,D.P. & Burge,C.B. Prediction of mammalian DNA endonuclease in adaptive bacterial immunity. 89. Papenfort,K., Podkaminski,D., Hinton,J.C.
microRNA targets. Cell 115, 787798 (2003). Science 337, 816821 (2012). &Vogel,J. The ancestral SgrS RNA discriminates
40. Brennecke,J., Stark,A., Russell,R.B. & Cohen,S.M. Provides key biochemical evidence for the presence horizontally acquired Salmonella mRNAs through
Principles of microRNA-target recognition. PLoS Biol. of a seed sequence in a Cas9containing asingle GU wobble pair. Proc. Natl Acad. Sci. USA
3, e85 (2005). CRISPRCas system. 109, E757E764 (2012).
41. Lewis,B.P., Burge,C.B. & Bartel,D.P. Conserved 66. Jiang,F. etal. Structures of a CRISPRCas9 Rloop 90. Bouvier,M., Sharma,C.M., Mika,F., Nierhaus,K.H.
seed pairing, often flanked by adenosines, indicates complex primed for DNA cleavage. Science 351, & Vogel,J. Small RNA binding to 5 mRNA coding
that thousands of human genes are microRNA targets. 867871 (2016). region inhibits translational initiation. Mol. Cell 32,
Cell 120, 1520 (2005). 67. Nishimasu,H. etal. Crystal structure of 827837 (2008).
42. Lim,L.P. etal. Microarray analysis shows that some Staphylococcus aureus Cas9. Cell 162, 11131126 91. Coornaert,A., Chiaruttini,C., Springer,M.
microRNAs downregulate large numbers of target (2015). &Guillier,M. Post-transcriptional control of
mRNAs. Nature 433, 769773 (2005). 68. Storz,G., Vogel,J. & Wassarman,K.M. Regulation by theEscherichia coli PhoQPhoP two-component
References 3942 provide the first evidence small RNAs in bacteria: expanding frontiers. Mol. Cell system by multiple sRNAs involves a novel pairing
forthe importance of the miRNA seed sequence 43, 880891 (2011). region of GcvB. PLoS Genet. 9, e1003156 (2013).
inmRNA targeting. 69. Gottesman,S. & Storz,G. Bacterial small RNA 92. Sharma,C.M. etal. Pervasive post-transcriptional
43. Wee,L.M., Flores-Jasso,C.F., Salomon,W.E. regulators: Versatile roles and rapidly evolving control of genes involved in amino acid metabolism
&Zamore,P.D. Argonaute divides its RNA guide into variations. Cold Spring Harb. Perspect. Biol. 3, bythe Hfq-dependent GcvB small RNA.
domains with distinct functions and RNA-binding a003798 (2011). Mol.Microbiol. 81, 11441165 (2011).
properties. Cell 151, 10551067 (2012). 70. Wagner,E.G. & Romby,P. Small RNAs in bacteria 93. Beisel,C.L. & Storz,G. The base-pairing RNA spot
44. Ameres,S.L., Martinez,J. & Schroeder,R. Molecular andarchaea: Who they are, what they do, and how 42participates in a multioutput feedforward loop to
basis for target RNA recognition and cleavage by they doit. Adv. Genet. 90, 133208 (2015). help enact catabolite repression in Escherichia coli.
human RISC. Cell 130, 101112 (2007). 71. Westermann,A.J. etal. Dual RNA-seq unveils Mol.Cell 41, 286297 (2011).
45. Haley,B. & Zamore,P.D. Kinetic analysis of the RNAi noncoding RNA functions in hostpathogen 94. Papenfort,K. etal. Specific and pleiotropic patterns
enzyme complex. Nat. Struct. Mol. Biol. 11, 599606 interactions. Nature 529, 496501 (2016). ofmRNA regulation by ArcZ, a conserved, Hfq-
(2004). 72. Thomason,M.K. etal. Global transcriptional start site dependent small RNA. Mol. Microbiol. 74, 139158
46. Doench,J.G. & Sharp,P.A. Specificity of microRNA mapping using differential RNA sequencing reveals (2009).
target selection in translational repression. Genes Dev. novel antisense RNAs in Escherichia coli. J.Bacteriol. 95. Chao,Y. etal. In vivo cleavage map illuminates the
18, 504511 (2004). 197, 1828 (2015). central role of RNase E in coding and noncoding RNA
47. Grimson,A. etal. MicroRNA targeting specificity in 73. Peer,A. & Margalit,H. Evolutionary patterns of pathways. Mol. Cell 65, 3951 (2017).
mammals: Determinants beyond seed pairing. Escherichia coli small RNAs and their regulatory 96. Chao,Y. & Vogel,J.A. 3 UTR-derived small RNA
Mol.Cell 27, 91105 (2007). interactions. RNA 20, 9941003 (2014). provides the regulatory noncoding arm of the inner
48. Wang,Y., Sheng,G., Juranek,S., Tuschl,T. 74. Kroger,C. etal. An infection-relevant transcriptomic membrane stress response. Mol. Cell 61, 352363
&Patel,D.J. Structure of the guide-strand-containing compendium for Salmonella enterica Serovar (2016).
argonaute silencing complex. Nature 456, 209213 Typhimurium. Cell Host Microbe 14, 683695 97. Sauer,E., Schmidt,S. & Weichenrieder,O.
(2008). (2013). SmallRNAbinding to the lateral surface of Hfq
The first structural report describing the 75. Miyakoshi,M., Chao,Y. & Vogel,J. Regulatory hexamers and structural rearrangements upon mRNA
presentation of a guide seed sequence in an Aform smallRNAs from the 3 regions of bacterial mRNAs. target recognition. Proc. Natl Acad. Sci. USA 109,
helix conformation by a bacterial Ago protein. Curr. Opin. Microbiol. 24, 132139 (2015). 93969401 (2012).

NATURE REVIEWS | MOLECULAR CELL BIOLOGY ADVANCE ONLINE PUBLICATION | 13



2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.
REVIEWS

98. Horstmann,N. etal. Structural mechanism of 106. Papenfort,K., Espinosa,E., Casadesus,J. & Vogel,J. 114. Brunel,C., Marquet,R., Romby,P. & Ehresmann,C.
Staphylococcus aureus Hfq binding to an RNA Atract. Small RNA-based feedforward loop with AND-gate RNA looploop interactions as dynamic functional
Nucleic Acids Res. 40, 1102311035 (2012). logic regulates extrachromosomal DNA transfer motifs. Biochimie 84, 925944 (2002).
99. Schumacher,M.A., Pearson,R.F., Moller,T., inSalmonella. Proc. Natl Acad. Sci. USA 112, 115. Geissmann,T. etal. A search for small noncoding
ValentinHansen,P. & Brennan,R.G. Structures E4772E4781 (2015). RNAs in Staphylococcus aureus reveals a conserved
ofthepleiotropic translational regulator Hfq and 107. Fei,J. etal. RNA biochemistry. Determination sequence motif for regulation. Nucleic Acids Res. 37,
anHfqRNA complex: A bacterial Smlike protein. ofinvivo target search kinetics of regulatory 72397257 (2009).
EMBO J. 21, 35463556 (2002). noncoding RNA. Science 347, 13711374 116. Forster,A.C. & Altman,S. External guide sequences
100. Link,T.M., Valentin-Hansen,P. & Brennan,R.G. (2015). for an RNA enzyme. Science 249, 783786 (1990).
Structure of Escherichia coli Hfq bound to 108. Zhao,H. etal. Crystal structure of the RNA- 117. Attaiech,L. etal. Silencing of natural transformation
polyriboadenylate RNA. Proc. Natl Acad. Sci. USA guidedimmune surveillance Cascade complex in by an RNA chaperone and a multitarget small RNA.
106, 1929219297 (2009). Escherichia coli. Nature 515, 147150 (2014). Proc. Natl Acad. Sci. USA 113, 88138818 (2016).
101. Schu,D.J., Zhang,A., Gottesman,S. & Storz,G. 109. Mulepati,S., Heroux,A. & Bailey,S. Structural 118. Smirnov,A. etal. Grad-seq guides the discovery of
Alternative HfqsRNA interaction modes dictate biology. Crystal structure of a CRISPR RNA-guided ProQ as a major small RNA-binding protein. Proc.
alternative mRNA recognition. EMBO J. 34, surveillance complex bound to a ssDNA target. Natl Acad. Sci. USA 113, 1159111596 (2016).
25572573 (2015). Science 345, 14791484 (2014).
102. Peng,Y., Curtis,J.E., Fang,X. & Woodson,S.A. 110. Jackson,R.N. etal. Structural biology. Crystal Acknowledgements
Structural model of an mRNA in complex with the structure of the CRISPR RNA-guided surveillance The authors are grateful to B. Luisi, A. Eulalio, G. Wagner and
bacterial chaperone Hfq. Proc. Natl Acad. Sci. USA complex from Escherichia coli. Science 345, members of the authors laboratories for discussions
111, 1713417139 (2014). 14731479 (2014). andcomments on the manuscript. The authors also thank
103. Panja,S., Schu,D.J. & Woodson,S.A. Conserved 111. Wiedenheft,B. etal. Structures of the RNA-guided S.Geibel for help with the figures.
arginines on the rim of Hfq catalyze base pair surveillance complex from a bacterial immune system.
formation and exchange. Nucleic Acids Res. 41, Nature 477, 486489 (2011). Competing interests statement
75367546 (2013). 112. Fender,A., Elf,J., Hampel,K., Zimmermann,B. The authors declare competing interests: see Web version
104. Holmqvist,E. etal. Global RNA recognition patterns &Wagner,E.G. RNAs actively cycle on the fordetails.
of post-transcriptional regulators Hfq and CsrA Smlikeprotein Hfq. Genes Dev. 24, 26212626
revealed by UV crosslinking invivo. EMBO J 35, (2010).
9911011 (2016). 113. Franch,T., Petersen,M., Wagner,E.G., Jacobsen,J.P. DATABASES
105. Tree,J.J., Granneman,S., McAteer,S.P., Tollervey,D. & Gerdes,K. Antisense RNA regulation in prokaryotes: RCSB Protein Data Bank: http://www.rcsb.org/pdb/home/
& Gally,D.L. Identification of bacteriophage-encoded Rapid RNA/RNA interaction facilitated by a general home.do
anti-sRNAs in pathogenic Escherichia coli. Mol. Cell Uturn loop structure. J.Mol. Biol. 294, 11151125 ALL LINKS ARE ACTIVE IN THE ONLINE PDF
55, 199213 (2014). (1999).

14 | ADVANCE ONLINE PUBLICATION www.nature.com/nrm



2
0
1
7
M
a
c
m
i
l
l
a
n
P
u
b
l
i
s
h
e
r
s
L
i
m
i
t
e
d
,
p
a
r
t
o
f
S
p
r
i
n
g
e
r
N
a
t
u
r
e
.
A
l
l
r
i
g
h
t
s
r
e
s
e
r
v
e
d
.

Вам также может понравиться