Вы находитесь на странице: 1из 6

Names: __________________________________ Section: ___________

Bio400

BIOINFORMATICS WORKSHEET

Each pair of two students will turn in ONE Bioinformatics worksheet. Please write both
of your names on the top of this sheet.
BLASTing (BLAST = Basic Local Alignment Search Tool)
Imagine that you have cloned and sequenced a portion of an Arabidopsis gene.
gtgaacccgt caacccttga acctcggctg gcaagtctaa tcaaaggcag gcagttaaat
The questions you ask are:
-Has anyone else ever studied this gene?
- Is this gene unique to Arabidopsis or are there homologs in other species?
To answer these questions you need to search online genome databases:
1. Select "NCBI Blast" from the Bio 400 home page.
2. Select Nucleotide BLAST.
3. Paste your sequence into the Enter Query Sequence Box. (The sequence can be
copied and pasted from the word document BioinformaticsSequence in the
Assignments folder on the Bio 400 web page)
4. Select a database from the dropdown menu to search. For the broadest search, use
the nucleotide collection (nr/nt).
5. Click the ? button next to choose a BLAST algorithm and read the information. To
answer your initial questions (at top of page) you will need to choose the least stringent
search, "blastn".
6. Choose your algorithm and then click the blue "BLAST" button.
7. Your request is now being processed. The sequence you entered is being compared
to all sequences in the database you selected, so it may take a few minutes.
8. Your results will be presented in graphic format. Scroll down to see the pair-wise
alignments below the graph (or click on a bar inside the graph and you will be taken to
that sequence alignment).
a. What does the length of each line indicate? ______________________________
b. What does the color of each line indicate?________________________________
c. Can you find any homologs of this
sequence in organisms other than Arabidopsis?_____________________________
(Note: if you didnt identify any non-Arabiodpsis sequences you can go back to the
BLAST search page and broaden your search.)

Names: __________________________________ Section: ___________

Bio400

9. From the NCBI website: "E Value (Expect Value) describes the likelihood that a
sequence with a similar score will occur in the database by chance. The smaller the E
Value, the more significant the alignment. For example, [an] alignment [with] a very
low E value of e-117 [means] that a sequence with a similar score is very unlikely to
occur simply by chance."
Do alignments of your sequence with those in other species have
higher or lower E-values than alignments within Arabidopsis species?
_____________________
10. Click on the blue accession number for the entry that represents the Arabidopsis gene
with an exact match over these 60 base pairs. This will take you to the GenBank Record
for this gene.
11. Arabidopsis has 5 chromosomes. Each gene in the Arabidopsis genome has a unique
identifier (locus tag) in the format AtNgNNNNN. The At refers to Arabidopsis
thaliana, the "N" in Ng refers to the chromosome number, and the final 5 digits refer to
the location on the chromosome. The genes are numbered sequentially along the
chromosome. Look through the GenBank record to identify the Arabidopsis locus tag for
your sequence and write it below.
Arabidposis locus tag: ________________________
12. What is the genes three letter acronym (usually followed by a
number if the gene belongs to a family of genes):____________________
This is the Arabidopsis gene that you will be studying ALL QUARTER in this class.
13. Looking at the GenBank record, answer these questions:
a. What chromosome is this gene located on? ___________
b. What is the function of this gene product? _______________________________
c. Does this gene have introns and exons? _______________________
(Hint: Does mRNA contain introns? Where did you find the information about
genomic DNA in the GenBank Record you looked at for the last worksheet?)

Names: __________________________________ Section: ___________

Bio400

Protein information databases


- Plant Chromatin DataBase or ChromDB: Great for showing gene expression!
1. Select the "ChromDB" link from the Bio 400 webpage.
2. In the Search Box at the top of the page, select Locus and enter the Arabidopsis locus
tag.
- What protein group does your gene product belong to? _____________________
3. Select expression from the menu at the top of the page
- In what plant tissues is your gene expressed
as determined by Northern analysis? ______________________________
Note: Slq = silique or seed pod; Sdlg = 2 week old seedling (entire plant)
- TAIR database: Great for getting a summary of all information known about your
Arabidopsis gene!
4. Go back to the Summary page for your gene.
5. Click on the "Locus Link": it will take you to the TAIR information page for your
gene.
6. Scroll down to the bottom of this page and notice there are several published papers
that reference your gene. You will be reading one of these (Plant orthologs of
p300/CBP: conservation of a core domain in metazoan p300/CBP acetyltransferaserelated proteins) for class.
There is a lot more information on this page about your gene!
***You should come back to this page throughout the quarter to learn more about
your gene as you develop your research projects.***

Names: __________________________________ Section: ___________

Bio400

SALK T-DNA database:


One way to study a protein's function is to look at the knock-out phenotype. Several
years ago, an incredible resource of Arabidopsis knock-out mutants was compiled at
the Salk Institute in San Diego. Random insertion of the Agrobacterium transfer DNA (TDNA) into the Arabidopsis genome created a library of seeds each of which contains one
or two insertions in its genome. The Salk institute has set up a searchable database to
allow you to find a mutant in your gene of interest.
1. Select the "Salk T-DNA" webpage from the Bio 400 homepage.
2. To make this page easier to read, use the arrows in the upper right corner of the
screen to zoom in to 10 bases/pixel. The dark blue bar running horizontally represents
the chromosome, with the top of the chromosome to the left. The first gene you see at
the top of Chromosome 1 is called At1g01010, and the next one is named Atg1g01020
etc. The positions of the genes are shown as broken green arrows above the
chromosome (the introns are shown as gaps in the arrow). You can click on the arrows
on either end of the dark blue line to move along to the next region of the chromosome.
- Name a gene that does not have any introns: ______________________
3. Underneath the diagrammed chromosome, there are indications of various
insertions. For example, insert mutants generated at the Salk Institute lab are indicated
in green or pink, and are called SALK_NNNNNN. The numbering of these mutant
plants is unrelated to the gene in which they are inserted.
4. To search the database for a specific gene, scroll down to SEARCH and type the
Arabidopsis locus name into Query. Click on Search. The display will now be
centered on the gene you searched for. You should see SALK insert lines in your gene.
5. Write down the Salk numbers of two T-DNA insertion mutants that you believe will
not express your gene. For each one, describe where in the gene the insertion lies (e.g.
exon 3) and what effect you predict it will have on gene expression and/or function.
The position of the insert determines in part how it affects gene function.
1)

2)

***Check with a TA/instructor to find out which Salk T-DNA line your group will be
using. Then proceed to identifying PCR primers on the next page.***

Names: __________________________________ Section: ___________

Bio400

Identifying PCR Primers

We will be using PCR to confirm that the plants we obtained from the ABRC
(Arabidopsis Biological Resource Center) do indeed have a T-DNA insertion. We want
you to see how you can design primers necessary for the PCR reaction.
1. Return to the Salk T-DNA website.
2. On the right side of the page, under Tools, select "- iSct primers". (You may have to
scroll down.)
3. Scroll down to 2. Salk-TDNA verification Primer design
4. In the white box, type in the name of your SALK line (using the indicated format) and
click on submit.
In the space below, write the sequences of your LP and RP primers (include 5' and 3'
labels): By convention, DNA sequences are written in the 5 to 3 orientation.
LP: ______________________________________________________
RP: ______________________________________________________
6. Copy the information you obtain into a Word document and e-mail it to yourself in
order to print it and SAVE it. You will need this information to design your PCR
protocol in next weeks laboratory.
(Note: LP= Left Primer; RP = right primer, TM = melting temperature, GC = % GC
content. It also gives you the predicted PCR product size if an insert is present (BP+RP
product size.)
Note: Since you are joining an on-going research project, students from previous quarters have
already ordered the LP and RP primers for the PCR reaction.

Names: __________________________________ Section: ___________

Bio400

Double checking primers

Before you go to the expense of ordering primers, carrying out PCR reactions and
analyzing results, its always a good a idea to double check the primer sequences to see
that they are specific for the sequence youre interested in, and that they will amplify
the correct sequence.
1. Go to the TAIR homepage from the Bio 400 homepage.
2. Under TOOLS select SeqViewer.
3. In the text box, paste (or type in) the LP primer sequence, select search by sequence
and hit Submit.
4. Look carefully at the screen. The green lines represent the 5 Arabidopsis chromosomes.
The position that matches your primer sequence will be indicated by a red #1.
2. Click on select (the button in the middle of the top of the page) then scroll down to
1.
3. A new window will open with a view of the chromosome sequence indicating what
region your primer is complementary to.
4. Look along the right side of the screen to see if the gene name written vertically
matches your gene of interest.
5. Look at the sequence, and record the sequence position of the primer. Is the sequence
italicized or not? (If the sequence is italicized it means that it is identical to the bottom strand
of the DNA, if it is not, it means that it is identical to the top strand of the DNA).
6. Now repeat this process for the RP primer and answer the following questions.
a. Which strand will the LP primer anneal to? _________________________
b. Which strand will the RP primer anneal to? _________________________
c. Do both primers fall within the correct gene? _________________________
d. What would be the size of a PCR product created by amplifying Arabidopsis
genomic DNA using these 2 primer? _________________________
Make sure that the answers make sense with respect to PCR. We will be going over
PCR in lecture and lab next week.
YOU WILL NEED TO UNDERSTAND HOW TO DESIGN PRIMERS FOR THE LAB
PRACTICAL
Turn in your worksheet to a TA.

Вам также может понравиться