Вы находитесь на странице: 1из 11

Exercises for phylogeny

The exercises use programs in the PHYLIP package (Felsenstein, 1995; Phylogeny Inference Package, ver. 3.5c. Seattle, Department of Genetics, University of Washington). For further information see http://evolution.genetics.washington.edu/phylip.html. During the course, the programs included with PHYLIP can be assessed in two ways: From web interface at Institut Pasteur:
http://bioweb.pasteur.fr/seqanal/phylogeny/phylip-uk.html

or from web interface at KVL implemented by Peter Sestoft: http://www.dina.kvl.dk/~sestoft/bsa/dinaws/phylogeny.html. The exercises are presented in relation to the Pasteur-server. For larger jobs, output will be returned by email. Random number of seeds should be of the form 4n + 1, eg. 137. For each job submitted on the Pasteur server press the link coming up otherwise the data will be lost: Example: From now, this files will remain accessible for 10 days at: http://bioweb.pasteur.fr/seqanal/tmp/neighbor/A41284110643970/ You can also install PHYLIP on your PC. It will then work with the DOS command prompt. This is not used with the exercises but said here for your information to use this possibility download the package from the Felsenstein homepage (address above) and procees in DOS format.

Exercise 1. Parsimony and rooted versus unrooted trees.


The dataset is very simple to illustrate the principle. The input is in PHYLIP format. The first line with the number of sequences the number of positions in the sequences and the following lines contain labels of exactly 10 characters followed by the letters of the sequences (nucleotides or amino acids). The sequences must be aligned. Use the program DNAPARS in advanced form with output options turned on. Study the output (outfile). Have the data been read correctly? what is the meaning of . in the display of the sequences? Draw by hand the corresponding unrooted tree based on the program output. Draw by hand all possible (3) unrooted trees with 4 leaves.

Exercise 2. Evolution of primates.


In this exercise we will use a small real data-set with 5 sequences of 846 sites for 5 primates including man. The region includes mitochondrial DNA of three tRNAs and parts of two coding genes. Mitochondial, since the evolution within this organelle has been almost 10 times faster compared to chromosomal DNA. The alignment was first used by Brown et al. 1982. J. Mol. Evol. 18, 225-39 and has been one of the most frequently used phylogenic cases. The sequences have accession numbers V00658, V00659, V00672, V00675 and D381123 in GenBank.

The purpose of the exercise is to explore the programs implementing the parsimony, neighbour joining and maximum likelihood algorithms. The expected phylogeny of primates on basis of morphological data is that man and chimpanzee are closest related, more distantly related are gorillas, orangutans and gibbons in the described order (Benton, 1997. Vertebrate Palaeontology 2nd ed. Chapman & Hall). The expected phylogeny can be described by the tree formula: gibbon(orang-utan(gorilla(chimpanzee, man))). Parsimony. Procees as in exercise 1 by the DNAPARS program. Write down the resulting tree by rooting with gibbon. You will need the tree for the following comparison. Save the treefile on your local PC eg. on the desktop - you will need it later. Neighbour joining. First generate a distance matrix with the DNADIST program using standard settings (Jukes and Cantor matrix). Study the output file and proceed using it as input to the neighbour joining program, NEIGHBOR. Write down the tree with gibbon as root. Save the treefile on your local PC eg. on the desktop - you will need it later. Try also the substitutions model F84 with DNADIST and run NEIGHBOR again. Did it change anything, why? Hint: Transversion/transition bias. Maximum likelihood. Use the fastDNAml program with standard settings. Repeat the analysis and observe if the final log likelihood value is constant and if not why? write down the branching order of the tree with the highest ln likelihood and compare it with those form the other two methods. Bootstrap. Any of the analysis above can be supplemented by bootstrap analysis. In PHYLIP this is done as a three-step procedure. First eg. 100 bootstrap resampels are generated with the SEQBOOT program and the original sequence data-set. Then all the re-seamples are analysed by one of the phylogentic methods with the M (multiple) option set to 100. Finally the concensus tree of the 100 trees is obtained by CONSENSE. Some of these steps can be overruled by running PHYLIP at the Pasteur-server. It is important to use the file Treefile with all the tree formulas as input for CONSENSE. Try to bootstrap the parsimony and the neighbour joining analysis. Start with the advanced forms of the two analysis and you will be guided through the analysis just by clicking. In the last step you need to choose CONSENSE. Write the bootstrap proportions on the trees you made under Parsimony and Neighbour joining (the maximum likelihood analysis can be bootstrapped but will be too slow for the exercise). You are only allowed to write down a given bootstrap value if the monophyletic group is recognized in the original tree. Compare the two trees made from parsiomony and neighbour joining with the bootstrap values included. Where are the bootstrap proportions highest? Do

high bootstrap values indicate high consistency of the method or just that the bootstrap approach is reproducing the original data better?

Exercise 3. Primates compared on protein level.


The coding regions of the DNA used in exercise 3 have been translated into protein with 149 amino acids in the five sequences. Repeat the analysis with the PROTPARS and PROTDIST/NEIGHBOR programs equivalent to exercise 2 (maximum likelihood analysis can be done with MOLPHY on the Pasteur server but we dont do it). Compare the trees with those obtained for DNA in exercise 2. Try to compare all five trees obtained. Which trees fit the palaeontological tree? How can disagreements be explained? Think about saturation of mutations long branch attraction.

Exercise 4. Training in making multiple alignment and phylogenetic analysis.


Make a multiple alignment in PHYLIP format of the primat dataset. The sequences data V00658, V00659, V00672, V00675 and D381123 in GenBank. Open each sequence from www.ncbi.nlm.nih.gov Entrez. Save each sequence in Fasta format on your PC. Make one file with the sequences by use of a text editor and use it as input for ClustalW or X. Choose PHYLIP as output format. Control that PHYLIP accepts the alignment. Do phylogenetic analysis by neighbour joining and compare the tree with the one in exercise 2. If differences occur what could be the reason? Hint. gaps were not removed and how do the analysis treat gaps?

Exercise 5. Phylogenetic analysis for sorting orthologous and paralogous genes.


The F0F1ATP synthase operon is used as model. The F1 unit is composed of five subunits and two of these and are encoded by homologous genes. At the protein level, these paralogs are compared in members of Enterobacteriaceae (Escherichia coli and Yersinia enterocolytica) and Pasteurellaceae (Haemophilus influenzae and Pasteurella multocida). Use one of the methods you used in exercise 2 with the data supplied below. Are the orthologous genes closer related than the paralogous?

Exercise 6. Make nice output for presentations and publications.


A. PHYLIP. Drawtree makes a radial type of tree, Drawgram a dendrogram-like. It is easiest to select the program after the phylogenetic analysis is finished so that the treefile can be use as input automatic. In this case we use the saved tree formulas from exercise 2. You cannot view the plotfile.ps directly but it can either be plotted (c:\>copy plotfile.ps lpt2) to your local printer or imported to a local Graphic program. It normally requires several tries to get a nice tree. Make a nice tree with Drawtree or Drawgram (PHYLIP) and plot the postscript file (c:\>filename copy

lpt1) or edit it further with a Graphic programme. After saving as Windows Meta file the file can be imported into Word or Power-Point. B. Treeview. This program is used to draw and manipulate trees. It might be your only choice to do branch swapping which is rotation around a node, this way the tree can be manipulated to illustrate results better. If you have the outgroup located in the middle of the tree, it cannot be shown in a publication and the outgroup need to be rotated either to the top or botton of the tree. You have to install this program on the local PC. Locate the server with Google.com searching just treewiev and download, unzip and install the program. The input is given as a file with the tree-formula (((a, b)d)c)d by click File>Open. Take the raw tree formula [(a,b)c)d; etc.] from exercise 2 and treat it further. Most important is the swap function. Save the tree (File > Save as graphic > windows metafile). Insert the tree into Word or Power-Point and treat the text further.

Data sets for Phylogeny exercises


Exercise 1
4 3 seq1 seq2 seq3 seq4 AAG AAA GGA AGA

Exercise 2
5 846 chimpanzeeAAGCTTCACC gibbonxxxxAAGCTTTACA gorillaexxAAGCTTCACC homosapienAAGCTTCACC orangutangAAGCTTCACC CATTATTATT CCCTGCTATT CATTATTATT CATTACTATT CCCTACTGTT ATCATAATTC ATCATAATCC ATCATAATTC ATCATAATCC CTGCCTAGCA CTGCCTTGCA CTGCCTAGCA CTGCCTAGCA CTGCCTAGCA TCTCCCAAGG TATCTCGAGG TCTCTCAAGG TCTCTCAAGG GGCGCAATTA GGTGCAACCG GGCGCAGTTG GGCGCAGTCA GGCGCAACCA AACTCAAATT AACTCAAACT AACTCAAACT AACTCAAACT AACTCAAACT ACTTCAAACT GCTCCAAGCC ACTCCAAACC ACTTCAAACT TCCTCATAAT TCCTCATAAT TTCTTATAAT TTCTCATAAT CCCTCATGAT ATGAACGCAC ACGAACGAAC ACGAACGAAC ACGAACGCAC ACGAACGAAC CTACTCCCAC TTACTCCCAC CTACTCCCAC CTACTCCCAC CGCCCACGGA CGCCCACGGA TGCCCACGGA CGCCCACGGA TGCCCATGGA CCACAGTCGC TCACAGCCGC CCACAGCCGC TCACAGTCGC CCACAGCCGC TAATAGCCTT TGATAGCCTT TAATAGCCCT TAATAGCTTT CTTACATCCT CTAACCTCTT CTTACATCAT CTTACATCCT CTCACATCCT

ATCATAATCC TCTCTCAAGG CCTTCAAACT CTACTCCCCC TAATAGCCCT TTGATGACTC CTGATGACTC TTGATGACTT TTGATGACTT CTGATGACTT ATCTCCTAGG ACCTCCTAGG ACCTACTAGG ACCTACTGGG ACCTTCTAGG ACCACTCTCC ACTACTATTA ACCACCCTTT ATCACTCTCC ATCACCATCC CCTCTACATG CCTTTACATA CCTTTATATA CCTCTACATA TCTCTATATA ATAACATAAA AAAACATAAA CCAACATAAA ACAACATAAA ACAACATAAA CTATCCCCCA CTCTTCCCCC CTATCCCCCA CTATCCCCCA CTATCCCCCA CACCTCCTGT TACTCCCTGT CACCTCCTGT TTCCTCTTGT CGCCTACTGT ACAGAGGCTC ATAGAGGCTC ACAGAGGCTC ACAGAGGCTT ATAGGGCCCC ATTCATATCC ACTCACTATC ACTCATACCC ACTCATGCCC ACTCNTCACT ACAGCCATCC ACAGCTATCC ACAGCTATCC ACAGCTATCC ACAGCTATCC CTAGCAAGCC GCAGCAAGCC CTGGCAAGCC CTAGCAAGCC CTAGCAAGCC GGAACTCTCC TGAACTCTTC AGAGCTCTCC AGAACTCTCT AGAACTCTCC TACTCACAGG CACTCACCGG TACTTACAGG TACTTACAGG TACTAACAGG TTTACCACAA TTTATCATAA TTTACCACAA TTTACCACAA TTCACCACAA GCCCTCATTC ACCCTCACTC ACCCTCATTT ACCCTCATTC ACCTTCTTTC TCCTCCTTCT TCCTCCTCCT TCCTCCTCCT TTCTCCTCCT TCCTCCTCTT AAATATAGTT AAACATAGTT AAATATAGTT AAATATAGTT AAATATAGTT ACGACCCCTT GAAACCTCTT ACAACCCCTT ACGACCCCTT ACAACCCCTT CCATGCCTGA CCATGTATGA CCGTGCTTGA CCATGTCTAA CCATGTGTGA GTTGGTCTTA ATTGGTCTTA ATTGGTCTTA ATTGGTCTTA CTTGGTCTTA TCGCTAACCT TCGCTAACCT TCGCCAACCT TCGCTAACCT TCACTAACCT GTGCTAGTAA GTACTAATGG GTACTAGTAA GTGCTAGTAA GTACTAATAG ATTCAACATA GCTCAACGTA ATCTAACATA ACTCAACATA ACTCAACATA CACAATGAGG CACAACGAGG CACAATGAGG CACAATGAGG CACAACGAGG ACACGAGAAA ACACGAGAAA ACACGAGAAA ACACGAGAAA ACACGCGAAA ATCCCTCAAT AACCCTCAAC ATCCCTCAAC ATCCCTCAAC ATCCCTCAAC TAACCAAAAC TAATCAAAAC TAACCAAAAC TAACCAAAAC TAACCAAAAC ATTTACCGAG GCTTACCGAG ATTTACCGAG ATTTACCGAG ATTTACCGAG CAACATGGCT CAACATGGCT CAACATGGCT CAACATGGCT CAACATGGCT GGCCCCAAAA GGACCCAAAA GGACCCAAAA GGCCCCAAAA GGATCCAAAA CGCCCTACCC CGCCCTACCC CGCCTTACCC CGCCTTACCC TGCCCTACCA CCTCATTCTC CCTCCTTCTC CCACATTCTC CCACATTCTC CCATATTCTC CTAATCACAG CTAATCACGG CTAATCACAG CTAGTCACAG CTAATCACAA CTCACTCACC CACACTTACA CCCACTCACA CTCACTCACC TACACCCACA ATACTCTCAT ACATATTAAT ACATCCTCAT ACACCCTCAT ATACCCTCAT CCTGATATCA CCTAACATCA CCCGATATTA CCCGACATCA CCCAGCATCA ATCAGATTGT ATTAGATTGT ATCAGATTGT ATCAGATTGT ATTAGATTGT AAAGCTTATA AAAGCCCACA AAAGCTCGTA AAAGCTCACA AAAGCTCACA TTCTCAACTT TTCTCAACTT TTCTCAACTT TTCTCAACTT TTCTCAGCTT ATTTTGGTGC ATTTTGGTGC ATTTTGGTGC ATTTTGGTGC ATTTTGGTGC CCTACCATTA CCCACTATTA CCCACCATTA CCCACTATTA CCCACCATCA CTGATCAAAT CTGGGCAAAC CTGATCAAAT CTGATCAAAT TTGATCTAAC CCCTGTACTC CCCTATACTC CCCTGTACTC CCCTATACTC CCCTATACTC CACCACATTA CACCACATTA CACCACATCA CACCACATTA CACCACATCA ATTTTTACAC ACTTATGCAC ATTCATGCAC GTTCATACAC GCTCATACAC TCACTGGATT TTACTGGCTT TCACCGGGTT TTACCGGGTT TCGCTGGGTT GAATCTGACA GAATCTAACA GAATCTGATA GAATCTGACA GAATCTAATA AGAACTGCTA AGAACTGCTA AGAGCTGCTA AGAACTGCTA AGAACTGCTA TTAAAGGATA TTAAAGGATA TTAAAGGATA TTAAAGGATA TTAAAGGATA AACTCCAAAT AACTCCAAAT AACTCCAAAT AACTCCAAAT AACTCCAAAT

AAAAGTAATA AAAAGTAATA AAAAGTAATA AAAAGTAATA AAAAGTAACA TAATTCTCCC TAATTCCCCC TAATTCCCCC TAATTCCCCC TAATCCCCCC TATCCCCATT TACCCGCACT TACCCCCATT TACCCCCATT TACCCCCACT TTTCCCCACA ATTTCCCACA CTTCCCCACA CTTCCCCACA TATCCCAACA ACTGGCACTG ACTGACACTG GCTGACACTG ACTGACACTG ACTGATGCTG

ACCATGTATA GCAATGTACA ACTATGTACG ACCATGCACA GCCATGTTTA CATCCTCACC CATTACAGCC TATCCTTACC CATCCTTACC CATTACCGCT ATGTGAAATC ACGTAAAAAT ACGTAAAATC ATGTAAAATC ATGTAAAAAC ACAATATTCA ATAATATTCA ACAATATTTC ACAATATTCA ACAATATTTA AGCAACAACC AACTGCAACC AGCAACAACC AGCCACAACC AACAACCACC

CTACCATAAC CCACCATAGC CTACCATAAC CTACTATAAC CCACCATAAC ACCCTCATTA ACCCTTATTA ACCTTCATCA ACCCTCGTTA ACCCTCATTA CATTATCGCG GACCATTGCC TATCGTCGCA CATTGTCGCA GGCCATCGCA TATGCCTAGA TGTGCACAGA TATGCCTAGA TGTGCCTAGA TCTGCCTAGG CAAACAACCC CAAACGCTAG CAAACAATTC CAAACAACCC CAGACACTAC

CACCTTAACC CATTCTAACG CACCTTAGCC CACCCTAACC TGCCCTCACC ACCCTAACAA ACCCCAATAA ATCCTAACAA ACCCTAACAA ACCCCAACAA TCCACCTTTA TCTACCTTTA TCCACCTTTA TCCACCTTTA TCCGCCTTTA CCAAGAAGCT CCAAGAAACC CCAAGAAGCT CCAAGAAGTT ACAAGAAACC AGCTCTCCCT AACTCTCCCT AACTCTCCCT AGCTCTCCCT AACTCTCACT

CTAACTCCCT CTAACCTCCC CTAACTTCCT CTGACTTCCC TTAACTTCCC AAAAAACTCA AAAGAACTTA AAAAAGCTCA AAAAAACTCA AAAAAACCCA TCATTAGCCT TAATCAGCCT TCATCAGCCT TTATCAGTCT CTATCAGCCT ATTATCTCAA ATTATTTCAA ATTATCTCAA ATTATCTCGA ATCGTCACAA AAGCTT AAGCTT AAGCTT AAGCTT AAGCTT

Exercise 3
5 149 chimpanzeeSFTGAIILIIAHGLTSSLLFCLANSNYERTHSRIIILSQGLQTLLPLIAF gibbonxxxxSFTGATVLIIAHGLTSSLLFCLANSNYERTHSRIIILSRGLQALLPLIAF gorillaexxSFTGAVVLIIAHGLTSSLLFCLANSNYERTHSRIIILSQGLQTLLPLIAL homosapienSFTGAVILIIAHGLTSSLLFCLANSNYERTHSRIIILSQGLQTLLPLIAF orangutangSFTGATTLMIAHGLTSSLLFCLANSNYERTHSRIIILSQGLQTLLPLIAL LLASLANLALPPTINLLGELSVLVTSFSSNTTLLLTGFNILITALYS LAASLANLALPPTINLLGELFVLMASFSANTTITLTGLNVLITALYS LLASLANLALPPTINLLGELSVLVTTFSSNTTLLLTGSNILITALYS LLASLANLALPPTINLLGELSVLVTTFSSNITLLLTGLNILVTALYS LLASLTNLALPPTINLLGELSVLIAIFSSNITILLTGLNILITTLYS LYMFTTTQGSLTHHINNIKPSFTRENTLIFLHLSPILLLSLNPDIITGF LYIFIITQGTLTHHIKNIKPSLTRENILILMHLFPLLLLTLNPNIITGF LYIFTTTQGPLTHHITNIKPSFTRENILIFMHLSPILLLSLNPDIITGF LYIFTTTQGSLTHHINNIKPSFTRENTLMFIHLSPILLLSLNPDIITGF LYIFTTTQGTPTHHINNIKPSFTRENTLMLIHLSPILLLSLNPSIIAGF TSC TPC TSC SSC AYC

Exercise 5.
>hinfatpD

AVIDVEFPQD AVPKVYDALK VESGLTLEVQ QQLGGGVVRC IALGTSDGLK RGLKVENTNN PIQVPVGTKT LGRIMNVLGE PIDEQGAIGE EERWAIHRSA PSYEEQSNST ELLETGIKVI DLICPFAKGG KVGLFGGAGV GKTVNMMELI RNIAIEHSGY SVFAGVGERT REGNDFYHEM KDSNVLDKVS LVYGQMNEPP GNRLRVALTG LTMAEKFRDE GRDVLFFVDN IYRYTLAGTE VSALLGRMPS AVGYQPTLAE EMGVLQERIT STKTGSITSV QAVYVPADDL TDPSPATTFA HLDSTVVLSR QIASLGIYPA VDPLDSTSRQ LDPLVVGQEH YDVARGVQGI LQRYKELKDI IAILGMDELS EEDKLVVARA RKIERFLSQP FFVAEVFTGS PGKYVTLKDT IRGFKGILDG EYDHIPEQAF Y >pmatpD AVIDVEFPQD AVPKVYDALN VETGLVLEVQ QQLGGGVVRC IAMGSSDGLK RGLSVTNTNN PISVPVGTKT LGRIMNVLGE PIDEQGEIGA EENWSIHRAP PSYEEQSNST ELLETGIKVI DLVCPFAKGG KVGLFGGAGV GKTVNMMELI RNIAIEHSGY SVFAGVGERT REGNDFYHEM KDSNVLDKVS LVYGQMNEPP GNRLRVALTG LTMAEKFRDE GRDVLFFVDN IYRYTLAGTE VSALLGRMPS AVGYQPTLAE EMGVLQERIT STKTGSITSV QAVYVPADDL TDPSPATTFA HLDSTVVLSR QIASLGIYPA VDPLESTSRQ LDPLVVGEEH YNVARGVQTT LQRYKELKDI IAILGMDELS EEDKLVVARA RKIERFLSQP FFVAEVFNGT PGKYVPLKET IRGFKGILDG EYDHIPEQAF Y >ecoliATPD MATGKIVQVIGAVVDVEFPQDAVPRVYDALEVQNGNERLVLEVQQQLGGGIVRTIAMGSSDGLRRGLDV K DLEHPIEVPVGKATLGRIMNVLGEPVDMKGEIGEEERWAIHRAAPSYEELSNSQELLETGIKVIDLMCP F AKGGKVGLFGGAGVGKTVNMMELIRNIAIEHSGYSVFAGVGERTREGNDFYHEMTDSNVIDKVSLVYGQ M NEPPGNRLRVALTGLTMAEKFRDEGRDVLLFVDNIYRYTLAGTEVSALLGRMPSAVGYQPTLAEEMGVL Q ERITSTKTGSITSVQAVYVPADDLTDPSPATTFAHLDATVVLSRQIASLGIYPAVDPLDSTSRQLDPLV V GQEHYDTARGVQSILQRYQELKDIIAILGMDELSEEDKLVVARARKIQRFLSQPFFVAEVFTGSPGKYV S LKDTIRGFKGIMEGEYDHLPEQAFYMVGSIEEAVEKAKKL >ecoliATPB MASENMTPQDYIGHHLNNLQLDLRTFSLVDPQNPPATFWTINIDSMFFSVVLGLLFLVLFRSVAKKATS G VPGKFQTAIELVIGFVNGSVKDMYHGKSKLIAPLALTIFVWVFLMNLMDLLPIDLLPYIAEHVLGLPAL R VVPSADVNVTLSMALGVFILILFYSIKMKGIGGFTKELTLQPFNHWAFIPVNLILEGVSLLSKPVSLGL R LFGNMYAGELIFILIAGLLPWWSQWILNVPWAIFHILIITLQAFIFMVLTIVYLSMASEEH >yersatpB MSASGEISTPRDYIGHHLNHLQLDLRTFELVNPHSTGPATFWTLNIDSLFFSVVLGLAFLLVFRKVAAS A TSGVPGKLQTAVELIIGFVDNSVRDMYHGKSKVIAPLALTVFVWVLLMNMMDLLPIDLLPYIGEHVFGL P ALRVVPTADVSITLSMALGVFILIIFYSIKMKGVGGFTKELTMQPFNHPIFIPVNLILEGVSLLSKPLS L GLRLFGNMYAGELIFILIAGLLPWWSQWMLSVPWAIFHILIITLQAFIFMVLTIVYLSMASEEH >hinfatpB MSGQTTSEYISHHLSFLKTGDGFWNVHIDTLFFSILAAVIFLFVFSRVGKKATTGVPGKMQCLVEIVVE W VNGIVKENFHGPRNVVAPLALTIFCWVFIMNAIDLIPVDFLPQFAGLFGIHYLRAVPTADISATLGMSI C VFFLILFYTIKSKGFKGLVKEYTLHPFNHWAFIPVNFILETVTLLAKPISLAFRLFGNMYAGELIFILI A VMYSANMAIAALGIPLHLAWAIFHILVITLQAFIFMMLTVVYLSIAYNKADH

Output files for phylogenetic exercises

Exercise 1
DNA parsimony algorithm, version 3.573c Name ---seq1 seq2 seq3 seq4 Sequences --------AAG ..A GGA .GA

One most parsimonious tree found: +--seq4 +--3 +--2 +--seq3 ! ! --1 +-----seq2 ! +--------seq1 remember: this is an unrooted tree! requires a total of 3.000 steps in each site: 0 1 2 3 4 5 6 7 8 9 *----------------------------------------0! 1 1 1 >From on tree) 1 2 3 3 2 1 1 2 3 seq4 seq3 seq2 seq1 maybe yes no yes no maybe To Any Steps? State at upper node ( . means same as in the node below it AAR ..A .G. ... G.. ... ..G

Exercise 2a
DNA parsimony algorithm, version 3.573c One most parsimonious tree found: +--------gorillaexx +--2 ! ! +-----homosapien ! +--3 --1 ! +--orangutang ! +--4 ! +--gibbonxxxx ! +-----------chimpanzee remember: this is an unrooted tree! requires a total of 330.000

Exercise 2b
Neighbor-Joining/UPGMA method version 3.573c +-------gibbonxxxx +--1 +--2 +-----orangutang ! ! ! +---gorillaexx ! --3-homosapien ! +--chimpanzee remember: this is an unrooted tree! Between ------3 2 1 1 2 3 3 And --2 1 gibbonxxxx orangutang gorillaexx homosapien chimpanzee Length -----0.00318 0.03598 0.12602 0.09198 0.05777 0.04015 0.05195

Exercise 2c (dnaml program)


Nucleic acid sequence Maximum Likelihood method, version 3.573c Empirical Base Frequencies: A C G T(U) 0.30929 0.32750 0.10570 0.25751

Transition/transversion ratio = 2.000000 (Transition/transversion parameter = 1.653039) +-homosapien +--2 ! ! +-----orangutang ! +--3 ! +-------gibbonxxxx ! --1---gorillaexx ! +--chimpanzee remember: this is an unrooted tree! Ln Likelihood = -2514.48557 Examined 17 trees Between ------And --Length -----Approx. Confidence Limits ------- ---------- ------

** ** ** ** ** ** **

1 2 2 3 3 1 1

2 homosapien 3 orangutang gibbonxxxx gorillaexx chimpanzee

0.01720 0.02875 0.05455 0.09121 0.13271 0.06191 0.05097

( ( ( ( ( ( (

0.00594, 0.01514, 0.03466, 0.06758, 0.10424, 0.04344, 0.03402,

0.02847) 0.04262) 0.07469) 0.11573) 0.16187) 0.08064) 0.06806)

* = significantly positive, P < 0.05 ** = significantly positive, P < 0.01

Exercise 2d
(Bootstrapping of parsimony) Majority-rule and strict consensus tree program, version 3.573c Species in order: gorillaexx homosapien orangutang gibbonxxxx chimpanzee Sets included in the consensus tree Set (species in order) How many times out of 100.00 ..**. .***. 100.00 73.67

Sets NOT included in consensus tree: Set (species in order) How many times out of 100.00 ..*** .*..* 20.17 6.17

CONSENSUS TREE: the numbers at the forks indicate the number of times the group consisting of the species which are to the right of that fork occurred among the trees, out of 100.00 trees +----gibbonxxxx +-100.0 +-73.7 +----orangutang ! ! +-100.0 +---------homosapien ! ! ! +--------------chimpanzee ! +-------------------gorillaexx

remember: this is an unrooted tree!

Exercise 3 (parsimony and neighbour joining methods)


Protein parsimony algorithm, version 3.573c One most parsimonious tree found: +--homosapien +-----3 ! +--gorillaexx +--2 ! ! +--orangutang --1 +-----4 ! +--gibbonxxxx ! +-----------chimpanzee remember: this is an unrooted tree! requires a total of 57.000 Neighbor-Joining/UPGMA method version 3.573c Neighbor-joining method Negative branch lengths allowed +--------gibbonxxxx +--1 ! +-----orangutang ! --3-gorillaexx ! ! +-chimpanzee +--2 +-homosapien remember: this is an unrooted tree! Between ------3 1 1 3 3 2 2 And --1 gibbonxxxx orangutang gorillaexx 2 chimpanzee homosapien Length -----0.02121 0.14023 0.08913 0.03381 0.00696 0.02877 0.03076

Вам также может понравиться