Вы находитесь на странице: 1из 52

Nile University, Bioinformatics Group.

Cluster Computer For Bioinformatics Applications


Hisham Adel

2008

Done By:
1. Hisham Adel Hassan.

Supervised by:
Dr. Mohamed Aboualhouda

Points

Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic Problems. General Idea about Sequence Alignment. BLAST and Parallel BLAST Algorithm. Sequence Alignment and Parallel Sequence Alignment. Learned Skills.
3

Introduction

Points

Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic Problems. General Idea about Sequence Alignment. BLAST and Parallel BLAST Algorithm. Sequence Alignment and Parallel Sequence Alignment. Learned Skills.
5

Cluster Definition
Group of computers and servers (connected together) that act like a single system.

Each system called a Node.


Node contain one or more Processor , Ram ,Hard disk and LAN card. Nodes work in Parallel. We can increase performance by adding more Nodes.

Points

Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic Problems. General Idea about Sequence Alignment. BLAST and Parallel BLAST Algorithm. Sequence Alignment and Parallel Sequence Alignment. Learned Skills.
9

Cluster types
Load Balancing Cluster (Parallel BLAST). Computing Cluster(Parallel sequence alignment). High-availability (HA) clusters.

10

Cluster types:Load Balancing Cluster

Task

11

Cluster types:Computing Cluster


Task

12

Cluster type:High-availability Clusters

13

Cluster advantages
Performance. Scalability. Maintenance. Cost.

14

Points

Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic Problems. General Idea about Sequence Alignment. BLAST and Parallel BLAST Algorithm. Sequence Alignment and Parallel Sequence Alignment. Learned Skills.
15

Our Cluster
Internet Node 4 Node 1 Internet

switch

Internet

Node 3

Node 2

Internet

16

Our Cluster specification


Communication : Switch 5-Port 10/100Mbps. Processor and Ram: -Master Node Duo core Processor 1.86 GHZ. Ram 1GB. -Node 1 Pentium 4 Ram 1GB. -Node 2 Pentium 4 Ram 1GB -Node 3 Pentium 4 Ram 512 MB
17

Our Cluster specification (cont)


Operating System OPEN SUSE 10.3
http://software.opensuse.org/

MPICH2
http://www.mcs.anl.gov/research/projects/mpich2/

18

Points

Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic Problems. General Idea about Sequence Alignment. BLAST and Parallel BLAST Algorithm. Sequence Alignment and Parallel Sequence Alignment. Learned Skills.
19

Performance of the Cluster is affected by

1-Node speed.

2-Running Program.

20

Running Program(sequential)
Working

21

Running Program(sequential)
Working

22

Running Program(sequential)
Working

23

Running Program(sequential)

24

Running Program(Parallel)

Data sent

Data sent

Data sent

25

Running Program(Parallel)
Working Working

Working

Working

26

Running Program(Parallel)
Finished Results Get results Results

Finished

Finished
Results

27

Points

Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic Problems. General Idea about Sequence Alignment. BLAST and Parallel BLAST Algorithm. Sequence Alignment and Parallel Sequence Alignment. Learned Skills.
28

Sequence Alignment

29

Sequence Alignment
Used to :

1-Compare between sequences.

2-Search databases.

30

How to Align two Sequences.


if we have two sequences A A A C G A A A T G A Let match=1, gap=-1 , miss-match=0. they can be aligned as: 1A A A C G A
| | | | | |

Score=3

A A T _ G A 2A A A C _ G A
| | | | | | |

Score=1

A A _ _ T G A
31

Points

Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance Cluster Computer for Basic Problems.. General Idea about Sequence Alignment. BLAST and Parallel BLAST Algorithm. Sequence Alignment and Parallel Sequence Alignment. Learned Skills.
32

BLAST
(Basic Local Alignment Search Tool)
Searching DataBases

33

BLAST Algorithm

(High scoring pairs)

34

Blast search types.


BLASTN - Compares a nucleotide query sequence against a nucleotide sequence database.

BLASTP- Compares an amino acid query sequence against a protein sequence database.
TBLASTN- Compares a protein query sequence against a nucleotide sequence Database. BLASTX- Compares nucleotide query sequence against a protein sequence database.

35

Why We need BLAST to be parallelized ?

36

Our Program:Parallel BLAST

37

Parallel BLAST(cont)
Formatdb.c

Nucleotide sequence database

formatdb -i DATABASE -p F .

Protein sequence database

formatdb -i DATABASE -p T .

38

Parallel BLAST(cont)
Linux_Cluster_BLASTALL.c

blastall-p BLAST Search Type -d DATABASE -i QUERY FILE -oout.Txt

39

Results

Average of running 1000 Query, 1000 times.

Nucleotide-Nucleotide
1.8000000 1.6000000 1.4000000 1.2000000 1.0000000
Tim e(S)

0.8000000 0.6000000 0.4000000 0.2000000 0.0000000 drosoph.nt (118,6 MB)) Yeastnt (3.2 MB) month.htgs (573 MB) igseqnt (67.5 MB) Pdbnt (1.7 MB) mito.nt (3.2 MB)

1 Node 3 Nodes-Query time 3-Nodes-Query and communication time

Database(Size)

40

Results(cont)

Average of running 1000 Query, 1000 times.

Amino acid_Amino acid


90.000000 80.000000 70.000000 60.000000 50.000000 40.000000 30.000000 20.000000 10.000000 0.000000 env_nr(1.6GB) nr(573MB) Sw issProt(160MB) Pdbaa(20MB) Yeast.aa(3.2MB)

Tim e(S)

1 Node-Query Time 3 Nodes-Query time 3 Nodes-Query and communication time

Database(size)

41

Results(cont)
90.0000000

Average of running 1000 Query, 1000 times.

Amino acid_Nucltide

80.0000000

70.0000000

60.0000000

Time(S)

50.0000000

40.0000000

1 Node Query time 3 Nodes Query time only 3 Nodes Query and Communication time

30.0000000

20.0000000

10.0000000

0.0000000 env_nr(1.6GB) Sw issprot(160MB) nr(84.7MB) Pdbaa(20.4MB) yeast.aa(3.2MB)

Database(Size)

42

Conclusion about Parallel BLAST.

Performane: Batter by using CLUSTER.

Scalability:More Nodes time decrease.

43

Points

Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic Problems. General Idea about Sequence Alignment. BLAST and Parallel BLAST Algorithm. Sequence Alignment and Parallel Sequence Alignment. Learned Skills.
44

Sequence Alignment
Compare between sequences

45

Sequence Alignment
Introduction.

Sequence Alignment Benefits.

Sequence Alignment Types.

46

Needleman-Wunsch Algorithm

47

Why We need Sequence Alignment to be parallelized ?

48

Parallel Sequence Alignment algorithm

49

Our Sequence Alignment Program

Pairwise Alignment. Built Using Needleman-Wunsch algorithm.

50

Learned Skills.
Using Linux (Suse 10.3) operating system. Programming using C language. Cluster computers and how to build one. MPICH2 for message passing interfaces between nodes. Latex. Team working, and helping each other. Presentation skills.

51

Thank you for your time.

Hisham Adel

52

Вам также может понравиться