Академический Документы
Профессиональный Документы
Культура Документы
Introduction to Bioinformatics
Bioinformatics
Medical informatics
Applying computational techniques to medical data
Chemo-informatics
Applying computational techniques to chemical data
What is bioinformatics?
Bioinformatics Data
DNA Data
Structural Genomics
Predictive toxicology
Drug trial Data
Medical diagnosis
Patient Record Data
TDQAAFDTNIVTLTRFVM
EQGRKARGTGEMTQLLNS
LCTAVKAISTAVRKAGIA
HLYGIAGSTNVTGDQVKK
LDVLSNDLVINVLKSSFA
TCVLVTEEDKNAIIVEPE
KRGKYVVCFDPLDGSSNI
DCLVSIGTIFGIYRKNST
DEPSEKDALQPGRNLVAA
GYALYGSATMLV
sequence
protein functions
Basic concepts
> 500, 000 genes
sequenced to date
Expected number of
unique protein
structures:
~ 700-1, 000
nucleic acids
proteins
Genome
regulatory
transcripts
sites
Protein population:
proteomics
Genome activation
patterns: transcriptomics
coding regions
One-to-many mappings!
Organisation:
tissue imaging
EM
Context-dependence!
X-ray, NMR
cells
molecular complexes
Perturbation
Living cell
Dynamic response
External environment
Internal environment
Biological knowledge
(computerised)
Sequence information
Basic principles
Virtual cell
Practical
applications
Metabolic net
Genetic networks
Structural information
Bioinformatics
DNA hRNA
mRNAs
proteins
Mathematical
modelling
Simulation
Transcription
The Central
Dogma
Bioinformatics in context
Mathematics/
computer
science
Genomics
DNA
transcription
RNA
Molecular
biology
Bioinformatics
Biophysics
translation
Proteins
Ethical, legal,
and social
implications
Molecular
evolution
Biological databases
The challenge
Searching Databases
We have ways to score how well 2 seqs match
Now want to use this in databases
Given a known gene sequence
Which genes in the database are closely related
Multiple sequence
alignment algorithms
Profiles
PSI-BLAST
Data types
primary data
sequence
AATGCGTATAGGC
DNA
DMPVERILEALAVE
amino acid
secondary data
motifs: regular
expressions, blocks,
profiles, fingerprints
tertiary data
atomic co-ordinates
secondary
protein structure
primary database
secondary db
tertiary protein
structure
tertiary db
EMBL
Europe
Protein
Nucleic acid
EMBL
GenBank
DDBJ (DNA
Data Bank of Japan)
PIR
MIPS
SWISS-PROT
TrEMBL
NRL-3D
Swiss-Prot
EMBL
EBI
International
Advisory Meeting
USA
NLM
NCBI
Collaborative Meeting
TrEMBL
DDBJ
NRDB
Japan
NIG
CIB
By text
By sequence
By title
By author
By query language
By regular expression
Machine Learning
Machine learning (inductive reasoning)
Automatic proposing of hypotheses based on data
Has many applications in bioinformatics
Including protein structure prediction
Define errors
Use statistics to define confidence intervals
Show that one learning algorithm