Вы находитесь на странице: 1из 68

Indian Ocean Rim 2017 Laboratory Haematology Congress

Clinical Implementation of Next-Generation Sequencing:


Experience in Hong Kong

Ho-Wan Ip
Division of Haematology
Department of Pathology and Clinical Biochemistry
Queen Mary Hospital
Hong Kong
18th June 2017
Outline

Myeloid panel

Targeted RNA-seq for haematological malignancies

NGS panel for inherited haematological diseases

Bioinformatics consideration
Myeloid Panel
Panel Design

Panel of 67 genes implicated to be of diagnostic, prognostic and therapeutic significance in myeloid neoplasms
Acute myeloid leukaemia (AML)
Myelodysplastic syndrome (MDS)
Myeloproliferative neoplasms (MPN)
Myelodysplastic/myeloproliferative neoplasms (MDS/MPN)
Clonal haemopoiesis of indeterminate potential (CHIP)

Hybridisation-based capture of all exons

Paired-end sequencing by Illumina MiniSeq sequencer (7.5 Gb throughput per run)

Tumour sample: Peripheral blood or bone marrow aspirate


Matched normal sample: Buccal swab
Myeloid Panel
Genes in Myeloid Panel

ABL1, ANKRD26, ASXL1, ATM, ATRX, BCOR, BCORL1, BRAF, CALR, CBL, CBLB, CBLC, CDKN2A, CEBPA,
CREBBP, CSF3R, CUX1, CXCR4, DNMT3A, DDX41, ETNK1, ETV6, EZH2, FBXW7, FLT3, GATA1, GATA2, GNAS,
GNB1, HRAS, IDH1, IDH2, IKZF1, JAK2, JAK3, KDM6A, KIT, KMT2A, KRAS, KMT2B, KMT2D, MPL, MYD88,
NF1, NOTCH1, NPM1, NRAS, PDGFRA, PHF6, PPM1D, PTEN, PTPN11, RAD21, RUNX1, SETBP1, SETD2,
SETDB1, SF3B1, SMC1A, SMC3, SRSF2, STAG2, TET2, TP53, U2AF1, WT1, ZRSR2
Human genome is huge

Genome
Focused interrogation of the human genome: Targeted sequencing

Genome
Fragmentation
Adaptor ligation
Hybridization capture
Sequencing

150 bp
sequence read A T
T A
G C
400 bp fragment
C G
Paired-end sequencing

150 bp
sequence read

400 bp
fragment

150 bp sequenced
section
Massively parallel sequencing
Sequence alignment to reference genome

Reference genome
Variant calling
Myeloid Panel
Type of Variants

Single nucleotide variants

Indels

Copy number variants


Bioinformatics Pipeline for Myeloid Panel
BAM.final

FASTQ

fastQC, Trimmomatic

FASTQ.trim MuTect2 Pindel VisCap CNVkit


VarScan2
BWA HaplotypeCaller

BAM
SNV & Indel CNV (gene)
Picard MarkDuplicates

BAM.uniq FLT3-ITD CNV (genome)

GATK Indel Realignment

SnpEff, ANNOVAR
BAM.indel Population: ESP, 1000G, ExAC_nonTCGA,
gnomAD
GATK Base Recalibration Significance: ClinVar, InterVar, COSMIC
Prediction: dbnsfp33a
BAM.final
Variant List
In-house database: Variant classification,
PMID, frequency
Performance of Myeloid Panel
Quality Metrics
Performance of Myeloid Panel
Quality Metrics: Depth of Coverage
Myeloid Panel: Variant Calling
Single Nucleotide Variants
Performance of Myeloid Panel
Analytical Sensitivity: Validation Using Reference Standard
Gene Variant Expected VAF Measured VAF No. of Alt Allele Total Depth
ABL1 p.T315I 2.5% 2.2% 29 1289
NRAS p.Q61H 2.5% 3.7% 27 739
NRAS p.Q61R 2.5% 1.5% 11 739
IDH1 p.R132H 2.5% 2.1% 27 1311
IDH2 p.R172K 2.5% 1.4% 24 1693
FLT3 p.D835Y 2.5% 2.5% 37 1477
KIT p.D816V 2.5% 2.0% 29 1444
KRAS p.A146T 2.5% 2.8% 29 1081
KRAS p.G12C 2.5% 2.5% 29 1165
KRAS p.G12D 2.5% 2.7% 31 1164
KRAS p.G12S 2.5% 3.4% 40 1165
KRAS p.G13D 25.0% 21.2% 251 1185
KRAS p.Q61H 2.5% 4.3% 50 1169
BRAF p.V600E 8.0% 6.7% 73 1088
BRAF p.V600M 2.0% 1.9% 34 1059
BRAF p.V600R 2.0% (1.2%) 13 1088
PDGFRA p.D842V 2.5% 2.9% 41 1393
Myeloid Panel: Variant Calling
Indels
Spencer DH et al. Detection of FLT3 internal tandem duplication in targeted, short-read-length, next-generation sequencing data. J Mol Diagn
2013;15:81-93.
Myeloid Panel: Variant Calling
Copy Number Variant (CNV)
Myeloid Panel: Variant Calling
Copy Number Variant (CNV)
Myeloid Panel: Variant Calling
Copy Number Variant (CNV): Major Challenges in Cancer
Myeloid Panel
Classification of Sequence Variants

Class 1: Established as clinically actionable in the disease primary histology


Class 2: Established as clinically actionable in a different primary histology
Class 3
Variants in the gene are established as actionable in this primary histology
The current variant is not one of the recurrently reported ones
Functional prediction algorithms indicate the identified variant
Likely does modify protein function: Class 3A
May or may not modify protein function: Class 3B
Likely does not modify protein function: Class 3C
Class 4
Variants in the gene are established as actionable in a different primary histology
The current variant is not one of the recurrently reported ones
Functional prediction: Classify into Class 4A, Class 4B, Class 4C (Same scheme as Class 3)
Class 5: Unknown significance

Sukhai MA et al. A classification system for clinical relevance of somatic variants identified in molecular profiling of cancer. Genet Med 2016;18:128-136.
Myeloid Panel
Interpretation of Sequence Variants

Population frequency: Variant with >1% frequency in normal population can be assumed to have no clinical
significance (Benign)
1000 Genomes Project, Exome Sequencing Project, Exome Aggregation Consortium, gnomAD
Literature and database review (Class 1 and 2)
ClinVar: Database of clinical significance
COSMIC: Cancer database
PubMed, Google
Determine mechanism of implicated gene in causing disease phenotype
Gain of function
Loss of function
Function prediction software (Class 3 and 4)
Somatic variant
Myeloid Panel
Result Reporting
Myeloid Panel: NGS Study in Myelofibrosis in Hong Kong
Patient Demographics

101 patients with myelofibrosis


PMF: 70
Post-PV MF: 14
Post-ET MF: 17
Median age: 60 (Range: 26-89)
Median duration of follow-up: 46 months (Range: 1-256 months)
International prognostic scoring system (IPSS)
Low: 11
Intermediate-1: 25
Intermediate-2: 23
High: 42
Outline

Myeloid panel

Targeted RNA-seq for haematological malignancies

NGS panel for inherited haematological diseases

Bioinformatics consideration
Targeted RNA-seq for Haematological Malignancies
Panel Design and Application

Panel of 271 genes that have been implicated in haematological malignancies

Hybridisation capture

Fusion detection: Cytogenetically cryptic or novel rearrangements


More information of potential drivers
Diagnosis of BCR-ABL1-like B lymphoblastic leukaemia (a provisional entity in WHO 2016 classification)

(Expression profiles of targeted genes)


Targeted RNA-seq for Haemic Malignancies
Bioinformatics Pipeline for Fusion Detection

FASTQ

Trimmomatic

FASTQ.trim

TopHat-Fusion STAR-Fusion

BAM Shortlisted fusion genes

TopHat-Fusion-Post

Shortlisted fusion genes


Targeted RNA-seq for Haemic Malignancies
Fusion Detection

FNDC3B RARA

Cheng CK et al. FNDC3B is another novel partner fused to RARA in the t(3;17)(q26;q21) variant of acute promyelocytic leukemia. Blood 2017;129:2705-2709.
NGS Panel for Inherited Haematological Diseases
Panel Design

367 genes implicated in inherited haematological diseases

Hybridization capture of all exons, with some genes including promoter regions

Curation process
Literature review
OMIM: Search of disease phenotype and associated genes
NGS Panel for Inherited Haematological Diseases
Disease Phenotypes

Inherited bone marrow failure syndromes Megaloblastic anaemia


Thrombocytopenia, including Paroxysmal nocturnal haemoglobinuria
macrothrombocytopenia Familial haemophagocytic lymphohistiocytosis
Platelet dysfunction Neuroacanthocytosis
Microangiopathy Coagulopathies
Red cell membranopathies, enzymopathies, Thrombotic disorders
haemoglobinopathies Warfarin sensitivity
Sideroblastic anaemia Hyperfibrinolysis
Erythrocytosis Thrombocytosis
Bioinformatics Pipeline for Inherited Disease Panel
BAM.final

FASTQ

fastQC, Trimmomatic

FASTQ.trim HaplotypeCaller VisCap

BWA

BAM
SNV & Indel CNV (gene)
Picard MarkDuplicates

BAM.uniq

GATK Indel Realignment

SnpEff, ANNOVAR
BAM.indel

GATK Base Recalibration


Population: ESP, 1000G, ExAC, gnomAD
Significance: ClinVar, InterVar
Prediction: dbnsfp33a
BAM.final
Variant List
Outline

Myeloid panel

Targeted RNA-seq for haematological malignancies

NGS panel for inherited haematological diseases

Bioinformatics consideration
Bioinformatics Considerations
Personnel

Hardware & data storage


IT Software maintenance
Bioinformatics pipeline

Target-oriented interrogation of genomic data to answer


clinical questions
Interpretation of sequence variants
Development of in-house variant database

Development of sequencing platform and wet-bench


procedures
Biologists Literature review
Result reporting
Bioinformatics Considerations
Possible Options for Analysis Platform

In-house bioinformatics platform


Open-source operating system and software
Flexible

Proprietary local platform


Easy implementation
Cost implication, less flexible

Online/cloud-based: Galaxy, commercial


Easy implementation
Flexible
Data transfer
Confidentiality issue?
Bioinformatics Considerations
Hardware

For targeted sequencing, personal computers can handle most, if not all, the data analysis
i5 or i7 CPU, or equivalent
8 Gb memory

Our current system


Linux-based workstation
Dual Intel Xeon E5 CPUs
128 Gb memory
8 Tb storage on-board
Bioinformatics Considerations
Operating Systems and Use of Bioinformatics Software

Linux is the preferred platform for bioinformatics analysis


Numerous open-source software developed using Linux
Drawback: Need to be acquainted with command-line operation

Command-line operation: Instruction-for-use are well-documented for most established software


Example
bwa mem -M hg19.fasta sample_1.fastq.gz sample_2.fastq.gz > sample.sam

Installation of software and first use of software may need some troubleshooting efforts
Clinical Implementation of NGS: Experience in Hong Kong
Summary

Development of genomic platforms for clinical application


Myeloid panel
Targeted RNA-seq for haemic malignancies
Panel for inherited haematological diseases

Strategies of variant detection in NGS: SNVs, indels, CNVs, fusion transcripts

Interpretation of sequence variants

Bioinformatics considerations when setting up genomic testing platform


Acknowledgement

Dept. of Pathology, Queen Mary Hospital Funding


Dr. Clarence Lam Hong Kong Blood Cancer Foundation
Dr. Jason So SK Yee Medical Foundation
Dr. Rock Leung
Mr. Tommy Tang

Dept. of Medicine, The University of Hong Kong


Dr. Harinder Gill
Myeloid Panel: Variant Calling
FLT3 Internal Tandem Duplication

30% of AML

Insertion of 15 to 300 bp

Pose significant challenge to usual mapping and


calling software

Spencer DH et al. Detection of FLT3 internal tandem duplication in targeted, short-read-length, next-generation sequencing data. J Mol Diagn
2013;15:81-93.
Myeloid Panel: Variant Calling
FLT3 Internal Tandem Duplication

Reference genome

Internal tandem duplication

Reference genome

Soft-clipped read
Soft-clipped read
Myeloid Panel: Variant Calling
FLT3 Internal Tandem Duplication
Myeloid Panel: Variant Calling
Surveying the Copy Number of the Whole Genome Using Targeted Data

Genome
Myeloid Panel: Variant Calling
Surveying the Copy Number of the Whole Genome Using Targeted Data
Myeloid Panel: Variant Calling
Surveying the Copy Number of the Whole Genome Using Targeted Data
Myeloid Panel: NGS Study in Myelofibrosis in Hong Kong
Survival Analyses
Case Study: Patient TCA
History and Physical Examination

M/86d
Born in private hospital. 1st child. Parents not consanguineous.
Admitted from OPD for haemolysis after birth
Anaemia, severe jaundice
Require red cell transfusions and exchange transfusion
Case Study: Patient TCA
Investigation: Complete blood count
Case Study: Patient TCA
Investigation: Peripheral blood smear
Case Study: Patient TCA
Investigation: PK study & Sanger sequencing in family trio

Normal CBC Normal CBC


PK-standard: Normal PK-standard: Low
PK-Low substrate: Normal PK-Low substrate: Low
Heterozygous PKLR:p.R510Q No apparent pathogenic mutation

Anaemia
Phenotypically PK deficiency
Appear to be homozygous PKLR:p.R510Q

Is this due to hemizygous mutation due to inheritance of a deletion from the mother?
Is this due to uniparental disomy from a paternally inherited mutation?
Case Study: Patient TCA
Investigation by NGS: PKLR(NM_000298.5):c.1529G>A p.R510Q
Case Study: Patient TCA
Investigation by NGS: PKLR deletion
Case Study: Patient TCA
Summary of genetic findings

PKLR large deletion involving exon 4 to exon 11

Previous reported large deletion in PKLR involves exon 4 to exon 10


PKLR:c.283+1914_1434del5006

Proposed further studies


NGS testing of the mother: Further acquisition of experience
MLPA confirmation of the presence and extent of deletion: For validation
Gap-PCR for exon 4 to exon 10 deletion: For exclusion of the previous variant

Costa C et al. Severe hemolytic anemia in a Vietnamese family, associated with novel mutations in the gene encoding for pyruvate kinase. Haematologica
2005;90:25-30.
So CC et al. First reported case of prenatal diagnosis for pyruvate kinase deficiency in a Chinese family. Hematology 2011;16:377-379.
Case Study: Patient TCA
Summary of genetic findings

PKLR(NM_000298.5):c.1529G>A p.R510Q
Frequently occurring mutation, found primarily in north Europeans and Caucasian Americans

Functional protein with decreased stability toward heat


May explain the normal functional study in the heterozygous father

Baronciani L et al. Analysis of pyruvate kinase-deficiency mutations that produce nonspherocytic hemolytic anemia. Proc Natl Acad Sci U S A
1993;90:4324-4327.
Wang C et al. Human erythrocyte pyruvate kinase: characterization of the recombinant enzyme and a mutant form (R510Q) causing nonspherocytic
hemolytic anemia. Blood 2001;98:3113-3120.
Bioinformatics Considerations
Possible Options for Analysis Platform

In-house bioinformatics platform


Free operating system, open-source software (most established software are free for use)
Highly flexible and up-to-date, tailor-made
Command-line operation, manage own software and databases

Proprietary local platform


Easy implementation
Cost implication, less flexible

Online/cloud-based: Galaxy, commercial


Easy implementation
Flexible options
Consideration on confidentiality of patient information
Bioinformatics Considerations
Operating Systems

Linux is the preferred platform for bioinformatics analysis


Numerous open-source software developed using Linux
Drawback: Command-line operation

Standalone Linux: Free distributions available

Windows
Windows 10: Bash on Ubuntu on Windows
Dual boot
Virtual machine (not preferred)
Note: Windows-based text files are not compatible with Linux-based text files

MacOS is Unix-based
Chang J. Rewarding bioinformaticians. Nature 2015;520:151-152.