Вы находитесь на странице: 1из 20

Batch Effects

Correction for
Metabolomics
Andrés G. Camacho-Bonet 1 and Wandaliz Torres-García2, Ph.D.
Department of Industrial Engineering, University of Puerto Rico
1

Mayagüez, PR andres.camacho@upr.edu
Department of Industrial Engineering, University of Puerto Rico
2

Mayagüez, PR wandaliz.torres@upr.edu
Justification
CAR T-Cell Therapies
• What are T-cells
 White blood cell that circulate around our bodies, scanning for and infections. [1]
 Kill infected cells and naturally eradicate cancer cells [1]
• What are CAR T-Cell therapies?

Challenge:
What characteristics of these
cells are important for the
potency of the therapies?

Approach:
Metabolomics Characterization
[2]
Metabolites
• What are Metabolites?
Small molecules which are the reactants, intermediates, or products of
enzyme-mediated biochemical reactions [3]

• Metabolomic Characterization
Knowing which metabolites are
present permits us to understand
regarding the T-Cell Therapies in a
micro scale:
• What has happened?
• What is happening?
[4]
Metabolomic Characterization
Methods
1. Liquid chromatography (LC)
2. Mass spectrometry (MS)
Justification for analysis
Data Type
1. Develop safer and effective
cancer therapies.
2. Understand critical to
quality metabolites for
manufacturing to make
reliable medicine.
3. Reducing manufacturing
costs through focusing on
what matters.
[5]
Problem &
Objectives
Batch effects problem
• Data acquisition variation is highly sensitive.
• If performed in batches there Is batch clustering.
E.g. different: operators, machines, time etc.
• Challenge to extract insightful information since data is biased.
PCA Plot PCA Plot

Sample of Batch 1

Clustered by
batches [7] [7]
Select batch effect
correction algorithm.

Objectives
Determine analysis to
detect presence of batch
effect before and after
correction.
Methodology
Batch Correction Algorithm
LIMMA – Linear models for microarray data
• Variant of ANalysis Of VAriance (ANOVA) [6].
• Removes any measurable, technical variation not associated with the treatment condition or
biological signal of interest [6].
• Fits a linear model with the known batch and treatment effects, the procedure essentially
performs an ANOVA decomposition on the data and removes the variability associated with
the batches while retaining that which is associated with the experimental design [6].
Combat
• Fits linear model like LIMMA.
• Uses empirical bayes to estimate linear model parameters.
• Removes components associates with bacth effect in the linear model.
Principal Component Analysis (PCA)
• Reduces the dimensionality of the data set, allows most of the
variability to be explained using fewer variables.
PCA – Scores Plot
Correction Metric: Bhattacharyya distance
• Average distance between batches based in PCA scores

1,2 [8]
Where:
D1,2: distance between batch 1 and 2 D1,2
µ1: mean of batch 1
Σ1: covariance matrix of batch 1

  Σ

• Interpretation: lower interbatch distance = lower batch effect


Correction Metric: Repeatability

[8]

Between Within

: variance between batches

: variance within batches

Expected after correction:


• Lower value
Results
Databases
• Two metabolomic datasets of Arabidopsis samples.
• These differ in batch effect degree and sample size.

Dataset Stage Metabolites Samples Batches


(Columns) (Rows)

After
SET2 Removal 165 753 15

After
SET3 Removal 32 240 4
Set 2 Batch Effect Correction
No Correction LIMMA Combat

R: Repeatability: 0.36
Set 3 Batch Correction
No Correction LIMMA Combat
Evaluation metrics

Database Batch Correction Method Interbatch Distance % Reduction - Interbatch Distance Repeatability % Change - Repeatability
Uncorrected 104.19 - 0.283 -
Limma 0.186 100% 0.36 27%
Set 2 Combat 20.05 81% 0.275 -3%
Uncorrected 0.387 - 0.346 -
Limma 0.076 80% 0.361 4%
Set 3 Combat 0.022 100% 0.363 5%

Recommended correction method: Limma


• Able to reduce interbatch distance significantly in both cases
with different degree of batch effect
• Did not induce significant within variability
References
• [1] Designs, J. (n.d.). Beginners Guide to T cells. Retrieved from http://www.tcells.org/beginners/tcells/
• [2] NCI Dictionary of Cancer Terms. (n.d.). Retrieved from
https://www.cancer.gov/publications/dictionaries/cancer-terms/def/car-t-cell-therapy
• [3] Metabolite. (n.d.). Retrieved from https://www.sciencedirect.com/topics/medicine-and-
dentistry/metabolite
• [4] Designs, J. (n.d.). Beginners Guide to T cells. Retrieved from http://www.tcells.org/beginners/tcells/
• [5] Liquid chromatography–mass spectrometry. 2019, April 09. Retrieved from
https://en.wikipedia.org/wiki/Liquid_chromatography–mass_spectrometry
• [6] Salermo, S., Mehrmohamadi, M., Liberti, M., Wan, M., Wells, M., Booth, J., Locasale, J. RRmix: A
method for simultaneous batch effect correction and analysis of metabolomics data in the absence of
internal standards
• [7] Fernandez, Facundo M. Batch effects_v2.2. 10 Dec. 2018
• [8] Wehrens, Ron. “Improved Batch Correction in Untargeted MS-Based Metabolomics.” DOI
10.1007/s11306-016-1015-8

Вам также может понравиться