Brain Mri

Accurate white matter lesion segmentation by k nearest neighbor
classification with tissue type priors (kNN-TTPs)

Martijn D. Steenwijka, ,
,
Petra J.W. Pouwelsb,
Marita Daamsa, c,
Jan Willem van Dalend,
Matthan W.A. Caane,
Edo Richardd,
Frederik Barkhofa,
Hugo Vrenkena, b
a Department of Radiology and Nuclear Medicine, Neuroscience Campus Amsterdam, VU University
Medical Center, Amsterdam, The Netherlands
b Department of Physics and Medical Technology, Neuroscience Campus Amsterdam, VU University
Medical Center, Amsterdam, The Netherlands
c Department of Anatomy and Neurosciences, Neuroscience Campus Amsterdam, VU University Medical
Center, Amsterdam, The Netherlands
d Department of Neurology, Academic Medical Centre Amsterdam, The Netherlands
e
Department of Radiology, Academic Medical Centre Amsterdam, The Netherlands
Highlights

Intensity normalization has a large influence on lesion segmentation performance.
Inclusion of tissue type priors as features increases segmentation performance.
Best performance was achieved using variance scaling and tissue type priors.
Abstract
Introduction
The segmentation and volumetric quantification of white matter (WM) lesions play an
important role in monitoring and studying neurological diseases such as multiple
sclerosis (MS) or cerebrovascular disease. This is often interactively done using 2D
magnetic resonance images. Recent developments in acquisition techniques allow for
3D imaging with much thinner sections, but the large number of images per subject
makes manual lesion outlining infeasible. This warrants the need for a reliable
automated approach. Here we aimed to improve k nearest neighbor (kNN)
classification of WM lesions by optimizing intensity normalization and using spatial
tissue type priors (TTPs).
Methods
The kNN-TTP method used kNN classification with 3.0 T 3DFLAIR and 3DT1 intensities
as well as MNI-normalized spatial coordinates as features. Additionally, TTPs were
computed by nonlinear registration of data from healthy controls. Intensity features
were normalized using variance scaling, robust range normalization or histogram
matching. The algorithm was then trained and evaluated using a leave-one-out
experiment among 20 patients with MS against a reference segmentation that was
created completely manually. The performance of each normalization method was
evaluated both with and without TTPs in the feature set. Volumetric agreement was
evaluated using intra-class coefficient (ICC), and voxelwise spatial agreement was
evaluated using Dice similarity index (SI). Finally, the robustness of the method across
different scanners and patient populations was evaluated using an independent
sample of elderly subjects with hypertension.
Results
The intensity normalization method had a large influence on the segmentation

performance, with average SI values ranging from 0.66 to 0.72 when no TTPs were
used. Independent of the normalization method, the inclusion of TTPs as features
increased performance particularly by reducing the lesion detection error. Best
performance was achieved using variance scaled intensity features and including TTPs
in the feature set: this yielded ICC = 0.93 and average SI = 0.75 0.08. Validation of
the method in an independent sample of elderly subjects with hypertension, yielded
even higher ICC = 0.96 and SI = 0.84 0.14.
Conclusion
Adding TTPs increases the performance of kNN based MS lesion segmentation

methods. Best performance was achieved using variance scaling for intensity
normalization and including TTPs in the feature set, showing excellent agreement with
the reference segmentations across a wide range of lesion severity, irrespective of the
scanner used or the pathological substrate of the lesions.
Keywords
Segmentation;
White matter lesions;
Multiple sclerosis;
Cerebrovascular disease;
MRI
1. Introduction
Focal white matter (WM) pathology in the brain has been associated with various
disorders, including multiple sclerosis (MS), cerebrovascular disease and dementia.
Magnetic resonance imaging (MRI) plays a key role in diagnosing, monitoring and
studying these diseases (Polman et al., 2011 and Provenzano et al., 2013). Perhaps one
of the most important contributions of MRI is that it can be used to visualize lesions in
the WM. Treatment effects are studied in clinical trials by counting these lesions and
quantifying their volumes through lesion segmentation, and epidemiological studies
are performed to understand how the lesions affect the brain (Kappos et al.,
2007 and Mortamais et al., 2013).
Quantification of white matter lesions (WMLs) is traditionally performed by visual
rating or manual outlining on 2D proton density (PD) weighted, T2-weighted, or fluid
attenuated inversion recovery (FLAIR) images with slice thicknesses of 3 mm or more
(Fazekas et al., 1987, Olsson et al., 2013 and Schoonheim et al., 2012). Recent advances
in acquisition techniques enable 3D imaging with much better spatial resolution,
typically around 1 mm isotropic. The much larger number of images per subject makes
manual outlining of lesions infeasible, and warrants the need for reliable automated
lesion segmentation techniques.
A number of automated WML segmentation techniques have been described
(Mortazavi et al., 2012). Based on the performance reported in literature and the
explicit use of a priori information, we selected the k-nearest neighbor (kNN) method
described by Anbeek et al. (2004) as a starting point for our method. kNN classification
is a supervised pattern recognition technique, which performs segmentation by
comparing new data to a collection of labeled examples in a training set. For each new
voxel to be classified, the algorithm computes the probability of the voxel being a
lesion, by determining the fraction of k nearest neighbors that were labeled as a lesion
in the feature space of the training set. Previous studies showed that kNN classification
provides good WML segmentation results when both signal intensities and spatial
coordinates are used as features ( Anbeek et al., 2004 and Anbeek et al., 2005).
Here, we sought to improve on the method by Anbeek et al., first, by adding GM, WM
and CSF tissue type priors (TTPs) derived from healthy controls to allow the inclusion
of anatomical information and reduce the number of false positive voxels. The use of
such tissue type information has been shown to improve WML segmentation in
previous studies (Schmidt et al., 2012). Second, we optimize the method of signal
intensity normalization by comparing different normalization strategies. We trained
and evaluated the method in patients with MS and elderly subjects with hypertension
using manually developed reference segmentations, constructed by expert raters who
perform these segmentations routinely.
The aim of the present study was to quantify the effect of adding TTPs and optimizing
intensity normalization on the performance of kNN WML classification. This was done
by measuring the segmentation performance (i.e. spatial correspondence with the
manual reference segmentation) of kNN-TTP with various intensity normalization
methods, using a leave-one-out approach in a sample of MS patients. Finally, the
robustness of the method across different scanners and patient populations was
studied by applying it in an independent sample of elderly subjects with hypertension.
2. Materials and methods
2.1. Subjects
We primarily investigated MR images of patients with clinically-definite MS and healthy

controls who were part of a larger cohort. The validation sample consisting of elderly
subjects with hypertension will be described in the section Validation in an
independent cohort of elderly subjects with hypertension below.
The institutional ethics review board approved the study and all subjects gave written
informed consent prior to participation. From a larger study cohort, we selected a
subset of 20 patients with MS showing a wide variety of pathology in terms of lesion
burden. Their ages varied between 29 and 67 years (mean age: 52.5 7.7 years), and
13 of them were women. Disease severity was measured on the day of scanning using
the expanded disability status scale (EDSS) (Kurtzke, 1983). The median EDSS score was
4, ranging between 2.5 and 8.0. From the same cohort we randomly selected the MR
images of 16 healthy controls (mean age: 51.7 5.8, 8 of them were women) for use as
an atlas in the TTP creation step of the segmentation method (see details below).
2.2. MR imaging
MR imaging was performed on a 3.0 T whole body scanner (GE Signa HDxt,
Milwaukee, WI, USA) using an eight-channel phased-array head coil. The protocol
contained among others two 3D sequences: a fat-saturated 3DFLAIR (TR: 8000 ms, TE:
125 ms, TI: 2350 ms, 250 250 mm2 field of view (FOV), 132 sagittal slices of 1.2 mm
thickness, 0.98 0.98 mm2 in-plane resolution) for lesion detection, and a 3DT1
weighted fast spoiled gradient echo (FSPGR) sequence (TR 7.8 ms, TE 3 ms, FA 12,
240 240 mm2 FOV, 176 sagittal slices of 1 mm thickness, 0.94 0.94 mm2 in-plane
resolution) for anatomical information.
2.3. Manual reference segmentation

A reference WML segmentation was constructed manually using the 3DFLAIR and
3DT1 images. Before constructing the reference segmentation, the 3DT1 image of each
subject was rigidly registered to its respective 3DFLAIR image using FLIRT which is part
of the FMRIB Software Library (FSL 5.0.2) (Jenkinson and Smith, 2001). Subsequently,
both 3DT1 and 3DFLAIR images were orthogonally reformatted to the axial plane,
which resulted in 256 slices with a thickness of 0.94 mm for each dataset.
The axially reformatted images were then used to identify and outline the WMLs.
Lesion identification was performed by three raters in consensus (two PhD-students
with two years of experience each and an experienced neuroradiologist) using the
3DFLAIR images, while the raters were allowed to view the corresponding co-
registered 3DT1 image. Lesions were only identified if they were larger than 3 voxels
in-plane and visible on at least two consecutive slices. In the next step, two trained
technicians manually outlined the identified lesions on the 3DFLAIR using MIPAV
(http://mipav.cit.nih.gov). Each technician was randomly assigned to 10 of the 20
patients, and outlined the identified WMLs on each slice. The 20 reference
segmentations thus produced were used to train and evaluate the automatic lesion
segmentation algorithm.
To assess interobserver reliability of the manual segmentations, each technician also
outlined six randomly selected consecutive slices of each subject assigned to the other
technician. Furthermore, both technicians outlined twenty consecutive slices of one of
the subjects for a second time during the project, to obtain information about
intraobserver reliability.
2.4. Automatic white matter lesion segmentation
kNN classification compares new data with a collection of examples (i.e. the training
set) in a feature space. In this feature space, each voxel is characterized by 3DFLAIR
intensity, 3DT1 intensity, MNI-normalized spatial coordinates and tissue type
probability. Based on the manual reference segmentations, the voxels in the training
set are labeled as being lesion or not. The algorithm classifies a new voxel based on
the labels of its neighbors in feature space. The full algorithm consists of five stages,
namely image preprocessing, feature extraction, feature normalization, classification
and post-processing. These different stages are discussed in the following sections.
2.5. Image preprocessing
First, non-brain tissue was removed from the co-registered 3DT1 image using the FSL
brain extraction tool (BET) (Smith, 2002), using standardized parameters for brain
extraction, including bias field correction and robust brain center estimation as
recommended by Popescu et al. (2012). The resulting brain mask was also applied to
the 3DFLAIR image. Finally, radio frequency (RF) field inhomogeneity correction was
performed on both images using the N3 algorithm (Sled, 1997).
2.6. Feature extraction
The features used for kNN classification in the current study were: 3DFLAIR and 3DT1
signal intensity, MNI-normalized spatial coordinates x, y and z, and tissue type
probabilities pCSF, pGM, and pWM (see Fig. 1).
Fig. 1.
Features used for the kNN classification: 3DFLAIR intensity (A), MNI-normalized spatial
coordinate x (B), spatial coordinate y (C), spatial coordinate z (D), 3DT1 intensity (E), pCSF (F), pGM
(G), and pWM (H).
Figure options
The normalized spatial coordinates x, y, and z were derived by linear registration of

the 3DT1 image to MNI space using FLIRT. By applying the inverse transformation, the
voxelwise corresponding MNI coordinates were subsequently warped back to subject-
space. This resulted in x, y, and z features comparable between subjects.
The TTPs were obtained using a procedure commonly referred to as multi-atlas
segmentation as follows (Aljabar et al., 2009). For the 3DT1 images of the 16 healthy
control subjects, voxelwise hard segmentations of CSF, GM and WM were generated
using FSL-FAST (Zhang et al., 2001). Then the 3DT1 image of each healthy control was
non-linearly registered to the 3DT1 image of the subject of interest using Elastix, which
involved an affine and B-spline transformation, both using mutual information as cost-
function, gradient descent optimizers, a four-stage pyramidal approach and a final
control point resolution of 2.5 mm (Klein et al., 2010). The resulting transformations
were applied to the voxelwise CSF, GM and WM segmentations using nearest
neighbor interpolation. Then for each voxel the probability of being CSF, WM or GM
was estimated by computing the frequency of the respective tissue class in the
registered segmentations ( Aljabar et al., 2009 and De Boer et al., 2009).
2.7. Feature normalization
As different features have different ranges, the features should be normalized to

obtain meaningful distances in feature space for selecting the k nearest neighbors. A
common way of feature normalization is variance scaling, which subtracts the within-
subject mean feature value from each voxel's feature value and divides the result by
the within-subject standard deviation, resulting in zero mean and unit variance in the
normalized feature set. This approach however, may be sensitive to differences in
feature distribution, such as signal intensity distribution differences between patients
with different lesion loads. We therefore also investigated the effect of two other
feature normalization strategies which might be less sensitive to differences in feature
distribution between subjects, namely robust range normalization (De Boer et al.,
2009) and histogram matching ( Lao et al., 2008 and Younis et al., 2008). Robust range
normalization linearly scales a feature such that the 4th percentile of the histogram is
matched to value 0 and the 96th percentile is matched to value 1. Histogram matching
finds, for each new patient, the linear transformation that maximizes the overlap
between the normalized histogram of the transformed feature and the normalized
histogram of a reference histogram. This reference histogram is selected by finding the
most typical histogram among the subjects, and scaling this between zero and one
using robust range normalization. The histogram overlap was maximized using
Genetic Algorithms, as described in Younis et al. (2008).
Since we expected non-intensity feature distributions to be relatively constant, all non-
intensity features were always scaled using variance scaling.
2.8. Classification
The probability that a new voxel is a lesion was defined as the fraction of the k nearest
examples that were labeled as being a lesion in the training set. This can be converted
to a binary segmentation by applying a threshold p to the probability map. Based on
values used in the literature (Anbeek et al., 2008), k was set to 40 in the current study.
Using a leave-one-out procedure, a probability map was computed for each patient.
Subsequently, the optimal threshold was determined by applying different
thresholds p = 0.05, 0.10, ..., 0.95 to each probability map, and calculating the SI of the
resulting binary segmentations with the manual reference segmentation. The
threshold p resulting in the highest average SI across the 20 datasets was selected as
the optimal threshold.
2.9. Post-processing
The binary segmentation sometimes contained small false positive regions, which are
often too small to be considered as a true lesion. To remove these small false positive
regions, we applied a simple post-processing step which removes all lesions with a
volume smaller than a threshold C. From the binary probability maps obtained using
the optimal threshold p in the leave-one-out procedure, the optimal C was selected by
applying different minimum lesion volumes, and selecting the threshold C which
results on average in the highest overlap with the manual reference segmentation.
2.10. Evaluation metrics
We tested the performance of six different configurations by altering the normalization

procedure for the intensity features (i.e., variance scaling, robust range normalization
and histogram matching), and either including TTPs in the feature set or omitting them
(see Table 1).
Table 1.
The different configurations.
Configuration Description
Variance scaling Variance scaling 3DFLAIR, 3DT1, x, y, z
Robust range normalization Robust range normalization of 3DFLAIR and 3DT1
Variance scaling of x, y and z
Histogram matching Histogram matching of 3DFLAIR and 3DT1
Variance scaling of x, y and z
Variance scaling + tissue type priors Variance scaling of 3DFLAIR, 3DT1, x, y, z, pCSF, pGM,
and pWM
Robust range normalization + tissue type Robust range normalization of 3DFLAIR and 3DT1
priors Variance scaling of x, y, z, pCSF, pGM, and pWM
Histogram matching + tissue type priors Histogram matching of 3DFLAIR and 3DT1
Variance scaling of x, y, z, pCSF, pGM, and pWM
Table options
Each configuration was evaluated using both volumetric and spatial correspondence
measures. Volumetric correspondence between the automatic segmentation and the
manual reference segmentations was measured using the intraclass correlation
coefficient (ICC; two-way mixed model with absolute agreement definition) for the
total lesion volume (Koch, 1982). Spatial correspondence at voxel level was evaluated
using Dice's similarity index (SI) (Dice, 1945) and sensitivity, respectively defined as
SI = 2 TP/(2 TP + FP + FN), and sensitivity = TP / (TP + FN), where TP is the
number of true positives, FP is the number of false positives, TN is the number of true
negatives and FN is the number of false negatives. Since SI is affected by lesion burden
(Admiraal-Behloul et al., 2005), we also computed the lesion volume independent
similarity index SIestimate, and the outline error rate (OER) (Wack et al., 2012). As a logical
extension to OER, we also computed detection error rate (DER), defined as
DER = DE / MTA, where DE is detection error and MTA is mean total area such as
described in Wack et al. (2012). In the leave-one-out approach, SI was regarded as the
primary outcome measure.
2.11. Validation in an independent cohort of elderly subjects with hypertension

In order to evaluate the robustness of the optimal configuration across different
scanners and patient populations, we finally applied the previously described training
procedure, parameter selection and cross-validation to an independent dataset
consisting of 20 high resolution MR images, selected from a larger cohort of elderly
subjects with hypertension. In order to include a wide variety of vascular WML severity,
subjects were selected based on the severity of WMLs. Age varied from 74 to 81 years
(mean SD: 77.1 7.0), 11 were women, and mean blood pressure varied from 113 to
188 mm Hg systolic (mean SD: 142.0 17.0) and 66 to 90 mm Hg diastolic
(mean SD: 79.0 7.0).
MR imaging of this dataset was performed on a 3.0 T Intera whole body scanner
(Philips Medical Systems, Best, The Netherlands) using a phased-array SENSE-eight-
channel head coil. The protocol contained among others a 3DFLAIR sequence (TR:
4800 ms, TE: 355 ms, TI: 1650 ms, 250 250 mm2 field of view (FOV), 160 saggital
slices of 1.12 mm thickness, interpolated to 0.56 mm thick (overcontiguous) slices
during reconstruction, 1.1 1.1 mm2 in-plane resolution) for lesion detection, and a
sagittal MPRAGE (magnetization prepared rapid acquisition gradient echo) sequence
(TR: 6.6 ms, TE: 3.1 ms, FA: 9, 270 270 mm2 FOV, 170 sagittal slices of 1.2 mm
thickness, 1.1 1.1 mm2 in-plane resolution) for anatomical information.
In the reference segmentation, segmentations of the vascular WMLs were constructed
using the 3DFLAIR images as follows. First, RF field inhomogeneity correction was
performed using the N3 algorithm (Sled, 1997) implemented in 3D Slicer software
(version 4.0, www.slicer.org). Subsequently, the images were orthogonally reformatted
to the axial plane and WMLs were labeled by a single, trained rater. Afterwards,
voxelwise thresholding was applied to the labeled areas to only include voxels with an
intensity higher than the cortex at the level of the insula. Such a thresholding approach
is well known in aging studies, as it allows for a much more consistent definition of
lesion boundaries, which are often not clear in vascular WM lesions (Olsson et al.,
2013).
3. Results
3.1. Reliability of manual reference segmentation

The manual reference segmentation showed a very good intra-observer agreement at
the voxel level, with SI between the first and the second segmentation of 0.93 for the
first technician, and 0.92 for the second technician. Inter-observer agreement was also
very good, both concerning volumes, with ICC = 0.96, as well as at the voxel level, with
an average SI of 0.84 0.04 across all 120 slices on which lesions were outlined by
both technicians. Mean and SD lesion volume in the final manual reference
segmentation was 16.33 11.49 mL with a median of 13.92 mL and volumes per
patient ranging from 1.88 to 50.95 mL, quite typical for the range of lesion volumes in
established MS patients.
3.2. Quantitative analysis of WML segmentation configurations
Table 2 lists the SI, sensitivity, SIestimate, DER, OER and ICC for the configurations that
were tested. Fig. 2displays the average similarity index as a function of the binary
threshold p for the different configurations.
Table 2.
Evaluation of different configurations in MS patients.
Method p SI Sensitivity SIestimate DER OER ICC

Variance scaling 0.4 0.66 0.1 0.63 0.1 0.64 0.1 0.21 0.1 0.47 0.1 0.8
0 2 2 1 8 2 4
Robust normalization 0.4 0.66 0.1 0.62 0.1 0.65 0.0 0.19 0.1 0.50 0.1 0.8
0 2 3 9 6 5 0
Histogram matching 0.3 0.72 0.0 0.72 0.1 0.70 0.0 0.11 0.0 0.47 0.1 0.9
5 9 4 7 8 3 0
Variance 0.4 0.74 0.0 0.72 0.1 0.73 0.0 0.09 0.0 0.44 0.1 0.9
scaling + tissue type 0 9 1 5 8 1 2
priors
Robust range 0.3 0.72 0.0 0.71 0.1 0.72 0.0 0.09 0.0 0.46 0.1 0.9
normalization + tissu 5 9 1 5 8 1 1
e type priors
Histogram 0.3 0.72 0.0 0.73 0.1 0.72 0.0 0.09 .07 0.46 0.1 0.9
matching + tissue 5 9 3 5 0 3 1
type priors
p: optimal threshold for configuration; SI: Dice's similarity index; DER: detection error ratio; OER:
outline error ratio; ICC: intra-class coefficient. All spatial correspondence metrics are listed
(mean SD).
Table options
Fig. 2.
Segmentation performance for different configurations in the MS patients. Boxplots showing for
different configurations the distribution of the similarity indices across the 20 MS datasets as a
function of threshold p. VS: variance scaling; RR: robust range normalization; HM: histogram
matching; TTP: tissue type priors.
Figure options
In terms of volumetric correspondence, the configurations including TTPs within the

feature set resulted in higher ICCs compared to the configurations without TTPs. The
highest ICC was achieved using variance scaling with TTPs (ICC = 0.92). Robust range
normalization without TTPs resulted in the lowest ICC (ICC = 0.80), indicating that
intensity normalization and TTPs have a strong effect on volumetric correspondence.
The combination of variance scaling and inclusion of TTPs also led to maximum
performance in terms of spatial correspondence (SI = 0.74 0.09). In general, again
better spatial performance was measured using the configurations where TTPs were
added as features, although the addition of TTPs only had a marginal effect in the case
of histogram matching, and a large effect in the case of variance scaling. Similar to SI,
SIestimate showed the best performance when variance scaling + TTPs was used
(SIestimate = 0.73 0.05) and a lower performance when no TTPs were used.
Sensitivity was overall reasonable, with histogram matching + TTPs giving the best
results (sensitivity = 0.73 0.13). DER and OER showed that particularly a reduced
detection error is responsible for the increased SI when including TTPs in the feature
set. While outline error is relatively constant throughout the different configurations,
the average detection error reduces from 0.21 in the worst case of variance scaling
without TTPs to 0.09 in the configuration of variance scaling with TTPs.
Based on these results we selected variance scaling with TTPs as the optimal
configuration.
3.3. Post-processing and detailed analysis of the optimal configuration: variance

scaling with TTPs
Post-processing was applied to the binary segmentation of the optimal configuration

to reduce the number of small false positive regions. Variation in the size
threshold C (integer values between 1 and 10 voxels) only caused small variations in
performance. The highest mean SI was obtained after removing lesions smaller than
5 voxels, increasing the average SI from 0.74 0.09 at p = 0.40 (no post-processing)
to 0.75 0.08 atp = 0.35. Volumetric correspondence in terms of ICC also increased,
from 0.92 before to 0.93 after post-processing, respectively. Post-processing reduced
both outline and detection error.
An example segmentation of a patient with average lesion load is shown in Fig. 3. To
obtain more insight in the performance characteristics of the optimal configuration,
the spatial correspondence metrics are listed inTable 3 for patients with low,
intermediate, and high lesion loads. This shows that SI increases with lesion burden:
datasets with lesion volume 5 mL have an average SI of 0.65, while datasets with
lesion volume 15 mL have an average SI of 0.81. Similar behavior was seen for mean
SIestimate which increases from 0.64 (< 5 mL) to 0.77 (> 15 mL). Furthermore, DER
decreases strongly when lesion burden is lower. Although less pronounced, a similar
relationship was seen for OER.
Fig. 3.
Two slices showing the result of the automatic segmentation in a 39 year old relapsingremitting
MS patient (EDSS 2.5). 3DFLAIR (A, E), 3DT1 (B, F), manual reference segmentation (C, G), and
thresholded probability map (red-yellow:p = [0.351.0]; D, H).
Figure options
Table 3.
Detailed evaluation of variance scaling + tissue type priors configuration including post-
processing in MS patients.
N SI Sensivity SIestimate DER OER

< 5 mL 3 0.65 0.04 0.65 0.08 0.64 0.08 0.19 0.06 0.50 0.06
(0.600.68) (0.570.73) (0.560.70) (0.100.27) (0.430.56)
510 mL 4 0.72 0.08 0.71 0.13 0.73 0.02 0.08 0.06 0.47 0.11
(0.610.78) (0.540.82) (0.710.75) (0.040.16) (0.390.63)
1015 mL 5 0.73 0.07 0.72 0.10 0.76 0.01 0.07 0.03 0.48 0.11
(0.630.80) (0.570.83) (0.750.76) (0.030.10) (0.370.63)
> 15 mL 8 0.81 0.05 0.79 0.09 0.77 0.01 0.04 0.02 0.34 0.09
(0.690.86) (0.680.94) (0.760.78) (0.010.08) (0.250.53)
Total 20 0.75 0.08 0.74 0.10 0.74 0.05 0.08 0.07 0.43 0.11
(0.600.86) (0.540.94) (0.560.78) (0.010.27) (0.250.63)
N: number of subjects per group; SI: Dice's similarity index; DER: detection error rate; OER: outline
error rate mean SD (minimummaximum).
Table options
3.4. Validation in an independent cohort of elderly subjects with hypertension
Mean and SD lesion volume in the dataset of elderly subjects with hypertension was
8.21 8.02 mL with a median of 5.36 mL and volumes per patient ranging from 0.57
to 31.20 mL. We first performed segmentation of the elderly subjects by using the MS
reference segmentations (based on data acquired using a different scanner) as training
set, the variance scaling + TTPs configuration, and the previously derived
optimalp = 0.35 and C = 5. As expected, this yielded suboptimal results: volumetric
ICC = 0.60, average SI = 0.50 0.24, sensitivity = 0.87 0.06, SIestimate = 0.49 0.17,
DER = 0.53 0.43 and OER = 0.45 0.12. Retraining was then performed using the
elderly reference segmentations and variance scaling + TTPs configuration. Cross-
validation measured maximal segmentation performance at p = 0.5 andC = 2, with
volumetric ICC = 0.96, average SI = 0.84 0.10, sensitivity = 0.86 0.14,
SIestimate = 0.83 0.05, DER = 0.07 0.06 and OER = 0.25 0.16, thus showing
substantial improvement in all measures except sensitivity, which on average stayed
the same. The retrained method in the elderly tended to show, as the MS patients did,
a lower segmentation performance when subjects had a low lesion volume, compared
to elderly subjects with a high lesion volume (see Table 4). Here it should be noted
that 9 of the elderly subjects had a lesion volume of < 5 mL, 4 subjects had a lesion
volume of 510 mL, 2 subjects had a lesion volume of 1015 mL and only 3 subjects
had a lesion volume of > 15 mL.
a Different definition of lesion load: (LV < 4 mL), moderate (4 mL < LV < 18 mL), large (LV > 18 mL).
b Definition of lesion load based on diameter of largest diffuse white matter lesion and location of
periventricular white matter lesions.
Table options
4. Discussion
An automated WML segmentation algorithm on 3DFLAIR and 3DT1 images was
presented and validated using manual reference segmentations. The optimal method
used variance scaling, tissue type priors, 3DFLAIR intensities, 3DT1 intensities, and
MNI-normalized spatial coordinates as features, and achieved very good voxelwise
agreement with the reference segmentation. The results were further improved by
applying a post-processing step which removed regions too small to be classified as a
lesion from the segmentation.
The results of our study show that adding TTPs improves the results of kNN lesion
segmentation considerably. This is in line with results of other studies showing
increased performance when using tissue type information in the segmentation
procedure (Schmidt et al., 2012). Adding TTPs improved lesion segmentation
particularly by reducing the average detection error, while average outline error was
fairly constant. Furthermore the results confirm the large influence of the choice of
feature normalization on segmentation performance, emphasizing that feature
normalization is an important aspect to consider in the design of a supervised lesion
segmentation algorithm. Additional post-processing (i.e. removal of regions too small
to be considered as a lesion) showed only a minor improvement in segmentation
performance. Visually however, after post-processing, segmentation results were
considerably smoother compared to without post-processing.
The final algorithm is fully automatic, and segmented a single dataset on a standard
eight-core machine on average in about 23 min, of which 19 min was needed for
nonlinear registration of the healthy controls to the dataset of interest and 4 min for
the actual segmentation and post-processing.
For applicability of automated WML segmentation procedures in clinical studies, both

good spatial and volumetric correspondence are critical. First, it is important to find
the correct regions, but second, it is also important to outline them as accurately as
possible, since lesion volumes are often used as outcome parameters or explanatory
variables (Kappos et al., 2007, Mikol et al., 2008, Polman et al., 2006 and Schoonheim
et al., 2012) and lesion masks are increasingly used to perform lesion filling for
obtaining accurate brain atrophy measurements (Battaglini et al., 2012 and Chard et
al., 2010). Using our final method, the volumetric correspondence reached ICC values
up to 0.93, which we regard as excellent agreement. Furthermore, using TTPs, spatial
correspondence measured by SI was higher than 0.7, which is regarded as excellent as
well (Anbeek et al., 2004 and Bartko, 1991). The final method also showed the lowest SI
variance, indicating that kNN segmentation with TTPs delivers robust performance
across our 20 datasets chosen to reflect the heterogeneity typically observed in MS
populations, which is important for its applicability in clinical trials.
Validation using an independent dataset, obtained on a different scanner, involving
vascular WMLs in elderly hypertensive subjects, again yielded very good voxelwise
performance, demonstrating the robustness of thekNN-TTP segmentation method
irrespective of the scanner used or the pathological substrate of the WMLs.
Many methods for WML segmentation have been published (Admiraal-Behloul et al.,
2005, Akselrod-Ballin et al., 2009, Anbeek et al., 2005, Damangir et al., 2012, De Boer et
al., 2009, Geremia et al., 2011, Khayati et al., 2008, Schmidt et al., 2012, Shiee et al.,
2010 and Van Leemput et al., 2001). Comparison of different methods however, should
be done with care, since the measured performance is highly dependent on the
dataset and the reference segmentation being used for evaluation. Factors known to
influence segmentation performance include the pulse-sequence being used (i.e.,
sequence type, 2D versus 3D) (Anbeek et al., 2005), the way the reference
segmentation was constructed (i.e., manual or semi-automatic), the heterogeneity of
pathology in the sample (i.e., easier to achieve high performance in a homogeneous
dataset), and overall lesion burden (i.e., higher lesion load generally leads to better
spatial segmentation performance) (Wack et al., 2012). This is illustrated by the better
performance in the validation dataset compared to the dataset consisting of patients
with MS: the vascular pathology in the validation dataset is more homogeneous and
the construction of the reference segmentation involved a semi-automatic
segmentation step, which might partially explain the higher segmentation
performance in this sample. Taking these considerations into account, and given that
the use of such a semi-automatic procedure is defendable since the described
approach is common in aging studies (Olsson et al., 2013), our method performs very
well.
Comparing our method to others, some studies reported poorer performance in terms
of SI (Akselrod-Ballin et al., 2009, De Boer et al., 2009, Shiee et al., 2010 and Van
Leemput et al., 2001), whereas others reported comparable or higher performance
(Admiraal-Behloul et al., 2005, Anbeek et al., 2004, Khayati et al., 2008 and Schmidt et
al., 2012). One study reporting high performance is the original study by Anbeek et al.
which was the first to use kNN to classify WMLs (average SI = 0.80). In that study,
WMLs of 20 patients with vascular disease were segmented using spatial coordinates,
and 2D T1, IR, PD, T2, and FLAIR-intensities as features. The method used in that study
was very similar to our variance + no TTPs configuration which resulted in much
lower performance in our MS sample (SI = 0.66). This difference can possibly be
explained by the different sequences used and different pathologies addressed in both
studies, and it illustrates the difficulty of comparing performance using different
reference datasets.
As expected, SI was lower in subjects with lower lesion burden. It has also been
reported by others that small errors have a relatively larger effect on a smaller
reference (Admiraal-Behloul et al., 2005, Anbeek et al., 2004, Khayati et al., 2008, Sajja
et al., 2006 and Schmidt et al., 2012). Table 4 compares the SI for different lesion loads
of our study with other studies and shows that our method performs equally well,
despite the use of 3D sequences and manual reference segmentation, across the full
range of lesion loads.
A limitation of our method is that the algorithm requires new training when applied to
data originating from other scanners or other acquisition protocols. This is necessary
since 3DFLAIR and 3DT1 signal characteristics may differ among MR scanners and
pulse sequences, and is illustrated by the much better performance after retraining in
the sample with elderly subjects. Secondly, the outlining of the manual MS reference
segmentations was performed by two technicians who, while highly trained and
performing MS lesion outlining on 2D images on a daily basis, were not used to
working with the high-resolution 3D images used in the current study. Therefore, to
optimize their performance with these new images, we provided limited additional
training prior to the study. The resulting manual segmentation was of high quality, as
evidenced by the high reproducibility, both between sessions of the same technician
and between the two technicians (inter-observer SI = 0.84). Furthermore, our manual
MS reference segmentations were based on a single consensus scoring to determine
which regions were MS WMLs. This could have led to artificial higher inter- and intra-
observer agreements since detection errors could not occur when outlining the
lesions. Finally, we did not optimize the value of k in the present work, but selected a
value of 40 based on the literature. To rule out that other values of k would have
resulted in large performance differences, we performed a post-hoc analysis in which
the training and evaluation of the optimal configuration for the dataset with MS
patients was repeated for different values of k, namely k = 20, 80 and 160. Here, it
should be noted that classification takes longer when larger values of k are used, since
more nearest neighbors have to be found. The results of this post-hoc analysis
(k = 20: p = 0.35, C = 6, SI = 0.74 0.08; k = 40: p = 0.35, C = 5, SI = 0.75 0.08
(previously reported); k = 80: p = 0.30, C = 6, SI = 0.75 0.08;
and k = 160: p = 0.30, C = 8, SI = 0.75 0.08) confirmed that k in the current range is
suitable for this type of segmentation problems.
In conclusion, we improved kNN classification for the segmentation of WMLs by
adding TTPs and showed that intensity normalization has a strong impact on
segmentation performance. The optimal configuration showed excellent agreement in
terms of volumetric and spatial measures with fully manual 3D reference
segmentations across a wide range of WML severity, irrespective of the scanner used
or the pathological substrate of the WML.
Acknowledgments
This work was supported by the Dutch MS Research Foundation through a program
grant to the VUmc MS Center Amsterdam (grant number 09-358d). The MRI scans of
the validation cohort were obtained with support from Internationale Stichting
Alzheimer Onderzoek (grant number 10507). The authors would like to thank the
Image Analysis Center (IAC) Amsterdam for contributing to the development of the
manual reference segmentations of the patients with MS.
Random Forests
Leo Breiman and Adele Cutler
Random Forests(tm) is a trademark of Leo Breiman and Adele Cutler and is
licensed exclusively to Salford Systems for the commercial release of the
software.
Our trademarks also include RF(tm), RandomForests(tm), RandomForest(tm)
and Random Forest(tm).
classification/clustering|regression|survival analysis
description|manual|code|papers|graphics|philosophy|copyright
|contact us
Contents
Introduction
Overview
Features of random forests
Remarks
How Random Forests work
The oob error estimate
Variable importance
Gini importance
Interactions
Proximities
Scaling
Prototypes
Missing values for the training set
Missing values for the test set
Mislabeled cases
Outliers
Unsupervised learning
Balancing prediction error
Detecting novelties
A case study - microarray data
Classification mode
Variable importance
Using important variables
Variable interactions
Scaling the data
Prototypes
Outliers
A case study - dna data
Missing values in the training set
Missing values in the test set
Mislabeled cases
Case Studies for unsupervised learning
Clustering microarray data
Clustering dna data
Clustering glass data
Clustering spectral data
References
Introduction
This section gives a brief overview of random forests and some comments
about the features of the method.
Overview
We assume that the user knows about the construction of single
classification trees. Random Forests grows many classification trees. To
classify a new object from an input vector, put the input vector down each
of the trees in the forest. Each tree gives a classification, and we say the
tree "votes" for that class. The forest chooses the classification having the
most votes (over all the trees in the forest).
Each tree is grown as follows:
1. If the number of cases in the training set is N, sample N cases at

random - but with replacement, from the original data. This sample
will be the training set for growing the tree.
2. If there are M input variables, a number m<<M is specified such that
at each node, m variables are selected at random out of the M and
the best split on these m is used to split the node. The value of m is
held constant during the forest growing.
3. Each tree is grown to the largest extent possible. There is no pruning.
In the original paper on random forests, it was shown that the forest error
rate depends on two things:
The correlation between any two trees in the forest. Increasing the
correlation increases the forest error rate.
The strength of each individual tree in the forest. A tree with a low
error rate is a strong classifier. Increasing the strength of the
individual trees decreases the forest error rate.
Reducing m reduces both the correlation and the strength. Increasing it

increases both. Somewhere in between is an "optimal" range of m - usually
quite wide. Using the oob error rate (see below) a value of m in the range
can quickly be found. This is the only adjustable parameter to which
random forests is somewhat sensitive.
Features of Random Forests

It is unexcelled in accuracy among current algorithms.
It runs efficiently on large data bases.
It can handle thousands of input variables without variable deletion.
It gives estimates of what variables are important in the classification.
It generates an internal unbiased estimate of the generalization error
as the forest building progresses.
It has an effective method for estimating missing data and maintains
accuracy when a large proportion of the data are missing.
It has methods for balancing error in class population unbalanced
data sets.
Generated forests can be saved for future use on other data.
Prototypes are computed that give information about the relation
between the variables and the classification.
It computes proximities between pairs of cases that can be used in
clustering, locating outliers, or (by scaling) give interesting views of
the data.
The capabilities of the above can be extended to unlabeled data,
leading to unsupervised clustering, data views and outlier detection.
It offers an experimental method for detecting variable interactions.
Remarks
Random forests does not overfit. You can run as many trees as you want. It
is fast. Running on a data set with 50,000 cases and 100 variables, it
produced 100 trees in 11 minutes on a 800Mhz machine. For large data
sets the major memory requirement is the storage of the data itself, and
three integer arrays with the same dimensions as the data. If proximities
are calculated, storage requirements grow as the number of cases times
the number of trees.
How random forests work

To understand and use the various options, further information about how
they are computed is useful. Most of the options depend on two data
objects generated by random forests.
When the training set for the current tree is drawn by sampling with
replacement, about one-third of the cases are left out of the sample.
This oob (out-of-bag) data is used to get a running unbiased estimate of
the classification error as trees are added to the forest. It is also used to get
estimates of variable importance.
After each tree is built, all of the data are run down the tree,
and proximities are computed for each pair of cases. If two cases occupy
the same terminal node, their proximity is increased by one. At the end of
the run, the proximities are normalized by dividing by the number of trees.
Proximities are used in replacing missing data, locating outliers, and
producing illuminating low-dimensional views of the data.
The out-of-bag (oob) error estimate

In random forests, there is no need for cross-validation or a separate test
set to get an unbiased estimate of the test set error. It is estimated
internally, during the run, as follows:
Each tree is constructed using a different bootstrap sample from the

original data. About one-third of the cases are left out of the bootstrap
sample and not used in the construction of the kth tree.
Put each case left out in the construction of the kth tree down the kth tree
to get a classification. In this way, a test set classification is obtained for
each case in about one-third of the trees. At the end of the run, take j to be
the class that got most of the votes every time case n was oob. The
proportion of times that j is not equal to the true class of n averaged over all
cases is the oob error estimate. This has proven to be unbiased in many
tests.
Variable importance
In every tree grown in the forest, put down the oob cases and count the
number of votes cast for the correct class. Now randomly permute the
values of variable m in the oob cases and put these cases down the tree.
Subtract the number of votes for the correct class in the variable-m-
permuted oob data from the number of votes for the correct class in the
untouched oob data. The average of this number over all trees in the forest
is the raw importance score for variable m.
If the values of this score from tree to tree are independent, then the
standard error can be computed by a standard computation. The
correlations of these scores between trees have been computed for a
number of data sets and proved to be quite low, therefore we compute
standard errors in the classical way, divide the raw score by its standard
error to get a z-score, ands assign a significance level to the z-score
assuming normality.
If the number of variables is very large, forests can be run once with all the
variables, then run again using only the most important variables from the
first run.
For each case, consider all the trees for which it is oob. Subtract the
percentage of votes for the correct class in the variable-m-permuted oob
data from the percentage of votes for the correct class in the untouched
oob data. This is the local importance score for variable m for this case,
and is used in the graphics program RAFT.
Gini importance
Every time a split of a node is made on variable m the gini impurity criterion
for the two descendent nodes is less than the parent node. Adding up the
gini decreases for each individual variable over all trees in the forest gives
a fast variable importance that is often very consistent with the permutation
importance measure.
Interactions
The operating definition of interaction used is that variables m and k
interact if a split on one variable, say m, in a tree makes a split on k either
systematically less possible or more possible. The implementation used is
based on the gini values g(m) for each tree in the forest. These are ranked
for each tree and for each two variables, the absolute difference of their
ranks are averaged over all trees.
This number is also computed under the hypothesis that the two variables
are independent of each other and the latter subtracted from the former. A
large positive number implies that a split on one variable inhibits a split on
the other and conversely. This is an experimental procedure whose
conclusions need to be regarded with caution. It has been tested on only a
few data sets.
Proximities
These are one of the most useful tools in random forests. The proximities
originally formed a NxN matrix. After a tree is grown, put all of the data,
both training and oob, down the tree. If cases k and n are in the same
terminal node increase their proximity by one. At the end, normalize the
proximities by dividing by the number of trees.
Users noted that with large data sets, they could not fit an NxN matrix into
fast memory. A modification reduced the required memory size to NxT
where T is the number of trees in the forest. To speed up the computation-
intensive scaling and iterative missing value replacement, the user is given
the option of retaining only the nrnn largest proximities to each case.
When a test set is present, the proximities of each case in the test set with
each case in the training set can also be computed. The amount of
additional computing is moderate.
Scaling
The proximities between cases n and k form a matrix {prox(n,k)}. From their
definition, it is easy to show that this matrix is symmetric, positive definite
and bounded above by 1, with the diagonal elements equal to 1. It follows
that the values 1-prox(n,k) are squared distances in a Euclidean space of
dimension not greater than the number of cases. For more background on
scaling see "Multidimensional Scaling" by T.F. Cox and M.A. Cox.
Let prox(-,k) be the average of prox(n,k) over the 1st coordinate, prox(n,-)
be the average of prox(n,k) over the 2nd coordinate, and prox(-,-) the
average over both coordinates. Then the matrix
cv(n,k)=.5*(prox(n,k)-prox(n,-)-prox(-,k)+prox(-,-))
is the matrix of inner products of the distances and is also positive definite
symmetric. Let the eigenvalues of cv be (j) and the
eigenvectors j(n). Then the vectors
x(n) = ((1) (n) , (2) (n) , ...,)
have squared distances between them equal to 1-prox(n,k). The values

of (j) j(n) are referred to as the jth scaling coordinate.
In metric scaling, the idea is to approximate the vectors x(n) by the first few
scaling coordinates. This is done in random forests by extracting the largest
few eigenvalues of the cv matrix, and their corresponding eigenvectors .
The two dimensional plot of the ith scaling coordinate vs. the jth often gives
useful information about the data. The most useful is usually the graph of
the 2nd vs. the 1st.
Since the eigenfunctions are the top few of an NxN matrix, the
computational burden may be time consuming. We advise taking nrnn
considerably smaller than the sample size to make this computation faster.
There are more accurate ways of projecting distances down into low
dimensions, for instance the Roweis and Saul algorithm. But the nice
performance, so far, of metric scaling has kept us from implementing more
accurate projection algorithms. Another consideration is speed. Metric
scaling is the fastest current algorithm for projecting down.
Generally three or four scaling coordinates are sufficient to give good

pictures of the data. Plotting the second scaling coordinate versus the first
usually gives the most illuminating view.
Prototypes
Prototypes are a way of getting a picture of how the variables relate to the
classification. For the jth class, we find the case that has the largest
number of class j cases among its k nearest neighbors, determined using
the proximities. Among these k cases we find the median, 25th percentile,
and 75th percentile for each variable. The medians are the prototype for
class j and the quartiles give an estimate of is stability. For the second
prototype, we repeat the procedure but only consider cases that are not
among the original k, and so on. When we ask for prototypes to be output
to the screen or saved to a file, prototypes for continuous variables are
standardized by subtractng the 5th percentile and dividing by the difference
between the 95th and 5th percentiles. For categorical variables, the
prototype is the most frequent value. When we ask for prototypes to be
output to the screen or saved to a file, all frequencies are given for
categorical variables.
Missing value replacement for the training set
Random forests has two ways of replacing missing values. The first way is
fast. If the mth variable is not categorical, the method computes the median
of all values of this variable in class j, then it uses this value to replace all
missing values of the mth variable in class j. If the mth variable is
categorical, the replacement is the most frequent non-missing value in
class j. These replacement values are called fills.
The second way of replacing missing values is computationally more

expensive but has given better performance than the first, even with large
amounts of missing data. It replaces missing values only in the training set.
It begins by doing a rough and inaccurate filling in of the missing values.
Then it does a forest run and computes proximities.
If x(m,n) is a missing continuous value, estimate its fill as an average over

the non-missing values of the mth variables weighted by the proximities
between the nth case and the non-missing value case. If it is a missing
categorical variable, replace it by the most frequent non-missing value
where frequency is weighted by proximity.
Now iterate-construct a forest again using these newly filled in values, find
new fills and iterate again. Our experience is that 4-6 iterations are enough.
Missing value replacement for the test set

When there is a test set, there are two different methods of replacement
depending on whether labels exist for the test set.
If they do, then the fills derived from the training set are used as
replacements. If labels no not exist, then each case in the test set is
replicated nclass times (nclass= number of classes). The first replicate of a
case is assumed to be class 1 and the class one fills used to replace
missing values. The 2nd replicate is assumed class 2 and the class 2 fills
used on it.
This augmented test set is run down the tree. In each set of replicates, the
one receiving the most votes determines the class of the original case.
Mislabeled cases
The training sets are often formed by using human judgment to assign
labels. In some areas this leads to a high frequency of mislabeling. Many of
the mislabeled cases can be detected using the outlier measure. An
example is given in the DNA case study.
Outliers
Outliers are generally defined as cases that are removed from the main
body of the data. Translate this as: outliers are cases whose proximities to
all other cases in the data are generally small. A useful revision is to define
outliers relative to their class. Thus, an outlier in class j is a case whose
proximities to all other class j cases are small.
Define the average proximity from case n in class j to the rest of the training
data class j as:
The raw outlier measure for case n is defined as
This will be large if the average proximity is small. Within each class find
the median of these raw measures, and their absolute deviation from the
median. Subtract the median from each raw measure, and divide by the
absolute deviation to arrive at the final outlier measure.
Unsupervised learning
In unsupervised learning the data consist of a set of x -vectors of the same
dimension with no class labels or response variables. There is no figure of
merit to optimize, leaving the field open to ambiguous conclusions. The
usual goal is to cluster the data - to see if it falls into different piles, each of
which can be assigned some meaning.
The approach in random forests is to consider the original data as class 1

and to create a synthetic second class of the same size that will be labeled
as class 2. The synthetic second class is created by sampling at random
from the univariate distributions of the original data. Here is how a single
member of class two is created - the first coordinate is sampled from the N
values {x(1,n)}. The second coordinate is sampled independently from the
N values {x(2,n)}, and so forth.
Thus, class two has the distribution of independent random variables, each
one having the same univariate distribution as the corresponding variable
in the original data. Class 2 thus destroys the dependency structure in the
original data. But now, there are two classes and this artificial two-class
problem can be run through random forests. This allows all of the random
forests options to be applied to the original unlabeled data set.
If the oob misclassification rate in the two-class problem is, say, 40% or
more, it implies that the x -variables look too much like independent
variables to random forests. The dependencies do not have a large role
and not much discrimination is taking place. If the misclassification rate is
lower, then the dependencies are playing an important role.
Formulating it as a two class problem has a number of payoffs. Missing

values can be replaced effectively. Outliers can be found. Variable
importance can be measured. Scaling can be performed (in this case, if the
original data had labels, the unsupervised scaling often retains the
structure of the original scaling). But the most important payoff is the
possibility of clustering.
Balancing prediction error

In some data sets, the prediction error between classes is highly
unbalanced. Some classes have a low prediction error, others a high. This
occurs usually when one class is much larger than another. Then random
forests, trying to minimize overall error rate, will keep the error rate low on
the large class while letting the smaller classes have a larger error rate. For
instance, in drug discovery, where a given molecule is classified as active
or not, it is common to have the actives outnumbered by 10 to 1, up to 100
to 1. In these situations the error rate on the interesting class (actives) will
be very high.
The user can detect the imbalance by outputs the error rates for the
individual classes. To illustrate 20 dimensional synthetic data is used.
Class 1 occurs in one spherical Gaussian, class 2 on another. A training
set of 1000 class 1's and 50 class 2's is generated, together with a test set
of 5000 class 1's and 250 class 2's.
The final output of a forest of 500 trees on this data is:
500 3.7 0.0 78.4

There is a low overall test set error (3.73%) but class 2 has over 3/4 of its
cases misclassified.
The error can balancing can be done by setting different weights for the
classes.
The higher the weight a class is given, the more its error rate is decreased.
A guide as to what weights to give is to make them inversely proportional to
the class populations. So set weights to 1 on class 1, and 20 on class 2,
and run again. The output is:
500 12.1 12.7 0.0
The weight of 20 on class 2 is too high. Set it to 10 and try again, getting:
500 4.3 4.2 5.2
This is pretty close to balance. If exact balance is wanted, the weight on

class 2 could be jiggled around a bit more.
Note that in getting this balance, the overall error rate went up. This is the
usual result - to get better balance, the overall error rate will be increased.
Detecting novelties
The outlier measure for the test set can be used to find novel cases not
fitting well into any previously established classes.
The satimage data is used to illustrate. There are 4435 training cases,
2000 test cases, 36 variables and 6 classes.
In the experiment five cases were selected at equal intervals in the test set.
Each of these cases was made a "novelty" by replacing each variable in
the case by the value of the same variable in a randomly selected training
case. The run is done using noutlier =2, nprox =1. The output of the run is
graphed below:
This shows that using an established training set, test sets can be run
down and checked for novel cases, rather than running the training set
repeatedly. The training set results can be stored so that test sets can be
run through the forest without reconstructing it.
This method of checking for novelty is experimental. It may not distinguish

novel cases on other data. For instance, it does not distinguish novel cases
in the dna test data.
A case study-microarray data

To give an idea of the capabilities of random forests, we illustrate them on
an early microarray lymphoma data set with 81 cases, 3 classes, and 4682
variables corresponding to gene expressions.
Classification mode
To do a straight classification run, use the settings:
parameter(
c DESCRIBE DATA
1 mdim=4682, nsample0=81, nclass=3, maxcat=1,
1 ntest=0, labelts=0, labeltr=1,
c
c SET RUN PARAMETERS
2 mtry0=150, ndsize=1, jbt=1000, look=100, lookcls=1,
2 jclasswt=0, mdim2nd=0, mselect=0, iseed=4351,
c
c SET IMPORTANCE OPTIONS
3 imp=0, interact=0, impn=0, impfast=0,
c
c SET PROXIMITY COMPUTATIONS
4 nprox=0, nrnn=5,
c
c SET OPTIONS BASED ON PROXIMITIES
5 noutlier=0, nscale=0, nprot=0,
c
c REPLACE MISSING VALUES
6 code=-999, missfill=0, mfixrep=0,
c
c GRAPHICS
7 iviz=1,
c
c SAVING A FOREST
8 isaverf=0, isavepar=0, isavefill=0, isaveprox=0,
c
c RUNNING A SAVED FOREST
9 irunrf=0, ireadpar=0, ireadfill=0, ireadprox=0)
Note: since the sample size is small, for reliability 1000 trees are grown
using mtry0=150. The results are not sensitive to mtry0 over the range 50-
200. Since look=100, the oob results are output every 100 trees in terms of
percentage misclassified
100 2.47
200 2.47
300 2.47
400 2.47
500 1.23
600 1.23
700 1.23
800 1.23
900 1.23
1000 1.23
(note: an error rate of 1.23% implies 1 of the 81 cases was misclassified,)
Variable importance
The variable importances are critical. The run computing importances is
done by switching imp =0 to imp =1 in the above parameter list. The output
has four columns:
gene number
the raw importance score
the z-score obtained by dividing the raw score by its standard error
the significance level.
The highest 25 gene importances are listed sorted by their z-scores. To get
the output on a disk file, put impout =1, and give a name to the
corresponding output file. If impout is put equal to 2 the results are written
to screen and you will see a display similar to that immediately below:
gene raw z-score significance

number score
667 1.414 1.069 0.143
689 1.259 0.961 0.168
666 1.112 0.903 0.183
668 1.031 0.849 0.198
682 0.820 0.803 0.211
878 0.649 0.736 0.231
1080 0.514 0.729 0.233
1104 0.514 0.718 0.237
879 0.591 0.713 0.238
895 0.519 0.685 0.247
3621 0.552 0.684 0.247
3529 0.650 0.683 0.247
3404 0.453 0.661 0.254
623 0.286 0.655 0.256
3617 0.498 0.654 0.257
650 0.505 0.650 0.258
645 0.380 0.644 0.260
3616 0.497 0.636 0.262
938 0.421 0.635 0.263
915 0.426 0.631 0.264
669 0.484 0.626 0.266
663 0.550 0.625 0.266
723 0.334 0.610 0.271
685 0.405 0.605 0.272
3631 0.402 0.603 0.273
Using important variables
Another useful option is to do an automatic rerun using only those variables
that were most important in the original run. Say we want to use only the 15
most important variables found in the first run in the second run. Then in
the options change mdim2nd=0 to mdim2nd=15 , keep imp=1 and
compile. Directing output to screen, you will see the same output as above
for the first run plus the following output for the second run. Then the
importances are output for the 15 variables used in the 2nd run.
gene raw z-score significance

number score
3621 6.235 2.753 0.003
1104 6.059 2.709 0.003
3529 5.671 2.568 0.005
666 7.837 2.389 0.008
3631 4.657 2.363 0.009
667 7.005 2.275 0.011
668 6.828 2.255 0.012
689 6.637 2.182 0.015
878 4.733 2.169 0.015
682 4.305 1.817 0.035
644 2.710 1.563 0.059
879 1.750 1.283 0.100
686 1.937 1.261 0.104
1080 0.927 0.906 0.183
623 0.564 0.847 0.199
Variable interactions
Another option is looking at interactions between variables. If variable m1 is
correlated with variable m2 then a split on m1 will decrease the probability
of a nearby split on m2 . The distance between splits on any two variables
is compared with their theoretical difference if the variables were
independent. The latter is subtracted from the former-a large resulting
value is an indication of a repulsive interaction. To get this output,
change interact =0 to interact=1 leaving imp =1 and mdim2nd =10.
The output consists of a code list: telling us the numbers of the genes
corresponding to id. 1-10. The interactions are rounded to the closest
integer and given in the matrix following two column list that tells which
gene number is number 1 in the table, etc.
1 2 3 4 5 6 7 8 9 10
1 0 13 2 4 8 -7 3 -1 -7 -2
2 13 0 11 14 11 6 3 -1 6 1
3 2 11 0 6 7 -4 3 1 1 -2
4 4 14 6 0 11 -2 1 -2 2 -4
5 8 11 7 11 0 -1 3 1 -8 1
6 -7 6 -4 -2 -1 0 7 6 -6 -1
7 3 3 3 1 3 7 0 24 -1 -1
8 -1 -1 1 -2 1 6 24 0 -2 -3
9 -7 6 1 2 -8 -6 -1 -2 0 -5
10 -2 1 -2 -4 1 -1 -1 -3 -5 0
There are large interactions between gene 2 and genes 1,3,4,5 and
between 7 and 8.
Scaling the data

The wish of every data analyst is to get an idea of what the data looks like.
There is an excellent way to do this in random forests.
Using metric scaling the proximities can be projected down onto a low
dimensional Euclidian space using "canonical coordinates". D canonical
coordinates will project onto a D-dimensional space. To get 3 canonical
coordinates, the options are as follows:
parameter(
c DESCRIBE DATA
1 mdim=4682, nsample0=81, nclass=3, maxcat=1,
1 ntest=0, labelts=0, labeltr=1,
c
c SET RUN PARAMETERS
2 mtry0=150, ndsize=1, jbt=1000, look=100, lookcls=1,
2 jclasswt=0, mdim2nd=0, mselect=0, iseed=4351,
c
c SET IMPORTANCE OPTIONS
3 imp=0, interact=0, impn=0, impfast=0,
c
c SET PROXIMITY COMPUTATIONS
4 nprox=1, nrnn=50,
c
c SET OPTIONS BASED ON PROXIMITIES
5 noutlier=0, nscale=3, nprot=0,
c
c REPLACE MISSING VALUES
6 code=-999, missfill=0, mfixrep=0,
c
c GRAPHICS
7 iviz=1,
c
c SAVING A FOREST
8 isaverf=0, isavepar=0, isavefill=0, isaveprox=0,
c
c RUNNING A SAVED FOREST
9 irunrf=0, ireadpar=0, ireadfill=0, ireadprox=0)
Note that imp and mdim2nd have been set back to zero and nscale set
equal to 3. nrnn is set to 50 which instructs the program to compute the 50
largest proximities for each case. Setiscaleout=1. Compiling gives an
output with nsample rows and these columns giving case id, true class,
predicted class and 3 columns giving the values of the three scaling
coordinates. Plotting the 2nd canonical coordinate vs. the first gives:
The three classes are very distinguishable. Note: if one tries to get this
result by any of the present clustering algorithms, one is faced with the job
of constructing a distance measure between pairs of points in 4682-
dimensional space - a low payoff venture. The plot above, based on
proximities, illustrates their intrinsic connection to the data.
Prototypes
Two prototypes are computed for each class in the microarray data
The settings are mdim2nd=15, nprot=2, imp=1, nprox=1, nrnn=20. The

values of the variables are normalized to be between 0 and 1. Here is the
graph
Outliers
An outlier is a case whose proximities to all other cases are small. Using
this idea, a measure of outlyingness is computed for each case in the
training sample. This measure is different for the different classes.
Generally, if the measure is greater than 10, the case should be carefully
inspected. Other users have found a lower threshold more useful. To
compute the measure, set nout =1, and all other options to zero. Here is a
plot of the measure:
There are two possible outliers-one is the first case in class 1, the second
is the first case in class 2.
A case study-dna data

There are other options in random forests that we illustrate using the dna
data set. There are 60 variables, all four-valued categorical, three classes,
2000 cases in the training set and 1186 in the test set. This is a classic
machine learning data set and is described more fully in the 1994 book
"Machine learning, Neural and Statistical Classification" editors Michie, D.,
Spiegelhalter, D.J. and Taylor, C.C.
This data set is interesting as a case study because the categorical nature
of the prediction variables makes many other methods, such as nearest
neighbors, difficult to apply.
Missing values in the training set

To illustrate the options for missing value fill-in, runs were done on the dna
data after deleting 10%, 20%, 30%, 40%, and 50% of the set data at
random. Both methods missfill=1 and mfixrep=5 were used. The results
are given in the graph below.
It is remarkable how effective the mfixrep process is. Similarly effective

results have been obtained on other data sets. Here nrnn=5 is used.
Larger values of nrnn do not give such good results.
At the end of the replacement process, it is advisable that the completed

training set be downloaded by setting idataout =1.
Missing values in the test set

In v5, the only way to replace missing values in the test set is to
set missfill =2 with nothing else on. Depending on whether the test set has
labels or not, missfill uses different strategies. In both cases it uses the fill
values obtained by the run on the training set.
We measure how good the fill of the test set is by seeing what error rate it
assigns to the training set (which has no missing). If the test set is drawn
from the same distribution as the training set, it gives an error rate of 3.7%.
As the proportion of missing increases, using a fill drifts the distribution of
the test set away from the training set and the test set error rate will
increase.
We can check the accuracy of the fill for no labels by using the dna data,
setting labelts=0, but then checking the error rate between the classes filled
in and the true labels.
missing% labelts=1 labelts=0

10 4.9 5.0
20 8.1 8.4
30 13.4 13.8
40 21.4 22.4
50 30.4 31.4
There is only a small loss in not having the labels to assist the fill.
Mislabeled Cases
The DNA data base has 2000 cases in the training set, 1186 in the test set,
and 60 variables, all of which are four-valued categorical variables. In the
training set, one hundred cases are chosen at random and their class
labels randomly switched. The outlier measure is computed and is graphed
below with the black squares representing the class-switched cases
Select the threshold as 2.73. Then 90 of the 100 cases with altered classes
have outlier measure exceeding this threshold. Of the 1900 unaltered
cases, 62 exceed threshold.
Case studies for unsupervised learning
Clustering microarray data
We give some examples of the effectiveness of unsupervised clustering in
retaining the structure of the unlabeled data. The scaling for the microarray
data has this picture:
Suppose that in the 81 cases the class labels are erased. But it we want to
cluster the data to see if there was any natural conglomeration. Again, with
a standard approach the problem is trying to get a distance measure
between 4681 variables. Random forests uses as different tack.
Set labeltr =0 . A synthetic data set is constructed that also has 81 cases
and 4681 variables but has no dependence between variables. The original
data set is labeled class 1, the synthetic class 2.
If there is good separation between the two classes, i.e. if the error rate is
low, then we can get some information about the original data.
Set nprox=1, and iscale =D-1. Then the proximities in the original data set
are computed and projected down via scaling coordinates onto low
dimensional space. Here is the plot of the 2nd versus the first.
The three clusters gotten using class labels are still recognizable in the
unsupervised mode. The oob error between the two classes is 16.0%. If a
two stage is done with mdim2nd =15, the error rate drops to 2.5% and the
unsupervised clusters are tighter.
Clustering dna data

The scaling pictures of the dna data is, both supervised and unsupervised,
are interesting and appear below:
The structure of the supervised scaling is retained, although with a different
rotation and axis scaling. The error between the two classes is 33%,
indication lack of strong dependency.
Clustering glass data

A more dramatic example of structure retention is given by using the glass
data set-another classic machine learning test bed. There are 214 cases, 9
variables and 6 classes. The labeled scaling gives this picture:
Erasing the labels results in this projection:

Clustering spectral data
Another example uses data graciously supplied by Merck that consists of
the first 468 spectral intensities in the spectrums of 764 compounds. The
challenge presented by Merck was to find small cohesive groups of outlying
cases in this data. Using forests with labeltr=0, there was excellent
separation between the two classes, with an error rate of 0.5%, indicating
strong dependencies in the original data.
We looked at outliers and generated this plot.

This plot gives no indication of outliers. But outliers must be fairly isolated
to show up in the outlier display. To search for outlying groups scaling
coordinates were computed. The plot of the 2nd vs. the 1st is below:
This shows, first, that the spectra fall into two main clusters. There is a
possibility of a small outlying group in the upper left hand corner. To get
another picture, the 3rd scaling coordinate is plotted vs. the 1st.
The group in question is now in the lower left hand corner and its
separation from the main body of the spectra has become more apparent.
References
The theoretical underpinnings of this program are laid out in the paper
"Random Forests". It's available on the same web page as this manual. It
was recently published in the Machine Learning Journal.
Multiple sclerosis
From Wikipedia, the free encyclopedia
Multiple sclerosis
Classification and external resources
Demyelination by MS. The CD68 colored tissue shows

several macrophages in the area of the lesion. Original scale
1:100
ICD-10 G35
ICD-9 340
OMIM 126200
DiseasesDB 8412
MedlinePlus 000737
eMedicine neuro/228 oph/179emerg/321 pmr/82radio/461
MeSH D009103
Overview
GeneReviews
Multiple sclerosis (MS), also known as disseminated sclerosis or encephalomyelitis

disseminata, is an inflammatory disease in which theinsulating covers of nerve cells in
the brain and spinal cord are damaged. This damage disrupts the ability of parts of the nervous
system to communicate, resulting in a wide range of signs and symptoms,[1][2] including
physical, mental,[2] and sometimes psychiatric problems.[3] MS takes several forms, with new
symptoms either occurring in isolated attacks (relapsing forms) or building up over time
(progressive forms).[4] Between attacks, symptoms may go away completely; however,
permanent neurological problems often occur, especially as the disease advances.[4]
While the cause is not clear, the underlying mechanism is thought to be either destruction by the
immune system or failure of the myelin-producing cells.[5] Proposed causes for this
include genetics and environmental factors such as infections.[2][6] MS is usually diagnosed
based on the presenting signs and symptoms and the results of supporting medical tests.
There is no known cure for multiple sclerosis. Treatments attempt to improve function after an
attack and prevent new attacks.[2] Medications used to treat MS while modestly effective can
have adverse effects and be poorly tolerated. Many people pursue alternative treatments,
despite a lack of evidence. The long-term outcome is difficult to predict, with good outcomes
more often seen in women; those who develop the disease early in life; those with a relapsing
course; and those who initially experienced few attacks.[7] Life expectancy is 5 to 10 years lower
than that of an unaffected population.[1]
As of 2008, between 2 and 2.5 million people are affected globally with rates varying widely in
different regions of the world and among different populations.[8] The disease usually begins
between the ages of 20 and 50 and is twice as common in women as in men.[9] The
name multiple sclerosis refers to scars (scleraebetter known as plaques or lesions)
particularly in the white matter of the brain and spinal cord.[10] MS was first described in 1868
by Jean-Martin Charcot.[10] A number of new treatments and diagnostic methods are under
development.
Contents
[hide]
1 Signs and symptoms

2 Causes
o 2.1 Geography
o 2.2 Genetics
o 2.3 Infectious agents
o 2.4 Other
3 Pathophysiology
o 3.1 Lesions
o 3.2 Inflammation
o 3.3 Bloodbrain barrier
4 Diagnosis
o 4.1 Clinical courses
5 Management
o 5.1 Acute attacks
o 5.2 Disease-modifying treatments
o 5.3 Associated symptoms
o 5.4 Alternative treatments
6 Prognosis
7 Epidemiology
8 History
o 8.1 Medical discovery
o 8.2 Historical cases
9 Research
o 9.1 Medications
o 9.2 Disease biomarkers
o 9.3 Chronic cerebrospinal venous insufficiency
10 See also
11 References
12 Further reading
13 External links
Signs and symptoms
Main article: Multiple sclerosis signs and symptoms
Main symptoms of multiple sclerosis
A person with MS can have almost any neurological symptom or sign; with autonomic, visual,
motor, and sensory problems being the most common.[1]The specific symptoms are determined
by the locations of the lesions within the nervous system, and may include loss of
sensitivity or changes in sensation such as tingling, pins and needles or numbness, muscle
weakness, very pronounced reflexes, muscle spasms, or difficulty in moving; difficulties with
coordination and balance (ataxia); problems with speech or swallowing, visual problems
(nystagmus, optic neuritis or double vision),feeling tired, acute or chronic pain, and bladder and
bowel difficulties, among others.[1] Difficulties thinking and emotional problems such
as depressionor unstable mood are also common.[1] Uhthoff's phenomenon, a worsening of
symptoms due to exposure to higher than usual temperatures, andLhermitte's sign, an electrical
sensation that runs down the back when bending the neck, are particularly characteristic of
MS.[1] The main measure of disability and severity is the expanded disability status
scale (EDSS), with other measures such as the multiple sclerosis functional composite being
increasingly used in research.[11][12][13]
Animation created from an 1887 photographic study of locomotion of a male MS patient with walking
difficulties byMuybridge
The condition begins in 85% of cases as a clinically isolated syndrome over a number of days
with 45% having motor or sensory problems, 20% having optic neuritis, and 10% having
symptoms related tobrainstem dysfunction, while the remaining 25% have more than one of the
previous difficulties.[14] The course of symptoms occurs in two main patterns initially; either as
episodes of sudden worsening that last a few days to months (called relapses, exacerbations,
bouts, attacks, or flare-ups) followed by improvement (85% of cases) or as a gradual worsening
over time without periods of recovery (10-15% of cases).[9] A combination of these two patterns
may also occur[4] or people may start in a relapsing and remitting course which then becomes
progressive later on.[9] Relapses are usually not predictable, occurring without
warning.[1] Exacerbations rarely occur more frequently than twice per year.[1] Some relapses,
however, are preceded by common triggers and they occur more frequently during spring and
summer.[15] Similarly, viral infections such as the common cold, influenza,
or gastroenteritis increase their risk.[1] Stress may also trigger an attack.[16] Being pregnant
decreases the risk of relapse; however, during the first months after delivery the risk
increases.[1] Overall, pregnancy does not seem to influence long-term disability.[1] Many events
have not been found to affect relapse rates including vaccination, breast feeding,[1] physical
trauma,[17] and Uhthoff's phenomenon.[15]
Causes
The cause of MS is unknown; however, it is believed to occur as a result of some combination
of environmental factors such as infectious agents and genetics.[1] Theories try to combine the
data into likely explanations, but none has proved definitive. While there are a number of
environmental risk factors and although some are partly modifiable, further research is needed
to determine whether their elimination can prevent MS.[18]
Geography
MS is more common in people who live farther from the equator, although exceptions
exist.[1][19] These exceptions include ethnic groups that are at low risk far from the equator such
as the Samis, Amerindians, Canadian Hutterites, New Zealand Mori,[20] and
Canada's Inuit,[9] as well as groups that have a relatively high risk close to the equator such
as Sardinians,[9] Palestinians and Parsis.[20] The cause of this geographical pattern is not
clear.[9]While the north-south gradient of incidence is decreasing,[19] as of 2010 it is still
present.[9]
MS is more common in regions with northern European populations[1] and the geographic
variation may simply reflect the global distribution of these high-risk populations.[9] Decreased
sunlight exposure resulting in decreased vitamin D production has also been put forward as an
explanation.[21][22] A relationship between season of birth and MS lends support to this idea, with
fewer people born in the northern hemisphere in November as compared to May being affected
later in life.[23] Environmental factors may play a role during childhood, with several studies
finding that people who move to a different region of the world before the age of 15 acquire the
new region's risk to MS. If migration takes place after age 15, however, the person retains the
risk of his home country.[1][18] There is some evidence that the effect of moving may still apply to
people older than 15.[1]
Genetics
HLA region of Chromosome 6. Changes in this area increase the probability of getting MS.
MS is not considered a hereditary disease; however, a number of genetic variations have been
shown to increase the risk.[24] The probability is higher in relatives of an affected person, with a
greater risk among those who are more closely related.[2] In identical twins both are affected
about 30% of the time, while around 5% for non-identical twins and 2.5% of siblings are affected
with a lower percentage of half-siblings.[1][2][25] If both parents are affected the risk in their
children is 10 times that of the general population.[9] MS is also more common in some ethnic
groups than others.[26]
Specific genes that have been linked with MS include differences in the human leukocyte
antigen (HLA) systema group of genes on chromosome 6that serves as the major
histocompatibility complex (MHC).[1] That changes in the HLA region are related to susceptibility
has been known for over thirty years,[27] and additionally this same region has been implicated in
the development of other autoimmune diseases such as diabetes type I andsystemic lupus
erythematosus.[27] The most consistent finding is the association between multiple sclerosis
and alleles of the MHC defined as DR15and DQ6.[1] Other loci have shown a protective effect,
such as HLA-C554 and HLA-DRB1*11.[1] Overall, it has been estimated that HLA changes
account for between 20 and 60% of the genetic predisposition.[27] Modern genetic methods
(genome-wide association studies) have discovered at least twelve other genes outside the
HLA locus that modestly increase the probability of MS.[27]
Infectious agents
Many microbes have been proposed as triggers of MS, but none have been
confirmed.[2] Moving at an early age from one location in the world to another alters a person's
subsequent risk of MS.[6] An explanation for this could be that some kind of infection, produced
by a widespread microbe rather than a rare one, is related to the disease.[6] Proposed
mechanisms include the hygiene hypothesis and the prevalence hypothesis. The hygiene
hypothesis proposes that exposure to certain infectious agents early in life is protective, the
disease being a response to a late encounter with such agents.[1] The prevalence hypothesis
proposes that the disease is due to an infectious agent more common in regions where MS is
common and where in most individuals it causes an ongoing infection without symptoms. Only
in a few cases and after many years does it cause demyelination.[6][28] The hygiene hypothesis
has received more support than the prevalence hypothesis.[6]
Evidence for a virus as a cause include: the presence of oligoclonal bands in the brain and
cerebrospinal fluid of most people with MS, the association of several viruses with human
demyelination encephalomyelitis, and the occurrence of demyelination in animals caused by
some viral infection.[29] Human herpes viruses are a candidate group of viruses. Individuals who
have never been infected by the Epstein-Barr virus are at a reduced risk of getting MS while
those infected as young adults are at a greater risk than those who had it at a younger
age.[1][6] Although some consider that this goes against the hygiene hypothesis, since the non-
infected have probably experienced a more hygienic upbringing,[6] others believe that there is no
contradiction since it is a first encounter with the causative virus relatively late in life that is the
trigger for the disease.[1] Other diseases that may be related
includemeasles, mumps and rubella.[1]
Other
Smoking has been shown to be an independent risk factor for MS.[21] Stress may be a risk factor
although the evidence to support this is weak.[18] Association with occupational exposures
andtoxinsmainly solventshas been evaluated, but no clear conclusions have been
reached.[18] Vaccinations were studied as causal factors; however, most studies show no
association.[18]Several other possible risk factors, such as diet and hormone intake, have been
looked at; however, evidence on their relation with the disease is "sparse and
unpersuasive".[21] Gout occurs less than would be expected and lower levels of uric acid have
been found in people with MS. This has led to the theory that uric acid is protective, although its
exact importance remains unknown.[30]
Pathophysiology
Main article: Pathophysiology of multiple sclerosis
The three main characteristics of MS are the formation of lesions in the central nervous
system (also called plaques), inflammation, and the destruction of myelin sheaths of neurons.
These features interact in a complex and not yet fully understood manner to produce the
breakdown of nerve tissue and in turn the signs and symptoms of the disease.[1] Additionally MS
is believed to be an immune-mediated disorder that develops from an interaction of the
individual's genetics and as yet unidentified environmental causes.[2] Damage is believed to be
caused, at least in part, by the person's own immune system attacking the nervous system.[1]
Lesions
Demyelination in MS. On Klver-Barrera myelin staining, decoloration in the area of the lesion can be
appreciated (Original scale 1:100)
The name multiple sclerosis refers to the scars (sclerae better known as plaques or lesions)
that form in the nervous system. These lesions most commonly affect the white matter in
the optic nerve, brain stem, basal ganglia and spinal cord, or white matter tracts close to the
lateral ventricles.[1]The function of white matter cells is to carry signals between grey
matter areas, where the processing is done, and the rest of the body. The peripheral nervous
system is rarely involved.[2]
More specifically, MS involves the loss of oligodendrocytes, the cells responsible for creating
and maintaining a fatty layerknown as the myelin sheathwhich helps the neurons
carry electrical signals (action potentials).[1] This results in a thinning or complete loss of myelin
and, as the disease advances, the breakdown of the axons of neurons. When the myelin is lost,
a neuron can no longer effectively conduct electrical signals.[2] A repair process,
called remyelination, takes place in early phases of the disease, but the oligodendrocytes are
unable to completely rebuild the cell's myelin sheath.[31] Repeated attacks lead to successively
less effective remyelinations, until a scar-like plaque is built up around the damaged
axons.[31] These scars are the origin of the symptoms and during an attack magnetic resonance
imaging (MRI) often shows more than ten new plaques.[1] This could indicate that there is a
number of lesions below which the brain is capable of repairing itself without producing
noticeable consequences.[1] Another process involved in the creation of lesions is an
abnormal increase in the number of astrocytes due to the destruction of nearby neurons.[1] A
number oflesion patterns have been described.[32]
Inflammation
Apart from demyelination, the other sign of the disease is inflammation. Fitting with
an immunological explanation, the inflammatory process is caused by T cells, a kind
of lymphocyte that plays an important role in the body's defenses.[2] T cells gain entry into the
brain via disruptions in the bloodbrain barrier. The T cells recognize myelin as foreign and
attack it, explaining why these cells are also called "autoreactive lymphocytes".[1]
The attack of myelin starts inflammatory processes which triggers other immune cells and the
release of soluble factors like cytokines and antibodies. Further breakdown of the bloodbrain
barrier, in turn cause a number of other damaging effects such as swelling, activation
of macrophages, and more activation of cytokines and other destructive proteins.[2] Inflammation
can potentially reduce transmission of information between neurons in at least three
ways.[1] The soluble factors released might stop neurotransmission by intact neurons. These
factors could lead to or enhance the loss of myelin, or they may cause the axon to break down
completely.[1]
Bloodbrain barrier
The bloodbrain barrier is a part of the capillary system that prevents the entry of T cells into the
central nervous system. It may become permeable to these types of cells secondary to an
infection by a virus or bacteria. After it repairs itself, typically once the infection has cleared, T
cells may remain trapped inside the brain.[2] Gadolinium cannot cross a normal BBB and
therefore Gadolinium-enhanced MRI is used to show BBB breakdowns.[33]
Diagnosis
Animation showing dissemination of brain lesions in time and space as demonstrated by monthly MRI
studies along a year
Multiple sclerosis is typically diagnosed based on the presenting signs and symptoms, in
combination with supporting medical imaging and laboratory testing.[14] It can be difficult to
confirm, especially early on, since the signs and symptoms may be similar to other medical
problems.[1][34] TheMcDonald criteria which focus on clinical, laboratory and radiologic evidence
of lesions at different times and in different areas is the most commonly used method of
diagnosis[8] with the Schumacher and Poser criteria being of mostly historical
significance.[35] While the above criteria allow for a non-invasive diagnosis, some state that the
only definitive proof is an autopsy or biopsy where lesions typical of MS are detected.[1][36][37]
Clinical data alone may be sufficient for a diagnosis of MS if an individual has had separate
episodes of neurologic symptoms characteristic of the disease.[36] In those who seek medical
attention after only one attack, other testing is needed for the diagnosis. The most commonly
used diagnostic tools are neuroimaging, analysis of cerebrospinal fluid and evoked
potentials. Magnetic resonance imaging of the brain and spine may show areas of
demyelination (lesions or plaques). Gadolinium can be administered intravenously as a contrast
agent to highlight active plaques and, by elimination, demonstrate the existence of historical
lesions not associated with symptoms at the moment of the evaluation.[36][38] Testing
of cerebrospinal fluidobtained from a lumbar puncture can provide evidence of
chronic inflammation in the central nervous system. The cerebrospinal fluid is tested
foroligoclonal bands of IgG on electrophoresis, which are inflammation markers found in 75
85% of people with MS.[36][39] The nervous system in MS may respond less actively to
stimulation of the optic nerve and sensory nerves due to demyelination of such pathways.
These brain responses can be examined using visual and sensory evoked potentials.[40]
Clinical courses
Progression of MS subtypes
Several subtypes, or patterns of progression, have been described. Subtypes use the past
course of the disease in an attempt to predict the future course. They are important not only for
prognosis but also for treatment decisions. In 1996, the United States National Multiple
Sclerosis Society described four clinical courses:[4]
1. relapsing remitting,
2. secondary progressive,
3. primary progressive, and
4. progressive relapsing.
The relapsing-remitting subtype is characterized by unpredictable relapses followed by periods

of months to years of relative quiet (remission) with no new signs of disease activity. Deficits
that occur during attacks may either resolve or leave problems, the latter in about 40% of
attacks and being more common the longer a person has had the disease.[1][14] This describes
the initial course of 80% of individuals with MS.[1] When deficits always resolve between attacks,
this is sometimes referred to as benign MS,[41] although people will still build up some degree of
disability in the long term.[1] On the other hand, the term malignant multiple sclerosis is used to
describe people with MS who reach significant level of disability in a short period of time.[42] The
relapsing-remitting subtype usually begins with aclinically isolated syndrome (CIS). In CIS, a
person has an attack suggestive of demyelination, but does not fulfill the criteria for multiple
sclerosis.[1][43] 30 to 70% of persons experiencing CIS later develop MS.[43]
Nerve axon with myelin sheath
Secondary progressive MS occurs in around 65% of those with initial relapsing-remitting MS,
who eventually have progressive neurologic decline between acute attacks without any definite
periods of remission.[1][4] Occasional relapses and minor remissions may appear.[4] The most
common length of time between disease onset and conversion from relapsing-remitting to
secondary progressive MS is 19 years.[44] The primary progressive subtype occurs in
approximately 1020% of individuals, with no remission after the initial symptoms.[45][14] It is
characterized by progression of disability from onset, with no, or only occasional and minor,
remissions and improvements.[4] The usual age of onset for the primary progressive subtype is
later than of the relapsing-remitting subtype. It is similar to the age that secondary progressive
usually begins in relapsing-remitting MS, around 40 years of age.[1]
Progressive relapsing MS describes those individuals who, from onset, have a steady
neurologic decline but also have clear superimposed attacks. This is the least common of all
subtypes.[4]
Unusual types of MS have been described; these include Devic's disease, Balo concentric
sclerosis, Schilder's diffuse sclerosis and Marburg multiple sclerosis. There is debate on
whether they are MS variants or different diseases.[46] Multiple sclerosis behaves differently in
children, taking more time to reach the progressive stage.[1] Nevertheless they still reach it at a
lower average age than adults usually do.[1]
Management
Main article: Management of multiple sclerosis
Although there is no known cure for multiple sclerosis, several therapies have proven helpful.
The primary aims of therapy are returning function after an attack, preventing new attacks, and
preventing disability. As with any medical treatment, medications used in the management of
MS have several adverse effects. Alternative treatments are pursued by some people, despite
the shortage of supporting evidence.
Acute attacks
During symptomatic attacks, administration of high doses of intravenous corticosteroids, such
as methylprednisolone, is the usual therapy,[1] with oral corticosteroids seeming to have a
similar efficacy and safety profile.[47] Although generally effective in the short term for relieving
symptoms, corticosteroid treatments do not appear to have a significant impact on long-term
recovery.[48]The consequences of severe attacks which do not respond to corticosteroids might
be treatable by plasmapheresis.[1]
Disease-modifying treatments
Relapsing remitting multiple sclerosis
Eight disease-modifying treatments have been approved by regulatory agencies for relapsing-
remitting multiple sclerosis (RRMS) including: interferon beta-1a, interferon beta-1b, glatiramer
acetate, mitoxantrone, natalizumab, fingolimod,[49] teriflunomide[50] and dimethyl
fumarate.[51] Their cost effectiveness as of 2012 is unclear.[52]
In RRMS they are modestly effective at decreasing the number of attacks.[49] The interferons
and glatiramer acetate are first-line treatments[14] and are roughly equivalent, reducing relapses
by approximately 30%.[53] Early-initiated long-term therapy is safe and improves
outcomes.[54][55] Natalizumab reduces the relapse rate more than first-line agents; however, due
to issues of adverse effects is a second-line agent reserved for those who do not respond to
other treatments[14] or with severe disease.[53] Mitoxantrone, whose use is limited by severe
adverse effects, is a third-line option for those who do not respond to other
medications.[14] Treatment of clinically isolated syndrome (CIS) with interferons decreases the
chance of progressing to clinical MS.[1][56] Efficacy of interferons and glatiramer acetate in
children has been estimated to be roughly equivalent to that of adults.[57] The role of some of the
newer agents such as fingolimod, teriflunomide, and dimethyl fumarate, as of 2011, is not yet
entirely clear.[58]
Progressive multiple sclerosis
No treatment has been shown to change the course of primary progressive MS[14] and as of
2011 only one medication, mitoxantrone, has been approved for secondary progressive
MS.[59] In this population tentative evidence supports mitoxantrone moderately slowing the
progression of the disease and decreasing rates of relapses over two years.[60][61]
Adverse effects
Irritation zone after injection of glatiramer acetate.
The disease-modifying treatments have several adverse effects. One of the most common is
irritation at the injection site for glatiramer acetate and the interferons (up to 90% with
subcutaneous injections and 33% with intramuscular injections).[62] Over time, a visible dent at
the injection site, due to the local destruction of fat tissue, known as lipoatrophy, may
develop.[62] Interferons may produce flu-like symptoms;[63] some people taking glatiramer
experience a post-injection reaction with flushing, chest tightness, heart palpitations,
breathlessness, and anxiety, which usually lasts less than thirty minutes.[64] More dangerous but
much less common are liver damage from interferons,[65] systolic dysfunction (12%), infertility,
and acute myeloid leukemia (0.8%) from mitoxantrone,[60][66] and progressive multifocal
leukoencephalopathy occurring with natalizumab (occurring in 1 in 600 people treated).[14][67]
Fingolimod may give rise to hypertension and bradycardia, macular edema, elevated liver
enzymes or a reduction in lymphocyte levels.[68] Tentative evidence supports the short term
safety of teriflunomide, with common side effects including: headaches, fatigue, nausea, hair
loss, and limb pain.[49]There have also been reports of liver failure and PML with its use and it
is dangerous for fetal development.[68] Most common side effects of dimethyl fumarate are
flushing and gastrointestinal problems.[51][68] While dimethyl fumarate may lead to a reduction in
the white blood cell count there were no reported cases of opportunistic infections during
trials.[69][70]
Associated symptoms
Both medications and neurorehabilitation have been shown to improve some symptoms, though
neither changes the course of the disease.[71] Some symptoms have a good response to
medication, such as an unstable bladder and spasticity, while others are little changed.[1] For
neurologic problems, a multidisciplinary approach is important for improving quality of life;
however, it is difficult to specify a 'core team' as many different health services may be needed
at different points in time.[1] Multidisciplinary rehabilitation programs increase activity and
participation of people with MS but do not influence impairment level.[72] There is limited
evidence for the overall efficacy of individual therapeutic disciplines,[73][73][74] though there is good
evidence that specific approaches, such as exercise,[75][76] and psychology therapies,
particularly cognitive behavioral approaches are effective.[77]
Alternative treatments
Over 50% of people with MS may use complementary and alternative medicine, although
percentages vary depending on how alternative medicine is defined.[78] The evidence for the
effectiveness for such treatments in most cases is weak or absent.[78][79] While there is tentative
evidence that vitamin D may be useful, evidence is insufficient for a definitive
conclusion.[80] Treatments of unproven benefit used by people with MS include: dietary
supplementation and regimens,[78][81][82] relaxation techniques such as yoga,[78] herbal
medicine (including medical cannabis),[78][83]hyperbaric oxygen therapy,[84] self-infection with
hookworms, reflexology and acupunture.[78][85] Regarding the characteristics of users, they are
more frequently women, have had MS for a longer time, tend to be more disabled and have
lower levels of satisfaction with conventional healthcare.[78]
Prognosis
Disability-adjusted life year for multiple sclerosis per 100,000 inhabitants in 2004
no data 2831
<13 3134
1316 3437
1619 3740
1922 4043
2225 >43
2528
The expected future course of the disease depends on the subtype of the disease; the
individual's sex, age, and initial symptoms; and the degree ofdisability the person has.[7] Female
sex, relapsing-remitting subtype, optic neuritis or sensory symptoms at onset, few attacks in the
initial years and especially early age at onset, are associated with a better course.[7][86]
The average life expectancy is 30 years from onset, being 5 to 10 years lower than that of
unaffected people.[1] Almost 40% of people with MS reach the seventh decade of
life.[86] Nevertheless, two-thirds of the deaths are directly related to the consequences of the
disease.[1] Suicide is more common, while infections and other complications are especially
dangerous for the more disabled.[1] Although most people lose the ability to walk before death,
90% are capable of independent walking at 10 years from onset, and 75% at 15 years.[86][87]
Epidemiology
The number of people with MS, as of 2010, is 22.5 million (approximately 30 per 100,000)
globally, with rates varying widely in different regions.[8][9] It is estimated to have resulted in
18,000 deaths that year.[88] In Africa rates are less than 0.5 per 100,000, while they are 2.8 per
100,000 in South East Asia, 8.3 per 100,000 in the Americas, and 80 per 100,000 in
Europe.[8] Rates surpass 200 per 100,000 in certain populations of Northern European
descent.[9] The number of new cases which develop per year is about 2.5 per 100,000.[8]
Rates of MS appear to be increasing, this however may be explained simply by better

diagnosis.[9] Studies on populational and geographical patterns have been common[28] and have
led to a number of theories about the cause.[6][18][21]
MS usually appears in adults in their late twenties or early thirties but it can rarely start in
childhood and after 50 years of age.[8][9] The primary progressive subtype is more common in
people in their fifties.[45] Similar to many autoimmune disorders, the disease is more common in
women, and the trend may be increasing.[1][19] As of 2008, globally it is about two times more
common in women than in men.[8] In children, it is even more common in females than
males,[1] while in people over fifty, it affects males and females almost equally.[45]
History
Medical discovery
Detail of Carswell's drawing of MS lesions in the brain stem and spinal cord(1838)
The French neurologist Jean-Martin Charcot (18251893) was the first person to recognize
multiple sclerosis as a distinct disease in 1868.[89]Summarizing previous reports and adding his
own clinical and pathological observations, Charcot called the disease sclerose en plaques. The
three signs of MS now known as Charcot's triad 1 are nystagmus, intention tremor,
and telegraphic speech (scanning speech), though these are not unique to MS. Charcot also
observed cognition changes, describing his patients as having a "marked enfeeblement of the
memory" and "conceptions that formed slowly".[10]
Before Charcot, Robert Carswell (17931857), a British professor of pathology, and Jean
Cruveilhier (17911873), a French professor of pathologic anatomy, had described and
illustrated many of the disease's clinical details, but did not identify it as a separate
disease.[89] Specifically, Carswell described the injuries he found as "a remarkable lesion of the
spinal cord accompanied with atrophy".[1] Under the microscope, Swiss pathologist Georg
Eduard Rindfleisch (18361908) noted in 1863 that the inflammation-associated lesions were
distributed around blood vessels.[90][91] During the 20th century theories about the cause and
pathogenesis were developed and effective treatments began to appear in 1990s.[1]
Historical cases
Photographic study of locomotion of a MS female patient with walking difficulties created in 1887
by Muybridge
There are several historical accounts of people who lived before or shortly after the disease was
described by Charcot and probably had MS.
A young woman called Halldora who lived in Iceland around 1200 suddenly lost her vision and
mobility but, after praying to the saints, recovered them seven days after. Saint
Lidwina of Schiedam (13801433), aDutch nun, may be one of the first clearly identifiable
people with MS. From the age of 16 until her death at 53, she had intermittent pain, weakness
of the legs, and vision losssymptoms typical of MS.[92] Both cases have led to the proposal of
a "Viking gene" hypothesis for the dissemination of the disease.[93]
Augustus Frederick d'Este (17941848), son of Prince Augustus Frederick, Duke of

Sussex and Lady Augusta Murray and the grandson of George III of the United Kingdom,
almost certainly had MS. D'Este left a detailed diary describing his 22 years living with the
disease. His diary began in 1822 and ended in 1846, although it remained unknown until 1948.
His symptoms began at age 28 with a sudden transient visual loss (amaurosis fugax) after the
funeral of a friend. During the course of his disease, he developed weakness of the legs,
clumsiness of the hands, numbness, dizziness, bladder disturbances, and erectile dysfunction.
In 1844, he began to use a wheelchair. Despite his illness, he kept an optimistic view of
life.[94][95]Another early account of MS was kept by the British diarist W. N. P. Barbellion, nom-de-
plume of Bruce Frederick Cummings (18891919), who maintained a detailed log of his
diagnosis and struggle.[95] His diary was published in 1919 as The Journal of a Disappointed
Man.[96]
Research
Main article: Multiple sclerosis research
Medications
Chemical structure of alemtuzumab
There is ongoing research looking for more effective, convenient, and tolerable treatments for
relapsing-remitting MS; creation of therapies for the progressive
subtypes; neuroprotection strategies; and effective symptomatic treatments.[97]
During the 2000s and 2010s there has been approval of several oral drugs which are expected
to gain in popularity and frequency of use.[98] Further oral drugs are under investigation, one
being laquinimod, which was announced in August 2012 and is in a third phase III trial after
mixed results in the previous ones.[99] Similarly, studies aimed to improve the efficacy and ease
of use of already existing therapies are occurring. This includes the use of new preparations
such as the PEGylated version of interferon--1a, which it is hoped may be given at less
frequent doses with similar effects.[100][101]Request for approval ofpeginterferon beta-1a is
expected during 2013.[101]
Monoclonal antibodies have also raised high levels of

interest. Alemtuzumab, daclizumab and CD20 monoclonal antibodies such
as rituximab,ocrelizumab and ofatumumab have all shown some benefit and are under study as
potential treatments.[70] Their use has also been accompanied by the appearance of potentially
dangerous adverse effects, most importantly opportunistic infections.[98] Related to these
investigations is the development of a test for JC virus antibodies which might help to determine
who is at greater risk of developing progressive multifocal leukoencephalopathy when taking
natalizumab.[98] While monoclonal antibodies will probably have some role in the treatment of
the disease in the future, it is believed that it will be small due to the risks associated with
them.[98]
Another research strategy is to evaluate the combined effectiveness of two or more

drugs.[102] The main rationale for using a number of medications in MS is that the involved
treatments target different mechanisms and therefore their use is not necessarily
exclusive.[102] Synergies, in which one drug improves the effect of another are also possible, but
there can also be drawbacks such as the blocking of the action of the other or worsened side
effects.[102] There have been several trials of combined therapy, yet none have shown positive
enough results to be considered as a useful treatment for MS.[102]
Research on neuroprotection and regenerative treatments, such as stem cell therapy, while of
high importance, are in the early stages.[103] Likewise, there are not any effective treatments for
the progressive variants of the disease. Many of the newest drugs as well as those under
development are probably going to be evaluated as therapies for PPMS or SPMS.[98]
Disease biomarkers
MRI brain scan produced using aGradient-echo phase sequence showing an iron deposit in a white
matter lesion (inside green box in the middle of the image; enhanced and marked by red arrow top-left
corner)[104]
While diagnostic criteria are not expected to change in the near future, work to
develop biomarkers that help with diagnosis and prediction of disease progression is
ongoing.[98] New diagnostic methods that are being investigated include work with anti-
myelin antibodies, and studies with serum andcerebrospinal fluid, but none of them has yielded
reliably positive results.[105]
Currently there are no laboratory investigations that can predict prognosis. Several promising
approaches have been proposed including: interleukin-6,nitric oxide and nitric oxide
synthase, osteopontin, and fetuin-A.[105] Since disease progression is the result of degeneration
of neurons, the roles of proteins showing loss of nerve tissue such
as neurofilaments, tau and N-acetylaspartate are under investigation.[105] Other effects include
looking for biomarkers that distinguish between those who will and will not respond to
medications.[105]
Improvement in neuroimaging techniques such as positron emission tomography (PET)

or magnetic resonance imaging (MRI) carry a promise for better diagnosis and prognosis
predictions, although the effect of such improvements in daily medical practice may take several
decades.[98] Regarding MRI, there are several techniques that have already shown some
usefulness in research settings and could be introduced into clinical practice, such as double-
inversion recovery sequences, magnetization transfer, diffusion tensor, and functional magnetic
resonance imaging.[106] These techniques are more specific for the disease than existing ones,
but still lack some standardization of acquisition protocols and the creation of normative
values.[106]There are other techniques under development that include contrast agents capable
of measuring levels of peripheral macrophages, inflammation, or neuronal dysfunction,[106] and
techniques that measure iron deposition that could serve to determine the role of this feature in
MS, or that of cerebral perfusion.[106] Similarly, new PET radiotracers might serve as markers of
altered processes such as brain inflammation, cortical pathology, apoptosis, or
remylienation.[107]
Chronic cerebrospinal venous insufficiency

Main article: Chronic cerebrospinal venous insufficiency
In 2008, vascular surgeon Paolo Zamboni suggested that MS involves narrowing of the veins
draining the brain which he referred to as chronic cerebrospinal venous insufficiency (CCSVI).
He found CCSVI in all patients with MS in his study, performed a surgical procedure, later called
in the media the "liberation procedure" to correct it and claimed that 73% of participants
improved.[108] This theory received significant attention in the media and among those with MS,
especially in Canada.[109] Concerns have been raised with Zamboni's research as it was neither
blinded nor controlled, and its assumptions about the underlying cause of the disease is not
backed by known data.[110] Also further studies have either not found a similar relationship or
found one which is much less strong one,[111] raising serious objections to the
hypothesis.[112] The "liberation procedure" has been criticized for resulting in serious
complications and deaths with unproven benefits.[110] It is thus as of 2013 not recommended for
the treatment of MS.[113] Additional research investigating the CCSVI hypothesis are
underway.[114]
See also
List of multiple sclerosis organizations

List of people with multiple sclerosis
References
1. ^ Jump up
to:a b c d e f g h i j k l m n o p q r s t u v w x y z aa ab ac ad ae af agah ai aj ak
al am an ao ap aq ar as at au av aw ax ay az ba bb bcbd be
Compston A,
Coles A (October 2008). "Multiple sclerosis". Lancet 372 (9648):
150217.doi:10.1016/S0140-6736(08)61620-
7. PMID 18970977.
2. ^ Jump up to:a b c d e f g h i j k l m Compston A, Coles A (April
2002). "Multiple sclerosis". Lancet 359 (9313): 1221
31.doi:10.1016/S0140-6736(02)08220-X. PMID 11955556.
3. Jump up^ Murray ED, Buttner EA, Price BH (2012).
"Depression and Psychosis in Neurological Practice". In Daroff
R, Fenichel G, Jankovic J, Mazziotta J. Bradley's neurology in
clinical practice. (6th ed. ed.). Philadelphia, PA:
Elsevier/Saunders.ISBN 1-4377-0434-4.
4. ^ Jump up to:a b c d e f g h Lublin FD, Reingold SC; National
Multiple Sclerosis Society (USA) Advisory Committee on
Clinical Trials of New Agents in Multiple Sclerosis (April 1996).
"Defining the clinical course of multiple sclerosis: results of an
international survey". Neurology 46 (4): 907
11.doi:10.1212/WNL.46.4.907. PMID 8780061.
5. Jump up^ Nakahara, J; Maeda, M; Aiso, S; Suzuki, N (2012
Feb). "Current concepts in multiple sclerosis: autoimmunity
versus oligodendrogliopathy.". Clinical reviews in allergy &
immunology 42 (1): 2634. doi:10.1007/s12016-011-8287-
6. PMID 22189514.
6. ^ Jump up to:a b c d e f g h Ascherio A, Munger KL (April 2007).
"Environmental risk factors for multiple sclerosis. Part I: the role
of infection". Annals of Neurology 61 (4): 288
99.doi:10.1002/ana.21117. PMID 17444504.
7. ^ Jump up to:a b c Weinshenker BG (1994). "Natural history of
multiple sclerosis". Annals of Neurology 36 (Suppl): S6
11.doi:10.1002/ana.410360704. PMID 8017890.
8. ^ Jump up to:a b c d e f g World Health Organization
(2008). Atlas: Multiple Sclerosis Resources in the World 2008.
Geneva: World Health Organization. pp. 1516. ISBN 92-4-
156375-3.
9. ^ Jump up to:a b c d e f g h i j k l m Milo R, Kahana E (March
2010). "Multiple sclerosis: geoepidemiology, genetics and the
environment". Autoimmun Rev 9 (5): A387
94.doi:10.1016/j.autrev.2009.11.010. PMID 19932200.
10. ^ Jump up to:a b c Clanet M (June 2008). "Jean-Martin Charcot.
1825 to 1893" (PDF). Int MS J 15 (2): 5961. PMID 18782501.
* Charcot, J. (1868). "Histologie de la sclerose en
plaques".Gazette des hopitaux, Paris 41: 5545.
11. Jump up^ Kurtzke JF (1983). "Rating neurologic impairment in
multiple sclerosis: an expanded disability status scale
(EDSS)". Neurology 33 (11): 1444
52.doi:10.1212/WNL.33.11.1444. PMID 6685237.
12. Jump up^ Amato MP, Ponziani G (August 1999).
"Quantification of impairment in MS: discussion of the scales in
use". Mult. Scler. 5 (4): 2169. PMID 10467378.
13. Jump up^ Rudick RA, Cutter G, Reingold S (October 2002).
"The multiple sclerosis functional composite: a new clinical
outcome measure for multiple sclerosis trials". Mult. Scler. 8(5):
35965. PMID 12356200.
14. ^ Jump up to:a b c d e f g h i Tsang, BK; Macdonell, R (2011 Dec).
"Multiple sclerosis- diagnosis, management and
prognosis.". Australian family physician 40 (12): 948
55.PMID 22146321.
15. ^ Jump up to:a b Tataru N, Vidal C, Decavel P, Berger E,
Rumbach L (2006). "Limited impact of the summer heat wave in
France (2003) on hospital admissions and relapses for multiple
sclerosis". Neuroepidemiology 27 (1): 28
32.doi:10.1159/000094233. PMID 16804331.
16. Jump up^ Heesen C, Mohr DC, Huitinga I, et al. (March 2007).
"Stress regulation in multiple sclerosis: current issues and
concepts". Mult. Scler. 13 (2): 143
8.doi:10.1177/1352458506070772. PMID 17439878.
17. Jump up^ Martinelli V (2000). "Trauma, stress and multiple
sclerosis". Neurol. Sci. 21 (4 Suppl 2): S849
52.doi:10.1007/s100720070024. PMID 11205361.
18. ^ Jump up to:a b c d e f Marrie RA (December 2004).
"Environmental risk factors in multiple sclerosis
aetiology". Lancet Neurol 3 (12): 70918. doi:10.1016/S1474-
4422(04)00933-0.PMID 15556803.
19. ^ Jump up to:a b c Alonso A, Hernn MA (July 2008). "Temporal
trends in the incidence of multiple sclerosis: a systematic
review".Neurology 71 (2): 129
35.doi:10.1212/01.wnl.0000316802.35974.34.PMID 18606967.
20. ^ Jump up to:a b Pugliatti M, Sotgiu S, Rosati G (July 2002).
"The worldwide prevalence of multiple sclerosis". Clin Neurol
Neurosurg 104 (3): 18291. doi:10.1016/S0303-
8467(02)00036-7. PMID 12127652.
21. ^ Jump up to:a b c d Ascherio A, Munger KL (June 2007).
"Environmental risk factors for multiple sclerosis. Part II:
Noninfectious factors". Annals of Neurology 61 (6): 504
13.doi:10.1002/ana.21141. PMID 17492755.
22. Jump up^ Ascherio A, Munger KL, Simon KC (June 2010).
"Vitamin D and multiple sclerosis". Lancet Neurol 9 (6): 599
612.doi:10.1016/S1474-4422(10)70086-7. PMID 20494325.
23. Jump up^ Kulie T, Groff A, Redmer J, Hounshell J, Schrager S
(2009). "Vitamin D: an evidence-based review". J Am Board
Fam Med 22 (6): 698
706.doi:10.3122/jabfm.2009.06.090037. PMID 19897699.
24. Jump up^ Dyment DA, Ebers GC, Sadovnick AD (February
2004). "Genetics of multiple sclerosis". Lancet Neurol 3 (92):
10410. doi:10.1016/S1474-4422(03)00663-X.PMID 14747002.
25. Jump up^ Hassan-Smith, G; Douglas, MR (2011 Oct).
"Epidemiology and diagnosis of multiple sclerosis.". British
journal of hospital medicine (London, England : 2005) 72 (10):
M14651. PMID 22041658.
26. Jump up^ Rosati G (April 2001). "The prevalence of multiple
sclerosis in the world: an update". Neurol. Sci. 22 (2): 117
39. doi:10.1007/s100720170011. PMID 11603614.
27. ^ Jump up to:a b c d Baranzini SE (June 2011). "Revealing the
genetic basis of multiple sclerosis: are we there yet?". Current
Opinion in Genetics & Development 21 (3): 317
24.doi:10.1016/j.gde.2010.12.006. PMC 3105160.PMID 212477
52.
28. ^ Jump up to:a b Kurtzke JF (October 1993). "Epidemiologic
evidence for multiple sclerosis as an infection". Clin. Microbiol.
Rev. 6(4): 382
427. doi:10.1128/CMR.6.4.382. PMC 358295.PMID 8269393.
29. Jump up^ Gilden DH (March 2005). "Infectious causes of
multiple sclerosis". The Lancet Neurology 4 (3): 195
202.doi:10.1016/S1474-4422(05)01017-3. PMID 15721830.
30. Jump up^ Spitsin S, Koprowski H (2008). "Role of uric acid in
multiple sclerosis". Curr. Top. Microbiol. Immunol. 318: 325
42. PMID 18219824.
31. ^ Jump up to:a b Chari DM (2007). "Remyelination in multiple
sclerosis".Int. Rev. Neurobiol. 79: 589620. doi:10.1016/S0074-
7742(07)79026-8. PMID 17531860.
32. Jump up^ Pittock SJ, Lucchinetti CF (March 2007). "The
pathology of MS: new insights and potential clinical
applications".Neurologist 13 (2): 45
56.doi:10.1097/01.nrl.0000253065.31662.37.PMID 17351524.
33. Jump up^ Ferr JC, Shiroishi MS, Law M (November 2012).
"Advanced techniques using contrast media in
neuroimaging". Magn Reson Imaging Clin N Am 20 (4): 699
713. doi:10.1016/j.mric.2012.07.007.PMID 23088946.
34. Jump up^ Trojano M, Paolicelli D (November 2001). "The
differential diagnosis of multiple sclerosis: classification and
clinical features of relapsing and progressive neurological
syndromes". Neurol. Sci. 22 (Suppl 2): S98
102.doi:10.1007/s100720100044. PMID 11794488.
35. Jump up^ Poser CM, Brinar VV (June 2004). "Diagnostic
criteria for multiple sclerosis: an historical review". Clin Neurol
Neurosurg 106 (3): 147
58.doi:10.1016/j.clineuro.2004.02.004. PMID 15177763.
36. ^ Jump up to:a b c d McDonald WI, Compston A, Edan G, et
al. (July 2001). "Recommended diagnostic criteria for multiple
sclerosis: guidelines from the International Panel on the
diagnosis of multiple sclerosis". Annals of Neurology 50 (1):
1217. doi:10.1002/ana.1032. PMID 11456302.
37. Jump up^ Polman CH, Reingold SC, Edan G, et al. (December
2005). "Diagnostic criteria for multiple sclerosis: 2005 revisions
to the "McDonald Criteria"". Annals of Neurology58 (6): 840
6. doi:10.1002/ana.20703. PMID 16283615.
38. Jump up^ Rashid W, Miller DH (February 2008). "Recent
advances in neuroimaging of multiple sclerosis". Semin
Neurol 28 (1): 4655. doi:10.1055/s-2007-
1019127. PMID 18256986.
39. Jump up^ Link H, Huang YM (November 2006). "Oligoclonal
bands in multiple sclerosis cerebrospinal fluid: an update on
methodology and clinical usefulness". J. Neuroimmunol.180 (1
2): 1728. doi:10.1016/j.jneuroim.2006.07.006.PMID 16945427.
40. Jump up^ Gronseth GS, Ashman EJ (May 2000). "Practice
parameter: the usefulness of evoked potentials in identifying
clinically silent lesions in patients with suspected multiple
sclerosis (an evidence-based review): Report of the Quality
Standards Subcommittee of the American Academy of
Neurology". Neurology 54 (9): 1720
5.doi:10.1212/WNL.54.9.1720. PMID 10802774.
41. Jump up^ Pittock SJ, Rodriguez M (2008). "Benign multiple
sclerosis: a distinct clinical entity with therapeutic
implications". Curr. Top. Microbiol. Immunol. 318: 1
17. doi:10.1007/978-3-540-73677-6_1. PMID 18219812.
42. Jump up^ Feinstein, A (2007). The clinical neuropsychiatry of
multiple sclerosis (2nd ed. ed.). Cambridge: Cambridge
University Press. p. 20. ISBN 052185234X.
43. ^ Jump up to:a b Miller D, Barkhof F, Montalban X, Thompson A,
Filippi M (May 2005). "Clinically isolated syndromes suggestive
of multiple sclerosis, part I: natural history, pathogenesis,
diagnosis, and prognosis". Lancet Neurol 4 (5): 281
8.doi:10.1016/S1474-4422(05)70071-5. PMID 15847841.
44. Jump up^ Rovaris M, Confavreux C, Furlan R, Kappos L, Comi
G, Filippi M (April 2006). "Secondary progressive multiple
sclerosis: current knowledge and future challenges". Lancet
Neurol 5 (4): 34354. doi:10.1016/S1474-4422(06)70410-
0. PMID 16545751.
45. ^ Jump up to:a b c Miller DH, Leary SM (October 2007).
"Primary-progressive multiple sclerosis". Lancet Neurol 6 (10):
90312. doi:10.1016/S1474-4422(07)70243-0.PMID 17884680.
46. Jump up^ Stadelmann C, Brck W (November 2004).
"Lessons from the neuropathology of atypical forms of multiple
sclerosis".Neurol. Sci. 25 (Suppl 4): S319
22. doi:10.1007/s10072-004-0333-1. PMID 15727225.
47. Jump up^ Burton, JM (2012 Dec 12). "Oral versus intravenous
steroids for treatment of relapses in multiple
sclerosis.".Cochrane Database of Systematic
Reviews (12):CD006921 (Orig. rev.).doi:10.1002/14651858.CD
006921. PMID 23235634.
48. Jump up^ Multiple sclerosis : national clinical guideline for
diagnosis and management in primary and secondary
care (pdf). London: Royal College of Physicians. 2004. pp. 54
57.ISBN 1-86016-182-0. PMID 21290636. Retrieved 6 February
2013.
49. ^ Jump up to:a b c He, D; Xu, Z; Dong, S; Zhang, H; Zhou, H;
Wang, L; Zhang, S (2012 Dec 12). "Teriflunomide for multiple
sclerosis". In Zhou, Hongyu. Cochrane database of systematic
reviews (Online) 12:
CD009882.doi:10.1002/14651858.CD009882.pub2.PMID 2323
5682.
50. Jump up^ "FDA approves new multiple sclerosis treatment
Aubagio" (Press release). US FDA. 2012-09-12. Retrieved
2013-01-21.
51. ^ Jump up to:a b "Biogen Idecs TECFIDERA (Dimethyl
Fumarate) Approved in US as a First-Line Oral Treatment for
Multiple Sclerosis" (Press release). Biogen Idec. 2013-03-27.
Retrieved 2013-06-04.
52. Jump up^ Manouchehrinia, A; Constantinescu, CS (2012 Oct).
"Cost-effectiveness of disease-modifying therapies in multiple
sclerosis.". Current neurology and neuroscience reports 12(5):
592600. doi:10.1007/s11910-012-0291-6.PMID 22782520.
53. ^ Jump up to:a b Hassan-Smith, G; Douglas, MR (2011 Nov).
"Management and prognosis of multiple sclerosis.". British
journal of hospital medicine (London, England : 2005) 72(11):
M1746. PMID 22082979.
54. Jump up^ Freedman MS (January 2011). "Long-term follow-up
of clinical trials of multiple sclerosis therapies". Neurology 76(1
Suppl 1): S26
34.doi:10.1212/WNL.0b013e318205051d.PMID 21205679.
55. Jump up^ Qizilbash N, Mendez I, Sanchez-de la Rosa R
(January 2012). "Benefit-risk analysis of glatiramer acetate for
relapsing-remitting and clinically isolated syndrome multiple
sclerosis". Clin Ther 34 (1): 159
176.e5.doi:10.1016/j.clinthera.2011.12.006. PMID 22284996.
56. Jump up^ Bates D (January 2011). "Treatment effects of
immunomodulatory therapies at different stages of multiple
sclerosis in short-term trials". Neurology 76 (1 Suppl 1): S14
25. doi:10.1212/WNL.0b013e3182050388.PMID 21205678.
57. Jump up^ Johnston J, So TY (June 2012). "First-line disease-
modifying therapies in paediatric multiple sclerosis: a
comprehensive overview". Drugs 72 (9): 1195
211.doi:10.2165/11634010-000000000-00000.PMID 22642799.
58. Jump up^ Killestein J, Rudick RA, Polman CH (November
2011). "Oral treatment for multiple sclerosis". Lancet
Neurol 10(11): 102634. doi:10.1016/S1474-4422(11)70228-
9.PMID 22014437.
59. Jump up^ Kellerman, Rick D.; Edward N. Hanley Jr MD
(2011).Conn's Current Therapy 2012: Expert Consult - Online
and Print. Philadelphia: Saunders. p. 627. ISBN 1-4557-0738-4.
60. ^ Jump up to:a b Martinelli Boneschi, F; Vacchi, L; Rovaris, M;
Capra, R; Comi, G (2013 May 31). "Mitoxantrone for multiple
sclerosis.". Cochrane database of systematic reviews
(Online) 5:
CD002127.doi:10.1002/14651858.CD002127.pub3.PMID 2372
8638.
61. Jump up^ Marriott, JJ; Miyasaki, JM; Gronseth, G; O'Connor,
PW; Therapeutics and Technology Assessment Subcommittee
of the American Academy of, Neurology (2010 May 4).
"Evidence Report: The efficacy and safety of mitoxantrone
(Novantrone) in the treatment of multiple sclerosis: Report of
the Therapeutics and Technology Assessment Subcommittee
of the American Academy of Neurology.".Neurology 74 (18):
1463
70.doi:10.1212/WNL.0b013e3181dc1ae0. PMC 2871006.PMID
20439849.
62. ^ Jump up to:a b Balak, DM; Hengstman, GJ; akmak, A; Thio,
HB (2012 Dec). "Cutaneous adverse events associated with
disease-modifying treatment in multiple sclerosis: a systematic
review.". Multiple sclerosis (Houndmills, Basingstoke,
England) 18 (12): 1705
17.doi:10.1177/1352458512438239. PMID 22371220.
63. Jump up^ Sldkov T, Kostolansk F (2006). "The role of
cytokines in the immune response to influenza A virus
infection". Acta Virol. 50 (3): 15162. PMID 17131933.
64. Jump up^ Munari L, Lovati R, Boiko A (2004). "Therapy with
glatiramer acetate for multiple sclerosis". In Munari, Luca
M.Cochrane database of systematic reviews (Online) (1):
CD004678. doi:10.1002/14651858.CD004678.PMID 14974077.
65. Jump up^ Tremlett H, Oger J (November 2004). "Hepatic
injury, liver monitoring and the beta-interferons for multiple
sclerosis".J. Neurol. 251 (11): 1297303. doi:10.1007/s00415-
004-0619-5. PMID 15592724.
66. Jump up^ Comi G (October 2009). "Treatment of multiple
sclerosis: role of natalizumab". Neurol. Sci. 30. Suppl 2 (S2):
S1558.doi:10.1007/s10072-009-0147-2. PMID 19882365.
67. Jump up^ Hunt, D; Giovannoni, G (2012 Feb). "Natalizumab-
associated progressive multifocal leucoencephalopathy: a
practical approach to risk profiling and monitoring.".Practical
neurology 12 (1): 2535. doi:10.1136/practneurol-2011-
000092. PMID 22258169.
68. ^ Jump up to:a b c Killestein J, Rudick RA, Polman CH
(November 2011). "Oral treatment for multiple sclerosis". Lancet
Neurol10 (11): 102634. doi:10.1016/S1474-4422(11)70228-
9.PMID 22014437.
69. Jump up^ "NDA 204063 - FDA Approved Labeling Text". US
Food and Drug Agency. 27 March 2013. Retrieved 5 April 2013.
"NDA Approval". US Food and Drug Agency. 27 March 2013.
Retrieved 5 April 2013.
70. ^ Jump up to:a b Saidha S, Eckstein C, Calabresi PA (January
2012). "New and emerging disease modifying therapies for
multiple sclerosis". Annals of the New York Academy of
Sciences 1247: 11737. doi:10.1111/j.1749-
6632.2011.06272.x. PMID 22224673.
71. Jump up^ Kesselring J, Beer S (October 2005). "Symptomatic
therapy and neurorehabilitation in multiple sclerosis".Lancet
Neurol 4 (10): 64352. doi:10.1016/S1474-4422(05)70193-
9. PMID 16168933.
72. Jump up^ Khan F, Turner-Stokes L, Ng L, Kilpatrick T (2007).
"Multidisciplinary rehabilitation for adults with multiple sclerosis".
In Khan, Fary. Cochrane Database Syst Rev (2):
CD006036. doi:10.1002/14651858.CD006036.pub2.PMID 1744
3610.
73. ^ Jump up to:a b Steultjens EM, Dekker J, Bouter LM, Leemrijse
CJ, van den Ende CH (2005). "Evidence of the efficacy of
occupational therapy in different conditions: an overview of
systematic reviews". Clinical rehabilitation 19 (3): 247
54.doi:10.1191/0269215505cr870oa. PMID 15859525.
74. Jump up^ Steultjens EM, Dekker J, Bouter LM, Cardol M, Van
de Nes JC, Van den Ende CH (2003). "Occupational therapy for
multiple sclerosis". In Steultjens, Esther EMJ. Cochrane
database of systematic reviews (Online) (3):
CD003608.doi:10.1002/14651858.CD003608. PMID 12917976.
75. Jump up^ Gallien P, Nicolas B, Robineau S, Ptrilli S,
Houedakor J, Durufle A (2007). "Physical training and multiple
sclerosis".Ann Readapt Med Phys 50 (6): 3736, 369
72.doi:10.1016/j.annrmp.2007.04.004. PMID 17482708.
76. Jump up^ Rietberg MB, Brooks D, Uitdehaag BMJ, Kwakkel G
(2005). "Exercise therapy for multiple sclerosis". In Kwakkel,
Gert.Cochrane Database of Systematic Reviews (1):
CD003980.doi:10.1002/14651858.CD003980.pub2.PMID 1567
4920.
77. Jump up^ Thomas PW, Thomas S, Hillier C, Galvin K, Baker R
(2006). "Psychological interventions for multiple sclerosis". In
Thomas, Peter W. Cochrane Database of Systematic
Reviews (1):
CD004431.doi:10.1002/14651858.CD004431.pub2.PMID 1643
7487.
78. ^ Jump up to:a b c d e f g Huntley A (January 2006). "A review of
the evidence for efficacy of complementary and alternative
medicines in MS". Int MS J 13 (1): 512, 4.PMID 16420779.
79. Jump up^ Olsen SA (2009). "A review of complementary and
alternative medicine (CAM) by people with multiple
sclerosis". Occup Ther Int 16 (1): 57
70.doi:10.1002/oti.266. PMID 19222053.
80. Jump up^ Jagannath, VA; Fedorowicz, Z; Asokan, GV; Robak,
EW; Whamond, L (2010 Dec 8). "Vitamin D for the
management of multiple sclerosis.". Cochrane database of
systematic reviews (Online) (12):
CD008422.doi:10.1002/14651858.CD008422.pub2.PMID 2115
4396.
81. Jump up^ Farinotti M, Simi S, Di Pietrantonj C, et al. (2007).
"Dietary interventions for multiple sclerosis". In Farinotti,
Mariangela.Cochrane database of systematic reviews
(Online) (1):
CD004192. doi:10.1002/14651858.CD004192.pub2.PMID 1725
3500.
82. Jump up^ Grigorian A, Araujo L, Naidu NN, Place DJ,
Choudhury B, Demetriou M. (September 2011). "N-
acetylglucosamine inhibits T-helper 1 (Th1)/T-helper 17 (Th17)
cell responses and treats experimental autoimmune
encephalomyelitis.". J Biol Chem 286 (46): 40133
41.doi:10.1074/jbc.M111.277814. PMID 21965673.
83. Jump up^ Chong MS, Wolff K, Wise K, Tanton C, Winstock A,
Silber E (2006). "Cannabis use in patients with multiple
sclerosis".Mult. Scler. 12 (5): 646
51.doi:10.1177/1352458506070947. PMID 17086912.
84. Jump up^ Bennett M, Heard R (2004). "Hyperbaric oxygen
therapy for multiple sclerosis". In Bennett, Michael H. Cochrane
database of systematic reviews (Online) (1):
CD003057.doi:10.1002/14651858.CD003057.pub2.PMID 1497
4004.
85. Jump up^ Adams, Tim (23 May 2010). "Gut instinct: the
miracle of the parasitic hookworm". The Observer.
86. ^ Jump up to:a b c Phadke JG (May 1987). "Survival pattern and
cause of death in patients with multiple sclerosis: results from
an epidemiological survey in north east Scotland". J. Neurol.
Neurosurg. Psychiatr. 50 (5): 523
31.doi:10.1136/jnnp.50.5.523. PMC 1031962.PMID 3495637.
87. Jump up^ Myhr KM, Riise T, Vedeler C, et al (February 2001).
"Disability and prognosis in multiple sclerosis: demographic and
clinical variables important for the ability to walk and awarding
of disability pension". Mult. Scler. 7 (1): 59
65. PMID 11321195.
88. Jump up^ Lozano, R (2012 Dec 15). "Global and regional
mortality from 235 causes of death for 20 age groups in 1990
and 2010: a systematic analysis for the Global Burden of
Disease Study 2010.". Lancet 380 (9859): 2095
128.doi:10.1016/S0140-6736(12)61728-0. PMID 23245604.
89. ^ Jump up to:a b Compston A (October 1988). "The 150th
anniversary of the first depiction of the lesions of multiple
sclerosis". J. Neurol. Neurosurg. Psychiatr. 51 (10): 1249
52.doi:10.1136/jnnp.51.10.1249. PMC 1032909.PMID 3066846.
90. Jump up^ Lassmann H (October 1999). "The pathology of
multiple sclerosis and its evolution". Philosophical Transactions
of the Royal Society B 354 (1390): 1635
40.doi:10.1098/rstb.1999.0508. PMC 1692680.PMID 10603616.
91. Jump up^ Lassmann H (July 2005). "Multiple sclerosis
pathology: evolution of pathogenetic concepts". Brain
Pathology 15 (3): 21722. doi:10.1111/j.1750-
3639.2005.tb00523.x.PMID 16196388.
92. Jump up^ Medaer R (September 1979). "Does the history of
multiple sclerosis go back as far as the 14th century?". Acta
Neurol. Scand. 60 (3): 18992. doi:10.1111/j.1600-
0447.1979.tb08970.x. PMID 390966.
93. Jump up^ Holmy T (2006). "A Norse contribution to the
history of neurological diseases". Eur. Neurol. 55 (1): 57
8.doi:10.1159/000091431. PMID 16479124.
94. Jump up^ Firth, D (1948). The Case of August D`Est.
Cambridge: Cambridge University Press.
95. ^ Jump up to:a b Pearce JM (2005). "Historical descriptions of
multiple sclerosis". Eur. Neurol. 54 (1): 49
53.doi:10.1159/000087387. PMID 16103678.
96. Jump up^ Barbellion, Wilhelm Nero Pilate (1919). The Journal
of a Disappointed Man. New York: George H. Doran. ISBN 0-
7012-1906-8.
97. Jump up^ Cohen JA (July 2009). "Emerging therapies for
relapsing multiple sclerosis". Arch. Neurol. 66 (7): 821
8.doi:10.1001/archneurol.2009.104. PMID 19597083.
98. ^ Jump up to:a b c d e f g Miller AE (2011). "Multiple sclerosis:
where will we be in 2020?". Mt. Sinai J. Med. 78 (2): 268
79.doi:10.1002/msj.20242. PMID 21425270.
99. Jump up^ Jeffrey, susan (09 Aug 2012). "CONCERTO: A Third
Phase 3 Trial for Laquinimod in MS". Medscape Medical News.
Retrieved 21 May 2013.
100. Jump up^ Kieseier BC, Calabresi PA (March 2012).
"PEGylation of interferon--1a: a promising strategy in multiple
sclerosis".CNS Drugs 26 (3): 20514. doi:10.2165/11596970-
000000000-00000. PMID 22201341.
101. ^ Jump up to:a b "Biogen Idec Announces Positive Top-Line
Results from Phase 3 Study of Peginterferon Beta-1a in
Multiple Sclerosis" (Press release). Biogen Idec. 2013-01-24.
Retrieved 2013-05-21.
102. ^ Jump up to:a b c d Milo R, Panitch H (February 2011).
"Combination therapy in multiple sclerosis". J.
Neuroimmunol. 231 (1-2): 23
31. doi:10.1016/j.jneuroim.2010.10.021.PMID 21111490.
103. Jump up^ Luessi F, Siffrin V, Zipp F (September
2012)."Neurodegeneration in multiple sclerosis: novel treatment
strategies". Expert Rev Neurother 12 (9): 106176; quiz
1077. doi:10.1586/ern.12.59. PMID 23039386.
104. Jump up^ Mehta V, Pei W, Yang G, et al. (2013). "Iron is a
sensitive biomarker for inflammation in multiple sclerosis
lesions".PLoS ONE 8 (3):
e57573.doi:10.1371/journal.pone.0057573. PMC 3597727.PMI
D 23516409.
105. ^ Jump up to:a b c d Harris VK, Sadiq SA (2009). "Disease
biomarkers in multiple sclerosis: potential for use in therapeutic
decision making". Mol Diagn Ther 13 (4): 225
44.doi:10.2165/11313470-000000000-00000.PMID 19712003.
106. ^ Jump up to:a b c d Filippi M, Rocca MA, De Stefano N, et
al. (December 2011). "Magnetic resonance techniques in
multiple sclerosis: the present and the future". Arch.
Neurol. 68 (12): 1514
20. doi:10.1001/archneurol.2011.914.PMID 22159052.
107. Jump up^ Kiferle L, Politis M, Muraro PA, Piccini P (February
2011). "Positron emission tomography imaging in multiple
sclerosis-current status and future applications". Eur. J.
Neurol. 18 (2): 22631. doi:10.1111/j.1468-
1331.2010.03154.x. PMID 20636368.
108. Jump up^ Zamboni P, Galeotti R, Menegatti E, et al. (April
2009)."Chronic cerebrospinal venous insufficiency in patients
with multiple sclerosis". J. Neurol. Neurosurg. Psychiatr. 80(4):
392
9. doi:10.1136/jnnp.2008.157164.PMC 2647682. PMID 190600
24.
109. Jump up^ Pullman D, Zarzeczny A, Picard A (2013). "Media,
politics and science policy: MS and evidence from the CCSVI
Trenches". BMC Med Ethics 14: 6. doi:10.1186/1472-6939-14-
6. PMC 3575396. PMID 23402260.
110. ^ Jump up to:a b Qiu J (May 2010). "Venous abnormalities
and multiple sclerosis: another breakthrough claim?". Lancet
Neurol 9(5): 4645. doi:10.1016/S1474-4422(10)70098-
3.PMID 20398855.
111. Jump up^ Ghezzi A, Comi G, Federico A (February 2011).
"Chronic cerebro-spinal venous insufficiency (CCSVI) and
multiple sclerosis". Neurol. Sci. 32 (1): 17
21. doi:10.1007/s10072-010-0458-3. PMID 21161309.
112. Jump up^ Dorne H, Zaidat OO, Fiorella D, Hirsch J,
Prestigiacomo C, Albuquerque F, Tarr RW. (October 2010).
"Chronic cerebrospinal venous insufficiency and the doubtful
promise of an endovascular treatment for multiple sclerosis". J
NeuroIntervent Surg 2 (4): 309
311.doi:10.1136/jnis.2010.003947. PMID 21990639.
113. Jump up^ Baracchini C, Atzori M, Gallo P (March 2013).
"CCSVI and MS: no meaning, no fact". Neurol. Sci. 34 (3): 269
79.doi:10.1007/s10072-012-1101-2. PMID 22569567.
114. Jump up^ van Zuuren, EJ; Fedorowicz, Z; Pucci, E;
Jagannath, VA; Robak, EW (2012 Dec 12). "Percutaneous
transluminal angioplasty for treatment of chronic cerebrospinal
venous insufficiency (CCSVI) in multiple sclerosis
patients.".Cochrane database of systematic reviews
(Online) 12:
CD009903. doi:10.1002/14651858.CD009903.pub2.PMID 2323
5683.
Further reading
Langgartner M, Langgartner I, Drlicek M (April 2005). "The

patient's journey: multiple sclerosis". BMJ 330 (7496): 885
8. doi:10.1136/bmj.330.7496.885. PMC 556161.PMID 15831874.
External links
Multiple sclerosis at the Open Directory Project

Database for analysis and comparison of global data on the
epidemiology of MS
[show]
V
E
Multiple sclerosis and other demyelinating disease
[show]
V
Pathology of the nervous system, primarily C
[show]
V
Immune disorders: hypersensitivity and autoi
Categories:
Multiple sclerosis
EpsteinBarr virus-associated diseases
Navigation menu
Create account
Log in
Article
Talk
Read
View source
View history
Main page
Contents
Featured content
Current events
Random article
Donate to Wikipedia
Interaction
Help
About Wikipedia
Community portal
Recent changes
Contact page
Tools
Print/export
Languages

Bosanski
Catal
etina
Cymraeg
Dansk
Deutsch
Eesti

Espaol
Esperanto
Euskara

Franais
Gaeilge
Galego

Hrvatski
Bahasa Indonesia
Italiano

Kurd
Latvieu
Lietuvi
Magyar

Nederlands

Norsk bokml

Polski
Portugus
Romn

Shqip
Simple English
Slovenina
Slovenina
/ srpski
Srpskohrvatski /
Suomi
Svenska

Trke

Edit links
This page was last modified on 13 November 2013 at 05:14.

Text is available under the Creative Commons Attribution-ShareAlike
License; additional terms may apply. By using this site, you agree to
the Terms of Use and Privacy Policy.
Wikipedia is a registered trademark of the Wikimedia Foundation, Inc., a
non-profit organization.
Privacy policy
About Wikipedia
Disclaimers
Contact Wikipedia
Developers
Mobile view

Iterative reconstruction refers to iterative algorithms used to reconstruct 2D and 3D
images in certain imaging techniques. For example, in computed tomography an image
must be reconstructed from projections of an object. Here, iterative reconstruction
techniques are a better, but computationally more expensive, alternative to the
common filtered back projection (FBP) method, which directly calculates the image in a
single reconstruction step.[1]
Contents
[hide]
1 Basic concepts
2 Advantages
3 See also
4 References
Basic concepts[edit]
The reconstruction of an image from the acquired data is an inverse problem. Often, it is not
possible to exactly solve the inverse problem directly. In this case, a direct algorithm has to
approximate the solution, which might cause visible reconstruction artifacts in the image.
Iterative algorithms approach the correct solution using multiple iteration steps, which allows
to obtain a better reconstruction at the cost of a higher computation time.
In computed tomography, this approach was the one first used by Hounsfield. There are a
large variety of algorithms, but each starts with an assumed image, computes projections
from the image, compares the original projection data and updates the image based upon
the difference between the calculated and the actual projections.
There are typically five components to iterative image reconstruction algorithms, e.g. . [2]
1. An object model that expresses the unknown continuous-space function that is

to be reconstructed in terms of a finite series with unknown coefficients that must be
estimated from the data.
2. A system model that relates the unknown object to the "ideal" measurements that
would be recorded in the absence of measurement noise. Often this is a linear
model of the form .
3. A statistical model that describes how the noisy measurements vary around their
ideal values. Often Gaussian noise or Poisson statistics are assumed.
4. A cost function that is to be minimized to estimate the image coefficient vector. Often
this cost function includes some form of regularization. Sometimes the regularization
is based onMarkov random fields.
5. An algorithm, usually iterative, for minimizing the cost function, including some initial
estimate of the image and some stopping criterion for terminating the iterations.
Advantages[edit]
The advantages of the iterative approach include improved insensitivity to noise and
capability of reconstructing an optimal image in the case of incomplete data. The method
has been applied in emission tomography modalities like SPECT and PET, where there is
significant attenuation along ray paths and noise statistics are relatively poor.
As another example, it is considered superior when one does not have a large set of
projections available, when the projections are not distributed uniformly in angle, or when
the projections are sparse or missing at certain orientations. These scenarios may occur
in intraoperative CT, in cardiac CT, or when metal artifacts [3] require the exclusion of some
portions of the projection data.
In Magnetic Resonance Imaging it can be used to reconstruct images from data acquired
with multiple receive coils and with sampling patterns different from the conventional
Cartesian grid[4] and allows the use of improved regularization techniques (e.g. total
variation)[5] or an extended modeling of physical processes[6] to improve the reconstruction.
For example, with iterative algorithms it is possible to reconstruct images from data
acquired in a very short time as required for Real-time MRI.[7]
Here is an example that illustrates the benefits of iterative image reconstruction for cardiac
MRI.[8]
A single frame from a Real-time MRI movie of a human heart. a) direct reconstruction b) iterative
(nonlinear inverse) reconstruction[7]
See also[edit]
Tomographic reconstruction
Tomogram
Computed Tomography
Magnetic Resonance Imaging
Inverse problem
Osem
Deconvolution
References[edit]
1. Jump up^ Herman, G. T., Fundamentals of computerized tomography: Image reconstruction

from projection, 2nd edition, Springer, 2009
2. Jump up^ J A Fessler, "Penalized weighted least-squares image reconstruction for positron
emission tomography," IEEE Trans. on Medical Imaging, 13(2):290-300, June 1994.
3. Jump up^ FE Boas and D Fleischmann. "Evaluation of two iterative techniques for reducing
metal artifacts in computed tomography." Radiology, doi:10.1148/radiol.11101782, 2011.
4. Jump up^ Pruessmann, K. P., Weiger, M., Brnert, P. and Boesiger, P. (2001), Advances in
sensitivity encoding with arbitrary k-space trajectories. Magnetic Resonance in Medicine, 46:
638651.doi:10.1002/mrm.1241
5. Jump up^ Block, K. T., Uecker, M. and Frahm, J. (2007), Undersampled radial MRI with
multiple coils. Iterative image reconstruction using a total variation constraint. Magnetic
Resonance in Medicine, 57: 10861098. doi:10.1002/mrm.21236
6. Jump up^ Fessler, J. (2010) Model-based Image Reconstruction for MRI. Signal Processing
Magazine, IEEE 27:81-89
7. ^ Jump up to:a b M Uecker, S Zhang, D Voit, A Karaus, KD Merboldt, J Frahm (2010a) Real-
time MRI at a resolution of 20 ms. NMR Biomed 23: 986-994, doi:10.1002/nbm.1585
8. Jump up^ I Uyanik, P Lindner, D Shah, N Tsekos I Pavlidis (2013) Applying a Level Set
Method for Resolving Physiologic Motions in Free-Breathing and Non-gated Cardiac MRI.
FIMH, 2013, [1]
Wikimedia Commons has

media related to Iterative
reconstruction.
Categories:
Medical imaging

Brain Mri

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Brain Mri

Загружено:

Авторское право:

Доступные форматы

Accurate white matter lesion segmentation by k nearest neighbor

classification with tissue type priors (kNN-TTPs)

Intensity normalization has a large influence on lesion segmentation performance.

Inclusion of tissue type priors as features increases segmentation performance.

The intensity normalization method had a large influence on the segmentation

Adding TTPs increases the performance of kNN based MS lesion segmentation

2. Materials and methods

We primarily investigated MR images of patients with clinically-definite MS and healthy

2.3. Manual reference segmentation

2.4. Automatic white matter lesion segmentation

2.6. Feature extraction

The normalized spatial coordinates x, y, and z were derived by linear registration of

2.7. Feature normalization

As different features have different ranges, the features should be normalized to

2.10. Evaluation metrics

We tested the performance of six different configurations by altering the normalization

2.11. Validation in an independent cohort of elderly subjects with hypertension

3.1. Reliability of manual reference segmentation

3.2. Quantitative analysis of WML segmentation configurations

Method p SI Sensitivity SIestimate DER OER ICC

In terms of volumetric correspondence, the configurations including TTPs within the

3.3. Post-processing and detailed analysis of the optimal configuration: variance

Post-processing was applied to the binary segmentation of the optimal configuration

N SI Sensivity SIestimate DER OER

For applicability of automated WML segmentation procedures in clinical studies, both

Each tree is grown as follows:

1. If the number of cases in the training set is N, sample N cases at

Reducing m reduces both the correlation and the strength. Increasing it

Features of Random Forests

How random forests work

The out-of-bag (oob) error estimate

Each tree is constructed using a different bootstrap sample from the

x(n) = ((1) (n) , (2) (n) , ...,)

have squared distances between them equal to 1-prox(n,k). The values

Generally three or four scaling coordinates are sufficient to give good

Missing value replacement for the training set

The second way of replacing missing values is computationally more

If x(m,n) is a missing continuous value, estimate its fill as an average over

Missing value replacement for the test set

The raw outlier measure for case n is defined as

The approach in random forests is to consider the original data as class 1

Formulating it as a two class problem has a number of payoffs. Missing

Balancing prediction error

The final output of a forest of 500 trees on this data is:

500 3.7 0.0 78.4

500 12.1 12.7 0.0

500 4.3 4.2 5.2

This is pretty close to balance. If exact balance is wanted, the weight on

This method of checking for novelty is experimental. It may not distinguish

A case study-microarray data

(note: an error rate of 1.23% implies 1 of the 81 cases was misclassified,)

gene raw z-score significance

gene raw z-score significance

Scaling the data

The settings are mdim2nd=15, nprot=2, imp=1, nprox=1, nrnn=20. The

A case study-dna data

Missing values in the training set

It is remarkable how effective the mfixrep process is. Similarly effective

At the end of the replacement process, it is advisable that the completed

Missing values in the test set

missing% labelts=1 labelts=0

Clustering dna data

Clustering glass data