2 views

Uploaded by chinnikama5342

- A study on the speech acoustic-to-articulatory mapping using morphological constraints
- Art16-3_3.pdf
- A Review on Time Series Data Mining
- A System Identification Approach for Video-Based Face Recognition
- SME Credit Risk Analysis Using Bank Lending Data: An Analysis of Thai SMEs
- Kuiper Ch10
- bio aging
- Interacción genotipo por ambiente para el daño causado por taladradores (Diatraea spp.) en caña de azúcar
- Face Detection using ANN.pdf
- 3 - 2 - Getting Help (12_24)
- IMECS2008_pp462-467
- Glynn - A Logistic Regression Model for the Enhancement of Student Retention
- en_Tanagra_hac_pca.pdf
- Enzymes SBB 2006
- A Novel Approach for Clustering Categorical Time Series Using Dissimilarity Based Measure
- Gharamani pca
- AMMI and GGE by Gauch
- cabai
- Iris Dataset Clustering and Spam Email Separation
- Sr 3

You are on page 1of 10

Version 1.0 Date 2010-23-02 Title Clustering through Correlation Clustering Model (CCM) and cluster analysis tools. Author Mathieu Vrac <mathieu.vrac@lsce.ipsl.fr> Maintainer Mathieu Vrac <mathieu.vrac@lsce.ipsl.fr> Depends R (>= 1.8.0), mclust, class, tree,mvtnorm Description This package proposes a clustering method called Correlation Clustering Model (CCM) based on mixture of canonical correlation analysis (CCA). It also provides some tools for cluster analysis. License GPL (>= 2) Repository CRAN Date/Publication 2010-02-26 14:28:39

R topics documented:

CCM . . . . . . . . . . . . . . . . . . CWGLI . . . . . . . . . . . . . . . . . DI . . . . . . . . . . . . . . . . . . . . Info.Criterion . . . . . . . . . . . . . . learn.and.project.clusters . . . . . . . . Percent.bad.and.false.classif.per.cluster WGP . . . . . . . . . . . . . . . . . . Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 3 4 5 6 7 8 10

CCM

CCM

Description This function performs a simultaneous clustering of two matched datasets (e.g., daily local- and large-scale atmospheric data), such that each cluster maximises the correlation between the two datasets. This clustering is based on a mixture of canonical correlation analyses (CCAs). Usage CCM(Nc, NS, DataA.tbc,DataS.tbc,NN, DataStation, init="block", ITmax=15, rq=0) Arguments Nc NS DataA.tbc Number of clusters required. Number of locations (i.e., weather stations) for the local-scale time series. Dataset corresponding to the large-scale data to be clustered. This is a matrix M*NN, where M corresponds to the number of large-scale locations (e.g., GCM or RCM grid cells), and NN to the length of the time series (e.g., number of days). Note that this matrix is used in CCM without any transformation. For example, if a principal component analysis (PCA) has to be performed, this must be done BEFORE entering CCM. Dataset corresponding to the local-scale (station) data to be clustered. This is a matrix NS*NN. Similarly as for DataA.tbc, note that this matrix is used in CCM without any transformation, and that if a PCA has to be performed, this must be done BEFORE entering CCM. Length of the time series (e.g, number od days for daily time series). Local-scale (stations) dataset on which the information criterion will be calculated. It usually is the same as DataS.tbc but can be different according to the application (e.g., DataS.tbc is the result of a PCA) or to the goal to reach. Initializing method for the clusters. Six methods are available: - "block": Blocks initialization (the default) - "12345": Each day is alternatively allocated to a cluster (For example, if 3 clusters required, day1 goes to C1, day2 to C2, day3 to C3, day4 to C1, day5 to C2, etc.) - "Kmeans": Initialization by the k-means algorithm - "Mixtn": Same a "12345" but for length 12 (instead of length 1) - "EMw": Initialization by the EM clustering algorithm applied onto the w (i.e., large-scale) canonical variates resulting from a CCA performed between DataA.tbc and DataS.tbc - "EMvw": Same as "EMw" but EM is applied onto both v (local) and w (largescale) canonical variates. Maximum number of iterations (default is 15) is the algorithm di not converge.

DataS.tbc

NN DataStation

init

ITmax

CWGLI rq

3 Value (of the local-scale variable of interest) on which the information criterion (IC) is calculated (default is rq=0). rq can be the 90th percentile of the data. In that case, CCM will try to nd clusters for which the extremes are well discriminated. A high IC means a good discrimination (in terms of local-scale variable) between the clusters.

Details For details about the CCM method, see the reference below. M. Vrac, P. Yiou. "Weather regimes designed for local precipitation modelling: Application to the Mediterranean basin". JGR-Atmospheres, doi:10.1029/2009JD012871, 2010 Author(s) M. Vrac (mathieu.vrac@lsce.ipsl.fr)) See Also Info.Criterion Examples

## Example

CWGLI

Description This function calculates the Conditional Weighted Global Log-Intensity (CWGLI) cost function for precipitation. This error is supposed to be due to a misclassication (i.e., the elements should be classied according to sequence cl.check, while they are in practice allocated to the clusters according to the sequence cl.proj). Usage CWGLI(cl.check, cl.proj, DataS.check) Arguments cl.check cl.proj DataS.check Reference sequence (i.e., numerical vector) of clusters (i.e., this is the sequence we should have). Sequence of clusters in practice. Precipitation data time series

4 Details For details about this cost function, see the reference below.

DI

M. Vrac, P. Yiou. "Weather regimes designed for local precipitation modelling: Application to the Mediterranean bassin". Submitted, JGR-A, 2009 Value Returns a numerical vector corresponding to the CWGLI cost function Author(s) M. Vrac (mathieu.vrac@lsce.ipsl.fr)) See Also DI, WGP Examples

## Example

DI

Description This function calculates the Daily Intensity (DI) cost function for precipitation. This error is supposed to be due to a misclassication (i.e., the elements should be classied according to sequence cl.check, while they are in practice allocated to the clusters according to the sequence cl.proj). Usage DI(cl.calibration, cl.check, cl.proj, DataS.Calibration, DataS.check) Arguments cl.calibration Sequence of clusters for time period 1 (TP1) cl.check Reference sequence (i.e., numerical vector) of clusters (i.e., this is the sequence we should have) for time period 2 (TP2).

cl.proj Sequence of clusters in practice for TP2. DataS.Calibration Precipitation data time series for TP1. DataS.check Precipitation data time series for TP2.

Info.Criterion Details For details about these cost functions, see the reference below.

M. Vrac, P. Yiou. "Weather regimes designed for local precipitation modelling: Application to the Mediterranean bassin". Submitted, JGR-A, 2009 Value Returns a list with arguments: DI MDI RMDI Daily Intensity (DI) error costs, depending on station s and day d. Means DI costs, depending on station s. Regional MDI providing a global (spatial and temporal) view of the DI error cost.

## Example

Info.Criterion

Computes the Information Criterion (IC) of a clustering result, for values higher than r.

Description This function computes the Information Criterion (IC) of a clustering result. Usage Info.Criterion(NS, DataS, r, totCL, Nc, cl) Arguments NS DataS Number of locations (i.e., weather stations) for the local-scale time series on which IC is calculated. Dataset corresponding to local-scale (station) data on which IC is calculated. This is a matrix NS*NN, where NN is the number of days (i.e., length of the time series). Value for which the IC is calculated (see details).

6 totCL Nc cl Details The IC is computed as $IC = Sum_i=1^K |n_i,r - (p_r * n_i)|$, where $n_i,r$ = \# of days in cluster $i$ that receive a rainfall amount > r $p_r$ = proba of such rainy days in the whole population $n_i$ = \# of days in cluster i Author(s) M. Vrac (mathieu.vrac@lsce.ipsl.fr)) See Also CCM Examples

## Example

learn.and.project.clusters Vector of numbers of elements (e.g., days) in each cluster. Number of clusters. Vector containing the sequence of clusters (length(cl) is NN).

Description This function (1) learns how to attribute days to clusters based on the sequence of predictors and associated sequence of clusters. Usage

learn.and.project.clusters(DataCalibration, DataToBeProjected, cl.calibration, allo Arguments DataCalibration Values of the predictor variable for the calibration set (can be a matrix). DataToBeProjected Values of the predictor variable for the projection set. cl.calibration Numerical vector corresponding to the sequence of clusters (i.e., calibration set).

Percent.bad.and.false.classif.per.cluster allocmet

Name of the attribution method.(The 12 possibilities are: "Euclid.dist.A", "Euclid.dist.w1", "Euclid.dist.w2", "CART.A", "CART.w", "CART.A.and.w", "knnA", "knnA10", "Gaussian.A", "Gaussian.w", "MM", "MMw")

DataS.Calibration Values of other predictor variables for the calibration set. This is sometimes needed, according to the attribution method (allocmet) to be used (needed for "Euclid.dist.w1", "Euclid.dist.w2", "CART.w", "CART.A.and.w", "Gaussian.w", "MMw"). Value Returns a list with two objects: cl tot Author(s) M. Vrac (mathieu.vrac@lsce.ipsl.fr)) Examples

## Example

The sequence of clusters dened from the predictors for the projection set. Number of elements per cluster for projection set.

Description This function computes the percentage of bad and false classication of a sequence of clusters (new.cl) according to a reference sequence (cl). Usage Percent.bad.and.false.classif.per.cluster(cl, new.cl) Arguments cl new.cl Reference sequence of clusters. Sequence of clusters to be compared to the reference sequence.

8 Value Returns a list containing the following elements: tot Global percentage of bad classication BadPerCluster Percentage of bad classication per cluster FalsePerCluster Percentage of false classication per cluster mat.att Global matrix of attribution (row = cl, colomn = new.cl) Author(s) M. Vrac (mathieu.vrac@lsce.ipsl.fr)) See Also learn.and.project.clusters Examples

## Example

WGP

WGP

Description This function calculates the Weighted Global Probability (WGP) cost function for precipitation occurrence. This error is supposed to be due to a misclassication (i.e., the elements should be classied according to sequence cl.check, while they are in practice allocated to the clusters according to the sequence cl.proj). Usage WGP(cl.check, cl.proj, DataS.check) Arguments cl.check cl.proj DataS.check Details For details about this cost function, see the reference below. M. Vrac, P. Yiou. "Weather regimes designed for local precipitation modelling: Application to the Mediterranean bassin". Submitted, JGR-A, 2009 Reference sequence (i.e., numerical vector) of clusters. Sequence of clusters in practice. Precipitation data time series

WGP Value Returns a numerical vector corresponding to the WGP cost function Author(s) M. Vrac (mathieu.vrac@lsce.ipsl.fr)) See Also DI, CWGLI Examples

## Example

Index

Topic cluster CCM, 1 CWGLI, 3 DI, 4 Info.Criterion, 5 learn.and.project.clusters, 6 Percent.bad.and.false.classif.per.cluster, 7 WGP, 8 Topic models CCM, 1 CWGLI, 3 DI, 4 WGP, 8 CCM, 1, 5 CWGLI, 3, 8 DI, 3, 4, 5, 8 Info.Criterion, 3, 5 learn.and.project.clusters, 6, 7 Percent.bad.and.false.classif.per.cluster, 7 WGP, 3, 5, 8

10

- A study on the speech acoustic-to-articulatory mapping using morphological constraintsUploaded byHani C Yehia
- Art16-3_3.pdfUploaded byAhmed Ghilane
- A Review on Time Series Data MiningUploaded byTri Kurniawan Wijaya
- A System Identification Approach for Video-Based Face RecognitionUploaded byAdrian Seijas
- SME Credit Risk Analysis Using Bank Lending Data: An Analysis of Thai SMEsUploaded byADBI Publications
- Kuiper Ch10Uploaded byAyushYadav
- bio agingUploaded byNaveen Antony Machado
- Interacción genotipo por ambiente para el daño causado por taladradores (Diatraea spp.) en caña de azúcarUploaded byjrcorreac
- Face Detection using ANN.pdfUploaded byBudi Purnomo
- 3 - 2 - Getting Help (12_24)Uploaded bynoestoypanadie2000
- IMECS2008_pp462-467Uploaded byRajiv Coonghe
- Glynn - A Logistic Regression Model for the Enhancement of Student RetentionUploaded bySebastianus Carus Maximus
- en_Tanagra_hac_pca.pdfUploaded byCherry Ratisoontorn
- Enzymes SBB 2006Uploaded byPedro Campos
- A Novel Approach for Clustering Categorical Time Series Using Dissimilarity Based MeasureUploaded byInternational Journal for Scientific Research and Development - IJSRD
- Gharamani pcaUploaded byLeonardo L. Impett
- AMMI and GGE by GauchUploaded byramankant_pau
- cabaiUploaded byDiki Rizzal Fauzi
- Iris Dataset Clustering and Spam Email SeparationUploaded byAkash M Shahzad
- Sr 3Uploaded byDireshan Pillay
- (2004, Bailey Dkk) Prediction of Anti-plasmodial Activity of a Annua_application of 1H NMR Spectroscopy and ChemometricsUploaded byArinta Purwi Suharti
- Tourist Experiences and AttractionsUploaded byKhoa Xuan Nguyen
- Technique to Hybridize Principle Component and Independent Component Algorithms using Score Based Fusion ProcessUploaded byAnonymous kw8Yrp0R5r
- A Deter Minis Tic Algorithm for the MCDUploaded byHamidouche Hamada
- PAPER Unsupervised Learning in Neural ComputationUploaded byAntonio Marcegaglia
- GFI - "Illicit Financial Flows fromDeveloping Countries Overthe Decade Ending 2009"Uploaded bychequeado
- Demo Cluster 3Uploaded byRicardo Herrera
- Banse_et_al.2013.PCA_AggViolBehav-libre (1).pdfUploaded byFelipe Reyes Silva
- Emotional Feature AnalysisUploaded byjeffconnors
- 207.full.pdfUploaded byRandu Pinandito Nugroho

- Alex University CourseUploaded byNguyễnTrường
- WS 08.3 Volumes EUploaded byPeter Aguirre Kluge
- SET for Cluster-based Cloud Computing: A SurveyUploaded byGRENZE Scientific Society
- UntitledUploaded byVishal Bahadoor
- QuantEconlectures-python3Uploaded byCristian F. Sanabria
- 307Uploaded bysandeepjangra15
- 00091820Uploaded bymsmsoft90
- m150a Mock MtaUploaded byAdnan Ak
- Lehmann IA SSM Ch4Uploaded byAnonymous 7gpjR0W
- 03 Section 2 Example Bridge(E)Uploaded byGuru
- The Structure of TopoisomaraseUploaded byyashbhardwaj
- An Optimization Model on Construction and Demolition Waste.pdfUploaded byjcaldanab
- ADAPT(FM1)Uploaded bySumaiyyahRoshidi
- What’s the Difference between Line, Instrument and Microphone LevelsUploaded bydollagio
- List Week5Uploaded byRoberta Barbour
- Reflexw ManualUploaded byDiego Mota Jara
- Pauwels_SLIM®_transformers_lead_the_way_to_wind-power_in_FranceUploaded byLuminita Codreanu
- Vacon Product Catalogue 2014 DPD01367B EnglishUploaded byDeepak Jain
- Research Papers Marketing Conf 2010Uploaded byShagufta Sarwar
- Single Multi Shaft TurbinesUploaded bymsshahenter
- ISTQB Question Paper15Uploaded byKapildev
- EYLES MACHIN 2015 the Introduction of Academy Schools to England s EducationUploaded byclfrites
- Greenhouse EffectUploaded byHina Farooq
- AVX RF Microwave ProductsUploaded byLuke Gomez
- Dynamics NotesUploaded byBas Ramu
- Constant-Pressure Turbocharger Matching by Michael M. W. de SilvaUploaded byMichael M. W. de Silva
- hermeneutics sportUploaded byJavi LoFrias
- An Experimental Analysis of IC Engine by Using Hydrogen BlendUploaded byIJSTE
- BPCS Reference ManualUploaded bykriprasad
- MysticUploaded byfreshrich