Вы находитесь на странице: 1из 6

Home

Day
Week
Month
Year
Contact
Search

clusteringandmatlab
{Visits(95)PostedbyAmroGarrithGraham3.2k}
HiimtryingtoclustersomedataIhavefromthekdd1999cupdataset
theoutputfromthefilelookslikethis:
0,tcp,http,SF,239,486,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,8,8,0.00,0.00,0.00,0.00,1.00,0.00,0.00,19,19,1.00,0.00,0.05,0.00,0.00,0.00,0.00,0.00,normal.

with48thousanddifferentrecordsinthatformat.Ihavecleanedthedataupandremovedthetextkeepingonlythenumbers.Theoutputlookslikethisnow:

Icreatedacommadillematedfileinexcelandsavedasacsvfilethencreatedadatasourcefromthecsvfileinmatlab,ivetryedrunningitthroughthefcmtoolbox
inmatlab(findclusteroutputs38datatypeswhichisexpectedwith38columns).
TheclustershoweverdontlooklikeclustersoritsnotacceptingandworkingthewayIneeditto.
Couldanyonehelpfindingtheclusters?ImnewtomatlabsodonthaveanyexperienceandImalsonewtoclustering.
Themethod:
1.Chosenumberofclusters(K)
2.Initializecentroids(Kpatternsrandomlychosenfromdataset)
3.Assigneachpatterntotheclusterwithclosestcentroid
4.Calculatemeansofeachclustertobeitsnewcentroid
5.Repeatstep3untilastoppingcriteriaismet(nopatternmovetoanothercluster)
ThisiswhatI'mtryingtoachieve:

Thisiswhatimgetting:

loadkddcup1.dat
plot(kddcup1(:,1),kddcup1(:,2),'o')
[center,U,objFcn]=fcm(kddcup1,2);
Iterationcount=1,obj.fcn=253224062681230720.000000
Iterationcount=2,obj.fcn=241493132059137410.000000
Iterationcount=3,obj.fcn=241484544542298110.000000
Iterationcount=4,obj.fcn=241439204971005280.000000
Iterationcount=5,obj.fcn=241090628742523840.000000
Iterationcount=6,obj.fcn=239363408546874750.000000
Iterationcount=7,obj.fcn=238580863900727680.000000
Iterationcount=8,obj.fcn=238346826370420990.000000
Iterationcount=9,obj.fcn=237617756429912510.000000
Iterationcount=10,obj.fcn=226364785036628320.000000
Iterationcount=11,obj.fcn=94590774984961184.000000
Iterationcount=12,obj.fcn=2220521449216102.500000
Iterationcount=13,obj.fcn=2220521273191876.200000
Iterationcount=14,obj.fcn=2220521273191876.700000
Iterationcount=15,obj.fcn=2220521273191876.700000
figure
plot(objFcn)
title('ObjectiveFunctionValues')
xlabel('IterationCount')
ylabel('ObjectiveFunctionValue')
maxU=max(U);
index1=find(U(1,:)==maxU);
index2=find(U(2,:)==maxU);
figure
line(kddcup1(index1,1),kddcup1(index1,2),'linestyle',...
'none','marker','o','color','g');
line(kddcup1(index2,1),kddcup1(index2,2),'linestyle',...
'none','marker','x','color','r');
holdon
plot(center(1,1),center(1,2),'ko','markersize',15,'LineWidth',2)
plot(center(2,1),center(2,2),'kx','markersize',15,'LineWidth',2)

Answer(1)matlabmachinelearningclusteranalysisdataminingfuzzy

#1
{AnsweredbyAmro}

Sinceyouarenewtomachinelearning/datamining,youshouldn'ttacklesuchadvancedproblems.Afterall,thedatayouareworkingwithwasusedina
competition(KDDCup'99),sodon'texpectittobeeasy!
Besidesthedatawasintendedforaclassificationtask(supervisedlearning),wherethegoalispredictthecorrectclass(bad/goodconnection).Youseemtobe
interestedinclustering(unsupervisedlearning),whichisgenerallymoredifficult.
Thissortofdatasetrequiresalotofpreprocessingandcleverfeatureextraction.Peopleusuallyemploydomainknowledge(networkintrusiondetection)toobtain
betterfeaturesfromtherawdata..DirectlyapplyingsimplealgorithmslikeKmeanswillgenerallyyieldpoorresults.
Forstarters,youneedtonormalizetheattributestobeofthesamescale:whencomputingtheeuclideandistanceaspartofstep3inyourmethod,thefeatureswith
valuessuchas239and486willdominateovertheotherfeatureswithsmallvaluesas0.05,thusdisruptingtheresult.
Anotherpointtorememberisthattoomanyattributescanbeabadthing(curseofdimensionality).Thusyoushouldlookintofeatureselectionordimensionality
reductiontechniques.
Finally,Isuggestyoufamiliarizeyourselfwithasimplerdataset...

Relatedquestions
1.clusteringandmatlab
HiimtryingtoclustersomedataIhavefromthekdd1999cupdatasettheoutputfromthefilelookslikethis:
0,tcp,http,SF,239,486,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,8,8,0.00,0.00,0.00,0.00,1.00,0.00,0.00,19,19,1.00,0.00,0.05,0.00,0.00,0.00,0.00,0.00,nor...
2.fuzzycmeanscategoricaldata
canthefuzzycmeansappliedonnonnumericaldatasets?i.ecategoricalormixednumericalandcategorical..ifyes(Ihopeso:():howwecalculatecluster
centers?IfNO,whatisthealternative..howtofuzzyclustersthesedata?Ineedtheresp...
3.Fuzzycmeanstcpdumpclusteringinmatlab
HiIhavesomedatathatsrepresentedlikethis:
0,tcp,http,SF,239,486,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,8,8,0.00,0.00,0.00,0.00,1.00,0.00,0.00,19,19,1.00,0.00,0.05,0.00,0.00,0.00,0.00,0.00,normal.Itsfromthe
kddcup1999whichwasbasedonthedarpaset....
4.FuzzyKmodesclusteringhowtofindtheclustercenters
I'mtryingtounderstandfuzzykmodesalgorithm(lookmainlyatpage3)inordertoimplementit.I'mstuckatthecalculationofclustercenterstheysaidas
showninthepicIneedtoknowwhetherthefollowingistrueorfalseandpleasecorrectmeIn...
5.fuzzykmodeclusteringmembershipvaluecalculation
IwassearchingforaclusteringalgorithmtofuzzyclustercategoricalattributesandIfoundthekmodesalgorithmI'vegotthewayitworksbutI'mnot
understandingifthemembershiporbelongingmatrixiscalculatedthesamewayasthismatrixinfuz...
6.matlabclusteringanddataformats
LeadingonfromapreviousquestionFCMClusteringnumericdataandcsv/excelfileImnowtryingtofigureouthowtotaketheoutputedinformationand
createaworkable.datfileforusewithclusteringinmatlab.%#readthelistoffeaturesfid=fopen(...
7.isthereadiscretizedmethodavailableinmatlab?
Ihaveasetattributeslikesoinmydatafile:Theselectedattributesconsistsofbothdiscreteandcontinuousattributetypes.TheattributesProtocolTypeand
ServiceareoftypediscreteandtheattributeSrcBytes,DstBytes,Countareofcontinuou...
8.isthereadiscretizedmethodavailableinmatlab?
Ihaveasetattributeslikesoinmydatafile:Theselectedattributesconsistsofbothdiscreteandcontinuousattributetypes.TheattributesProtocolTypeand
ServiceareoftypediscreteandtheattributeSrcBytes,DstBytes,Countareofcontinuou...
9.ClusteringstringsinR(isitpossible?)
Ihaveadatasetwithacolumnthatiscurrentlybeingtreatedasafactorwith1000+levels.Thesearevaluesforthecolumn.Iwouldliketocleanupthis
data.Somevaluesarestringslike"18+5=13"and"518=13",Iwouldliketheclustering...
10.FCMClusteringnumericdataandcsv/excelfile
HiIaskedapreviousquestionthatgaveareasonableanswerandIthoughtIwasbackontrack,Fuzzycmeanstcpdumpclusteringinmatlabtheproblemis
thepreprocessingstageofthebelowtcp/udpdatathatIwouldliketorunthroughmatlabsfcmclust...
11.Selectinganappropriatesimilaritymetric&assessingthevalidityofakmeansclusteringmodel
Ihaveimplementedkmeansclusteringfordeterminingtheclustersin300objects.Eachofmyobjecthasabout30dimensions.Thedistanceiscalculated
usingtheEuclideanmetric.IneedtoknowHowwouldIdetermineifmyalgorithmsworkscorrectly?Ica...
12.inputmustbeemptyoraformatstring
HiIkeepgettinganerrorwiththis:%%generatesampledataK=3numObservarations=12000dimensions=20data=fopen('M.dat','rt')C=textscan(data,
[numObservarationsdimensions])???Errorusing==textscanSecondinputmustbeemptyoraformatst...
13..datfilehowtocreateonebasedonexceldocument
Ihavea.csvfileinmymatlabfolderwith38columnsandabout48thousandentries.Iwashopingonusingthefindclusterguibutitonlyaccepts.datfiles.
HowdoIcreatea.datfileinmatlaborspecificallyhowdoIconvertthe.csvfileintoa.da...
14.Kmeansgoingexceptionallyslowwhenclusteringmorethan3documents[closed]
I'mtryingtousekmeanstoclustersimilardocumentstoeachother.IamusingNLTK'sKMeans.WhenIonlycluster3documents,ittakeslessthan5
seconds.ButonceIaddinafourthdocument,itdoesn'tfinish(Icutitoutafter10minutes).Whenther...
15.Whatcustomizablemachinelearningtoolkitsareavailable?
I'mlookingforamachinelearningtoolkitthatwillallowmetospecifycustomsimilaritymeasuresaswellaschoosemyownrepresentationsforthedata.Can
anyonepointmetoanysuchtoolkits?PreferablyPythonorJava.Thankyou....
16.SeedselectionstrategiesforKmeans
IwonderwhatkindofseedselectionmethodsIcanapplytoKmeansalgorithm.Googlesearchwasn'tthathelpful.Anysuggestions?...
17.Classifyingaclassifier
I'veimplementedaclassifierwhichEachiterationreceivesaparameterobjecttoclassify,someobjectsshareaclassifiable"property"likeacolorname.
Classificationparameterscouldchange,sotheyareparametrizedtooandpassedtothisclassifiera...
18.Classifyingaclassifier
I'veimplementedaclassifierwhichEachiterationreceivesaparameterobjecttoclassify,someobjectsshareaclassifiable"property"likeacolorname.
Classificationparameterscouldchange,sotheyareparametrizedtooandpassedtothisclassifiera...
19.Trainingdataforsentimentanalysis
WherecanIgetacorpusofdocumentsthathavealreadybeenclassifiedaspositive/negativeforsentimentinthecorporatedomain?Iwantalargecorpusof
documentsthatprovidereviewsforcompanies,likereviewsofcompaniesprovidedbyanalystsandm...
20.DataclusteringinKMeansAlgorithmusingbinarytreestructure
IamhavingtroubleingeneratingcodeforKMeansclusteringinjava.Ihavealreadyknownthealgorithmbutit'sveryhardtowriteininjavacode.My
assignmentistoretrievedatafromdatabasethenruntheClusteringwithKMeans,inthiscase,thedat...
21.Clusteradjacencymatrixofdifferentsizes
Ihavecreatedadjacencymatrixfordirectedgraphsofdifferentsizes.Ihavearound30,000matrices,eachonaseparatetextfile.HowcanIclusterthem,is
thereanytoolsavailable.Whatisthebestwaytorepresentadirectedgraphforclustering.T...
22.KMeansClusteringusingMahout
I'musingtheclusteringtechniquegivenhereforclusteringalargedataset,whichisgiveninMahoutexamples.However,whenIvisualizetheparticular
clusteringIgetthefollowingfigure.I'mreallystrugglingtounderstandwhatthisactuallymeansa...
23.Opensourcedataminetools,searchingforagoodoption(GNUdataminingapps)[closed]

IwanttotestsomeappsfordatamininginGNU/LinuxDebian,Idownloaded"GnomeDataMineTools"from
http://www.togaware.com/datamining/gdatamine/Ifollowedtheinstructions,Iinstalledtheapp(s)andthenitsaysthatyoushouldrunthecommand:g...
24.WaystodetermineagroupofunitsinRTS
LookingforanalgorithmthatcanbeusedtodeterminegroupsofunitsthatmovetogetherasasquadinarealtimestrategygamelikeStarCraft.Thedirection
thatIamcurrentlylookatisaclusteringalgorithmbuthavingahardtimefindingwhichone...
25.Combiningdifferentsimilaritiestobuildonefinalsimilarity
Imprettymuchnewtodataminingandrecommendationsystems,nowtryingtobuildsomekindofrecsystemforusersthathavesuchparameters:city
educationinterestTocalculatesimilaritybetweenthemimgonnaapplycosinesimilarityanddiscretesimil...
26.Howclusteringworks,especiallyStringclustering?
Iheardaboutclusteringtogroupsimilardata.IwanttoknowhowitworksinthespecificcaseforString.Ihaveatablewithmorethandifferent100,000
words.Iwanttoidentifythesamewordwithsomedifferences(eg.:house,house!!,hooouse,HoUse...
27.ClustercentermeanofDBSCANinR?
UsingdbscaninpackagefpcIamabletogetanoutputof:dbscanPts=322MinPts=20eps=0.00501seed0233border872total87235butIneedtofindthe
clustercenter(meanofclusterwithmostseeds).Cananyoneshowmehowtoproceedwiththis?...
28.WhatisthedifferencebetweenaConfusionMatrixandContingencyTable?
I'mwrittingapieceofcodetoevaluatemyClusteringAlgorithmandIfindthateverykindofevaluationmethodneedsthebasicdatafromam*nmatrixlike
A={aij}whereaijisthenumberofdatapointsthataremembersofclassciandelementsofclus...
29.exampletotrainhiddenmarkovmodelusingmallet(machinelearningforlangaugeengineering)
Ineedtohavealibraryofhmmforsequencemodelingtomodellabelingofsentencesintext.ForthisiexploredtoexampleofMALLETsourcecode
"TrainHMM"butstuckedwithnothavingtrainingandtestingfilereferenceanddescription.Pleasehelp..Reg...
30.Findstatisticalcorrelationsinarelationaldatabase
IhavealargeSQLdatabaseofassociationsbetweenstatefeaturesandarewardmetric.e.g.A^B^C^D^Action(E)=0.1F^G^W^D^Action(R,P,H)=
0.9A^T^U^Y^Action(A,S)=0.2Myfeaturesmaybediscrete,continuous,ornominal.I'mtryin...
31.Theapproachtocalculating'similar'objectsbasedoncertainweightedcriteria
IhaveasitethathasmultipleProjectobjects.Eachprojecthas(forexample):multipletagsmultiplecategoriesasizemultipletypesetc.Iwouldliketowritea
methodtograball'similar'projectsbasedontheabovecriteria.Icaneasilyretrieve...
32.sequenceminingfortimeandproductprediction
Iamfacingatrickyproblemaboutsequencemining,sayIhave10products,Ihavemillionrecordsoftheseproductsarepurchased.Eachusermayhaveonly
1recordor100records..suchas:user1,p1,t1user1,p1,t2user1,p2,t3user1,p3,t4user1,...
33.Textclassificationcategorisationpointers
iamtryingtodevelopaverysimpleprogramforclassifyingandcategorisingdocumentsusingvariousalgorithms.Myproblem,sinceiamabeginneristhati
cannotfindgoodarticlesorwebsitesforsimpletutorialsofhowtogetstartedwithit.Ihave...
34.PredictingValueswithkMeansClusteringAlgorithm
I'mmessingaroundwithmachinelearning,andI'vewrittenaKMeansalgorithmimplementationinPython.Ittakesatwodimensionaldataandorganises
themintoclusters.Eachdatapointalsohasaclassvalueofeithera0ora1.Whatconfusesmeabout...
35.GooddatasetforPreprocessing
IamenrolledinanundergraduatecourseinDataMiningandI'vegotanassignmenttocodeaDataMiningPreprocessor.Ihavethelibertytochoosethe
programminglanguageandthedataset.Iwaswonderingifanybodycouldsuggestagooddatasettous...
36.Howtostorelargenumberofngramsefficently?
Iamextracting4gramsfrombinaryitemsinhexadecimalform,thismeanIcanhaveatmost65535differentgramsperitem.Iwanttoassociateeveryitemto
it'sgramsandtheirfrequencybutIampuzzledonhowtostoreeverythingthisismyfirstdat...
37.Similaritymatrix>featurevectorsalgorithm?
IfwehaveasetofMwords,andknowthesimilarityofthemeaningofeachpairofwordsinadvance(haveaMxMmatrixofsimilarities),whichalgorithm
canweusetomakeonekdimensionalbitvectorforeachword,sothateachpairofwordscanbeco...
38.Howtoapproachnumberguessinggame(withatwist)algorithm?
Iamlearningprogramming(pythonandalgos)andwastryingtoworkonaprojectthatIfindinteresting.IhavecreatedafewbasicpythonscriptsbutImnot
surehowtoapproachasolutiontoagameIamtryingtobuild.Hereshowthegamewillwork:...
39.NLPandMachinelearningforsentimentanalysis
I'mtryingtowriteaprogramthattakestext(article)asinputandoutputsthepolarityofthistext,weatheritsapositiveoranegativesentiment.I'veread
extensivelyaboutdifferentapproachesbutiamstillconfused.Ireadaboutmanytechniquesl...
40.Dataminingforsignificantvariables(numerical):Wheretostart?
IhaveatradingstrategyontheforeignexchangemarketthatIamattemptingtoimproveupon.Ihaveahugetable(100k+rows)thatrepresenteverypossible
tradeinthemarket,thetypeoftrade(buyorsell),theprofit/lossafterthattradeclosed,an...
41.Dataminingforsignificantvariables(numerical):Wheretostart?
IhaveatradingstrategyontheforeignexchangemarketthatIamattemptingtoimproveupon.Ihaveahugetable(100k+rows)thatrepresenteverypossible
tradeinthemarket,thetypeoftrade(buyorsell),theprofit/lossafterthattradeclosed,an...
42.kmeansmatlabcodefeedowndatasource
IwanttotrythisKmeansclusteringcodeonmyownfilehowdoIchangeitsoitdoesntcreaterandominformationbutreadsitfrommyowndatasource?
%%generatesampledataK=3numObservarations=100dimensions=3data=rand([numObservarationsdi...
43.Techniquesforfindingrepeattransactionsbetweencustomerswithmisspellingsorotherchangeininformation?
Thisisn'taSQLServerspecificquestionbuttheremightbetSQLspecificoptionshere.I'vegotabunchofcustomerdetailsmanyofthemcancelandresign
upfortheirservice.Theygetanentirelynewaccountandourdatavalidationissketchyatbes...
44.WeightedNaiveBayesClassifierinApacheMahout
IamusingNaiveBayesclassifierformysentimentanalysisoncustomersupport.ButunfortunatelyIdon'thavehugeannotateddatasetsinthecustomer
supportdomain.ButIhavealittleamountofannotateddatainthesamedomain(around100positivean...
45.Customersupportdatasetsforemailsentimentanalysis
Iamlookingforanannotateddatasetinthecustomersupportdomainforasentimentanalysis,totrainmyNaiveBayesClassifier.Arethereanysuchdata
setsavailableontheinternet?Iamunabletofindanysofar.HowdoIgoaboutthis....
46.Sentimentalanalysisusingapachemahout[closed]
Iamplanningtodevelopasystemthatwouldpredictthemoodofagiventext(sentimentanalysisinshort).Iwouldalsopreferapachemahoutbecause,itis
seriouslyhugedataandthesystemshouldbescalablerealtime.Kindlysuggestmealgorithmsthat...
47.Sentimentanalysisinotherlanguages
MyCSEgraduationprojectIchosetobeasimulationofasearchenginethatusessentimentanalysistoevaluatewhethercomments/reviewsis
positive/negative/neutralIamnotsurehowwouldIbedoingthisyet,ButIunderstoodthatitusesclassifyinga...
48.Sentimentanalysisinotherlanguages
MyCSEgraduationprojectIchosetobeasimulationofasearchenginethatusessentimentanalysistoevaluatewhethercomments/reviewsis
positive/negative/neutralIamnotsurehowwouldIbedoingthisyet,ButIunderstoodthatitusesclassifyinga...
49.IdentifyingtheentityinsentimentanalysisusingLingpipe
IhaveimplementedsentimentanalysisusingthesentimentanalysismoduleofLingpipe.IknowthattheyuseaDynamicLRmodelforthis.Itjusttellsmeif
theteststringisapositivesentimentornegativesentiment.WhatideascouldIusetodetermine...
50.IdentifyingtheentityinsentimentanalysisusingLingpipe
IhaveimplementedsentimentanalysisusingthesentimentanalysismoduleofLingpipe.IknowthattheyuseaDynamicLRmodelforthis.Itjusttellsmeif
theteststringisapositivesentimentornegativesentiment.WhatideascouldIusetodetermine...
51.clusterdataMatlabfunction
IamusingMatlabclusterdatafunctiontoclassifymydata(noiseandnonnoise)into2categories:noiseandnonnoisegroups.Thefunctionworkswellexcept
thatsometimesitnamesallnoisedataasgroup1andallnonnoisedataasgroup2.Sometimesi...
52.AgglomerativeClusteringinMatlab

Ihaveasimple2dimensionaldatasetthatIwishtoclusterinanagglomerativemanner(notknowingtheoptimalnumberofclusterstouse).TheonlywayI've
beenabletoclustermydatasuccessfullyisbygivingthefunctiona'maxclust'value.Forsimp...
53.clusteringdataoutputsirregularplotgraph
OkIwillrundownwhatimtryingtoachieveandhowItryedtoachieveitthenIwillexplainwhyItryedthismethod.IhavedatafromtheKDDcup1999in
itsoriginalformatthedatahas494kofrowswith42columns.Mygoalistryingtoclusterthisd...
54.scripterrorrelatingtonamingconvention
Ihavesomedatastoredinamatfilespreadsheetwhenitrytorunmykmeans.mscriptIgetthiserrorandIcantworkoutwhatsgoingon?Attempttoexecute
SCRIPTkmeansasafunctionErrorin==kmeansat10[clustIDX,clusters,interClustSum,Dist]=...
55.Recommendationsforloganalysistoolsforejabberdlogs[closed]
I'mlookingatamassivesetofejabberdlogs,andI'mtryingtopryoutsomeusefulinformationfromthem.Arethereanyexistingtoolsthatcanhelpmeget
someoftheworkdone,oramIlefttorollmyown?...
56.Whichisabettermethod?libsvmorsvmclassify?
Ihavebeenrecentlytryingtousesvmforfeatureclassification.Whileiwasdoingso,aquestioncametomymind.Whichwouldbeabettermethodtouse,
LIBSVMorsvmclassify?WhatImeanbysvmclassifyistouseinbuiltfunctionsinMATLABsuchass...
57.ClassificationwithMatlab.Recognizeclassesinthetestset
IhaveasituationthatseemstrivialbutIcan'tfigureitout.IhaveadatasetinMatlabthathascategoricalvalues.Forexample:
Outlook,Temperature,Humidity,Windy,Playsunny,hot,high,false,nosunny,hot,high,true,noovercast,hot,high,false,yesrainy,mild...
58.UnabletoinstantiateaWekaclassinMATLAB
I'mtryingtoconvertdataXinMATLABintoaWekaInstanceclass.I'musingWeka3.7.5andMATLAB7.10(2010a).I'vetriedthefollowing:
javaaddpath([WEKA_HOME'weka.jar'])importweka.core.*N=3inst=Instance(N)AndIreceivetheerror???Noco...
59.HowtostartSVMtrainingonMATLAB
IhaveasetoffacialfeaturesthatihaveobtainedandwouldliketoclassifyusingSVM.IintendtouselibsvmpackageanduseMATLABtocarryoutthe
training.IhavealreadyreaduponSVMbywatchingtheStanfordlecture.ButIamnotsurehowtouse...
60.HowtouselibsvminMatlab?
Iamnewtomatlabanddon'tknowhowtouselibsvm.Isthereanysamplecodeforclassifyingsomedata(with2features)withaSVMandthenvisualizethe
result?Howaboutwithkernel(RBF,Polynomial,andSigmoid)?Isawthatreadmefileinlibsvmpack...
61.Whatisthealgorithmusedinlearn_dmm.minKevinMurphyHMMtoolbok?
i'mgoingtorewriteaMATLABscriptthatusetheKevinMurphy'stoolbokinPython.IknowthattherearesomeHMMalgosimplementationinpython
(Viterbi,BaumWelch,BackwordForward)soithinkthatihaveeverythingineedtodotheportingmatlabpy...
62.ImplementingNaveBayesalgorithminMATLABNeedsomeguidance
IhaveaBinaryclassificationproblemthatIneedtodoinMATLAB.Therearetwoclassesandthetrainingdataandtestingdataproblemsarefromtwo
classesandtheyare2dcoordinatesdrawnfromGaussiandistributions.Thesamplesare2Dpointsandthe...
63.Problemsusingezplotwithimplicitfunction
IamtryingtovisualizethedecisionboundarywhenusingaBayesianclassifierinMATLAB.Todothis,Ihavewrittenanimplicitfunctionswhichuses
trainingdatatodeterminewhichoftwoclassesadatapointP=(x,y)belongsto.Thisisdonebeevaluati...
64.knearestneighborclassifierinmatlab
I'mcompletelynewtotheknearestneighborclassifieralgorithm.Cansomeonepleasegivemealinktoagoodtutorial/lecturethatgivesadatasetsothatI
canapplyknearestneighbortoit.Ireallyreallyneedtolearnthisbutduetolackofexampl...
65.MatlabCovarianceMatrixComputationforDifferentClasses
I'vegot2differentfiles,oneofthemisaninputmatrix(X)whichhas3823*63elements(3823inputand63features),theotheroneisaclassvector(R)which
has3823*1elementsthoseelementshavevaluesfrom0to9(thereare10classes).Ihaveto...
66.FeasibilityofMachineLearningtechniquesforNetworkIntrusionDetection
Isthereamachinelearningconcept(algorithmormulticlassifiersystem)thatcandetectthevarianceofnetworkattacks(ortryto).Oneofthebiggest
problemsforsignaturebasedintrusiondetectionsystemsistheinabilitytodetectneworvariantat...
67.ConditionalRandomFields
Isthereatrainingandoptimizationalgorithmfor2D(twodimensional)conditionalrandomfields(CRF)suitedforclassificationofimagery?Hasanyone
usedCRFpackageinR(http://crf.rforge.rproject.org/html/CRFpackage.html)forimageclassifica...
68.usingprecomputedkernelswithlibsvm
I'mcurrentlyworkingonclassifyingimageswithdifferentimagedescriptors.Sincetheyhavetheirownmetrics,Iamusingprecomputedkernels.Sogiven
theseNxNkernelmatrices(foratotalofNimages)iwanttotrainandtestaSVM.I'mnotveryexpe...
69.HowtouserandomforestsinRwithmissingvalues?
Iwouldliketofitarandomforestmodel,butwhenIcalllibrary(randomForest)cars$speed[1]NA#tosimulatemissingvaluemodelrandomForest(speed~.,
data=cars)IgetthefollowingerrorErrorinna.fail.default(list(speed=c(NA,4,7,7,8,9,10...
70.HowdoIformafeaturevectorforaclassifiertargetedatNamedEntityRecognition?
Ihaveasetoftags(differentfromtheconventionalName,Place,Objectetc.).Inmycase,theyaredomainspecificandIcallthem:Entity,Action,Incident.I
wanttousetheseasaseedforextractingmorenamedentities.Icameacrossthispaper:"...
71.Histogramapproximationforstreamingdata
Thisquestionisaslightextensionoftheoneansweredhere.IamworkingonreimplementingaversionofthehistogramapproximationfoundinSection2.1
ofthispaper,andIwouldliketogetallmyducksinarowbeforebeginningthisprocessagain....
72.HowcanIevaluatemytechnique?
Iamdealingwithaproblemoftextsummarizationi.e.givenalargechunk(s)oftext,Iwanttofindthemostrepresentative"topics"orthesubjectofthetext.
Forthis,IusedvariousinformationtheoreticmeasuressuchasTFIDF,ResidualIDFandPoi...
73.What'sagooddatabaseforfastandfrequentretrievaloflargecrosssectionsofdata?
Basically,Ihaveabout100Kitemsthatdon'tquitefitinmemory(thoughtheycaniftheyabsolutelyhaveto),andIwanttomakealotofcomparisonsoflarge
setsoftheseitems.Forinstance,imaginethiswasadatabaseofuserbehavior,andIwanted...
74.Dynamictextpatterndetectionalgorithm?[closed]
Iwaswonderingifsuchalgorithmexists.Ihaveabunchoftextdocumentsandwouldliketofindapatternamongallthesedocuments,ifapatternexists.
PleasenoteimNOTtryingtoclassifythedocumentsalliwanttodoisfindapatternifitexists...
75.Whatdoescorrelationcoefficientactuallyrepresent[closed]
Whatdoescorrelationcoefficientintuitivelymean?IfIhaveaseriesofXandthenaseriesofY,andifIinputthesetwointoWekamultilayerperceptron
treatingYastheoutputandXasinput,Igetacorreleationcoefficientas0.76.Whatdoesthis...
76.Whatdoescorrelationcoefficientactuallyrepresent[closed]
Whatdoescorrelationcoefficientintuitivelymean?IfIhaveaseriesofXandthenaseriesofY,andifIinputthesetwointoWekamultilayerperceptron
treatingYastheoutputandXasinput,Igetacorreleationcoefficientas0.76.Whatdoesthis...
77.Reinforcementlearningwithneo4j:make2copiesofthegraphvsstore2copiesofallvalueson1graph
I'mplanningonrunningamachinelearningalgorithmthatlearnsnodevaluesandedgeweights.Thealgorithmisverysimilartothevalueiterationalgorithm
here.Eachnoderepresentsalocationandeachedgeisapathtoanewlocation.Eachnodeande...
78.C++ReinforcementLearningLibrary
IhavebeenlookingforaC++LibrarythatimplementsReinforcementLearningAlgorithmsbutwasnotverysatisfiedwiththeresults.Ifoundthe
ReinforcementLearningToolbox2.0fromtheTUGrazbutunfortunatelythisprojectisveryoldandIwasunabl...
79.Whatisthepreferredmachinelearningtechniqueforbuildingarealtimegameplayersimulator?[closed]
I'vesetouttobuildanAIenginethatlearnstoplayTetris,i.e.anenginethatcanimproveit'sperformance,perhapsbyadjustingitsheuristics,andsoforth.
Let'ssaythatI'vegottheGUIoutofthewaywherewouldIbegininbuildingtheengine?...
80.ReadinginhighdimensionaldataintoRwithoutuseofdataframe
Ihaveverysparsehighdimensional(40kobservations,20kdimensions)textdatainARFFformatgeneratedbyWEKA.Thereare2ARFFreadersavailable
inRviaRWekaandforeignpackages.Problemwithboththesearffreadersisthattheyreadinthearff...
81.needanideaabouttextminingforminingdatafrombulkoffiles

Iamnewfordatamining.IamdoingmyB.Techfinalyear,myfinalyearprojecttitleis"Extractionandanalysisoffacultyperformanceofmanagement
disciplinefromstudentfeedbackusingtextmining".Herewewillhavenumberoffileswhichcontainsf...
82.Algorithms/methodstocompileforumdiscussionsintocategorizedarticlesorinformation?
I'mdesigningandcodingaknowledgebasedcommunitysharingsystem(forum,QA,articlesharingbetweenstudents,professorsandexperts)inJava,forthe
web.Ineedtousesomedatamining/textprocessingtechniques/algorithmstoanalysethediscussions...
83.Searchwebpagethatcontainspecificlinks
SupposeIwantosearchthewebpagesthatcontainthelinksIwant.Iwouldnormallyusethelinkasthequeryandsearchit(LikeinGoogle)Notehere,Ijust
needtopagesthatcontainthelink.Butnormally,thesearchenginewouldreturnresultsthat...
84.URLpathsimilarity/stringsimilarityalgorithm
MyproblemisthatIneedtocompareURLpathsanddeduceiftheyaresimilar.BelowIprovideexampledatatoprocess:#GROUP1/robots.txt#GROUP
2/bot.html#GROUP3/phpMyAdmin2.5.6rc1/scripts/setup.php/phpMyAdmin2.5.6rc2/scripts/setup.php/phpMyAdmi...
85.DataMiningsituation
SupposeIhavethedataasmentionedbelow.11AMuser1Brush11:05AMuser1PrepBrakfast11:10AMuser1eatBreakfast11:15AMuser1Takebath
11:30AMuser1Leaveforoffice12PMuser2Brush12:05PMuser2PrepBrakfast12:10PMuser2eatBreakfast12:15PMus...
86.exceptioninthreadonKMeansclustering[error]
IencounterproblemonKmeansclustering,Iactuallyneedstoclusterdatainputfromnotepadintosomeclusters.howeverIencounterexceptionandthecpde
isnotworkingwell.kindlyneedshelponthiserrorExceptioninthread"main"java.lang.NullPoin...
87.Generatingclustersfromadjacencymatrix/edgelistinR
Iamtryingtofindpotentialclustersorgroupsofnodes(forummessages,inthiscase).Inthecurrentdata,eachnode(message)hasbeententativelygrouped
togetherwithnothermessages,andthatgroupgivenaname.So,weknowthatmsgID1hasbeen...
88.Groupnpointsinkclustersofequalsize[duplicate]
PossibleDuplicate:KmeansalgorithmvariationwithequalclustersizeEDIT:likecasperOnepointitouttomethisquestionisaduplicate.Anywayshereisa
moregeneralizedquestionthatcoverthisone:http://stats.stackexchange.com/questions/8744/cl...
89.FriendGrouping
Iamwritingaprogramthatfetchesthelinksbetweenfriendsonfacebookandthencreatefriendshipgroupsfromtheselinks.Ihavegotasfarascreatingthe
datastructurewhichissomethinglike[friend_id:[mutual_friend_id,mutual_friend_id,mutual...
90.Javalibrarymethodoralgorithmtoestimateaggregatestringsimilarity?
Ihaveresponsesfromuserstomultiplechoicequestions,e.g.(roughly):Married/SingleMale/FemaleAmerican/LatinAmerican/European/Asian/AfricanWhat
Iwantistoestimatesimilaritybyaggregatingallresponsesintoasinglefieldwhichcanbecompared...
91.MarkovClusteringAlgorithm
I'vebeenworkingthroughthefollowingexampleofthedetailsoftheMarkovClusteringalgorithm:http://www.cs.ucsb.edu/~xyan/classes/CS595D
2009winter/MCL_Presentation2.pdfIfeellikeIhaveaccuratelyrepresentedthealgorithmbutIamnotgettingth...
92.Clusteringalistusingboundaryfunction
Givenalist,I'dliketodivideitintoclustersusinga"boundaryfunction".Suchfunctionwouldtaketwoconsecutiveelementsofthelistanddecidewhetheror
nottheyshouldbelongtothesamecluster.Soessentially,Iwantsomethinglikethis:clus...
93.GetpointIDsafterclustering,usingpython[duplicate]
PossibleDuplicate:PythonkmeansalgorithmIwanttocluster10000indexedpointsbasedontheirfeaturevectorsandgettheiridsafterclusteringi.e.
cluster1:[p1,p3,p100,...],cluster2:[...]...IsthereanywaytodothisinPython?Thx~P.s.Th...
94.Clusteringasparsedatasetofbinaryvectors
IfIhaveasparsedatasetwhereeachdataisdescribedbyavectorof1000elements,eachelementofthisvectorcanbeeither0or1(alotof0andsome1),do
youknowanydistancefunctionthatcouldhelpmetoclusterthem?Issomethinglikeeuclid...
95.R:Unusedargumentlabelinhclust
I'musingthefollowingcodetobuildandhierarchicalcluster:datread.table(textConnection("pdbPAEHSS1avd_model.pdb3028.03920.01ave_model.pdb
3083.04019.01ij8_model.pdb2958.03830.01ldo_model.pdb2889.03754.01ldq_model.pdb2758.03590.01lel_m...
96.Clusteringalgorithmtoclusterobjectsbasedontheirrelationweight
Ihavenwordsandtheirrelatednessweightthatgivesmean*nmatrix.I'mgoingtousethisforasearchalgorithmbuttheproblemisIneedtoclusterthe
enteredkeywordsbasedontheirpairwiserelation.Solet'ssayifthekeywordsare{tennis,feder...
97.Howtobestdoserversidegeoclustering?
Iwanttodopreclusteringforasetofapprox.500,000points.Ihaven'tstartedyetbutthisiswhatIhadthoughtIwoulddo:storeallpointsinalocalSOLR
indexdetermine"naturalclusterpositions"accordingtosomeadministrativeinformation(big...
98.SameresultfromKmeansandsequentialKmeans?
DoweobtainthesameresultifweapplyKmeansandsequentialKmeansmethodstothesamedatasetwiththesameinitialsettings?Explainyourreasons.
PersonallyIthinktheanswerisNo.TheresultobtainedbysequentialKmeansdependsonthepresent...
99.Excel2010CreateClusterGraph
IsthereawaytocreateclustergraphswithinExcel2010?Morespecifically,Iamlookingforthetypeofclustergraphwhichresemblesascattergraphas
opposedtoabarchart.IamworkingwithkmeansandthebestIcanachieveinExcelatthemoment...
100.GoogleMapsClusteringMarkers
ihavealistofmarkersbutiwanttochangethemasaddress.vardata={"loc":[{"longitude":81.81718856098772,"latitude":26.278657439364583},
{"longitude":81.81291211952795,"latitude":26.199298735114475},{"longitude":81.74875180993064,"lat...

Copyright(c)2015questioninbox.com.Allrightsreserved.

Вам также может понравиться