计算机专业文献翻译机器学习的研究.docx
- 文档编号:14805317
- 上传时间:2023-06-27
- 格式:DOCX
- 页数:27
- 大小:41.19KB
计算机专业文献翻译机器学习的研究.docx
《计算机专业文献翻译机器学习的研究.docx》由会员分享,可在线阅读,更多相关《计算机专业文献翻译机器学习的研究.docx(27页珍藏版)》请在冰点文库上搜索。
计算机专业文献翻译机器学习的研究
Machine-LearningResearch
FourCurrentDirections
ThomasG.Dietterich
■Machine-learningresearchhasbeenmakinggreatprogressinmanydirections.Thisarticlesummarizesfourofthesedirectionsanddiscussessomecurrentopenproblems.Thefourdirectionsare
(1)theimprovementofclassificationaccuracybylearningensemblesofclassifiers,
(2)methodsforscalingupsupervisedlearningalgorithms,(3)reinforcementlearning,and(4)thelearningofcomplexstochasticmodels.
Thelastfiveyearshaveseenanexplosioninmachine-learningresearch.Thisexplosionhasmanycauses:
First,separateresearchcommunitiesinsymbolicmachinelearning,computationlearningtheory,neuralnetworks,statistics,andpatternrecognitionhavediscoveredoneanotherandbeguntoworktogether.Second,machine-learningtechniquesarebeingappliedtonewkindsofproblem,includingknowledgediscoveryindatabases,languageprocessing,robotcontrol,andcombinatorialoptimization,aswellastomoretraditionalproblemssuchasspeechrecognition,facerecognition,handwritingrecognition,medicaldataanalysis,andgameplaying.
Inthisarticle,Iselectedfourtopicswithinmachinelearningwheretherehasbeenalotofrecentactivity.ThepurposeofthearticleistodescribetheresultsintheseareastoabroaderAIaudienceandtosketchsomeoftheopenresearchproblems.Thetopicareasare
(1)ensemblesofclassifiers,
(2)methodsforscalingupsupervisedlearningalgorithms,(3)reinforcementlearning,and(4)thelearningofcomplexstochasticmodels.
Thereadershouldbecautionedthatthisarticleisnotacomprehensivereviewofeachofthesetopics.Rather,mygoalistoprovidearepresentativesampleoftheresearchineachofthesefourareas.Ineachoftheareas,therearemanyotherpapersthatdescriberelevantwork.IapologizetothoseauthorswhoseworkIwasunabletoincludeinthearticle.
EnsemblesofClassifiers
Thefirsttopicconcernsmethodsforimprovingaccuracyinsupervisedlearning.Ibeginbyintroducingsomenotation.Insupervisedlearning,alearningprogramisgiventrainingexamplesoftheform{(x1,y1),…,(xm,ym)}forsomeunknownfunctiony=f(x).Thexivaluesaretypicallyvectorsoftheform
Theyvaluesaretypicallydrawnfromadiscretesetofclasses{1,…,k}inthecaseofclassificationorfromthereallineinthecaseofregression.Inthisarticle,Ifocusprimarilyonclassification.Thetrainingexamplesmightbecorruptedbysomerandomnoise.
GivenasetSoftrainingexamples,alearningalgorithmoutputsaclassifier.Theclassifierisahypothesisaboutthetruefunctionf.Givennewxvalues,itpredictsthecorrespondingyvalues.Idenoteclassifiersbyh1,…,hi.
Anensembleofclassifierisasetofclassifierswhoseindividualdecisionsarecombinedinsomeway(typicallybyweightedorunweightedvoting)toclassifynewexamples.Oneofthemostactiveareasofresearchinsupervisedlearninghasbeenthestudyofmethodsforconstructinggoodensemblesofclassifiers.Themaindiscoveryisthatensemblesareoftenmuchmoreaccuratethantheindividualclassifiersthatmakethemup.
Anensemblecanbeemoreaccuratethanitscomponentclassifiersonlyiftheindividualclassifiersdisagreewithoneanother(HansenandSalamon1990).Toseewhy,imaginethatwehaveanensembleofthreeclassifiers:
{h1,h2,h3},andconsideranewcasex.Ifthethreeclassifiersareidentical,thenwhenh1(x)iswrong,h2(x)andh3(x)arealsowrong.However,iftheerrorsmadebytheclassifiersareuncorrelated,thenwhenh1(x)iswrong,h2(x)andh3(x)mightbecorrect,sothatamajorityvotecorrectlyclassifiesx.Moreprecisely,iftheerrorratesofLhypotheseshiareallequaltop Ofcourse,iftheindividualhypothesesmakeuncorrelatederrorsatratesexceeding0.5,thentheerrorrateofthevotedensembleincreasesasaresultofthevoting.Hence,thekeytosuccessfulensemblemethodsistoconstructindividualclassifierswitherrorratesbelow0.5whoseerrorsareatleastsomewhatuncorrelated. MethodsforConstructingEnsembles Manymethodsforconstructingensembleshavebeendeveloped.Somemethodsaregeneral,andtheycanbeappliedtoanylearningalgorithm.Othermethodsarespecifictoparticularalgorithms.Ibeginbyreviewingthegeneraltechniques. SubsamplingtheTrainingExamples Thefirstmethodmanipulatesthetrainingexamplestogeneratemultiplehypotheses.Thelearningalgorithmisrunseveraltimes,eachtimewithadifferentsubsetofthetrainingexamples.Thistechniqueworksespeciallywellforunstablelearningalgorithms-algorithmswhoseoutputclassifierundergoesmajorchangesinresponsetosmallchangesinthetrainingdata.Decisiontree,neuralnetwork,andrule-learningalgorithmsareallunstable.Linear-regression,nearest-neighbor,andlinear-thresholdalgorithmsaregenerallystable. Themoststraightforwardwayofmanipulatingthetrainingsetiscalledbagging.Oneachrun,baggingpresentsthelearningalgorithmwithatrainingsetthatconsistofasampleofmtrainingexamplesdrawnrandomlywithreplacementfromtheoriginaltrainingsetofmitems.Suchatrainingsetiscalledabootstrapreplicateoftheoriginaltrainingset,andthetechniqueiscalledbootstrapaggregation(Breiman1996a).Eachbootstrapreplicatecontains,ontheaverage,63.2percentoftheoriginalset,withseveraltrainingexamplesappearingmultipletimes. Anothertraining-setsamplingmethodistoconstructthetrainingsetsbyleavingoutdisjointsubsets.Then,10overlappingtrainingsetscanbedividedrandomlyinto10disjointsubsets.Then,10overlappingtrainingsetscanbeconstructedbydroppingoutadifferentisusedtoconstructtrainingsetsfortenfoldcross-validation;so,ensemblesconstructedinthiswayaresometimescalledcross-validatedcommittees(Parmanto,Munro,andDoyle1996). ThethirdmethodformanipulatingthetrainingsetisillustratedbytheADABOOSTalgorithm,developedbyFreundandSchapire(1996,1995)andshowninfigure2.Likebagging,ADABOOSTmanipulatesthetrainingexamplestogeneratemultiplehypotheses.ADABOOSTmaintainsaprobabilitydistributionpi(x)overthetrainingexamples.Ineachiterationi,itdrawsatrainingsetofsizembysamplingwithreplacementaccordingtotheprobabilitydistributionpi(x).Thelearningalgorithmisthenappliedtoproduceaclassifierhi.Theerrorrate£iofthisclassifieronthetrainingexamples(weightedaccordingtopi(x))iscomputedandusedtoadjusttheprobabilitydistributiononthetrainingexamples.(Infigure2,notethattheprobabilitydistributionisobtainedbynormalizingasetofweightswi(i)overthetrainingexamples.) Theeffectofthechangeinweightsistoplacemoreweightonexamplesthatweremisclassifiedbyhiandlessweightonexamplesthatwerecorrectlyclassified.Insubsequentiterations,therefore,ADABOOSTconstructsprogressivelymoredifficultlearningproblems. Thefinalclassifier,hiisconstructsbyaweightedvoteoftheindividualclassifiers.Eachclassifierisweightedaccordingtoitsaccuracyforthedistributionpithatitwastrainedon. Inline4oftheADABOOSTalgorithm(figure2),thebaselearningalgorithmLearniscalledwiththeprobabilitydistributionpi.IfthelearningalgorithmLearncanusethisprobabilitydistributiondirectly,thenthisproceduregenerallygivesbetterresults.Forexample,Quinlan(1996)developedaversionofthedecisiontree-learningprogramc4.5thatworkswithaweightedtrainingsample.Hisexperimentsshowedthatitworkedextremelywell.Onecanalsoimagineversionsofbackpropagationthatscaledthecomputedoutputerrorfortrainingexample(Xi,Yi)bytheweightpi(i).Errorsforimportanttrainingexampleswouldcauselargergradient-descentstepsthanerrorsforunimportant(low-weight)examples. However,ifthealgorithmcannotusetheprobabilitydistributionpidirectly,thenatrainingsamplecanbeconstructedbydrawingarandomsamplewithreplacementinproportiontotheprobabilitiespi.ThisproceduremakesADABOOSTmorestochastic,butexperimentshaveshownthatitisstilleffective. Figure3comparestheperformanceofc4.5toc4.5withADABOOST.M1(usingrandomsampling).Onepointisplottedforeachof27testdomainstakenfromtheIrvinerepositoryofmachine-learningdatabases(MerzandMurphy1996).Wecanseethatmostpointslieabovetheliney=x,whichindicatesthattheerrorrateofADABOOSTislessthantheerrorrateofc4.5.Figure4comparestheperformanceofbagging(withc4.5)toc4.5alone.Again,weseethatbaggingproducessizablereductionsintheerrorrateofc4.5formanyproblems.Finally,figure5comparesbaggingwithboosting(bothusingc4.5astheunderlyingalgorithm).Theresultsshowthatthetwotechniquesarecomparable,althoughboostingappearstostillhaveanadvantageoverbagging. Wecanseethatmostpointslieabovetheliney=x,whichindicatesthattheerrorrateofADABOOSTislessthantheerrorrateofc4.5.Figure4comparestheperformanceof
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 计算机专业 文献 翻译 机器 学习 研究