sasadvancedbook1.docx
- 文档编号:17954198
- 上传时间:2023-08-05
- 格式:DOCX
- 页数:51
- 大小:58.17KB
sasadvancedbook1.docx
《sasadvancedbook1.docx》由会员分享,可在线阅读,更多相关《sasadvancedbook1.docx(51页珍藏版)》请在冰点文库上搜索。
sasadvancedbook1
TopicAdvancedProgrammingTechniquesI
1.OptimizingSystemPerformance
2.SASIndex
3.UsinganIndexforEfficientWHEREProcessing
4.UsingBY-GroupProcessingwithanIndex
5.FILENAME
6.Sampling
7.PROCTRANSPOSE
8.SETStatementandBeyond
9.PROCAPPEND
10.ModifyStatement
ReviewofSASProcessing
SASprocessingisthewaythattheSASlanguagereadsandtransformsinputdataandgeneratesthekindofoutputthatyourequest.
ProcessingaDATAStep:
AWalkthrough
/*Sample*/
datapro;
inputTeamName$Name$Event1Event2Event3;
datalines;
YellowSue688
BlueJane978
RedJohn777
YellowLisa899
RedFran766
BlueWalter9810
;
TheCompilationPhase
WhenyousubmitaDATAstepforexecution,SASchecksthesyntaxoftheSASstatementsandcompilesthem,thatis,automaticallytranslatesthestatementsintomachinecode.Inthisphase,SASidentifiesthetypeandlengthofeachnewvariable,anddetermineswhetheratypeconversionisnecessaryforeachsubsequentreferencetoavariable.Duringthecompilephase,SAScreatesthefollowingthreeitems:
inputbuffer
isalogicalareainmemoryintowhichSASreadseachrecordofrawdatawhenSASexecutesanINPUTstatement.NotethatthisbufferiscreatedonlywhentheDATAstepreadsrawdata.ThisbufferholdsthedatabeforemovingthedatatotheProgramDataVector(PDV).(WhentheDATAstepreadsaSASdataset,SASreadsthedatadirectlyintotheprogramdatavector.)
programdatavector(PDV)
isalogicalareainmemorywhereSASbuildsadataset,oneobservationatatime.Whenaprogramexecutes,SASreadsdatavaluesfromtheinputbufferorcreatesthembyexecutingSASlanguagestatements.Thedatavaluesareassignedtotheappropriatevariablesintheprogramdatavector.Fromhere,SASwritesthevaluestoaSASdatasetasasingleobservation.
Alongwithdatasetvariablesandcomputedvariables,thePDVcontainstwoautomaticvariables,_N_and_ERROR_,thatareautomaticallygeneratedforeveryDATAstep.The_N_variablecountsthenumberoftimestheDATAstepbeginstoiterate.The_ERROR_variablesignalstheoccurrenceofanerrorcausedbythedataduringexecution.Thevalueof_ERROR_iseither0(indicatingnoerrorsexist),or1(indicatingthatoneormoreerrorshaveoccurred).SASdoesnotwritethesevariablestotheoutputdataset.
descriptorinformation
isinformationthatSAScreatesandmaintainsabouteachSASdataset,includingdatasetattributesandvariableattributes.Itcontains,forexample,thenameofthedatasetanditsmembertype,thedateandtimethatthedatasetwascreated,andthenumber,namesanddatatypes(characterornumeric)ofthevariables.
TheExecutionPhase
Bydefault,asimpleDATAstepiteratesonceforeachobservationthatisbeingcreated.TheflowofactionintheExecutionPhaseofasimpleDATAstepisdescribedasfollows:
1.TheDATAstepbeginswithaDATAstatement.EachtimetheDATAstatementexecutes,anewiterationoftheDATAstepbegins,andthe_N_automaticvariableisincrementedby1.
2.SASsetsthenewlycreatedprogramvariablestomissingintheprogramdatavector(PDV).
3.SASreadsadatarecordfromarawdatafileintotheinputbuffer,oritreadsanobservationfromaSASdatasetdirectlyintotheprogramdatavector.YoucanuseanINPUT,MERGE,SET,MODIFY,orUPDATEstatementtoreadarecord.
4.SASexecutesanysubsequentprogrammingstatementsforthecurrentrecord.
5.Attheendofthestatements,anoutput,return,andresetoccurautomatically.SASwritesanobservationtotheSASdataset,thesystemautomaticallyreturnstothetopoftheDATAstep,andthevaluesofvariablescreatedbyINPUTandassignmentstatementsareresettomissingintheprogramdatavector.NotethatvariablesthatyoureadwithaSET,MERGE,MODIFY,orUPDATEstatementarenotresettomissinghere.
6.SAScountsanotheriteration,readsthenextrecordorobservation,andexecutesthesubsequentprogrammingstatementsforthecurrentobservation.
7.TheDATAstepterminateswhenSASencounterstheend-of-fileinaSASdatasetorarawdatafile.
AccessPatterns
SASproceduresandstatementscanreadobservationsinSASdatasetsinoneoffollowingpatterns:
sequentialaccess
processesobservationsoneaftertheother,startingatthebeginningofthefileandcontinuinginsequencetotheendofthefile.
randomaccess
processesobservationsaccordingtothevalueofsomeindicatorvariablewithoutprocessingpreviousobservations.
BY-groupaccess
groupsandprocessesobservationsinorderofthevaluesofthevariablesspecifiedinaBYstatement.
multiple-pass
performstwoormorepassesondatawhenrequiredbySASstatementsorprocedures.
1.OptimizingSystemPerformance
1.1DefinitionsPerformanceStatistics
Alloftasksrequiretimeandspace.Timeandspaceforacomputerprogramare
composedofCPUtime,I/Otime,andmemory.
•I/Otime:
(thetimeyourcomputertakestoreaddataintomemoryandwritedata
fromthememorytoyourharddrive)
•Memory:
thesizeoftheworkareathattheCPUmustdevotetotheoperationsinthe
program.
•CPUtime:
(thetimeyourcomputertakestoperformcalculations,CPU-Central
ProcessingUnit)
•Datastorage:
howmuchspaceondiskortapeyourdatause
•Programmingtime:
theamountoftimerequiredfortheprogrammertowriteandmaintaintheprogram.
YoucanobtainthesestatisticsbyusingSASsystemoptionsthatcanhelpyoumeasureyourjob'sinitialperformanceandtodeterminehowtoimproveperformance.
1.2SystemPerformance
ismeasuredbytheoverallamountofI/O,memory,datastorageandCPUtimethatyoursystemusestoprocessSASprograms.
1.3InterpretingFULLSTIMER
SeveraltypesofresourceusagestatisticsarereportedbyFULLSTIMERoptions,includingrealtime(elapsedtime)andCPUtime.Realtimerepresentstheclocktimeittooktoexecuteajoborstep.CPUtimerepresentstheactualprocessingtimerequiredbytheCPUtoexecutethejob.
ThestatisticsreportedbyFULLSTIMERrelatetothethreecriticalcomputerresources:
I/O,memory,andCPUtimeincludingsystemanduserCPUtime.Undermanycircumstances,reducingtheuseofanyofthesethreeresourcesusuallyresultsinbetterthroughoutofaparticularjobandareductionofrealtimeused.
1.4OverviewofTechniquesforOptimizingI/O
I/Oisoneofthemostimportantfactorsforoptimizingperformance.MostSASjobsconsistofrepeatedcyclesofreadingaparticularsetofdatatoperformvariousdataanalysisanddatamanipulationtasks.ToimprovetheperformanceofaSASjob,youmustreducethenumberoftimesSASaccessesdiskortapedevicesandreducethenumberoftimesitprocessesthedatainternally.
ImprovementinI/Ocancomeatthecostofincreasedmemoryconsumption.InordertounderstandtherelationshipbetweenI/Oandmemory,itishelpfultoknowwhendataiscopiedtoabufferandwhereI/Oismeasured.WhenyoucreateaSASdatasetusingaDATAstep,
1.SAScopiesthedatafromtheinputdatasettoabufferinmemory
2.oneobservationatatimeisloadedintotheprogramdatavector
3.eachobservationiswrittentoanoutputbufferwhenprocessingiscomplete
PageSize
Thinkofabufferasacontainerinmemorythatisbigenoughforonlyonepageofdata.
Thebuffersize,orpagesize,determinesthesizeofasingleinput/outputbufferthatSASusestotransferdataduringprocessing.ApageistheminimumnumberofbytesofdatathatSASmovesbetweenexternalstorageandmemoryinonelogicalinput/outputoperation.
Apage
∙istheunitofdatatransferbetweenthestoragedeviceandmemory
∙includesthenumberofbytesusedbythedescriptorportionandthedatavalues.
∙isfixedinsizewhenthedatasetiscreated,eithertoadefaultvalueortoauser-specifiedvalue.
TheamountofdatathatcanbetransferredtoonebufferinasingleI/Ooperationisreferredtoaspagesize.Eachbuffercanholdonepageofdata.
SettingtheBUFNO=,andBUFSIZE=SystemOptions
ThefollowingSASsystemoptionscanhelpyoureducethenumberofdiskaccessesthatareneededforSASfiles,thoughtheymightincreasememoryusage.
1)BUFNO=
YoucanusetheBUFNO=systemordatasetoptiontocontrolthenumberofbuffersthatareavailableforreadingorwritingaSASdataset.Also,BUFNOisthenumberofpagebufferstoallocateforthedataset.Byincreasingthenumberofbuffers,youcancontrolhowmanypagesofdataareloadedintomemorywitheachI/Otransfer.
Youcanspecifythenumberofbuffers,itsdefaultis1.Itspecifiesavaluefrom1tothemaximumnumberofbuffersavailableinyouroperatingenvironment.SASusestheBUFNO=optiontoadjustthenumberofopenpagebufferswhenitprocessesaSASdataset.Increasingthisoption'svaluecanimproveyourapplication'sperformancebyallowingSAStoreadmoredatawithfewerpasses;however,yourmemoryusageincreases.Therefore,thegreaterthenumberofpagebuffers,themorememoryisrequired.ThebuffernumberisnotapermanentattributeofthedatasetandisvalidonlyforthecurrentsteporSASsession.
TominimizeI/Oconsumption,whenyouworkwithasmalldataset,allocateasmanybuffersastherearepagesinthedatasetsothattheentiredatasetcanbeloadedintomemory.
Note:
UsingBUFNO=canspeedupexecutiontimebylimitingthenumberofinput/outputoperationsthatarerequiredforaparticularSASdataset.Theimprovementinexecutiontime,however,comesattheexpenseofincreasedmemoryconsumption.
2)BUFSIZE=
YoucanusetheBUFSIZE=systemoptionordatasetoptiontocontrolthepagesizeofanoutputSASda
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- sasadvancedbook1
![提示](https://static.bingdoc.com/images/bang_tan.gif)