书签分享收藏举报版权申诉 / 22

立即下载加入VIP,免费下载

当前位置：首页 > 初中教育 > 初中作文 > 多水平模型英文原著chap7.docx

多水平模型英文原著chap7.docx

文档编号：3792181
上传时间：2023-05-06
格式：DOCX
页数：22
大小：595.27KB

多水平模型英文原著chap7.docx

《多水平模型英文原著chap7.docx》由会员分享，可在线阅读，更多相关《多水平模型英文原著chap7.docx（22页珍藏版）》请在冰点文库上搜索。

多水平模型英文原著chap7.docx

多水平模型英文原著chap7

Chapter7

Discreteresponsedata

7.1Modelsfordiscreteresponsedata

Allthemodelsofpreviouschaptershaveassumedthattheresponsevariableiscontinuouslydistributed.Wenowlookatdatawheretheresponseisessentiallyacountofevents.Thiscountmaybethenumberoftimesaneventoccursoutofafixednumberof‘trials’inwhichcaseweusuallydealwiththeresultingproportionasresponse:

anexampleistheproportionofdeathsinapopulation,classifiedbyage.Wemayhaveavectorofcountsrepresentingthenumbersofeventsofdifferentkindswhichoccuroutofatotalnumberofevents:

anexampleisgiveninchapter3wherewestudiedthenumberofresponsestoeach,ordered,categoryofaquestiononabortionattitudes.

Statisticalmodelsforsuchdataarereferredtoas‘generalisedlinearmodels’（McCullaghandNelder,1989）.A2-levelmodelcanbewritteninthegeneralform

（7.1）

whereistheexpectedvalueoftheresponsefortheij-thlevel1unitandfisanonlinearfunctionofthe‘linearpredictor’

.Notethatweallowrandomcoefficientsatlevel2.Themodeliscompletedbyspecifyingadistributionfortheobservedresponse

.WheretheresponseisaproportionthisistypicallytakentobebinomialandwheretheresponseisacounttakentobePoisson.Equation（7.1）isaspecialcaseofthenonlinearmodelstudiedinchapter5andweshallbeusingtheresultsgiventhere.Itremainsforustospecifythenonlinear‘link’functionf.Table7.1listssomeofthestandardchoices,withlogarithmschosentobasee.

Inadditiontothesewecanalsohavethe‘identity’function

butthiscancreatedifficultiessinceitallows,inprinciple,predictedcountsorproportionswhicharerespectivelylessthanzerooroutsidetherange（0,1）.Nevertheless,inmanycases,usingtheidentityfunctionproducesacceptableresultswhichmaydifferlittlefromthoseobtainedwiththenonlinearfunctions.Inthefollowingsectionsweconsidereachcommontypeofmodelinturnwithexamples.

Table7.1Somenonlinearlinkfunctions.

Response

Name

Proportion

logit

Proportion

complementaryloglog

Vectorofproportions

multivariatelogit

Count

log

7.2Proportionsasresponses

Considerthe2-levelvariancecomponentsmodelwithasingleexplanatoryvariablewheretheexpectedproportionismodelledusingalogitlinkfunction

（7.2）

Theobservedresponsesareproportionswiththestandardassumptionthattheyarebinomiallydistributed

（7.3）

whereisthedenominatorfortheproportion.Wealsohave

（7.4）

Wenowwritethemodelinthestandardwayincludingthelevel1variationas

（7.5）

Usingthisexplanatoryvariableandconstrainingthelevel1varianceassociatedwiththistobeoneweobtaintherequiredbinomialvarianceinequation（7.4）.Whenfittingamodelwecanalsoallowthelevel1variancetobeestimatedandbycomparingtheestimatedvariancewiththevalue1.0obtainatestfor‘extrabinomial’variation.Suchvariationmayariseinanumberofways.

Ifwehaveomittedalevelinthemodel,forexampleignoredhouseholdclusteringinasurveywithoneormoreindividualssampledfromahousehold,wewouldexpectagreaterthanbinomialvariationattheindividuallevel.Likewise,supposetheindividualsandhouseholdswerenestedwithinareasandwechosetoclassifyindividuals,saybygenderand3socialclassgroupsgiving6cellsineacharea.Ifwetreattheseasthelevel1unitssothattheresponseisaproportion,thenwenolongerhaveabinomialvariancesincetheseproportionsarebaseduponthesumofseparatebinomialvariableswithdifferingprobabilities.Herethevarianceforcelljwithinanareawouldhavetheform

whereisthecellsize.Tofitsuchamodelwewouldspecifyanextralevel1explanatoryvariableequalto

forthej-thcell,withvarianceparameteratlevel1whichwasallowedtobenegative（seechapter3）.Moregenerally,wecanfitamodelwithanextrabinomialparametertogetherwithafurthertermsuchasabovetogivethefollowinglevel1variancestructure（omittingsubscripts）

Wedonot,ofcourse,knowthetruevalueoforsothatateachiterationweuseestimatesbaseduponthecurrentvaluesoftheparameters.Becauseweareusingonlythemeanandvarianceofthebinomialdistributiontocarryouttheestimation,theestimationisknownas‘quasilikelihood’（seeappendix5.1）.

Anotherwayofmodellingsuchextrabinomialvariation,whichhascertainadvantages,istoinserta‘pseudolevel’abovelevel1.Thus,forindividualssampledwithinhouseholds,level1wouldbethatoftheindividualandwewouldspecifylevel2asthatoftheindividualsalsotogiveexactly1level1unitperlevel2unit.Wespecifybinomialvariationatlevel1andatlevel2wecannowfitfurtherrandomcoefficients.Forexample,ifwefitarandomcoefficientfortheexplanatoryvariablewithavariancewhichcanbeallowedtobenegativethisisequivalenttospecifyinganextralevel1variable

asabove.Intheaboveexamplewhereindividualsareclassifiedbygenderandsocialclasswecancreatealevel2unitcoincidingwitheachlevel1unit,fitbinomialvariationatlevel1andaddlevel2variationwhichisafunctionofgenderandsocialclass,forexampleanadditivefunctionwith4parameters（seechapter3）.Wemaywishtomodelthebetween-areavariationofthecellproportionsintermsofasimplevarianceterm,ratherasinverselyproportionalto.Inthiscasewewouldchooseasimpledummyvariablestructureratherthanexplanatoryvariablesproportionalto

.This‘pseudolevel’procedureisrathersimilartothewayinwhichametaanalysiswithknownlevel1variationismodelled（chapter3）.

Inchapter5wemadethedistinctionbetweenmodelswherethecurrentlevel2residualestimateswereaddedtothelinearcomponentofthenonlinearfunctionwhenformingtheTaylorexpansioninordertoworkwithalinearisedmodel,andthosecaseswheretheywerenot.Theformerisreferredtoaspredictivequasilikelihood（PQL）andthelattermarginalquasilikelihood（MQL）.InmanyapplicationstheMQLprocedurewilltendtounderestimatethevaluesofboththefixedandrandomparameters,especiallywhereissmall.InadditionwepointedoutthatgreateraccuracyistobeexpectedifthesecondorderapproximationisusedratherthanthefirstorderbaseduponthefirsttermintheTaylorexpansion.Also,whenthesamplesizeissmalltheunbiased（RIGLS,REML）procedureshouldbeused.Appendix7.1givesexpressionsfortheseconddifferentialsrequiredforthesecondorderprocedure..Toillustratethedifferencetable7.2presentstheresultsofsimulatingthefollowingmodelwheretheresponseisbinary（0,1）.Theexampleassumesonemoderateandonelargelevel2variance.

Thereare50level2unitswith20level1unitsineachlevel2unit.Thefollowingresultsarebasedupon400simulationsoftheabovemodelforeachvariancevalue.

Table7.2Meanvaluesof400simulations.Empiricalstandarderrorinfirstbracket;meanofestimatedstandarderrorsinsecondbracket（IGLS）.

True

Parameter

MQLfirstorder

PQLsecondorder

MQLfirstorder

PQLsecondorder

0.386（0.115）（0.130）

0.480（0.157）（0.152）

0.672（0.157）（0.188）

0.964（0.278）（0.255）

0.448（0.126）（0.129）

0.499（0.139）（0.138）

0.420（0.145）（0.149）

0.500（0.171）（0.172）

0.934（0.154）（0.147）

1.018（0.168）（0.154）

0.875（0.147）（0.145）

1.017（0.171）（0.158）

Here,thedenominatoris1.0inallcases.ItisclearthattheMQLfirstordermodelunderestimatesalltheparametervalues,whereasthesecondorderPQLmodelproducesestimatesclosetothetruevalues.TheestimatesgivenarebaseduponIGLS.Ineverycaseconvergencewasachievedinlessthan10iterations.VerysimilarestimatesforthefixedcoefficientsareobtainedusingRIGLS,andforthelevel2variancesthePQLestimatesbecome0.498and0.996respectively,whichareevenclosertothetruevalues.Inaddition,theaveragesofthestandarderrorsgivenbybothmodelsarereasonablyclosetothosecalculatedempiricallyfromthereplications.Ifwecalculate95%confidenceintervalsfortheparametersinthesecondorderPQLmodelusingtheestimatedstandarderrorsandassumingNormalitythenforthevariancewefindthatabout91%oftheintervalsincludethetruevalueandforandabout95%doso.Hence,inferencesaboutthetruevalueswouldnotbetoomisleading.TheresultsofTable7.2arebaseduponabalanceddatasetwithequalnumbersoflevel1unitswithineachlevel2unit.Further,limited,simulationssuggestthatevenwherethedataareveryunbalanced,forexamplewithsomelevel2unitscontainingonlyasinglelevel1unit,thePQLsecondorderestimatesremainclosetothetruevalues.Theseestimatesappeartohavegoodpropertiesevenwithaverageobservedprobabilitiesassmallas0.1oraslargeas0.9andalevel2varianceof1.0forthesamplestructureofthisexample.

Moregenerally,whentheaverageobservedprobabilityisverysmall（orverylarge）,ifmanyofthelevel2unitshavefewlevel1unitsandthereareveryfewlevel2unitswithlargenumbersoflevel1units,wewilloftenfindthatwheretheresponseisbinary,therewillbemanylevel2unitswheretheresponsesareallzero.Insuchacaseconvergenceoftenmaynotbepossibleandevenwhereestimatesareobtained,ingeneraltheywillnotbeunbiased.Thisproblemcanbeavoidedbyhavingasufficientnumberoflargelevel2unitswherethereisadequateresponseheterogeneity,andinsuchcaseswecanobtainsatisfactoryestimatesevenwheretheaverageprobabilitiesareverysmallorlarge.FurtherworkonthisissueisreportedbyGol