Contents Prefacetothefirstedition XVI Preface XVIll 1Introduction 1.1Background 1.1.1Theproblemoflookingatdata 1.1.2Theoryaspattern 1.1.3Modelfitting 1.1.4Whatisagoodmodel? 1.2Theoriginsofgeneralizedlinearmodels 113457889 1.2.1Terminology 1.2.2Classicallinearmodels 1.2.3RA.Fisherandthedesignofexperiments 10 1.2.4Dilutionassay 1.2.5Probitanalysis 13 1.2.6Logitmodelsforproportions 14 1.2.7Log-linearmodelsforcounts 14 1.2.8Inversepolynomials 16 1.2.9Survivaldata 16 1.3Scopeoftherestofthebook 17 1.4Bibliographicnotes 1.5Furtherresultsandexercises1 19 2Anoutlineofgeneralizedlinearmodels 21 2.1Processesinmodelfitting 21 2.1.1Modelselection 21 2.1.2Estimation 23 2.1.3Prediction vill CONTENTS 2.2Thecomponentsofageneralizedlinearmodel 26 2.2.1Thegeneralization 27 2.2.2Likelihoodfunctions 28 2.2.3Linkfunctions 30 2.2.4Sufficientstatisticsandcanonicallinks 32 2.3Measuringthegoodnessoffit 2.3.1Thediscrepancyofafit 33 2.3.2Theanalysisofdeviance 2.4Residuals 37 2.4.1Pearsonresidual 37 2.4.2Anscomberesidual 2.4.3Devianceresidual 39 2.5Analgorithmforfittinggeneralizedlinearmodels 40 2.5.1Justificationofthefittingprocedure 41 2.6Bibliographicnotes 43 2.7Furtherresultsandexercises2 3Modelsforcontinuousdatawithconstantvariance48 3.1Introduction 48 3.2Errorstructure 49 3.3Systematiccomponent(linearpredictor 51 3.3.1Continuouscovariates 51 3.3.2Qualitativecovariates 52 3.3.3Dummyvariates 54 3.3.4Mixedterms 55 3.4Modelformulaeforlinearpredictor 56 3.4.1Individualterms 56 3.4.2Thedotoperator 56 3.4.3Theoperator 57 3.4.4Thecrossing(*)andnesting(/)operators 58 3.4.5Operatorsfortheremovalofterms 59 3.4.6Exponentialoperator 60 3.5Aliasing 61 3.5.1Intrinsicaliasingwithfactors 63 3.5.2Aliasinginatwo-waycross-classification 65 3.5.3Extrinsicaliasing 68 3.5.4Functionalrelationsamongcovariates 69 3.6Estimation 70 3.6.1Themaximum-likelihoodequations 70 3.6.2Geometricalinterpretation 71 CONTENTS 3.6.3Information 3.6.4Amodelwithtwocovariates 74 3.6.5Theinformationsurface 3.6.6Stability 3.7Tablesasdata 79 3.7.1Emptycells 79 3.7.2Fusedcells 81 3.8Algorithmsforleastsquares 81 3.8.1Methodsbasedontheinformationmatrix 82 3.8.2Directdecompositionmethods 85 3.8.3Extensiontogeneralizedlinearmodels 88 3.9Selectionofcovariates 89 3.10Bibliographicnotes 93 3.11Furtherresultsandexercises3 93 4Binarydata 98 4.1Introduction 4.1.1Binaryresponses 4.1.2Covariateclasses 99 4.1.3Contingencytables 4.2Binomialdistribution 101 4.2.1Genes 101 4.2.2Momentsandcumulants 102 4.2.3Normallimit 103 4.2.4Poissonlimit 105 4.2.5Transformations 105 4.3Modelsforbinaryresponses 107 4.3.1Linkfunctions 107 4.3.2Parameterinterpretation 110 4.3.3Retrospectivesampling 111 4.4Likelihoodfunctionsforbinarydata 114 4.4.1Loglikelihoodforbinomialdata 114 4.4.2Parameterestimation 115 4.4.3Deviancefunction 118 4.4.4Biasandprecisionofestimates 119 4.4.5Sparseness 120 4.4.6Extrapolation 122 4.5Over-dispersion 124 4.5.1Genesis 124 4.5.2Parameterestimation 126 CONTENTS 46E 128 4.6.1Habitatpreferencesoflizards 128 4.7Bibliographicnot 135 4.8Furtherresultsandexercises4 135 5Modelsforpolytomousdata 149 5.1Introduction 149 5.2Measurementscales 150 521G po】 150 5.2.2Modelsforordinalsca 151 5.2.3Modelsforintervalscales 155 5.2.4Modelsfornominalscales 159 5.2.5Nestedorhierarchicalresponsescales 160 5.3Themultinomialdistribution 164 5.3.1Genesis 164 5.3.2Momentsandcumulants 165 5.3.3Generalizedinversematrices 168 5.3.4Quadraticforms 169 5.3.5Marginalandconditionaldistributions 5.4Likelihoodfunctions 5.4.1Loglikelihoodformultinomialresponses 171 5.4.2Parameterestimation 172 5.4.3Deviancefunction 174 5.5Over-dispersion 174 5.6Examples 175 5.6.1Acheese-tastingexperiment 175 5.6.2Pneumoconiosisamongcoalminers 178 5.7Bibliographicnotes 182 5.8Furtherresultsandexercises5 184 6Log-linearmodels 193 6.1Introduction 193 6.2Likelihoodfunctions 194 6.2.1Poissondistribution 194 6.2.2ThePoissonlog-likelihoodfunction 197 6.2.3Over-dispersion 198 6.2.4Asymptotictheory 200 6.3Examples 200 6.3.1Abiologicalassayoftuberculins 200 6.3.2Astudyofwavedamagetocargoships 204 CONTENTS 6.4Log-linearmodelsandmultinomialresponsemodels209 6.4.1ComparisonoftwoormorePoissonmeans 6.4.2Multinomialresponsemodels 211 6.4.3Summary 213 6.5Multipleresponses 214 6.5.1Introduction 214 6.5.2Independenceandconditionalindependence215 6.5.3Canonicalcorrelationmodels 217 6.5.4Multivariateregressionmodels 219 6.5.5Multivariatemodelformulae 222 6.5.6Log-linearregressionmodels 223 6.5.7Likelihoodequations 225 6.6Example 229 6.6.1Respiratoryailmentsofcoalminers 229 6.6.2Parameterinterpretation 233 6.7Bibliographicnotes 6.8Furtherresultsandexercises6 236 7Conditionallikelihoods 245 7.1Introduction 245 7.2Marginalandconditionallikelihoods 246 7.2.1Marginallikelihood 246 7.2.2Conditionallikelihood 248 7.2.3Exponential-familymodels 252 7.2.4Profilelikelihood 254 7.3Hypergeometricdistributions 255 7.3.1Centralhypergeometricdistribution 255 7.3.2Non-centralhypergeometricdistribution 257 7.3.3Multivariatehypergeometricdistribution 260 7.3.4Multivariatenon-centraldistribution 261 7.4Someapplicationsinvolvingbinarydata 262 7.4.1Comparisonoftwobinomialprobabilities 262 7.4.2Combinationofinformationfrom2x2tables265 7.4.3Ille-et-Vilainestudyofoesophagealcancer 267 7.5Someapplicationsinvolvingpolytomousdata 270 7.5.1Matchedpairs:nominalresponse 270 7.5.2Ordinalresponses 273 7. Example 276 7.6Bibliographicnotes 277 7.7Furtherresultsandexercises7 279 x11 CONTENTS 8Modelswithconstantcoefficientofvariation 285 8.1Introducti 285 8.2Thegammadistribution 287 8.3Modelswithgamma-distributedobservations 289 8.3.1Thevariancefunction 289 8.3.2Thedeviance 290 8.3.3Thecanonicallink 8.3.4Multiplicativemodels:loglink 292 8.3.5Linearmodels:identitylink 294 8.3.6Estimationofthedispersionparameter 295 8.4Examples 296 8.4.1Carinsuranceclaims 296 8.4.2Clottingtimesofblood 300 8.4.3Modellingrainfalldatausing twogeneralizedlinearmodels 8.4.4DevelopmentalrateofDrosophilamelanogaster306 8.5Bibliographic 313 8.6Furtherresultsandexercises8 314 9Quasi-likelihoodfunctions 323 9.1Introduction 323 9.2Independentobservations 324 9.2.1Covariancefunctions 324 9.2.2Constructionofthequasi-likelihoodfunction325 9.2.3Parameterestimation 327 9.2.4Example:incidenceofleaf-blotchonbarley328 9.3Dependentobservations 332 9.3.1Quasi-likelihoodestimatingequations 332 9.3.2Quasi-likelihoodfunction 333 9.3.3Example:estimationofprobabilitiesfrom marginalfrequencies 336 9.4Optimalestimatingfunctions 339 9.4.1Introduction 339 9.4.2Combinationofestimatingfunctions 340 9.4.3Example:estimationformegalithicstonerings343 9.5Optimalitycriteria 347 9.6Extendedquasi-likelihood 349 9.7Bibliographicnotes 352 9.8Furtherresultsandexercises9 352 CONTENTS 10Jointmodellingofmeananddispersion 357 10.1Introduction 357 10.2Modelspecification 358 10.3Interactionbetweenmeananddispersioneffects359 10.4Extendedquasi-likelihoodasacriterion 360 10.5Adjustmentsoftheestimatingequations 361 10.5.1Adjustmentforkurtosis 10.5.2Adjustmentfordegreesoffreedom 362 10.5.3Summaryofestimatingequationsfor thedispersionmode 363 10.6Jointoptimumestimatingequations 10.7Example:theproductionofleaf-springsfortrucks365 10.8Bibliographicnotes 370 10.9Furtherresultsandexercises10 371 11Modelswithadditionalnon-linearparameters372 11.1Introduction 372 11.2Parametersinthevariancefunction 373 11.3Parametersinthelinkfunction 375 11.3.1Onelinkparameter 375 11.3.2Morethanonelinkparameter 377 11.3.3Transformationofdatavs transformationoffittedvalues 378 11.4Non-linearparametersinthecovariates 379 11.5Examples 381 11.5.1Theeffectsoffertilizersoncoastal Bermudagrass 8 11.5.2Assayofaninsecticidewithasynergist384 11.5.3Mixturesofdrugs 386 11.6Bibliographicnotes 389 11.7Furtherresultsandexercises11 389 12Modelchecking 12.1Introduction 391 12.2Techniquesinmodelchecking 392 12.3Scoretestsforextraparameters 393 2.4Smoothingasanaidtoinformalchecks 394 12.5Therawmaterialsofmodelchecking 396 CONTENTS 12.6Checksforsystematicdeparturefrommodel 398 12.6.1Informalchecksusingresiduals 398 12.6.2Checkingthevariancefunction 400 12.6.3Checkingthelinkfunction 401 12.6.4Checkingthescalesofcovariates 12.6.5Checksforcompounddiscrepancies 12.7Checksforisolateddeparturesfromthemodel 403 12.7.1Measureofleverage 405 12.7.2Measureofconsistency 406 12.7.3Measureofinfluence 406 12.7.4Informalassessmentofextremevalues 407 12.7.5Extremepointsandchecksfor systematicdiscrepancies 408 12.8Examples 409 12.8.1Carrotdamageinaninsecticideexperiment409 12.8.2Minitabtreedata 410 12.8.3Insuranceclaims(continued) 413 12.9Astrategyformodelchecking 414 12.10Bibliographicnotes 415 12.11Furtherresultsandexercises12 416 13Modelsforsurvivaldata 419 13.1Introduction 419 13.1.1Survivalfunctionsandhazardfunctions419 13.2Proportional-hazardsmodels 421 13.3Estimationwithaspecifiedsurvivaldistribution422 13.3.1Theexponentialdistribution 423 13.3.2TheWeibulldistribution 423 13.3.3Theextreme-valuedistribution 424 13.4Example:remissiontimesforleukaemia 425 13.5 Cox'sproportional-hazardsmodel 426 13.5.1Partiallikelihood 426 13.5.2Thetreatmentofties 427 13.5.3Numericalmethods 429 13.6Bibliographicnotes 430 13.7Furtherresultsandexercises13 430 14Componentsofdispersion 432 14.1Introduction 432 14.2Linearmodels 433