NOVITA'/TESTI

 

Contents of Generalized Linear Models and Extensions, Second Edition

by James W. Hardin and Joseph M. Hilbe

Generalized linear models (GLMs) extend standard linear (Gaussian) regression techniques to models with a non-Gaussian, or even discrete, response. GLM theory is predicated on the exponential family of distributions—a class so rich that it includes the commonly used logit, probit, and Poisson distributions. Although one can fit these models in Stata by using specialized commands (e.g., logit for logit models), fitting them under the GLM paradigm with Stata’s glm command offers the advantage of having many models under the same roof. For example, model diagnostics may be calculated and interpreted similarly regardless of the assumed distribution.

This text thoroughly covers GLMs, both theoretically and computationally. The theory consists of showing how the various GLMs are special cases of the exponential family, general properties of this family of distributions, and the derivation of maximum likelihood (ML) estimators and standard errors. The book shows how iteratively reweighted least squares, another method of parameter estimation, is a consequence of ML estimation via Fisher scoring. The authors also discuss different methods of estimating standard errors, including robust methods, robust methods with clustering, Newey–West, outer product of the gradient, bootstrap, and jackknife. The thorough coverage of model diagnostics includes measures of influence such as Cook’s distance, nine forms of residuals, the Akaike and Bayesian information criteria, and various R2-type measures of explained variability.

After presenting general theory, the text then breaks down each distribution. Each distribution has its own chapter that discusses the computational details of applying the general theory to that particular distribution. Pseudocode plays a valuable role here, since it lets the authors describe computational algorithms relatively simply. Devoting an entire chapter to each distribution (or family in GLM terms) also allows including real-data examples showing how Stata fits such models, as well as presenting certain diagnostics and analytical strategies that are unique to that family. The chapters on binary data and on count (Poisson) data are excellent in this regard. Hardin and Hilbe give ample attention to the problems of overdispersion and zero inflation in count-data models.

The final part of the text concerns extensions of GLMs, which come in three forms. First, some chapters cover multinomial responses, both ordered and unordered. Although strictly not part of GLM, the theory is similar in that one can think of a multinominal response as an extension of a binary response. The examples presented in these chapters often use the authors’ own Stata programs, augmenting official Stata’s capabilities. Second, GLMs may be extended to clustered data through generalized estimating equations (GEEs), and one chapter covers GEE theory and examples. Finally, GLMs may be extended by programming one’s own family and link functions for use with Stata’s official glm command, and the book covers this process.

Table of Contents

    List of Tables

    List of Figures

    List of listings

    Preface

  1. Introduction
    1. Origins and motivation
    2. Notational conventions
    3. Applied or theoretical?
    4. Road map
    5. Installing the support materials

I Foundations of Generalized Linear Models

  1. Generalized Linear Models
    1. Components
    2. Assumptions
    3. Exponential family
    4. Example: Using an offset in a GLM
    5. Summary
  1. GLM estimation algorithms
    1. Newton–Raphson (using the observed Hessian)
    2. Starting values for Newton–Raphson
    3. IRLS (using the expected Hessian)
    4. Starting values for IRLS
    5. Goodness of fit
    6. Estimated variance matrices
      1. Hessian
      2. Outer product of the gradient (OPG)
      3. Sandwich
      4. Modified sandwich
      5. Unbiased sandwich
      6. Modified unbiased sandwich
      7. Weighted sandwich: Newey-West
      8. Jackknife
        1. Usual jackknife
        2. One-step jackknife
        3. Weighted jackknife
        4. Variable jackknife
      9. Bootstrap
        1. Usual bootstrap
        2. Grouped bootstrap
    7. Estimation algorithms
    8. Summary
  1. Analysis of fit
    1. Deviance
    2. Diagnostics
      1. Cook's distance
      2. Overdispersion
    3. Assessing the link function
    4. Checks for systematic departure from the model
    5. Residual analysis
      1. Response residuals
      2. Working residuals
      3. Pearson residuals
      4. Partial residuals
      5. Anscombe residuals
      6. Deviance residuals
      7. Adjusted deviance residuals
      8. Likelihood residuals
      9. Score residuals
    6. Model statistics
      1. Criterion measures
        1. AIC
        2. BIC
      2. The interpretation of R2 in linear regression
        1. Percent variance explained
        2. The ratio of variances
        3. A transformation of the likelihood ratio
        4. A transformation of the F test
        5. Squared correlation
      3. Generalizations of linear regression R2 interpretations
        1. Efron's pseudo-R2
        2. McFadden's likelihood-ratio index
        3. Ben-Akiva and Lerman adjusted likelihood-ratio index
        4. McKelvey and Zavoina ratio of variances
        5. Cragg and Uhler normed measure
      4. More R2 measures
        1. The count R2
        2. The adjusted count R2
        3. Veall and Zimmermann R2
        4. Cameron–Windmeijer R2

II Continuous Response Models

  1. The Gaussian family
    1. Derivation of the GLM Gaussian family
    2. Derivation in terms of the mean
    3. IRLS GLM algorithm (nonbinomial)
    4. Maximum likelihood estimation
    5. GLM log-normal models
    6. Expected versus observed information matrix
    7. Other Gaussian links
    8. Example: Relation to OLS
    9. Example: Beta-carotene
  1. The gamma family
    1. Derivation of the gamma model
    2. Example: Reciprocal link
    3. Maximum likelihood estimation
    4. Log-gamma models
    5. Identity-gamma models
    6. Using the gamma model for survival analysis
  1. The inverse Gaussian family
    1. Derivation of the inverse Gaussian model
    2. The inverse Gaussian algorithm
    3. Maximum likelihood algorithm
    4. Example: The canonical inverse Gaussian
    5. Non-canonical links
  1. The power family and link
    1. Power links
    2. Example: Power link
    3. The power family

III Binomial Response Models

  1. The binomial-logit family
    1. Derivation of the binomial model
    2. Derivation of the Bernoulli model
    3. The binomial regression algorithm
    4. Example: Logistic regression
      1. Model producing logistic coefficients: The heart data
      2. Model producing logistic odds ratios
    5. GOF statistics
    6. Interpretation of parameter estimates
  1. The general binomial family
    1. Non-canonical binomial models
    2. Non-canonical binomial links (binary form)
    3. The probit model
    4. The clog-log and log-log models
    5. Other links
    6. Interpretation of coefficients
      1. Identity link
      2. Logit link
      3. Log link
      4. Log complement link
      5. Summary
    7. Generalized binomial regression
  1. The problem of overdispersion
    1. Overdispersion
    2. Scaling of standard errors
    3. Williams' procedure
    4. Robust standard errors

IV Count Response Models

  1. The Poisson family
    1. Count response regression models
    2. Derivation of the Poisson algorithm
    3. Poisson regression: Examples
    4. Example: Testing overdispersion in the Poisson model
    5. Using the Poisson model for survival analysis
    6. Using offsets to compare models
    7. Interpretation of coefficients
  1. The negative binomial family
    1. Constant overdispersion
    2. Variable overdispersion
      1. Derivation in terms of a Poisson–gamma mixture
      2. Derivation in terms of the negative binomial probability function
      3. The canonical link negative binomial parameterization
    3. The log-negative binomial parameterization
    4. Negative binomial examples
    5. The geometric family
    6. Interpretation of coefficients
  1. Other count data models
    1. Count response regression models
    2. Zero-truncated models
    3. Zero-inflated models
    4. Hurdle models
    5. Heterogeneous negative binomial models
    6. Generalized Poisson regression models
    7. Censored count response models

V Multinomial Response Models

  1. The ordered response family
    1. Ordered outcomes for general link
    2. Ordered outcomes for specific links
      1. Ordered logit
      2. Ordered probit
      3. Ordered clog-log
      4. Ordered log-log
      5. Ordered cauchit
    3. Generalized ordered outcome models
    4. Example: Synthetic data
    5. Example: Automobile data
    6. Partial proportional-odds models
    7. Continuation ratio models
  1. Unordered response family
    1. The multinomial logit model
      1. Example: Relation to logistic regression
      2. Example: Relation to conditional logistic regression
      3. Example: Extensions with conditional logistic regression
      4. The independence of irrelevant alternatives
      5. Example: Assessing the IIA
      6. Interpreting coefficients
      7. Example: Medical admissions—introduction
      8. Example: Medical admissions—summary
    2. The multinomial probit model
      1. Example: A comparison of the models
      2. Example: Comparing probit and multinomial probit
      3. Example: Concluding remarks

VI Extensions to the GLM

  1. Extending the likelihood
    1. The quasi-likelihood
    2. Example: Wedderburn's leaf blotch data
    3. Generalized additive models
  1. Clustered data
    1. Generalization from individual to clustered data
    2. Pooled estimators
      1. Fixed effects
        1. Unconditional fixed-effects estimators
        2. Conditional fixed-effects estimators
      2. Random effects
        1. Maximum likelihood estimation
        2. Gibbs sampling
      3. GEEs
      4. Other models

VII Stata Software

  1. Programs for Stata
    1. The glm command
      1. Syntax
      2. Description
      3. Options
    2. The predict command after glm
      1. Syntax
      2. Options
    3. User-written programs
      1. Global macros available for user-written programs
      2. User-written variance functions
      3. User-written programs for link functions
      4. User-written programs for Newey-West weights
    4. Remarks
      1. Equivalent commands
      2. Special comments on family(Gaussian) models
      3. Special comments on family(binomial) models
      4. Special comments on family(nbinomial) models
      5. Special comment on family(gamma) link(log) models

A Tables

References

Author index

Subject index


 
Copyright © 2008 TStat All rights reserved via Rettangolo, 12/14 - 67039 - Sulmona (AQ) - Italia