A Gentle Introduction to Stata

Alan C. Acock’s A Gentle Introduction to Stata, Sixth Edition is aimed at new Stata users who want to become proficient in Stata. After reading this introductory text, new users will be able not only to use Stata well but also to learn new aspects of Stata.

 

Acock assumes that the user is not familiar with any statistical software. This assumption of a blank slate is central to the structure and contents of the book. Acock starts with the basics; for example, the part of the book that deals with data management begins with a careful and detailed example of turning survey data on paper into a Stata-ready dataset on the computer. When explaining how to go about basic exploratory statistical procedures, Acock includes notes that will help the reader develop good work habits. This mixture of explaining good Stata habits and good statistical habits continues throughout the book.

 

Acock is quite careful to teach the reader all aspects of using Stata. He covers data management, good work habits (including the use of basic do-files), basic exploratory statistics (including graphical displays), and analyses using the standard array of basic statistical tools (correlation, linear and logistic regression, and parametric and nonparametric tests of location and dispersion). He also successfully introduces some more advanced topics such as multiple imputation and multilevel modeling in a very approachable manner. Acock teaches Stata commands by using the menus and dialog boxes while still stressing the value of do-files. In this way, he ensures that all types of users can build good work habits. Each chapter has exercises that the motivated reader can use to reinforce the material.

 

The tone of the book is friendly and conversational without ever being glib or condescending. Important asides and notes about terminology are set off in boxes, which makes the text easy to read without any convoluted twists or forward referencing. Rather than splitting topics by their Stata implementation, Acock arranges the topics as they would appear in a basic statistics textbook; graphics and postestimation are woven into the material in a natural fashion. Real datasets, such as the General Social Surveys from 2002, 2006, and 2016, are used throughout the book.

 

The focus of the book is especially helpful for those in the behavioral and social sciences because the presentation of basic statistical modeling is supplemented with discussions of effect sizes and standardized coefficients. Various selection criteria, such as semipartial correlations, are discussed for model selection. Acock also covers a variety of commands available for evaluating reliability and validity of measurements.

 

The sixth edition incorporates new features of Stata 15. All menus, dialog boxes, and instructions for using the point-and-click interface have been updated. Power-and-sample-size calculations for linear regression are demonstrated using Stata 15’s new power rsquared command. This edition also includes new sections that describe how to evaluate convergent and discriminant validity, how to compute effect sizes for t tests and ANOVA models, how to use margins and marginsplot to interpret results of linear and logistic regression models, and how to use full-information maximum-likelihood (FIML) estimation with SEM to address problems with missing data.

List of figures
List of tables
List of boxed tips
Preface
Support materials for the book
Glossary of acronyms
Glossary of mathematical and statistical symbols

 

1. GETTING STARTED

Conventions
Introduction
The Stata screen
Using an existing dataset
An example of a short Stata session
Video aids to learning Stata
Summary
Exercises

 

2. ENTERING DATA

Creating a dataset
An example questionnaire
Developing a coding system
Entering data using the Data Editor

Value labels

The Variables Manager
The Data Editor (Browse) view
Saving your dataset
Checking the data
Summary
Exercises

 

3. PREPARING DATA FOR ANALYSIS

Introduction
Planning your work
Creating value labels
Reverse-code variables
Creating and modifying variables
Creating scales
Save some of your data
Summary
Exercises

 

4. WORKING WITH COMMANDS, DO-FILES, AND RESULTS

Introduction
How Stata commands are constructed
Creating a do-file
Copying your results to a word processor
Logging your command file
Summary
Exercises

 

5. DESCRIPTIVE STATISTICS AND GRAPHS FOR ONE VARIABLE

Descriptive statistics and graphs
Where is the center of a distribution?
How dispersed is the distribution?
Statistics and graphs—unordered categories
Statistics and graphs—ordered categories and variables
Statistics and graphs—quantitative variables
Summary
Exercises

 

6. STATISTICS AND GRAPHS FOR TWO CATEGORICAL VARIABLES

Relationship between categorical variables
Cross-tabulation
Chi-squared test

Degrees of freedom
Probability tables

Percentages and measures of association
Odds ratios when dependent variable has two categories
Ordered categorical variables
Interactive tables
Tables—linking categorical and quantitative variables
Power analysis when using a chi-squared test of significance
Summary
Exercises

 

7. TESTS FOR ONE OR TWO MEANS

Introduction to tests for one or two means
Randomization
Random sampling
Hypotheses
One-sample test of a proportion
Two-sample test of a proportion
One-sample test of means
Two-sample test of group means

Testing for unequal variances

Repeated-measures t test
Power analysis
Nonparametric alternatives

Mann–Whitney two-sample rank-sum test
Nonparametric alternative: Median test

Video tutorial related to this chapter
Summary
Exercises

 

8. BIVARIATE CORRELATION AND REGRESSION

Introduction to bivariate correlation and regression
Scattergrams
Plotting the regression line
An alternative to producing a scattergram, binscatter
Correlation
Regression
Spearman’s rho: Rank-order correlation for ordinal data
Power analysis with correlation
Summary
Exercises

 

9. ANALYSIS OF VARIANCE

The logic of one-way analysis of variance
ANOVA example
ANOVA example with nonexperimental data
Power analysis for one-way ANOVA
A nonparametric alternative to ANOVA
Analysis of covariance
Two-way ANOVA
Repeated-measures design
Intraclass correlation—measuring agreement
Power analysis with ANOVA

Power analysis for one-way ANOVA
Power analysis for two-way ANOVA
Power analysis for repeated-measures ANOVA
Summary of power analysis for ANOVA

Summary
Exercises

 

10. MULTIPLE REGRESSION

Introduction to multiple regression
What is multiple regression?
The basic multiple regression command
Increment in R-squared: Semipartial correlations
Is the dependent variable normally distributed?
Are the residuals normally distributed?
Regression diagnostic statistics

Outliers and influential cases
Influential observations: DFbeta
Combinations of variables may cause problems

Weighted data
Categorical predictors and hierarchical regression
A shortcut for working with a categorical variable
Fundamentals of interaction
Nonlinear relations

Fitting a quadratic model
Centering when using a quadratic term
Do we need to add a quadratic component?

Power analysis in multiple regression
Summary
Exercises

 

11. LOGISTIC REGRESSION

Introduction to logistic regression
An example
What is an odds ratio and a logit?

The odds ratio
The logit transformation

Data used in the rest of the chapter
Logistic regression
Hypothesis testing

Testing individual coefficients
Testing sets of coefficients

Margins: More on interpreting results from logistic regression
Nested logistic regressions
Power analysis when doing logistic regression
Next steps for using logistic regression and its extensions
Summary
Exercises

 

12. MEASUREMENT, RELIABILITY, AND VALIDITY

Overview of reliability and validity
Constructing a scale

Generating a mean score for each person

Reliability

Stability and test–retest reliability
Equivalence
Split-half and alpha reliability—internal consistency
Kuder–Richardson reliability for dichotomous items
Rater agreement—kappa (K)

Validity

Expert judgment
Criterion-related validity
Construct validity

Factor analysis
PCF analysis

Orthogonal rotation: Varimax
Oblique rotation: Promax

But we wanted one scale, not four scales

Scoring our variable

Summary
Exercises

 

13. STRUCTURAL EQUATION AND GENERALIZED STRUCTURAL EQUATION MODELING

Linear regression using sem

Using the sem command directly
SEM and working with missing values
Exploring missing values and auxiliary variables
Getting auxiliary variables into your SEM command

A quick way to draw a regression model
The gsem command for logistic regression

Fitting the model using the logit command
Fitting the model using the gsem command

Path analysis and mediation
Conclusions and what is next for the sem command
Exercises

 

14. WORKING WITH MISSING VALUES – MULTIPLE IMPUTATION

Working with missing values—multiple imputation
What variables do we include when doing imputations?
The nature of the problem
Multiple imputation and its assumptions about the mechanism for missingness
Multiple imputation

A detailed example

Preliminary analysis
Setup and multiple-imputation stage
The analysis stage
For those who want an R2 and standardized βs
When impossible values are imputed

Summary
Exercises

 

15. AN INTRODUCTION TO MULTILEVEL ANALYSIS 

Questions and data for groups of individuals
Questions and data for a longitudinal multilevel application
Fixed-effects regression models
Random-effects regression models
An applied example

Research questions
Reshaping data to do multilevel analysis

A quick visualization of our data
Random-intercept model
Random intercept—linear model
Random-intercept model—quadratic term
Treating time as a categorical variable
Random-coefficients model
Including a time-invariant covariate
Summary
Exercises

 

16. ITEM RESPONSE THEORY (IRT)

How are IRT measures of variables different from summated scales?
Overview of three IRT models for dichotomous items

The one-parameter logistic (1PL) model
The two-parameter logistic (2PL) model
The three-parameter logistic (3PL) model

Fitting the 1PL model using Stata

The estimation
How important is each of the items?
An overall evaluation of our scale
Estimating the latent score

Fitting a 2PL IRT model

Fitting the 2PL model

The graded response model—IRT for Likert-type items

The data
Fitting our graded response model
Estimating a person’s score

Reliability of the fitted IRT model
Using the Stata menu system
Extensions of IRT
Exercises

 

A. WHAT’S NEXT?

Introduction to the appendix
Resources

Web resources
Books about Stata
Short courses
Acquiring data
Learning from the postestimation methods

Summary

Author: Alan C. Acock
Edition: Sixth Edition
ISBN-13: 978-1-59718-269-0
©Copyright: 2018
Versione e-Book disponibile

Alan C. Acock’s A Gentle Introduction to Stata, Sixth Edition is aimed at new Stata users who want to become proficient in Stata. After reading this introductory text, new users will be able not only to use Stata well but also to learn new aspects of Stata.

 

What’s new in this edition:

  • New section discussing how to account for missing values using full-information maximum-likelihood (FIML) estimation in SEM
  • Introduction to using Stata 15’s new power rsquared command for power-and-sample-size calculations for linear regression
  • New section demonstrating how to evaluate convergent and discriminant validity of measures
  • More examples and explanation of effect-size calculations for t tests and ANOVA
  • Additional examples of interpreting model results using margins and marginsplot
  • Screenshots of menus, dialogs, and interface updated for Stata 15

 

This book is of particular interest for:

  • Those wanting to learn Stata, from data entry and data management to analysis and graphics
  • Users who are unfamiliar with statistical packages
  • Teachers of introductory statistics courses, especially in the social or behavioral sciences
  • Anyone wanting to become proficient in Stata