A Gentle Introduction to Stata - Second Edition
by Alan C. Acock
 Alan C. Acock’s
Alan Acock’s A Gentle Introduction to Stata, Second Edition is aimed at new Stata users who want to become proficient in Stata. After reading this introductory text, new users will not only be able to use Stata well but also learn new aspects of Stata easily.
Acock assumes that the user is not familiar with any statistical software. This assumption of a blank slate is central to the structure and contents of the book. Acock starts with the basics; for example, the portion of the book that deals with data management begins with a careful and detailed example of turning survey data on paper into a Stata-ready dataset on the computer. When explaining how to go about basic exploratory statistical procedures, Acock includes notes that should help the reader develop good work habits. This mixture of explaining good Stata habits and good statistical habits continues throughout the book.
Acock is quite careful to teach the reader all aspects of using Stata. He covers data management, good work habits (including the use of basic do-files), basic exploratory statistics (including graphical displays), and analyses using the standard array of basic statistical tools (correlation, linear and logistic regression, and parametric and nonparametric tests of location and dispersion). Acock teaches Stata commands by using the menus and dialog boxes while still stressing the value of do-files. In this way, he ensures that all types of users can build good work habits. Each chapter has exercises that the motivated reader can use to reinforce the material.
The tone of the book is friendly and conversational without ever being glib or condescending. Important asides and notes about terminology are set off in boxes, which makes the text easy to read without any convoluted twists or forward-referencing. Rather than splitting topics by their Stata implementation, Acock chose to arrange the topics as they would be in a basic statistics textbook; graphics and postestimation are woven into the material in a natural fashion. Real datasets, such as the General Social Surveys from 2002 and 2006, are used throughout the book.
The focus of the book is especially helpful for those in psychology and the social sciences, because the presentation of basic statistical modeling is supplemented with discussions of effect sizes and standardized coefficients. Various selection criteria, such as semipartial correlations, are discussed for model selection.
The second edition of the book has been updated to reflect new features in Stata 10 and includes a new chapter on the use of factor analysis to develop valid, reliable scale measures.
Table of contents
List of figures
Support materials for the book
1 Getting started 1
- 1.1 Conventions
- Introduction
- 1.2 The Stata screen
- 1.3 Using an existing dataset
- 1.4 An example of a short Stata session
- 1.5 Summary
- 1.6 Exercises
2 Entering data
- 2.1 Creating a dataset
- 2.2 An example questionnaire
- 2.3 Develop a coding system
- 2.4 Entering data
- 2.4.1 Labeling values
- 2.5 Saving your dataset
- 2.6 Checking the data
- 2.7 Summary
- 2.8 Exercises
3 Preparing data for analysis
- 3.1 Introduction
- 3.2 Plan your work
- 3.3 Create value labels
- 3.4 Reverse-code variables
- 3.5 Create and modify variables
- 3.6 Create scales
- 3.7 Save some of your data
- 3.8 Summary
- 3.9 Exercises
4 Working with commands, do-files, and results
- 4.1 Introduction
- 4.2 How Stata commands are constructed
- 4.3 Getting the command from the menu system
- 4.4 Saving your results
- 4.5 Logging your command file
- 4.6 Summary
- 4.7 Exercises
5 Descriptive statistics and graphs for a single
variable
- 5.1 Descriptive statistics and graphs
- 5.2 Where is the center of a distribution?
- 5.3 How dispersed is the distribution?
- 5.4 Statistics and graphs—unordered categories
- 5.5 Statistics and graphs—ordered categories and
variables
- 5.6 Statistics and graphs—quantitative variables
- 5.7 Summary
- 5.8 Exercises
6 Statistics and graphs for two categorical variables
- 6.1 Relationship between categorical variables
- 6.2 Cross-tabulation
- 6.3 Chi-squared
- 6.3.1 Degrees of freedom
- 6.3.2 Probability tables
- 6.4 Percentages and measures of association
- 6.5 Ordered categorical variables
- 6.6 Interactive tables
- 6.7 Tables—linking categorical and quantitative
variables
- 6.8 Summary
- 6.9 Exercises
7 Tests for one or two means
- 7.1 Introduction to tests for one or two means
- 7.2 Randomization 0
- 7.3 Random sampling
- 7.4 Hypotheses
- 7.5 One-sample test of a proportion
- 7.6 Two-sample test of a proportion
- 7.7 One-sample test of means
- 7.8 Two-sample test of group means
- 7.8.1 Testing for unequal variances
- 7.9 Repeated-measures t test
- 7.10 Power analysis
- 7.11 Nonparametric alternatives
- 7.11.1 Mann–Whitney two-sample rank-sum test
- 7.11.2 Nonparametric alternative: median test
- 7.12 Summary
- 7.13 Exercises
8 Bivariate correlation and regression
- 8.1 Introduction to bivariate correlation and
regression
- 8.2 Scattergrams
- 8.3 Plotting the regression line
- 8.4 Correlation
- 8.5 Regression
- 8.6 Spearman's rho: rank-order correlation for ordinal
- 8.7 Summary
- 8.8 Exercises
9 Analysis of variance (ANOVA)
- 9.1 The logic of one-way analysis of variance
- 9.2 ANOVA example
- 9.3 ANOVA example using survey data
- 9.4 A nonparametric alternative to ANOVA
- 9.5 Analysis of covariance
- 9.6 Two-way ANOVA
- 9.7 Repeated-measures design
- 9.8 Intraclass correlation—measuring agreement
- 9.9 Summary
- 9.10 Exercises
10 Multiple regression
- 10.1 Introduction to multiple regression
- 10.2 What is multiple regression?
- 10.3 The basic multiple regression command
- 10.4 Increment in R-squared: semipartial correlations
- 10.5 Is the dependent variable normally distributed?
- 10.6 Are the residuals normally distributed?
- 10.7 Regression diagnostic statistics
- 10.7.1 Outliers and influential cases
- 10.7.2 Influential observations: DFbeta
- 10.7.3 Combinations of variables may cause problems
- 10.8 Weighted data
- 10.9 Categorical predictors and hierarchical
regression
- 10.10 Fundamentals of interaction
- 10.11 Summary
- 10.12 Exercises
11 Logistic regression
- 11.1 Introduction
- 11.2 An example
- 11.3 What are an odds ratio and a logit?
- 11.3.1 The odds ratio
- 11.3.2 The logit transformation
- 11.4 Data used in rest of chapter
- 11.5 Logistic regression
- 11.6 Hypothesis testing
- 11.6.1 Testing individual coefficients
- 11.6.2 Testing sets of coefficients
- 11.7 Nested logistic regressions
- 11.8 Summary
- 11.9 Exercises
12 Measurement, reliability, and validity
What's next?
- 12.1 Overview of reliability and validity
- 12.2 Constructing a scale
- 12.2.1 Generating a mean score for each person
- 12.3 Reliability
- 12.3.1 Stability and test-retest reliability
- 12.3.2 Equivalence
- 12.3.3 Split-half and alpha reliability—internal consistency
- 12.3.4 Kuder–Richardson reliability for dichotomous items
- 12.3.5 Rater agreement—kappa (K)
- 12.4 Validity
- 12.4.1 Expert judgement
- 12.4.2 Criterion-related validity
- 12.4.3 Construct validity
- 12.5 Factor analysis
- 12.6 PCF analysis
- 12.6.1 Orthogonal rotation: varimax
- 12.6.2 Oblique rotation: promax
- 12.7 But we wanted one scale, not four scales
- 12.7.1 Scoring our variable
- 12.8 Summary
- 12.9 Exercises
13 Appendix What's next?
- 13.1 Introduction to the appendix
- 13.2 Resources
- 13.2.1 Web resources
- 13.2.2 Books on Stata
- 13.2.3 Short courses
- 13.2.4 Acquiring data
- 13.3 Summary
References
Author index
Subject index


|