| |
Data Analysis Using Stata, Second Edition - By Ulrich Kohler and Frauke Kreuter Comment from the Stata Technical group

Updated to include changes to Stata over the past several years, Data
Analysis Using Stata, Second Edition comprehensively introduces Stata
and will be useful to those who are just learning statistics and Stata,
as well as to users of other statistical packages who are making the
switch to Stata. Throughout the book, Kohler and Kreuter show examples
using data from the German Socioeconomic Panel, a large survey of
households containing demographic, income, employment, and other key
information. The authors describe the Graph Editor and time-of-day
variables, two features added in Stata 10, in this new edition.
Kohler and Kreuter’s book is a valuable
introduction to Stata. The authors take a hands-on approach, leading
you step by step through actual Stata sessions to answer practical
questions commonly asked by social scientists.
They begin with an introduction to the Stata
interface and then proceed with a description of Stata syntax and
simple programming tools like foreach loops. The core of the book
includes chapters on producing tables and graphs, performing linear
regression, and using logistic regression. Kohler and Kreuter use
multiple examples to illustrate all key concepts.
The rest of the book includes chapters on reading
text files, writing programs and ado-files, and using Internet
resources, such as the search command and the SSC archive.
Table of Contents
List of Tables
List of Figures
- "The first time"
- 1.1 Starting Stata
- 1.2 Setting up your screen
- 1.3 Your first analysis
- 1.3.1 Inputting commands
- 1.3.2 Files and the working memory
- 1.3.3 Loading data
- 1.3.4 Variables and observations
- 1.3.5 Looking at data
- 1.3.6 Interrupting a command and repeating a command
- 1.3.7 The variable list
- 1.3.8 The in qualifier
- 1.3.9 Summary statistics
- 1.3.10 The if qualifier
- 1.3.11 Define missing values
- 1.3.12 The by prefix
- 1.3.13 Command options
- 1.3.14 Frequency tables
- 1.3.15 Variable labels and value labels
- 1.3.16 Graphs
- 1.3.17 Getting help
- 1.3.18 Recoding of variables
- 1.3.19 Linear regression
- 1.4 Do-files
- 1.5 Exiting Stata
- 1.6 Exercises
- Working with do-files
- 2.1 From interactive work to working with a do-file
- 2.1.1 Alternative 1
- 2.1.2 Alternative 2
- 2.2 Designing do-files
- 2.2.1 Comments
- 2.2.2 Line breaks
- 2.2.3 Some crucial commands
- 2.3 Organizing your work
- 2.4 Exercises Summary
- The grammar of Stata
- 3.1 The elements of Stata commands
- 3.1.1. Stata commands
- 3.1.2 The variable list
- List of variables: required or optionals
- Abbreviation rules
- Special listings
- 3.1.3 Options
- 3.1.4 The in qualifier
- 3.1.5 The if qualifier
- 3.1.6 Expressions
- Operators
- Functions
- 3.1.7 Lists of numbers
- 3.1.8 Using filenames
- 3.2 Repeating similar commands
- 3.2.1 The by prefix
- 3.2.2 The foreach loop
- The types of foreach lists
- Several commands within a foreach loop
- 3.2.3 The forvalues loop
- 3.3 Weights
- Frequency weights
- Analytic weights
- Probability weights
- 3.4 Exercises
- General comments on the statistical commands
- 4.1 Exercises
- Creating and changing variables
- 5.1 The commands generate and replace
- 5.1.1 Variable names
- 5.1.2 Some examples
- 5.1.3 Changing codes with by, _n, and _N
- 5.1.4 Subscripts
- 5.2 Specialized recoding commands
- 5.2.1 The recode command
- 5.2.2 The egen command
- 5.3 More tools for recording data
- 5.3.1 String functions
- 5.3.2 Date and time functions
- Dates
- Times
- 5.4 Commands for dealing with missing values
- 5.5 Labels
- 5.6 Storage types, or, the ghost in the machine
- 5.7 Exercises
- Creating and changing graphs
- 6.1 A primer on graph syntax
- 6.2 Graph types
- 6.2.1 Examples
- 6.2.2 Specialized graphs
- 6.3 Graph elements
- 6.3.1 Appearance of data
- Choice of marker
- Marker colors
- Marker size
- Lines
- 6.3.2 Graphs and plot regions
- Graph size
- Plot region
- Scaling the axes
- 6.3.3 Information inside the plot region
- Reference lines
- Labeling inside the plot region
- 6.3.4 Information outside the plot region
- Labeling the axes
- Tick lines
- Axis titles
- The legend
- Graph titles
- 6.4 Multiple graphs
- 6.4.1 Overlaying numerous twoway graphs
- 6.4.2 Option by()
- 6.4.3 Combining graphs
- 6.5 Saving and printing graphs
- 6.6 Exercises
- Describing and Comparing Distributions
- 7.1 Categories: Few or many?
- 7.2 Variables with few categories
- 7.2.1 Tables
- Frequency tables
- More than one frequency table
- Comparing distributions
- Summary statistics
- More than one contingency table
- 7.2.2 Graphs
- Histograms
- Bar charts
- Bar charts
- Dot chart
- 7.3 Variables with many categories
- 7.3.1 Frequencies of grouped data
- Some remarks on grouping data
- Special techniques for grouping data
- 7.3.2 Describing data using statistics
- Important summary statistics
- The summarize command
- The tabstat command
- Comparing distributions using statistics
- 7.3.3 Graphs
- Box plots
- Histograms
- Kernel density estimation
- Quantile plot
- Comparing distributions with Q–Q plots
- 7.4 Exercises
- Introduction to Linear Regression
- 8.1 Simple linear regression
- 8.1.1 The basic principle
- 8.1.2 Linear regression using Stata
- The table of coefficients
- Standard errors
- The table of ANOVA results
- The model fit table
- 8.2 Multiple regression
- 8.2.1 Multiple regression using Stata
- 8.2.2 Additional components
- Adjusted R2
- Standardized regression coefficients
- 8.2.3 What does "under control" mean?
- 8.3 Regression diagnostics
- 8.3.1 Violation of E(ei) = 0
- Linearity
- Influential cases
- Omitted variables
- 8.3.2 Violation of Var(ei) = s2
- 8.3.3 Violation of Cov(ei, ej) = 0, i ? j
- 8.4 Model extensions
- 8.4.1 Categorical independent variables
- 8.4.2 Interaction terms
- 8.4.3 Regression models using transformed variables
- Nonlinear relations
- Eliminating heteroskedasticity
- 8.5 More on standard errors
- 8.5.1 Bootstrap techniques
- 8.5.2 Confidence intervals on cluster samples
- 8.6 Advanced techniques
- 8.6.1 Median regression
- 8.6.2 Regression models for panel data
- From wide to long format
- Fixed-effects models
- 8.6.3 Error-component models
- 8.7 Exercises
- Regression models for Categorical Dependent Variables
- 9.1 The linear probability model
- 9.2 Basic concepts
- 9.2.1 Odds, log odds, and odds ratios
- 9.2.2 Excursion: The maximum likelihood principle
- 9.3 Logistic regression with Stata
- 9.3.1 The coefficients block
- Sign interpretation
- Interpretation with odds ratios
- Probability interpretation
- 9.3.2 The iteration block
- 9.3.3 The model fit block
- Classification tables
- Pearson chi-squared
- 9.4 Logistic regression diagnostics
- 9.4.1 Linearity
- 9.4.2 Influential cases
- 9.5 Likelihood-ratio test
- 9.6 Refined models
- 9.6.1 Nonlinear relationships
- 9.6.2 Categorical independent variables
- 9.6.3 Interaction effects
- 9.7 Advanced techniques
- 9.7.1 Probit models
- 9.7.2 Multinomial logistic regression
- 9.7.3 Models for ordinal data
- 9.8 Exercises
- Reading and writing data
- 10.1 The goal: The data matrix
- 10.2 Importing machine-readable data
- 10.2.1 Reading system files from other packages
- 10.2.2 Reading ASCII text files
- Reading data in spreadsheet format
- Reading data in free format
- Reading data in fixed format
- 10.3 Inputting data
- 10.3.1 Input data using the editor
- 10.3.2 The input command
- 10.4 Combining data
- 10.4.1 The GSOEP database
- 10.4.2 The merge command
- The merge procedure
- Keeping track of observations
- Merging more than two files
- Merging data on different levels
- 10.4.3 The append command
- 10.5 Saving and exporting data
- 10.6 Handling big datasets
- 10.6.1 Rules for handling the working memory
- 10.6.2 Using oversized datasets
- 10.7 Exercises
- Do-files for advanced users and user-written programs
- 11.1 Two examples of usage
- 11.2 Four programming tools
- 11.2.1 Local macros
- Calculating with local macros
- Combining local macros
- Changing local macros
- 11.2.2 Do-files
- 11.2.3 Programs
- The problem of redefinition
- The problem of naming
- The problem of error checking
- 11.2.4 Programs in do-files and ado-files
- 11.3 User-written Stata commands
- 11.3.1 Parsing variable lists
- 11.3.2 Parsing options
- 11.3.3 Parsing if and in qualifiers
- 11.3.4 Generating an unknown number of variables
- 11.3.5 Default values
- 11.3.6 Extended macro functions
- 11.3.7 Avoiding changes in the dataset
- 11.3.8 Help files
- 11.4 Exercises
- Around Stata
- 12.1 Resources and information
- 12.2 Taking care of Stata
- 12.3 Additional procedures
- 12.3.1 SJ and STB ado-files
- 12.3.2 SSC ado-files
- 12.3.3 Other ado-files
- 12.4 Exercises
References
Authors Index
Subject Index


|
|