TStat’s Analysing Micro Data in Stata Summer School offers participants a comprehensive introduction to the principle methodologies used in the analysis of micro data. Micro data contains information at the level of a specific unit (such as individuals, firms or entities), by its very nature micro data has become an increasingly important source of information offering researchers and policy makers an effective tool with which to obtain a more in-depth understanding of an array of political, socio-economic and Public Health phenomena. The collection and subsequent analysis of micro data has over recent years therefore proved to be the key to policy formulation, the targeting of interventions and the subsequent monitoring and measurement of the impact of such interventions and policies.
Although micro data analysis techniques were originally developed and applied in the field of economics, the increasing availability of micro data has resulted in a steady increase in the analysis of such data by researchers working in Political and Social Sciences, Biostatistics, Education, Epidemiology and Public Health.
Throughout the course of the Summer School, the course leaders will focus, from both a theoretical and applied point of view, on the principal methodologies implemented for the analysis of both cross-section and panel data: linear models, count models, binary dependent variable models, multinomial models, Tobit and Interval Regression models, models with Sample Selection, and estimation of Extended Regression Models (ERM), which implement Maximum Likelihood estimators capable of simultaneously treating issues of Sample Selection and the presence of both endogenous regressors and treatment variables.
The school opens with an optional introductory one day course (Module A) to the statistical package Stata, during which participants will be provided with the necessary tools to enable them to use Stata independently and actively participate in the applied empirical Lab sessions during the course of the week.
In common with TStat’s training philosophy, the summer school is composed of both a theoretical component (in which the techniques and underlying principles behind them are explained), and an applied (hands-on) segment, during which participants have the opportunity to implement the techniques using real data under the watchful eye of the course tutor. Theoretical sessions are reinforced by case study examples, in which the course tutor discusses and highlights potential pitfalls and the advantages of individual techniques. The intuition behind the choice and implementation of a specific technique is of the utmost importance. In this manner, the course leader is able to bridge the “often difficult” gap between abstract theoretical methodologies, and the practical issues one encounters when dealing with real data. Special attention is also given to the interpretation and presentation of results.
At the end of the course, participants are expected to be able, with the aid of the Stata routines implemented during the sessions, to independently implement the methodologies and techniques acquired during the course by adopting the Stata routines to their own particular research needs.
The Summer School program has been particularly developed for both doctoral students and young researchers working in biostatistics, business, economics, epidemiology, finance, public health, psychology, social and political sciences needing to acquire the necessary toolset to independently conduct empirical analysis using micro data, but who may not have access to a specific micro data analysis course in their home institution. It is however, also particularly useful to professionals working in one of these fields needing to either refresh their existing micro data skills or acquire new ones.
It is assumed that course participants have at some point followed a university basic course in econometrics or statistics and thus be comfortable with the arguments covered in chapters 1-9 in J.M. Wooldridge, Introductory, Econometrics: A Modern Approach, South-Western College Pub,2013, 5th edition.
Previous exposure to Stata would also be an advantage. Participants with no previous knowledge of Stata are however, strongly encouraged to follow the Introduction to Stata Course offered at the beginning of the School.
MODULE A | DAY 1 | STATA IN JUST ONE DAY!
SESSION I: INTRODUCTION GETTING STARTED
File types in Stata
Working interactively in Stata
Saving output: the log file
Loading Stata databases
The Log Output File
Saving databases in Stata
Exiting the software
SESSION II: PRELIMINARY DATA ANALYSIS
A preliminary look at the data: describe, summarize commands
Abbreviations in Stata
Statistical Tables: table, tabstat and tabulate commands
SESSION III: DATA MANAGEMENT
Selecting or eliminating variables
The count command
Creating sub-groups: the prefix by
Creating new variables: generate
Operators in Stata
The command assert
Missing values in Stata
Modifying variables: replace, recode
Creating Labels: variable labels and value labels
Creating dummy variables
SESSION IV: IMPORTING DATA FROM SPREADSHEETS
Import Excel and Export Excel commands
The insheet and outsheet commands
Reading in Text Data Files
Issues to watch out for when importing data
Redefining missing values
dealing wih “messy” strings
SESSION V: GRAPHICS – A BRIEF INTRODUCTION
Stata’s syntax for two way graphs
Saving and exporting graphs
Useful graph commands
Personalizing a graph
Stata’s Graph Editor
Merging data bases
APPENDIX B: MORE ADVANCED ISSUES (time permitting)
Merging data bases
e-class and r-class variables
MODULE B | DAY 2 | LINEAR REGRESSION MODELS
SESSION I: THE LINEAR MODEL WITH EXOGENOUS REGRESSORS
The Ordinary Least Squares (OLS) Estimator: regress
Specification tests and tests for robust inference: estat imtest, estat hettest, estat bgodfrey, actest
SESSION II: THE LINEAR MODEL WITH ENDOGENOUS REGRESSORS
IV e GMM Estimators: ivregress, gmm
Specification tests and tests for robust inference: ivhettest, actest, estat overid, estat endogenous, estat firststage, weakivtest
DAYS 3-4 – LINEAR PANEL DATA REGRESSION MODELS
SESSION I: PANEL DATA IN STATA SOME BASIC CONCEPTS
Panel Data structures in Stata
Time Series Operators in Stata
The advantages of Panel Data for applied micro data analysis
SESSION II: LINEAR PANEL DATA MODELS WITH EXOGENOUS VARIABLES
One-way and two-way fixed effect estimators: xtreg, fe
Random Effects Estimators: xtreg, re; xtmixed
SESSION III: LINEAR PANEL DATA MODELS WITH EXOGENOUS VARIABLES: ROBUST INFERENCE
Robust covariance estimators
The first-difference estimator
Testing for non i.i.d. errors
Testing Random Effects against Fixed Effects:
non-robust approach using Hausman
robust approach using Mondlak auxiliary regression (Wooldridge, 2010)
SESSION IV: LINEAR PANEL DATA MODELS WITH ENDOGENOUS VARIABLES
Fixed and Random Effect IV Estimators: xtivreg
Hausman and Taylor’s estimator: xthtaylor
DAYS 5-7 NON-LINEAR REGRESSION MODELS
SESSION I: COUNT MODEL ESTIMATORS
The Poisson Model: poisson, nl, gmm
The Poisson Model with engodenous regressors: ivpoisson, gmm
Estimation and tests in the presence of overdispersion (the negative binomial regression model): nbreg
Estimation and interpretation of the marginal estimation effects using Stata’s post estimation command margins
Fixed and Random Panel Data Estimators: xtpoisson, xtnbreg
SESSION II: DISCRETE DEPENDENT VARIABLE MODELS
Estimating linear models with binary dependent variables – Logit, Probit and the Linear Probability Model: probit, logit, regress
The Heteroskedastic Probit Model and tests of heteroskadicity: hetprobit
Measures of Goodness of Fit and Specification Tests: estat classification, estat gof
Estimating and interpreting marginal effects: margins
Fixed and Random Panel Data Estimators: xtprobit, xtlogit, clogit
SESSION III: PROBIT MODELS WITH ENDOGENOUS REGRESSORS
Maximum likelihood estimation in the presence of continuous endogenous regressors: ivprobit
Measures of Goodness of Fit: tabulate, estat classification, estat correlation
Estimating and interpretation of estimated marginal effects: margins
SESSION IV: MULTINOMIAL MODELS
Ordered categorical variable models (the Ordered Probit and Ordered Logit Estimators): oprobit and ologit
The Heteroskedastic Probit Model and tests of heteroskadicity: hetoprobit
Random Effect Ordered Panel Data Probit Models: xtoprobit
Models with unordered categorical variables – Multinomial Logit and Multinomial Probit estimators: mlogit, mprobit
MacFadden’s Choice Model – categorical variable models with alternative specific regressors: cmclogit, cmcprobit
Measures of Goodness of Fit and Specification Tests
Estimation and interpretation of marginal effects using the Stata post estimation command margins
SESSION V: THE TOBIT MODEL, INTERVAL REGRESSION AND SAMPLE SELECTION
The Tobit Model: tobit
Estimating the Tobit model with endogenous regressors: ivtobit
Interval Regression: a generalization of the Tobit Model: intreg
Fixed and Random Effects Panel Data Estimators: xttobit, xtintreg
Estimators for Sample Selection Models: heckman
Estimation and interpretation of marginal effects using the Stata post estimation command margins
Random Effect Panel Data Estimators: xtheckman
SESSION VI: EXTENDED REGRESSION MODELS WITH BOTH ENDOGENOUS REGRESSORS AND TREATMENT EFFECTS IN THE PRESENCE OF SAMPLE SELECTION
Extended Regression Models: eregress
Extended Regression Probit Models: eprobit
Ordered Extended Regression Probit Models: eoprobit
Extended Interval Regression Models: eintreg
Extended Regression Random Effect Panel Data models: xteregress, xteprobit, xteoprobit, xteintreg
A Gentle Introduction to Stata, 6th Ed., Alan Acock (2018) Stata Press
Data Analysis Using Stata, 3rd Ed., Ulrich Kohler, Frauke Kreuter (2012) Stata Press
Data Management Using Stata: A Practical Handbook, Michael N. Mitchell, (2010) Stata Press
The Workflow of Data Analysis Using Stata, J. Scott Long (2009) Stata Press
Mostly Harmless Econometrics: An Empiricist’s Companion, Joshua D. Angrist e Jorn-Steffen Pischke (2008) Princeton University Press
Microeconometrics Using Stata, Colin Cameron and Pravin K. Trivedi (2010) Stata Press
The Summer School will take place from the 28th of August to the 3rd of September 2022 at the CISL Studium Center, Via Della Piazzola, 71 | I-50123 Florence | www.centrostudi.cisl.it from 9:00 am to 5:00 pm Central European Summer Time (CEST).
Dr. Una-Louise BELL, TStat Training | TStat S.r.l.
Dr. Giovanni BRUNO, Bocconi University, Milan
ENTIRE WEEK (MODULES A plus B, 7 days)
Full-Time Students*: € 1890.00
Ph.D. Students: € 2770.00
Academic: € 3080.00
Commercial: € 4550.00
MODULE B (6 days)
Full-Time Students*: € 1620.00
Ph.D. Students: € 2380.00
Academic: € 2640.00
Commercial: € 3900.00
**To be eligible for full-time student prices, participants must provide proof of their full-time student status for the current academic year. Our standard policy is to provide all full-time students, be they Undergraduates or Masters, access to our student registration rates. Part-time master and doctoral students on the other hand, who are also currently employed will however, be assigned the standard academic registration fee. Residential costs for full-time students are completely covered TStat Training through our Investing in Young Researchers Programme. Participation is however restricted to a maximum of 3 students.
Fees are subject to VAT (applied at the current Italian rate of 22%). Under current EU fiscal regulations, VAT will not however applied to companies, Institutions or Universities providing a valid tax registration number.
Please note that a non-refundable deposit of €100.00 for full-time students and €250.00 for Academic and Commercial participants, is required to secure a place and is payable upon registration. The number of participants is limited to 15. Places will be allocated on a first come, first serve basis.
Course fees cover: i) teaching materials (copies of lecture slides, databases and Stata routines used during the summer school; ii) a temporary licence of Stata valid for 30 days from the day before the beginning of the school; iii) half board accommodation (breakfast, lunch and coffee breaks) in a single room at the CISL Studium Centre or equivalent (7 nights for entire school, 6 nights for Modules B). Participants requiring accommodation the night of the final day of the school, are requested to contact us as soon as possible.
To maximize the usefulness of this summer school, we strongly recommend that participants bring their own laptops with them, to enable them to actively participate in the empirical sessions.
Individuals interested in attending this summer school must return their completed registration forms by email (email@example.com) to TStat by the 18th August 2022.
TStat’s Analysing Micro Data in Stata Summer School offers participants a comprehensive introduction to the principle methodologies used in the analysis of micro data. Micro data contains information at the level of a specific unit (such as individuals, firms or entities), by its very nature micro data has become an increasingly important source of information offering researchers and policy makers an effective tool with which to obtain a more in-depth understanding of an array of political, socio-economic and Public Health phenomena.
[contact-form-7 404 "Not Found"]