Panel Data Analysis

Professor Li is such a friendly and helpful instructor, providing a great introduction to panel data methods and making them easy to understand. — participant from Singapore

This course provides a general survey of the various methods used to analyze panel data. The course begins with a quick overview of causal inference and a review of the standard ordinary least square (OLS) assumptions. It then moves on to simple panel data methods, fixed and random effect estimators. The remainder of the course focuses on more advanced methods, such as methods for the study of clustered samples, panel instrumental variable methods, and dynamic panel regression, that deal with violations of the OLS assumptions. Theoretical lectures are complemented by applied lab sessions that put these methods into practice.

Dates

This one-week, 17.5-hour course runs Monday-Friday, 30 June - 4 July, 2025. The course is scheduled for 1:30pm - 5:00 pm.

Classroom Location

Faculty of Arts and Social Science

Instructor

Andrew X. Li, Central European University

Detailed Description

This course provides a general survey of the various econometric models and techniques that are commonly used to analyze panel data. In addition, the course places a special emphasis on the connections between these methods and causal inference, which is the primary goal of all social science research. Panel data are particularly well suited for causal inference as they allow researchers to control for unit-specific factors/individual heterogeneity, such as geography and culture, which may be unobserved and can be difficult to measure. Furthermore, panel data usually provide more degrees of freedom and more sample variability than purely cross-sectional or time-series data, and hence improve the efficiency of parameter estimates.

The course is divided into three parts. The first part involves a through discussion of the logic and assumptions underlying panel data methods. Participants learn how the development of more advanced methods is driven by the need to address potential violations of these assumptions. The second part focuses on the various statistical approaches and 'tricks' available to social scientists to deal with such violations and problems hidden in their data, allowing them estimate effects that are as close as possible to the true causal effects. The final part of the course focuses on applying the wide range of panel data methods discussed in the previous parts to substantive research questions of interest. Participants learn how these methods can be used to provide answers to their own research questions. In this context, participants are encouraged to bring and work with their own panel data during the course’ applied lab sessions.

This course aims to strike a balance between statistical theory and practical application, and participants have the opportunity to learn and practice how to use panel data methods with the help of the popular statistical software Stata and R as well as to develop an understanding and appreciation for the science behind these methods.

Prerequisites

This course presumes a working knowledge of statistics. As it builds on OLS regression and extends it to data with a time-series cross-sectional structure, participants unfamiliar with regression should consider taking Regression Analysis or Applied Data Analysis instead. A background in the use of Stata and/or R would be helpful, but is not required.

Requirements

Participants are expected to have access to an internet-connected computer. Access to data, temporary licenses for the course software, and installation support will be provided by the Methods School.

Core Readings

Wooldridge, Jeffrey M. 2016. Introductory Econometrics: A Modern Approach. Boston, MA: Cengage Learning.

Suggested Readings

Angrist, Joshua D., and Jörn-Steffen Pischke. 2009. Mostly Harmless Econometrics. An Empiricist's Companion. Princeton, NJ: Princeton University Press.

Arellano, Manuel, and Olympia Bover. 1995. Another Look at the Instrumental Variable Estimation of Error-components Models. Journal of Econometrics 68: 29-51.

Beck, Nathaniel, and Jonathan N. Katz. 1995. What to Do (and not to Do) with Time-series Cross-section Data. American Political Science Review 89: 634-647.

Beck, Nathaniel. 2001. Time-series Cross-section Data. What Have we Learned in the Past Few Years? Annual Review of Political Science 4: 271-293.

Blundell, Richard, Stephen Bond, and Frank Windmeijer. 2001. Estimation in Dynamic Panel Data Models: Improving on the Performance of the Standard GMM Estimator. In: Baltagi, Badi H. (ed.). Nonstationary Panels, Panel Cointegration, and Dynamic Panels. Bingley: Emerald.

Bond, Stephen R. 2002. Dynamic Panel Data Models: A Guide to Micro Data Methods and Practice. Portuguese Economic Journal 1: 141-162.

Holland, Paul W. 1986. Statistics and Causal Inference. Journal of the American Statistical Association 81: 945-960.

Imbens, Guido W., and Jeffrey M. Wooldridge. 2009. Recent Developments in the Econometrics of Program Evaluation. Journal of Eeconomic Literature 47: 5-86.

Newey, Whitney K., and Kenneth D. West. 1987. A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix. Econometrica 55: 703-708.

Nickell, Stephen. 1981. Biases in Dynamic Models with Fixed Effects. Econometrica 49: 1417-1426.

Rubin, Donald B. 1974. Estimating Causal Effects of Freatments in Randomized and Nonrandomized Studies. Journal of Educational Psychology 66: 688-701.

Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Panel Data. Boston, MA: MIT Press.