Data Visualization
Many of the major debates in social sciences revolve around questions that can be addressed with data. The sources of voting behavior, the correlates of war, the determinants of development, political economy, psychology, institutions, and conflict—all are issues that are amenable to data-based analysis.
At the same time, the amount of available data and the number of publicly-available open-source tools for cleaning, transforming, analyzing and visualizing it have increased exponentially since the turn of the millennium. With a few clicks students can compare word frequencies in books over time or construct elaborate size-weighted wordclouds— tasks that would have taken scholars weeks if not months of effort in the past.
Dates
This two-week, 35-hour course runs Monday-Friday, July 1-12, 2025. The course is scheduled for 9:00 am-12:30 pm.
Instructor
Cho Soohyun, Princeton University
Detailed Description
This course introduces students to those tools and the principles behind their use in the context of applications in social sciences. It neither requires nor imparts any statistical background: it is designed to serve either as a standalone course or as a gateway to a more advanced data-analytics class.
Prerequisites
There are no formal prerequisites.
Requirements
In this course, students are expected to use the statistical software R (see https://www.r-project. org) and RStudio (see https://www.rstudio.com).
Core Readings
Healy, Kieran. 2018. Data Visualization: A Practical Introduction. Princeton: Princeton University Press. (Most of the book and lecture notes are available free online)
Wickham, Hadley and Garrett Grolemund. 2017. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. Sebastopol, California: O’Reilly Media. (Most of the book and lecture notes are available free online)