2025-2026 Bulletin: Program Requirements 
    
    Aug 17, 2025  
2025-2026 Bulletin: Program Requirements

PP 487M - Applied Data Analysis: Machine Learning and Data Mining for Social Scientists


In the past decades, machine learning revolutionized the way scientists across disciplines and industries analyze data, unearth hidden patterns, and apply statistical tools to solve social problems. This course introduces students to the most commonly applied machine learning, data mining, and statistical pattern recognition techniques in academic and public/private research settings. It offers a practical know-how approach while focusing on the techniques, methods, and the statistics supporting them. Some of the topics covered in the course include supervised machine learning (e.g., parametric/non-parametric algorithms), unsupervised machine learning (e.g., clustering), and performance, cross-validation, and regularization theory. Because this is not a coding/programming course, students will learn how to apply algorithms to social science data, create and evaluate data clusters, and perform predictive analytics through a variety of statistical software packages (Stata, R, and Python) and already-written code specifications in various languages. This course offers a great opportunity for social scientists to acquire advanced data science skills and application tools to solve real-world problems.
Units: 4
Course Type: Seminar