Resources & Reading List
Handouts
Foundational ML for Social Scientists
- James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An Introduction to Statistical Learning (2nd ed.). Springer. [Free PDF: statlearning.com]
- McKinney, W. (2022). Python for Data Analysis (3rd ed.). O'Reilly.
- Athey, S. (2017). Beyond prediction: Using machine learning for policy problems. Science, 355(6324), 483–485.
ML in Innovation Research
- Arts, S., Hou, J., & Gomez, J.C. (2021). Natural language processing to identify the creation and destruction of creative destruction. Research Policy, 50(4), 104218.
- Behrens, J., Ernst, H., & Shepherd, D.A. (2020). The role of machine learning in innovation research. Research Policy, 49(1), 103889.
- Fernandez-Delgado, M., Cernadas, E., Barro, S., & Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems? JMLR, 15, 3133–3181.
NLP and Text Analysis
- Gentzkow, M., Kelly, B., & Taddy, M. (2019). Text as data. Journal of Economic Literature, 57(3), 535–74.
- Hannigan, T.R. et al. (2019). Topic modeling in management research. Academy of Management Annals, 13(2), 586–632.
Clustering and Dimensionality Reduction
- McInnes, L., Healy, J., & Melville, J. (2018). UMAP: Uniform manifold approximation and projection for dimension reduction. arXiv:1802.03426.
- Steinley, D. (2006). K-means clustering: A half-century synthesis. British Journal of Mathematical and Statistical Psychology, 59(1), 1–34.
Generative AI in Research
- Bail, C.A. (2024). Can generative AI improve social science? PNAS, 121(21), e2314021121.
- Ziems, C. et al. (2024). Can large language models transform computational social science? Computational Linguistics, 50(1), 237–291.
Reporting ML Results
- Collins, G.S. et al. (2015). Transparent reporting of a multivariable prediction model (TRIPOD). BMJ, 350, g7594.
Online Resources
Kaggle Learn
Free courses on Python, ML, and data science
Google Colab
Free Jupyter notebooks in the browser
scikit-learn
Excellent tutorials and examples
Papers With Code
ML papers with implementation