Preface
These pages are my study notes following Tidy Finance, an open-source book that teaches empirical and quantitative finance through transparent, reproducible code in both R and Python. Each chapter pairs a financial idea with the code that implements it, end to end — from downloading data to forming portfolios to running regressions.
What the book is for
The premise is that financial research should be reproducible: anyone should be able to take the code, run it, and obtain the same results. To that end the book favors a small, consistent toolset (the tidyverse in R, pandas/numpy in Python), a tidy-data discipline (one row per observation, consistent keys), and a companion tidyfinance package that wraps common data downloads behind a single call.
How these notes are organized
The chapters follow the book's structure, grouped into parts:
- Prerequisites — setting up the environment and the
tidyfinancepackage. - Getting Started — returns, modern portfolio theory, the CAPM, financial statements, and discounted cash flow analysis.
- Financial Data — accessing open-source data, WRDS/CRSP/Compustat, TRACE/FISD, and other providers.
- Asset Pricing — beta estimation, portfolio sorts, the size and value effects, factor replication, and Fama–MacBeth regressions.
- Modeling and Machine Learning — fixed effects and clustered standard errors, difference-in-differences, factor selection, and option pricing.
- Portfolio Optimization — parametric portfolio policies and constrained optimization with backtesting.
- Appendix — WRDS pseudo data (so the code runs without WRDS access), cleaning Enhanced TRACE, and proofs.
Each page explains the idea in my own words and reproduces the book's R and Python code (toggle at the top of code-bearing chapters), with attribution to the source.
Study notes following the Tidy Finance curriculum by Scheuch, Voigt, Weiss, and Frey. Prose is my own; reproduced code is licensed CC BY-NC-SA 4.0.