Financial Statement Analysis
This chapter pulls firm fundamentals — balance sheet, income statement, and cash flow statement — and turns raw statement items into comparable figures. The code uses the fmpapi package (Financial Modeling Prep) to fetch statements. Use the R | Python toggle to switch.
library(tidyverse)
library(tidyfinance)
library(scales)
library(fmpapi)
import pandas as pd
import numpy as np
from fmpapi import fmp_get
from plotnine import *
from mizani.formatters import percent_format
The three statements
Each statement is one API call. The balance sheet is a snapshot of assets, liabilities, and equity; we request the last five annual filings for a single firm.
fmp_get(
resource = "balance-sheet-statement",
symbol = "MSFT",
params = list(period = "annual", limit = 5)
)
fmp_get(
resource="balance-sheet-statement",
symbol="MSFT",
params={"period": "annual", "limit": 5},
to_pandas=True
)
The income statement reports revenues and expenses over the period.
fmp_get(
resource = "income-statement",
symbol = "MSFT",
params = list(period = "annual", limit = 5)
)
fmp_get(
resource="income-statement",
symbol="MSFT",
params={"period": "annual", "limit": 5},
to_pandas=True
)
The cash flow statement records where cash actually moved.
fmp_get(
resource = "cash-flow-statement",
symbol = "MSFT",
params = list(period = "annual", limit = 5)
)
fmp_get(
resource="cash-flow-statement",
symbol="MSFT",
params={"period": "annual", "limit": 5},
to_pandas=True
)
Downloading a cross-section
To work with more than one firm, we loop the same call over a sample of symbols and stack the results — map_df in R, a list comprehension fed to pd.concat in Python. The same params (annual, last five years) apply to all three statement types.
sample <- c(
"AAPL", "MSFT", "GOOGL", "AMZN", "TSLA", "NVDA", "META", "NFLX", "DIS", "NKE",
"WMT", "KO", "JPM", "BAC", "V", "XOM", "CVX", "JNJ", "PFE", "INTC",
"AMD", "SBUX", "BABA", "UBER", "CSCO"
)
params <- list(period = "annual", limit = 5)
balance_sheet_statements <- sample |>
map_df(\(x) fmp_get(resource = "balance-sheet-statement", symbol = x, params = params))
income_statements <- sample |>
map_df(\(x) fmp_get(resource = "income-statement", symbol = x, params = params))
cash_flow_statements <- sample |>
map_df(\(x) fmp_get(resource = "cash-flow-statement", symbol = x, params = params))
sample = [
"AAPL", "MSFT", "GOOGL", "AMZN", "TSLA", "NVDA", "META", "NFLX", "DIS", "NKE",
"WMT", "KO", "JPM", "BAC", "V", "XOM", "CVX", "JNJ", "PFE", "INTC",
"AMD", "SBUX", "BABA", "UBER", "CSCO"
]
params = {"period": "annual", "limit": 5}
balance_sheet_statements = pd.concat(
[fmp_get(resource="balance-sheet-statement", symbol=x, params=params, to_pandas=True) for x in sample],
ignore_index=True
)
income_statements = pd.concat(
[fmp_get(resource="income-statement", symbol=x, params=params, to_pandas=True) for x in sample],
ignore_index=True
)
cash_flow_statements = pd.concat(
[fmp_get(resource="cash-flow-statement", symbol=x, params=params, to_pandas=True) for x in sample],
ignore_index=True
)
Working with the statements
Because the data is tidy, selecting and filtering is routine: here we pull one firm's total assets and liabilities by year. The same selection logic extends to building the ratios (profitability, leverage, valuation) that the asset-pricing chapters use as firm characteristics.
balance_sheet_statements |>
filter(symbol == "MSFT") |>
select(symbol, calendar_year, total_assets, total_liabilities)
(balance_sheet_statements
.query("symbol == 'MSFT'")
.get(["symbol", "calendar_year", "total_assets", "total_liabilities"])
)
Study notes following the Tidy Finance curriculum by Scheuch, Voigt, Weiss, and Frey. Prose is my own; the R/Python code is reproduced from the book's open-source source, licensed CC BY-NC-SA 4.0.