Financial Statement Analysis

Jun He May 31, 2026

This chapter pulls firm fundamentals — balance sheet, income statement, and cash flow statement — and turns raw statement items into comparable figures. The code uses the fmpapi package (Financial Modeling Prep) to fetch statements. Use the R | Python toggle to switch.

library(tidyverse)
library(tidyfinance)
library(scales)
library(fmpapi)

import pandas as pd
import numpy as np
from fmpapi import fmp_get
from plotnine import *
from mizani.formatters import percent_format

The three statements

Each statement is one API call. The balance sheet is a snapshot of assets, liabilities, and equity; we request the last five annual filings for a single firm.

fmp_get(
  resource = "balance-sheet-statement",
  symbol = "MSFT",
  params = list(period = "annual", limit = 5)
)

fmp_get(
  resource="balance-sheet-statement",
  symbol="MSFT",
  params={"period": "annual", "limit": 5},
  to_pandas=True
)

The income statement reports revenues and expenses over the period.

fmp_get(
  resource = "income-statement",
  symbol = "MSFT",
  params = list(period = "annual", limit = 5)
)

fmp_get(
  resource="income-statement",
  symbol="MSFT",
  params={"period": "annual", "limit": 5},
  to_pandas=True
)

The cash flow statement records where cash actually moved.

fmp_get(
  resource = "cash-flow-statement",
  symbol = "MSFT",
  params = list(period = "annual", limit = 5)
)

fmp_get(
  resource="cash-flow-statement",
  symbol="MSFT",
  params={"period": "annual", "limit": 5},
  to_pandas=True
)

Downloading a cross-section

To work with more than one firm, we loop the same call over a sample of symbols and stack the results — map_df in R, a list comprehension fed to pd.concat in Python. The same params (annual, last five years) apply to all three statement types.

sample <- c(
  "AAPL", "MSFT", "GOOGL", "AMZN", "TSLA", "NVDA", "META", "NFLX", "DIS", "NKE",
  "WMT", "KO", "JPM", "BAC", "V", "XOM", "CVX", "JNJ", "PFE", "INTC",
  "AMD", "SBUX", "BABA", "UBER", "CSCO"
)

params <- list(period = "annual", limit = 5)

balance_sheet_statements <- sample |>
  map_df(\(x) fmp_get(resource = "balance-sheet-statement", symbol = x, params = params))

income_statements <- sample |>
  map_df(\(x) fmp_get(resource = "income-statement", symbol = x, params = params))

cash_flow_statements <- sample |>
  map_df(\(x) fmp_get(resource = "cash-flow-statement", symbol = x, params = params))

sample = [
  "AAPL", "MSFT", "GOOGL", "AMZN", "TSLA", "NVDA", "META", "NFLX", "DIS", "NKE",
  "WMT", "KO", "JPM", "BAC", "V", "XOM", "CVX", "JNJ", "PFE", "INTC",
  "AMD", "SBUX", "BABA", "UBER", "CSCO"
]

params = {"period": "annual", "limit": 5}

balance_sheet_statements = pd.concat(
  [fmp_get(resource="balance-sheet-statement", symbol=x, params=params, to_pandas=True) for x in sample],
  ignore_index=True
)

income_statements = pd.concat(
  [fmp_get(resource="income-statement", symbol=x, params=params, to_pandas=True) for x in sample],
  ignore_index=True
)

cash_flow_statements = pd.concat(
  [fmp_get(resource="cash-flow-statement", symbol=x, params=params, to_pandas=True) for x in sample],
  ignore_index=True
)

Working with the statements

Because the data is tidy, selecting and filtering is routine: here we pull one firm's total assets and liabilities by year. The same selection logic extends to building the ratios (profitability, leverage, valuation) that the asset-pricing chapters use as firm characteristics.

balance_sheet_statements |>
  filter(symbol == "MSFT") |>
  select(symbol, calendar_year, total_assets, total_liabilities)

(balance_sheet_statements
  .query("symbol == 'MSFT'")
  .get(["symbol", "calendar_year", "total_assets", "total_liabilities"])
)

Study notes following the Tidy Finance curriculum by Scheuch, Voigt, Weiss, and Frey. Prose is my own; the R/Python code is reproduced from the book's open-source source, licensed CC BY-NC-SA 4.0.