Option Pricing via Machine Learning
Machine learning can approximate an option-pricing function: given inputs like moneyness, maturity, and volatility, a model learns to output a price. It is a clean teaching setting because the Black–Scholes formula provides a known ground truth to judge the learned surrogate against. This chapter generates a Black–Scholes dataset and fits neural networks and a random forest to it. Use the R | Python toggle to switch.
library(tidyverse)
library(tidyfinance)
library(keras3)
library(hardhat)
library(tidymodels)
import pandas as pd
import numpy as np
from sklearn.neural_network import MLPRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.model_selection import train_test_split
The Black–Scholes benchmark
The ground truth is the closed-form Black–Scholes price. We define it as a function so we can label any combination of inputs with its true price.
black_scholes_price <- function(S, K, r, vola, T) {
d1 <- (log(S / K) + (r + vola^2 / 2) * T) / (vola * sqrt(T))
d2 <- d1 - vola * sqrt(T)
S * pnorm(d1) - K * exp(-r * T) * pnorm(d2)
}
from scipy.stats import norm
def black_scholes_price(S, K, r, vola, T):
d1 = (np.log(S / K) + (r + vola**2 / 2) * T) / (vola * np.sqrt(T))
d2 = d1 - vola * np.sqrt(T)
return S * norm.cdf(d1) - K * np.exp(-r * T) * norm.cdf(d2)
Generating the training data
We build a grid over the five inputs (spot, strike, rate, volatility, maturity), price each point with Black–Scholes, then add a little noise to mimic "observed" market prices the model must learn from.
set.seed(2023)
option_prices <- expand_grid(
S = seq(40, 60, 5),
K = seq(20, 90, 5),
r = seq(0, 0.05, 0.01),
vola = seq(0.01, 0.3, 0.01),
T = seq(3 / 12, 2, 1 / 12)
) |>
mutate(
black_scholes = map_dbl(
seq_len(n()),
\(i) black_scholes_price(S[i], K[i], r[i], vola[i], T[i])
),
observed_price = map_dbl(black_scholes, \(x) x + rnorm(1, sd = 0.15))
)
rng = np.random.default_rng(2023)
grid = np.array(np.meshgrid(
np.arange(40, 61, 5), np.arange(20, 91, 5),
np.arange(0, 0.051, 0.01), np.arange(0.01, 0.31, 0.01),
np.arange(3/12, 2.01, 1/12)
)).T.reshape(-1, 5)
option_prices = pd.DataFrame(grid, columns=["S", "K", "r", "vola", "T"])
option_prices["black_scholes"] = black_scholes_price(
option_prices["S"], option_prices["K"], option_prices["r"],
option_prices["vola"], option_prices["T"])
option_prices["observed_price"] = (
option_prices["black_scholes"] + rng.normal(0, 0.15, len(option_prices)))
Train/test split and preprocessing
We hold out a test set, and standardize the inputs so the neural network trains well. The recipe/pipeline keeps the preprocessing tied to the model.
split <- initial_split(option_prices, prop = 0.8)
rec <- recipe(observed_price ~ S + K + r + vola + T, data = training(split)) |>
step_normalize(all_predictors())
X = option_prices[["S", "K", "r", "vola", "T"]]
y = option_prices["observed_price"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=2023)
Models: neural networks and a random forest
A single-hidden-layer network captures the broad pricing surface; a deeper network learns its nonlinearities more flexibly; a random forest is a non-parametric alternative. Comparing them illustrates the flexibility-vs-interpretability trade-off — and, since the truth is known, exactly how close each gets.
nnet_model <- mlp(epochs = 250, hidden_units = 10) |>
set_mode("regression") |>
set_engine("keras", verbose = 0)
deeplearning_model <- mlp(epochs = 250, hidden_units = c(10, 10, 10)) |>
set_mode("regression") |>
set_engine("keras", verbose = 0)
rf_model <- rand_forest(trees = 50, min_n = 2000) |>
set_mode("regression") |>
set_engine("ranger")
nnet_fit <- workflow() |>
add_recipe(rec) |>
add_model(nnet_model) |>
fit(training(split))
nnet_model = Pipeline([
("scale", StandardScaler()),
("mlp", MLPRegressor(hidden_layer_sizes=(10,), max_iter=250, random_state=2023))
])
deeplearning_model = Pipeline([
("scale", StandardScaler()),
("mlp", MLPRegressor(hidden_layer_sizes=(10, 10, 10), max_iter=250, random_state=2023))
])
rf_model = RandomForestRegressor(n_estimators=50, min_samples_leaf=2000, random_state=2023)
nnet_model.fit(X_train, y_train)
deeplearning_model.fit(X_train, y_train)
rf_model.fit(X_train, y_train)
Evaluation against the benchmark
Predicting on the held-out set and comparing to the true Black–Scholes prices measures how well each surrogate learned the function. The practical payoff: a trained surrogate evaluates far faster than re-solving the pricing model, which matters when prices are needed millions of times for risk or calibration.
predictions <- bind_cols(
testing(split),
nnet = predict(nnet_fit, testing(split))$.pred
) |>
mutate(error = nnet - black_scholes)
predictions |>
summarize(rmse = sqrt(mean(error^2)))
predictions = X_test.copy()
predictions["black_scholes"] = black_scholes_price(
X_test["S"], X_test["K"], X_test["r"], X_test["vola"], X_test["T"])
predictions["nnet"] = nnet_model.predict(X_test)
predictions["error"] = predictions["nnet"] - predictions["black_scholes"]
np.sqrt((predictions["error"]**2).mean())
Study notes following the Tidy Finance curriculum by Scheuch, Voigt, Weiss, and Frey. Prose is my own; the R/Python code is reproduced from the book's open-source source, licensed CC BY-NC-SA 4.0.