Certified equilibrium via asymptotic-initialised deep BSDE

Connection

The theoretical epsilon-Nash guarantee for deep fictitious play (Han-Hu-Long 2021, Corollary 1) requires that the deep BSDE sub-problems at each stage are solved accurately — specifically, that the BSDE loss E|g(X_T) - Y_T|^2 is small. The guarantee bounds the equilibrium gap by the sub-problem loss, so the BSDE loss IS the equilibrium certificate.

However, direct application of the coupled BSDE solver (Ji et al. Algorithm 2) to the transaction cost equilibrium FBSDE fails in practice: the zero-diffusion forward component (d(phi) = phi_dot dt) and the Z^2 nonlinearity in the minimised Hamiltonian cause poor convergence. The direct cost minimisation approach (Carmona-Lauriere Method 1) works numerically but provides no such certificate — it minimises each agent’s cost but doesn’t bound the epsilon-Nash gap.

The connection: the sqrt(lambda) asymptotic expansion from Shelley (2023) and Herdegen-Muhle-Karbe (2018) provides EXACT analytical formulas for the equilibrium in the small-cost regime. These can serve as control variates in the sense of Naito et al. (2025). The neural network then only needs to learn the small residual between the true equilibrium and the asymptotic approximation. This is precisely the regime where the coupled BSDE solver should work well — the residual is smooth, small, and doesn’t suffer from the initialization instability that plagues the raw solver (Naito et al. showed 90% failure → 0% with control variates).

Bridged Concepts

From equilibrium theory (Shelley 2023, Herdegen-Muhle-Karbe 2018)

quadratic transaction costs: the sqrt(lambda) scaling of the equilibrium correction gives an explicit zeroth-order expansion. The Riccati solution P(t) = lambda * kappa * tanh(kappa*(T-t)) gives exact Y and Z for the leading order.
FBSDE equilibrium characterisation: the coupled linear FBSDE system has explicit solutions via matrix exponentials. These serve as Y^{AE} and Z^{AE} in the Naito et al. framework.

From deep learning methods (Naito et al. 2025, Han-Hu-Long 2021)

control variate for deep BSDE: replace phi^1 with chi * Y^{AE} + phi^1 and phi^2 with chi * Z^{AE} + phi^2. The network learns Y - Y^{AE} and Z - Z^{AE} rather than the full solution.
deep fictitious play: at each DFP stage, solve each agent’s BSDE sub-problem using the asymptotic-initialised solver. The BSDE loss provides the epsilon-Nash certificate via Corollary 1 of Han-Hu-Long.

From general costs (Gonon-Muhle-Karbe-Shi 2020)

Asset Pricing with General Transaction Costs: for non-quadratic costs (power costs G(x) = lambda|x|^q/q), the equilibrium FBSDE is NONLINEAR and has no closed-form. The quadratic-cost analytical solution could still serve as a control variate — the leading-order structure is similar when costs are calibrated to match trading volume. This makes the approach useful beyond the LQ setting.

Why It Matters

This connection simultaneously solves three problems:

Numerical: the coupled BSDE solver fails on the raw equilibrium FBSDE due to zero-diffusion and Z^2 nonlinearity. The control variate provides a warm start that eliminates initialization failure and reduces the learning burden to a small residual.
Theoretical: the direct cost minimisation approach works numerically but sacrifices the epsilon-Nash guarantee. Recovering the BSDE formulation (with control variates to make it converge) restores the certificate: epsilon is bounded by the BSDE loss at each DFP stage.
Extensibility: for the N > 2 case with non-quadratic costs (the open problem in Shelley’s thesis), the quadratic-cost equilibrium serves as a universal control variate. The perturbation parameter epsilon is the deviation from quadratic costs (or from the known N=2 solution). The network learns the correction, and the BSDE loss certifies the result.

Potential Directions

Implement for LQ case first: use the EXACT Riccati solution P(t) as the control variate (not an approximation). This should make the coupled BSDE solver converge perfectly for N=2 quadratic costs, verifying the pipeline before moving to harder cases.
Perturbation in lambda: for the full equilibrium with small costs, expand around the frictionless equilibrium (lambda=0). The zeroth-order is the Merton solution, the first-order correction is the sqrt(lambda) term from Shelley. Use these as (Y^{AE,0}, Z^{AE,0}) and (Y^{AE,1}, Z^{AE,1}).
Perturbation in cost specification: for non-quadratic costs (Gonon et al.), expand around the quadratic-cost equilibrium. The perturbation parameter is the deviation from q=2 in the power cost family G_q(x) = lambda|x|^q/q. The zeroth-order is the known quadratic equilibrium; the network learns the nonlinear correction.
Perturbation in N: for N > 2 agents, expand around the known N=2 solution. The third agent introduces a perturbation to the 2-agent equilibrium. This could provide control variates for the deep BSDE even when no closed-form exists for N=3.
Epsilon-Nash monitoring in DFP: at each DFP stage, log the BSDE loss as a proxy for the epsilon-Nash gap. Plot this across stages to verify convergence and bound the equilibrium quality.

Evidence

Naito et al. (2025): demonstrates order-of-magnitude improvement in coupled FBSDE solvers using asymptotic control variates. Eliminates initialization instability (90% → 0%). Example 3 is a financial portfolio optimization FBSDE with the same structural challenges (fully coupled through Z).
Shelley (2023): provides the explicit sqrt(lambda) asymptotic expansion for the equilibrium return, portfolio, and trading rate. The expansion coefficients are known in closed form for N=2 with quadratic costs.
Han-Hu-Long (2021): Corollary 1 gives the epsilon-Nash guarantee. Theorem 3 bounds the total error by the sum of decoupling error (geometric convergence) and BSDE approximation error (bounded by the loss). The key equation: epsilon ⇐ C * (loss at each stage + mesh size).
Gonon et al. (2020): demonstrates that different cost specifications yield similar equilibria when calibrated to trading volume (Section 5), providing empirical justification for using the quadratic-cost solution as a control variate for non-quadratic costs.
Numerical evidence from Phase 2 of the implementation project: the coupled BSDE solver (Ji et al. Algorithm 2) converges to 0.12% error on a coupled LQ test problem, but fails (40% error) on the transaction cost FBSDE without control variates.

Alethograph

Explorer