Multi-period portfolio optimisation: beyond the single-period MPT framework

Mean-variance optimisation, as formulated by Markowitz (1952), solves a single-period problem: given today's expected returns and covariances, find the optimal weights for the next period. Real investing is a sequence of these decisions made over a long horizon, where the optimal action today depends on what is expected to happen later. Multi-period portfolio optimisation extends the framework to handle the sequence directly.

What multi-period optimisation is

Multi-period optimisation treats the portfolio decision as a sequence of allocations indexed by time, with the objective being the investor's wealth or utility at the end of a finite horizon (or in some formulations, an infinite horizon with a discount factor). The decision rule at each point in time can depend on the current state of the portfolio, expectations about future returns, and any relevant predictive variables.

The framework was developed by Samuelson (1969) and Merton (1969, 1971) for the continuous-time case. Their result, under specific assumptions—constant relative risk aversion, log-normal returns, no transaction costs—is that the optimal portfolio is the same in every period: the multi-period problem reduces to a sequence of identical single-period problems. This is the so-called myopic property, and it is the theoretical justification for using single-period MPT as an approximation to the multi-period problem in practice.

The myopic property breaks down when the assumptions break down. With transaction costs, the optimal portfolio depends on the current portfolio (because rebalancing is costly) and on expectations about future trades. With predictable returns, the optimal portfolio depends on the current state of the predictive variables. With horizon-dependent risk aversion or non-log-normal return distributions, the multi-period and single-period solutions diverge.

How it works

The standard multi-period framework formulates the problem as a stochastic dynamic programming problem. At each time t, the investor chooses weights wₜ to maximise expected utility at the terminal horizon, subject to the wealth dynamics, the transaction cost function, and any constraints. The solution is a sequence of decision rules wₜ(state), where the state captures whatever information is relevant to the decision—portfolio composition, levels of predictive variables, time remaining to the horizon.

For realistic specifications, the dynamic programming problem cannot be solved analytically and must be approximated numerically. Approximate dynamic programming, scenario-tree methods, and reinforcement learning have all been applied to multi-period portfolio problems, with each approach making different trade-offs between computational tractability and modelling fidelity.

A practical alternative to full multi-period optimisation is rolling-window single-period optimisation: solve the single-period problem at each rebalancing date using current inputs, and let the sequence of single-period solutions approximate the multi-period optimum. This is what most institutional and retail portfolio construction does in practice. It is theoretically suboptimal in the presence of meaningful transaction costs or return predictability but is materially simpler and is the standard approach.

What the evidence shows

Empirically, the gap between full multi-period optimisation and rolling single-period optimisation is small in most settings. Detemple, Garcia, and Rindisbacher (2003), Brandt et al. (2005), and others have shown that for typical investor horizons (10–30 years) and typical asset universes, the welfare gain from solving the full multi-period problem versus the rolling single-period problem is on the order of basis points per year. The gain is largest in settings with substantial transaction costs (where the single-period solution rebalances too aggressively) or strong return predictability (where the single-period solution ignores the information in predictive variables).

The strongest case for genuine multi-period optimisation is in institutional contexts where transaction costs are large, the asset universe is large, and there are well-identified return-predictive variables. For retail multi-asset portfolios held in liquid ETFs and futures, the gain from multi-period optimisation over rolling single-period optimisation is typically too small to justify the additional complexity.

The robustness considerations also work against full multi-period optimisation. The single-period problem requires inputs that are difficult to estimate; the multi-period problem requires those inputs plus additional dynamic structure. Each additional input introduces estimation error, which compounds across periods. Robust single-period optimisation often outperforms theoretically optimal multi-period optimisation out-of-sample, because the single-period problem has fewer inputs to mis-specify.

Limitations and trade-offs

Multi-period optimisation requires modelling assumptions about the dynamics of expected returns, the covariance structure, and any predictive variables. Each assumption can be wrong, and errors in dynamic models compound over periods in ways that single-period errors do not. The framework is most useful when the modelling assumptions are well-supported by data and when the cost of mis-specification is bounded.

The framework also requires specifying the horizon explicitly, which is a non-trivial choice. An investor saving for retirement at 65 has one horizon; an investor managing a multi-generational family balance sheet has a much longer one. The optimal sequence of decisions depends on the choice, and different horizons can produce materially different recommendations.

Computational tractability is a binding constraint for realistic specifications. The dimensionality of the state space grows with the number of assets and the number of predictive variables, and brute-force dynamic programming becomes intractable beyond a small problem size. Approximate methods exist but introduce their own modelling choices.

Multi-period optimisation in pfolio

pfolio's portfolio optimiser is single-period: at each rebalancing it computes weights based on the current covariance estimate without explicitly modelling future paths. The platform's monthly rebalancing creates a sequence of single-period optimisations rather than a true multi-period solution. The construction methodology is documented at how we build portfolios.

Related articles

Disclaimer
This article constitutes advertising within the meaning of Art. 68 FinSA and is for informational purposes only. It does not constitute investment advice. Investments involve risks, including the potential loss of capital.

Get started now

It is never too early and it is never too late to start investing. With pfolio, everybody can be their own wealth manager.
pfolio — start investing for free, broker-agnostic DIY portfolio management
This website uses cookies. Learn more in our Privacy Policy