Skip to content

Details

Parametric Approximation of Forecasts

Mosqlimate forecasts may provide quantiles corresponding to the median, 50%, 80%, 90%, and 95% prediction intervals. To obtain a full predictive distribution from these quantiles, we fit a log-normal distribution using quantile-matching estimation.

Let

be the cumulative probability levels associated with the available quantiles, and let

be the corresponding forecasted values.. Our goal is to estimate the parameters (, ) of a log-normal distribution whose theoretical quantiles best match the forecasted quantiles.

Assume that

Then

For a given probability level (), the corresponding theoretical quantile satisfies

where

and denotes the inverse cumulative distribution function (quantile function) of the standard normal distribution.

Taking logarithms yields

Therefore, estimating a log-normal distribution from a set of forecasted quantiles can be formulated as a simple linear regression problem:

where:

  • is the response variable;
  • is the predictor;
  • the intercept estimates ();
  • the slope estimates ().

The parameters can then be estimated using ordinary least squares. Let

and define, being the number of quantiles

where denotes the number of available quantiles. The least-squares estimates are

and

Substituting gives

and

Before IMDC 2025

Prior to IMDC 2025, forecasts provided only three quantities: the median prediction (pred) and the lower and upper bounds of the 90% prediction interval. The log-normal distribution was therefore reconstructed using the procedure described below.

A numerical optimization method is applied to determine the mean () and variance () of a log-normal distribution based on the predictions recorded on the platform, for the purpose of computing scoring metrics. The method uses the median , and the lower and upper bounds of the 90% prediction interval.

For cases where , the optimization problem is formulated as:

where and are the median and the 90% upper bound of a log-normal distribution with parameters and . Additionally, is the forecast median recorded on the platform, and is the 90% upper bound of the submitted forecast.

For the specific case where , the optimization problem is defined as: