Details
Parametric Approximation of Forecasts¶
Mosqlimate forecasts may provide quantiles corresponding to the median, 50%, 80%, 90%, and 95% prediction intervals. To obtain a full predictive distribution from these quantiles, we fit a log-normal distribution using quantile-matching estimation.
Let
be the cumulative probability levels associated with the available quantiles, and let
be the corresponding forecasted values.. Our goal is to estimate the parameters (, ) of a log-normal distribution whose theoretical quantiles best match the forecasted quantiles.
Assume that
Then
For a given probability level (), the corresponding theoretical quantile satisfies
where
and denotes the inverse cumulative distribution function (quantile function) of the standard normal distribution.
Taking logarithms yields
Therefore, estimating a log-normal distribution from a set of forecasted quantiles can be formulated as a simple linear regression problem:
where:
- is the response variable;
- is the predictor;
- the intercept estimates ();
- the slope estimates ().
The parameters can then be estimated using ordinary least squares. Let
and define, being the number of quantiles
where denotes the number of available quantiles. The least-squares estimates are
and
Substituting gives
and
Before IMDC 2025¶
Prior to IMDC 2025, forecasts provided only three quantities: the median prediction (pred) and the lower and upper bounds of the 90% prediction interval. The log-normal distribution was therefore reconstructed using the procedure described below.
A numerical optimization method is applied to determine the mean () and variance () of a log-normal distribution based on the predictions recorded on the platform, for the purpose of computing scoring metrics. The method uses the median , and the lower and upper bounds of the 90% prediction interval.
For cases where , the optimization problem is formulated as:
where and are the median and the 90% upper bound of a log-normal distribution with parameters and . Additionally, is the forecast median recorded on the platform, and is the 90% upper bound of the submitted forecast.
For the specific case where , the optimization problem is defined as: