Consider two forecasters, each making a single prediction for a sequence of events over time. We ask a relatively basic question: how might we compare these forecasters, either online or post-hoc, while avoiding unverifiable assumptions on how the forecasts and outcomes were generated? In this paper, we present a rigorous answer to this question by designing novel sequential inference procedures for estimating the time-varying difference in forecast scores. To do this, we employ confidence sequences (CS), which are sequences of confidence intervals that can be continuously monitored and are valid at arbitrary data-dependent stopping times ("anytime-valid"). The widths of our CSs are adaptive to the underlying variance of the score differenc...
Scoring rules measure the deviation between a probabilistic forecast and reality. Strictly proper sc...
introduced a new, easy-to-calculate economic skill score for use in yes/no fore-cast decisions, of w...
Forecasting and forecast evaluation are inherently sequential tasks. Predictions are often issued on...
Probability forecasts for binary events play a central role in many applications. Their quality is c...
Can we forecast the probability of an arbitrary sequence of events happening so that the stated prob...
Building on the game theoretic framework for probability, we show that it is possible, using randomi...
This note gives an easily verified necessary and sufficient condition for one probability forecaster...
Predictions about the future are commonly evaluated through statistical tests. As shown by recent li...
We study the problem of making calibrated probabilistic forecasts for a binary sequence generated by...
In any dataset with individual forecasts of economic variables, some forecasters will perform bette...
Consider a forecaster who observes a sequence of data on-line and after each new observation makes a...
We propose simple randomized strategies for sequential decision (or prediction) under imperfect moni...
Scoring rules measure the deviation between a forecast, which assigns degrees of confidence to vario...
We study the problem of designing consistent sequential two-sample tests in a nonparametric setting....
In any dataset with individual forecasts of economic variables, some forecasters will perform better...
Scoring rules measure the deviation between a probabilistic forecast and reality. Strictly proper sc...
introduced a new, easy-to-calculate economic skill score for use in yes/no fore-cast decisions, of w...
Forecasting and forecast evaluation are inherently sequential tasks. Predictions are often issued on...
Probability forecasts for binary events play a central role in many applications. Their quality is c...
Can we forecast the probability of an arbitrary sequence of events happening so that the stated prob...
Building on the game theoretic framework for probability, we show that it is possible, using randomi...
This note gives an easily verified necessary and sufficient condition for one probability forecaster...
Predictions about the future are commonly evaluated through statistical tests. As shown by recent li...
We study the problem of making calibrated probabilistic forecasts for a binary sequence generated by...
In any dataset with individual forecasts of economic variables, some forecasters will perform bette...
Consider a forecaster who observes a sequence of data on-line and after each new observation makes a...
We propose simple randomized strategies for sequential decision (or prediction) under imperfect moni...
Scoring rules measure the deviation between a forecast, which assigns degrees of confidence to vario...
We study the problem of designing consistent sequential two-sample tests in a nonparametric setting....
In any dataset with individual forecasts of economic variables, some forecasters will perform better...
Scoring rules measure the deviation between a probabilistic forecast and reality. Strictly proper sc...
introduced a new, easy-to-calculate economic skill score for use in yes/no fore-cast decisions, of w...
Forecasting and forecast evaluation are inherently sequential tasks. Predictions are often issued on...