Like most endeavours in life, the act of verifying forecasts must have a purpose. Allan Murphy, who made his career
in the mathematics and science of weather forecast verification, put it this way: "Verification activities are useful
only if they lead to some decision regarding the product being verified". This means that someone must actually look at
the results of the verification and make a decision based on these results. The decision may be to "do nothing for now"– the
product is good enough as it stands until more verification information is available - but that is still a decision.
It also implies there must be an interested "user" of verification results who will make the decision. The user may or may
not be the same person who carries out the verification.
There are many different kinds of users of verification and therefore many different purposes. It is, or should be the user
who defines the purpose of the verification. The purpose of the verification should be clearly stated in advance so that
appropriate verification methods can be chosen. The purposes of verification can be classified into two general types:
Administrative verification: To support decisions about the administration or budgeting of weather forecast services,
for example to justify a new computer. Administrative verification usually means calculating verification statistics
over large data samples.
Scientific verification: To direct research into new or improved forecast products. This may involve large or small data
samples depending on the exact purpose, and usually involves more exploratory statistical analysis of verification datasets.
Below are some examples of verification tasks. Whether you are potentially a user of verification results, or someone who
is interesting in doing your own verification, or both, try to put yourself in the position of someone who is asked to do
the verification tasks listed on the left, and think about what you would want to know about the reasons for doing these tasks.
Loading Questions
...
Are the following verification tasks likely intended for administrative or
scientific purposes or both?
Please choose the best answer
Not likely, since this is only one case, too little data to make a decision.
Yes. The forecaster may be checking how well he (or someone else) performed on yesterday’s forecast,
as input to today’s forecast.
Yes. Here the user is trying to track the long term trend in temperature forecast errors, for example
to convince his superiors that improvements are being made.
Not likely. The verification is summarized into one value per year, too much averaging to be of use in
determining how to further improve the forecast.
Yes, possibly. The user may want to know whether the forecasters can still "beat" the model to decide
staffing levels in weather offices. For precipitation, one would be wise to use more than 3 months of data.
Yes, possibly. The user may wish to identify those situations or periods where forecasters have the best
chance of improving on the model forecasts.
Yes. There could well be both administrative and scientific reasons for comparing forecasts from two sources.
Yes, definitely. This is a request that has been made by high level management in more than one national
meteorological service, usually for feedback to the general public or political leaders.
No. It is difficult to think of any defendable scientific reason for verification scores which try to summarize
all aspects of accuracy into one number.
Yes, possibly. This could be at the request of either weather service managers or boating groups to determine the
reliability of forecasts of extreme conditions.
Yes, possibly. Stratification of forecast datasets into extreme and non-extreme values using a threshold is a way
of determining the quality of a product for important situations to help direct research efforts.
Yes. There could be administrative or scientific reasons for defining threshold values of the predicted variable
for the purpose of verification. The threshold should be specified by the user. Check the other answers for examples.