## Heteroscedastic Extended Logistic Regression for Postprocessing of Ensemble Guidance [Monthly Weather Review](Monthly Weather Review Via Acquire Media NewsEdge) ABSTRACT To achieve well-calibrated probabilistic forecasts, ensemble forecasts are often statistically postprocessed. One recent ensemble-calibration method is extended logistic regression, which extends the popular logistic regression to yield full probability distribution forecasts. Although the purpose of this method is to postprocess ensemble forecasts, usually only the ensemble mean is used as the predictor variable, whereas the ensemble spread is neglected because it does not improve the forecasts. In this study it is shown that when simply used as an ordinary predictor variable in extended logistic regression, the ensemble spread affects the location but not the variance of the predictive distribution. Uncertainty information contained in the ensemble spread is therefore not utilized appropriately. To solve this drawback a new approach is proposed where the ensemble spread is directly used to predict the dispersion of the predictive distribution. With wind speed data and ensemble forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF) it is shown that by using this approach, the ensemble spread can be used effectively to improve forecasts from extended logistic regression.
(ProQuest: ... denotes formulae omitted.) 1. Introduction Weather forecasts are very important for many parts of social and economic life. For example, they are used for severe weather warnings, for decision making in ag- riculture and industry, or for planning of leisure activ- ities. Generally these forecasts are based on numerical weather prediction (NWP) models. Unfortunately, be- cause of uncertainties in the initial conditions and un- known or unresolved atmospheric processes these models are always subject to error. Luckily some of these errors are systematic and can be corrected with statistical post- processing, often also referred to as model output statistics (MOS; Glahn and Lowry 1972). However, not all errors can be corrected and for many customers it is important to get additional information about the re- maining forecast uncertainty. For this purpose many forecasting centers provide ensemble forecasts. These are multiple NWP forecasts with slightly perturbed initial conditions and sometimes also different model formu- lations. The idea is that these different forecasts should represent the range of possible outcomes (Lorenz 1996). Large ensemble spreads are then presumably associated with high forecast uncertainties and small spreads that signify low uncertainties. However, in practice the initial ensemble members do not represent initial-condition uncertainty (Hamill et al. 2003; Wang and Bishop 2003). Furthermore ensemble forecasts exhibit the same model errors as single integration forecasts. Thus, to achieve unbiased and calibrated uncertainty forecasts, statistical postprocessing is needed. In the past decade much research has gone into finding appropriate methods to postprocess ensemble forecasts. For example, Roulston and Smith (2003) proposed dressing the ensemble members with historical model errors and Raftery et al. (2005) suggested Bayesian model averaging for this purpose. Gneiting et al. (2005) proposed to use linear regression with error variances depending on the ensemble spread, and for binary pre- dictands Hamill et al. (2004) proposed to use logistic regression. Comparisons of these and other methods (Wilks 2006a; Wilks and Hamill 2007) showed that lo- gistic regression is one of the better approaches. A very promising extension of logistic regression has been pro- posed recently (Wilks 2009). By including the predictand threshold in the regression equations this extended logistic regression allows derivation of full predictive distributions. The extended logistic regression method has been used in several studies for probabilistic pre- cipitation forecasts (Schmeits and Kok 2010; Ruiz and Saulo 2012; Roulin and Vannitsem 2012; Hamill 2012; Ben Bouall^egue 2013; Scheuerer 2013) and was shown to perform very well compared to standard logistic regression (Wilks 2009; Ruiz and Saulo 2012) and other ensemble postprocessing methods (Schmeits and Kok 2010; Ruiz and Saulo 2012; Scheuerer 2013). In all of these studies, extended logistic regression is used to post- process ensemble forecasts, but usually the ensemble mean was used as the only predictor variable. There were also several attempts to additionally include the ensemble spread, but with the exception of Hamill (2012) it was always disregarded because it did not improve the forecasts. In this study we show that the predictive distribution of the transformed predictand is logistic and that the predictor variables only affect the location (mean) but not the dispersion (variance) of this logistic distribution. So far the ensemble spread was always included as or- dinary predictor variable in extended logistic regression so that its information was only used to predict the lo- cation but not the dispersion of the forecast distribution. However, the ensemble spread is generally expected to mainly contain information about the forecast uncer- tainty, which in turn should be directly related to the dispersion of the predictive distribution. Hence, the un- certainty information contained in the ensemble spread cannot be utilized properly by extended logistic re- gression, so that it is not surprising that no improve- ments could be found. To solve this drawback of extended logistic regres- sion, we therefore propose a simple new approach in which the ensemble spread can be directly used as predictor for the dispersion of the forecast probability distribution. To illustrate our findings and test if im- provements can be achieved with this new approach, we compare different approaches to include the ensemble spread in extended logistic regression on wind speed data from 11 European locations and ensemble forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF). The remainder of the paper is organized as follows: in section 2 we describe the extended logistic regression model and show the problems when including the en- semble spread as ordinary predictor variable. Our new approach is introduced in section 3. Results from the case study are shown in section 4 and a summary and conclusions can be found in section 5. 2. Extended logistic regression Originally, logistic regression is a regression model from the generalized linear model framework (Nelder and Wedderburn 1972) to model the conditional prob- ability of binary events. As such it is also a well-suited MOS method for binary predictands (Hamill et al. 2004). For example, the probability of a continuous variable y to fall below a certain threshold q can be predicted with ... (1) where x is a vector of predictor variables [e.g., NWP forecasts; x 5 (1, x1, x2, ...)T]andb is a vector of re- gression coefficients [b 5 (b0, b1, b2, ...)T]thatisgen- erally estimated with maximum likelihood estimation (see appendix A). The regression function has the same mathematical form as the cumulative distribution func- tion of the standard logistic distribution L, which is in- dicated by the final equality in Eq. (1). Often, more than one threshold is of interest and separate logistic regressions are fitted for each of these thresholds. This approach has the disadvantage that the predicted probabilities are not constrained to be mu- tually consistent. In other words, for two thresholds qa and qb with qa , qb it can occur that P(y , qa j x) . , qb j x ), which would imply nonsense negative probabilities for P(qa # y , qb j x). To avoid these inconsistencies, Wilks (2009) extended logistic regression by including (a transformation of) the thresholds qj as additional predictor variable: P(y,qj jx)5L[ag(qj)1xTb]. (2) Here g(qj) is a nondecreasing function of qj and a is an additional coefficient that has to be estimated. In addi- tion to avoiding negative probabilities, this extended logistic regression has the advantage that fewer co- efficients have to be estimated (instead of different vectors b for each threshold, a and b are the same for all thresholds), which is especially advantageous for small training datasets (Wilks 2009). Furthermore, the probability to fall below any arbitrary value Q can be easily computed by replacing qj with Q: P(y , Qj x) 5L[ag(Q) 1xTb]. (3) Equation (3) can also be interpreted as continuous cu- mulative distribution function, which implies that full continuous probability distributions can be provided. Since g( ) has to be a nondecreasing function, the equation P(y,Q jx) 5P[g(y) ,g(Q) jx] (4) is always fulfilled. With Eq. (4) and some rearrange- ments, Eq. (3) can also be written as ... (5) anduponsettingm 52xTb/a and s 5 1/a we obtain the following: ... (6) This notation allows one to easily see that the condi- tional probability distribution of the transformed pre- dictand g(y) given the predictor variables x is a logistic distribution with location parameter m and scale pa- rameter s. Cumulative distribution functions and proba- bility density functions of this distribution with different scale parameters s are shown in Fig. 1. The shape of the logistic distribution is very similar to that of the normal distribution but with somewhat heavier tails. The mean of this distribution is m and in terms of the scale parameter the variance is s2p2/3 (Johnson et al. 1995). Note that the scale parameter s 5 1/a is constant so that the predictor variables in x only affect the mean, not the variance of the logistic predictive distribution. Hence, when included as additional predictor variable in x, the ensemble spread has no effect on the disper- sion of the predictive distribution. However, usually largeensemblespreadsareassociated with high fore- cast uncertainties, which in turn should be related to wider predictive distributions. In contrast the level of uncertainty should generally have no effect on the lo- cation of the forecast probability distribution. 3. Heteroscedastic extended logistic regression In the previous section we showed that when using the ensemble spread as an ordinary predictor variable in extended logistic regression, uncertainty information is not utilized appropriately. As a more effective ap- proach we therefore propose to use the ensemble spread directly as predictor for the dispersion of the predictive distribution. Therefore, we simply replace m and s in Eq. (6) with m 5 xT g (7) and s 5 exp(zTd) , (8) respectively. Here z is an additional vector of input variables (i.e., the ensemble spread) and g and d are coefficient vectors that have to be estimated. The ex- ponential function is used here as a simple method to ensure positive values for s. Note that with z 51 this model is completely equiva- lent to the original extended logistic regression [Eq. (2)] with a 5 1/exp(d) and b 52g/exp(d). The idea of using the ensemble spread as predictor for the dispersion is not completely new. For Gaussian linear regression models, Gneiting et al. (2005) proposed a similar approach, which has been proven to perform well in several studies (e.g., Wilks 2006a; Wilks and Hamill 2007). 4. Case study In this section, we apply the findings from the pre- vious sections on real data. We use 10-m wind speed observations (mean over last 10 min) from the follow- ing 11 European weather stations: Amsterdam Airport Schiphol in Amsterdam, Netherlands (52.38N, 4.7838E); Berlin Tegel Airport in Berlin, Germany (52.558N, 13.38E); National Airport in Brussels, Belgium (50.98N, 4.5338E); Copenhagen Airport in Copenhagen, Denmark (55.68N, 12.6338E); Frankfurt Main in Frankfurt, Germany (50.0338N, 8.5838E); Heathrow in London, United King- dom (51.4678N, 20.458E); Geof in Lisbon, Portugal (38.7678N,29.1338E);BarajasinMadrid,Spain(40.4678N, 23.558E); Orly in Paris, France (48.7178N, 2.3838E); Fiumicino in Rome, Italy (41.88N, 12.2338E); and Wien- Hohe-Warte in Vienna, Austria (48.2498N, 16.3568E), from April 2010 to December 2012. As NWP forecasts we use ensemble wind speed forecasts bilinearly inter- polated to the instrument location from the ECMWF (Molteni et al. 1996), initialized at 0000 UTC for the lead times 24, 36, 48, and 60 h. Figure 2 shows a clear positive correlation between ensemble spread and forecast error for Wien-Hohe- Warte (similar for most other locations). This positive spread-skill relationship suggests that the ensemble spread contains potentially useful uncertainty informa- tion. To investigate how this information might be used most effectively, we compare different extended logistic regression models. For all models we use the square root function for g( ^ ) 5 ^. This function gave good results for precipi- tation forecasts in several studies (Wilks 2009; Schmeits and Kok 2010; Roulin and Vannitsem 2012; Ruiz and Saulo 2012; Ben Bouall^egue 2013; Scheuerer 2013) and also improves our wind speed forecasts compared to the identity function. As potential regressors we use the ensemble mean M and standard deviation S of the square root-transformed ensemble wind speed forecasts. Furthermore, we selected J 5 9 climatological quantiles with probabilities 1/10, 2/10, ..., 9/10 as thresholds qj for each location separately. Table 1 lists the models that are used in the following. In addition to the extended logistic regression model with the ensemble mean as single predictor variable (XLR) there are four models that use the ensemble standard deviation. The models XLR:S and XLR:SM are standard extended logistic regression models with the ensemble standard deviation as additional predictor variable, either alone (XLR:S) or multiplied with the ensemble mean (XLR:SM). In the heteroscedastic extended logistic re- gression model (HXLR) the ensemble standard deviation is only included as predictor variable for the scale and in HXLR:S it is additionally also used as predictor variable for the location of the predictive distribution. Before reporting the forecast quality of these different models it is interesting to investigate the effect of the ensemble spread on the predicted probability distri- butions. Figure 3 shows predicted probability density functions of the XLR:S and HXLR models for different ensemble standard deviations. For the XLR:S model it can be seen that contrary to the desired effect, larger ensemble standard deviations are related to slightly sharper distributions. In contrast, the HXLR model uses the ensemble standard deviation more appropriately and larger ensemble standard deviations are clearly re- lated to wider distributions. Next we compare the performance of the different models. Since extended logistic regression can provide multicategorical probabilistic forecasts, the ranked prob- ability score (Epstein 1969; Wilks 2006b) is a well-suited measure of forecast quality: ... (9) Here J 5 9 is the number of thresholds and I(^) 5 1 if the argument in brackets is true and 0 if it is not. To get independent training and test datasets we estimate and verify the models with tenfold cross validation. With this cross validation we get one RPS value for each event in the dataset. From these individual RPS values, 250 es- timates of the mean (RPS) are computed on 250 boot- strap samples. This is all done separately for each model, location, and lead time. Since we are mainly interested in improvements that can be achieved with the ensem- ble standard deviation we finally compute skill scores (RPSS) with the standard extended logistic regression model (XLR) as a reference: ... (10) Note that here positive values indicate improvements over the standard extended logistic regression model. Figure 4 shows the RPSS of the different models and lead times aggregated over the 11 locations. It can be seen that including the ensemble standard deviation simply as ordinary predictor variable (XLR:S, XLR:SM) does not improve forecast quality of extended logistic regression. However, the reason is not the absence of predictive information in the ensemble standard deviation since using it with our new approach (HXLR) clearly improves the forecast quality, especially for day time forecasts (36- and 60-h lead time). Since the ensemble standard deviation seems not to contain any predictive information on the location it is also not advantageous to include it additionally as predictor variable for the loca- tion (HXLR:S). The effect of the lead time on the RPSS is only weak but for daytime forecasts (12 and 36 h) the superiority of HXLR is more pronounced. Note that we also tested longer lead times (up to 96 h) and shorter training data lengths (down to 6 months), but results were similar and are therefore not shown. Figure 5 shows the RPSS for selected locations aggre- gated over lead times 24-60 h. While most of the loca- tions show similar patterns as in Fig. 4 (e.g., Amsterdam, Wien) there are also some locations (e.g., Berlin) where including the ensemble spread as ordinary predictor variable (XLR:S, XLR:SM) is superior to heteroscedastic extended logistic regression (HXLR). This suggests that for these locations the ensemble spread also contains predictive information on the location. For nonnegative predictands like wind speed, large observed values are generally related to large ensemble spreads. Therefore, it is indeed conceivable that the ensemble spread con- tains some predictive information on the location that is not yet covered by the ensemble mean. However, addi- tional improvements can be achieved when including the ensemble spread as a predictor for both location and scale of the predictive distribution (HXLR:S). Finally, Fig. 6 shows reliability diagrams for 36-h forecasts of the first climatological decile [P(y , q1 j x)] and the climatological median P(y , q5 j x)forthemodels XLR:S and HXLR. Both models are fairly reliable with only little differences between each other. For the lower decile both models are slightly overforecasting (points below diagonal). The logistic predictive distribution of extended logistic regression involves a point mass at zero (i.e., positive predictive density for negative wind speeds; Schefzik et al. 2013). Because zero wind speeds occur relatively rarely, this might be the reason for the over- estimated probabilities to fall below the lower decile. 5. Summary and conclusions The inclusion of the ensemble spread in extended lo- gistic regression has been shown in several studies not to improve the forecast skill. As we have shown in this paper this is not surprising because when the ensemble spread is included as an ordinary predictor variable it modifies only the location but not the dispersion of the forecast distribution. Uncertainty information contained in the ensemble spread is therefore not used appropri- ately. To solve this problem we proposed a new approach called heteroscedastic extended logistic regression where the ensemble spread is directly used as predictor for the scale of the predictive distribution. To illustrate the advantages of this new approach we used wind speed observations from 11 European locations and ensemble forecasts from ECMWF. Con- sistent with our findings and with results from previous studies, the inclusion of the ensemble standard de- viation as an ordinary predictor variable has no clear positive effects on forecast quality. In contrast, with our new approach the uncertainty information in the en- semble standard deviation is used effectively to achieve clear improvements. An additional single case study with precipitation data showed similar results. We therefore expect that our results can be transferred to other variables and/or lo- cations. However, this still has to be tested. Hamill (2012) got better forecasts when using the en- semble variance multiplied with the ensemble mean as an additional predictor variable. This suggests that in his data, the ensemble spread also contained predictive information on the location of the predictive distribution. Consistent with these findings, we also found individual weather stations where including the ensemble spread as ordinary predictor variable is even superior to het- eroscedastic extended logistic regression. However, further improvements could be achieved when in- cluding the ensemble spread as predictor variable for both location and spread of the predictive distribution. To enhance the flexibility of extended logistic re- gression, Ben Bouall^egue (2013) proposed the use of interaction terms between the threshold and the pre- dictor variables. An interaction term between threshold and ensemble spread could also be used to control the dispersion of the predictive distribution. Contrary to het- eroscedastic extended logistic regression such a model can be easily implemented with standard binary logistic regression software. However, with interaction terms the ensemble spread also has some undesired effects on the distribution location. Extended logistic regression has been shown in sev- eral studies to perform well compared to other en- semble postprocessing algorithms (e.g., Schmeits and Kok 2010; Ruiz and Saulo 2012; Scheuerer 2013). How- ever, a major drawback of this method was that un- certainty information contained in the ensemble spread could not be utilized effectively. Heteroscedastic ex- tended logistic regression is therefore a very attractive extension of extended logistic regression to further en- hance its competitiveness. Acknowledgments. We thank Tom Hamill, Tilmann Gneiting, Constantin Junk, and an anonymous reviewer for their valuable comments that helped to improve this manuscript. This study was supported by the Austrian Science Fund (FWF): L615-N10. The first author was also supported by a Ph.D. scholarship from the University of Innsbruck, Vizerektorat fueuror Forschung. Data from the ECMWF forecasting system were obtained from the ECMWF Data Server. Denotes Open Access content. REFERENCES Ben Bouall^egue, Z., 2013: Calibrated short-range ensemble pre- cipitation forecasts using extended logistic regression with interaction terms. Wea. Forecasting, 28, 515-524. Broeurocker, J., and L. A. Smith, 2007: Increasing the reliability of reliability diagrams. Wea. Forecasting, 22, 651-661. Epstein, E. S., 1969: A scoring system for probability forecasts of ranked categories. J. Appl. Meteor., 8, 985-987. Glahn, H., and D. Lowry, 1972: The use of model output statistics (MOS) in objective weather forecasting. J. Appl. Meteor., 11, 1203-1211. Gneiting, T., A. E. Raftery, A. H. Westveld, and T. Goldman, 2005: Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation. Mon. Wea. Rev., 133, 1098-1118. Hamill, T. M., 2012: Verification of TIGGE multimodel and ECMWF reforecast-calibrated probabilistic precipitation forecasts over the contiguous United States. Mon. Wea. Rev., 140, 2232- 2252. _____, C. Snyder, and J. S. Whitaker, 2003: Ensemble forecasts and the properties of flow-dependent analysis-error covariance singular vectors. Mon. Wea. Rev., 131, 1741-1758. _____, J. S. Whitaker, and X. Wei, 2004: Ensemble reforecasting: Improving medium-range forecast skill using retrospective forecasts. Mon. Wea. Rev., 132, 1434-1447. Johnson, N. L., S. Kotz, and N. Balakrishnan, 1995: Continuous Univariate Distributions. Vol. 2. Wiley, 752 pp. Lorenz, E., 1996: Predictability: A problem partly solved. Proc. ECMWF Seminar on Predictability, Reading, United Kingdom, ECMWF, 1-18. Messner, J. W., and A. Zeileis, cited 2013: crch: Censored Re- gression with Conditional Heteroscedasticity. R package version 0.1-0. [Available online at http://CRAN.R-project. org/package=crch.] Molteni, F., R. Buizza, T. N. Palmer, and T. Petroliagis, 1996: The ECMWF ensemble prediction system: Methodology and val- idation. Quart. J. Roy. Meteor. Soc., 122, 73-119. Nelder, J., and R. Wedderburn, 1972: Generalized linear models. J. Roy. Stat. Soc., 135A, 370-384. Raftery, A. E., T. Gneiting, F. Balabdaoui, and M. Polakowski, 2005: Using Bayesian model averaging to calibrate forecast ensembles. Mon. Wea. Rev., 133, 1155-1174. R Core Team, cited 2012: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. [Available online at http://www.R-project.org/.] Roulin, E., and S. Vannitsem, 2012: Postprocessing of ensemble precipitation predictions with extended logistic regression based on hindcasts. Mon. Wea. Rev., 140, 874-888. Roulston, M. S., and L. A. Smith, 2003: Combining dynamical and statistical ensembles. Tellus, 55A, 16-30. Ruiz, J. J., and C. Saulo, 2012: How sensitive are probabilistic precipitation forecasts to the choice of calibration algorithms and the ensemble generation method? Part I: Sensitivity to calibration methods. Meteor. Appl., 19, 302-313. Schefzik, R., T. Thorarinsdottir, and T. Gneiting,2013: Uncertainty quantification in complex simulation models using ensemble copula coupling. Stat. Sci., in press. Scheuerer, M., 2013: Probabilistic quantitative precipitation fore- casting using ensemble model output statistics. Quart. J. Roy. Meteor. Soc., in press. Schmeits, M. J., and K. J. Kok, 2010: A comparison between raw ensemble output, (modified) Bayesian model averaging, and extended logistic regression using ECMWF ensemble precipi- tation reforecasts. Mon. Wea. Rev., 138, 4199-4211. Wang, X., and C. H. Bishop, 2003: A comparison of breeding and ensemble transform Kalman filter ensemble forecast schemes. J. Atmos. Sci., 60, 1140-1158. Wilks, D. S., 2006a: Comparison of ensemble-MOS methods in the Lorenz '96 setting. Meteor. Appl., 13, 243-256. _____, 2006b: Statistical Methods in the Atmospheric Sciences. 2nd ed. Academic Press, 627 pp. _____, 2009: Extending logistic regression to provide full-probability- distribution MOS forecasts. Meteor. Appl., 368, 361-368. _____, and T. M. Hamill, 2007: Comparison of ensemble-MOS methods using GFS reforecasts. Mon. Wea. Rev., 135, 2379- 2390. JAKOB W. MESSNER AND GEORG J. MAYR Institute of Meteorology and Geophysics, University of Innsbruck, Innsbruck, Austria ACHIM ZEILEIS Department of Statistics, University of Innsbruck, Innsbruck, Austria DANIEL S. WILKS Department of Earth and Atmospheric Sciences, Cornell University, Ithaca, New York (Manuscript received 21 August 2013, in final form 4 October 2013) Corresponding author address: Jakob W. Messner, Institute of Meteorology and Geophysics, University of Innsbruck, Innrain 52, 6020 Innsbruck, Austria. E-mail: jakob.messner@uibk.ac.at DOI: 10.1175/MWR-D-13-00271.1 APPENDIX A Likelihood Function To estimate the coefficients a and b (extended logistic regression) or g and d (heteroscedastic extended lo- gistic regression) maximum likelihood estimation is used. The general log-likelihood function for logistic regression models is ... (A1) where N is the length of the training dataset and pi is the predicted probability for the ith observed outcome. For binary logistic regression there are two possible out- comes, so that ... (A2) In previous studies the sum of this binary log-likelihood over all thresholds is used as objective function that is maximized to estimate the regression coefficients. However, the predicted probability of the ith outcome actually is ... (A3) so that the correct maximum likelihood estimator is given by the maximization of Eqs. (A1) and (A3). In this study we employ this maximum likelihood estimator to take advantage of all standard asymptotic inference in the maximum likelihood framework. However, the concepts presented in this paper do not depend on the objective function and results should also not differ significantly when using the sum of binary log-likelihoods [Eq. (A2)] to estimate the coefficients. APPENDIX B Computational Details Our results were obtained on Ubuntu using R 2.15.2 (R Core Team 2012). A function to fit (heteroscedastic) extended logistic regression models is included in the package crch 0.1-0 (Messner and Zeileis 2013). (c) 2014 American Meteorological Society |