Advances in Water Resources 79 (2015) 121–126 Contents lists available at ScienceDirect Advances in Water Resources journal homepage: www.elsevier.com/locate/advwatres A metastatistical approach to rainfall extremes Marco Marani a,b,⇑, Massimiliano Ignaccolo a,c a Division of Earth & Ocean Sciences, Nicholas School of the Environment, Department of Civil and Environmental Engineering, Pratt School of Engineering, Duke University, Durham, NC, USA b Department of Civil, Environmental, and Architectural Engineering, University of Padova, Padova, Italy c DIIAR, Politecnico di Milano, Milano, Italy a r t i c l e i n f o Article history: Received 28 August 2013 Received in revised form 1 March 2015 Accepted 3 March 2015 Available online 13 March 2015 Keywords: Extreme events Generalized Extreme Value distribution Metastatistics a b s t r a c t The traditional statistical theory of extreme events assumes an asymptotic regime in which the number of events per year is large enough for a limiting Generalized Extreme Value distribution to apply. This has been shown not to be applicable to many practical cases. We introduce here a Metastatistical Extreme Value (MEV) approach which is defined in terms of the distribution of the statistical parameters describing ‘‘ordinary’’ daily rainfall occurrence and intensity. The method does not require an asymptotic assumption, and naturally accounts for the influence of the bulk of the distribution of ordinary events on the distribution of annual maximum daily rainfall. Building on existing observations showing the distribution of daily rainfall to be Weibull right-tail equivalent, the MEV approach is then specialized to yield a compact and easily applicable formulation. We apply this formulation to Monte Carlo experiments based on Weibull statistics derived from the 3-century long rainfall time series observed in Padova (Italy). We find an excellent agreement between MEV estimates and the ‘observed’ frequency of occurrence of extreme events in the synthetic time series generated. GEV and Gumbel estimates, on the contrary, exhibit systematic errors. Tests with different rates of occurrence of rainfall events show slight improvements of the GEV and Gumbel estimation bias when the number of events/year is increased. However, a constant bias in GEV and Gumbel estimates is seen for (synthetic) climates where the number of events and the distribution of intensities is varied stochastically. The estimation root mean square error is also larger for the GEV and Gumbel distributions than for the MEV approach. Hence, GEV and Gumbel quantile estimates are more likely to be further away from the actual value than MEV estimates. Finally, the application of the new MEV approach to subsets of the long Padova time series identifies marked variabilities in rainfall extremes at the centennial time scale. Ó 2015 Elsevier Ltd. All rights reserved. 1. Introduction The definition and estimation of extreme rainfall events is of central importance in the analysis of past and projected rainfall regimes, as well as in the design of any water resources management and flood control infrastructure. For a given event duration of interest (here we will focus on the important case of daily duration), extreme value analysis usually studies the distribution of yearly maxima, y, either directly or by considering the distribution of rainfall values over a high threshold [1]. Under the assumptions that (i) rainfall intensity may be assumed independent and identically distributed (i.i.d.) and (ii) the number of events per year tends to infinity, the classical Extreme Value Theory (EVT) identifies a Generalized Extreme Value (GEV) distribution of yearly maxima ⇑ Corresponding author at: Division of Earth & Ocean Sciences, Nicholas School of the Environment, Department of Civil and Environmental Engineering, Pratt School of Engineering, Duke University, Durham, NC, USA. http://dx.doi.org/10.1016/j.advwatres.2015.03.001 0309-1708/Ó 2015 Elsevier Ltd. All rights reserved. [2–6], which has been and still is widely applied [[7–12], e.g.]. It is important to emphasize here that the GEV is not an exact distribution of yearly maxima, and that the actual extreme value distribution may converge to a GEV distribution only as the number of events/year is ‘‘large enough’’, a potentially problematic concept as the number of events/year (wet days in the present case) is necessarily limited. However, little work has addressed the conditions under which the actual distribution may be considered to be close to the limiting GEV form [13,14, e.g.] or how the possible variability of the rainfall depth distribution (i.e. a violation of the i.i.d. hypothesis e.g. due to seasonality), can affect the resulting extreme value distribution [15]. These analyses show that indeed the actual extreme value distribution of rainfall may in practice be quite far from the asymptotic GEV form. We propose here a non-asymptotic approach to the definition and evaluation of an extreme value distribution based on a metastatistical approach (also referred to as superstatistics [16], compound distributions [17], or doubly stochastic processes [18] in 122 M. Marani, M. Ignaccolo / Advances in Water Resources 79 (2015) 121–126 different contexts). Our approach avoids the need of assuming an infinite collection of events, i.e. it avoids the asymptotic assumption, and allows for interannual variability to be accounted for. The manuscript is organized as follows. In Section 2 we describe the data used in our analyses. In Section 3 we briefly summarize the classical extreme event theory and introduce the new Metastatistical Extreme Value formulation (MEV). A Results section compares GEV and MEV performances, and a Discussion and Conclusion section closes the paper. 2. Data As noted, the distribution describing the n-sample maximum will strictly be a GEV distribution, independent of the specific value n, only for ‘large enough’ values of the number of wet days. When the number of wet days n is not large enough for the asymptotic regime of the EVT to apply (e.g. this has been shown to be the case in practice for Weibull variates [13,21]) one must use Eq. (1). However, a useful approximation of Hn ðyÞ that does not require n ! 1 can be obtained by considering U n , the expected largest value of the variable X in n realizations. Because U n is on average exceeded once every n realizations of X [22,23]: WðU n Þ ¼ We analyze extremes in the daily rainfall time series observed in Padova (Italy) over a span of almost three centuries, as well as on synthetic data derived from its statistical properties. The Padova dataset, comprised of 275 complete years of daily observations is described in detail elsewhere [19,20], and provides an exceptionally long record, particularly suitable to test estimates of high return period extremes. 3. Methods We first briefly summarize the EVT, as typically used in hydrology, and then introduce a metastatical approach to the definition of extreme value distributions. 1 n ð4Þ (note that a Weibull plotting position estimate, WðU n Þ ¼ 1=ðn þ 1Þ, could also be used with no consequence of substance). Using this result we can rewrite the cumulative probability for the n-sample maximum Y n as WðyÞ n Hn ðyÞ ¼ ½FðyÞn ¼ ½1 WðyÞn ¼ 1 nWðU n Þ ð5Þ For y > U n (i.e. for an extreme value larger than the average maximum value in the observations) the term WðyÞ=WðU n Þ < 1. Therefore, for large values of y, i.e. for extremes, we can use the Cauchy approximation: ð1 zÞn ffi 1 n z ffi expðn zÞ, valid for z 1. Hence, Eq. (5) can be approximated as: 3.1. Extreme value theorem WðyÞ Hn ðyÞ ¼ exp WðU n Þ We use the random variable X > 0 to indicate daily rainfall depth, f ðxÞ being its probability density function, FðxÞ ¼ PðX 6 xÞ its cumulative distribution function, and WðxÞ ¼ 1 FðxÞ the exceedance probability. Notice that having considered X > 0, no probability atom at X ¼ 0 need be considered to represent the finite probability of zero rainfall. The maximum of n realizations of the stochastic variable X; Y n ¼ maxðx1 ; x2 ; . . . ; xn Þ, is also a random variable, often termed an n-maximum or a maximum with cardinality n of the ‘‘parent’’ stochastic variable. In hydrologic practice n will be the number of wet days in a given year, itself a discrete random variable. If the events generating the n realizations of X are independent, the cumulative distribution, Hn ðyÞ, of Y n may be expressed as Eq. (6) is sometimes referred to as the ‘‘penultimate’’ approximation [24,22], the ‘‘ultimate’’ approximation being Eq. (2), only valid when n is very large. The error associated with the penultimate approximation can be quantified through the relative error eðyÞ ¼ f½expðWðyÞ=WðU n Þ ½1 WðyÞ=ðnWðU n ÞÞn g=½1 WðyÞ= ðnWðU n ÞÞn . For y ¼ U n [24]: eðU n Þ ¼ ðexpð1Þ ½1 1=nn Þ=½1 1=nn . For example, for n ¼ 50 the relative error is eðU 50 Þ ¼ 0:01. Note that for values y > U n , of greatest applicative interest, the relative error is smaller than eðU n Þ, as WðyÞ < WðU n Þ. The penultimate approximation has been used in the evaluation of extreme values in some geophysical contexts, such as in modelling wind power [24,23] or of drought severity [25], but very rarely has it been applied to rainfall extremes [26,15]. Hn ðyÞ ¼ ½FðyÞn 3.2. The case of Weibull variates ð1Þ Upon definition of a renormalized variable Sn ¼ ðY n bn Þ=an (an > 0 and bn being constants), the EVT establishes that [2–4] n lim PðSn < sÞ ¼ lim Hn ðsÞ ¼ lim ½Fðan s þ bn Þ ¼ HðsÞ n!1 n!1 n!1 ð2Þ The limiting distribution HðsÞ in Eq. (2), depending on the tail behaviour of WðyÞ, can only be one of three distributions: (i) the Gumbel distribution (Extreme Value 1 – EV1, or double exponential), when the tail of WðyÞ decreases faster than a power law; (ii) the Frechét distribution (EV2), when the tail of WðyÞ behaves as a power law for large values of x; and (iii) the Weibull distribution (EV3), when x has a finite upper limit [2–4]. In terms of the non renormalized variable y, the three asymptotic types, EV1–EV3, can be thought of as special cases of a single Generalized Extreme Value distribution [6]: y l1=k HGEV ðyÞ ¼ exp 1 þ k r þ ð3Þ where ðÞþ ¼ maxð; 0Þ; l is the location parameter, r > 0 is the scale parameter, and k is a shape parameter. The limit k ¼ 0 corresponds to the EV1 distribution, k > 0 to the EV2 distribution, and k < 0 to the EV3 distribution. ð6Þ Daily rainfall as been shown to be accurately modelled as a Weibull variate [27]. Hence we consider here the important case of WðxÞ ¼ expððx=CÞw Þ. Under these assumptions, the yearly maximum daily rainfall depth, i.e. the maximum depth over the n wet days occurred in a generic year, is distributed as: w n y Hn ðyÞ ¼ 1 exp C ð7Þ Hence, the penultimate approximation takes the following forms: w w y y ffi exp exp w þ ln n Hn ðyÞ ffi 1 n exp w C C ð8Þ Note that these expressions (and later results in this Section) are valid also for distributions that are only right-tail equivalent to a Weibull distribution [24,27] (two distributions F 1 and F 2 are right-tail equivalent if ð1 F 1 ðxÞÞ=ð1 F 2 ðxÞÞ ! 1 when x ! þ1). 3.3. A metastatistical approach The cumulative probability, Hn ðyÞ, of the n-maximum Y n depends on the number of wet days, n, and on the parameters, 123 M. Marani, M. Ignaccolo / Advances in Water Resources 79 (2015) 121–126 100 Y 10-2 10-4 10-5 0.1 1 10 100 IV (a) I II III IV (b) 40 15 I II III IV (c) 12 9 6 3 1.4 I II III IV (d) 80 h dh f ðn; hÞ Hn ðy; n; hÞ ð9Þ where f ðn; hÞ is the joint probability distribution (discrete in n) of the number of wet days in a year and of the parameter vector. The symbol d h denotes the differential dh1 dh2 dhk . We refer to fðyÞ in Eq. (9) as the Metastatistical Extreme Value (MEV) distribution. This formulation is practically appealing. The number of days with non-zero rainfall clearly shows a wide interannual variability (see below for an analysis of such variability in the test case of the Padova time series). Moreover, nonstationarities in the rainfall-generating processes could be accounted for through suitablydefined distributions of h in Eq. (9). However we restrict our analysis here to the stationary (or periodic) case, in which we can assume past observations to be representative of future realizations. In this case, and without the need to make assumptions on the specific form of f ðn; hÞ; fðyÞ can be estimated based on T years of rainfall observations as: T 1X Hn ðy; nj ; hj Þ T j¼1 j ð10Þ which may be formally obtained from Eq. (9) with P f ðn; hÞ ¼ 1T Tj¼1 dðn nj ; h hj Þ (dð; Þ = a multi-dimensional Dirac’s Delta), accounting for the empirical frequency distribution of the observed maxima. If one assumes daily precipitation to be Weibull-distributed, the MEV expression is: " !# T 1X ywj fMEV ðyÞ ffi exp exp wj þ ln nj T j¼1 Cj ð11Þ In other cases one may assume the probability distribution of daily rainfall to be time-invariant (the physical rainfall-generating mechanisms remain unchanged) and to be right-tail equivalent to a Weibull distribution [27], such that only the number of rainy days changes from year to year according to a discrete distribution gðnÞ. By substituting the first expression in Eq. (8) into Eq. (9) we have: w w X y y ¼ 1 hni exp gðnÞ 1 n exp w C Cw n 1 ’ exp exp w yw þ C w ln hni C fðyÞ ¼ ð12Þ where hni indicates the expected value of the number of wet days. C w h ¼ ðh1 ; ::; hk Þ, of the distribution of the parent variable x. To make this dependence explicit we now adopt the notation Hn ðy; n; hÞ. In the realistic context in which both n and h are random variables, we suggest that a meaningful definition of the probability, here denoted as fðyÞ, that a yearly maximum Y be smaller than y, is the expected value of Hn ðy; n; hÞ, computed over all possible realizations of n and h: XZ 120 80 Fig. 1. Daily rainfall survival probability, WðxÞ, for the 1841–2006 period (squares) and for each single year within the same period (solid lines). The vertical dashed line indicates the threshold used to left-censor the data as in [27]. fðyÞ ffi III 160 1841 - 2006 x (mm) n II 40 10-3 n Ψ (x) 10 fðyÞ ¼ I 120 -1 0.9 0.4 1730 1780 1830 1880 year 1930 1980 0 0.2 0.4 frequency Fig. 2. The annual maxima (Y, panel (a)), the number of wet days per year (n, panel (b)), the Weibull scale parameter (C, panel (c)) and shape parameter (w, panel (d)) obtained for the Padova time series. We indicate: the shape and scale parameter values obtained from the entire dataset (black line), the values obtained using 1year (red line) and 10-year (blue line) sliding windows. Panels on the right portray the empirical frequency distributions from the time series in the left panels. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) The case in which also w and C are random variables of course requires the full specification of the joint distribution function f ðn; hÞ. 4. Results We first characterize the statistics of daily precipitation in the Padova dataset. We later use this information to generate synthetic rainfall time series with known, and realistic, statistical properties. Using such synthetic time series we compare rainfall extremes as estimated with the MEV approach and via the GEV formulation. Finally, we apply the MEV approach to the Padova data set. 4.1. The Padova time series and the Weibull approximation The Padova time series includes four long sub-intervals of uninterrupted observations: 1725–1764, 1768–1807, 1841–1880 and 1887–2006 [20]. For each interval we calculate the daily rainfall exceedance probability WðxÞ and find a wide interannual variability, particularly in the tail of the distribution (Fig. 1). Next, we analyze the yearly maximum, Y, and the yearly number of wet days, n. The yearly maxima (Fig. 2(a)) display a large variability, with periods in which the values of Y are significantly smaller than for the rest of the time series (e.g. see the ’800). Overall, the mode of Y is in the range 40–65 mm and the pdf has a positive skewness. The number of wet days also seems to exhibit a marked inter-annual variability and, qualitatively, a temporal correlation structure (Fig. 2(b)). The number of events ranges between about 40 days yr1 and 160 days yr1, with a mode between 100 days yr1 and 120 day yr1. We also analyze the distribution of the scale (C) and shape (w) parameters obtained from fitting a Weibull distribution (with a least-square method) to non-overlapping moving windows of length 1 yr and 10 yrs. The fit starts by left-censoring rainfall observations with a 10 mm-threshold [27]. This procedure filters out small ordinary rainfall values, affected by higher relative M. Marani, M. Ignaccolo / Advances in Water Resources 79 (2015) 121–126 4.2. MEV, GEV, and Gumbel -ln(-ln(ζ(y))) 1,000 yr 100 yr -ln(-ln(ζ(y))) 1,000 yr 100 yr -ln(-ln(ζ(y))) We compare here the MEV approach with the GEV and the traditional Gumbel approaches by means of synthetic data sets. The synthetic data sets are constructed by drawing daily rainfall values from a Weibull distribution, as justified by previous work analyzing rainfall data at the global scale [27]. The parameters of the Weibull distribution used are obtained from the Padova time series to ensure the synthetic values generated are representative of real daily rainfall. The goal of these simulations is to produce realistic time series for which extreme value properties are known, such that the effectiveness of different extreme value estimation approaches can be evaluated. We consider three different statistical set ups, for each of which we generate a large number (1000) of synthetic time series. The A setup is a ‘‘homogeneous’’ case, in which the number of wet days/ year is n ¼ 100 days/year, and C and w are equal to the values obtained from the 1957–2006 time interval of the Padova time series (C ¼ 7:3 mm, w ¼ 0:82). The B setup is homogeneous in n and heterogeneous, or time-variant, in C and w : n ¼ 100 days/year while C and w are drawn, with uniform probability, from the pairs of values observed in the Padova time series. Setup C is timevariant in n and homogeneous in C and w : C ¼ 7:3 mm, w ¼ 0:82, while n is drawn from a uniform distribution between n ¼ 21 days/year and n ¼ 50 days/year. This latter experiment also 1,000 yr 100 yr (a) Constant n, C, and w ζ GEV o ζ obs ζ GUM ζ MEV (b) Constant n - Random C and w (c) Random n - Constant C and w 0 100 200 y (mm) 300 Fig. 3. Doubly logarithmic (Gumbel) plots of different estimates of the extreme value distribution fðyÞ. Different panels refer to different synthetic data sets: A (panel a), B (panel b), C (panel c). The solid lines indicate the MEV estimate fMEV ðyÞ (blue), the GEV estimate fGEV ðyÞ (red), and the Gumbel estimate fGUM ðyÞ (green). Open circles indicate the empirical frequency distribution averaged across different realizations, fobs ðyÞ. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) explores the effects of a small number of rainfall events per year on the convergence properties of the actual extreme value distribution to the GEV distribution. For each setup we generate 1000 realizations of N yr ¼ 50 years of synthetic rainfall, to reproduce typical (though optimistic) sample sizes encountered in practice, and we estimate the distribution of extremes using the MEV, GEV, and Gumbel approaches. For each realization: (1) we estimate C and w from the synthetic time series via least square fit; (2) We extract the sequence of the N yr annual maxima Y j R ; (3) We use Eq. (11), to obtain 1000 estimated MEV probability distributions; (4) We estimate, via Maximum Likelihood, the corresponding 1000 GEV and Gumbel distributions; (5) We compute the ensemble average distributions, fMEV ðyÞ; fGEV ðyÞ, and fGUM ðyÞ as the means of the distributions of Y computed over the 1000 synthetic time series. To test the predictive performance of the estimated distributions fðyÞ; fGEV ðyÞ, and fGUM ðyÞ, we compare them against the observational frequency, fobs ðyi Þ, computed for the annual maxima from sequences of N ¼ 106 years of synthetic rainfall generated according to the above definitions for each setup A, B, and C. The choice of using 106 values ensures that extremes with return period of 100 and 1000 years are appropriately represented by the usual plotting-position estimate fobs ðyi Þ ¼ ni =ðN þ 1Þ (ni being the number of annual maxima in the sample smaller or equal to yi ). The results (Fig. 3) show that in all cases the distribution obtained with the MEV approach coincides with the observational frequencies. This indicates that use of Eq. (11) provides an accurate representation of the extreme value distribution even when the Weibull distribution generating daily values changes over time (setups B and C). The GEV distribution, on the other hand, consistently overestimates the precipitation value associated with a given return period, while the Gumbel approach consistently underestimates it. It is interesting to note that these discrepancies between GEV and Gumbel distributions and the observed one are very marked also for the time-invariant case, when the sample used for fitting the extreme value distributions more accurately represents the underlying population and convergence to the limiting GEV distribution should be more rapid. Mean of % estimation error errors, and focuses only on significant events, of chief interest here. We also note that all that is needed for Eq. (12) or Eq. (10) to be applicable is that the distribution of daily values be right-tail equivalent to a Weibull distribution. Hence the analysis of the upper range of values of X suffices for our purposes. C and w show significant changes over the observation period (e.g. compare the 1841–1880 and the 1887–2006 periods in particular), suggestive of different rainfall regimes taking place across centuries (see Fig. 2(c) and (d)). Visual inspection of the values of C and w suggests precipitation processes to be relatively homogeneous particularly within sub-intervals 1768–1807, and 1887– 2006. Sub-interval 1841–1880 exhibits more widely different fluctuations of the values of n; C, and w. Root Mean Square % Error 124 40% (a) 100 yrs GEV 1000 yrs MEV 20% 100 yrs GUM 1000 yrs 100 yrs 1000 yrs 0% -20% -40% 80% (b) 60% 40% 20% 0% 20 40 60 80 100 120 140 number of wet days/year 160 180 200 Fig. 4. The relative errors for daily rainfall intensity corresponding to a return period of 100 years (solid red line for the GEV approach and solid green line for the Gumbel approach) and 1000 years (dotted red line for the GEV approach and dotted green line for the Gumbel approach). Panel (a) shows results for the homogeneous test case, A: the number of events/year and the parameters of the Weibull distribution are kept fixed for the whole synthetic time series. Panel (b) shows results for the heterogeneous case B: the Weibull parameters used to generate the daily rainfall sequences are extracted randomly every two years. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) M. Marani, M. Ignaccolo / Advances in Water Resources 79 (2015) 121–126 125 period extreme events in actual applications, as wide interannual variability, such as that evidenced in Fig. 1, and possibly systematic changes, are likely the norm. 4.3. MEV estimates of high return period events Fig. 5. MEV extreme rainfall distributions obtained for different sub-intervals of the Padova time series. The hatched areas represent confidence intervals of (vertical) width equal to two standard deviations. The standard deviation is estimated using the asymptotic value of 5.1% in Fig. 4(b). In order to further characterize the error associated with the GEV and Gumbel estimates, we consider two more test cases in which we generate 1000 replicates of synthetic time series of daily rainfall. The first synthetic test case (A2) is generated, as for test A above, by assuming a fixed number of wet days, n, and a Weibull distribution of daily rainfall with constant parameters. This is the homogeneous test case. The second test case (B2) is obtained, similarly to test B above, with a fixed number of wet days n, by changing the parameters of the Weibull distribution every 2 years (chosen with uniform probability from the set of Weibull parameters obtained from the Padova time series). For both test experiments we progressively increase the fixed number of wet days from n ¼ 10 events/year to n ¼ 200 events/year, to address the accuracy of the asymptotic assumption embedded in the GEVGumbel approaches. From each synthetic time series realization we estimate fGEV ðyÞ and fGUM ðyÞ (using ML) and derive the daily rainfall intensity corresponding to a return period of 100 years (1000 years) yGEV;100 (yGEV;1000 and yGUM;100 yGUM;1000 ). These values are compared with the values obtained from the observed frequency distribution, by computing relative errors such as (yGEV;100 -yth;100 Þ=yth;100 Þ 100 (yth;100 being the 100th yearly maximum value in an list of ‘‘observations’’ sorted in ascending order). Finally, the mean relative error is computed over all the 1000 replicates of the synthetic time series for both test cases. Results show (Fig. 4) that both in the homogeneous A2 case (panel (a)) and in the inhomogeneous case B2 (panel (b)), the relative errors decrease as the cardinality increases, but that a constant bias always remains. In the homogeneous case the GEV approach systematically overestimates the 100-yr extreme rainfall intensity by 5% even for large numbers of wet days. The Gumbel approach systematically underestimates the 100-yr extreme rainfall intensity by about 5%. For the 1000-years return period intensities, the GEV approach severely overestimates actual extreme events (minimum relative error is 30% for n = 200 events/year) whereas the Gumbel approach yields underestimation errors of about 10%. It is interesting to see that, while a noticeable dependence on the number of wet days exists in the homogeneous case, estimation errors are larger in the non-homogeneous case, with relative errors almost independent of the number of wet days for n > 20 events/year. In the non-homogeneous case the GEV and Gumbel approaches give similar absolute values of the relative errors (about 10%, but with opposite signs) for the 100-year return period. Estimations errors for the 1000-year return period are about +50% for the GEV estimate and about 20% for the Gumbel approach. These findings have significant implications for the estimation of high-return It is interesting, also in view of practical applications, to compare extreme value distributions obtained through the MEV approach for different periods from the Padova time series (see [20] for similar considerations using the traditional GEV approach). The observed differences (Fig. 5) are significant when compared with the uncertainty associated with MEV estimates, represented by a confidence interval equal to two times the standard deviation of the estimation error (hatched area in Fig. 5). The MEV distribution adapted on the 1841–1880 period gives the smallest extreme events: the estimated 1000 yr rainfall depth for 1841– 1880 is h1000 110 mm. The estimated 1000 yr rainfall based on data from 1725 to 1764 is h1000 130 mm, while the estimate from 1887 to 2006 is h1000 160 mm, very similar to the estimate obtained for 1768–1807 (distribution not shown here because almost superimposed on the one from 1887 to 2006). In this case, the estimate of a 1000-yr return period design rainfall based on data from the 1841–1880 period would be seriously underestimating the extreme events which occurred in the subsequent century. Similar conclusions can be reached if one considers the 100-year return period, for which estimates vary from about 90 mm for 1841–1880 to about 120 mm for 1887–2006. Overall, the interannual differences in extremes estimated using the MEV distribution are coherent with those obtained by a conventional GEV analysis [20], even though the range covered by such estimates is not quite as large. These results show that, even in times in which anthropogenic climatic changes may safely be assumed not to affect rainfall regimes (e.g. consider the 1725–1764 vs. the 1841–1880 periods), extreme events exhibit a wide long-term variability, such that a simplistic interpretation of the past may lead to grossly erroneous inferences on the future [20]. 5. Summary and conclusion We have introduced here a metastatistical approach to daily rainfall extreme event evaluation. In its general formulation (Eqs. (9) and (10)) the approach is valid for an arbitrary (and, in particular, arbitrarily small) value of the yearly number of rainfall events, as it does not rely on an asymptotic approximation as the classical EVT. Both the number of wet days per year and the parameters controlling rainfall intensity are also allowed to vary stochastically, as observed in practice for a long rainfall time series (see Figs. 1 and 2). Global observations and inferences show that daily rainfall can be considered, to a good approximation, to be Weibull right-tail equivalent. The specialization of the general MEV expression to the Weibull case yields a compact expression (Eq. (11)), which can be easily applied by estimating the Weibull parameters (C j ; wj ) and the number of wet days (nj ) on a yearly basis. Monte Carlo experiments based on Weibull-distributed synthetic daily rainfall show an excellent agreement between MEV-estimated extremes and ‘‘observed’’ occurrence of extreme events. GEV and Gumbel approaches, on the contrary, are shown to be affected by systematic errors and a large root mean square error. Experiments using a varying number of wet days/year show a mild dependence of GEV and Gumbel bias on n and show a significant tendency of the GEV approach to overestimate actual extremes and of the Gumbel method to underestimate them. When n; C, and w are allowed to vary stochastically, the GEV and Gumbel bias seems to be insensitive to n, and no improvement in the estimation accuracy is seen for (synthetic) climates with a larger number of 126 M. Marani, M. Ignaccolo / Advances in Water Resources 79 (2015) 121–126 rainy days/year. We hypothesize this to be a symptom of the sensitivity of the traditional methods to violations of the asymptotic hypothesis: when n cannot be considered to be ‘‘large enough’’, the specific value of n matters, particularly if it is timevarying. The MEV method, on the contrary, does not exhibit a detectable bias for any value of the number of wet days, and is here proposed as a general approach to extreme value estimation. MEV estimates for the long rainfall time series recorded in Padova (Italy) illustrate the wide variability of extremes and the magnitude of the estimation errors that can arise when the past is assumed to be a faithful representation of the future. Very significant deviations of actual extreme events from estimated ones occur across centuries [20, see also]. The MEV approach explicitly identifies and separates the roles of conceptually and physically meaningful sources of variability: the stochastic occurrence of rainfall events and the probability distribution of daily rainfall intensities. We also suggest that the use of a time-varying metastatistical characterization of these processes, i.e. a time-varying distribution f ðn; C; w; hðtÞÞ, though not yet pursued here, could provide a natural way to treat changing extremes. In particular, the metastatistical parameters h could be linked to dominant largescale climatic patterns, such as ENSO and NAO, to provide a clear connection between changing extremes, endogenous climatic oscillations, and anthropogenic climate change. Acknowledgements This research was funded by NSF-EAR-13-44703 ‘‘The Direct and Indirect Effects of Plantation Forestry Expansion on Usable Water in the Southeastern US’’. We also acknowledge support by the Nicholas School of the Environment and the Pratt School of Engineering at Duke University. References [1] Davison A, Smith R. Models for exceedances over high thresholds. J Roy Stat Soc 1993;B52:393–442. http://dx.doi.org/10.2307/2345667. [2] Frechet M. Sur la loi de probabilité de l’ecart maximum. Ann Soc Pol Math 1927;6:93–117. [3] Fisher RA, Tippett LHC. Limiting forms of the frequency distribution of the largest or smallest member of a sample. Math Proc Camb Philos Soc 1928;24(02):180–90. [4] Gnedenko B. Sur la distribution limite du terme maximum d’une serie aleatoire. Ann Math 1943;44(3):423–53. [5] Gumbel EJ. Statistical theory of extreme values and some practical applications: a series of lectures. In: Applied mathematics series. U.S. Govt. Print. Office; 1954. [6] Jenkinson AF. The frequency distribution of the annual maximum (or minimum) values of meteorological elements. Quart J Roy Meteorol Soc 1955;81(348):158–71. http://dx.doi.org/10.1002/qj.49708134804. [7] Katz R. Extreme value theory for precipitation: sensitivity analysis for climate change. Adv Water Resour 1999;23:133–9. http://dx.doi.org/10.1016/S03091708(99)00017-2. [8] Chavez-Demoulin V, Davison Anthony C. Generalized additive modelling of sample extremes. Appl Stat 2005;54:207–22. http://dx.doi.org/10.1111/ j.1467-9876.2005.00479.x. [9] Martins ES, Stedinger J. Generalized maximum-likelihood generalized extreme-value quantile estimators for hydrologic data. Water Resour Res 2000;36(3):737–44. http://dx.doi.org/10.1029/1999WR900330. [10] Villarini G, Smith J. Flood peak distributions for the eastern united states. Water Resour Res 2010;46. http://dx.doi.org/10.1029/2009WR008395. [11] Smith J, Villarini G, Baeck M. Mixture distributions and the hydroclimatology of extreme rainfall and flooding in the eastern united states. J Hydrometeorol 2011;12:294–309. http://dx.doi.org/10.1175/2010JHM1242.1. [12] Papalexiou SM, Koutsoyiannis D. Battle of extreme value distributions: a global survey on extreme daily rainfall. Water Resour Res 2013;49:187–201. http://dx.doi.org/10.1029/2012WR012557. [13] Koutsoyiannis D. Statistics of extremes and estimation of extreme rainfall: I. theoretical investigation/statistiques de valeurs extremes et estimation de precipitations extremes: I. Recherche theorique. Hydrol Sci J 2004;49(4):381–404. http://dx.doi.org/10.1623/hysj.49.4.575.54430. [14] Veneziano D, Langousis A, Lepore C. New asymptotic and preasymptotic results on rainfall maxima from multifractal theory. Water Resour Res 2009;45. http://dx.doi.org/10.1029/2009WR008257. [15] Serinaldi F, Kilsby CG. Rainfall extremes: toward reconciliation after the battle of distributions. Water Resour Res 2014;50. http://dx.doi.org/10.1002/ 2013WR014211. [16] Beck C, Cohen E. Superstatistics. Physica A 2003;322:267–75. http:// dx.doi.org/10.1016/S0378-4371(03)00019-0. [17] Dubey S. Compound gamma, beta and F distributions. Metrika 1970;16:27–31. http://dx.doi.org/10.1007/BF02613934. [18] Cox D, Isham V. Point processes. London: Chapman and Hall/CRC Press; 1980. [19] Camuffo D. Analysis of the series of precipitation at Padova, Italy. Clim Change 1984;6:57–77. http://dx.doi.org/10.1007/BF00141668. [20] Marani M, Zanetti S. Long-term oscillations in rainfall extremes in a 268-year daily time series. Water Resour Res 2015;51:639–47. http://dx.doi.org/ 10.1002/2014WR015885. [21] Koutsoyiannis D. Uncertainty, entropy, scaling and hydrological stochastics. 2. time dependence of hydrological processes and time scaling/ incertitude, entropie, effet d’échelle et propriétés stochastiques hydrologiques. 2. dépendance temporelle des processus hydrologiques et échelle temporelle. Hydrol Sci J 2005;50(3):405–26. http://dx.doi.org/10.1623/ hysj.50.3.405.65028. [22] Cramer H. Mathematical methods of statistics. Princeton, NJ: Princeton University Press; 1946. [23] Harris R. Extreme value analysis of epoch maxima convergence, and choice of asymptote. J Wind Eng Ind Aerodyn 2004;92(11):897–918. http://dx.doi.org/ 10.1016/j.jweia.2004.05.003. [24] Cook NJ, Harris R. Exact and general FT1 penultimate distributions of extreme wind speeds drawn from tail-equivalent Weibull parents. Struct Saf 2004;26(4):391–420. http://dx.doi.org/10.1016/j.strusafe.2004.01.002. [25] Abaurrea J, Cebriàn AC. Drought analysis based on a cluster poisson model: distribution of the most severe drought. Clim Res 2002;22:227–35. http:// dx.doi.org/10.1175/JHM494.1. [26] Furrer EM, Katz RW. Improving the simulation of extreme precipitation events by stochastic weather generators. Water Resour Res 2008;44. http:// dx.doi.org/10.1029/2008WR007316. [27] Wilson PS, Tuomi R. A fundamental probability distribution for heavy rainfall. Geophys Res Lett 2005;32(14):1–4. http://dx.doi.org/10.1029/2005GL022465.