Optimal Rain Gauge Network Design and Spatial Precipitation Mapping Based on Geostatistical Analysis from Colocated Elevation and Humidity Data

The accurate estimation of the spatial rainfall distribution requires a dense network of instruments, which entails large installation and operational costs. It is thus necessary to optimize the number of rainfall stations and estimate point precipitation at unrecorded locations from existing data. This paper serves 2 objectives: i) to establish a spatial representative rainfall stations from the entire existing network in the study area (i.e., rainfall-data optimization); and ii) to use of multivariate geostatistical algorithm for incorporating relatively cheaper hydrological data into the spatial prediction of rainfall. The technique was illustrated using annual and monthly rainfall observations measured at 326 rainfall stations covering Yom river basin and its vicinity in Thailand. Optimal rain gauge network was designed based on the station redundancy and the homogeneity of the rainfall distribution. Digital elevation, humidity, and temperature models were incorporated into the spatial rainfall prediction using multivariate geostatistical algorithms. The results revealed that the multivariate geostatistical algorithm outperform the linear regression, stressing the importance of accounting for spatially dependent rainfall observations in addition to the collocated elevation. The digital elevation data were highly correlated to monthly monsoon-induced precipitation. Humidity and temperature data exhibited a higher degree of correlation to the monthly precipitation data.


INTRODUCTION
Hydrologic analysis and designs rely significantly on measured rainfall data, including intense storms and flash floods identification based on high resolution estimates of spatial variability in precipitation map.The accurate estimation of the spatial distribution of rainfall usually requires a dense network of instruments, which entails large installation and operational costs.Also, failure of the observers to make necessary visits to the gauge may result in even lower sampling density.It is thus necessary to estimate point rainfall at unrecorded locations from values at surrounding sites.
A number of methods were proposed for the interpolation of rainfall data.The simplest approach, Thiessen polygons, consists of assigning to the unsampled location the record of the closest gauge [1].Thiessen polygon method was used for estimation of areal rainfall [2] as well as applied to the interpolation of point measurements [3].An inverse square distance method developed by the US National Weather Service in 1972 estimates the unknown rainfall depth as a weighted average of surrounding values (i.e., the weights being reciprocal to the square distances from the unsampled location).It is obvious that both Thiessen polygons and inverse distance methods do not take for other factors such as topography that can affect the rainfall measurement into consideration.The isohyetal method [2] was later developed to use the gauge location and measurement as well as knowledge of the factor affecting these measurements to create isohyets (i.e., lines of equal rainfall).Rainfall at the unsampled location is then estimated by interpolation within the isohyets.This technique also possesses some limitations because an extensive gauge network is usually required to draw isohyets quite accurately.
Geostatistics is based on the theory of regionalized variables [4][5][6] and allows the spatial correlation between neighboring observations to predict attribute values at unsampled locations.Better geostatistical rainfall estimates (i.e., Kriging) in comparison to the conventional techniques may be achieved [7,8], particularly in sparsely sampled observations.Borga and Vizzaccaro (1997) and Dirks et al. (1998) discovered that for high-resolution networks (e.g., 13 rain gauges over a 35 km 2 area), the kriging method does not show significantly greater predictive skill than the inverse square distance method.
Besides providing a measure of prediction error (i.e., kriging variance), another advantage of kriging is that the sparsely sampled observations of the primary attribute can be complemented by secondary attributes that are more densely samples [10].In the case of rainfall, secondary information can be in the form of weather-radar observations [11,12] and digital elevation [10].The rationale behind considering digital elevation data, a valuable and cheaper source of secondary information, is that precipitation tends to increase with increasing elevation, mainly because of the orographic uplift effects of mountainous terrain, which causes the air to be lifted vertically, and the condensation occurs due to adiabatic cooling.Multivariate extension of kriging (i.e., cokriging) as well as krigging with an external drift [13] were successfully demonstrated to combine both types of information.A more straightforward approach consists of estimating rainfall at a DEM grid cell through a regression of rainfall versus elevation data [14].
This study aimed to address the issue of incorporating a secondary source of information that tends to be cheaper and more densely sampled into the mapping of precipitation in the Yom river basin area of Thailand.Although estimation for precipitation was demonstrated by accounting for the exhaustive secondary elevation information, the correlation between rainfall and elevation had not yet been applied in Thailand, especially in the study area.Moreover, other secondary information such as humidity and temperature seemed worth accounting into the rainfall mapping since Yom river basin usually experienced both orographic and monsoon rainfall.In this paper, annual and monthly rainfall data from Yom river basin (Thailand) were interpolated using two types of techniques: (1) univariate methods that use only rainfall data recorded at 326 stations (the inverse square distance and ordinary kriging); and (2) algorithms that combine rainfall data with secondary sources of information including elevation, humidity, and temperature (linear regression and simple kriging with varying local means (SKlm)).Prediction performances of various algorithms were achieved and compared with the traditional inverse distance method, in terms of the pattern of spatial dependence of rainfall.

Case Study
The upper central plain, Thailand, covers about 38,000 km 2 (180 km × 300 km) of eight provinces with a population of four million people.The main landuse is 63% agricultural, out of which 21% is irrigated, and 24% forest.The basin is surrounded in the East and West by volcanic rocks.The average elevation of the basin is 40-60 m above mean sea level.The basin drains into the lower basin in the South where some free discharge is partially obstructed by crystalline rocks.The 900-1,450 mm annual rainfall within the study region is apportioned to 81% in the wet (April-September) and 19% in the dry season (October-March).Figure 1 shows the location of 326 daily recorded rain gauge stations (in Yom river basin and its vicinity) employed in this study.The monthly and annual rainfall measurements have been averaged from 1970-2003.
Another source of information is the digital elevation map as shown in figure 2. Each grid cell represents 1 km 2 and its elevation was computed as the average of the elevations at 4 discrete points within the cell.The rationale of incorporating the secondary digital elevation information into the mapping of rainfall in Thailand has been motivated by Goovaerts 2000 [10].The particular study showed that the precipitation of Algarve region in Portugal could be estimated accounting for the exhaustive secondary elevation information.However, the correlation between precipitation and elevation data has not yet been applied in Thailand, especially in the study area.Additionally, other secondary information such as humidity and temperature seem worth accounting into the mapping of monsoon rainfall in Thailand as the strong correlations between precipitation and elevation may be suitable for orographic

Mapping of Precipitation
The problem is in fact involving the estimation of rainfall depth z at an unsampled location u using rainfall data.Let be the set {z(u α ), α = 1,...,n} of precipitation values measured at n = 326 locations.Elevation is denoted by y and is available at all primary data locations u a and at all locations u being estimated.Using this information, the precipitation z at any grid node u may be estimated.

Inverse Square Distance Method
To avoid unrealistic patchy precipitation maps that could result from Thiessen polygon technique, the rainfall depth z can be estimated as a linear combination of several surrounding observations, with the weights being inversely proportional to the square distance between observations and u.The basic idea behind the weighting scheme is that observations that are close to each other on the ground tend to be more alike than those further apart, hence observations closer to u should receive a larger weight.

Linear Regression
A straightforward approach consists of modeling the relationship between elevation and rainfall, e.g., using a linear function of the type: This relation was then employed to convert each elevation (from DEM) into a rainfall value.A major shortcoming of this linear regression is that the precipitation at a particular grid node u is derived only from the elevation at u regardless of the precipitation value at the surrounding climatic stations u a Such an approach only assumes that precipitation values are independent from one another, which is rarely the case in practice.

Semivariogram
Spatial dependence between observations can be detected using the semivariogram which is a measure of average dissimilarity between observations as a function of the separation vector h.The experimental semivariogram γ ^(h) is computed as half the average squared difference between the components of every data pair as follows: (3) where N(h) is the number of pairs of data locations a vector h apart. (1)

Geostatistical Interpolation Approach
Geostatistical interpolation allows us to account for the spatial dependence between observations in the prediction of attribute values.Most of geostatistics is based on the concept of a random function, whereby the set of unknown values is regarded as a set of spatially dependent random variables.Each measurement z(u a ) is thus interpreted as a particular realization of a random variable z(u a ) [5].
Kriging is a generic name adopted by geostatisticians for a family of generalized least-squares regression algorithms.The basic idea to be employed in this paper is to estimate the unknown precipitation value at the unsampled location u as a linear combination of neighboring observations.If only precipitation values are available, the simple kriging (SK) estimate z(u) is: (4) where m is the stationary mean of Z(u), and λ α SK (u) is the weight assigned to datum z(u a ) within the search neighborhood W(u).The kriging weights are determined such as to minimize the estimation variance, while ensuring unbiasedness of the estimator.

Simple Kriging with Varying Local Means (SKlm)
The three algorithms introduced herein this work are the variants of simple kriging that allow to incorporate secondary information.Simple kriging with varying local means (SKlm) amounts at replacing the known stationary mean in the SK estimate (equation 4) by known varying means m * SK (u) derived from the secondary information [5]: (5)

Precipitation Map from Univariate Methods
Figures 3B and 3C show the map of annual rainfall interpolated at the nodes of 1×1 km 2 grid corresponding to the available resolution of the elevation model using the inverse square distance technique and geostatistical analysis via ordinary kriging differed significantly.Ordinary kriging seemed to provide a smoother precipitation map, while inverse square distance yielded a relatively crude precipitation data.This observation perhaps contributed to nonuniform distribution of the rain gauge network in the study area, stressing the importance of accounting for more densely sampled information.
Another attractive advantage of Geostatistics is that it can provide a measure of dissimilarity between observations using the semivariogram.The experimental semivariogram γ ^( h), computed from equation 3, is a function of both distance and direction, so it can account for directiondependent variability (i.e., anisotropy spatial pattern).The spatial variability was assumed to be identical in all direction, hence only omidirectional semivariograpm was computed in this study.Figure 4 shows the semivariogram of annual rainfall computed from 326 observations of figure 3.
The three models above were fitted using regression and were such that the weighted sum of squares (WSS) of differences between experimental γ ^(h) and model y(h) semivariogram values was minimum based on the following equation:   Semivariogram values increased with the separation distance, reflecting that two rainfall data close to each other are more similar, and thus their squared difference should be even smaller, than those that are further apart.The semivariogram seemed to reach a maximum at 45 km before fluctuating around a still value, supporting the "hole effect" characteristics typically reflected pseudo-periodic or cyclic phenomena [4].

Accounting for Elevation
Now, the point of rainfall data (z(u a )) at 326 stations were supplemented by elevation data (y(u)) available at all estimation grid nodes.In this study, a straightforward approach (i.e, linear regression) consists of predicting the precipitation as a function of co-located elevation according to equation 2 was employed.Another geostatistical approach, namely simple kriging with varying local means (SKlm) was employed to create representative rainfall map of Yom river basin in the study area.This replaces the known stationary mean in the simple kriging estimate by known varying means.Since this study consisted of estimating and modeling the semivariogram YR(h), then SKlm system could be expressed in terms of only covariances as CR(0) -YR(h).This was quite convenient as the best semivariogram model was only necessary for precipitation map generation.The precipitation map generated from point estimation of rainfall data at 326 stations supplemented by elevation data using linear regression and SKlm techniques are presenting in figure 5.
The results from figure 5 revealed that the precipitation map generated using SKlm was more similar to the one produced from ordinary kriging approach, indicating that the impact of elevation on rainfall map was less pronounced that for the map generated using linear regression.
The performances of the interpolators described herein this work were assessed and compared using cross validation [15].Although the various kriging algorithms provided an estimate of the error variance, it could not be retained as a performance criterion because in practice it usually provided

Figure 1 .
Figure 1.(A) Study area in Yom river basin; (B) distribution of rain gauges in the study area and vicinity; (C) monthly rainfall at a station in the study area showing the highest rainfall in August and September of Thailand.

Figure 2 .
Figure 2. Digital elevation model of the study area in Yom river basin of Thailand.type of rainfall causing by the vertically lifted air and the condensation from adiabatic cooling[10].

( 6 )
Calculation of WSS (data not shown) indicated that the spherical model yielded the smallest WSS value.The results from this study agree with others that the spherical model is the most widely used semivariogram model and is characterized by a linear behavior at the origin.Figure3Cshows the precipitation map generated by ordinary kriging using the spherical semivariogram model.

Figure 3 .
Figure 3. (A) Distribution of rain gauge in the study area and vicinity; (B) annual rainfall map (mm) obtained by interpolation of 326 observations using inverse square distance method; and (C) annual rainfall map (mm) obtained by interpolation of 326 observations using ordinary kriging technique.

Figure 4 .
Figure 4. Experimental semivariogram of the annual rainfall with three different permissible models fitted: (A) spherical model; (B) exponential model; and (C) hole effect model.

Figure 5 .
Figure 5. Annual precipitation maps (mm) obtained by interpolation of 326 observations accounting for the digital elevation model (figure 2).Two algorithms were considered: (A) linear regression; (B) simple kriging with varying local means; (C) ordinary kriging.littleinformation on the reliability of the kriging estimate[16].The idea consists of temporarily removing one rainfall observation at a time from the data set and

Figure 7 .
Figure 7. Relative MSE of prediction produced by each geostatistical algorithm using point precipitation data supplemented by humidity data for the monthly and annual rainfall.