# 1. AppendicesAimDataStudy AreaProcedureApplication How to use this Self-guided Tour This self-guided tour is designed for you to work through at your.

## Presentation on theme: "1. AppendicesAimDataStudy AreaProcedureApplication How to use this Self-guided Tour This self-guided tour is designed for you to work through at your."— Presentation transcript:

1

AppendicesAimDataStudy AreaProcedureApplication How to use this Self-guided Tour This self-guided tour is designed for you to work through at your own pace. It also allows you to review or skip a section. The following Features are included to help you navigate through this exercise: Underlined words take you directly to the respective subject. At any point throughout the tour, you may jump to the beginning of either one of the six Sections below. The Arrow Buttons allow you to move forward or backwards in the exercise.

AppendicesAimDataStudy AreaProcedureApplication Aim Background and Objectives The low-flow regime of a river controls industrial, agricultural and domestic water resources. In this context, low flows are critical for maintaining surface-water abstraction, dilution of effluents, and hydropower and for providing an adequate freshwater habitat for a wide range of flora and fauna. For an integrated catchment management it is necessary to have access to low-flow indices not only at the gauged site but also at the ungauged site. In the case of the ungauged site it is necessary to estimate low-flow indices with appropriate methods (BECKER 1992). A widely-used method for this purpose applies multiple regression analysis to estimate low- flow indices at the ungauged site, taking catchment descriptors as independent variables. This self-guided tour demonstrates a practical procedure to estimate low-flow indices at the ungauged site through the learning-by-screening method. You will learn:  to develop a conceptual model,  to translate the conceptual model into a mathematical model,  to calibrate the necessary mathematical transfer function by means of linear multiple regression analysis between catchment descriptors and low-flow indices,  to evaluate the model based on prior assumptions and the goodness of the fit,  to validate the model with the help of a separate data set, and  to apply the model to a real case. Aim

AppendicesAimDataStudy AreaProcedureApplication Aim Problem Statement (1) The map on the right shows the State of Baden-Württemberg in Southwest Germany. You will be introduced to the study area in the next section. Let us consider the following problem statement: Supposedly, a hydro-power station is planned to be built at the outlet of the Wiese catchment ( ). For design purposes we are asked to characterise the low-flow behaviour of the Wiese by determining the Q90 value. Unfortunately, there are no runoff data available for this catchment. Is there a way we could still give an estimate of the Q90? Fig. 1.1 Map of the study area Freiburg Stuttgart Karlsruhe Strasbourg (France) N 100 km Basel (Switzerland) Aim

AppendicesAimDataStudy AreaProcedureApplication Wiese Aim Problem Statement (2) Even though runoff data are unavailable, we do have access to information on the catchment itself. While some of this information can be deduced from maps (e.g. catchment area and mean elevation) while other information must be acquired in the field (e.g. precipitation). Click here to learn more about the available information on this catchmenthere From historical research we can be quite certain that runoff processes are related to a certain set of catchment attributes, which we will call catchment descriptors. However, we do not know how they are related. Site of the proposed hydro-power plant Fig. 1.2 Wiese catchment (No. 532) Q90 = ? How can we gain information on the expected relationship between the Q90 and the catchment descriptors? Aim

AppendicesAimDataStudy AreaProcedureApplication Q90 = ? Climate Annual precipitation 1891 mm Land Use Percentage of urbanisation2% Percentage of forested area63% Soil Percentage of soils with high infiltration capacity 0.19% Percentage of soils with medium infiltration capacity 0% Percentage of soils with low infiltration capacity 99.81% Percentage of soils with very low infiltration capacity 0% Hydrogeology Percentage of rock formations with a very low hydraulic permeability 0% Weighted mean of hydraulic conductivity 9.62 * 10 –4 m/s Morphometry Catchment area 206.28 km 2 Drainage density1.31 km/ km 2 Highest elevation 1485.5 m a.m.s.l. Lowest elevation 423.6 m a.m.s.l. Average elevation 898.74 m a.m.s.l. Maximum slope 45.54% Minimal slope 0 % Average slope 18.08% Mean hydraulic conductivity of the soils 201.69 cm/d Percentage of soils with low hydraulic conductivity 0% Percentage of soils with high water-holding capacity in the effective root zone 0% Mean water-holding capacity in the effective root zone 109 mm Aim Available Catchment Descriptors Aim back

AppendicesAimDataStudy AreaProcedureApplication Information on the relationship between the Q90 and a certain set of catchment descriptors can be gained by looking at other catchments in the same region for which both flow data and catchment descriptors are available. Aim Problem Statement (3) By means of multiple regression analysis among the other qualifying catchments in this region, we may be able to find a common Assuming that the same relationship is true for the Wiese catchment, we can use the regional transfer function and estimate the desired Q90 value at our ungauged site based on the respective Wiese catchment descriptors. regional pattern which describes this relationship between catchment descriptors and the Q90. This equation is called the regional transfer function. Aim Wiese Site of the proposed hydro-power plant Fig. 1.2 Wiese catchment (No. 532) Q90 = ?

AppendicesAimDataStudy AreaProcedureApplication Aim Problem Statement (4) After a more theoretical section on the basics of the regression analysis Procedure we will then come back to the Wiese catchment for the Application part of this self-guided tour in order to sole the stated problem. We encourage you to also use the Data provided to seek to reproduce the regression analysis on your own. Aim In the Appendices we have provided additional information on the catchment descriptors, data sources, and references. Let us now move on to get acquainted with the Study Area we will work with.

AppendicesAimDataStudy AreaProcedureApplication Study Area Overview The general study area is the State of Baden- Württemberg. Baden-Württemberg is located in the Southwest of the Federal Republic of Germany and shares a border with France in the West and Switzerland in the South. The region encompasses several landscapes, which exhibit a wide range of morphometry, hydro-geology, soil, land use, and climate. You may click on either one of the light bulbs on the map to receive more information on the specific landscapes, or choose from the following categories Climate Hydrology Freiburg Stuttgart Karlsruhe Study Area Fig. 1.1 Map of the study area N 100 km Strasbourg (France) Basel (Switzerland)

AppendicesAimDataStudy AreaProcedureApplication Study Area Rhine Rift Valley The Oberrheinische Tiefebene (Rhine Rift Valley) is a 300 km long and 20-30 km wide tectonic rift, which is filled with fluvio-glacial deposits. The river Rhine flows through the valley from South to North. It interacts with the sediments to form terraces, alluvial fans, gravel bars, etc.. It is here that the lowest elevations of the study area, 85 m to 250 m a.m.s.l., are found. The region is among the warmest in Central Europe, with mean air temperatures around 10°C, and it receives 600 to 900 mm rainfall per year. (B ORCHERDT 1991). The favourable climate and fertile soil on extensive loess deposits are the basis for the high agricultural productivity of this region, where wine and fruit are grown (M OHR 1992). Freiburg Stuttgart Karlsruhe Study Area back to OverviewOverview Fig. 1.1 Map of the study area N 100 km Strasbourg (France) Basel (Switzerland)

AppendicesAimDataStudy AreaProcedureApplication Study Area Black Forest The Schwarzwald (Black Forest) is a mountain range, characterized by steep valleys on the West side toward the river Rhine and more gentle slopes on the Eastern side towards the Danube. The Northern and Eastern Schwarzwald has an average elevation of 600 to 800 m a.m.s.l. (highest peak: Hornisgrinde 1164 m a.m.s.l.) and is dominated by New Red Sandstone. Due to relatively permeable bedrock, the drainage network is not particularly dense. The Southern and Western part of the Schwarzwald is the most elevated part of the study area with mean elevations of 1000 m a.m.s.l.. Feldberg is the highest elevation in the study area with 1493 m a.m.s.l. and an average air temperature of 3.2 °C (B ORCHERDT 1991). The area also receives the most precipitation in the study area; up to 2100 mm/year. Since the top bedrock is composed of granite and gneiss with relatively low permeability, a significant amount of water is drained on the surface and a dense drainage network with a mean drainage density of 1.94 km/km 2 and a maximum of 5.0 km/km 2 (W UNDT 1953) has developed. Freiburg Stuttgart Karlsruhe Study Area back to OverviewOverview Fig. 1.1 Map of the study area N 100 km Strasbourg (France) Basel (Switzerland)

AppendicesAimDataStudy AreaProcedureApplication Study Area Südwestdeutsches Schichtstufenland The Südwestdeutsches Schichtstufenland (literally: “Southwest German step-layered land”) is characterized by a relatively level to rolling topography, which is slightly tilted towards the South East. Its elevation ranges from 700 to 1000 m a.m.s.l.. Mean annual air temperatures range from 6 to 9°C. The region receives between 650 and 900 mm rainfall per year. The bedrock is composed of layers of sedimentary rocks, such as New Red Sandstone, Coquina, Keuper, and Jurassic, which exhibit karstic phenomena, such as dolines and sinkholes. Dry valleys are relics from periods of colder climate when the ground was frozen so that more water drained on the surface. Drainage density today may be as low as 0.03 km/km 2 (W UNDT 1953). In some areas, limestone is covered with loess, which causes an increase in drainage density. New Red Sandstone is also found in this region in alternating layers with Marl. Due to differential erosion and an inclination of these layers a sequence of steps has been formed in the landscape. Freiburg Stuttgart Karlsruhe Study Area back to OverviewOverview Fig. 1.1 Map of the study area N 100 km Strasbourg (France) Basel (Switzerland)

AppendicesAimDataStudy AreaProcedureApplication Study Area Pre-Alps and Lake Constance The Alpenvorland (Pre-Alps) is an area where unconsolidated sediments have been re- arranged by glaciers. The area around Bodensee (Lake Constance) has been affected by the most recent (Würm) ice age, and has a quite pronounced relief with drumlins, lakes, and bogs of glacial origin. For the most part the area drains to Lake Constance, which is part of the river Rhine system. The lake is the result of glacial scouring. With a surface area of 538 km 2 and a maximum depth of 254 m it is the largest German lake supplying Stuttgart and several other cities with drinking water. The Northern part of this landscape is a relic of the preceding (Riss) ice age and is therefore more levelled. Along the Danube, gravel with loess deposits can be found. The region lies at a mean elevation of 600 m a.m.s.l.. It receives 750 to 1400 mm precipitation and its mean annual air temperature is between 6 and 7°C. Freiburg Stuttgart Karlsruhe Study Area back to OverviewOverview Fig. 1.1 Map of the study area N 100 km Strasbourg (France) Basel (Switzerland)

AppendicesAimDataStudy AreaProcedureApplication Study Area Climate (1) The climate in the study area is the result of the interaction of oceanic and continental influences. While the latter is responsible for seasonality (with cold winters and hot summers), the dominating impact of the former leads to a more temperate climate. July is usually the warmest and January the coldest month. The mean annual air temperature ranges from 3.2 °C at the Feldberg (highest elevation of the study area) to above 10°C in the Rhine Rift Valley (HUTTENLOCHER 1972). Fig. 2.2 Mean annual precipitation (1961-90) [mm] N Study Area more

AppendicesAimDataStudy AreaProcedureApplication Study Area Climate (2) Precipitation in this area is predominantly caused by frontal (zyklonal) storms. This pattern is modified by orographic lifting. Therefore, the amount of precipitation is mostly a function of elevation and exposition. It ranges from 2100 mm/year (in the Western part of the Black Forest, such as the Feldberg) to 600 mm/year (in the sheltered areas of the Rhine Rift Valley). During the summer convective lifting may induce the formation of short-duration-high- intensity precipitation. The study area receives precipitation throughout the year with a maximum in the summer (June to August) and a minimum in the late winter (February and March). There is snow on the ground for up to 150 days in the Black Forest (H UTTENLOCHER 1972). Fig. 2.2 Mean annual precipitation (1961-90) [mm] N Study Area back to OverviewOverview

AppendicesAimDataStudy AreaProcedureApplication Study Area Hydrology Three quarters of the study area is drained by the river Rhine (the only alpine river flowing to the North Sea) and one quarter by the Danube. Since the area draining into the river Rhine falls more steeply, backwards erosion allows its headwaters to tap into the Danube catchments. It is difficult to map out the exact position of the European groundwater divide in this area since a significant amount of water drains in the karst system, part of which is diverted from the Danube into the Rhine (V ILLINGER 1982). During low-flow periods, all the water from the Danube leaves the river bed between Immendingen and Fridingen through sink holes and continues to flow underground. Two thirds of this water ends up in the river Rhine system (B ORCHERDT 1991). With the lack of substantial tributaries and the loss of water to the river Rhine, the Danube remains a relatively small river until its alpine tributaries add to its flow in Bavaria, east of the study area. Fig. 2.3 Catchments in Southwest Germany Study Area Rhine Danube more

AppendicesAimDataStudy AreaProcedureApplication Study Area Drainage Network and Human Impact The previously-discussed variety of landscapes in Southwest Germany is reflected by the regional distribution of drainage density. It is easy to spot the low-laying Rhine Rift Valley as well as the Black Forest and the Pre-Alps, which receive the highest precipitation amounts in the study area. The high drainage density in these regions is indicated by the blue colours. In contrast, the Swabian Alb, part of the “Deutsches Schichtstufenland”, is easily distinguishable by the white shading. It has a very low drainage density, due to wide-spread karstic phenomena. Water management measures, such as water diversions and exports, stormwater ponds, reservoirs for the augmentation of low flows and groundwater extraction, are examples of how the hydrological cycle is being quantitatively impacted by human activity in this area. For our example only catchments with little human impact on the flow regime have been selected. Fig. 2.4 Drainage density in Southwest Germany Study Area more

AppendicesAimDataStudy AreaProcedureApplication The runoff regime in this region is dominated by the effects of rainfall and modified to some degree by snow melt. The highest flows usually occur between February and April and the lowest in August or September due to a summer maximum of evapotranspiration. 1 2 Study Area Runoff Regimes back to OverviewOverview Fig. 2.5 Pardé coefficients Figure 2.5 shows the Pardé coefficients (mean annual monthly flow divided by mean flow) for two catchments in our region, ranging between 0.5 in late summer and 2.0 in the spring. Breg (at Hammereisenbach) Elz (at Mosbach) 1 2 Study Area

AppendicesAimDataStudy AreaProcedureApplication Catchment Descriptors (independent variables) Low-Flow Indices (dependent variables) Catchment Descriptors (independent variables) Low-Flow Indices (dependent variables) Procedure Outline Multiple Linear Regression Model Y i = b 0 +  b j * X ij + e i BASE = b 0 +  b j * X ij + e i MAM(10) = b 0 +  b j * X ij + e i Q90 = b 0 +  b j * X ij + e i  Check for Sensibleness  Model Requirements  Catchment Selection  Selection of Catchment Descriptors  Deduction of Catchment Descriptors  Selection of Low-Flow Indices  Calculation of Low-Flow Indices  Model Selection  Assumptions and Requirements  Selection of Algorithms to depict the low-flow indices  Computation of Regional Transfer Functions  Check for agreement between observed and estimated values Model Application You may chose a specific step of the procedure or click on the arrows (bottom right) to proceed in sequence Data Acquisition (Step 1) Model Design (Step 2) Model Validation (Step 5) Model Calibration (Step 3) Model Evaluation (Step 4) Calibration Data Set (56 Stations) Validation Data Set (27 Stations)  Data Splitting Procedure

AppendicesAimDataStudy AreaProcedureApplication Procedure Outline Multiple Linear Regression Model Y i = b 0 +  b j * X ij + e i  Check for Sensibleness  Model Requirements  Catchment Selection  Selection of Catchment Descriptors  Deduction of Catchment Descriptors  Selection of Low-Flow Indices  Calculation of Low-Flow Indices  Model Selection  Assumptions and Requirements  Check for agreement between observed and estimated values Model Application You may chose a specific step of the procedure or click on the arrows (bottom right) to proceed in sequence Data Acquisition (Step 1) Model Design (Step 2) Model Validation (Step 5) Model Calibration (Step 3) Model Evaluation (Step 4) Calibration Data Set (56 Stations) Validation Data Set (27 Stations)  Data Splitting Procedure Catchment Descriptors (independent variables) Low-Flow Indices (dependent variables) Catchment Descriptors (independent variables) Low-Flow Indices (dependent variables)  Selection of Algorithms to depict the low-flow indices  Computation of Regional Transfer Functions BASE = b 0 +  b j * X ij + e i MAM(10) = b 0 +  b j * X ij + e i Q90 = b 0 +  b j * X ij + e i

AppendicesAimDataStudy AreaProcedureApplication Catchment Descriptors (independent variables) Low-Flow Indices (dependent variables) Data Acquisition Overview This preparatory step is foundational to the success of the whole analysis and estimation process. Our results can only be as good as the data we use for the basis of our calculations. Therefore, adequate resources and attention should be given to this crucial step. The data used in the self-guided tour has been provided by different project groups and institutions, e.g. WaBoA, RIPS-Pool, LfU, LGRB (which are all part of the European Water Archive EWA), and the KLIWA project group. The applicable data associated with each of the respective catchments was entered into a two- dimensional spreadsheet, which can accessed through the Data Section Click here to receive an explanation of the administrative acronymshere  Catchment Selection  Selection of Catchment Descriptors  Deduction of Catchment Descriptors  Selection of Low-Flow Indices  Calculation of Low-Flow Indices  Data Splitting Calibration Data Sets (56 Stations) Validation Data Sets (27 Stations) 1 1 Procedure Catchment Descriptors (independent variables) Low-Flow Indices (dependent variables)

AppendicesAimDataStudy AreaProcedureApplication  KLIWA – Projekt Klimaänderung und Konsequenzen für die Wasserwirtschaft (Climatic Change and Impact on Water Resources Management) Organisations  LfU – Landesanstalt für Umweltschutz (Environmental Agency, Regional Office, State of Baden-Württemberg)  LGRB – Landesanstalt für Geologie, Rohstoffe und Bergbau Baden-Württemberg (Regional Office for Geology, Commodities, and Mining, State of Baden-Württemberg) The data used in the self-guided tour were provided by the following data pools, projects, and organisations: Data Pools  RIPS-Pool – Räumliches Informations- und Planungssystem (Spatial Information and Planing System, State of Baden- Württemberg)  EWA - European Water Archive of the Northern European FRIEND project (Flow Regimes from International and Experimental Data) Projects  WaBoA – Wasser und Boden Atlas von Baden-Württemberg (Water and Soil Atlas of the State of Baden-Württemberg) back Data Acquisition Acronyms 1 1 Procedure

AppendicesAimDataStudy AreaProcedureApplication Data Acquisition Catchments In a first step, the catchments to be considered for the analysis must be selected. The catchments for this study were selected based on the following FRIEND EWA criteria:  Availability of continuous runoff data  Precision in gauging low-water runoff. Accurate low-flow measurements, are difficult to attain. According to M ORGENSCHWEIS (1990), gauging errors of 10% are common, and may – in case of heavy vegetation in the river bed – even reach 30% (G LOS & L AUTERBACH 1972)  Negligible influence of human activity on low-water runoff  Negligible influence of glacial runoff on total streamflow  Availability of catchment descriptors Based on these criteria, 83 medium-scale catchments were selected for this study (Figure 3.1). Fig. 3.1Spatial distribution of selected catchments 0 80 160 km 2 1 N Procedure

AppendicesAimDataStudy AreaProcedureApplication The catchment descriptors are the dependent variables in the model to be established. They were selected based on the following criteria (HAAS 2000):  agreement with hydrological principles  spatial representation with respect to climate, land use, morphometry, soil, and hydrogeology  experience in using these independent variables in other studies  availability for the study area  relatively easy calculations  possible interpretation as areal means Click here to receive more informationhere on the catchments descriptors Data Acquisition Catchment Descriptors - Overview 3 1 Climate Average annual precipitation [mm] Land Use Percentage of urbanisation [%] Percentage of forest [%] Morphometry Catchment area [km 2 ] Drainage density [km/ km 2 ] Highest elevation [m a.m.s.l.] Average elevation [m a.m.s.l.] Lowest elevation [m a.m.s.l.] Maximum slope [%] Average slope [%] Minimal slope [%] Soil Percentage of soils with high infiltration capacity [%] Percentage of soils with medium infiltration capacity[%] Percentage of soils with low infiltration capacity[%] Percentage of soils with very low infiltration capacity [%] Mean hydraulic conductivity of the soils [cm/d] Percentage of soils with low hydraulic conductivity [%] Percentage of soils with high water-holding capacity in the effective root zone [%] Mean water-holding capacity in the effective root zone [mm] Hydrogeology Percentage of rock formations with a very low hydraulic permeability [%] Weighted mean of hydraulic conductivity [m/s] Tab. 1 Selected catchment descriptors Procedure

AppendicesAimDataStudy AreaProcedureApplication You may click on any of the categories on the right to receive more information on the catchment descriptors or click here to return. here Fig 3.2 Catchment Descriptors (PLATE 1992) Morphology and Morphometry Climate Hydrogeology Soil Land Use Data Acquisition Catchment Descriptors - Overview 3 1 Procedure

AppendicesAimDataStudy AreaProcedureApplication AREA - Catchment area [km 2 ] The catchment area is defined as the“area having a common outlet for its surface runoff” (IHP/OHP 1998). The descriptor was deduced from a 1:50,000 scale map of catchment boundaries provided by the Water and Soil Atlas of the State of Baden- Württemberg (WaBoA) and the RIPS-Pool. DD - Drainage density [km/km 2 ] Drainage density is the “total channel- segment length, accumulated for all [stream] orders within a drainage area, divided by the area” (IHP/OHP 1998). For the deduction procedure 1: 50,000 scale maps of catchment boundaries and drainage network (WaBoA and RIPS-Pool) were combined. HMIN – Lowest elevation [m a.m.s.l.] HMAX – Highest elevation [m a.m.s.l.] HMEAN – Average elevation [m a.m.s.l.] The elevation data are based on a digital elevation model (50 m by 50 m cells), provided by the Water and Soil Atlas of the State of Baden- Württemberg (WaBoA) and the RIPS-Pool. SLOPEMIN - Minimal slope [%] SLOPEMAX - Maximum slope [%] SLOPEMEAN - Mean slope [%] Minimum, maximum and mean slopes were deduced using a digital elevation model. Data Acquisition Morphology and Morphometry 3 1 back to Catchment Descriptors - Overview Procedure

AppendicesAimDataStudy AreaProcedureApplication Data Acquisition Land Use and Hydrogeology Remote sensing was used to derive land use for the area (Landsat TM, 30 x 30 m grid, 1993). It was classified into 16 classes, which were aggregated to four groups; forest, farmland, grassland and settlements/urban areas. Only the relative proportion of forest and urban areas were chosen to be included in this self- guided tour. URBAN - Percentage of urbanisation [%] FOREST - Percentage of forest [%] URBAN is an aggregation of settlement areas and areas with large-scale surface sealing due to industry. The latter covers 0.8% of the study area. Settlements are comprised of loose (1.9%) and dense (4.6%) settlements. FOREST is a combination of deciduous (7.8%) and coniferous (21.4%) forest and other forested areas (10.0%). 3 1 GEOHCMEAN – Weighted mean of hydraulic conductivity [m/s] GEOVLHP – Percentage of rock formations with a very low hydraulic permeability [%] From a 1:350,000 scale map produced by the Regional Authority for Geology, Commodities, and Mining of Baden-Württemberg (LGRB), 98 geological classes were reduced to 54 hydro- geological classes and aggregated to eight groups. Each group was associated with a mean hydraulic conductivity of the upper hydro-geological unit. From these values, a weighted mean was produced for each catchment. From the same data, the proportion of rock formations with a mean hydraulic conductivity of less than 10 -5 m/s was derived. Procedure back to Catchment Descriptors - Overview

AppendicesAimDataStudy AreaProcedureApplication The classification of the soil water regime was based on a study by the Regional Authority for Geology, Commodities, and Mining of Baden- Württemberg (LGRB). They produced a 1 : 350 000 scale map of 29 soil water regime classes based on soil type, humus content, packing, slope, and geology. These classes were aggregated to four groups of soil types based predominantly on their infiltration capacity, which is defined as the “maximum rate at which water can be absorbed by a given soil per unit area under given conditions” (IHP/OHP 1998). SOILH – Percentage of soils with high infiltration capacity [%] These soils exhibit a high infiltration capacity even under conditions of high antecedent soil water content, such as sand and gravel soils. SOILM – Percentage of soils with medium infiltration capacity [%] Examples of soils which feature a medium infiltration capacity are loamy soils and loess of medium depth. SOILL – Percentage of soils with low infiltration capacity [%] The low infiltration capacity of these soils is due to their fine texture and/or the impermeability of one or more layers, as found in shallow sandy and loamy soils. SOILVL – Percentage of soils with very low infiltration capacity [%] The infiltration capacity in these soils is very low because they are shallow, composed of hardly permeable material (such as clay) or have a high ground water level. Data Acquisition Soil (1) 3 1 Procedure back to Catchment Descriptors - Overview more

AppendicesAimDataStudy AreaProcedureApplication SOILHCMEAN - Mean hydraulic conductivity of the soils [cm/d] SOILLHC - Percentage of soils with low hydraulic conductivity [%] Hydraulic conductivity is a “property of a saturated porous medium which determines the relationship, called Darcy’s law, between the specific discharge and the hydraulic gradient causing it” (IHP/OHP 1998). From a 1 : 200 000 scale map with 9 classes, areal means were deduced. The lowest two classes (with a mean hydraulic conductivity of less than 2.3*10 -6 m/s) were combined for the calculation of the percentage of soils with low hydraulic conductivity. ROOTSMEAN - Mean water-holding capacity in the effective root zone [mm] ROOTSHIGH - Percentage of soils with high water-holding capacity in the effective root zone [%] The data for this descriptor is based on a map produced by the Regional Authority for Geology, Commodities, and Mining of Baden- Württemberg (LGRB), which shows the distribution of water-holding capacity for a theoretical soil depth of 100 cm. Water-holding capacity is defined as “water in the soil available to plants. It is normally taken as the water in the soil between wilting point and field capacity. In this context water-holding capacity is used and is identical to the available water” (IHP/OHP 1998). Based on the information of soil type, land use, root depth, and water logging conditions the water-holding capacity values were adjusted to the estimated effective root zone. These values were then used to compute the areal mean. A threshold mean water-holding capacity was set at 200 mm. Above this threshold, all classes were aggregated to “soils with high water-storage capacity in the effective root zone” and its proportion was calculated. Data Acquisition Soil (2) 3 1 Procedure back to Catchment Descriptors - Overview

AppendicesAimDataStudy AreaProcedureApplication AAR – Average annual precipitation [mm] The data for the average annual precipitation was derived from a digital map provided by the Water and Soil Atlas of the State of Baden-Württemberg (WaBoA) and the RIPS-Pool. It shows average annual precipitation for the period 1961-1990 based on a resolution of a 500 m grid. For this map, average annual precipitation had been calculated from the relationship between precipitation depth and altitude. It was also based on the principle of distance-weighting from the points of measurement. The raw data for the production of this map was provided by the German Weather Service (DWD). Data Acquisition Climate 3 1 Procedure back to Catchment Descriptors - Overview Fig. 2.2 Mean annual precipitation (1961-90) [mm] N

AppendicesAimDataStudy AreaProcedureApplication Data Acquisition Low-Flow Indices Several low-flow indices have been developed to describe the statistical distribution of flow. The low-flow indices are the independent variables in our model. The estimation procedure was performed for the following three low-flow indices.  the mean base flow, BASE,  the mean annual 10-day-minimum flow, MAM(10), and  the 90 percentile runoff, i.e. the runoff to be equalled or exceeded 90% of the time, Q90. The low-flow indices are calculated from daily flow data for the entire data set. Our regression analysis will be performed separately for all three dependent variables. 4 1 You may use the arrow buttons to view the low-flow indices in sequence or proceed to the next section.proceed Procedure

AppendicesAimDataStudy AreaProcedureApplication Data Acquisition BASE The method of base flow estimation from daily flow data was developed by W UNDT (1958) and K ILLE (1970) and modified by D EMUTH (1993). It serves as an example of how more complex indices can be obtained in an automated and objective way. The approach is based on the analysis of monthly minimum flows. It assumes that for the most part monthly minimum flows are equivalent to the mean base flows of the respective months. Monthly minimum runoff values are extracted from a time series of at least ten years, and the individual values are ranked in an ascending order and plotted (Figure 3.3). The points of the ranked flow data are similar to a flow duration curve with the lower values arranged approximately along a straight line. At the critical point the slope of the curve sharply increases. It is assumed that flows beyond the critical point are not “pure” base flow. Rank Streamflow [m 3 /s] Fig. 3.3 Monthly minimum runoff values, ranked in ascending order (Elsenz at Meckesheim, No. 460, 1966-90) 5 1 Procedure

AppendicesAimDataStudy AreaProcedureApplication Data Acquisition BASE A stepped linear regression is computed to find the line which separates the flow values which are influenced by surface and subsurface flow from “pure” base flow values (Figure 3.4). The step regression starts with the values between the 5% and the 50% mark. Successively, values beyond the 50% mark are included in the regression and the correlation coefficient is re- computed until it reaches a maximum. This value is called the critical point. Between the 5% value and the new critical point a straight line is interpolated and extended in both directions to correct the higher flows to “true” base flow. Finally, all flow values are adjusted to the straight line and the mean base flow is calculated (yellow arrows). Rank Streamflow [m 3 /s] Fig. 3.4Monthly minimum runoff values, ranked in ascending order (Elsenz at Meckesheim, No. 460, 1966-90) 5% value critical value 50% value 6 1 BASE Procedure

AppendicesAimDataStudy AreaProcedureApplication Data Acquisition BASE The D EMUTH procedure can only be applied to the S-shaped curve (Type I). The parabolic curve (Type II) does not allow a linear reduction procedure. In our data sets, all flow data belonged to type I and could be used for the deduction of BASE for the respective catchment. Fig. 3.5 Type I and Type II curves Streamflow [m 3 /s] Rank 7 1 Type I Type II Procedure

AppendicesAimDataStudy AreaProcedureApplication Data Acquisition MAM(10) The MAM(10) value is calculated by selecting the annual ten-day minimum values (AM(10)) of discharge from each year of the observation period and computing the arithmetic mean of this set of values (Figure 3.6). Year Streamflow [m 3 /s] Fig. 3.6Annual 10-day minimum values of discharge and their arithmetic mean (Elsenz at Meckesheim, No. 460, 1966-90) 8 1 Procedure

AppendicesAimDataStudy AreaProcedureApplication Data Acquisition Q90 In a Flow Duration Curve (FDC) the observed flow data is ranked in descending order. It displays the relationship between a discharge value and the percentage of time during which it is equalled or exceeded. The Q90 is the value which is equalled or exceeded in 90% of the time, in this case 0.9 m 3 /s (Figure 3.7). Percentiles Streamflow [m 3 /s] 9 1 Procedure Fig. 3.7Flow Duration curve and deduction of the 90 percentile (Elsenz at Meckesheim, No. 460, 1966-90)

AppendicesAimDataStudy AreaProcedureApplication Data Acquisition Data Splitting The final step in preparing the data is to split the acquired data set arbitrarily in order to produce two sets for model calibration and validation, respectively. This is called Data Splitting. It is advisable to split the data with a ratio of about 2 to 1, ensuring that both data sets reflect the physiographic properties of the region under study. The Baden-Württemberg data set was split into 56 and 27 data sets for calibration and validation, respectively. Catchment Descriptors (independent variables) Low-Flow Indices dependent variables) Catchment Descriptors (independent variables) Low-Flow Indices (dependent variables)  Data Splitting Catchment Descriptors (independent variables) Low-Flow Indices (dependent variables) 10 1 Calibration Data Set (56 Stations) Validation Data Set (27 Stations) Procedure

AppendicesAimDataStudy AreaProcedureApplication Procedure Outline Multiple Linear Regression Model Y i = b 0 +  b j * X ij + e i  Check for Sensibleness  Model Requirements  Catchment Selection  Selection of Catchment Descriptors  Deduction of Catchment Descriptors  Selection of Low-Flow Indices  Calculation of Low-Flow Indices  Model Selection  Assumptions and Requirements  Check for agreement between observed and estimated values Data Acquisition (Step 1 Model Design (Step 2) Model Validation (Step 5) Model Calibration (Step 3) Model Evaluation (Step 4) Calibration Data Set (56 Stations) Validation Data Set (27 Stations)  Data Splitting You may chose a specific step of the procedure or click on the arrows (bottom right) to proceed in sequence Model Application Procedure Catchment Descriptors (independent variables) Low-Flow Indices (dependent variables) Catchment Descriptors (independent variables) Low-Flow Indices (dependent variables)  Selection of Algorithms to depict the low-flow indices  Computation of Regional Transfer Functions BASE = b 0 +  b j * X ij + e i MAM(10) = b 0 +  b j * X ij + e i Q90 = b 0 +  b j * X ij + e i

AppendicesAimDataStudy AreaProcedureApplication Model Design Model Selection 2 In the self-guided tour, the multiple regression approach is chosen since it is easy to handle, produces fast results, and is an effective procedure in most statistics programs. The purpose of multiple regression analysis, as defined by H OLDER (1985), is to “asses the combined effect of several variables on a single variable.” Thereby, the regression analysis allows for the recognition and interpretation of statistical relationships. The understanding gained from this analysis can be used to estimate an independent variable based on several dependent variables. In our model, the independent variables are the catchment descriptors, and the dependent variables are the low-flow indices. By applying the regression approach we assume that the relationship between a low-flow index Y and its catchment descriptors X can be expressed as follows: Y i = b 0 +  b j * X ij + e i withi = 1,..., N andj = 1,..., P where Y i is the dependent variable and b 0 and b j are constants or coefficients respectively. X ij signifies the catchment descriptor j of the catchment i. N is the total number of data sets (samples) and P is the total number of independent variables; finally, e i is the error term (D EMUTH 1993). 1 P j = 1 Procedure

AppendicesAimDataStudy AreaProcedureApplication 1. The model is free of specification error 2. The data set is free of measurement error 3. Homoscedasticity: The variance of the error term is constant for all values of the independent variables 4. The error term is neither auto-correlated nor correlated with the independent variables 5. The error term follows normal distribution 6. The model is free of multi-colinearity Model Design Assumptions and Requirements In order for the multiple regression model to be the “best linear unbiased estimates” (L EWIS- BECK 1986) six assumptions have to be made. They become requirements when we want to make predictions based on this analysis: You may click on any of the six assumptions, use the arrow buttons to view them in sequence, or proceed to the next section.proceed 2 2 Procedure

AppendicesAimDataStudy AreaProcedureApplication Model Design Assumptions and Requirements 1. The model is free of specification error We must assume that  the independent variables X i (e.g. catchment size, areal precipitation) are linearly related to the dependent variable Y (e.g. Q90), and their effect on Y is truly additive or multiplicative (depending on the model chosen).  all relevant independent variables have been included in the model while all irrelevant independent variables have been excluded (L EVIS - B ECK 1986). It is the responsibility of the modeller to use all available statistical and physical knowledge to minimize specification error. Tests for statistical significance aid in identifying variables that should not be in the model. 2. The data set is free of measurement error The model relies on the quality of the data. We must be confident that the variables X i and Y i have been measured accurately. The fulfilment of this condition is problematic, particularly since low flows are usually associated with an error in the magnitude of 10 to 30% (G LOS & L AUTERBACH 1972). back to Assumptions - Overview 3 2 Procedure

AppendicesAimDataStudy AreaProcedureApplication Model Design Assumptions and Requirements 3. Homoscedasticity: The variance of the error term is constant for all values of the independent variables The assumption of homoscedasticity is true when a plot of residuals versus predicted values of Y produces a horizontal band with uniform width (Figure 3.8). If this condition is not met, the estimated indices will not have a minimal variance. Consequently, the general procedures related to t- test, F-test, and confidence intervals will not be valid anymore. Therefore, the evaluation of a regression model implies a proper investigation of the residuals or estimation error. Predicted Values of Y Standardised Residuals 0 Fig. 3.8 Check for Homoscedasticity more 4 2 Procedure

AppendicesAimDataStudy AreaProcedureApplication Model Design Assumptions and Requirements 3. Homoscedasticity: The variance of the error term is constant for all values of the independent variables (continued) If the plot is a tilted band with equal width (Figure 3.9) either an error has occurred in the calculations or the model fails to accurately model changes in Y. In such a case  transformation of Y or  inclusion of polynomial terms of X in the model may prove as a remedy (H OLDER 1985). 5 2 Procedure more Standardised Residuals 0 Fig. 3.9 Check for Homoscedasticity Predicted Values of Y

AppendicesAimDataStudy AreaProcedureApplication Model Design Assumptions and Requirements 3. Homoscedasticity: The variance of the error term is constant for all values of the independent variables (continued) If the band does not have equal width (Figure 3.10), the variance of the error term is not constant. Reasons for this deviation may include the increase of variability with increasing Y or the increase of errors for greater Y. This can be corrected through  the application of a weighted least squares procedure (L EWIS -B ECK 1986),  a transformation of the variables,  elimination of part of the values, or  fitting several models to different ranges of values (H OLDER 1985). 6 2 Standardised Residuals 0 Fig. 3.10 Check for Homoscedasticity Predicted Values of Y Procedure back to Assumptions - Overview

AppendicesAimDataStudy AreaProcedureApplication Model Design Assumptions and Requirements 4. The error term is neither auto-correlated nor correlated with the independent variables If this condition is not met, significance tests and confidence intervals will be invalid (H OLDER 1985). In cases where measurements were collected in a sequence or as part of a time series, it is possible that time (even though it is not specified as a separate independent variable) has an effect on the error term. This can be checked when error is plotted versus time or sequence number of measurements. If an (auto-)correlation is detected, the acquired data needs to be corrected with respect to time. Fig. 3.11 Assessment of the effect of time on the error term Time Standardised Residuals 0 Time Standardised Residuals 0 7 2 a b Procedure more

AppendicesAimDataStudy AreaProcedureApplication Model Design Assumptions and Requirements 4. The error term is neither auto-correlated nor correlated with the independent variables (continued) If a trend is visible (Figure 3.11-a) then time has explanatory value and should be included in the model. Possibly the measurement procedure has induced a systematic error over time or the property to be measured is undergoing a change. It is also possible that the plot changes in width over time (Figure 3.11-b). This variability of the variance of the error can be a result of increased precision of the measuring technique over time (H OLDER 1985). Fig. 3.11Assessment of the effect of time on the error term Time Standardised Residuals 0 Time Standardised Residuals 0 8 2 a b Procedure more

AppendicesAimDataStudy AreaProcedureApplication Model Design Assumptions and Requirements 4. The error term is neither auto-correlated nor correlated with the independent variables (continued) A correlation between the error term and an independent variable (Figure 3.12) may occur when a significant variable has been left out of the model and is now accounted for partially by the error term and partially by the other independent variables. (L EWIS -B ECK 1986). If an independent variable and the error term are correlated “the least squares parameter estimates will be biased” (L EWIS -B ECK 1986). Fig. 3.12 Assessment of correlation between the error term and an independent variable Values of X Standardised Residuals 0 9 2 Procedure back to Assumptions - Overview

AppendicesAimDataStudy AreaProcedureApplication 0 Residuals Frequency Fig. 3.13 Frequency distribution of residuals Observed cumulative probability of residuals Model Design Assumptions and Requirements 5. The error term follows normal distribution The fulfilment of this requirement can be assessed visually by comparing a histogram of the residuals (Figure 3.13) or a cumulative distribution of the error term (Figure 3.14) to the expected normal distribution, or mathematically by computing skewness. Since the X values are fix it can be implied that a normal distribution of the error term corresponds to a normal distribution of Y (L EWIS - B ECK 1986). This means that for a fulfilment of the assumption the data used in the regression analysis also needs to follow near-normal distribution. Fig. 3.14 Probability plot of residuals Predicted cumulative probability of residuals 10 2 Procedure back to Assumptions - Overview

AppendicesAimDataStudy AreaProcedureApplication 0 Residuals Frequency Fig. 3.13 Frequency distribution of residuals Model Design Assumptions and Requirements 5. The error term follows normal distribution (continued) If the error term is not normally distributed tests of significance and confidence interval statements will become questionable. However, “the tests of significance appear to be insensitive to non- normality in Y whenever the Xs themselves come from a near-normal distribution. On the other hand, if the Xs themselves do not come from a near- normal distribution and if some X values are very different in magnitude from the remainder, then the tests of significance are very sensitive to non- normality in Y” (L EWIS -B ECK 1986). 11 2 Observed cumulative probability of residuals Fig. 3.14 Probability plot of residuals Predicted cumulative probability of residuals Procedure back to Assumptions - Overview

AppendicesAimDataStudy AreaProcedureApplication Model Design Assumptions and Requirements 6. The model is free of multi-colinearity Multi-colinearity means that one independent variable can be expressed as a linear combination of the remaining independent variables in the model. This is problematic because it produces large variances for the slope estimates resulting in large standard errors so that parameter estimates become unreliable (LEWIS-BECK 1986). Furthermore, multi-colinearity complicates the interpretation of the regression equation. (SCHREIBER 1996). To detect multi-colinearity each independent variable is regressed on all other independent variables of the model (LEWIS-BECK 1986). DEMUTH (1993) uses 0.8 as the upper limit for the coefficient of determination. Variable combinations that are more strongly inter-correlated must be devised. The problem of multi-colinearity can be addressed by enlarging the sample size or by combining the problematic variables to form a single indicator (e.g. through principle component analysis). The third option, excluding the problematic variable, introduces specification error to the model (see assumption 1)! Comparing the new (reduced) model with the original model can help in the assessment of the significance of this error (LEWIS-BECK 1986). The seriousness of violations of the above assumptions is argued controversially in scientific literature. What can be said is that there are different degrees of robustness among the above conditions. For example, while the assumption of normality (5) is relatively robust for large samples, specification errors (1) generally cause grave problems (LEWIS-BECK 1986). 12 2 back to Assumptions – OverviewAssumptions – Overview Procedure

AppendicesAimDataStudy AreaProcedureApplication Procedure Outline Multiple Linear Regression Model Y i = b 0 +  b j * X ij + e i  Check for Sensibleness  Model Requirements  Catchment Selection  Selection of Catchment Descriptors  Deduction of Catchment Descriptors  Selection of Low-Flow Indices  Calculation of Low-Flow Indices  Model Selection  Assumptions and Requirements  Check for agreement between observed and estimated values Model Application Data Acquisition (Step 1) Model Design (Step 2) Model Validation (Step 5) Model Calibration (Step 3) Calibration Data Set (56 Stations) Validation Data Set (27 Stations)  Data Splitting You may chose a specific step of the procedure or click on the arrows (bottom right) to proceed in sequence Procedure Model Evaluation (Step 4) Catchment Descriptors (independent variables) Low-Flow Indices (dependent variables) Catchment Descriptors (independent variables) Low-Flow Indices (dependent variables)  Selection of Algorithms to depict the low-flow indices  Computation of Regional Transfer Functions BASE = b 0 +  b j * X ij + e i MAM(10) = b 0 +  b j * X ij + e i Q90 = b 0 +  b j * X ij + e i

AppendicesAimDataStudy AreaProcedureApplication Model Calibration Selection of independent variables 3 In the process of applying an objective procedure to select those variables which correlate highly with the target value, but not with each other, several considerations must be made: First of all, it is clear that the goal of the modeller should be to explain the observed phenomena as accurately as possible. However, it must be kept in mind that adding further variables to the model increases the risk of introducing variables whose correlation with the target value is coincidental. Furthermore, the specific explanatory value of the model may decrease with increasing degrees of freedom. In an extreme case, if the number of variables equals or exceeds the number of samples, the coefficient of determination is always 1 and the amount of information gained virtually zero. This is called overfitting. 1 Multiple Linear Regression Model Y i = b 0 +  b j * X ij + e i Pool of independent variables Regional Transfer Function Procedure

AppendicesAimDataStudy AreaProcedureApplication Model Calibration Selection of independent variables 3 Including less variables into the model can help in the understanding of dominant processes. As a rule of thumb, the number of independent variables should not exceed a third of the sample size (B ACKHAUS et al. 1996). Computations are performed with SPSS, a statistical computing package. Automated selection procedures are based on statistical indices, such as the coefficient of determination and significance. While these procedures help identifying potential variables to be included in the model, statistical significance must always be balanced with scientific knowledge. It lies within responsibility of the modeller to deduce equations that have both statistical and physical significance. 2 Multiple Linear Regression Model Y i = b 0 +  b j * X ij + e i Pool of independent variables Regional Transfer Function Procedure

AppendicesAimDataStudy AreaProcedureApplication Model Calibration Selection of independent variables 3 The program selects variables from a given pool based on their contribution towards the explanation of the variance of the target value. Applying the least-square method assures that the residuals (variances of the target value, which cannot be explained by the function) are minimized. The Stepwise selection technique (with F- values of 0.1 and 0.2 for accepting and rejecting variables into the model, respectively) is chosen for the selection of independent variables. This procedure helps the modeller understand how the coefficient of determination increases as more variables are accepted into the model. 3 Multiple Linear Regression Model Y i = b 0 +  b j * X ij + e i Regional Transfer Function Procedure Pool of independent variables

AppendicesAimDataStudy AreaProcedureApplication Model Calibration Selection of independent variables 3 Test model runs produced negative values for the constant b 0 in the model equations. As a result, the predicted values for low-flow indices in the lower range were mostly too low and often negative. To reduce the occurrence of negative estimates of the low-flow indices the regression was forced through the origin, which means that the constant b 0 was set to zero. The following Tables (2 to 4) show the results of the selection procedures for each of the three low-flow indices BASE, MAM(10), and Q90. All three procedures show a similar pattern in that as more and more variables are included into the model, the goodness of the model, represented by the coefficient of determination, R 2, increases, however, with diminishing returns. 4 Multiple Linear Regression Model Y i = b 0 +  b j * X ij + e i Regional Transfer Function Procedure Pool of independent variables

AppendicesAimDataStudy AreaProcedureApplication BASE= AREA*7.3*10 -3 - SOILVL*0.416 + SLOPEMEAN*3.9*10 -2 - ROOTSMEAN*2.5*10 -3 R 2 = 0.87 s.e.= 0.33 Independent variables corrected R 2 AREA0.81 AREA, SOILVL0.82 AREA, SOILVL; SLOPEMEAN0.84 AREA, SOILVL; SLOPEMEAN, ROOTSMEAN0.87 AREACatchment area [km 2 ] SOILVLPercentage of soils with very low infiltration capacity [%] SLOPEMEANMean slope [%] ROOTSMEANMean water-holding capacity in the effective root zone [mm] Model Calibration Regional Transfer Function BASE A simple linear regression for BASE with only one independent variable, AREA, explains 81% of the variation in BASE. Adding the three next adequate independent variables (SLOPEVL, SLOPEMEAN, and ROOTSMEAN) raises the coefficient of determination to 87%. After the fourth step, the procedure terminated because the requirements of significance were not met for any additional variable. Our preliminary model after the fourth step of the selection procedure is the model with the highest coefficient of determination that includes only statistically significant variables. Table 2 Results of the regression analysis for BASE 5 3 Click here to learn about the definition of R 2 and s.e. Procedure Click here to review the definition of BASE.

AppendicesAimDataStudy AreaProcedureApplication Rank Streamflow [m 3 /s] Fig. 3.3 Monthly minimum runoff values, ranked in ascending order (Elsenz at Meckesheim, No. 460, 1966-90) Procedure Model Calibration Review: BASE 3 more The method of base flow estimation from daily flow data was developed by W UNDT (1958) and K ILLE (1970) and modified by D EMUTH (1993). It serves as an example of how more complex indices can be obtained in an automated and objective way. The approach is based on the analysis of monthly minimum flows. It assumes that for the most part monthly minimum flows are equivalent to the mean base flows of the respective months. Monthly minimum runoff values are extracted from a time series of at least ten years, and the individual values are ranked in an ascending order and plotted (Figure 3.3). The points of the ranked flow data are similar to a flow duration curve with the lower values arranged approximately along a straight line. At the critical point the slope of the curve sharply increases. It is assumed that flows beyond the critical point are not “pure” base flow. more

AppendicesAimDataStudy AreaProcedureApplication Rank Streamflow [m 3 /s] Fig. 3.4Monthly minimum runoff values, ranked in ascending order (Elsenz at Meckesheim, No. 460, 1966-90) 5% value critical value 50% value back Procedure Model Calibration Review: BASE 3 BASE A stepped linear regression is computed to find the line which separates the flow values which are influenced by surface and subsurface flow from “pure” base flow values (Figure 3.4). The step regression starts with the values between the 5% and the 50% mark. Successively, values beyond the 50% mark are included in the regression and the correlation coefficient is re- computed until it reaches a maximum. This value is called the critical point. Between the 5% value and the new critical point a straight line is interpolated and extended in both directions to correct the higher flows to “true” base flow. Finally, all flow values are adjusted to the straight line and the mean base flow is calculated (yellow arrows). back

AppendicesAimDataStudy AreaProcedureApplication The coefficient of determination (R 2 ) is an indicator for the strength of the relationships represented in the regression model. With the regression model we seek to provide a better prediction for the scatter plot than the arithmetic mean. The total deviation of each value from the mean can be split into one portion, which is explained by our regression line, and the other portion, which remains unexplained, also called error. Figure 3.15 provides a simplified two- dimensional illustration of the components of the variation in Y. The coefficient of determination is defined as the sum of squared explained deviations divided by the sum of squared total deviations. It can range from 0 to 1; 1 being a perfect fit where all deviations from the mean can be explained through the regression line. For each set of independent variables, the statistic program computes the regression equation for which the sum of squared unexplained deviations reaches a minimum. Then, R 2 is calculated. Fig. 3.15 Components of deviation from Y X explained deviation unexplained deviation Y Sum of squared explained deviations R 2 = Sum of squared total (explained and unexplained) deviations Model Calibration Coefficient of Determination 3 Y Procedure more

AppendicesAimDataStudy AreaProcedureApplication The standard error of estimate of Y is defined as follows (LEWIS-BECK 1986): where Y i, obs is the observed value of the dependent variable Y and Y i, pred is the predicted value of the dependent variable Y. The difference between Y i,obs and Y i, pred is also called prediction error. Beside the coefficient of determination, the standard error is the second characteristic number that describes the quality of the regression equation. It is an estimate of the standard deviation of the actual Y from the predicted Y and gives an idea of the average error that goes along with predicting Y on the basis of the given regression equation. Model Calibration Standard Error 3  (Y i, obs – Y i, pred ) 2 n-2 s.e. = Procedure back

AppendicesAimDataStudy AreaProcedureApplication independent variables corrected R 2 AREA0.84 AREA, SOILVL0.87 AREA, SOILVL, SLOPEMEAN0.88 AREA, SOILVL, SLOPEMEAN, DD0.89 AREACatchment area [km 2 ] SOILVLPercentage of soils with very low infiltration capacity [%] SLOPEMEANMean slope [%] DDDrainage density [km/km 2 ] MAM(10) The regression analysis for the MAM(10) produces a similar pattern as for BASE. For this parameter, again four variables were included until the requirements for further acceptance could not be met anymore. Table 3 Results of the regression analysis for MAM(10) Procedure Click here to review the definition of MAM(10). MAM(10)= AREA*4.5*10 -3 - SOILVL*0.4 + SLOPEMEAN*2.1*10 -2 - DD*0.1 R 2 = 0.89 s.e.= 0.19 Click here to learn about the definition of R 2 and s.e. Model Calibration Regional Transfer Function 6 3

AppendicesAimDataStudy AreaProcedureApplication Model Calibration Coefficient of Determination 3 Procedure X explained deviation unexplained deviation Y Sum of squared explained deviations R 2 = Sum of squared total (explained and unexplained) deviations Y more The coefficient of determination (R 2 ) is an indicator for the strength of the relationships represented in the regression model. With the regression model we seek to provide a better prediction for the scatter plot than the arithmetic mean. The total deviation of each value from the mean can be split into one portion, which is explained by our regression line, and the other portion, which remains unexplained, also called error. Figure 3.15 provides a simplified two- dimensional illustration of the components of the variation in Y. The coefficient of determination is defined as the sum of squared explained deviations divided by the sum of squared total deviations. It can range from 0 to 1; 1 being a perfect fit where all deviations from the mean can be explained through the regression line. For each set of independent variables, the statistic program computes the regression equation for which the sum of squared unexplained deviations reaches a minimum. Then, R 2 is calculated. Fig. 3.15 Components of deviation from Y more

AppendicesAimDataStudy AreaProcedureApplication The standard error of estimate of Y is defined as follows (LEWIS-BECK 1986): where Y i, obs is the observed value of the dependent variable Y and Y i, pred is the predicted value of the dependent variable Y. The difference between Y i, obs and Y i, pred is also called prediction error. Beside the coefficient of determination, the standard error is the second characteristic number that describes the quality of the regression equation. It is an estimate of the standard deviation of the actual Y from the predicted Y and gives an idea of the average error that goes along with predicting Y on the basis of the given regression equation. Model Calibration Standard Error 3 n-2 s.e. = Procedure back  (Y i, obs – Y i, pred ) 2

AppendicesAimDataStudy AreaProcedureApplication The MAM(10) value is calculated by selecting the annual ten-day minimum values of discharge from each year of the observation period and computing the arithmetic mean of this set of values (Figure 3.6). Year Streamflow [m 3 /s] Fig. 3.6Annual 10-day minimum values of flow and their arithmetic mean (Elsenz at Meckesheim, No. 460, 1966-90) Procedure Model Calibration Review: MAM(10) 3 back

AppendicesAimDataStudy AreaProcedureApplication independent variables corrected R 2 AREA0.81 AREA, SOILVL0.83 AREA, SOILVL; SLOPEMEAN0.86 AREA, SOILVL; SLOPEMEAN, ROOTSMEAN0.88 AREACatchment area [km 2 ] SOILVLPercentage of soils with very low infiltration capacity [%] SLOPEMEANMean slope [%] ROOTSMEANMean water-holding capacity in the effective root zone [mm] Model Calibration Regional Transfer Function Q90 As with the previous low-flow indices, the regression analysis produced a model with four independent variables. Table 4 Results of the regression analysis for Q90 7 3 Procedure Click here to review the definition of Q90. Click here to learn about the definition of R 2 and s.e. Q90= AREA*4.9*10 -3 - SOILVL*0.5 + SLOPEMEAN*2.5*10 -2 - ROOTSMEAN*1.5*10 -3 R 2 = 0.88 s.e.= 0.21

AppendicesAimDataStudy AreaProcedureApplication Model Calibration Coefficient of Determination 3 Procedure X explained deviation unexplained deviation Y Sum of squared explained deviations R 2 = Sum of squared total (explained and unexplained) deviations Y more The coefficient of determination (R 2 ) is an indicator for the strength of the relationships represented in the regression model. With the regression model we seek to provide a better prediction for the scatter plot than the arithmetic mean. The total deviation of each value from the mean can be split into one portion, which is explained by our regression line, and the other portion, which remains unexplained, also called error. Figure 3.15 provides a simplified two- dimensional illustration of the components of the variation in Y. The coefficient of determination is defined as the sum of squared explained deviations divided by the sum of squared total deviations. It can range from 0 to 1; 1 being a perfect fit where all deviations from the mean can be explained through the regression line. For each set of independent variables, the statistic program computes the regression equation for which the sum of squared unexplained deviations reaches a minimum. Then, R 2 is calculated. Fig. 3.15 Components of deviation from Y more

AppendicesAimDataStudy AreaProcedureApplication The standard error of estimate of Y is defined as follows (LEWIS-BECK 1986): where Y i, obs is the observed value of the dependent variable Y and Y i, pred is the predicted value of the dependent variable Y. The difference between Y i, obs and Y i, pred is also called prediction error. Beside the coefficient of determination, the standard error is the second characteristic number that describes the quality of the regression equation. It is an estimate of the standard deviation of the actual Y from the predicted Y and gives an idea of the average error that goes along with predicting Y on the basis of the given regression equation. Model Calibration Standard Error 3 n-2 s.e. = Procedure back  (Y i, obs – Y i, pred ) 2

AppendicesAimDataStudy AreaProcedureApplication In a Flow Duration Curve (FDC) the observed flow data ise ranked in descending order. It displays the relationship between a discharge value and the time during which it is equalled or exceeded. The Q90 is the value which is equalled or exceeded in 90% of the time, in this case 0.9 m 3 /s (Figure 3.7). Percentiles Streamflow [m 3 /s] Fig. 3.7 Flow Duration curve and deduction of the 90 percentile (Elsenz at Meckesheim, No. 460, 1966-90) Procedure Model Calibration Review: Q90 3 back

AppendicesAimDataStudy AreaProcedureApplication Procedure Outline Multiple Linear Regression Model Y i = b 0 +  b j * X ij + e i  Check for Sensibleness  Model Requirements  Catchment Selection  Selection of Catchment Descriptors  Deduction of Catchment Descriptors  Selection of Low-Flow Indices  Calculation of Low-Flow Indices  Model Selection  Assumptions and Requirements  Check for agreement between observed and estimated values Model Application Data Acquisition (Step 1) Model Design (Step 2) Model Validation (Step 5) Model Calibration (Step 3) Model Evaluation (Step 4) Calibration Data Set (56 Stations) Validation Data Set (27 Stations)  Data Splitting You may chose a specific step of the procedure or click on the arrows (bottom right) to proceed in sequence Procedure Catchment Descriptors (independent variables) Low-Flow Indices (dependent variables) Catchment Descriptors (independent variables) Low-Flow Indices (dependent variables)  Selection of Algorithms to depict the low-flow indices  Computation of Regional Transfer Functions BASE = b 0 +  b j * X ij + e i MAM(10) = b 0 +  b j * X ij + e i Q90 = b 0 +  b j * X ij + e i

AppendicesAimDataStudy AreaProcedureApplication Model Evaluation Consideration of Sensibleness After having established the model, it is of utmost importance to critically review and evaluate its quality and applicability. We will do so in this self-guided tour by first looking at the established model itself (which we call evaluation) and then performing a validation procedure on the basis of the validation data set, which has been set aside for this purpose. Both the evaluation and the validation procedures will be shown for the Q90 model. The same procedures would have to be applied to the BASE and the MAM(10) model to check their validity. First, we can make a general statement about the model by looking at the coefficient of determination, R 2. With an R 2 of 0.88 the model accounts well for the variation of the low-flow parameter Q90. 1 4 ? Procedure Q90 = AREA*4.9*10 -3 - SOILVL*0.5 + SLOPEMEAN*2.5*10 -2 - ROOTSMEAN*1.5*10 -3 R 2 = 0.88 s.e.= 0.21

AppendicesAimDataStudy AreaProcedureApplication Model Evaluation Consideration of Sensibleness A second check of the model is to evaluate its sensibleness. This means to check whether the effect of a certain independent variable on the target value coincides with our (current) hydrological knowledge and understanding. AREA The regression analysis has shown that catchment area has a positive effect on the Q90. This result agrees with our current hydrological understanding since amount of precipitation and storage volume are generally proportional to catchment size. 2 4 ? Procedure Q90 = AREA*4.9*10 -3 - SOILVL*0.5 + SLOPEMEAN*2.5*10 -2 - ROOTSMEAN*1.5*10 -3 R 2 = 0.88 s.e.= 0.21

AppendicesAimDataStudy AreaProcedureApplication Model Evaluation Consideration of Sensibleness SOILVL Due to a high percentage of soils with very low infiltration capacity, only a small portion of the received previpitation can infiltrate and through the soil to recharge groundwater, which predominantly feeds the base flow in our regions. With an increase of the percentage of soils with very low infiltration capacity, low flows will decrease. Therefore the negative effect of SOILVL on the Q90 seems sensible. 3 4 ? Procedure Q90 = AREA*4.9*10 -3 - SOILVL*0.5 + SLOPEMEAN*2.5*10 -2 - ROOTSMEAN*1.5*10 -3 R 2 = 0.88 s.e.= 0.21

AppendicesAimDataStudy AreaProcedureApplication Model Evaluation Consideration of Sensibleness SLOPEMEAN Coming from a process-oriented angle, one would argue that greater slopes are characterized by shallower soils and faster flows. As a result, such cachments would have a lower retention potential and lower base flows. At this point, the knowledge of our study area becomes important. It may be possible that in our study area ‘slope’ is closely correlated with climatic characteristics. The dominant mountain range in the region is the Black Forest, which is the steepest on the Western edge. Westerly winds and steep slopes give rise to a significant amout of orographic precipitation in the Western parts of the Black Forest. Therefore, it makes sense that slope is in indirect descriptor of precipitation. It is interesting, however, to note that the annual precipitation AAR was not included into the model. 4 4 ? Procedure Q90 = AREA*4.9*10 -3 - SOILVL*0.5 + SLOPEMEAN*2.5*10 -2 - ROOTSMEAN*1.5*10 -3 R 2 = 0.88 s.e.= 0.21

AppendicesAimDataStudy AreaProcedureApplication Model Evaluation Consideration of Sensibleness ROOTSMEAN The mean water-holding capacity of the effective root-zone can be regarded as a storage compartment from where evapotranspiration losses occur. Hence, the more water is stored in this compartment, the less water will be available as base flow. Therefore, the negative effect of ROOTSMEAN on Q90 reflects a known hydrological principle. Therefore, we can conclude that this statistical finding of the multiple regression analysis does not contradict our hydrological experience. In conclusion, we can accept the model as being statistically and physically sound. 5 4 ? Procedure Q90 = AREA*4.9*10 -3 - SOILVL*0.5 + SLOPEMEAN*2.5*10 -2 - ROOTSMEAN*1.5*10 -3 R 2 = 0.88 s.e.= 0.21

AppendicesAimDataStudy AreaProcedureApplication Model Evaluation Check for Model Requirements Another way of evaluating the model is to compare the observed Q90 values with the predicted based on the model established. Since these values correspond to the same data set, this comparison is not sufficient as validation of the model. However, it can give us an idea of how well the observed data has been incorporated in the calibration process. 6 4 ? Procedure Fig. 3.16 Predicted vs. observed values of Q90 Predicted values of Q90 Observed values of Q90

AppendicesAimDataStudy AreaProcedureApplication Model Evaluation Check for Model Requirements A first brief look at the plot of predicted vs. observed values indicates two things:  First, we can see that even though the regression was forced to go through the origin the predicted Q90 values for part of the data set are still negative. These results are, of course, nonsensical.  Secondly, the linear trend shows a deviation from the perfect fit (1:1 line). It seems like our model will tend to underestimate the Q90 for higher flows. The same observations are true for the other two models and will have to be kept in mind during further analysis. 7 4 ? Procedure Fig. 3.16 Predicted vs. observed values of Q90 Predicted values of Q90 Observed values of Q90

AppendicesAimDataStudy AreaProcedureApplication Model Evaluation Check for Model Requirements Next, we will evaluate the validity of the model by examining whether the previously- established requirements of a valid multiple regression model are also true for our model: 8 4 ? You may click on any of the six assumptions, use the arrow buttons to view them in sequence, or click here to proceed.here 1. The model is free of specification error 2. The data set is free of measurement error 3. Homoscedasticity: The variance of the error term is constant for all values of the independent variables 4. The error term is neither auto-correlated nor correlated with the independent variables 5. The error term follows normal distribution 6. The model is free of multi-colinearity Procedure

AppendicesAimDataStudy AreaProcedureApplication Model Evaluation Check for Model Requirements (1)The model is free of specification error Statistical significance tests have already been performed when the independent variables were selected. In addition, we considered the model with regards to current knowledge of hydrological processes and approved of the equation given by the regression analysis. We can be confident that our model does not contain a variable that should not be in there. It is, however, impossible to gather and process data of every thinkable catchment characteristic. Nevertheless, the pool of available descriptors included a variety of properties. We trust that the model does not lack any variable that should have been included. Finally, we trust that an additive model gives us the best approximation of the natural relationships. A multiplicative model has also been tested but gave less significant results. (2) The data set is free of measurement error In Step 1 we checked the data for inconsistencies and excluded those catchments for which the data did not comply with our requirements. However, we must be aware, that even consistent data are subject to error since our low- flow measurements could only be obtained with limited precision. Furthermore, it must be noted that the independent variables include error as well, all of which is by nature of the regression analysis attributed to the error associated with Y. 9 4 back to Requirements - Overview ? Procedure

AppendicesAimDataStudy AreaProcedureApplication Model Evaluation Check for Model Requirements (3) Homoscedasticity: The variance of the error term is constant for all values of the índependent variables In Figure 3.17 residuals were plotted versus values of Y. The points more or less snuggle around the X-axis except for a few outliers, which seem to be unbalanced: For the four largest Q90 values the model predicts significantly lower flows. This is due to the fact that our procedure tends to underestimate the Q90 of higher values. It should be noted, however, that this figure illustrates the absolute deviations. In relative terms, the deviation only amounts to 15 to 30% of the observed values. 10 4 ? more Procedure Values of Y Residuals of Y Fig. 3.17 Residuals of Y as a function of values of Y (for the Q90 regression)

AppendicesAimDataStudy AreaProcedureApplication Fig. 3.18 Catchments that produced „outliers“ 0 80 160 km Model Evaluation Check for Model Requirements (3) Homoscedasticity: The variance of the error term is constant for all values of the independent variables (continued) The catchments which produced the largest residuals in the calibration data set for Q90 are distributed across several regions of the study area. This pattern for Q90 is similar to those of the other two low-flow indices. However, all four of the respective catchments are situated in areas where karst phenomena are quite frequent. In such areas, it is often very difficult to determine the actual size of the catchment. AREA is the most significant independent variable in all three models, which makes them very sensitive to inaccuracies in this descriptor. 11 4 ? N Procedure more

AppendicesAimDataStudy AreaProcedureApplication Model Evaluation Check for Model Requirements (3) Homoscedasticity: The variance of the error term is constant for all values of the independent variables (continued) To improve the model with regard to homoscedasticity we could eliminate the seemingly problematic data sets or accept the weakness of the model, regarding those four points as valid and valuable outliers. In this case we desist from further reducing our data set and will accept all remaining data. The outliers can be regarded as a reflection of the natural variance in our study area. Omitting those values would tighten the range of values used for the calibration of our model and would therefore limit its applicability. 12 4 ? Procedure back to Requirements - Overview Fig. 3.18 Catchments that produced „outliers“ 0 80 160 km N

AppendicesAimDataStudy AreaProcedureApplication Model Evaluation Check for Model Requirements (4) The error term is neither auto-correlated nor correlated with the independent variables. In the process of deducing low-flow indices from a time series of discharge those data sets that exhibited inconsistencies (which are an indicator that auto-correlation exists) were already excluded from further analysis. Time is not among the variables in our pool of independent variables. Since the other measurements are not based on a sequence over time that could be traced back, we concluded that auto-correlation is not a problem in this analysis. 13 4 ? Procedure more

AppendicesAimDataStudy AreaProcedureApplication Model Evaluation Check for Model Requirements (4) The error term is neither auto-correlated nor correlated with the independent variables (continued) In our test of homoscedasticity we have observed four outliers. The fact that three of these outliers correspond to catchments which are among the largest in the study area suggests that there may be a correlation between catchment size and the error term. In Figure 3.19, residuals are plotted versus catchment area. We can see that the data points more or less crowd around the x-axis forming a horizontal band. Catchment area does therefore not seem to be significantly correlated with the error term. Fig. 3.19 Residuals as a function of catchment area (for the Q90 regression) 14 4 ? Procedure back to Requirements - Overview Catchment area [km 2 ] Residuals

AppendicesAimDataStudy AreaProcedureApplication Model Evaluation Check for Model Requirements (5) The error term follows normal distribution Two graphical procedures were employed for a visual check of normal distribution. The two plots are the frequency distribution of residuals (Figure 3.20) and the probability plot (Figure 3.21). 0 Residuals Fig. 3.21 Probability plot of standardized residuals (Q90 regression) Fig. 3.20 Frequency distribution of residuals (Q90 regression) Observed cumulative probability Frequency Predicted cumulative probability 01 1 0 15 4 ? Procedure We can infer that the distribution of the error term – though by far not perfect - is reasonably symmetrical around zero, unimodal, and is not significantly skewed. It comes acceptably close to a normal distribution so that the above assumption is justified. back to Requirements - Overview

AppendicesAimDataStudy AreaProcedureApplication Model Evaluation Check for Model Requirements (6) The model is free of multi-colinearity When the four independent variables in our model are expressed as linear combinations of each other the following results are produced for the calibration data set (Table 5): The selected catchment descriptors exhibit different degrees of inter-dependence, with the coefficient of determination ranging from 0.04 to 0.29, which does not even come close to the previously set upper limit of 0.8. We will accept the variables as being practically independent of each other. Variable combination corrected R 2 AREA = f (SOILVL, SLOPEMEAN, ROOTSMEAN) 0.04 SOILVL = f (AREA, SLOPEMEAN, ROOTSMEAN) 0.08 SLOPEMEAN= f (AREA, SOILVL, ROOTSMEAN)0.29 ROOTSMEAN = f (AREA, SOILVL, SLOPEMEAN)0.25 Table 5 Multi-colinearity of catchment descriptors 16 4 ? back to Requirements – OverviewRequirements – Overview Procedure

AppendicesAimDataStudy AreaProcedureApplication Model Evaluation Regional Transfer Functions After having undergone the equivalent considerations and tests as the model for Q90, the regression equations for BASE and MAM(10) were also accepted as regional transfer functions for the estimation of low flows at the ungauged site. This concludes the model evaluation. However, the hardest test of all is still ahead: The validation of the model by means of the previously isolated validation data. Step 4 Model Evaluation Step 3 Model Calibration Step 5 Model Validation 17 4 ? Procedure

AppendicesAimDataStudy AreaProcedureApplication Procedure Outline Multiple Linear Regression Model Y i = b 0 +  b j * X ij + e i  Check for Sensibleness  Model Requirements  Catchment Selection  Selection of Catchment Descriptors  Deduction of Catchment Descriptors  Selection of Low-Flow Indices  Calculation of Low-Flow Indices  Model Selection  Assumptions and Requirements  Selection of Algorithms to depict the low-flow indices  Computation of Regional Transfer Functions  Check for agreement between observed and estimated values Model Application Data Acquisition (Step 1) Model Design (Step 2) Model Validation (Step 5) Model Calibration (Step 3) Model Evaluation (Step 4) Calibration Data Set (56 Stations) Validation Data Set (27 Stations)  Data Splitting You may chose a specific step of the procedure or click on the arrows (bottom right) to proceed in sequence Procedure Catchment Descriptors (independent variables) Low-Flow Indices (dependent variables) Catchment Descriptors (independent variables) Low-Flow Indices (dependent variables) BASE = b 0 +  b j * X ij + e i MAM(10) = b 0 +  b j * X ij + e i Q90 = b 0 +  b j * X ij + e i

AppendicesAimDataStudy AreaProcedureApplication Model Validation Observed vs. Predicted Q90 The second data set, which had so far been left out of consideration, is now used to validate our transfer function. For the validation data set the respective three low-flow indices are predicted on the basis of the catchment descriptors using the established regional transfer functions. Figure 3.23 is a visual comparison between the observed and the predicted Q90. Two effects can be noticed here: 1.The points deviate more or less from the perfect fit line, which means there is some individual degree of overestimation and underestimation, respectively. 2.The blue line exhibits a slightly steeper slope than the perfect fit line. This is probably due to the outlier in the top right hand corner. 1 5 ( ) ? Procedure Fig. 3.23 Validation: predicted Q90 vs. observed Q90 Predicted Q90 Observed Q90

AppendicesAimDataStudy AreaProcedureApplication Model Validation Model Bias Figure 3.24 shows the relative deviation of the predicted Q90 values from the observed ones, calculated as the model bias: 2 5 ( ) ? Q90 observed Q90 predicted – Q90 observed * 100% Procedure Fig. 3.24 Validation: Relative Deviation of the predicted from the observed Q90 Observed Q90 [m 3 /s] - Deviation of the predicted from the observed Q90 [%]

AppendicesAimDataStudy AreaProcedureApplication It is apparent that the model fails for a few values, giving relative deviation of several hundred percent. Based on the previous evaluation this is something that we should almost expect, given the limitations we have already discovered. Going back to the raw data confirms that the deviation is not due to descriptors laying significantly outside the calibration range. Rather, it is a reflection of the natural variability of hydrological processes in the catchment, which cannot be characterized completely with those independent variables. Another effect to be taken into consideration is the accumulation of inaccuracies, both of the data set as well as of the model. Model Validation Relative Deviation 3 5 ( ) ? Procedure Fig. 3.24 Validation: Relative Deviation of the predicted from the observed Q90 Observed Q90 [m 3 /s] - Deviation of the predicted from the observed Q90 [%]

AppendicesAimDataStudy AreaProcedureApplication Model Validation Summary For a more specific validation, we have now zoomed into the plot, only looking at those values which are within the 50% range. Table 6 gives the statistics of this plot. Close to half the values were predicted with a deviation of less than 50%. 4 5 ( ) ? Fig. 3.25 Validation: Relative deviation of the predicted from the observed Q90 (+/- 50% range shown) Evaluation Deviation in % % of Data sets Q90 MAM10 BASE Very good <10 15 11 7 Good 10-30 15 26 15 Satisfactory 30-50 15 11 11 Unsatisfactory >50 56 52 67 Table 6 Summary of the validation results Procedure Deviation of the predicted from the observed Q90 [%] Observed Q90 [m 3 /s]

AppendicesAimDataStudy AreaProcedureApplication Model Validation Conclusion This concludes the validation process. In light of the limitations we discovered we could say that the established regional transfer functions can produce good results but should be applied with caution. If the validation had failed, we would have had to conclude that our multiple regression approach is all together inadequate. 5 5 ( ) ?  Check for agreement between observed and estimated values Model Application Model Validation (Step 5) Procedure

AppendicesAimDataStudy AreaProcedureApplication Application Original Problem Statement We now come back to our original problem statement: We desired to determine the Q90 for the Wiese catchment without relying on local flow data. By means of multiple regression analysis among other catchments in the region we were able to find a common regional pattern which describes this relationship between catchment descriptors and the Q90. Application Wiese Site of the proposed hydro-power plant Fig. 1.2 Wiese catchment (No. 532) Q90 = ?

AppendicesAimDataStudy AreaProcedureApplication Application Original Problem Statement Assuming that the same relationship is true for the Wiese catchment, we can use the previously established regional transfer function and estimate the desired Q90 value at our ungauged site based on the respective Wiese catchment descriptors. Before applying the model we made sure that the Wiese catchment descriptors were within the range of the catchment descriptors of the calibration data set. The following two slides bring our self- guided tour to a conclusion by showing the final calculation process. Application Click here to see the ranges of the catchment descriptors of the calibration data sethere Wiese Site of the proposed hydro-power plant Fig. 1.2 Wiese catchment Q90 = ?

AppendicesAimDataStudy AreaProcedureApplication Q90 = ? Application Available Data on the Wiese Catchment Climate Annual precipitation 1891 mm Land Use Percentage of urbanisation2% Percentage of forested area63% Soil Percentage of soils with high infiltration capacity 0.19% Percentage of soils with medium infiltration capacity 0% Percentage of soils with low infiltration capacity 99.81% Percentage of soils with very low infiltration capacity 0% Hydrogeology Percentage of rock formations with a very low hydraulic permeability 0% Weighted mean of hydraulic conductivity 0.000962 m/s Morphometry Catchment area 206.28 km 2 Drainage density1.31 km/ km 2 Highest elevation 1485.5 m a.m.s.l. Lowest elevation 423.6 m a.m.s.l. Average elevation 898.74 m a.m.s.l. Maximum slope 45.54% Minimal slope 0 % Average slope 18.08% Mean hydraulic conductivity of the soils 201.69 cm/d Percentage of soils with low hydraulic conductivity 0% Percentage of soils with high water-holding capacity in the effective root zone 0% Mean water-holding capacity in the effective root zone 109 mm Application

AppendicesAimDataStudy AreaProcedureApplication Application Estimation of the Q90 AREA*4.879*10 -3 - SOILVL*0.457 + SLOPEMEAN* 2.506*10 -2 - ROOTSMEAN*1.540*10 -3 Morphometry Catchment area 206.28 km 2 Drainage density1.31 km/ km 2 Highest elevation 1485.5 m a.m.s.l. Lowest elevation 423.6 m a.m.s.l. Average elevation 898.74 m a.m.s.l. Maximum slope 45.54% Minimal slope 0 % Average slope 18.08% Hydrogeology Percentage of rock formations with a very low hydraulic permeability 0% Weighted mean of hydraulic conductivity 0.000962 m/s Climate Annual precipitation 1891 mm Land Use Percentage of urbanisation2% Percentage of forested area63% Soil Percentage of soils with high infiltration capacity 0.19% Percentage of soils with medium infiltration capacity 0% Percentage of soils with low infiltration capacity 99.81% Percentage of soils with very low infiltration capacity 0% Q90 = Q90 = 1.29 m 3 /s Application Mean hydraulic conductivity of the soils 201.69 cm/d Percentage of soils with low hydraulic conductivity 0% Percentage of soils with high water-holding capacity in the effective root zone 0% Mean water-holding capacity in the effective root zone 109 mm

AppendicesAimDataStudy AreaProcedureApplication We welcome your comments and suggestions which should be submitted to the following address: Prof. Dr. Siegfried Demuth IHP/OHP-Sekretariat International Hydrological and Operational Programme of UNESCO and WMO Mainzer Tor 1 59068 Koblenz, Germany Demuth@bafg.de Conclusion In this self-guided tour you have learned in which hydrological context you may use a multiple linear regression procedure to estimate low-flow indices at the ungauged site. The learning-by-screening method (step by step) gave you the opportunity not only to learn at your own pace but also to simultaneously apply the method to your own data. You have seen an example of how to develop a conceptual model, translate a conceptual model into a statistical model, calibrate, evaluate, and validate the model. You should note that the self-guided tour is a practical introduction towards the design and application of multiple regression models. A detailed discussion about the theoretical background is found in the appropriate literature. For your own exercise or review we have included the data sets used in this self-guided tour, both low-flow indices and catchment descriptors. Application

AppendicesAimDataStudy AreaProcedureApplication Data Overview These EXCEL spreadsheets contain the data used for calibration and validation respectively as well as summary statistics and the plots shown in this self-guided tour. Spreadsheet 1: Calibration Spreadsheet 2: Validation Data The documents can be opened directly by clicking on the respective name The documents below are in SPSS format. The Calibration Data sheet can be used directly for regression analysis of the data while the Results document is an output summary. Regression: Calibration Data Regression: Results The original flow data can be found on the CD-ROM under Data/Regional Data Set.

AppendicesAimDataStudy AreaProcedureApplication Appendices Overview You may choose from the following categories: Catchment Descriptors Acronyms, means of deduction, units Data Sources Data pools, projects, and organisations References Background and previous research Acknowledgements Thanks to the numerous contributors Contact Information We appreciate your feedback Appendices

AimDataStudy AreaProcedureApplication Fig. 3.2 Catchment Descriptors (PLATE 1992) 3 1 Morphology and Morphometry Climate Hydrogeology Soil Land Use Appendices - Catchment Descriptors Overview You may click on any of the categories on the right to receive more information on the catchment descriptors or click here to return. here Appendices

AimDataStudy AreaProcedureApplication AREA - Catchment area [km 2 ] The catchment area is defined as the“area having a common outlet for its surface runoff” (IHP/OHP 1998). The descriptor was deduced from a 1:50,000 scale map of catchment boundaries provided by the Water and Soil Atlas of the State of Baden- Württemberg (WaBoA) and the RIPS-Pool. DD - Drainage density [km/km 2 ] Drainage density is the “total channel- segment length, accumulated for all [stream] orders within a drainage area, divided by the area” (IHP/OHP 1998). For the deduction procedure 1: 50,000 scale maps of catchment boundaries and drainage network (WaBoA and RIPS-Pool) were combined. HMIN – Lowest elevation [m a.m.s.l.] HMAX – Highest elevation [m a.m.s.l.] HMEAN – Average elevation [m a.m.s.l.] The elevation data are based on a digital elevation model (50 m by 50 m cells), provided by the Water and Soil Atlas of the State of Baden- Württemberg (WaBoA) and the RIPS-Pool. SLOPEMIN - Minimal slope [%] SLOPEMAX - Maximum slope [%] SLOPEMEAN - Mean slope [%] Minimum, maximum and mean slopes were deduced using a digital elevation model. 3 1 Appendices - Catchment Descriptors Morphology and Morphometry back to Catchment Descriptors - Overview Appendices

AimDataStudy AreaProcedureApplication Remote sensing was used to derive land use for the area (Landsat TM, 30 x 30 m grid, 1993). It was classified into 16 classes, which were aggregated to four groups; forest, farmland, grassland and settlements/urban areas. Only the relative proportion of forest and urban areas were chosen to be included in this self- guided tour. URBAN - Percentage of urbanisation [%] FOREST - Percentage of forest [%] URBAN is an aggregation of settlement areas and areas with large-scale surface sealing due to industry. The latter covers 0.8% of the study area. Settlements are comprised of loose (1.9%) and dense (4.6%) settlements. FOREST is a combination of deciduous (7.8%) and coniferous (21.4%) forest and other forested areas (10.0%). GEOHCMEAN – Weighted mean of hydraulic conductivity [m/s] GEOVLHP – Percentage of rock formations with a very low hydraulic permeability [%] From a 1:350,000 scale map produced by the Regional Authority for Geology, Commodities, and Mining of Baden-Württemberg (LGRB), 98 geological classes were reduced to 54 hydro- geological classes and aggregated to eight groups. Each group was associated with a mean hydraulic conductivity of the upper hydro-geological unit. From these values, a weighted mean was produced for each catchment. From the same data, the proportion of rock formations with a mean hydraulic conductivity of less than 10 -5 m/s was derived. Data Acquisition Land Use and Hydrogeology 3 1 Appendices - Catchment Descriptors Land Use and Hydrogeology Appendices back to Catchment Descriptors - Overview

AppendicesAimDataStudy AreaProcedureApplication The classification of the soil water regime was based on a study by the Regional Authority for Geology, Commodities, and Mining of Baden- Württemberg (LGRB). They produced a 1 : 350 000 scale map of 29 soil water regime classes based on soil type, humus content, packing, slope, and geology. These classes were aggregated to four groups of soil types based predominantly on their infiltration capacity, which is defined as the “maximum rate at which water can be absorbed by a given soil per unit area under given conditions” (IHP/OHP 1998). SOILH – Percentage of soils with high infiltration capacity [%] These soils exhibit a high infiltration capacity even under conditions of high antecedent soil water content, such as sand and gravel soils. SOILM – Percentage of soils with medium infiltration capacity [%] Examples of soils which feature a medium infiltration capacity are loamy soils and loess of medium depth. SOILL – Percentage of soils with low infiltration capacity [%] The low infiltration capacity of these soils is due to their fine texture and/or the impermeability of one or more layers, as found in shallow sandy and loamy soils. SOILVL – Percentage of soils with very low infiltration capacity [%] The infiltration capacity in these soils is very low because they are shallow, composed of hardly permeable material (such as clay) or have a high ground water level. 3 1 Appendices Appendices - Catchment Descriptors Soil (1) back to Catchment Descriptors - Overview more

AppendicesAimDataStudy AreaProcedureApplication SOILHCMEAN - Mean hydraulic conductivity of the soils [cm/d] SOILLHC - Percentage of soils with low hydraulic conductivity [%] Hydraulic conductivity is a “property of a saturated porous medium which determines the relationship, called Darcy’s law, between the specific discharge and the hydraulic gradient causing it” (IHP/OHP 1998). From a 1 : 200 000 scale map with 9 classes, areal means were deduced. The lowest two classes (with a mean hydraulic conductivity of less than 2.3*10 -6 m/s) were combined for the calculation of the percentage of soils with low hydraulic conductivity. ROOTSMEAN - Mean water-holding capacity in the effective root zone [mm] ROOTSHIGH - Percentage of soils with high water-holding capacity in the effective root zone [%] The data for this descriptor is based on a map produced by the Regional Authority for Geology, Commodities, and Mining of Baden- Württemberg (LGRB), which shows the distribution of water-holding capacity for a theoretical soil depth of 100 cm. Water-holding capacity is defined as “water in the soil available to plants. It is normally taken as the water in the soil between wilting point and field capacity. In this context water-holding capacity is used and is identical to the available water” (IHP/OHP 1998). Based on the information of soil type, land use, root depth, and water logging conditions the water-holding capacity values were adjusted to the estimated effective root zone. These values were then used to compute the areal mean. A threshold mean water-holding capacity was set at 200 mm. Above this threshold, all classes were aggregated to “soils with high water-storage capacity in the effective root zone” and its proportion was calculated. Appendices - Catchment Descriptors Soil (2) Appendices back to Catchment Descriptors - Overview

AppendicesAimDataStudy AreaProcedureApplication AAR – Average annual precipitation [mm] The data for the average annual precipitation was derived from a digital map provided by the Water and Soil Atlas of the State of Baden-Württemberg (WaBoA) and the RIPS-Pool. It shows average annual precipitation for the period 1961-1990 based on a resolution of a 500 m grid. For this map, average annual precipitation had been calculated from the relationship between precipitation depth and altitude. It was also based on the principle of distance-weighting from the points of measurement. The raw data for the production of this map was provided by the German Weather Service (DWD). Appendices - Catchment Descriptors Climate Appendices Fig. 2.2 Mean annual precipitation (1961- 90) [mm] back to Catchment Descriptors - Overview

AppendicesAimDataStudy AreaProcedureApplication Appendices Data Sources The data used in the self-guided tour were provided by the following data pools, projects, and organisations: Data Pools  RIPS-Pool – Räumliches Informations- und Planungssystem (Spatial Information and Planing System, State of Baden- Württemberg)  EWA - European Water Archive of the Northern European FRIEND project (Flow Regimes from International and Experimental Data) Projects  WaBoA – Wasser und Boden Atlas von Baden-Württemberg (Water and Soil Atlas of the State of Baden-Württemberg)  KLIWA – Projekt Klimaänderung und Konsequenzen für die Wasserwirtschaft (Climatic Change and Impact on Water Resources Management) Organisations  LfU – Landesanstalt für Umweltschutz (Environmental Agency, Regional Office, State of Baden-Württemberg)  LGRB – Landesanstalt für Geologie, Rohstoffe und Bergbau Baden-Württemberg (Regional Office for Geology, Commodities, and Mining, State of Baden-Württemberg) back to Appendices - OverviewAppendices - Overview Appendices

AimDataStudy AreaProcedureApplication Appendices References (1) BACKHAUS, K., Erickson, B., Plinke, W. & Weiber, R. (1996): Multivariate Analysemethoden. Eine anwendungsorientierte Einführung. Springer, Berlin, Heidelberg, New York. BORCHERDT, C. (1985): Baden-Württemberg. Eine geographische Landeskunde. Wissenschaftliche Länderkunde. Bd. 12. Stuttgart. BECKER, A. (1992): Methodische Aspekte der Regionalisierung, in: Regionalisierung in der Hydrologie, ed. by H.- B. Kleeberg, DFG-Mitt. XI, VCH-Verl. Ges. Weinheim, pp. 16-33, in German. DEMUTH, S. (1993): Untersuchungen zum Niedrigwasser in West-Europa. Freiburger Schriften zur Hydrologie, Band 1. Freiburg, Germany. IHP/OHP (1998): WMO Technical Regulations, Volume III - Hydrology. GLOS, E. & LAUTERBACH, D. (1972): Regionale Verallgemeinerung von Niedrigwasserdurchflüssen mit Wahrscheinlichkeitsaussage. Mitteilungen des Institutes für Wasserwirtschaft. H. 37. VEB Verlag für Bauwesen, Berlin. HAAS, M. (2000): Regionalisierung des Quotienten Basisabfluss/Gesamtabfluss (Q bas /Q ges ) für Einzugsgebiete Baden-Württemberg. Freiburg, Germany. HOLDER, R. L. (1985): Multiple Regression in Hydrology. Institute of Hydrology, Wallingford, Great Britain. HUTTENLOCHER, F. (1972): Baden-Württemberg. Kleine geographische Landeskunde. Schriftenreihe der Kommission für geschichtliche Landeskunde, H. 2. Karlsruhe. Appendices back to Appendices - OverviewAppendices - Overview more

AppendicesAimDataStudy AreaProcedureApplication Appendices References (2) KILLE, K. (1970): Das Verfahren MoMNQ, ein Beitrag zur Berechnung der mittleren langjährigen Grundwasserneubildung mit Hilfe der monatlichen Niedrigwasserabflüsse. Zeitschrift der deutschen Geologischen Gesellschaft, Sonderheft Hydrogeologie Hydrogeochemie, 89-95. LEWIS-BECK, S. M. (1986): Applied Regression – An Introduction. Series: Quantitative Applications in the Social Sciences. Sage University Paper 22. MOHR, B. (1992): Die natürliche Raumausstattung. Südbaden. Schriften zur politischen Landeskunde Baden-Württembergs, 25-35. MORGENSCHWEIS, G. (1990): Zur Ungenauigkeit von Durchflussmessungen mit hydrometrischen Flügeln. DGM 34, H. 1/2, 16-21. PLATE, E. J. (1992): Regionalisierung in der Hydrologie. Deutsche Forschungsgemeinschaft. Mitteilung XI der Senatskommission für Wasserforschung. Hrsg. KLEEBERG, H.-B. Cambridge, NY. SCHREIBER, P. (1996): Regionalisierung des Niedrigwassers mit statistischen Verfahren. Freiburger Schriften zur Hydrologie, Band 4. Freiburg, Germany. VILLINGER, E. (1982): Hydrogeologische Aspekte zur geothermischen Anomalie im Gebiet Urauch-Boll am Nordrand der Schwäbischen Alb (Südwestdeutschland). Geologisches Jahrbuch, H. 32, 3-42. WUNDT, W. (1953): Gewässerkunde. Berlin, Göttingen, Heidelberg. Appendices back to Appendices - OverviewAppendices - Overview

AppendicesAimDataStudy AreaProcedureApplication Appendices Acknowledgements First of all, I would like to thank Falk Scissek, my co-author, for transferring the ideas to a power-point presentation and for his continuous engagement during the process of writing. The work and co-operation with my students Christian Birkel, Uli Nädelin, Anne Thormählen, and others from the Institute of Hydrology, University of Freiburg, are gratefully acknowledged. Thanks is due to individuals for their support in producing the self-guided tour: Volker Abraham, for providing the cartographic skills, maps and the layout of the front page of the self-guided tour. Kerstin Stahl, for calculating the flow regimes. Helmut Straub, Environmental Agency, Regional Office, State of Baden-Württemberg, Germany, for providing the flow data of Baden- Württemberg and the permission to use the data on this CD-ROM. Appendices back to Appendices - OverviewAppendices - Overview

AppendicesAimDataStudy AreaProcedureApplication Appendices Contact Information Prof. Dr. Siegfried Demuth formerly Institute of Hydrology University of Freiburg Fahnenbergplatz 79098 Freiburg, Germany currently IHP/OHP-Sekretariat International Hydrological and Operational Programme of UNESCO and WMO Mainzer Tor 1 59068 Koblenz, Germany Demuth@bafg.de Appendices back to Appendices - OverviewAppendices - Overview

Download ppt "1. AppendicesAimDataStudy AreaProcedureApplication How to use this Self-guided Tour This self-guided tour is designed for you to work through at your."

Similar presentations