Differences

This shows you the differences between two versions of the page.

--- start:hype_tutorials:automatic_calibration [2018/06/05 09:30]
cpers [Steepest descent method (task SD)]
+++ start:hype_tutorials:automatic_calibration [2024/01/25 11:37] (current)
@@ Line 7: / Line 7: @@
 There are in total **//nine methods of optimization//** to choose from in HYPE. The sampling methods are a basic Monte-Carlo simulation with random parameters values chosen within a user-specified parameter interval, and two progressive Monte-Carlo simulations where the Monte-Carlo simulations are made in stages with a reduced parameter space in between the stages. In addition it is possible to run an organized sampling of two parameters. The Differential Evolution Markov Chain method combines a genetic optimization algorithm with random sampling. The directional methods are the Brent method, two versions of quasi-Newton methods with different ways to calculate the gradient, and the method of steepest decent.
-Given enough sampling points, even the simple **//sampling method//** can give a rough estimate of the optimum. An advantage of the sampling methods is that the number of function evaluations, and thus the computation time, is determined by the user. The sampling methods are useful to provide a starting point for the directional optimization methods.
+Given enough sampling points, the simple **//sampling method//** can give a estimate of the optimum. An advantage of the sampling methods is that the number of function evaluations, and thus the computation time, is determined by the user. The sampling methods are useful to provide a starting point for the directional optimization methods.
 The **//Differential Evolution Markov Chain//** (DEMC) provides an uncertainty estimate of the optimum. The genetic algorithm (i.e. DE) works by proposing new members (parameter values) and then accepting or rejecting them. In addition to the random element of the creation of a proposal (by inheriting traits from other members and keeping some traits unchanged), in the DEMC method a random number is added to the proposed parameters and the proposal may be accepted by a certain probability even if the objective criterion is worse than for the replaced member. The advantage of DEMC versus plain DE is both the possibility to get a probability based uncertainty estimate of the global optimum and a better convergence towards it.
-The **//directional methods//** progress iteratively from one set of model parameters to a new set that have a better objective criterion. This is achieved by determining a direction of improvement, and then the optimal step length in that direction. The determination of the direction is what separates the different optimization methods. It is given by one parameter and the direction between the last two best parameter sets (for Brent method), or by a function of the gradient of the objective function. The methods using the gradient are more powerful, but require more evaluations. The directional methods depend on a starting point for their iterations. This choice of the starting point is important for the performance of the methods. It influences the calculation time and possibly which (local) optimum that is reached.
+The **//directional methods//** progress iteratively from one set of model parameters to a new set that have a better objective criterion. This is achieved by determining a direction of improvement, and then the optimal step length in that direction. The directional methods assume there exist a minima within the space. The determination of the direction is what separates the different optimization methods. It is given by one parameter and the direction between the last two best parameter sets (for Brent method), or by a function of the gradient of the objective function. The methods using the gradient are more powerful, but require more evaluations. The directional methods depend on a starting point for their iterations. This choice of the starting point is important for the performance of the methods. It influences the calculation time and possibly which (local) optimum that is reached.
 The automatic calibration algorithm is controlled by means of two or three **//files//**: [[start:HYPE_file_reference:info.txt|info.txt]] and [[start:HYPE_file_reference:optpar.txt|optpar.txt]], and for some methods [[start:HYPE_file_reference: qnstartpar.txt|qNstartpar.txt]]. The following sections present and discuss the entries and numerical parameters of those two files, necessary and/or optional to use the automatic calibration.
@@ Line 23: / Line 23: @@
 Generally speaking, the purpose of the [[start:HYPE_file_reference:info.txt|info.txt]] is to govern the simulation. Most of the content of the file is the same as for an ordinary simulation. The following file content is relevant for automatic calibration:
   * The flag ''Y'' must be passed to the model by the code ''calibration'' to turn on automatic calibration (red arrow in Fig 1).
-  * An objective function must be specified by means of the performance criteria it is composed of. Such a composite criterion, are constructed by linear combination of the already implemented, performance criteria, like: <m> c=w_1*c_1+ w_2*c_2+⋯+ w_N*c_N </m>
+  * An objective function must be specified by means of the performance criteria it is composed of. Such a composite criterion, are constructed by linear combination of the already implemented, performance criteria, like: <m> c=w_1*c_1+ w_2*c_2+ ... + w_N*c_N </m>
-where <m> c_1,c_2,…,c_N </m> are predefined performance criteria, and <m> w_1,w_2,…,w_N </m> are relative weighting factors. The available performance criteria and their id are [[start:hype_file_reference:info.txt:criteria|listed here]]. The criterion id, the [[start:hype_file_reference:info.txt:variables|HYPE variable ID]] of the computed and recorded variables to compare, as well as the period over which the variables are averaged before calculating the criterion, is specified for each performance criterion to be included in the objective function (see the block of data marked in red and green in Fig 1). For details on format see the description of [[start:hype_file_reference:info.txt#performance_criteria_options|the info-file]].
+where <m> c_1,c_2, ... ,c_N </m> are predefined performance criteria, and <m> w_1,w_2, ... ,w_N </m> are relative weighting factors. The available performance criteria and their id are [[start:hype_file_reference:info.txt:criteria|listed here]]. The criterion id, the [[start:hype_file_reference:info.txt:variables|HYPE variable ID]] of the computed and recorded variables to compare, as well as the period over which the variables are averaged before calculating the criterion, is specified for each performance criterion to be included in the objective function (see the block of data marked in red and green in Fig 1). For details on format see the description of [[start:hype_file_reference:info.txt#performance_criteria_options|the info-file]].
 In the example of Fig 1 the Nash-Sutcliffe efficiency (''MR2'') and relative error (''MRE'') are calculated for daily discharge (''cout'' and ''rout'' are compared on ''meanperiod 1''). The two criteria are weighted together. Most weight is put on MR2 and a little on the volume error. A small weight on relative error is usually enough to minimize the volume error but still get a good NSE. In the example all observations found in Qobs.txt are used to calculate the objective function. If more than one station is found, the MR2 criterion will use the average of each station’s NSE.
@@ Line 145: / Line 145: @@
   * The derivation of a new parameter set to be tested (the mutation) is also governed by the code ''DEMC_sigma''. "Sigma" and the parameter precision determines how much a random perturbation will influence the new parameter set. "Sigma" is the base of the perturbation. The value is the standard deviation of the sample error. Set to 0 if you don’t want to use it. Default is 0.1. The sigma value is multiplied with 3rd-row value for each parameter (the precision) to determine the size of the random perturbation.
   * The probability to use the new parameter set is governed by the code ''DEMC_crossover'', which is the probability to not use the mutated parameter values. Use 1 to always test the mutation (default), or < 1 to cross over some parameter values from parent generation (recommended if you have large number of parameters). In the example of [[start:hype_file_reference:optpar.txt|optpar.txt]] file shown in Fig 7, the 9th line indicates that the user wants 40% probability to keep the previous/parent parameter set and not to use the mutation.
-  * As default, only new parameter sets with better optimization criterion is accepted for the next generation. The code ''DEMC_accprob'' is used to switch on the possibility to accept also less good parameter sets. It defines a reduction in the probability to accept a proposed parameter set with a worse performance. Set to 1 then you maximize the probability to accept a proposal, set to 0 to only accept proposals with better performance than the parent generation (default). In the example of [[start:hype_file_reference:optpar.txt|optpar.txt]] file shown in Fig 7, the 10th line indicates that the user wants new generations to be better.
+  * As default, only new parameter sets with better optimization criterion is accepted for the next generation. The code ''DEMC_accprob'' is used to switch on the possibility to accept also less good parameter sets. If used, a proposed parameter set is accepted with increasing probability the better its performance is compared to the best parameter set so far. Set to 0 to only accept proposals with better performance than the parent generation (default), set to 1 to turn on the probability to accept worse proposals. In the example of [[start:hype_file_reference:optpar.txt|optpar.txt]] file shown in Fig 7, the 10th line indicates that the user wants new generations to be better.
 The specification of calibration parameters start on line 22 in [[start:hype_file_reference:optpar.txt|optpar.txt]]. Listing of the model parameters subject to optimization is achieved as described [[start:hype_tutorials:automatic_calibration#specification_of_calibration_parameters_-_optpartxt|above]].  The model parameters are listed in no particular order, but with three rows for each parameter. The precisions specified in the parameter listing part are used by the DEMC method to scale the random perturbation of the generation of a new proposed parameter set.
@@ Line 175: / Line 175: @@
 |Figure 10: Example of optpar.txt file for the quasi-Newton method|
-The quasi-Newton methods optimise all parameters at the same time. The parameter set is optimized with the line search routine starting from the point of the current best parameters. The direction of the search is determined by the gradient of the criteria surface at this point. The gradient can be estimated in three different ways in HYPE, the two quasi-Newton methods described in this section and the one called steepest descent in the next section. The optimization continues until one of several interruption criteria is fulfilled.
+The quasi-Newton methods optimise all parameters at the same time. The direction of the search is determined by the gradient of the criteria surface at the point of the current best parameters. The parameter set is optimized with the line search routine along the line determined by the gradient. The gradient can be estimated in three different ways in HYPE, the two quasi-Newton methods described in this section and the one called steepest descent in the next section. The optimization continues until one of several interruption criteria is fulfilled.
 Calculating the gradient for the quasi-Newton method involves updating the inverse Hessian matrix. This can be done by two methods, both described in Nocedal and Wright (2006). Task Q1 uses the DFP (Davidon-Fletcher-Powell) method and task Q2 uses the BFGS (Broyden-Fletcher-Goldfarb-Shanno) method.
@@ Line 213: / Line 213: @@
   * Similarly, a maximum amount of computation hours for the method can be specified by the code ''num_maxTim''. In the example given in Fig 12, the user wants the method to stop after 2 hours; see 7th line of the file.
   * If the objektive function does not change more than a given precision during a given amount of consecutive iterations, the method will exit. The amount of iterations is passed by the code ''num_criItr'', while the criterion precision is passed by the code ''num_criTol''. In the example of Fig 12, the user requires method exit if the criterion as not changed by more than 0.0001 over the last 2 consecutive iterations.
-  * If none of model parameter values that are optimized exceed the prescribed determination precision (3rd line in the model parameter listing for each parameter) for a certain amount of iterations, the method exits. This number of iteration is passed to the method by means of the flag ''num_parItr''. In the example of Fig 12, the user requires the method to stop if no model parameter values change by more than the specified precision for the last 8 consecutive iterations.
+  * If no model parameter values change, according to the prescribed precision (3rd line in the model parameter listing for each parameter), for a certain amount of iterations the method exits. This number of iteration is passed to the method by means of the code ''num_parItr''. In the example of Fig 12 line 10, the user requires the method to stop if no model parameter values change by more than the specified precision for the last 8 consecutive iterations.
-  * A tolerance for the gradient to be considered zero, i.e. an optimum reached, can be specified by the flag ''QN_nrmTol''. In the example, the user defines it to 0.00001; see 11th line of the file in Fig 12.
+  * A tolerance for the gradient to be considered zero, i.e. an optimum reached, can be specified by the code ''QN_nrmTol''. In the example, the user defines it to 0.00001; see 11th line of the file in Fig 12.
-  * When the gradient is estimated for a parameter set, simulations for a few points close to the parameter set are made and used to calculate the numerical derivative. These points are offset from the midpoint parameter values based on each parameter calibration interval, and a factor (default is 0.02, i.e. 2% offset). The offset for calculation of numerical derivative can be specified by the flag ''QN_pctDerv''. In the example, the default is used.
+  * When the gradient is estimated for a parameter set, simulations for a few points close to the parameter set are made and used to calculate the numerical derivative. These points are offset from the midpoint parameter values based on each parameter calibration interval, and a factor (default is 0.02, i.e. 2% offset). The offset for calculation of numerical derivative can be specified by the code ''QN_pctDerv''. In the example, the default is used.
-  * The number of points used to calculate the numerical derivative is called the numerical derivative stencil type (2, 4, 6 and 8 are allowed). The stencil type determines the order of accuracy of the numerical derivative, e.g. 2-stencil is 1st order accurate, 4-stencil is 2nd order accurate, etc. The stencil type can be specified by the flag ''QN_stencil''. In the example the default value 2 is used.
+  * The number of points used to calculate the numerical derivative is called the numerical derivative stencil type (2, 4, 6 and 8 are allowed). The stencil type determines the order of accuracy of the numerical derivative, e.g. 2-stencil is 1st order accurate, 4-stencil is 2nd order accurate, etc. The stencil type can be specified by the code ''QN_stencil''. In the example the default value 2 is used.
-  * The line search in the optimal direction may be limited within the given parameter intervals by a factor multiplied by the parameter interval. Lambda is the name of the current direction and step length to be searched (according to the QN algorithm), hence the flag’s name is ''QN_lambMax''. If the step length and factor are both one, the whole suggested parameter calibration interval is searched, and the offset points used to calculate the numerical derivative may end up outside the chosen parameter interval. To avoid this, a factor less than one may be set. In the example, the user increases it to 1 to use the fill interval; see 12th line of the file in Fig 12.
+  * The line search in the optimal direction may be limited within the given parameter intervals by a factor multiplied by the interval. Lambda is the name of the current direction and step length to be searched (according to the QN algorithm). If the step length and factor are both one, the whole suggested parameter interval is searched, and the offset points used to calculate the numerical derivative may end up outside the chosen parameter calibration interval. Therefore a factor less than one may be set. The factor can be specified by the code ''QN_lambMax''. In the example, the user increases it to 1 (from the default value 0.9) to use the suggested interval; see 12th line of the file in Fig 12.
-  * The actual maximum step length that the line search are allowed to take in the optimal direction, i.e. maximum value of lambda, is determined by the parameter limits, the direction vector and its length (i.e. the maximum step length according to the QN algorithm), and an in [[start:hype_file_reference:optpar.txt|optpar.txt]] set factor for increasing the step length. The default value of the factor is 1.618, consistent with golden ratio line search algorithm. The iteration process may thus be accelerated. The factor for increasing the step length can be specified by the flag ''QN_lambAcc''. In the example, the user defines it to 1.0; see 13th line of the file in Fig 12.
+  * The actual maximum step length that the line search are allowed to take in the optimal direction, i.e. maximum value of lambda, is determined by the parameter limits, the direction vector and its length (i.e. the maximum step length according to the QN algorithm), and an in [[start:hype_file_reference:optpar.txt|optpar.txt]] set factor for increasing the step length. The default value of the factor is 1.618, consistent with golden ratio line search algorithm. The iteration process may thus be accelerated. The factor for increasing the step length can be specified by the code ''QN_lambAcc''. In the example, the user defines it to 1.0; see 13th line of the file in Fig 12.
-  * For each line search operation, a maximum amount of iterations can be specified by the flag ''lnS_maxItr''. In the example at hand, maximum 50 line searches are allowed; see the 14th line of the file in Fig 12.
+  * For each line search operation, a maximum amount of iterations can be specified by the code ''lnS_maxItr''. In the example at hand, maximum 50 line searches are allowed; see the 14th line of the file in Fig 12.
-  * For each line search operation, a relative tolerance for search interval contraction can be specified by means of the flag ''lnS_tol''. In the example of Fig. 12, the line search interval is never allowed to be shorter than 0.00001.
+  * For each line search operation, a relative tolerance for search interval contraction can be specified by means of the code ''lnS_tol''. In the example of Fig. 12, the line search interval is never allowed to be shorter than 0.00001.
 The specification of calibration parameters start on line 22 of the [[start:hype_file_reference:optpar.txt|optpar.txt]] file. Listing of the model parameters subject to optimization is achieved as described [[start:hype_tutorials:automatic_calibration#specification_of_calibration_parameters_-_optpartxt|above]].  The order in which the model parameters are listed in [[start:hype_file_reference:optpar.txt|optpar.txt]] is relevant for the [[start:hype_file_reference:qnstartpar.txt|qNstartpar.txt]] file; they must be listed in the same order. This file gives the starting parameter values for the line search of the steepest descent method.

HYPE Model Documentation

User Tools

Site Tools

Differences

Page Tools