Soil Data Fitting

By H. D. Scott and D. L. Baker  Ó 2000

Topics  

 

Introduction  (top)

The first step in calibrating a simulation model of saturated and unsaturated water flow in soil must be to take field and laboratory measurements of the soil physical and hydraulic properties. In this case, Thiesse (1984) performed standard constant-head saturated conductivity and pressure plate saturation measurements on core samples in the lab. In the field, he used the zero-flux method (Arya, et al., 1975) to generate data for the unsaturated conductivity relation. One can either accept his results or refit the data to relations that seem more convenient. It takes less time and effort to fit a model to relations that use the least number of parameters. Depending on the calibration method chosen, such as simulated annealing, adding just one more parameter can increase the effort by as much as the multiple of the number of trial values for that parameter. The soil hydraulic relations chosen here may not be the best from other standpoints, but are chosen with that in mind.

Also, conventional symbols for soil hydraulic parameters are not used here. Instead, the Fortran variables from the computer program POULIT are used, in an effort to reduce confusion for users wanting to become familiar with the model.

 

 

The Pressure-Saturation Relation  (top)

Table 1 shows the soil saturation data, with depth (cm), from Table 8 in Thiesse (1984), with data added by H.D. Scott (Table 2). Table 1 converts matric suction, -p, from kbar in Thiesse to cm of suction head, at the rate of (1020.4 cm)/(1 bar). These data were first addressed by the current authors in 1997. For reasons lost in that time, the 100 kPa data were not used.

Table 1: Volumetric water content, th (cm3/cm3), versus depth, z (cm), and negative hydraulic soil water pressure, -p (cm).

z (cm):

0

15

30

45

61

76

91

106

122

137

-p (cm)

Volumetric water content, th (cm3/cm3)

0

0.537

0.479

0.461

0.462

0.449

0.447

0.458

0.434

0.422

0.465

25.5

0.458

0.401

0.372

0.375

0.379

0.378

0.385

0.374

0.367

0.396

51.0

0.450

0.381

0.348

0.352

0.362

0.362

0.372

0.366

0.363

0.383

75.5

0.417

0.359

0.329

0.331

0.340

0.345

0.359

0.358

0.356

0.381

102.0

0.406

0.350

0.320

0.325

0.340

0.344

0.358

0.355

0.350

0.372

153.1

0.393

0.336

0.308

0.316

0.331

0.338

0.352

0.349

0.346

0.363

204.0

0.363

0.314

0.292

0.301

0.320

0.330

0.344

0.344

0.340

0.356

255.1

0.357

0.307

0.286

0.295

0.316

0.326

0.340

0.341

0.339

0.348

306.1

0.333

0.291

0.276

0.285

0.307

0.318

0.333

0.335

0.331

0.344

510.0

0.306

0.263

0.254

0.267

0.294

0.305

0.321

0.326

0.322

0.329

765.3

0.286

0.239

0.233

0.248

0.281

0.295

0.311

0.316

0.312

0.318

1020.4

0.266

0.225

0.218

0.236

0.270

0.287

0.304

0.308

0.304

0.310

Table 2: Separate bulk density, rob (g/cm3), total porosity and water content, th (cm3/cm3) data, versus soil depth, z (cm).
   

water retained at 1000 kPa

water retained at 1500 kPa

z

rob

(g/g)

th

(g/g)

th

5

1.390

0.150

0.2085

0.049

0.0681

15

1.444

0.126

0.1819

0.051

0.0736

25

1.417

0.147

0.2083

0.072

0.1020

35

1.417

0.160

0.2267

0.092

0.1304

45

1.454

0.163

0.2370

0.116

0.1687

55

1.485

0.170

0.2524

0.135

0.2005

65

1.495

0.181

0.2706

0.142

0.2123

75

1.470

0.191

0.2808

0.134

0.1970

85

1.462

0.198

0.2895

0.182

0.2661

95

1.462

0.201

0.2939

0.116

0.1696

 

Notice that the 1500kPa (or 15.306 m or 15 bar) data in Table 2 were not developed either at the same depths in the soil column, or to the deepest depth in Table 1. The data were fitted (Figure 1) with a Gompertz function [1], and extended to 137 cm (Table 3). Unfortunately, it is apparent in later discussion that the 15-bar data do not always seem to be in the same data set as the pressure plate data.

[1]

 

Figure 1: Plot of 15 bar water content (cm3/cm3) data (points, Table 2) and fitted equation (line, Table 3) versus depth (cm)

Table 3: 15-bar Water contents, th (cm3/cm3), fitted to depth, z (cm), by [1]

z

th

z

th

z

th

0

0.0670

61

0.2083

122

0.2297

15

0.0719

76

0.2220

137

0.2299

30

0.1144

91

0.2271
   

45

0.1725

106

0.2290
   

 

It should also be noted that the water content data for z = 0 cm in Table 1 was not developed on a pressure plate apparatus. It was calculated from the bulk and particle densities, because of the difficulty of getting the samples fully saturated. This means that there is an unknown systematic error between this saturated water content, ths, and the remaining unsaturated water contents, th(z) in the table. The unsaturated water contents involve some unknown residual air content in the finest pores that did not exist for the saturated water content, ths. As the water is withdrawn from the soil in the pressure plate apparatus, these fine pores may or may not be eventually exposed to the contiguous air phase. There is no way to know where in the data this may happen. It also seems likely the that 10-bar and 15-bar measurements may have exposed these fine pores, where the lower suction measurements did not.

Figure 2a-j show the measured saturation relation data from Table 1 in log-log and semi-log plots, along with a line demonstrating a fitting equation [2], using the parameters in Table 4. Notice that for this equation, th can go to zero. The last column in Table 4 shows the matric suctions for this case. One could also choose to use the Brooks-Corey relation [3] or the van Genuchten-style relation [4]. But the approach for modeling the hydraulic properties of the soil column in this experiment and calibrating the model requires that the saturation relation be of the same form down the entire column. Only one of these can be chosen.

[2] ,

where th is water content (cm3/cm3) for p < 0, ths is saturated water content, vn is a dimensionless parameter, p is hydraulic pressure head (cm) and at is the "displacement pressure" (cm).

[3] , where thr is the residual water content (cm3/cm3), and vn is a dimensionless parameter

[4] , p < 0, vm and vn are dimensionless parameters.

Figure 2a

Figure 2b

Figure 2c

Figure 2d

Figure 2e

Figure 2f

Figure 2g

Figure 2h

Figure 2i

Figure 2j

Figure 3: Soil water content, fitted by [2] with the Table 4 parameters, versus the measured water content for all the points in Table 1, plus the 15-bar points in Table 3.

 

Table 4: Fitted parameters for equation [2] used in Figures 2a-j. The matric suction, y = -p, at which th = 0 is listed in the last column

Depth set, z (cm)

ths (cm3/cm3)

at (cm)

vn

rms error (cm3/cm3)

y(th=0) (m)

0

0.537

20.563

0.071067

8.127e-3

393

15

0.479

11.4081

0.056512

4.490e-3

547

30

0.461

3.4968

0.041388

3.085e-3

2404

45

0.462

2.7316

0.037250

9.450e-3

6650

61

0.449

2.4079

0.028916

4.404e-3

0.133e+6

76

0.447

1.6668

0.024722

2.421e-3

1.19e+6

91

0.458

1.4599

0.023469

4.581e-3

4.36e+6

106

0.434

1.8074

0.019503

8.382e-3

83.5e+6

122

0.422

2.6180

0.019363

7.464e-3

76.4e+6

137

0.465

3.9693

0.027913

4.558e-3

0.682e+6

 

Since some of the log-log plots of measured data in Figures 2a-j can be well fitted with a straight line, the Brooks-Corey could be the best in those cases. But others have an obvious curvature in the log-log plots that would seem to demand a van Genuchten-style relation. Table 5 shows the fitted parameters for the standard van Genuchten relation [5]. Figure 4 shows the comparison of the fitted values to the measured values. Notice that the standard van Genuchten fit does very well except for saturated porosity, ths. In many cases it produces a lower value. There are duplicate values in Table 5 to demonstrate the pseudo-random nature of the simulated annealing method to make the fits (explained later).

[5] ,

m and n are non-dimensional fitting parameters.

Table 5: Fitting parameters for the van Genuchten saturation relation [5] and the measured data in Table 1. Fitted by simulated annealing with duplicate fits for some depths. Last column is the mean absolute error (MAE) of the fit to the measured data.

depth set

z (cm)

ths (cm3/cm3)

thr (cm3/cm3)

at

(cm)

 

m

 

n

MAE (cm3/cm3)

0

0.4707

0.0000

122.83

0.21847

1.27954

0.0094

0

0.4711

0.0000

119.72

0.21623

1.27589

0.0094

15

0.4168

0.0000

77.57

0.19237

1.23819

0.0054

15

0.4107

0.0000

97.01

0.20320

1.25502

0.0054

30

0.3745

0.0000

89.29

0.17928

1.21845

0.0032

30

0.3745

0.0000

89.09

0.17919

1.21831

0.0032

45

0.4029

0.0731

33.56

0.16321

1.19504

0.0025

45

0.3891

0.0887

48.15

0.18067

1.22050

0.0025

61

0.4093

0.0261

20.83

0.10146

1.11291

0.0017

61

0.4041

0.0651

27.62

0.12032

1.13678

0.0018

76

0.3867

0.0000

45.40

0.08704

1.09534

0.0017

91

0.3864

0.0000

96.80

0.09503

1.10501

0.0021

91

0.3861

0.0000

99.32

0.09534

1.10539

0.0021

106

0.3670

0.0000

242.73

0.10202

1.11360

0.0019

106

0.3670

0.0001

243.08

0.10207

1.11367

0.0019

122

0.3662

0.0000

203.70

0.09707

1.10750

0.0015

122

0.3662

0.0019

204.30

0.09769

1.10826

0.0015

137

0.3982

0.0000

102.94

0.09884

1.10968

0.0012

137

0.3982

0.0005

103.25

0.09903

1.10991

0.0012

Figure 4: Van Genuchten fits [5] generated for data in Table 1 by the parameters in Table 5, versus the measured data in Table 1.

 

All of the semi-log plots in Figures 2a-j demonstrate that the simpler log relation [2] can fit the data in Table 1 as well as the van Genuchten relation [5], with all of the same values of ths. It has the added advantage that it does not require a parameter for the residual saturation, thr, making it the least number of parameters to fit. It is also reasonable to assume that if the saturation in [2] goes to zero, then so does the hydraulic conductivity, sc (cm/hr). This can produce some difficulties in the simulation model with stability and convergence. If the displacement pressure parameter, at, is not changed in the model calibration, then Table 4 shows that equation [2] fits the entire range of the measured tensiometer readings for this experiment, at each depth (not shown here). Thus, any adverse effects from using equation [2] may not appear, or if they do, not in all the ranges of the possible parameter values (the parameter space).

Calibrating a model to a field experiment (Scott, et al., 1995) over a nominal growing season of more than 200 days is computationally intensive. Consider simulated annealing (Goffe, et al., 1994), with eight fitting parameters in a model that takes about 2 minutes to simulate 1-D infiltration and evapotransipration in a 225-day growing season. It takes about 131 steps to go from an annealing temperature of 10 to 10-5, with a reduction factor of 0.9. Suppose that the model is run 20 times with random parameter settings, before the step size is change, and that 10 such sets are run before the temperature is reduced. This means that it will take 132*8*10*20*(2(min))/(60*24) = 293.3 days of computer time to reach a resolution, 68 days longer than the growing season being simulated.

This makes choosing a saturation relation with the least number of fitted parameters, along with some other optimization method, a reasonable first approach. In addition, the curvature in the log-log plots of measured saturation data is slight, and there are not enough data points in the low and high matric suction ranges. That makes it difficult to clearly and independently identify the at, vm and vn parameters in the van Genuchten-style relation [4], which in the fits tried, behaved as an equation with too many variables.

 

 

Fitting Saturation Relations with Simulated Annealing  (top)

While the computational overhead of simulated annealing (Goffe, et al., 1994) is burdensome for model calibration, it works quite well for fitting saturation relations. For those wishing a better understanding of the process, additional readings may include Corana, et al. (1987), and Kirkpatrick, et al. (1983). A public-domain Fortran program, simann (http://netlib.bell-labs.com/netlib/opt/simann.f.gz), associated with Goffe, et al., was modified to perform the fits presented in Tables 4 and 5.

Simulated annealing is a numerical optimization process that imitates the cooling of metal from high to low temperature. As metal cools, its atoms vibrate less and less, eventually "freezing" into their final positions. If the metal is quickly quenched, its atoms may be caught in positions that are not optimal in terms of the lowest possible energy. It may thus contain stresses and coarse grain structures, literally frozen into position. But if the metal is cooled very slowly, its atoms may naturally find the positions of lowest energy and stress.

Numerical simulated annealing is a pseudo-random process. Say that the "energy" of a data system is represented by a measure called an objective function. In this case, it is the sum of the absolute errors of the fitted saturation relation relative to the measured volumetric water content. The fitting parameters are always given a starting point, a first estimate. In this case, the program allows the user to make the first estimate, or will make the estimation itself.

The program is allowed a certain range of movement in the parameter space, a step length, within fixed limits. From the first point, it will take the each parameter and make a step away from it a random distance that is less than one-half the step length. If the objective function is lower, this will be the new best point and new starting point for the next step. If the objective function is higher, it may or may not accept the new point as the starting point for the next step, depending on the annealing temperature, which starts high.

The annealing temperature is used in an second pseudo-random process that accepts larger increases in the objective function for larger annealing temperatures, with an exponential probability. The higher the annealing temperature, the more likely a parameter point with a "worse" objective function value will be accepted as the start of the next step. Between random step lengths, and the random acceptance of worse values, the process has a good but finite chance of stepping out of local minima of the objective function to find the universal best minimum, the global minimum.

There are other details of the method that are left to the references. The step length is adjusted after 10 steps of each parameter to eventually become very small, and thus specify the resolution of the method. After each 200 steps of each parameter, the annealing temperature is reduced, decreasing the probability that a worse point will be accepted. The overall best point in each temperature cycle is retained as the starting point for the next cycle. Eventually, when both the temperature and the step sizes are very small (10-6), the best overall point is taken as the final result.

As the data in Table 5 demonstrate, the pseudo-random nature of the process does not guarantee the same answer every time. This is almost guaranteed in the program discussed here, satfitsa.exe, by using the computer clock time to seed the random number generator. Thus, the answers will most likely not be exactly the same in every run, even with the same input data and first estimates. The advantage of this is that a slightly better estimate might be found with more runs.

The computer program, satfitsa, will fit three types of saturation relations, the five-parameter van Genuchten-style relation [4], the three-parameter log relation [2] and the four-parameter standard van Genuchten relation [5]. It will allow the user to select the initial starting estimates for the parameters, or make its own estimates from the data. If the program is allowed to make its own estimates, it is wise to have a full range of data, and ten points or more. Please note that the program runs only under MS- or other compatible DOS, or in a DOS window in Windows. The input and output files are DOS ascii files.

The program will begin by asking for three filenames, one input and two output files. They must all be in DOS 8+3 filename format, with 8 or less characters, optionally followed by a dot and three or less characters. Table 6 shows the format of the input file. Except for the "series" identifier, all of the data is unformated, and must be set up on each line as shown. The x1 to x5 variables are the input estimates for the fitting parameters, designated under usage, and must all contain values, even if they are unused dummy variables.

Table 6: Format for the input file to satfitsa.

line
variable(s) usage

1
ny the number of points (y, th) following

2
series a 15-character alphanumeric identifier for the data set, starts with the first character of the line

2, ... ny+2
y, th y = matric suction (m), th = volumetric water content (cm3/cm3), one set per line, separated by a comma.

ny+3
me method = 1 for van Genuchten-style equation [4] with user initial estimates, = 10 for [4] with program estimates, = 2 for ln-equation [2] with user estimates, =20 for [2] with program estimates, = 3 for standard van Genuchten equation [5] with user estimates, = 30 for [5] with program estimates

1 or 10

2 or 20

3 or 30

ny+4
x1

ths

ths

ths

ny+5
x2

thr

dummy value

thr

ny+6
x3

at

at

at

ny+7
x4

vm

vn

n

ny+8
x5

vn

dummy value

dummy value

ny+9
f dummy value, placeholder for output file

ny+10
ny ny < 0, terminates the program
ny > 0, begins the next data set, with all the above lines repeated.

 

Table 7 shows the output format for the long output file, of the second name requested by the program at start. It adds to the basic input data. The names of the input and output files are added at the beginning to help in spreadsheets. The values the water content calculated by the fitting equation, th-fit, are added to the input (y, th) points on the same line, to use in spreadsheet plots. The input estimates of the parameter values are replaced by the program optimal values, followed by the final step sizes, res1 to res5, used in searching for the optimum. The magnitudes of these step sizes are a reliable indicator of the resolution of the optimal parameter values. Finally, the objective function, f, the summed-absolute-error of the fitted points, shows the overall error of the fit.

Table 7: Format for the long out file from satfitsa.

line
variable(s) usage

-2
the filename of the input file in quotes

-1
the filename of this file in quotes

0
blank line

1
ny the number of points (y, th) following, repeated from input file

2
series a 15-character alphanumeric identifier for the data set, repeated from input file

2, ... ny+2
y, th, th-fit y = matric suction (m), th = volumetric water content (cm3/cm3), th-fit = the fitted water content

ny+3
me, ier method, repeated from input file, and the error number, ier = 0 for no error, = 1 if the program exceeds 500,000 evaluations of the objective function

1 or 10

2 or 20

3 or 30

ny+4
x1, res1

ths, res

ths, res

ths, res

ny+5
x2, res2

thr, res

0, 0

thr, res

ny+6
x3, res3

at, res

at, res

at, res

ny+7
x4, res4

vm, res

vn, res

n, res

ny+8
x5, res5

vn, res

1, 0

m, 0

ny+9
fopt objective function, sum of the absolute error between th and th-fit

ny+10
ny ny < 0, terminates the program repeated from the input file
ny > 0, begins the next data set

 

The short output file, of the third filename requested by the program at start, tabulates the results without the data points. It contains one line for each input data set. If the fitting method is 1 or 10 (equation [4]), the line is:

series, ny, ier, me, th(1), th(2), ths, thr, at, vm, vn, fopt, res(ths), res(thr), res(at), res(vm), res(vn)

If the method is 2 or 20 (equation [2]), the line is:

series, ny, ier, me, th(1), th(2), ths, 0, at, vn, 0, fopt, res(ths), res(at), res(vn), 0, 0

If the method is 3 or 30 (equation [5]), the line is:

series, ny, ier, me, th(1), th(2), ths, thr, at, m, n, fopt, res(ths), res(thr), res(at), res(n), 0

The variables th(1) and th(2) are the first and second water contents of the input (y, th) points, for comparison to the ths value.

Table 8 shows the upper and lower limits allowed for the parameters by the program. Any user estimates outside these limits will be adjusted inside. Please note that the matric suction, y, is in the dimensions of meters. Please make any changes in scale necessary before and after the program runs.

Table 8: Upper and lower limits for fitted parameters

Method 1 or 10

Method 2 or 20

Method 3 or 30

, y(1) = 0 is taken to be a saturation measurement

0 £ thr £ 0.999 th(ny)

---

0 £ thr £ 0.999 th(ny)

10-4 £ at £ 5

10-4 £ at £ 5

10-4 £ at £ 10

1 £ vm £ 15

10-3 £ vn £ 10

---

0.1 £ vn £ 15

---

0.1 £ n £ 20

 

Associated files available for downloading:  (top)

  1. satfitsa.exe - MS-DOS executable program to fit three saturation relations (73k).
  2. se030001.dat - example DOS ascii input file for satfitsa; contains the data from Table 1 twice, set up for program estimates and fitting to the standard van Genuchten function [5] (the parameter estimates in the file are not used, and a good thing, too) (7k).
  3. se030002.prn - example long output file generated by satfitsa from se030001.dat (14k).
  4. se030003.prn - example short output file (4k).
  5. se030002.123 - example Lotus 1-2-3 97 spreadsheet file with a graph (Figure 4) demonstrating the data (30k).
  6. se010001.123 - example Lotus 1-2-3 spreadsheet file with the data and graphs used to generate Figures 2 and 3 (98k).
  7. satfitsa.zip - a zipped file with all of the files above (60k).

 

 

Conductivity Relation Fits  (top)

It is not entirely clear from Theisse's thesis (1984) and notes how he used the experimental data from the plane of zero flux method (Arya, et al., 1975) to generate his unsaturated conductivity relationships. He gives the unsaturated conductivity relation [6] and the tabular values of A and B (Table 9) from an unspecified averaging method over 12 plots at the given depths. Equation [6] can also be expressed as [7], where the saturated conductivity, sc (cm/h), and the saturated water content, ths (cm3/cm3), are explicitly parameters. If one substitutes the log saturation relation [2] into [7], one obtains [8], which makes the unsaturated conductivity, ck, an explicit function of the hydraulic pressure, p (cm). It produces two fitting parameters, sc and bc (dimensionless) that are independent of the saturation relation.

[6] , where k = unsaturated hydraulic conductivity after Thiesse (cm/h), q = volumetric water content (cm3/cm3).

[7] , where ck is the model variable for unsaturated conductivity (cm/h), sc = saturated conductivity (cm/h), and ths = saturated water content (cm3/cm3).

[8]

Table 9: Unsaturated conductivity fitting parameters from Thiesse (1984). Saturated porosity, ths, averaged from depths in Table 1.

Depth, z (cm)

ln A

A

B

avg ths

sc

7.5

-11.35

1.17e-5

25.0

0.5080

3.81904

22.5

-12.82

2.71e-6

34.0

0.4700

23.5706

37.5

-14.90

3.38e-7

40.3

0.4615

40.3847

53.0

-15.00

3.06e-7

39.0

0.4555

15.8711

68.5

-15.50

1.86e-7

38.1

0.4480

4.80088

83.5

-13.00

2.26e-6

30.3

0.4525

2.03552

98.5

-13.30

1.67e-6

30.6

0.4460

1.41567

114

-14.00

8.32e-7

33.3

0.428

1.28711

 

Notice that in order to develop equation [7] for each of the soil depths, one must have values of ths, A and B at the same depths. Unfortunately, the depths in Table 9 do not match the depths in Table 1 or Table 4. But the two sets of depths are positioned midway between each other, making simple averaging a reasonable approximation. For values outside the available ranges, such as z = 0 for A and B, and z > 114 for A and B, and z > 137 for ths and vn, let us simply extend the nearest value that is available, and produce a set from z = 0 to z = 201 cm.

Figure 5a shows the parameters B and vn cross-interpolated to the same z-points. Figure 5b shows the points generated from B× vn, and a smoothed line of bc through the points. The smoothed line was generated by TableCurve 2-D (SPSS, Inc), using a 21.1% Loess smoothing spline method. Table 10 shows the points taken from the curve to use in the model.

Figure 5a: Cross-interpolated B and vn

Figure 5b: bc = B*vn, and and spline-smoothed

Table 10: Spline-smoothed bc(z)

z (cm)

bc

z (cm)

bc

z (cm)

bc

0

1.746

60

1.156

120

0.652

10

1.662

70

0.971

130

0.789

20

1.634

80

0.807

140

0.893

30

1.587

90

0.711

160

0.924

40

1.512

100

0.660

201

0.930

50

1.355

110

0.640

 

 

References  (top)

Arya, L.M., D.A. Farrell and G.R. Blake. 1975. A field study of soil water depletion patterns in the presence of growing soybean roots: 1. Determination of hydraulic properties of the soil. Soil Science Society of America Proceedings 39:424-430.

Corana, A., M. Marchesi, C. Martin and S. Ridella. 1987. Minimizing multimodal functions of continuous variables with the "simulated annealing" algorithm, ACM Transactions on Mathematical Software, 13(3):262-280.

Goffe, W.L., G.D. Ferrier and J. Rogers. 1994. Global optimization of statistical functions with simulated annealing, Journal of Econometrics 60:65-99.

Kirkpatrick, S., C.D. Gelatt, Jr. and M.P. Vecchi. 1983. Optimization by simulated annealing, Science, 220(4598):671-680.

Scott, H.D., A. Mauromoustakos and J.T. Gilmour. 1995. Fate of inorganic nitrogen and phosphorous in broiler litter applied to tall fescue, Bulletin 947, Arkansas Agricultural Experiment Station, University of Arkansas, Fayetteville, AR 72701, 55p.

Thiesse, B.R., 1984, Variability of the physical properties of the Captina soils, M.S. Thesis, Department of Agronomy, University of Arkansas, Fayetteville, AR USA, 91 p.


Questions? mailto:dscott@cleora.uark.edu (no spam please)