Constrained Smoothing of Experimental Data in The Identification of Kinetic Models

Zdravko Kravanja (Editor)
Proceedings of the 26th European Symposium on Computer Aided Process Engineering

June 12th - 15th, 2016, Portorož, Slovenia. © 2016 Elsevier B.V. All rights reserved.
Constrained Smoothing of Experimental Data in the

Identification of Kinetic Models
Carolina S. Vertis1 , Nuno M.C. Oliveira1* and Fernando P. Bernardo1
1 CIEPQPF, Department of Chemical Engineering, University of Coimbra, Rua Sílvio Lima —
Polo II, 3030–790 Coimbra, Portugal
*nuno@eq.uc.pt
Abstract
In the context of the application of systematic methodologies for the incremental development
of mechanistic kinetic models, various regularization techniques have been explored to allow the
estimation of rate phenomena as the derivatives of sparse data sets. Through a combination of
a numerical approach with aspects of qualitative analysis, this work proposes a novel method to
obtain smoothed concentration profiles that allow their reliable differentiation. Two different case
studies are considered to illustrate the performance of the method.
Keywords: Kinetic modelling, reaction network analysis, data smoothing, orthogonal collocation,
systematic model identification
1. Introduction
Model-based analysis of experimental data plays a fundamental role in the development of kinetic
models for chemical and biochemical systems. These data, typically concentrations of chemical
species registered over transient experiments, are used to identify reliable kinetic models, which
tend to assume increasingly preeminent roles in the design, optimization and control of chemical
processes. Systematic methodologies have been recently proposed to support and streamline the
development of mechanistic models for these processes (Marquardt, 2005; Brendel et al., 2006).
A distinctive characteristic of these approaches is the partition of the identification task into a
sequence of consecutive phases. In each of these phases a particular structural component of the
model is addressed individually, excluding or substantially limiting the effects of the remaining
parts of the overall model, which are identified sequentially (Vertis et al., 2015).
For the application of these incremental modelling strategies both methods of classical analysis
of kinetic data, the integral and the differential method, have been considered, and each approach
brings specific advantages. The differential method offers the advantage of being able to directly
calculate the reaction rates once the derivatives of the concentration (or number of moles) time
profiles are differentiated. However, this operation usually introduces errors in the estimates of
the reaction rates, which can be substantial if adequate procedures are not used (Varah, 1982). In
contrast, the integral method can become less sensitive to errors in specific concentration measure-
ments, especially when the conversions become significant. In this case, and since the individual
reaction rates are not available (as this would require data differentiation), the identification of the
kinetic models is typically performed by comparing the predicted and computed stoichiometric
extents of candidate kinetic models, retaining the models that are able to achieve the best fit (Bhatt
et al., 2011). As in the application of the pure differential method, this might be problematic if
more chemical reactions than in dependent stoichiometric extents occur in the system; this be-
haviour is often observed in biological networks (Jia et al., 2012), but also happens in many other
2 Vertis et al.
catalytic systems (e.g., Deshpande et al. (2002)). Above all, when kinetic models need to be con-
sidered which are nonlinear in the kinetic parameters, it difficults the separate estimation of the
rate phenomena, without a priori postulating a model structure for their description, and compro-
mises the complete application of the incremental approach (Marquardt, 2005; Vertis et al., 2015).
Hence further interest subsists in the continued development of accurate and practical approxi-
mation techniques to estimate the derivatives of the concentration profiles, for incremental kinetic
model development.
2. Rate estimation using regularization methods

Consider the mass balances for species s in a homogeneous batch reactor, assumed to be of con-
stant volume. These balances can be represented by
dcs dc
= ∑ νs,m rm (t) and = N T · r(t) (1)
dt m∈rx dt
where cs represents the concentration of species s ∈ sp, νs,m is the stoichiometric coefficient of
species s in reaction m ∈ rx, c ∈ Rnsp is the overall concentration vector and N ∈ Rnrx ×nsp the
stoichiometry matrix, while r(t) ∈ Rnrx is the vector of reaction rates. Overall mass conservation
as well as additional elemental, electric charge and reactivity constraints need also to be consid-
ered. These are assumed to be represented by the set of linear equations constraining the possible
variations of the concentration vector c
A · c = bI or A · dc = 0 (2)
where A ∈ RnI ×nsp , nI represents the total number of invariants considered, and bI ∈ RnI represents
the total amount of the conserved quantities. Here A is assumed to be of full row rank, with
nI < nsp . Equation (2) restricts the possible concentrations changes in the system to remain in
the null space of A, with dimension nsp − nI . Hence (1) can be replaced by a smaller number of
independent mass balances sufficient to fully describe the evolution of the system, together with
(2). These can be written as
dcv
= NvT · r(t) (3)
dt
where cv ∈ Rnsp −nI and Nv ∈ Rnrx ×(nsp −nI ) represent the reduced concentration vector and stoi-
chiometry matrix, respectively. We also denote by sv the set of variant species considered in (3),
nsv = nsp − nI .
For simplicity, all of the species in cv are assumed to be measured, at sampling times tk , with
k ∈ me, the set of measurement
instants. If their respective time profiles are differentiated to obtain
the vector ċv (t) = dcv (t) dt , then the isolated solution of (3) is possible and determined at each
instant tk , to provide the instantaneous reaction rates r(tk ), on condition that Nv is of full row
rank (the number of reactions does not exceed nsp − nI , i.e., nrx = nsv ). These values can be later
correlated with the concentrations of the species at the corresponding tk , on a posterior phase of the
application of the incremental identification method, allowing the selection of particular kinetic
models and the determination of the numerical values of the corresponding parameters, through
the solution of separate algebraic problems for each rate expression. On the other hand, if the
number of assumed reactions exceeds the rank of Nv (nrx > nsv ), infinite solutions of (3) are in
general possible, and therefore the instantaneous solution of (3) is underdetermined in this case.
Instead, a correlation between measures obtained at more than one sampling instant needs to be
established, possibly with the simultaneous analysis of more than one set of experimental data at
once. Additional constraints of different natures, such as thermodynamic, and the type of reaction
network considered (linear / nonlinear) can also be included in this problem, to further delimit the
Constrained Smoothing of Experimental Data in the Identification of Kinetic Models 3
set of feasible solutions. However, the detailed analysis of this case is outside the scope of this
work.
The estimation of the derivatives of the concentration profiles from experimental data, perhaps the
major limitation in the application of the full incremental approach, is known to be an ill-posed
problem, due to the sparsity and noise contamination of the measurements. Consequently, diverse
regularization techniques have been tried to obtain solutions to similar but related problems, which
enforce additional smoothness assumptions on the solutions obtained. A common goal of these
techniques is to deal only with well-posed problems, which provide solutions as close as possible
to the solution of the original problem, but are less sensitive to small changes in the input data.
Methods such as the use of filter based approaches, Tikhonov regularization, and the approxi-
mation of the discrete points by smoothing splines, followed by differentiation of the resulting
numerical solution, have been reported (Michalik et al., 2009).
A common trade-off in these approaches is the amount of bias introduced in the estimates of the
derivatives produced (which is dependent on the particular technique used), versus the amount of
regularization achieved. For instance, with smoothing splines, a general objective function of the
form is usually considered:
p n cv,k − ĉv,k 2
Z tF
min φ = ∑ + (1 − p) ĉ00v (t)2 dt (4)
n k=1 δk 0
Here ĉv,k represents the model predicted (smoothed) value of cv (tk ) and δk corresponds to a mea-
sure of the standard deviation of cv (tk ), while p ∈ [0, 1] is a penalty term that controls the amount
of smoothing introduced. An important feature of this method is that if natural splines are con-
sidered, for p < 1 an amount of bias is known to be present in the derivative estimates produced
(Brendel et al., 2006). This error tends to be larger at the extremes of the interpolating interval,
since the second derivatives of the approximating functions are assumed to vanish there; conse-
quently, the use of boundary corrected smoothing splines is preferred (Huang, 2001).
Besides the use of smoothing splines, alternative approximation techniques have also been applied
in the solution and optimization of differential-algebraic process models. In particular, orthogonal
collocation, has been widely used in the solution of reaction models (Finlayson, 1972). Due to
its efficiency and versatility, orthogonal collocation on finite elements has also been proposed
for the parametrization of general differential algebraic optimization problems. An important
advantage of this approximation method is the existence of strategies for the adaptation of the
lengths (and placement) of the elements, resulting in a high-order approximation with relatively
few elements (Biegler, 2010). Unseemly related applications, for the qualitative modelling of
physical systems, and the description of the trends and main features of the process responses
observed have also been developed in the literature (Cheung and Stephanopoulos, 1990; Schaich
et al., 2001). These techniques seek to extract a high-level description of the observed behaviour,
translatable into a group of constraints that can be used to help the posterior identification of
quantitative kinetic models (Madár et al., 2003; Villez et al., 2013). The selected combination
of features from both contributions provides an alternative regularization methodology for the
estimation of the concentration derivatives, described in the next section.
3. Continuous approximation of the concentration data

Orthogonal collocation can be described as a variant of the weighted residual method, where
polynomials are used to approximate a continuous function, at particularly chosen (collocation)
data points. Besides the use of B-splines, other equivalent forms of representing the approximating
polynomials are possible; for convenience, the Lagrange interpolation form is used here.
We start by defining the sets fe of finite elements, and cp of collocation points in each element,
including the boundaries. Cubic polynomials with two interior collocation points inside each
4 Vertis et al.
element are assumed. A normalized coordinate 0 ≤ x ≤ 1 is defined for each element, to locate the
collocation points at specific values, corresponding to the normalized roots of orthogonal Legendre
polynomials, independently of the lengths of each element. With these definitions, we note that in
the phase of the incremental identification where the concentration derivatives are estimated, the
model (1) is still not known, and only constraints (2) can be used. Therefore, an initial formulation
to obtain a continuous approximation of the concentration profiles using orthogonal collocation on
finite elements can be written as:
2
min φ (a) = ∑ ∑ ws cs,k − ĉs,k (5)
as,i, j
k∈me s∈sp
s.t. Ps,i (x) = ∑ as,i, j l j (x), ∀s ∈ sp, ∀i ∈ fe (6)
j∈cp
nk
∏k=0 (x − xk )
k6= j
l j (x) = nk , ∀ j ∈ cp (7)
∏k=0 (x j − xk )
k6= j
ĉs,k = fint (tk ; {Ps,i (x)}) , ∀s ∈ sp, ∀k ∈ me (8)
0 0
Ps,i (1) = Ps,i+1 (0), Ps,i (1) = Ps,i+1 (0),
00 00
(9)
Ps,i (1) = Ps,i+1 (0), ∀s ∈ sp, ∀i = 1, . . . , nelem − 1
A · ĉ = bI (10)
ĉs ≥ 0, ∀s ∈ sp (11)
Here equation (7) describes the Lagrange interpolating polynomials, used to define the generic
interpolating functions in (6), used by the interpolating functions defined in (8). Equations (9) ex-
press the continuity of the solution and the first two derivatives between consecutive elements. The
formulation (5–11) corresponds to a quadratic problem (QP), usually of low dimension, which can
be readily solved. Note that only one approximation problem is constructed to simultaneously ap-
proximate the concentration of all species in the system. Also, and contrarily to the regularization
approaches previously described, the presence of the system (10) in the QP forces the estimates
to be automatically reconciled with the conservation constraints identified a priori for the system,
such as the overall or the elemental mass balances. These constitute significant advantages over
the use of other independent and purely numeric smoothing techniques. Finally, and since only
the concentrations in set sv are required for the deduction of the kinetic model, a reduced version
of problem (5–11) can be solved, by considering just the null-space projection of (10).
As formulated above, the QP problem is not able yet to provide good estimate of the concentration
derivatives, since it contains a large number of degrees of freedom. Consequently, the continuous
profiles will tend to exhibit various oscillations, and the solution will try to interpolate the avail-
able data. As a replacement of the regularization approach similar to the smoothing spline (4),
additional constraints can be added to the formulation to eliminate these oscillations, and provide
smoother approximations, as desired. These constraints are relative to the necessary monotonicity
of the solutions and their derivatives in particular finite elements. These can be intuited from the
simple inspection of the approximated profiles, and added interactively to the problem (5–11),
which needs to be subsequently resolved. In alternative, the qualitative modelling approaches de-
scribed earlier can also be used to derive the corresponding constraints (Schaich et al., 2001; Villez
et al., 2013). In a general form, denoting by the superscript ∗ the particular components (c∗ ) and
Constrained Smoothing of Experimental Data in the Identification of Kinetic Models 5
finite elements (fe∗ ) where they are applicable, these conditions can be represented as:
0 0
Ps,i (xr ) ≤ 0 ∨ Ps,i (xr ) ≥ 0, ∀i ∈ fe∗ ; r = 0, 1, 2; s ∈ c∗ (12)
0 0 0 0
Ps,i (xr ) ≤ Ps,i (xr+1 ) ∨ Ps,i (xr ) ≥ Ps,i (xr+1 ), ∀i ∈ fe∗ ; r = 0, 1, 2; s∈c ∗
00 00 00 00
Ps,i (xr ) ≤ Ps,i (xr+1 ) ∨ Ps,i (xr ) ≥ Ps,i (xr+1 ), ∀i ∈ fe∗ ; r = 0, 1, 2; s ∈ c∗ (13)
000 000 000 000
Ps,i ≤ Ps,i+1 ∨ Ps,i ≥ Ps,i+1 , i = 1, . . . , fe − 1 ∩ fe∗ ; s∈c ∗
These constraints are also linear in the decision variables as,i, j , due to the form of the interpolat-
ing polynomial (6), and can be enforced at all collocation points. Hence the application of this
methodology does not require the choice of a particular value of a smoothing parameter and, be-
cause it is entirely based on the incorporation of additional observed features in the solution, has
the potential to provide better estimates of its derivatives.
4. Case studies
Two case studies selected from the
literature are used to illustrate the ap- 1.0 ●
plication of this methodology: the ●

thermal isomerization of α-pinene 0.8
0 [molar ratio]
(Stewart and Sørensen, 1981) and 0.6

●
●
■
● AP
catalytic hydrogenation of succinic- ■ ■ LIM

●
■ ◆ AO
acid (Deshpande et al., 2002). 0.4
C /C AP
■ ● ▲ BP
■
In the first case, 5 chemical species 0.2
■
■ ▼ D
● ▼
▼
are present: α-pinene (AP), limonene ■ ◆◆◆ ◆ ◆
◆ ▼▼ ▲ ▼ ◆ ◆
▲
▼ ▲ ▲
(LIM), allo-ocimene (AO), β -pironene 0.0 ◆▼▲0■ ▼▲ ▼▲▲▲ 5 10 15
●
(BP) and a dimer (D). Due to the t × 10-3 [min]

similarity of all species, and since all
are measured, a global mass balance Figure 1: Concentration profiles for the α-pinene system
should be enforced during the es- (solid curves), and finite elements placement.
timation of the continuous profiles.
Figure 1 shows the results obtained, using 4 finite elements. Several of the qualitative constraints
on the solutions are immediately obvious; for instance, the first AP derivative is always negative
and increases monotonically, while the second AP derivative increases monotonically between
consecutive elements. For AO, the first AO derivative is positive until the end of the first element
and negative between the second and third elements. Additional constraints of the same type were
used for the remaining components.
The second example, relative to
the catalytic hydrogenation of suc- 1.0 ●
cinic acid involves nine compo- ■
0.8 ● ■
nents (Deshpande et al., 2002).
0 [molar ratio]
■
● AS
The main compounds are succinic 0.6 ■ GBL
acid (AS), γ-butyrolactone (GBL), ■
▲
◆ BDO
◆ ◆ ▲ ▲
1,4-butanediol (BDO), tetrahydrofu- 0.4 ● ◆
C /C AS
■ ▲ THF
ran (THF), butanol (BuOH) and n- 0.2 ●
◆
▼ ▼
▼ BuOH
■
propanol (PrOH), while the remain- ■
◆ ◆ ○
○ PrOH
▼ ○
■ ○
▲
▼ ▼ ○
▲
● ▼ ○ ■
ing are methane (CH4), water (H2O) 0.0 ◆◆◆◆◆
○○○○○
▲
▼
■▼▲ ▼▼ ▲◆
▲▲ ▼ ▲◆
○
▼ ○
▲
▼ ▼
● ● ● ◆
● ◆●
■
0 1 2 3 4 5 6
and hydrogen (H2). Prior assump- t [h]
tions relative to the nature of the re-
actions in this system indicate that Figure 2: Concentration profiles for the succinic acid sys-
the sum of the concentrations of the tem (solid curves).
C4 and C3 should be conserved.
6 Vertis et al.
This example is more challenging than the first case, since there is appreciable error in the overall
mass balance closure in these data; the difference is represented by the shadowed curve in the
figure. Despite this difficulty, the smoothing procedure produces very reasonable results. Here the
first derivative of AS derivative is always negative and increases monotonically; the first deriva-
tive of GBL is positive until the end of the third element and negative from the fifth element; and
the first derivative of BDO is positive until the end of the third element and increases monotoni-
cally until the end of the fourth element. Due to the equal weighting factors used, in the regions
where there is more error, the larger errors observed tend to be equidistributed between the species
present, except when the qualitative constraints used in the respective or contiguous elements dis-
allow this behaviour.
5. Final remarks
An alternative procedure for the continuous approximation of sparse kinetic data is described in
this work, relying on orthogonal collocation on finite elements. Qualitative information relative
to the experimental data allows a good approximation of the concentration profiles, without the
need of introducing artificial regularization constraints or parameters. Therefore its use in the
incremental development of kinetic models will be further explored.
Acknowledgments: Financial support from CNPq — Conselho Nacional de Desenvolvimento
Científico e Tecnológico, Brasil, through the scholarship 203592/2014-0, is acknowledged.
References
N. Bhatt, M. Amrhein, D. Bonvin, 2011. Incremental identification of reaction and mass–transfer kinetics using the
concept of extents. Industrial & Engineering Chemistry Research 50 (23), 12960–12974.
L. Biegler, 2010. Nonlinear Programming: Concepts, Algorithms, and Applications to Chemical Processes. Society for
Industrial and Applied Mathematics, Philadelphia, PA.
M. Brendel, D. Bonvin, W. Marquardt, 2006. Incremental identification of kinetic models for homogeneous reaction
systems. Chemical Engineering Science 61 (16), 5404–5420.
J.-Y. Cheung, G. Stephanopoulos, 1990. Representation of process trends — Part I. A formal representation framework.
Computers & Chemical Engineering 14 (4–5), 495–510.
R. Deshpande, V. Buwa, C. Rode, R. Chaudhari, P. Mills, 2002. Tailoring of activity and selectivity using bimetallic
catalyst in hydrogenation of succinic acid. Catalysis Communications 3 (7), 269–274.
B. Finlayson, 1972. The Method of Weighted Residuals and Variational Principles. Academic Press, New York, NY.
C. Huang, 2001. Boundary corrected cubic smoothing splines. Journal of Statistical Computation and Simulation 70 (2),
107–121.
G. Jia, G. Stephanopoulos, R. Gunawan, 2012. Incremental parameter estimation of kinetic metabolic network models.
BMC Systems Biology 6 (1), 142.
J. Madár, J. Abonyi, H. Roubos, F. Szeifert, 2003. Incorporating prior knowledge in a cubic spline approximationap-
plication to the identification of reaction kinetic models. Industrial & Engineering Chemistry Research 42 (17),
4043–4049.
W. Marquardt, 2005. Model-based experimental analysis of kinetic phenomena in multi-phase reactive systems. Chemi-
cal Engineering Research and Design 83 (6), 561–573.
C. Michalik, M. Brendel, W. Marquardt, 2009. Incremental identification of fluid multi-phase reaction systems. AIChE
Journal 55 (4), 1009–1022.
D. Schaich, R. Becker, R. King, 2001. Qualitative modelling for automatic identification of mathematical models of
chemical reaction systems. Control Engineering Practice 9 (12), 1373–1381.
W. E. Stewart, J. P. Sørensen, 1981. Bayesian estimation of common parameters from multiresponse data with missing
observations. Technometrics 23 (2), 131–141.
J. M. Varah, 1982. A spline least squares method for numerical parameter estimation in differential equations. SIAM
Journal on Scientific and Statistical Computing 3 (1), 28–46.
C. S. Vertis, N. M. Oliveira, F. P. Bernardo, 2015. Systematic development of kinetic models for systems described
by linear reaction schemes. In: J. K. H. Krist V. Gernaey, R. Gani (Eds.), 12th International Symposium on
Process Systems Engineering and 25th European Symposium on Computer Aided Process Engineering. Vol. 37 of
Computer Aided Chemical Engineering. Elsevier, pp. 647–652.
K. Villez, V. Venkatasubramanian, R. Rengaswamy, 2013. Generalized shape constrained spline fitting for qualitative
analysis of trends. Computers & Chemical Engineering 58, 116–134.

Constrained Smoothing of Experimental Data in The Identification of Kinetic Models

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Constrained Smoothing of Experimental Data in The Identification of Kinetic Models

Загружено:

Авторское право:

Доступные форматы

Zdravko Kravanja (Editor)

Proceedings of the 26th European Symposium on Computer Aided Process Engineering

Constrained Smoothing of Experimental Data in the

2. Rate estimation using regularization methods

3. Continuous approximation of the concentration data

plication of this methodology: the ●

(Stewart and Sørensen, 1981) and 0.6

catalytic hydrogenation of succinic- ■ ■ LIM

(BP) and a dimer (D). Due to the t × 10-3 [min]

Вам также может понравиться