Exploring Digital Twins of Nonlinear Systems through Meta-Modeling with Echo State Networks

Laisa Cristina Juffo Campos; Ana Carolina Spindola Rangel Dias; Wellington Betencurte da Silva; Julio Cesar Sampaio Dutra

Latin-American Journal of Computing

Escuela Politécnica Nacional, Ecuador

ISSN: 1390-9266

ISSN-e: 1390-9134

Periodicity: Semestral

vol. 11, no. 2, 2024

lajc@epn.edu.ec

Received: 08 March 2024

Accepted: 08 May 2024

URL: https://portal.amelica.org/ameli/journal/602/6025436007/

This work is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.

Abstract: — Effective process monitoring, and control rely on precise dynamic models that can capture the inherent nonlinearities of chemical systems. However, rigorous modeling of complex industrial processes can be computationally demanding. Meta modeling using machine learning methodologies offers a viable approach to generate computationally efficient surrogate representations. Specifically, Echo State Networks (ESNs) are a promising neural network approach for meta-modeling nonlinear dynamical systems. ESNs simplify training through fixed input weights while they focus learning on output weights. This study explores the development of ESN-based digital twins for a nonlinear dynamic process. An ESN is employed to construct a meta-model of a simulated continuously stirred tank reactor with biochemical kinetic. The network was trained on input-output data obtained from the simulation of an ordinary differential equation system, and the performance was evaluated both in-sample and out-of-sample. The results indicate that the ESN meta-model can successfully approximate the underlying dynamics, accurately capturing temporal evolution. A closed-loop digital twin deployment using the ESN surrogate also showed reliable behavior. This work presents initial steps toward developing digital twins of chemical processes using ESN-driven meta-modeling. The findings suggest ESNs can effectively generate computationally efficient surrogate representations of nonlinear dynamical systems. Such digital twins hold promise for online process monitoring and optimized control of industrial plants.

Keywords: Echo State Networks, Dynamic systems, Digital twins.

I. Introduction

In recent years, rapid technological progress has resulted in substantial enhancements across diverse sectors, notably in enhancing quality and safety within chemical processes. The ubiquitous incorporation of computers into process management has empowered control over various variables, that include temperature, pressure, and chemical composition, thereby generating extensive and diverse data archives [1]. Design challenges necessitating intensive computational resources are increasingly prevalent in manufacturing industries [2]. Moreover, creating tools capable of analyzing data and constructing predictive mathematical models has become imperative for real-time process monitoring and control.

Creating rigorous models that accurately capture the dynamics and nonlinearity of real systems may be impractical at plant sites, where rapid responses are crucial. One practical approach is to utilize metamodeling strategies [2][3] to tackle the challenges inherent in process systems. Widely utilized across engineering, computer science, and optimization, these strategies involve developing simplified models that approximate the behavior of complex systems or processes [4]. These simplified representations, named meta-models or surrogate models, aim to balance accuracy and computational efficiency.

In this context, digital twins emerge as virtual representations capable of reflecting the behavior of physical systems in real-time, this shows potential for online monitoring and process optimization [5]. By generating simplified yet computationally efficient models, digital twins enable dynamic data analytics and rapid decision-making to optimize industrial plant control and performance.

Expanding on recent data science research, metamodeling can draw upon various machine learning techniques [2][6]. Artificial Neural Networks (ANNs) are widely recognized for their ability to approximate complex functions [7]. Modeled after the functioning mechanism of biological neurons, ANNs comprise an input layer, a hidden layer housing artificial neurons in quantities necessary to represent the data, and an output layer. Additionally, ANNs possess memory storage and learning capabilities, making them particularly suitable for dynamic and nonlinear systems. This work precisely investigates this characteristic regarding applying neural meta-models for generating digital twins of complex chemical processes [8][9]. The aim is to develop computationally efficient representations that approximately capture the underlying dynamics of these systems.

Depending on the network architecture, various types of neural networks exist, including Feedforward Neural Networks (FNNs) and Recurrent Neural Networks (RNNs). RNNs offer computational advantages for dynamic process systems owing to their inherent feedback loops. However, training traditional RNNs can be complicated due to issues like the "vanishing gradient" problem [10]. To address this, [11] introduced the Echo State Network (ESN). Unlike traditional RNNs that adjust all synaptic weights, ESNs maintain fixed input and recurrent connections, focusing solely on training output connections through a relatively simple linear regression process. This approach circumvents the complexities of training recurrent connections and mitigates gradient-related challenges. Consequently, ESNs present an effective solution for harnessing the power of RNNs while mitigating training complexities, particularly in scenarios where efficient learning is essential.

This article proposes using an Echo State Network as a meta-model to approximate dynamic nonlinear models and evaluate the performance in a closed-loop application. This work assesses the potential of this approach for this purpose, analyzing the performance of different methodologies in modeling a CSTR reactor through the construction of a digital twin. Section 2 presents a brief background on the metamodeling problem. Section 3 elaborates a case study based on a simulated bioreactor and details the data acquisition procedure. The theory, rationale, and construction of the Echo State Network are described in Section 4, followed by the discussion of simulation results.

The contribution of this article lies in presenting initial steps towards developing digital twins of chemical processes using ESN-driven meta-modeling. By demonstrating the efficacy of ESNs in generating computationally efficient surrogate representations of a classical nonlinear dynamical system, this work opens space for online process monitoring and optimized control of industrial plants.

II. The Metamodeling Problem

A meta-model (or surrogate model) can be conceived as a "model of a model" [6], functioning as a simplified representation of a high-fidelity simulation model [12]. It emulates the response by delineating the relationship between inputs and outputs based on data acquired with known precision or uncertainty [13]. The importance of metamodeling lies in its ability to balance accuracy and computational efficiency. Hence, metamodeling emerges as an essential approach to navigating real-world system intricacies, especially those characterized by nonlinear relationships, numerous variables, and complex behaviors.

In industrial settings, meta-models are employed for tasks which necessitate the establishment of a (complex) relationship between the inputs and outputs of a process system. This relationship can be encapsulated by an extended meta-model equation that incorporates the feedback signal (1):

(1)

Where represents the current output, denotes the current inputs, is the previous output (feedback signal), is the relationship that incorporates inputs and feedback, and represents error or uncertainty in the meta-model prediction.

By offering a simplified representation of burdensome simulations, meta-models facilitate quicker evaluations and decision-making - crucial aspects in industries that demand real-time solutions. This approach enables approaching complex systems without the need of resource-intensive full-scale simulations, which can be computationally demanding and time-consuming. Some commonly used metamodeling techniques encompass polynomial surface response models, Kriging, Radial Basis Functions, Support Vector Regression, and Artificial Neural Networks [13][14]. These techniques generate approximated mappings from inputs to outputs. The choice depends on problem characteristics, available data, and required predictions.

Metamodeling using neural networks adopts a data-driven approach that harnesses the principles of ANNs to construct efficient approximations of complex systems. This methodology entails training the neural network on a dataset that reflects the system behavior under scrutiny. This dataset consists of input variables paired with corresponding output values, that facilitates the network identification of underlying patterns and correlations. Following training, the neural network can provide predictions for new input data, substantially which alleviates computational burdens compared to resource-intensive full-scale simulations.

The increased processing speed has dramatically expanded the applicability of neural network-based metamodeling. For example, [15] employed a neural network as a meta-model to approximate a copper porphyry mine comminution circuit, which leads to a significant acceleration of simulations compared to traditional phenomenological models. Additionally, [16] utilized neural networks in the metamodeling of reactive transport, and this reduces computational time for scenarios requiring multiple realizations. These studies highlight the versatility of neural network-based metamodeling in improving efficiency, accuracy, and computational performance across various domains.

Modeling and Data Generation

The mathematical model employed to generate the data was adapted from [17], outlining the dynamic behavior of a bioreactor. The equations that govern substrate balance, S, and cell balance, 𝑋, are expressed by (2) and (3), respectively, while the reaction rate, 𝜇(𝑆), is defined by (4), where 𝐷 is defined as the dilution rate, that represents the ratio between the volumetric feed flow rate and the reactor volume, and S_fstands for the substrate feed concentration.

(2)

(3)

(4)

All code implementations were developed in Python, with the free Spyder development environment (version 3.9.16). The code was compiled and executed on a computer system featuring 128 GB of DDR4 RAM, and an Intel® Core I7-12700k processor operating at 5.00 GHz.

This specific case study adopted a supervised training strategy to construct the neural model. This approach required the generation of input and output data. The input data was synthesized by a Random Gaussian Signal (RGS) algorithm [18]. The RGS technique is widely utilized for dynamic systems identification, which enables a thorough exploration of the input space. Consequently, it effectively stimulates the process response across diverse conditions.

The input variables were the dilution rate and substrate feed concentration, with mean values of 0.1 h⁻¹ and 10.0 g L⁻¹, respectively. Each variable displayed variations of ± 0.1 h⁻¹ and ± 2.5 g L⁻¹. A total of 2500 samples were generated and collected at intervals of 0.25 h. The sampling interval was modified to 8 h to generate the second dataset, while the other parameters were kept constant. As for the output data, represented by 𝑆 and 𝑋, these were derived by solving the system of ordinary differential equations outlined in (2) and (3), using the solve_ivp function from the scipy.integrate library for this purpose. Gaussian random noise was added to the simulated result to make output data more complex and realistic, with a standard deviation of 5%. This makes the resulting data more complex while pushing the meta-model to discover the underlying patterns in a way that enhances its robustness against noise and variability when it transfers to actual operation. Subsequently, all datasets were organized and stored within a spreadsheet.

The generated data is showcased in Figs. 1-4 which illustrate the obtained data with higher (Figs. 1-2) and lower frequency (Figs. 3-4). The red data points indicate outputs with the addition of measurement noise, which was introduced to a better approximate reality and attenuate potential overfitting.

Fig. 1.
Input data for the first dataset

Fig. 2.
Output data for the first dataset

Fig. 3.
Input data for the second dataset

Fig. 4
Output data for the second dataset

III. Echo State Network

Acknowledging the potential of RNNs, [8] introduced a groundbreaking neural network architecture called the Echo State Network (ESN). The primary aim of this architecture is to harness the capabilities of effectively addressing complex problems while it simplifes the learning process. In the conventional training of ANNs, with the adjustment of synaptic weights across input, output, and feedback layers can impose substantial computational demands, often requiring significant computational resources. However, Jaeger's innovative network design focuses solely on training output weights, accomplished through a relatively straightforward linear regression process. This approach offers significant advantages in terms of computational efficiency and streamlining the intricate task of fine-tuning complex feedback loops.

The ESN remarkably simplifies the training process by compartmentalizing the learning process into distinct stages - initially training output weights while keeping other weights fixed. This streamlined approach enhances computational efficiency and facilitates faster convergence during the training phase. Furthermore, the methodology unlocks potential applications in scenarios where efficient learning is paramount. The innovative design of the ESN offers a promising pathway to address challenges related to training complexity, which makes it well-suited for scenarios demanding both computational efficiency and enhanced learning performance.

In this implementation, the ESN network algorithm was coded following the equations outlined by [8], with specific hyperparameters maintained at fixed values (Table I). These predetermined values were determined empirically. An optimization method was utilized and implemented through Python programming to identify the optimal hyperparameters - neuron count, sparsity, and leaking rate. Following this, the resulting network was validated using the fine-tuned hyperparameters.

TABLE I.
Network Hyperparameters

IV. Controller tuning and closed-loop

Another test was applied to evaluate the performance in a closed-loop simulation, allowing for the assessment of the feasibility of applying the trained network as a meta-model (that is, the digital twin). The control objective was to maintain cell concentration (X) around desired values, and it considers the substrate concentration in the feed (Sf) as the disturbance and the dilution rate (D) as the manipulated variable. For this purpose, we used a PI controller with the velocity algorithm.

A transfer function of the reactor dynamics was obtained to tune the controller, with a step test of -5% on D, performed on the differential model from its initial conditions. The steady-state response obtained was Xs = 4.5 g L⁻¹ and Ss = 1.0 g L⁻¹. With the approach of [19], it was possible to approximate the process with a first-order plus dead time (FOPDT) system. Fig. 5 comparatively illustrates the original process (differential model), represented by red points, and the approximated process. The parameters obtained through such an approach are shown in Table II.

Fig. 5.
Process simulation and obtained model

TABLE II.
Process parameters

After conducting tests on different controllers, three tuning techniques were applied: Internal Model Control (IMC), Integral of Time multiplied by Absolute Error for servo test (ITAE), and manual fine-tuning [17]. The parameters for each tuning technique are described in Table III. It was concluded that the manually tuned controller was the best choice for this study, even though it was a more conservative option. The manually tuned controller yielded a favorable result of less oscillation in the manipulated variable during closed-loop tests. Additionally, it demonstrated a slight difference in response time compared to the other controllers examined. The gain margin of the manually fine-tuned controller was 56.8437, which is significantly higher than the gain margins of the IMC (22.9541) and ITAE-servo test (3.0869) methods. This result suggests that the manually fine-tuned controller is more robust than the other methods. As a result, the manually fine-tuned controller was chosen due to its quick, highly stable, and oscillation-free response.

The results of the closed-loop simulation using the selected controller are presented in Figs. 6-7. Fig. 6 illustrates the behavior of the manipulated and disturbance variables, while Fig. 7 depicts the controlled variable with its setpoint, along with the other output.

TABLE III.
Tuning Methods and Controller Parameters

Fig. 6.
Inputs of the closed-loop

To assess the neural network efficacy in accurately representing the behavior of the simulated system, as required for a digital twin, its response was evaluated within a closed-loop control framework. Within this framework, the control actions computed for the original process (based on the differential model) with the tuned proportional-integral (PI) controller were integrated as one of the network's inputs. Moreover, these inputs encompassed process disturbance information and a feedback signal generated by the network predictions rather than simulated measurements from the differential model simulation. Consequently, the neural network can autonomously adapt over time, dynamically responding to the evolving process inputs.

Fig. 7.
Outputs of the closed-loop

V. Results

After fine-tuning the hyperparameters, the network performance was evaluated on both datasets. The higher-frequency dataset was used to assess the network predictive capacity. The neural network demonstrated exceptional training performance, accurately predicting the test data and effectively capturing the underlying dataset patterns and relationships (Fig. 8). This success highlights the robust ability of the model to generalize from complex training examples to unseen data, this showcases its deep understanding of system dynamics.

An autocorrelation analysis of the training modeling errors (residual) indicated significant autocorrelation only at lag = 0, resembling a Dirac delta function (Fig. 9), which confirms that the residual distribution follows a white noise correlogram pattern. We can see this result as an indication of the absence of systematic errors or patterns in the model predictions. Additionally, a white noise correlogram pattern suggests that the model has effectively captured all relevant information from the data, and the predictions are based on genuine signals rather than noise.

The following run evaluates the pre-trained network adaptability to a distinct scenario (second dataset), as illustrated in Fig. 10-11. As can be seen, the successful prediction of the second test dataset resulted in a residual distribution that also adheres to a white noise correlogram pattern. Remarkably, despite being trained with higher-frequency data, the model ability to accurately represent lower-frequency data underscores its robustness and versatility in capturing the system dynamics across different temporal scales.

In the closed-loop control scenario, the neural network functioned autonomously, providing its feedback signal based on the predicted outputs. However, Fig. 9 reveals a systematic deviation between the predicted and actual responses, likely stemming from the absence of feedback control dynamical effects in the training data. This discrepancy highlights the challenge of accurately capturing real-time system behavior under closed-loop control conditions.

Fig. 8.
Network performance for the first dataset

Fig. 9.
Network residual analysis for the training of the first run

A bias b(k)was introduced to mitigate this issue, and this represents the disparity between the simulated process measurements, y_m(k), and the predicted outputs, \hat{y}(k). This adjustment on the predicted outputs, being \hat{y}\left(k\right)+b\left(k-1\right) with b(0)=0, yielded a maximum relative error of just 1.1%, compared to the 2.7% observed without bias. The graphical representations that depict the predictions in the absence and presence of bias correction are presented in Figs. 12 and 13, correspondingly.

Detailed performance metrics for the training, testing, and closed-loop application phases are provided in Table IV. The findings demonstrate the exceptional predictive capabilities of the network, which achieves outstanding performance in forecasting output data despite being trained on a comparatively small dataset — and contrasts with the higher training percentages commonly used in the literature. Notably, the network accurately captured the output dynamics in the first dataset with remarkable precision. Furthermore, the successful modeling of a scenario with lower variability in the second dataset suggests its versatility and robustness. Thus, inferring that the acquired meta-model fits both scenarios is reasonable. Moreover, the closed-loop results showcase the neural network potential as a virtual representation that reflects real-time process responses, thereby mimicking real-world scenarios with fidelity.

TABLE I.
Network performance metrics

Fig. 10.
Network performance for the second dataset

Fig. 11.
Network residual analysis for the training of the second run

Fig. 12.
Network performance for the closed-loop, without bias

Fig. 13.
Network performance for the closed-loop, with bias

VI. Conclusion

This study employed an Echo State Network (ESN) as a meta-model to tackle the complexities of a classical nonlinear bioreactor. Unlike traditional Recurrent Neural Networks, ESNs simplify learning by maintaining fixed input and recurrent connections, while training only output connections through linear regression. This approach mitigates the challenges associated with training recurrent connections.

The outcomes of our study showcase the robust predictive capabilities of the ESN, adeptly handling noisy data and limited samples across a broad spectrum of oscillations. These results underscore the ESN adaptability to the diverse scenarios commonly encountered in industrial contexts. The results of the closed-loop test validate the efficacy of ESNs, with maximum errors limited to just 3%. This underscores the potential for further exploration of ESN applications in constructing digital twins, which represents a paradigm shift from traditional models towards real-time control and monitoring contexts.

Moreover, the findings confirm the practical and effective utility of the ESN for metamodeling in industrial processes. The versatility and potential integration of ESNs into Process Control and Monitoring practices facilitate precise simulations and streamline optimization procedures, thereby enhancing the efficiency and effectiveness of industrial processes. However, it is essential to acknowledge the ongoing need for evaluating and discussing alternative strategies to enhance the network predictive accuracy, given the inherent complexity and challenges inherent in industrial process control. Continued research in this area promises to unlock further advancements in ESN applications, driving innovation and optimization within industrial processes.

Acknowledgments

This study was funded in part by the Fundação de Amparo à Pesquisa e Inovação do Espírito Santo – FAPES.

The authors also acknowledge the financial support from the CNPq and FAPERJ funding agencies.

References

[1] A. J. Silva Neto and J. C. Becceneri, “Técnicas de inteligência computacional inspiradas na natureza: Aplicação em problemas inversos em transferência radiativa,” 2009.

[2] C. P. Naveira-Cotta et al., “Eigenfunction expansions for transient diffusion in heterogeneous media,” International Journal of Heat and Mass Transfer, vol. 52, no. 21-22, pp. 5029–5039, 2009.

[3] D. C. Knupp, “Integral transform technique for the direct identification of thermal conductivity and thermal capacity in heterogeneous media,” International Journal of Heat and Mass Transfer, 2021.

[4] F. P. Incropera et al., Fundamentals of Heat and Mass Transfer, vol. 6, New York, Wiley, 1996.

[5] F. S. Mascouto et al., “Detection of contact failures employing combination of integral transforms with single-domain formulation, finite differences, and Bayesian inference,” Numerical Heat Transfer, Part A: Applications, 2020.

[6] J. Beck and S.-K. Au, “Bayesian updating of structural models and reliability using Markov chain Monte Carlo simulation,” Journal of Engineering Mechanics, vol. 128, no. 4, pp. 380–391, 2002.

[7] J. Ching and J. S. Wang, “Application of the transitional Markov chain Monte Carlo algorithm to probabilistic site characterization,” Engineering Geology, vol. 203, pp. 151–167, 2016.

[8] J. Ching and Y. C. Chen, “Transitional Markov chain Monte Carlo method for Bayesian model updating, model class selection, and model averaging,” Journal of Engineering Mechanics, vol. 133, no. 7, pp. 816–832, 2007.

[9] J. P. Kaipio and C. Fox, “The Bayesian framework for inverse problems in heat transfer,” Heat Transfer Engineering, vol. 32, no. 9, pp. 718–753, 2011.

[10] L. A. Da Silva Abreu et al., “Estimativa do perfil de temperatura na entrada de dutos via Método de Monte Carlo com Cadeias de Markov,” Revista Cereus, vol. 14, no. 4, pp. 129–143, 2022.

[11] M. N. Özışık and H. R. Orlande, Inverse Heat Transfer: Fundamentals and Applications, 2021.

[12] P. Gardner, C. Lord, and R. J. Barthorpe, “A unifying framework for probabilistic validation metrics,” *Journal of Verification, Validation and Uncertainty Quantification*, vol. 4, no. 3, 031005, 2019.

[13] W. Betz, I. Papaioannou, and D. Straub, “Transitional Markov chain Monte Carlo: observations and improvements,” Journal of Engineering Mechanics, vol. 142, no. 5, 04016016, 2016.J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, pp.68–73, 1892.

Hyperparameter	Value
Reservoir size	1222
Leaking rate	0.6964
Sparsity	0.3536
Spectral radius	0.70
Train fraction	0.35
Ridge	4E-4
Noise level	1E-5
Random seed	13042023

Parameter	Value
K_P(L g^-1 h^-1)	-6.6642
θ (h)	0.0700
𝜏 (h)	1.0050

Parameter	Tuning method
Parameter	IMC	ITAE (servo test)	Manual
K_C(L g^-1 h^-1)	-0.15164	-1.12761	-0.06123
𝜏_I(h)	1.00500	1.00752	6.70000

Metrics	Dataset 1		Dataset 2	Closed loop
Metrics	Training	Test	Test	Without bias	With bias
R²	0.9790	0.9490	0.9812	0.9930	0.9996
MSE	2.6676E-02	3.14347E-02	4.6535E-02	0.0007	0.0001
ExpVar	0.9790	0.9491	0.9812	0.9979	0.9998