Determination of relaxation modulus of time-dependent materials using neural networks

Health monitoring systems for plastic based structures require the capability of real time tracking of changes in response to the time-dependent behavior of polymer based structures. The paper proposes artificial neural networks as a tool of solving inverse problem appearing within time-dependent material characterization, since the conventional methods are computationally demanding and cannot operate in the real time mode. Abilities of a Multilayer Perceptron (MLP) and a Radial Basis Function Neural Network (RBFN) to solve ill-posed inverse problems on an example of determination of a time-dependent relaxation modulus curve segment from constant strain rate tensile test data are investigated. The required modeling data composed of strain rate, tensile and related relaxation modulus were generated using existing closed-form solution. Several neural networks topologies were tested with respect to the structure of input data, and their performance was compared to an exponential fitting technique. Selected optimal topologies of MLP and RBFN were tested for generalization and robustness on noisy data; performance of all the modeling methods with respect to the number of data points in the input vector was analyzed as well. It was shown that MLP and RBFN are capable of solving inverse problems related to the determination of a time dependent relaxation modulus curve segment. Particular topologies demonstrate good generalization and robustness capabilities, where the topology of RBFN with data provided in parallel proved to be superior compared to other methods.


Introduction
Plastics and plastic based composites are slowly replacing metals in automotive and aeronautical industries, which is mainly due to their more favorable strength-to-weight ratio.However, with all advantages of plastics their use for highly demanding engineering applications on which human lives depend requires exact predictions of durability and lifespan of polymeric structures.Unfortunately, standardized procedures for this do not exist, yet.
Durability control of structures made of elastic materials, such as metals, can be accomplished by health monitoring systems that are commercially available.But in case of viscoelastic materials, including plastics and polymers, their time-dependent properties and effects related to durability should be taken into account.In order to detect changes in material behavior, material transfer functions should be tracked and calculated based on the response of a structure to external excitations.This means that the method used for health monitoring of plastics structures should be able to comprehend time-dependent material transfer functions that affect structural responses.
One of the most important material transfer functions, which require monitoring, is the time-dependent relaxation modulus.This transfer function describes the process of relaxation, which appears as a decrease of stress under constant deformation and can be detected as softening of a material.Relaxation modulus jointly with the geometry of the construction determines its stiffness and strength, and therefore, should be known for construction purposes.Typically, relaxation modulus is determined by tensile tests (ISO 527-1 2012), however, the standardized tests do not provide information on time-dependency of the material behavior.Measurements of the time-dependent relaxation modulus are not standardized and require either very long time or, according to the principle of time-temperature superposition (Ferry 1980), measurements at different temperatures.The second approach is most widely used; however, its drawback is that for each measurement (segment) at a certain temperature an inverse problem of obtaining relaxation modulus from measured stress and applied strain data has to be solved.This problem has analytical solution for standard excitations (step and sine), while for non-standard or in the presence of noise in the read signal it turns into an ill-posed inverse problem.Such problems can be solved only numerically by time-demanding mathematical techniques, e.g., exponential fitting (Saprunov et al. 2014) or regularization (Tikhonov and Arsenin 1977).The above mentioned methods require explicit information on the geometry of a tested element, together with information on its excitation and response, to determine the underlying material properties.Regularization methods compute the solution in a point and for their application to complex geometries require implementation into finite elements codes, therefore they are not appropriate for real-time monitoring of complex structures and systems, whereas the neural networks do not require explicit information on a structure geometry.Additionally, artificial neural networks (NNs) have been proven as a suitable tool to solve inverse problems and for real-time applications (Xiao et al. 2006).Once trained, NNs are able to deliver results fast, they are capable of parallel calculations due to their nature, and they are able to generalize and process noisy data.It should be mentioned that all advantages that are brought by neural networks are accompanied by certain challenges, such as choice of neural network type, topology, determination of training parameters to avoid local minima, and training data choice.Therefore, the paper proposes artificial neural networks (NNs) for obtaining a segment of relaxation modulus curve based on the tensile data from a constant strain rate experiment.This initial step is essential for further application of NNs as a tool for health monitoring of polymeric structures in automotive, railway, or aeronautical applications.A neural network used for this purpose should be capable of solving the inverse problem for obtaining time-dependent material properties in order to qualitatively track changes caused by the viscoelastic nature of materials from which the structure is built.Of course, obtaining time-dependent material functions as a prime task is not the main purpose of NNs because there are many closed-form solutions available, see Saprunov et al. (2014) and references herein.
As an initial step we propose to apply the Multilayer Perceptron (MLP) neural network with a sigmoidal activation function and the Radial Basis Function Network (RBFN) with a Gaussian activation function for solving the inverse problem arising within characterization of time-dependent properties of viscoelastic materials.As a reference method nonlinear exponential parametric regression is used.Static neural networks are considered as an optimal tool for time-and space-dependent function approximations.They can be used for approximation of non-stationary data as well using training data containing non-stationarity, which in the current work is introduced by different materials.
Two different topology types with respect to the structure of the input data for MLP and RBFN were tested.The optimal number of neurons in hidden layers was chosen according to an optimization criterion.For investigating the NN's performance we have used an example for which the closed-form solution is known (Saprunov et al. 2014) and we could generate the NN non-stationary training data numerically.In order to check generalization capabilities and robustness of the networks, the validation of NN performance was done on the noisy set of data that were not used for training.
With the aim to present the capabilities of the NNs to estimate the relaxation modulus from the measured stress and applied strain data, in the following sections the problem statement is first presented in detail.Then the process of NN implementation, starting with the training data generation using a closed-form solution, is followed by the description of the methodology of optimal topology choice and its validation.Results and the discussion section present generalization and robustness capabilities of the obtained networks in comparison to the exponential fitting numerical technique.

Stress, strain, and relaxation modulus
The following section presents the constitutive relations between strain and stress within relaxation process, a problem related to the determination of relaxation modulus, and the process of data generation.

Theory and problem
The constitutive relation between strain excitation ε(t) and stress response σ (t) in the relaxation process of a time-dependent material under uniaxial stress state is given as: (1) Here E(t) is a time-dependent relaxation modulus and it is a material function of interest.Equation (1) represents a convolution integral equation that has an analytical solution only for standard types of excitation including a step function and a harmonic excitation.In the case of other excitation functions, for the determination of the relaxation modulus E(t) from given strain excitation ε(t) and stress response σ (t), an application of numerical techniques for the solution of the related ill-posed inverse problem is required.
Existing numerical methods of solving ill-posed inverse problems are based on adding some disturbance (additional restriction) to the initial problem to turn it into a close to the original but not ill-posed problem.In mathematics this approach is called regularization (Samarskii et al. 2009).The most well-known and widely used regularization technique is the one introduced by the Russian mathematician Tikhonov in 1955 (Tikhonov and Arsenin 1977).This approach is currently prevalent in the linear theory of viscoelasticity.It is important to mention that even though Tikhonov regularization technique and its derivatives are widely and successfully used in solving ill-posed inverse problems, this group of techniques is computationally-and time-demanding, and mathematically challenging, and consequently, not suitable for health monitoring of plastics-based structures.Due to this, the paper introduces an empirical modeling approach based on artificial neural networks, known for their high computational and generalization capabilities, and robustness.

Modeling data generation
Data for training a neural network should be chosen thoughtfully due to the fact that an NN can properly function only within the range of data that was covered during its training.Training data for this work were generated artificially using a closed-form solution, which does not represent an inverse problem (Saprunov et al. 2014).Data consists of vectors of strain ε ε ε, stress σ σ σ, time t, and the corresponding vector of relaxation moduli E.
The data generation procedure is schematically shown in Fig. 1 and consists of two steps including determination of relaxation modulus E(t) and related stress response σ (t) for a given strain excitation ε(t), respectively.To determine the relaxation modulus E(t), generation of relaxation mechanical spectrum H i (τ i ) representing different engineering materials is performed first.Relaxation mechanical spectrum H i (τ i ) is a characteristic of a polymeric material and constitutes its transfer function.It describes contributions of groups of molecules of different size/length to the overall response of a material to an external excitation.
Magnitudes of relaxation spectrum lines H i (τ i ) were determined according to the Gaussian distribution: where τ i is a response time of a particular material molecular group, μ is a mean value of distribution which was taken to be 0, i is a number of molecular groups, and σ G is the standard deviation of a Gaussian distribution which was varied from 0.4 to 1.6 with a step of 0.1.N = 49 spectrum lines were equally distributed in a logarithmic time scale log (τ i ) with the step 0.5 from −12 to 12, see Table 1.Afterwards, the obtained values were normalized according to As a result, l = 13 different relaxation spectra for σ G ∈ [0.4,1.6] with step 0.1 were generated, using parameters displayed in Table 1.The first diagram on the left in Fig. 1 shows examples of the three selected relaxation spectra for σ G = 0.4, 1.0, and 1.6.
Knowing the equilibrium values of the relaxation modulus and its normalized value of spectra h i allows one to calculate the whole relaxation modulus E(t) curve according to the formula: where E 0 and E g are fixed equilibrium and glassy relaxation moduli values, respectively.In our case the relaxation modulus E(t) was calculated as a segment corresponding to the time from 0 to 50 seconds.The number of data points representing E(t) in the time interval was varied as n = 10, 50, and 100 data points.In the next step the stress response σ (t) of the material was calculated by the constitutive equation ( 1) taking into account defined E(t) and excitation strain ε(t).Considering a constant strain rate, the strain changes according to where k is a constant strain rate and is taken as 0.1 (ISO 527-1 2012).In this case Eq. ( 1) after incorporation of Eq. ( 4) and integration turns into Equation ( 6) allows calculation of stress response σ (t) to constant strain rate input (Eq.( 5)) using the analytical closed-form solution of Eq. ( 1).The values of the used constants and parameters of calculation are given in Table 1.
The calculated l = 13 relaxation moduli for σ G ∈ [0.4,1.6] with step 0.1 are shown in Fig. 2.Among the l = 13 generated datasets (ε ε ε,σ σ σ, t; E), l tr = 7 curves E(t) marked with circular markers and corresponding stress-strain data were taken as training data while the remaining l val = 6 denoted with solid thin curves were taken as testing data to validate the modeling performance of NNs.
In order to investigate how the performance of MLP changes with respect to n = 10, 50, and 100 of data points used to represent the curve of relaxation modulus, training data sets with different numbers of data points were generated.
Robustness of MLP was checked on noisy data; therefore, noise of relative value of 1, 5, and 10 % had been added to the values of stress to simulate the real experiment.
All of these problems are ill-posed since they do not satisfy the definition of well-posed problem given by Jacques Hadamard and interpreted by Tikhonov and Arsenin (1977) as follows: The well-posed problem satisfies 3 conditions: (i) its solution exists, (ii) it is unique, and (iii) it is stable (small deviations in input parameters cause small changes in results).Listed above problems are related to a characterization of a particular system based on certain measurements that are accompanied with inevitable experimental errors.Solutions of the presented problems are not stable since they are very sensitive to experimental error and do not satisfy the third condition for a well-posed problem.
Researchers turned to NNs for solving inverse problems in different scientific fields due to their robustness (Elshafiey et al. 1995;Adler and Guardo 1994;Li et al. 2008), generalization capabilities (Jung and Ghaboussi 2006), and ability to deal with severely ill-posed problems (Baddari et al. 2010).Although NNs are used in so many different applications, there are practically no papers addressing the NN modeling of the behavior of viscoelastic materials and determination of their time-dependent material functions such as relaxation modulus.
Therefore, in this paper Multilayer Perceptron and Radial Basis Function neural networks, known as a universal function approximators, have been used for determination of a relaxation modulus segment from constant strain rate tensile experiment data.Determination of the relaxation modulus from constant strain rate experiments was already addressed by many other researchers that were using generalization or parametric regression methods (Tscharnuter et al. 2011;Knauss and Zhao 2007).Therefore, comparison of neural network performance was made with respect to the exponential parametric regression algorithm (nonlinear parametric regression) that showed good performance for this problem (Saprunov et al. 2014).

Neural networks models
The most widely used NNs are tested in solving the stated problem, namely Multilayer Perceptron (MLP) and Radial Basis Function Neural Network (RBFN).
Multilayer perceptron (MLP) is a feedforward artificial neural network consisting of fully interconnected neurons in several layers with nonlinear activation functions (Haykin 1999) as shown in Fig. 3(a).The number of hidden layers as well as neurons is arbitrary.
MLP can be trained in a supervised manner with a very popular back propagation algorithm (Werbos 1994).
The output of the network with two hidden layers presented in Fig. 3(a) will be where α and β are the numbers of inputs and outputs, respectively, indexed by i and r; q and m are the numbers of neurons in hidden layers indexed by j and p, respectively.Synaptic weights are represented by ω with two indices, where the first one represents the neuron accepting the signal, and the second indicates the neuron sending the signal.c j and c p are weights which are called biases and which may not be present in a particular structure.
Radial Basis Functions Neural Network (RBFN) is a feedforward neural network that utilizes radial-basis activation functions and typically consists of 3 layers with different roles as shown in Fig. 3(b).The input layer is made of source nodes (inputs) that connect the network to its environment.The second layer, which is the only hidden layer in the RBFN, applies a nonlinear transformation from the input space to the hidden space which is of high dimensionality in most applications.The output layer is linear, supplying the response of the network to the input signal.
Within this work, a Gaussian Radial Basis activation function was used, where s is the standard deviation of a Gaussian distribution.The output of the RBFN will be a scalar function of a real input vector X consisting of α components, where q is the number of neurons in a hidden layer, C j is the center vector of neuron j , f is a radial basis function of a neuron (Eq.( 9)), and a j is the weight of neuron j in the linear output layer.The norm is typically taken in a Euclidean space.
The following sections introduce the topologies of NNs used for each NN type, two possible ways of representation of input data, and design details for each NN type.

Topologies of neural networks
Topology of a neural network defines the way the neurons of a particular layer are connected, and it is an important factor in network functioning and learning (Sammut and Webb 2011).
In supervised learning, the most common topology is the fully connected, at least threelayer, feedforward network.In such a network, all input values to the network are connected to all neurons in the first hidden layer, the outputs of the last hidden neurons are connected to all neurons in the output layer, the neurons activation function of which defines the output of the network.
The number of hidden layers and related number of neurons determines computational capabilities of a network.According to universal approximation theorem, a single hidden layer is sufficient for MLP to compute a uniform approximation to a given training set (Haykin 1999).Nevertheless, due to the complexity of the problem and related poor results of single-layered MLP, the authors selected a topology of MLP with two hidden layers and maximum q = m = 30 neurons in each (see Fig. 3(a)).The number of variable parameters provided by such a structure is more than sufficient for proper training, but utilization of a training algorithm with regularization term will provide a network without tendency to overfitting (Sjoberg and Ljung 1992).For RBFN the number of hidden neurons was determined by the algorithm used for neural network creation.

Input data representation
Once the number of the hidden layers and the related number of neurons is defined, the neural network topology is also defined by the way of representation of input and related output data.In our case, two different ways of presenting input and output data were considered.
The first possibility is to provide data points of one training sample "in series", i.e., one after another (Fig. 4(a)), and therefore provide a mapping of the ith data set sample (ε i , σ i , t i ) → (E i ), where i is changing from 1 to the number n of data points comprising the input sample.The other option is to provide data of the whole sample at once "in parallel" (Fig. 4(b)) and hence a mapping (ε ε ε,σ σ σ) → (E) to the whole curve E(t i ), i = 1, . . ., n.
In case input data is provided in series, the input data vector (ε i , σ i , t i ) is mapped into the related scalar value (E i ) representing E at time t i .The training time is significantly shorter and it mimics an application for real time monitoring; however, a transition between two different training samples represents irregularity in the data.The value of time, t i , in this case is required for introduction of time-dependency into the system, which is required for the solution of the stated inverse problem described by Eq. (1).
For parallel feeding, strain and stress vectors (ε ε ε and σ σ σ, respectively) were given as inputs and were mapped into the vector E, representing the complete relaxation modulus function E(t).In this case, since the whole strain and stress curves were provided as input, each data point had its own input neuron (see Fig. 4(b)).Presenting the input data in parallel, on the one hand, slows down the training procedure and results in a complex topology (input neuron for each data point), and, on the other hand, provides to the network all the information on the curve that is necessary for solving the inverse problem (all history of material behavior).In this case, the vector of time instances is not needed, since the vectors of strain ε ε ε and stress σ σ σ fully presented to the network already contain all required information on timedependency.Relaxation modulus values of vector E were outputs.The number of output neurons corresponds to the number n of data points representing the curve of relaxation modulus E(t).

Multilayer perceptron design
For solving the stated inverse problem (ε ε ε,σ σ σ, t) → E, an MLP with the classical logistic sigmoid activation function was utilized: where f (x) is a neuron activation function of parameter x, which in the case of the neural network is presented as a weighted and biased sum of values of neuron's input vectors of strain ε ε ε, stress σ σ σ, and time t.The output of an MLP is defined by the vector of relaxation modulus E.
Training data were rescaled into the range between 0 and 1. Nguyen-Widrow initialization algorithm (Nguyen and Widrow 1990) was used for initialization.It chooses initial values of synaptic weights in order to distribute the active region of each neuron in the layer approximately evenly across the layer's input space.The selection of values contains a degree of randomness, so they are not the same each time this function is called (Demuth et al. 2009).In order to avoid this randomness during MLP topology optimization process, the generator of random numbers in MATLAB is set to default values before initialization function is called.
There is a variety of training functions available for MLP; however, in order to avoid overfitting, the backpropagation algorithm with Bayesian regularization is chosen (Demuth et al. 2009).More information on Bayesian networks can be found elsewhere; see, e.g., MacKay (1992).
Training was done in a batch mode using the mse performance function with the maximal number of training epoch set to 500 and maximal number of validation checks set to 25.

Radial basis function neural network design
RBFN was formed by consecutive addition of the neurons with the Gaussian Radial Basis Function in order to satisfy the condition of the error 0.001 (MATLAB function newrb); therefore, the only topological parameter to be determined was the spread (in MATLAB notation, the spread equals to s • log(4) ≈ 1.177 s) of the network.It was varied in a wide range in order to determine the optimal values of spread resulting in good RBFN performance.The same procedure of initialization as for MLP was applied in this case.

Exponential parametric regression
As a reference modeling method, an exponential parametric regression referred to as exponential fitting (Saprunov et al. 2014), was used.The exponential fitting is used to represent Eq. ( 1) in a general form: where i is an index of N spectrum lines.By using Eq. ( 12) in Eq. ( 1), taking into account Eq. ( 5) for constant change of strain ε(t), and deriving the resulting equation with respect to time, the relaxation modulus can be obtained in the closed form (Knauss and Zhao 2007) as The unknown coefficients c, a i , and b i in Eq. ( 13) were determined by fitting the stress and strain data with Eq. ( 12) using as optimization criterion the minimum least-squares error and taking into account constraints caused by the physical meaning of coefficients: where b i was varied in the window corresponding to changes of relaxation times λ i = 1/b i within the experimental window t 1 ≤ λ i ≤ t n corresponding to the number of measurement points for every i = 1, 2, . . ., N (Knauss and Zhao 2007).The optimization was done using the Trust Region method implemented using "lsqcurvefit" of MATLAB Optimization Toolbox (Saprunov et al. 2014).

Measures of modeling performance
To evaluate and to compare the performance of a particular NN and exponential fitting modeling, the following measures were used: 1. Mean squared error, MSE, [MPa 2 ] was calculated as follows: where j is an index representing the j th testing curve E j (t), E ji is a value of the relaxation modulus predicted in the ith point, E target ji is a true value of the relaxation modulus in the ith point, and n is the number of points representing the relaxation modulus E j (t) curve.The lower MSE j value, the better the performance.MSE j was calculated for each curve from j = 1 to l corresponding to the number of curves in the set of the testing set and was averaged over the number l of testing curves analyzed.2. R j,0.95 , [%] is defined as a percentage of the number n 0.95 of data points of the E j (t) curve that are estimated with equal or more than 5 % relative error.Performance parameter R 0.95 incorporated into the optimization criteria allows detecting and smoothing (removing) effects of large MSE values caused by possible outliers.This value of 5 % relative error was chosen as the maximal error acceptable for engineering purposes.The data points of a curve E j (t) which are predicted with equal or more than 5 % relative error satisfy the following condition: The performance measure R j,0.95 of modeling a particular E j (t) curve is determined as: where n is the number of the data representing the E j (t) curve.The modeling performance of a particular NN is then characterized as the average value R 0.95 of R j,0.95 with respect to the number of all analyzed testing curves E j (t).
With the aim to perform the validation of the modeling performance of NNs and related choice of an optimal topology in a two-dimensional space, an optimization criterion J was defined in such a way that minimization of J leads to an optimal performance of an NN.
Function J utilizes the Euclidean distance in the space of MSE and R 0.95 between the optimal zero values of both parameters and current topology as shown in Fig. 5. Minimization of the Euclidean distance corresponds to the minimal value of function J , which represents the best relaxation modulus E(t) function approximation.

Validation procedure
Validation of NN modeling performance and determination of optimal topologies was done in three respects: (i) generalization, (ii) robustness properties, and (iii) performance of NNs with respect to the number of data points in the signal.The first two are important for application of the NNs, the third one is related to the convergence of an optimization algorithm within training.A comparison with exponential fitting for robustness and the effect of the number of data points was done.Data sets representing the E(t) curves with n = 10, 50, and 100 data points were considered.
1. Generalization of the particular NN was checked on l val = 6 datasets representing 6 curves E(t) that were not used for training.2. Robustness test of an NN was done based on data sets including data for training and validation (ε ε ε,σ σ σ, t; E) to the stress component to which 1, 5 and 10 % relative noise was added.Noise levels higher than 10 % were not considered, since errors of measurements caused by sensors typically do not exceed this value.3. Mathematical convergence of a training algorithm and performance of the trained NN were analyzed with respect to the number n of data points in the input signal.

Results and discussion
Within this section the optimization results and related NN topologies are presented and their performance is evaluated with respect to their generalization ability and robustness.
In addition to this, the influence of the number of data points n used to present the E(t) curves is analyzed.The results are then compared with the modeling results obtained by the exponential fitting technique (Saprunov et al. 2014).

Choice and validation of NN topologies
All possible variations of maximum 30 neurons in 2 hidden layers (900 iterations) were tested and compared based on NN performance measures and introduced optimization criterion J presented in Sect.3.2.For RBFN, the number of hidden neurons was determined by the algorithm used for neural network creation.
Figure 6 shows the results of NN performance in the plane of MSE and R 0.95 for 6 different topologies of NN obtained by taking into account the defined criterion function J (Eq. ( 14)). Figure 6(a) shows the related results for MLP for various numbers of neurons [q, m] in the two hidden layers, and Fig. 6(b) illustrates the results for the RBFN with respect to the number of data points (n = 10, 50, and 100) and way of the presentation (serial and parallel) of the input data to the NN.
From the presented points in the (MSE, R 0.95 ) plane, which are closer to the origin, we observe that neural networks in the case when the data were provided in series (shown with empty markers in Fig. 6) mostly performed better than in the case when the data were provided in parallel (shown with line markers in Fig. 6).In the following, the selected topologies were validated with respect to their generalization abilities, robustness, and the influence of the number of data points n used to present the E(t).For the purpose of the NN validation, the data set representing the modeling data of all 13 E(t) curves was divided into training and testing data sets.The training data set consisted of the data representing the selected l tr = 7E(t) curves while the other l val = 6 curves were used as testing data.

Generalization
To demonstrate the generalization ability of the NNs, in Tables 2 and 3 the values of NN performance parameters averaged for the training and validation data sets for MLP and RBFN  From Table 2 it is visible that for MLP, MSEs for training and validation data are of the same order and are comparable.NNs with both ways of representation of input data demonstrated a decrease in generalization with respect to the number of data points n 0.95 reconstructed with more than 5 % relative error R 0.95 for the highest number of data points (n = 100) in the data set.Additionally, a drop in performance and related decrease of generalization ability not detected by R 0.95 is evident in the decrease of MSE for validation set of data in the case of MLP with parallel input and n = 50 data points in the data set.
Table 3 presents the results of the generalization tests for RBFN.Here, in contrast to the MLP, better results are obtained for the networks with data presented in parallel, while RBFNs with data provided in series show poor performance in respect to both the MSE and R 0.95 parameters.Among RBFNs with data provided in parallel, the ideal values of R 0.95 = 0 % parameter were observed for validation data.The related ratio MSE val /MSE tr between training and validation increased up to 32.For RBFN this ratio is higher than the maximal ratio for MLP networks, and in general indicates that MLP NNs showed better generalization capability than RBFNs.
Considering split-sample validation of generalization of MLP and RBFN with two different ways of input data representation and 3 different numbers of data points in the set n, MLP showed better generalization abilities with respect to both MSE and R 0.95 .

Robustness
Robustness in computer science is defined as the ability of a system to cope with errors during execution.Similar definition is given by Simon Haykin (Haykin 1999) in relation to neural network performance: "disturbances with small energy can only give rise to small excitation errors".In the current work, robustness is considered as a property of a network to resist noise in the input signals.
Since there are two conventional ways to solve the stated inverse problem, namely regularization methods and fitting techniques, the latter was chosen as a reference method to compare it with the neural network predictions to evaluate the NN robustness.The fitting procedure was chosen because it is well-known that Tikhonov regularization does not perform well with noisy data.In addition, the fitting procedure has been showed to be effective in obtaining time-dependent relaxation modulus from the constant strain-rate excitations (Saprunov et al. 2014).
To analyze the robustness of NN performance, data sets with relative additive noise of 1, 5 and 10 % were considered in modeling the relation (ε ε ε,σ σ σ, t) → E. Robustness of NN was compared to the exponential fitting technique.Since the reference method of exponential fitting is sensitive to the width of the spectrum, the whole generated data set, including data for training and validation, was used in the analysis.Graphs in Fig. 7 show averaged over complete data set of l = 13 (training and validation) values for MSE and R 0.95 for all 6 topologies of each type of NN and for the exponential fitting.
The best results for noiseless data in terms of the average MSE and R 0.95 were demonstrated by RBFN with data provided in parallel and MLP with data provided in series.
Further, the MLP with the data provided in parallel showed the highest robustness for the noise level less than 10 % and for a small number of data points n = 10.With an increase of the number of data points, RBFN with data provided in parallel became competitive and surpassed the performance of the MLP.For the n = 100 data points in the data set in terms of both performance measures, RBFN with the data presented in parallel was followed by exponential fitting.
Among the NNs tested for robustness, RBFN with data provided in parallel for n = 50 and 100 data points showed better results than the numerical method of the nonlinear exponential parametric regression.

Effect of number of data points
Figure 8 shows the MSE dependence on the level of additive noise in the input data for each of the modeling methods for different numbers of data points n in the data set.The MSE was calculated in the same way as for robustness evaluation.
For noiseless data, an increasing trend of MSE with increasing number of data points is observed for all the methods, except for MLP.MLP in this case performs the best for 50 data points representing the relaxation modulus curve.This might be explained by the optimal relation between the number of free parameters of the network and the number of data points in the training data.
We can observe that in the case of using RBFN with parallel input data (Fig. 8(d)) and exponential fitting for noisy data (Fig. 8(e)) the MSE decreases with increasing of number n of data points, while the opposite trend is detected for all other methods applied to the noisy data.As presented in Fig. 8, the lowest values of the average MSE are obtained with 10 data points for both data types using MLP and RBFN with data presented in series.
Furthermore, we can see that NNs with input data provided in series demonstrated the highest MSE among the compared methods (lowest performance) independently of the number of data points.

Conclusions
With the increasing use of polymers in demanding applications requiring control of structure health, systems that are not only capable of detecting geometrical changes, such as cracks, but also changes related to time-dependency of polymers are required.The system should be able to determine time-dependent material properties based on external excitation and, therefore, solve an inverse problem.Existing numerical techniques cannot be used for realtime applications; therefore, neural networks are suggested.
The paper proposes artificial neural networks as a tool for solving the inverse problem appearing within characterization of time-dependent properties of the relaxation modulus of viscoelastic materials.The simplest case with a known closed-form solution was considered for obtaining a segment of relaxation modulus from constant strain rate tensile test data.
The investigation showed that MLP and RBFN of different topologies are capable of solving the stated problem and have good generalization capabilities.MLP with data provided in series showed better generalization compared with parallel data feeding both measures of performance, MSE and R 0.95 .Opposite behavior was observed for RBFN.
Considering robustness, the RBFN with data provided in parallel showed better performance compared to the other NNs and exponential fitting for high numbers of data points (n = 50 and 100).
Neural networks demonstrated better performance for datasets with smaller numbers of data points compared to exponential fitting, while the latter worked better with a larger num-ber of data points in a set.This can be attributed to the utilization of nonlinear least-squares regression algorithm, since a small number of data points limits the number of parameters of the methods which leads to loss of performance.
Observed decrease of performance of MLPs and RBFNs using serial input data with increasing number of data points can be related to the ratio of free parameters of the system (NN) to the number of data points in the full training set.
The results showed that further development of artificial neural networks, particularly RBFN, is promising for application to real-time health monitoring of a polymer structure.Generalization and robustness properties of the network, exceeding performance of nonlinear parametric exponential regression, as well as possibility to operate in real-time are the advantages of NNs compared to conventional methods used to determine time-dependent mechanical properties from non-standard experiments.Further investigation should address such problems as dynamic loadings, real-time data prediction, precision of detection of timedependent changes, and training procedure upgrade.

Fig. 1
Fig. 1 Schematic representation of training data generation

Fig. 2
Fig. 2 Target relaxation moduli segments for n = 50 data points

Fig. 4
Fig. 4 Schematic representation of topology with data provided (a) in series and (b) in parallel

Fig. 5
Fig. 5 Schematic representation of topology optimization principle

Fig. 6
Fig. 6 Performance parameters in the (MSE, R 0.95 )-plane for the selected optimal topologies of (a) MLP and (b) RBFN As measures of performance, the defined MSE for each selected topology for training and validation data, their absolute difference ( MSE = |MSE val − MSE tr |), and ratio (MSE val /MSE tr ) were used.In addition, the efficiency R 0.95 for training and validation data, as well as their absolute difference ( R 0.95 = |R 0.95val − R 0.95tr |) are presented in the last three columns.

Fig. 8
Fig. 8 MSE dependence on the number of data points representing the relaxation modulus curve obtained by applying (a) MLP with input data in series, (b) MLP with parallel input data, (c) RBFN with input data in series, (d) RBFN with parallel input data, and (e) exponential fitting for data with 0, 1, 5, and 10 % added noise

Table 2
Comparison of MLP performance on training and validation data Input type