Title: | Validation Tools for Artificial Neural Networks |
---|---|
Description: | Methods and tools for analysing and validating the outputs and modelled functions of artificial neural networks (ANNs) in terms of predictive, replicative and structural validity. Also provides a method for fitting feed-forward ANNs with a single hidden layer. |
Authors: | Greer B. Humphrey [aut, cre] |
Maintainer: | Greer B. Humphrey <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.2.1 |
Built: | 2024-10-25 06:02:57 UTC |
Source: | https://github.com/gbhumphrey1/validann |
Fits a single hidden layer ANN model to input data x
and output data
y
.
ann(x, y, size, act_hid = c("tanh", "sigmoid", "linear", "exp"), act_out = c("linear", "sigmoid", "tanh", "exp"), Wts = NULL, rang = 0.5, objfn = NULL, method = "BFGS", maxit = 1000, abstol = 1e-04, reltol = 1e-08, trace = TRUE, ...)
ann(x, y, size, act_hid = c("tanh", "sigmoid", "linear", "exp"), act_out = c("linear", "sigmoid", "tanh", "exp"), Wts = NULL, rang = 0.5, objfn = NULL, method = "BFGS", maxit = 1000, abstol = 1e-04, reltol = 1e-08, trace = TRUE, ...)
x |
matrix, data frame or vector of numeric input values, with
|
y |
matrix, data frame or vector of target values for examples. |
size |
number of hidden layer nodes. Can be zero. |
act_hid |
activation function to be used at the hidden layer. See ‘Details’. |
act_out |
activation function to be used at the output layer. See ‘Details’. |
Wts |
initial weight vector. If |
rang |
initial random weights on [-rang,rang]. Default value is 0.5. |
objfn |
objective function to be minimised when fitting
weights. This function may be user-defined with the first two arguments
corresponding to |
method |
the method to be used by |
maxit |
maximum number of iterations used by |
abstol |
absolute convergence tolerance (stopping criterion)
used by |
reltol |
relative convergence tolerance (stopping criterion)
used by |
trace |
logical. Should optimization be traced? Default = TRUE. |
... |
arguments to be passed to user-defined |
The “linear” activation, or transfer, function is the
identity function where the output of a node is equal to its input
.
The “sigmoid” function is the standard logistic sigmoid function given
by .
The “tanh” function is the hyperbolic tangent function given by
The “exp” function is the exponential function given by
The default configuration of activation functions is
act_hid = "tanh"
and act_out = "linear"
.
Optimization (minimization) of the objective function (objfn
) is
performed by optim
using the method specified.
Derivatives returned are first-order partial derivatives of the hidden and output nodes with respect to their inputs. These may be useful for sensitivity analyses.
object of class ‘ann’ with components describing the ANN structure and the following output components:
wts |
best set of weights found. |
par_of |
best values of additional |
value |
value of objective function. |
fitted.values |
fitted values for the training data. |
residuals |
residuals for the training data. |
convergence |
integer code returned by |
derivs |
matrix of derivatives of hidden (columns |
## fit 1-hidden node ann model with tanh activation at the hidden layer and ## linear activation at the output layer. ## Use 200 random samples from ar9 dataset. ## --- data("ar9") samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- x[, c(1,4,9)] fit <- ann(x, y, size = 1, act_hid = "tanh", act_out = "linear", rang = 0.1) ## fit 3-hidden node ann model to ar9 data with user-defined AR(1) objective ## function ## --- ar1_sse <- function(y, y_hat, par_of) { err <- y - y_hat err[-1] <- err[-1] - par_of * err[-length(y)] sum(err ^ 2) } fit <- ann(x, y, size = 3, act_hid = "tanh", act_out = "linear", rang = 0.1, objfn = ar1_sse, par_of = 0.7)
## fit 1-hidden node ann model with tanh activation at the hidden layer and ## linear activation at the output layer. ## Use 200 random samples from ar9 dataset. ## --- data("ar9") samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- x[, c(1,4,9)] fit <- ann(x, y, size = 1, act_hid = "tanh", act_out = "linear", rang = 0.1) ## fit 3-hidden node ann model to ar9 data with user-defined AR(1) objective ## function ## --- ar1_sse <- function(y, y_hat, par_of) { err <- y - y_hat err[-1] <- err[-1] - par_of * err[-length(y)] sum(err ^ 2) } fit <- ann(x, y, size = 3, act_hid = "tanh", act_out = "linear", rang = 0.1, objfn = ar1_sse, par_of = 0.7)
Synthetically generated dataset containing values of dependent variable
x_t
given values of x_t-1, x_t-2, ..., x_t-15
.
ar9
ar9
A data frame with 1000 rows and 16 variables:
lagged values of x_t in columns 1:15
dependent variable in column 16
This dataset was generated using the AR9 model first described in Sharma (2000) and given by:
where
Sharma, A. (2000), Seasonal to interannual rainfall probabilistic forecasts for improved water supply management: Part 1 - a strategy for system predictor identification, Journal of Hydrology, 239(1-4), 232-239, http://dx.doi.org/10.1016/S0022-1694(00)00346-2.
Return observed target values used for fitting ‘ann’ or ‘nnet’ ANN models.
observed(object)
observed(object)
object |
an object of class ‘ann’ as returned by |
This function can be invoked by calling observed(x)
for an
object x
of class ‘ann’ or ‘nnet’.
a 1-column matrix of observed target values.
# Get observed values of y used to train ann object `fit'. # --- data("ar9") samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- x[, c(1,4,9)] fit <- ann(x, y, size = 1, act_hid = "tanh", act_out = "linear", rang = 0.1) y_obs <- observed(fit)
# Get observed values of y used to train ann object `fit'. # --- data("ar9") samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- x[, c(1,4,9)] fit <- ann(x, y, size = 1, act_hid = "tanh", act_out = "linear", rang = 0.1) y_obs <- observed(fit)
Plot method for objects of class ‘validann’. Produces a series
of plots used for validating and assessing ANN models based on results
returned by validann
.
## S3 method for class 'validann' plot(x, obs, sim, gof = TRUE, resid = TRUE, sa = TRUE, display = c("multi", "single"), profile = c("all", "median"), ...)
## S3 method for class 'validann' plot(x, obs, sim, gof = TRUE, resid = TRUE, sa = TRUE, display = c("multi", "single"), profile = c("all", "median"), ...)
x |
object of class ‘validann’ as returned
by |
obs , sim
|
vectors comprising observed ( |
gof |
logical; should goodness-of-fit plots be produced? Default = TRUE. |
resid |
logical; should residual analysis plots be produced? Default = TRUE. |
sa |
logical; should input sensitivity analysis plots be produced? Default = TRUE. |
display |
character string defining how plots should be
displayed. The default is “multi” where multiple plots are displayed
together according to whether they are goodness-of-fit, residual analysis
or sensitivity analysis plots. For “single”, each plot is displayed on
its own. If the session is interactive, the user will be asked to confirm
a new page whether |
profile |
character string defining which structural validity Profile method outputs should be plotted. The default is “all” where outputs corresponding to 5 summary statistics are plotted together with the median predicted response for each input value. For “median”, only the median response is plotted. |
... |
Arguments to be passed to plot (not currently used). |
This function can be invoked by calling
plot(x, obs, sim)
for an object x
of class
‘validann’.
To produce plots for all types of validation metrics and statistics,
gof
, resid
and sa
must be
TRUE
and corresponding results must have been successfully
computed by validann
and returned in object x
.
If gof
is TRUE
, a scatter plot, Q-Q plot and
time/sample plot of observed (obs
) versus predicted (sim
)
data are produced.
If resid
is TRUE
and x$residuals
is not NULL
, plots of the model residuals are produced including
histogram, Q-Q plot (standardized residuals compared to standard normal),
autocorrelation (acf), partial autocorrelation (pacf), standardized
residual versus predicted output (i.e. sim
) and standardized
residual versus time/order of the data.
If sa
is TRUE
and x$y_hat
is not
NULL
, model response values resulting from the Profile
sensitivity analysis are plotted against percentiles of each
input. If x$rs
is not NULL
, the relative sensitivities of
each input, as computed by the partial derivative (PaD) sensitivity
analysis, are plotted against predicted output.
Setting gof
, resid
and/or sa
to FALSE
will ‘turn off’ the respective validation plots.
## Build ANN model and compute replicative and structural validation results data("ar9") samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- x[, c(1,4,9)] fit <- ann(x, y, size = 1, act_hid = "tanh", act_out = "linear", rang = 0.1) results <- validann(fit, x = x) obs <- observed(fit) sim <- fitted(fit) ## Plot replicative and structural validation results to the current device ## - a single page for each type of validation plot(results, obs, sim) ## Plot results to the current device - a single page for each plot plot(results, obs, sim, display = "single") ## Plot replicative and structural validation results to single file pdf("RepStructValidationPlots.pdf") plot(results, obs, sim) dev.off() ## Get predictive validation results for above model based on a new sample ## of ar9 data. samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- x[, c(1,4,9)] obs <- y sim <- predict(fit, newdata = x) results <- validann(fit, obs = obs, sim = sim, x = x) ## Plot predictive results only to file pdf("PredValidationPlots.pdf") plot(results, obs, sim, resid = FALSE, sa = FALSE) dev.off()
## Build ANN model and compute replicative and structural validation results data("ar9") samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- x[, c(1,4,9)] fit <- ann(x, y, size = 1, act_hid = "tanh", act_out = "linear", rang = 0.1) results <- validann(fit, x = x) obs <- observed(fit) sim <- fitted(fit) ## Plot replicative and structural validation results to the current device ## - a single page for each type of validation plot(results, obs, sim) ## Plot results to the current device - a single page for each plot plot(results, obs, sim, display = "single") ## Plot replicative and structural validation results to single file pdf("RepStructValidationPlots.pdf") plot(results, obs, sim) dev.off() ## Get predictive validation results for above model based on a new sample ## of ar9 data. samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- x[, c(1,4,9)] obs <- y sim <- predict(fit, newdata = x) results <- validann(fit, obs = obs, sim = sim, x = x) ## Plot predictive results only to file pdf("PredValidationPlots.pdf") plot(results, obs, sim, resid = FALSE, sa = FALSE) dev.off()
Predict new examples using a trained neural network.
## S3 method for class 'ann' predict(object, newdata = NULL, derivs = FALSE, ...)
## S3 method for class 'ann' predict(object, newdata = NULL, derivs = FALSE, ...)
object |
an object of class ‘ann’ as returned by function |
newdata |
matrix, data frame or vector of input data.
A vector is considered to comprise examples of a single input or
predictor variable. If |
derivs |
logical; should derivatives of hidden and output nodes be
returned? Default is |
... |
additional arguments affecting the predictions produced (not currently used). |
This function is a method for the generic function predict()
for class ‘ann’. It can be invoked by calling predict(x)
for an
object x
of class ‘ann’.
predict.ann
produces predicted values, obtained by evaluating the
‘ann’ model given newdata
, which contains the inputs to be used
for prediction. If newdata
is omitted, the
predictions are based on the data used for the fit.
Derivatives may be returned for sensitivity analyses, for example.
if derivs = FALSE
, a vector of predictions is returned.
Otherwise, a list with the following components is returned:
values |
matrix of values returned by the trained ANN. |
derivs |
matrix of derivatives of hidden (columns |
## fit 1-hidden node `ann' model to ar9 data data("ar9") samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- x[, c(1,4,9)] fit <- ann(x, y, size = 1, act_hid = "tanh", act_out = "linear", rang = 0.1) ## get model predictions based on a new sample of ar9 data. samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- x[, c(1,4,9)] sim <- predict(fit, newdata = x) ## if derivatives are required... tmp <- predict(fit, newdata = x, derivs = TRUE) sim <- tmp$values derivs <- tmp$derivs
## fit 1-hidden node `ann' model to ar9 data data("ar9") samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- x[, c(1,4,9)] fit <- ann(x, y, size = 1, act_hid = "tanh", act_out = "linear", rang = 0.1) ## get model predictions based on a new sample of ar9 data. samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- x[, c(1,4,9)] sim <- predict(fit, newdata = x) ## if derivatives are required... tmp <- predict(fit, newdata = x, derivs = TRUE) sim <- tmp$values derivs <- tmp$derivs
Compute metrics and statistics for predictive, replicative and/or structural validation of artificial neural networks (ANNs).
validann(...) ## S3 method for class 'ann' validann(net, obs = NULL, sim = NULL, x = NULL, na.rm = TRUE, ...) ## S3 method for class 'nnet' validann(net, obs = NULL, sim = NULL, x = NULL, na.rm = TRUE, ...) ## Default S3 method: validann(obs, sim, wts = NULL, nodes = NULL, na.rm = TRUE, ...)
validann(...) ## S3 method for class 'ann' validann(net, obs = NULL, sim = NULL, x = NULL, na.rm = TRUE, ...) ## S3 method for class 'nnet' validann(net, obs = NULL, sim = NULL, x = NULL, na.rm = TRUE, ...) ## Default S3 method: validann(obs, sim, wts = NULL, nodes = NULL, na.rm = TRUE, ...)
net |
an object of class ‘ann’ (as returned by function
|
obs , sim
|
vectors comprising observed ( |
x |
matrix, data frame or vector of input data used for
fitting |
na.rm |
logical; should missing values (including NaN) be removed from calculations? Default = TRUE. |
wts |
vector of ANN weights used to compute input
‘relative importance’ measures if |
nodes |
vector indicating the number of nodes in each layer
of the ANN model. This vector should have 3 elements: nodes in input
layer, nodes in hidden layer (can be 0), and nodes in output layer.
If |
... |
arguments to be passed to different validann methods, see specific formulations for details. |
To compute all possible validation metrics and statistics,
net
must be supplied and must be of class ‘ann’ (as returned by
ann
) or ‘nnet’ (as returned by nnet
).
However, a partial derivative (PaD) sensitivity analysis (useful for
structural validation) will only be carried out if net
is of class
‘ann’.
If obs
and sim
data are supplied in addition to net
,
validation metrics are computed based on these. Otherwise, metrics and
statistics are computed based on obs
and sim
datasets
derived from the net
object (i.e. the data used to fit net
and the fitted values). As such, both obs
and sim
must be
supplied if validation is to be based either on data not used for
training or on unprocessed training data (if training data were
preprocessed). If either obs
or sim
is specified but the
other isn't, both obs
and sim
will be derived from
net
if supplied (and a warning will be given). Similarly, this
will occur if obs
and sim
are of different lengths.
If net
is not supplied, both obs
and sim
are
required. This may be necessary if validating an ANN model not built
using either the nnet
or ann
functions.
In this case, both wts
and nodes
are also required if any
structural validation metrics are to be returned. If an ANN model has
K input nodes, J hidden nodes and a single output O,
with a bias node for both the hidden and output layers, the wts
vector must be ordered
as follows:
c(Wi1h1,Wi1h2,...Wi1hJ,Wi2h1,...Wi2hJ,...,WiKh1,...,WiKhJ,Wi0h1,...,Wi0hJ,
Wh1O,...,WhJO,Wh0O)
where Wikhj
is the weight between the kth input and the
jth hidden node and WhjO
is the weight between the
jth hidden node and the output. The bias weight on the jth
hidden layer node is labelled Wi0hj
while the bias weight on the
output is labelled Wh0O
. The wts
vector assumes the network
is fully connected; however, missing connections may be substituted by
zero weights. Skip-layer connections are not allowed.
list object of class ‘validann’ with components dependent on
arguments passed to validann
function:
metrics |
a data frame consisting of metrics: AME, PDIFF, MAE, ME, RMSE, R4MS4E, AIC, BIC, NSC, RAE, PEP, MARE, MdAPE, MRE, MSRE, RVE, RSqr, IoAd, CE, PI, MSLE, MSDE, IRMSE, VE, KGE, SSE and R. See Dawson et al. (2007) for definitions. |
obs_stats |
a data frame consisting of summary statistics about the
|
sim_stats |
a data frame consisting of summary statistics about the
|
residuals |
a 1-column matrix of model residuals ( |
resid_stats |
a data frame consisting of summary statistics about the
model |
ri |
a data frame consisting of ‘relative importance’ values for each
input. Only returned if If Garson's (Garson); connection weight (CW); Profile sensitivity analysis (Profile); and partial derivative sensitivity analysis (PaD). In addition, if If See Gevrey et al. (2003), Olden et al. (2004) and Kingston et al. (2006) for details of the relative importance methods. |
y_hat |
a matrix of dimension The response values returned in |
as |
a matrix of dimension The values in |
rs |
a matrix of dimension To compute the values in |
ann
: Compute validation metrics when net
is of class ‘ann’.
nnet
: Compute validation metrics when net
is of class ‘nnet’.
default
: Useful for predictive validation only or when ANN model
has not been developed using either ann
or
nnet
. Limited structural validation metrics may be
computed and only if wts
and nodes
are supplied.
Dawson, C.W., Abrahart, R.J., See, L.M., 2007. HydroTest: A web-based toolbox of evaluation metrics for the standardised assessment of hydrological forecasts. Environmental Modelling & Software, 22(7), 1034-1052. http://dx.doi.org/10.1016/j.envsoft.2006.06.008.
Olden, J.D., Joy, M.K., Death, R.G., 2004. An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data. Ecological Modelling 178, 389-397. http://dx.doi.org/10.1016/j.ecolmodel.2004.03.013.
Gevrey, M., Dimopoulos, I., Lek, S., 2003. Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecological Modelling 160, 249-264. http://dx.doi.org/10.1016/S0304-3800(02)00257-0.
Kingston, G.B., Maier, H.R., Lambert, M.F., 2006. Forecasting cyanobacteria with Bayesian and deterministic artificial neural networks, in: IJCNN '06. International Joint Conference on Neural Networks, 2006., IEEE. pp. 4870-4877. http://dx.doi.org/10.1109/ijcnn.2006.247166.
Mount, N.J., Dawson, C.W., Abrahart, R.J., 2013. Legitimising data-driven models: exemplification of a new data-driven mechanistic modelling framework. Hydrology and Earth System Sciences 17, 2827-2843. http://dx.doi.org/10.5194/hess-17-2827-2013.
ann
, plot.validann
,
predict.ann
# get validation results for 1-hidden node `ann' model fitted to ar9 data # based on training data. # --- data("ar9") samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- x[, c(1,4,9)] fit <- ann(x, y, size = 1, act_hid = "tanh", act_out = "linear", rang = 0.1) results <- validann(fit, x = x) # get validation results for above model based on a new sample of ar9 data. # --- samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- x[, c(1,4,9)] obs <- y sim <- predict(fit, newdata = x) results <- validann(fit, obs = obs, sim = sim, x = x) # get validation results for `obs' and `sim' data without ANN model. # In this example `sim' is generated using a linear model. No structural # validation of the model is possible, but `wts' are provided to compute the # number of model parameters needed for the calculation of certain # goodness-of-fit metrics. # --- samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- as.matrix(x[, c(1,4,9)]) lmfit <- lm.fit(x, y) sim <- lmfit$fitted.values obs <- y results <- validann(obs = obs, sim = sim, wts = lmfit$coefficients) # validann would be called in the same way if the ANN model used to generate # `sim' was not available or was not of class `ann' or `nnet'. Ideally in # this case, however, both `wts' and `nodes' should be supplied such that # some structural validation metrics may be computed. # --- obs <- c(0.257, -0.891, -1.710, -0.575, -1.668, 0.851, -0.350, -1.313, -2.469, 0.486) sim <- c(-1.463, 0.027, -2.053, -1.091, -1.602, 2.018, 0.723, -0.776, -2.351, 1.054) wts <- c(-0.05217, 0.08363, 0.07840, -0.00753, -7.35675, -0.00066) nodes <- c(3, 1, 1) results <- validann(obs = obs, sim = sim, wts = wts, nodes = nodes)
# get validation results for 1-hidden node `ann' model fitted to ar9 data # based on training data. # --- data("ar9") samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- x[, c(1,4,9)] fit <- ann(x, y, size = 1, act_hid = "tanh", act_out = "linear", rang = 0.1) results <- validann(fit, x = x) # get validation results for above model based on a new sample of ar9 data. # --- samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- x[, c(1,4,9)] obs <- y sim <- predict(fit, newdata = x) results <- validann(fit, obs = obs, sim = sim, x = x) # get validation results for `obs' and `sim' data without ANN model. # In this example `sim' is generated using a linear model. No structural # validation of the model is possible, but `wts' are provided to compute the # number of model parameters needed for the calculation of certain # goodness-of-fit metrics. # --- samp <- sample(1:1000, 200) y <- ar9[samp, ncol(ar9)] x <- ar9[samp, -ncol(ar9)] x <- as.matrix(x[, c(1,4,9)]) lmfit <- lm.fit(x, y) sim <- lmfit$fitted.values obs <- y results <- validann(obs = obs, sim = sim, wts = lmfit$coefficients) # validann would be called in the same way if the ANN model used to generate # `sim' was not available or was not of class `ann' or `nnet'. Ideally in # this case, however, both `wts' and `nodes' should be supplied such that # some structural validation metrics may be computed. # --- obs <- c(0.257, -0.891, -1.710, -0.575, -1.668, 0.851, -0.350, -1.313, -2.469, 0.486) sim <- c(-1.463, 0.027, -2.053, -1.091, -1.602, 2.018, 0.723, -0.776, -2.351, 1.054) wts <- c(-0.05217, 0.08363, 0.07840, -0.00753, -7.35675, -0.00066) nodes <- c(3, 1, 1) results <- validann(obs = obs, sim = sim, wts = wts, nodes = nodes)