An example
of prediction the burned area of forest fires, in the northeast region
of Portugal, by using meteorological and other data
Introduction Neural networks have seen an explosion of
interest over the last few years, and are being successfully applied
across an extraordinary range of problem domains, in areas as diverse
as finance, medicine, engineering, geology and physics. Indeed,
anywhere that there are problems of prediction, classification or
control, neural networks are being introduced. This sweeping success
can be attributed to a few key factors:
- Neural
networks are very sophisticated modeling techniques capable of modeling
extremely complex functions.
- Neural networks
learn by example. The neural network user gathers representative data,
and then invokes training algorithms to automatically learn the
structure of the data. Although the user does need to have some
heuristic knowledge of how to select and prepare data, how to select an
appropriate neural network, and how to interpret the results, the level
of user knowledge needed to successfully apply neural networks is much
lower than would be the case using (for example) some more traditional
nonlinear statistical methods.
- Neural networks
are also intuitively appealing, based as they are on a crude low-level
model of biological neural systems. In the future, the development of
this neurobiological modeling may lead to genuinely intelligent
computers.
In this experiment it will be shown how
neural networks and Neuroph
Studio are used when it comes to problems like this. Several
architectures will be tried out, and it will be determined which ones
represent a good solution to the problem, and which ones do not.
Introduction to the problem
Our task is
to use the twelve
inputs (in the
original
data set) to predict the burned area of forest
fires. The output "area" was first transformed with a ln(x+1)
function. Then, several Data Mining methods were applied. After fitting
the models, the outputs were post-processed with the inverse of the
ln(x+1) transform. Four different input setups were used. The
experiments were conducted using a 10-fold (cross-validation) x 30
runs. Two regression metrics were measured: MAD and RMSE. A Gaussian
support vector machine (SVM) fed with only 4 direct weather conditions
(temp, RH, wind and rain) obtained the best MAD value: 12.71 +- 0.01
(mean and confidence interval within 95% using a t-student
distribution). The best RMSE was attained by the naive mean predictor.
An analysis to the regression error curve (REC) shows that the SVM
model predicts more examples within a lower admitted error. In effect,
the SVM model predicts better small fires, which are the majority.
The
data set contains 699 instances and 12 inputs. They are:
1. X - x-axis spatial coordinate within the Montesinho park map: 1 to 9
2. Y - y-axis spatial coordinate within
the Montesinho park map: 2 to 9
3. month - month of the year: "jan" to
"dec"
4. day - day of the
week: "mon" to "sun"
5. FFMC
- FFMC index from the FWI system: 18.7 to 96.20
6. DMC - DMC index from the FWI system:
1.1 to 291.3
7. DC - DC
index from the FWI system: 7.9 to 860.6
8. ISI - ISI index from the FWI system:
0.0 to 56.10
9. temp -
temperature in Celsius degrees: 2.2 to 33.30
10. RH - relative humidity in %: 15.0 to 100
11. wind - wind speed in km/h: 0.40 to
9.40
12. rain - outside
rain in mm/m2 : 0.0 to 6.4
13. area - the burned area of the forest (in ha): 0.00 to 1090.84
The
data set can be dowloaded
here,
however, it can not be inserted in Neuroph in original form. For it
to be able to help us with this problem, we need to normalize the data
first. The type of neural network that will be used in this experiment
is multi layer perceptron with backpropagation.
Procedure
of training a neural network
In order to train a neural network, there are six steps to be made:
- Normalize the data
- Create a Neuroph project
- Creating a Training Set
- Create a neural network
- Train the network
- Test the network to make sure that it is trained properly
In this experiment we will demonstrate
the use of some standard
and advanced
training
techniques. Several architectures will be tried out, based on which we
will be able to determine what brings us the best results for our
problem.
Step 1. Data
Normalization
In order to train neural network this data set
have to be normalized.
Normalization implies that all values from the data set should take
values in the range from 0 to 1. For that purpose it would be
used the following formula:
Where:
X – value that should be normalized
Xn – normalized value
Xmin – minimum value of X
Xmax – maximum value of X
In our
case, all attribute values are real
numbers, except
the months and days.
Therefore, we replaced the input
month with
twelve new inputs, that represent
each month, where the value 1 is the
only attribute
that indicates the
active month.
Other values
are
0. Similarly we
have done with the
days. In
addition, we set
out dropped
lines in
which the output
value is
considerable
because of more precise
analysis.
Therefore,
the final data
set can
be downloaded here.
Finally, all
data are transferred in
.txt
file. Step
2. Create a new Neuroph project
We create a new project in
Neuroph Studio by clicking File > New
Project, then we choose Neuroph project and click 'Next' button..
In a
new window we define project name and location. After that we
click 'Finish' and a new project is created and will appear in projects
window, on the left side of Neuroph Studio.
To
create
training set, in main menu we choose Training > New Training Set
to
open training set wizard. Then we enter name of training set and number
of inputs and outputs. In this case it will be 4 inputs and 3 outputs
and we will set type of training to be supervised as the most common
way of neural network training.
As supervised training
proceeds, the neural network is taken
through a number of iterations, until the output of the neural network
matches the anticipated output, with a reasonably small rate of the
error.
Step 3. Create a Training Set
To create
training set, in main menu we choose Training > New Training Set
to
open training set wizard. Then we enter name of training set and number
of inputs and outputs. In this case it will be 29 inputs and 1 output
and we will set type of training to be supervised as the most common
way of neural network training.
As supervised training
proceeds, the neural network is taken
through a number of iterations, until the output of the neural network
matches the anticipated output, with a reasonably small rate of the
error.
Then,
we click 'Load' and all data will be loaded into table. We can
see that this table has 7 columns, first 4 of them represents inputs,
and last 3 of them represents outputs from our data set.
After
clicking 'Next' we need to insert data into training set table.
All data could be inserted manually, but we have a large number of data
instances and it would be a lot more easier to load all data directly
from our .txt file. We click on 'Choose File' and select file in which
we
saved our normalized data set. Values in that file are separated by tab.
Then,
we click 'Load' and all data will be loaded into table. We can
see that this table has 30 columns, first 29 of them represents inputs,
and last one is the output from our data set.
After
clicking 'Finish' new training set will appear in our project.
To be able to decide which is the best solution for our problem we will
create several neural networks, with different sets of parameters, and
most of them will be based on this training set.
Standard training techniques
Standard
approaches to validation of neural networks are mostly based on
empirical evaluation through simulation
and/or experimental testing. There are several methods for supervised
training of neural networks. The backpropagation algorithm is the most
commonly used training method for artificial neural networks.
Backpropagation is a supervised learning
method. It requires
a data set of the desired output for many inputs, making up the
training set. It is most useful for feed-forward networks (networks
that have no feedback, or simply, that have no connections that loop).
Main idea is to distribute the error function across the hidden layers,
corresponding to their effect on the output.
Training attempt 1.
Step 4.1. Creating a neural network
We
create a new neural network by clicking right click on project and
then New > Neural Network. Then we define neural network name
and
type. We will choose 'Multy Layer Perceptron' type. This network we
called MultiLayer2.
A multilayer perceptron is a feedforward
artificial neural
network model that maps sets of input data onto a set of appropriate
output. It consists of multiple layers of nodes in a directed graph,
with each layer fully connected to the next one. Except for the input
nodes, each node is a neuron with nonlinear activation function
Multylauer perceptron utilizes a supervised learning technique called
backpropagation for training the network. It is a modification of the
standard linear perceptron, which can distinguish data that is not
linearly separable.
If we
choose 'Block View' and look at the top left
corner of View screen we will see that training set is empty. To traing
Neural Network we need to put training data in that corner. To do that
we will just click on training set that we created and click 'Train'. A
new window will open, where we need to set the learning parameters,
learning rate and momentum.
Next thing we should do is
determine the values of learning parameters, learning rate and
momentum.
When we have
selected the type of the perceptron, we can click 'Next'.
A new window will appear, where we will set some more parameters that
are characteristic for multi layer perceptron. The number of input and
output neuron is the same as the number of inputs and outputs in the
training set. However, now we have to select the number of hidden
layers, and the number of neurons in each layer. Guided by the rule
that problems that require two hidden layers are rarely encountered
(and that there is currently no theoretical reason to use neural
networks with any more than two hidden layers), we will decide for only
one layer.
Using too few neurons in the hidden
layers will result in something
called underfitting. Underfitting occurs when there are too few neurons
in the hidden layers to adequately detect the signals in a complicated
data set.
Using too many neurons in the hidden layers can result in
several
problems. First, too many neurons in the hidden layers may result in
overfitting. Overfitting occurs when the neural network has so much
information processing capacity that the limited amount of information
contained in the training set is not enough to train all of the neurons
in the hidden layers. A second problem can occur even when the training
data is sufficient. An inordinately large number of neurons in the
hidden layers can increase the time it takes to train the network. The
amount of training time can increase to the point that it is impossible
to adequately train the neural network.
For the
first attempt, we have opted for
the 2 hidden neurons.We
have checked 'Use Bias Neurons', and chosen sigmoid transfer
function (because the range of our data is 0-1, had it been -1 to 1, we
would check 'Tanh'). As a learning rule we have chosen 'Backpropagation
with Momentum'. This learning rule will be used in all the networks we
create, because backpropagation is most commonly used technique and is
most suited for this type of problem. In this method, the objects in
the training set are given to the network one by one in random order
and the regression coefficients are updated each time in order to make
the current prediction error as small as it can be. This process
continues until convergence of the regression coefficients. Also, we
have chosen to add an extra term, momentum, to the standard
backpropagation formula in order to improve the efficiency of the
algorithm.
Bias neuron is
very important, and the error-back propagation
neural network without Bias neuron for hidden layer does not learn. The
Bias weights control shapes, orientation and steepness of all types of
Sigmoidal functions through data mapping space. A bias input always has
the value of 1. Without a bias, if all inputs are 0, the only output
ever possible will be a zero.
Next,
we click 'Finish' and the first neural network is created.
Step 5.1. Train the neural network
If we choose 'Block View' and look at the
top left
corner of View screen we will see that training set is empty. To traing
Neural Network we need to put training data in that corner. To do that
we will just click on training set that we created and click 'Train'. A
new window will open, where we need to set the learning parameters,
learning rate and momentum.
Next thing we should do is
determine the values of learning parameters, learning rate and momentum.
Learning
rate is a control parameter of training algorithms, which
controls the step size when weights are iteratively adjusted.
To help avoid settling into a local minimum, a momentum
rate
allows the network to potentially skip through local minima. A momentum
rate set at the maximum of 1.0 may result in training which is highly
unstable and thus may not achieve even a local minima, or the network
may take an inordinate amount of training time. If set at a low of 0.0,
momentum is not considered and the network is more likely to settle
into a local minimum.
When the Total Net Error value drops below the max error, the
training is complete. If the error is smaller we get a better
approximation.
We
will set the maximum error to
0.01, learning
rate to
0.2, a
momentum to 0.7. Then
we click on the 'Train' button and the training process starts.
After 91
137 iterations,
we stopped the
training process because we noticed
that the error does
not decrease over time,
and that
the neural network
has a problem
to finish training. Test is not necessary to carry
out because network did not finish process of
training.
Training attempt 2.
Step 5.2. Train the neural network
Before the
new training is necessary to go to the
'Randomize'. Now
we will change the value of
learning rate to
0.4. Other
parameters will be
the same. Again click on 'Train'
button.
Similar
to a moment ago
after 46
847 iterations
total net
error is
not decreasing,
so we stopped
the training again.
For now we
can say that this architecture
has a major problems to
execute the training
of these data,
but we will
continue. Training
attempt 3.
Step 5.3. Train the neural network
Again go on 'Randomise. 'Now learning rate is 0.6, momentum is
0.6 and max error stays the same. The result is:
Now the
network just finished
training in 105 iterations.
Total net
error is
satisfactory and
we can now access the test. The
data set for testing
will be already created
data set.
Step 6.1. Test the neural network
After
the network is trained, we click 'Test', in order to see the total
error, and all the individual errors. The results show that total error
is 0.05928214112840989.
For us, this
error is
too large.
Also if
we look at individual errors we
will see that there are
a lot of large errors.
The conclusion is
that this architecture is
not capable to learn this data.
It
means that we
have to try with another
architecture.
Training attempt 4.
Step 4.2. Create a neural network
We now decide for a neural network that contains 5 hidden
neurons in
one hidden layer. Again, we type in the standard number of inputs and
outputs, check 'Use Bias Neurons', choose a Sigmoid Transfer function,
and select 'Backpropagation with Momentum' as the Learning rule. This
network we called MultiLayer5.
Step
5.4. Train the neural network
Now,
the neural network that will be used as our second solution to the
problem has been created. Just like the previous neural network, we
will train this one with the training set we created before. We select
'TSFull', click 'Train' and a new window appears, asking us to
fill in the parameters. We can select the maximum error to be 0.01. We
do not
limit the maximum number of iterations. As for the learning parameters,
the learning rate will be 0.2, and momentum 0.7. After we click
'Train', the iteration process starts. Afterr 6993 iterations training
process is finished.
The
network completed training
for 6993 iterations.
This result already indicates
that the we slowly moving
in the right direction.
Total net
error that
we will not tolerate will
be over 5 percent. All
values of total
net error
below this value will
be acceptable
for our
case. Let's
go now to test the network.
Now
we can do test on this network.
Step 6.2. Test the neural network
After
testing the neural network, we see that the total mean
square error is 0.025212139724338424 which is much better than it was
in the
previous attempts. We think that this value of total mean square error
is very good result for our problem. So, we will try to find a better
solution.
The
final part of testing the network is testing with several input values.
To do that, we will select 5 random input values from our data
set. In
this
file you can find the full set of data assigned randomly
(ignore the first column, it serves only as an help for deployment)
and in
this file
you can find 5 random instances.
Now we will carry
out testing of
the network over this
five data.
We will make a
new training set
in which
we will
enter these five instances.
Then you have to go back to
the card that
indicates an active neural
network and click
on the new training set. Finally,
click on
the 'Test' button.
We
got a very high total mean square errorr. Also you can see 3 very bad
outputs. Because of that we will continue our training process.
Training
attempt 5.
Step 5.5.
Train the neural network
Go on 'Randomise'. Paremeters
that
we use in this attempt are: max error=0.01, learning rate=0.4,
momentum= 0.7. The result is:
Now the
network completed training
for 13 iterations.
Let's go now to
test the network.
Step 6.3. Test the neural network
After testing, we got that in this attempt the total error is
even higher than it was in previous case but still good for our problem.
This is the results of
testing 5 radnom data that we got.
This
is very good result for our case. We doesn't have large individual
errors like in previous attempt.
Training
attempt 6
Step 5.6 Train the neural network
Go
on 'Randomise'. Paremeters that we use in this attempt are: max
error=0.01, learning rate=0.7, momentum= 0.7. The result is:
In only
10 iterations process of training is finished.
Hoping to get
even better results we willl do
the test
with whole data set.
Step 6.4. Test the neural network
Only
we need to do is to click on the 'Test' button and see the results.
Total mean
square errior
is 0.03440339693707508. This is higher than in the
previous attempt. Let's try a
test with five
random data.
Again we get
satisfactory results,
which means that this architecture
works pretty
well.
Let's try to in the end the
training with a
lower value of momentum.
Training attempt 7.
Step 5.7. Train the neural network
Let's try to
train the network with
a smaller value
of momentum.
So all
the parameters remains the
same as the previous attempt except
mementum who now will be 0.3.
Of course,
as before each new training
click on 'Randomize'.
Click on 'Train'
look and wait for the results.
Process
stoped afrer 13 iterations. It's worse than previous attempt but let's
see the test results.
Step 6.5. Test the neural network
After
testing neural network, we see that in this attempt the total mean
square error is 0.032681833816451226, which is better than the error
that
was in the previous attempt.
Now we will
test our 5 random data. Here are the results:
This architecture is much better
than the previous one but we will continue to
seek the best solution. We still are going to try to find out
if it is possible to get even better testing result, with lower total
error, but with another network architecture.
Training attempt 8
Step 4.3. Create the neural network
Now we try
to train
the neural network that will
have more neurons than
in the previous case.
We will try with 9 hidden neurons
in one layer.
Like the previous two times
we decided to use back propagation with
momentum
algorithm. Name of this network is MultiLayer9. Training set
that we will use in this case will be the 'TSFull' .
Step 5.8. Train the neural network
First
we will try with the next parameters: max error=0.01, learning
rate=0.2, momentum= 0.7. The result is:
We can
see that process of training is finished in 837 iterations. Let's see
the test results.
Step
6.6. Test the neural network
We
click 'Test' and then we can see testing results for this type of
neural network architecture. In this case we did not get good
result. Total mean square error is not below 5 percents, so we cant
accept this result.
Now we will try
to test 5 random data.
This is so bad result. We can see 3 big errors (one of them
is 1 !), so we conclude that this set of parameters doesn't work. In
the following table you can see our next few attempts for this network.
Training
attempt | Number of hidden
neurons | Number of hidden layers | Training
set | Maximum error | Learning
rate | Momentum | Number of
iterations | Total mean square error | 5
random inputs test - number of correct guesses | Network
trained |
9. | 9 |
1 | full | 0.01 |
0.4 | 0.7 | 28 |
0.0333 | 5/5 | yes |
10. | 9 |
1 | full | 0.01 |
0.7 | 0.7 | 19 |
0.03495 | 5/5 | yes |
11. | 9 |
1 | full | 0.01 |
0.5 | 0.3 | 23 |
0.03288 | 4/5 | yes |
12. | 9 | 1 | full | 0.01 | 0.7 | 0.1 | 18 | 0.0331 | 5/5 | yes |
13. | 9 | 1 | full | 0.01 | 0.7 | 0 | 14 | 0.03282 | 5/5 | yes |
Following this table we see that
this architecture works well. For different data gives equally good
results which leads us to conclude that 9 neurons are enough to learn
this data. It is interesting that we get the best result when momentum
was 0. Now we will try to do train on a network which contains a more
than one hidden layer.
Training
attempt 14.
Step 4.4
Create
the neural network
In
this attempt, we will create a different type of neural network. We
want to see what will happens if we create neural network with two
hidden layers. First we create a new neural
network, type will be Multy Layer Perceptron as it was in the previous
attempts. Network we called MultiLayer4 3 Now
we have to set network parameters. We will set 4 neurons on the first
layer, and 3 on the second. Learning rule will be Backpropagation with
Momentum.
Step 5.14. Train
the neural network
As we
have learned how to train
and test neural networks,
in the
following table we will
present the results
of training this network. Training set that we use is
'TSFull'.
Training
attempt | Number of hidden
neurons | Number of hidden layers | Training
set | Maximum error | Learning
rate | Momentum | Number of
iterations | Total mean square error | 5
random inputs test - number of correct guesses | Network
trained |
14. | 4
3 |
2 | full | 0.01 |
0.2 | 0.7 | 690 |
0.0556 | 2/5 | no |
15. | 4 3 |
2 | full | 0.01 |
0.4 | 0.7 | 3408 |
0.04141 | 1/5 | no |
16. | 4 3 |
2 | full | 0.01 |
0.7 | 0.7 | did not
finish (30000) |
/ | / | no |
17. | 4 3 | 2 | full | 0.01 | 0.3 | 0.4 | 961 | 0.0601 | 1/5 | no |
After this four attempts we show
that the type of architecture with more hidden layers of neurons does
not correspond to our problem. In this case network has failed to score
more than 2 outputs in all attempts. So we do not recommend you to use
networks with multiple layers to solve this problem.
Advanced
training techniques
Neural networks represent a
class of systems that do not fit into the current paradigms of software
development and
certification. Instead of being programmed, a learning algorithm
“teaches” a neural network using a set of data. Often,
because of the non-deterministic result of the adaptation, the neural
network is considered a “black box” and its
response may not be predictable. Testing the neural network with
similar data as that used in the training set is one of
the few methods used to verify that the network has adequately learned
the input domain.
In most instances, such
traditional testing techniques prove adequate for the acceptance of a
neural network system.
However, in more complex, safety- and mission-critical systems, the
standard neural network training-testing approach
is not able to provide a reliable method for their certification.
One
of the major advantages of neural networks is their ability to
generalize. This means that a trained network could classify data from
the same class as the learning data that it has never seen before. In
real world applications developers normally have only a small part of
all possible patterns for the generation of a neural network. To reach
the best generalization, the data set should be split into three parts:
validation, training and testing set.
The
validation
set contains a smaller percentage of instances from the initial data
set, and is used to determine whether the selected network architecture
is good enough. If validation was successful, only then we can do the
training. The training set is applied to the neural network for
learning and adaptation. The testing set is then used to determine the
performance of the neural network by computation of an error metric.
This validating-training-testing approach is the first, and
often the only, option system developers consider for the
assessment of a neural network. The assessment is accomplished by the
repeated application of neural network training
data, followed by an application of neural network testing data to
determine whether the neural network is acceptable.
Training attempt 18.
Step 3.3. Create a Training Set
The
idea of this attempt is to use only a part of the data set when
training a network, and then test the network with inputs from the
other, unused part of the data set. That way we can determine whether
the neural network has the power of generalization.
In
the initial training set we have 493 instances. In this attempt we will
create a new training set that contains only 10% of initial data set
instances and we will pick those instances randomly. First we have to
create a new file that would contains new data set instances. A new
data set would have 49 instances
Then, in Neuroph studio we create a new training set, which
we called 'TS10' with the same
parameters that we used in the first one, and load data from a new data
set.
We will also create a training set, which we
called 'TS90' that
contains the rest 90%
of instances that we should use for network
testing later in this attempt. This training set will contains 444
instances
The
final results of this training attempt are shown in Table 2.
Step 5.18 Train the neural network
Unlike
previous attempts, now we will train some neural network which is
already created, but in this case it would be trained with a new
created training set which contains 10% instances of the initial
training set. For this training we will use neural network which has 9
hidden neurons (MultiLayer9). Learning rate will be 0.2 and momentum
0.7 in this
case. We click on 'Train' button and wait for training process to
finish.
As we can
see in the image above, network was successfully
trained. It took only 2 iterations for training process to finish.
Let's see the test results
Step 6.15. Test the neural network
After
successfully training, we can now test neural network. First, will test
network with training set that contains only 10% of the initial
training set instances. We got that in this case the total error is
0.010952389892228815, which is so good result for our problem.
But,
the idea was to test neural network with the other 90% of data set that
wasn't used for training this neural network. So now, we will try to do
that kind of test. This time, for testing, we will use training set
that contains the remaining 90% instances that weren't used for
training ('TS90').
When training
process has completed, we
can see that the total error is 0.03775515556170865, which is not so
bad
considering the fact that we have tested the network with data that was
not used during the training.
Now,
we will analyze individual errors by selecting some random inputs to
see whether the network is in all cases well predicted the output. We
will random choose 5 observations which will be subjected to individual
testing ('Random5' data set'). Those observations and their testing
results are in the
following picture:
As
we can see in the table, in all 5 times network correctly
guessed. We have a
small deviation.
It showed us that
this type of network has a good ability of generalization.
Training attempt 19.
Step 3.4. Create a Training Set
In
this training attempt we will create three different data sets from the
initial data set. The first data set will be used for the validation of
neural network, the second for training and third for testing the
network.
- Validation set:
10% of instances
- 49 randomly selected observations (firs 49 instances in out 'Random' file) -
'TS10' training set
- Training
set: 70% of
instances - 345 randomly selected observations (whisch
includes already created
10%.') - 'TS70' training set
- Testing set: 30% of instances
that that do
not appear in previous two data sets (148 randomly selected
observations) - 'TS30' training set
The
final results of
this training attempt are shown in Table
2.
Step 5.19.
Validate and Train the neural
network
First
we need to do a validation of the network by using a smaller set of
data so we can check whether such a network architecture is suitable
for our problem, and if so, then we can train the network with a larger
data set. Neural network that we use in this attempt will be
our 'MultiLayer9'.
We will train the network with
validation data set that contains 10% of observations. We will set
maximum error to be 0.01, learning rate 0.2 and momentum 0.7. Then we
click on 'Train' and training starts. Process ends after only 2
iterations.
Based
on validation, we can conclude that this type of neural network
architecture is appropriate, but it is also necessary to train the
network with a larger set of data so we can be sure.
We
will again train this network, but this time with training set that
contains 70% of instances. Learning parameters will remain the same as
they were during the validation.
We can see that
training process is finished after 540 iterations, which is good. Now
we want to see the results of testing another 30% of data, which
do not appear in this training data.
Step 6.16. Test the neural
network
As we already said for this test we use out 'TS30' training set. Only
what we need to do is to click on this training set an go to 'Test'.
We do not
get the results what we want, This level of total mean square error is
high for us. In the following table we present next few attempt ,
wanting to
get better results. (below
the 5 %).
Training
attempt | Number of hidden
neurons | Number of hidden layers |
Validation set | Training set | Testing
set | Maximum error | Learning
rate | Momentum | Success
validation | Number of iterations
during training | Total mean square error |
5 random inputs test | Network trained |
20. |
9 |
1 |
10% |
70% |
30% |
0.01 |
0.4 |
0.7 |
yes |
270 |
0.0653 |
/ |
no |
21. |
9 |
1 |
10% |
70% |
30% |
0.01 |
0.7 |
0.7 |
yes |
146 |
0.0629 |
/ |
no |
22. |
4
3 | 2 |
10% |
70% |
30% |
0.01 |
0.2 |
0.7 |
yes |
9207 |
0.06796 |
/ |
no |
23. |
4 3 |
2 |
10% |
70% |
30% |
0.01 |
0.4 |
0.3 |
yes |
1256 |
0.0700 |
/ |
no |
In this table you
can see that this way of training is very
difficult for our case. We tried a several architectures and parameters
but we did't get total net error below 5 %.
We are left to try a
network that is different
from all previous
types .
Training
attempt 24.
Step 4.5. Create a neural
network
In this attempt, we will create a different type of neural
network. First we create a new neural
network, type will be Multy Layer Perceptron as it was in the previous
attempts. Network we called MultiLayer8 4. Now we have to
set network parameters. We will set 8 neurons on the first layer, and 4
on the second. Learning rule will be Backpropagation. Please note that
we will try training
without momentum.
Step 5.24. Validate and Train the
neural
network
As we did it before, first we must to
validate this network. For this we will use our 'TS10'. Max
error will be 0.01 and learnng rate 0.2. Go to the 'Train'.
As
you can see validation process is success. That means that we can go to
the main training. All you need to do is to choose 'TS70',
set learning rate on 0.7 and click to the 'Train'.
After 1806
iterations process of training is finished. Let's see the test with
rest 30%.
Step 6.21. Test a
neural network
Finally. The
result witch total main square error is below 5%.
Based on this
we think that for this
type of testing
we do not need
Backpropagation with momentum algorithm. Also when
it comes to advanced training
techniques better is
to use a two layer architecture.
Training attempt 25.
Step 3.5. Create a training set
In
this training attempt we will also show the use of advanced training
techniques. Three training sets will be created - validation, training,
and testing set.
- Validation
set: 30%
of instances - 148 randomly selected observations (first 148
instances from 'Random' file)
- 'TS30Val' training set
- Training set: 80% of instances -
394
randomly selected observations (whisch
includes already created
30%.') - 'TS80' training set
- Testing
set: 20% of
instances that that do not appear in previous
data sets (99 randomly selected observations) - 'TS20' training set
Now
we use 30% instances for validation because in the previous attempt
process of validation was very short (only 2 iterations).
This is good, but we want a safer validation.
The
final results of this training attempt are shown in Table 2.
Step 5.25. Validate and Train the neural
network
Again first
we need to do a validation of the network by using a smaller set of
data so we can check whether such a network architecture is suitable
for our problem, and if so, then we can train the network with a larger
data set. In this attempt we will use MultiLayer8 4 architecture like
in the previous attempt.
We will train neural
network with
validation data set that contains 30% of instances. The maximum error
will be 0.01, learning rate 0.2. Training process ends
after 13 iterations.
Then
we need to train the neural network with training set that contains 80%
of instances with learning rate 0.7. After 1246 iterations
process ends, and we can see that training was successful.
Step 6.22 Test the neural network
Now
we need to test this neural network in order to see results. Training
set that we use in this case is 'TS20'.
The total
mean square error in this case is significantly
lower than in the previous
attempt, and is 0.03599245182360657. We
think that this is the
best solution that can be obtained for this case,
but we
will trie the
other architectures. The
results are shown in the
table below.
Training
attempt | Number of hidden
neurons | Number of hidden layers |
Validation set | Training set | Testing
set | Maximum error | Learning
rate | Momentum | Success
validation | Number of iterations
during training | Total mean square error |
5 random inputs test | Network trained |
26. | 9 |
1 | 30% | 80% |
20% | 0.01 | 0.2 |
0.6 | yes | 944 |
0.0972 | / | no |
27. | 4 3 |
2 | 30% | 80% |
20% | 0.01 | 0.2 |
0.3 | yes | 540 |
0.0754 | / | no |
28. | 4 3 |
2 |
30% | 80% | 20% |
0.01 | 0.5 | 0.6 |
yes | 1485 | 0.0591 |
/ | no |
After
this we can state that for advanced training techniques for us is
better to use networks with 2 layers and Backpropagation algorithm.
Training attempt 29.
Step 3.6. Create a training set
- Validation
set: 30%
of instances - 148 randomly selected observations (first 148
instances from 'Random' file)
- 'TS30Val' training set
- Training set:
60% of instances
- 296
randomly selected observations (whisch
includes already created
30%.') - 'TS60' training set
- Testing
set: 40% of
instances that that do not appear in previous
data sets (197 randomly selected observations) - 'TS40' training set
Finally, we will try training
with a combination of 60-40. For validation
we will
use 30 percent of
the data,
as in
the previous attempt.
For training will use 60
percent, and for test
we will
use the
remaining 40 percent of
the data. The results are shown in the table below.
Training
attempt | Number of hidden
neurons | Number of hidden layers |
Validation set | Training set | Testing
set | Maximum error | Learning
rate | Momentum | Success
validation | Number of iterations
during training | Total mean square error |
5 random inputs test | Network trained |
29. | 9 |
1 | 30% | 60% |
40% | 0.01 | 0.7 |
0.7 | yes | 142 |
0.0508 | / | no |
30. | 4 3 |
2 | 30% | 60% |
40% | 0.01 | 0.4 |
0.7 | yes | 80 |
0.0537 | / | no |
31. | 8 4 |
2 |
30% | 60% | 40% |
0.01 | 0.7 | / |
yes | 1326 | 0.0410 |
/ | yes |
As in the previous 2 attempts we can see that 8 4
architecture (algorithm without momentum) shows the best results. Thus confirm
our expectations.Conclusion
During
this experiment, we have created several different architectures of
neural networks. We wanted to find out what is the most important thing
to do during the neural network training in order to get the best
results.
What proved out to be crucial to the
success of the training, is the selection of an appropriate number of
hidden neurons during the creating of a new neural network. One hidden
layer is in most cases proved to be sufficient for the training
success. As it turned, in our experiment was better to use more
neurons. In the standard
techniques we used
double-layer architecture, but
they did not show
results.
Also,
through the various
tests we have demonstrated the sensitivity of neural networks to high
and low values of learning parameters. We have shown the difference
between standard and advanced training techniques. In this
tutorial we have given
guidance on how to test
the network.
If you
have patience,
knowledge and a little luck you
may get
better results than us.
We wish you luck.
Final
results of our experiment are given in the two tables below. In the
first table (Table 1.) are the results obtained using standard training
techniques, and in the second table (Table 2.) the results obtained by
using advanced training techniques. The best solutions is indicated by
a
green background.
Table 1.
Standard training techniques
Training attempt | Number of hidden
neurons | Number of hidden layers | Training
set | Maximum error | Learning
rate | Momentum | Number of
iterations | Total mean square error | 5
random inputs test - number of correct guesses | Network
trained |
1. | 2 |
1 | full | 0.01 |
0.2 | 0.7 | did not
finish (9117) |
/ | / | no |
2. | 2 |
1 | full | 0.01 |
0.4 | 0.7 | did not
finish (46847) |
/ | / | no |
3. | 2 |
1 | full | 0.01 |
0.6 | 0.6 | 105 |
0.0593 | / | no |
4. | 5 |
1 | full | 0.01 |
0.2 | 0.7 | 6993 |
0.0252 | 2/5 | no |
5. | 5 | 1 | full | 0.01 | 0.4 | 0.7 | 13 | 0.0332 | 5/5 | yes |
6. | 5 | 1 | full | 0.01 | 0.7 | 0.7 | 10 | 0.0344 | 5/5 | yes |
7. | 5 | 1 | full | 0.01 | 0.7 | 0.3 | 13 | 0.0327 | 5/5 | yes |
8. | 9 | 1 | full | 0..01 | 0.2 | 0.7 | 837 | 0.0596 | 2/5 | no |
9. | 9 | 1 | full | 0.01 | 0.4 | 0.7 | 28 | 0.0333 | 5/5 | yes |
10. | 9 | 1 | full | 0.01 | 0.7 | 0.7 | 19 | 0.03495 | 5/5 | yes |
11. | 9 | 1 | full | 0.01 | 0.5 | 0.3 | 23 | 0.0329 | 4/5 | yes |
12. | 9 | 1 | full | 0.01 | 0.7 | 0.1 | 18 | 0.0331 | 5/5 | yes |
13. | 9 | 1 | full | 0.01 | 0.7 | 0 | 14 | 0.0328 | 5/5 | yes |
14. | 4
3 | 2 | full | 0.01 | 0.2 | 0.7 | 690 | 0.0556 | 2/5 | no |
15. | 4
3 | 2 | full | 0.01 | 0.4 | 0.7 | 3408 | 0.0414 | 1/5 | no |
16. | 4
3 | 2 | full | 0.01 | 0.7 | 0.7 | did
not finish (30000) | / | / | no |
17. | 4
3 | 2 | full | 0.01 | 0.3 | 0.4 | 961 | 0.0601 | 1/5 | no |
Table 2. Advanced
training techniques
Training attempt | Number of hidden
neurons | Number of hidden layers |
Validation set | Training set | Testing
set | Maximum error | Learning
rate | Momentum | Success
validation | Number of iterations
during training | Total mean square error |
5 random inputs test | Network trained |
18. | 9 |
1 | / | 10% |
90% | 0.01 | 0.2 |
0.7 | / | 2 |
0.0377 | 5/5 | yes |
19. | 9 | 1 | 10% | 70% | 30% | 0.01 | 0.2 | 0.7 | yes | 540 | 0.0522 | / | no |
20. | 9 | 1 | 10% | 70% | 30% | 0.01 | 0.4 | 0.7 | yes | 270 | 0.0653 | / | no |
21. | 9 | 1 | 10% | 70% | 30% | 0.01 | 0.7 | 0.7 | yes | 146 | 0.0629 | / | no |
22. | 4
3 | 2 | 10% | 70% | 30% | 0.01 | 0.2 | 0.7 | yes | 9207 | 0.06796 | / | no |
23. | 4
3 | 2 | 10% | 70% | 30% | 0.01 | 0.4 | 0.3 | yes | 1256 | 0.0700 | / | no |
24. | 8
4 | 2 | 10% | 70% | 30% | 0.01 | 0.7 | / | yes | 1806 | 0.0471 | / | yes |
25. | 8 4 | 2 | 30% | 80% | 20% | 0.01 | 0.7 | / | yes | 1246 | 0.03599 | / | yes |
26. | 9 | 1 | 30% | 80% | 20% | 0.01 | 0.2 | 0.6 | yes | 944 | 0.0972 | / | no |
27. | 4
3 | 2 | 30% | 80% | 20% | 0.01 | 0.2 | 0.3 | yes | 540 | 0.0754 | / | no |
28. | 4
3 | 2 | 30% | 80% | 20% | 0.01 | 0.5 | 0.6 | yes | 1485 | 0.0591 | / | no |
29. | 9 | 1 | 30% | 60% | 40% | 0.01 | 0.7 | 0.7 | yes | 142 | 0.0508 | / | no |
30. | 4
3 | 2 | 30% | 60% | 40% | 0.01 | 0.4 | 0.7 | yes | 80 | 0.0537 | / | no |
31. | 8
4 | 2 | 30% | 60% | 40% | 0.01 | 0.7 | / | yes | 1326 | 0.0410 | / | yes |
Download
See also:
Multi Layer Perceptron Tutorial