Predicting the burned area of forest fires with neural networks

An example of prediction the burned area of forest fires, in the northeast region of Portugal, by using meteorological and other data

Introduction

Neural networks have seen an explosion of interest over the last few years, and are being successfully applied across an extraordinary range of problem domains, in areas as diverse as finance, medicine, engineering, geology and physics. Indeed, anywhere that there are problems of prediction, classification or control, neural networks are being introduced. This sweeping success can be attributed to a few key factors:

Neural networks are very sophisticated modeling techniques capable of modeling extremely complex functions.
Neural networks learn by example. The neural network user gathers representative data, and then invokes training algorithms to automatically learn the structure of the data. Although the user does need to have some heuristic knowledge of how to select and prepare data, how to select an appropriate neural network, and how to interpret the results, the level of user knowledge needed to successfully apply neural networks is much lower than would be the case using (for example) some more traditional nonlinear statistical methods.
Neural networks are also intuitively appealing, based as they are on a crude low-level model of biological neural systems. In the future, the development of this neurobiological modeling may lead to genuinely intelligent computers.

In this experiment it will be shown how neural networks and Neuroph Studio are used when it comes to problems like this. Several architectures will be tried out, and it will be determined which ones represent a good solution to the problem, and which ones do not.

Introduction to the problem

Our task is to use the twelve inputs (in the original data set) to predict the burned area of forest fires. The output "area" was first transformed with a ln(x+1) function. Then, several Data Mining methods were applied. After fitting the models, the outputs were post-processed with the inverse of the ln(x+1) transform. Four different input setups were used. The experiments were conducted using a 10-fold (cross-validation) x 30 runs. Two regression metrics were measured: MAD and RMSE. A Gaussian support vector machine (SVM) fed with only 4 direct weather conditions (temp, RH, wind and rain) obtained the best MAD value: 12.71 +- 0.01 (mean and confidence interval within 95% using a t-student distribution). The best RMSE was attained by the naive mean predictor. An analysis to the regression error curve (REC) shows that the SVM model predicts more examples within a lower admitted error. In effect, the SVM model predicts better small fires, which are the majority.
The data set contains 699 instances and 12 inputs. They are:

    1. X - x-axis spatial coordinate within the Montesinho park map: 1 to 9
    2. Y - y-axis spatial coordinate within the Montesinho park map: 2 to 9
    3. month - month of the year: "jan" to "dec"
    4. day - day of the week: "mon" to "sun"
    5. FFMC - FFMC index from the FWI system: 18.7 to 96.20
    6. DMC - DMC index from the FWI system: 1.1 to 291.3
    7. DC - DC index from the FWI system: 7.9 to 860.6
    8. ISI - ISI index from the FWI system: 0.0 to 56.10
    9. temp - temperature in Celsius degrees: 2.2 to 33.30
    10. RH - relative humidity in %: 15.0 to 100
    11. wind - wind speed in km/h: 0.40 to 9.40
    12. rain - outside rain in mm/m2 : 0.0 to 6.4
    13. area - the burned area of the forest (in ha): 0.00 to 1090.84

The data set can be dowloaded here, however, it can not be inserted in Neuroph in original form. For it to be able to help us with this problem, we need to normalize the data first. The type of neural network that will be used in this experiment is multi layer perceptron with backpropagation.

Procedure of training a neural network

In order to train a neural network, there are six steps to be made:

Normalize the data

Create a Neuroph project

Creating a Training Set

Create a neural network

Train the network

Test the network to make sure that it is trained properly

In this experiment we will demonstrate the use of some standard and advanced training techniques. Several architectures will be tried out, based on which we will be able to determine what brings us the best results for our problem.

Step 1. Data Normalization

In order to train neural network this data set have to be normalized. Normalization implies that all values from the data set should take values in the range from 0 to 1. For that purpose it would be used the following formula:

Where:

X – value that should be normalized
Xn – normalized value
Xmin – minimum value of X
Xmax – maximum value of X

In our case, all attribute values are real numbers, except the months and days. Therefore, we replaced the input month with twelve new inputs, that represent each month, where the value 1 is the only attribute that indicates the active month. Other values are 0. Similarly we have done with the days. In addition, we set out dropped lines in which the output value is considerable because of more precise analysis.
Therefore, the final data set can be downloaded here. Finally, all data are transferred in .txt file.

Step 2. Create a new Neuroph project

We create a new project in Neuroph Studio by clicking File > New Project, then we choose Neuroph project and click 'Next' button..

In a new window we define project name and location. After that we click 'Finish' and a new project is created and will appear in projects window, on the left side of Neuroph Studio.

To create training set, in main menu we choose Training > New Training Set to open training set wizard. Then we enter name of training set and number of inputs and outputs. In this case it will be 4 inputs and 3 outputs and we will set type of training to be supervised as the most common way of neural network training.

As supervised training proceeds, the neural network is taken through a number of iterations, until the output of the neural network matches the anticipated output, with a reasonably small rate of the error.

Step 3. Create a Training Set

To create training set, in main menu we choose Training > New Training Set to open training set wizard. Then we enter name of training set and number of inputs and outputs. In this case it will be 29 inputs and 1 output and we will set type of training to be supervised as the most common way of neural network training.

Then, we click 'Load' and all data will be loaded into table. We can see that this table has 7 columns, first 4 of them represents inputs, and last 3 of them represents outputs from our data set.

After clicking 'Next' we need to insert data into training set table. All data could be inserted manually, but we have a large number of data instances and it would be a lot more easier to load all data directly from our .txt file. We click on 'Choose File' and select file in which we saved our normalized data set. Values in that file are separated by tab.

Then, we click 'Load' and all data will be loaded into table. We can see that this table has 30 columns, first 29 of them represents inputs, and last one is the output from our data set.

After clicking 'Finish' new training set will appear in our project.

To be able to decide which is the best solution for our problem we will create several neural networks, with different sets of parameters, and most of them will be based on this training set.

Standard training techniques

Standard approaches to validation of neural networks are mostly based on empirical evaluation through simulation and/or experimental testing. There are several methods for supervised training of neural networks. The backpropagation algorithm is the most commonly used training method for artificial neural networks.

Backpropagation is a supervised learning method. It requires a data set of the desired output for many inputs, making up the training set. It is most useful for feed-forward networks (networks that have no feedback, or simply, that have no connections that loop). Main idea is to distribute the error function across the hidden layers, corresponding to their effect on the output.

Training attempt 1.

Step 4.1. Creating a neural network

We create a new neural network by clicking right click on project and then New > Neural Network. Then we define neural network name and type. We will choose 'Multy Layer Perceptron' type. This network we called MultiLayer2.

A multilayer perceptron is a feedforward artificial neural network model that maps sets of input data onto a set of appropriate output. It consists of multiple layers of nodes in a directed graph, with each layer fully connected to the next one. Except for the input nodes, each node is a neuron with nonlinear activation function Multylauer perceptron utilizes a supervised learning technique called backpropagation for training the network. It is a modification of the standard linear perceptron, which can distinguish data that is not linearly separable.

If we choose 'Block View' and look at the top left corner of View screen we will see that training set is empty. To traing Neural Network we need to put training data in that corner. To do that we will just click on training set that we created and click 'Train'. A new window will open, where we need to set the learning parameters, learning rate and momentum.

Next thing we should do is determine the values of learning parameters, learning rate and momentum.

When we have selected the type of the perceptron, we can click 'Next'. A new window will appear, where we will set some more parameters that are characteristic for multi layer perceptron. The number of input and output neuron is the same as the number of inputs and outputs in the training set. However, now we have to select the number of hidden layers, and the number of neurons in each layer. Guided by the rule that problems that require two hidden layers are rarely encountered (and that there is currently no theoretical reason to use neural networks with any more than two hidden layers), we will decide for only one layer.

Using too few neurons in the hidden layers will result in something called underfitting. Underfitting occurs when there are too few neurons in the hidden layers to adequately detect the signals in a complicated data set.

Using too many neurons in the hidden layers can result in several problems. First, too many neurons in the hidden layers may result in overfitting. Overfitting occurs when the neural network has so much information processing capacity that the limited amount of information contained in the training set is not enough to train all of the neurons in the hidden layers. A second problem can occur even when the training data is sufficient. An inordinately large number of neurons in the hidden layers can increase the time it takes to train the network. The amount of training time can increase to the point that it is impossible to adequately train the neural network.

For the first attempt, we have opted for the 2 hidden neurons.

We have checked 'Use Bias Neurons', and chosen sigmoid transfer function (because the range of our data is 0-1, had it been -1 to 1, we would check 'Tanh'). As a learning rule we have chosen 'Backpropagation with Momentum'. This learning rule will be used in all the networks we create, because backpropagation is most commonly used technique and is most suited for this type of problem. In this method, the objects in the training set are given to the network one by one in random order and the regression coefficients are updated each time in order to make the current prediction error as small as it can be. This process continues until convergence of the regression coefficients. Also, we have chosen to add an extra term, momentum, to the standard backpropagation formula in order to improve the efficiency of the algorithm.

Bias neuron is very important, and the error-back propagation neural network without Bias neuron for hidden layer does not learn. The Bias weights control shapes, orientation and steepness of all types of Sigmoidal functions through data mapping space. A bias input always has the value of 1. Without a bias, if all inputs are 0, the only output ever possible will be a zero.

Next, we click 'Finish' and the first neural network is created.

Step 5.1. Train the neural network

Next thing we should do is determine the values of learning parameters, learning rate and momentum.

Learning rate is a control parameter of training algorithms, which controls the step size when weights are iteratively adjusted.

To help avoid settling into a local minimum, a momentum rate allows the network to potentially skip through local minima. A momentum rate set at the maximum of 1.0 may result in training which is highly unstable and thus may not achieve even a local minima, or the network may take an inordinate amount of training time. If set at a low of 0.0, momentum is not considered and the network is more likely to settle into a local minimum.

When the Total Net Error value drops below the max error, the training is complete. If the error is smaller we get a better approximation.

We will set the maximum error to 0.01, learning rate to 0.2, a momentum to 0.7. Then we click on the 'Train' button and the training process starts.

After 91 137 iterations, we stopped the training process because we noticed that the error does not decrease over time, and that the neural network has a problem to finish training. Test is not necessary to carry out because network did not finish process of training.

Training attempt 2.

Step 5.2. Train the neural network

Before the new training is necessary to go to the 'Randomize'. Now we will change the value of learning rate to 0.4. Other parameters will be the same. Again click on 'Train' button.
Similar to a moment ago after 46 847 iterations total net error is not decreasing, so we stopped the training again.

For now we can say that this architecture has a major problems to execute the training of these data, but we will continue.

Training attempt 3.

Step 5.3. Train the neural network

Again go on 'Randomise. 'Now learning rate is 0.6, momentum is 0.6 and max error stays the same. The result is:

Now the network just finished training in 105 iterations. Total net error is satisfactory and we can now access the test. The data set for testing will be already created data set.

Step 6.1. Test the neural network

After the network is trained, we click 'Test', in order to see the total error, and all the individual errors. The results show that total error is 0.05928214112840989. For us, this error is too large. Also if we look at individual errors we will see that there are a lot of large errors.

The conclusion is that this architecture is not capable to learn this data. It means that we have to try with another architecture.

Training attempt 4.

Step 4.2. Create a neural network

We now decide for a neural network that contains 5 hidden neurons in one hidden layer. Again, we type in the standard number of inputs and outputs, check 'Use Bias Neurons', choose a Sigmoid Transfer function, and select 'Backpropagation with Momentum' as the Learning rule. This network we called MultiLayer5.

Step 5.4. Train the neural network

Now, the neural network that will be used as our second solution to the problem has been created. Just like the previous neural network, we will train this one with the training set we created before. We select 'TSFull', click 'Train' and a new window appears, asking us to fill in the parameters. We can select the maximum error to be 0.01. We do not limit the maximum number of iterations. As for the learning parameters, the learning rate will be 0.2, and momentum 0.7. After we click 'Train', the iteration process starts. Afterr 6993 iterations training process is finished.

The network completed training for 6993 iterations. This result already indicates that the we slowly moving in the right direction. Total net error that we will not tolerate will be over 5 percent. All values of total net error below this value will be acceptable for our case. Let's go now to test the network.

Now we can do test on this network.

Step 6.2. Test the neural network

After testing the neural network, we see that the total mean square error is 0.025212139724338424 which is much better than it was in the previous attempts. We think that this value of total mean square error is very good result for our problem. So, we will try to find a better solution.

The final part of testing the network is testing with several input values. To do that, we will select 5 random input values from our data set. In this file you can find the full set of data assigned randomly (ignore the first column, it serves only as an help for deployment) and in this file you can find 5 random instances. Now we will carry out testing of the network over this five data.

We will make a new training set in which we will enter these five instances. Then you have to go back to the card that indicates an active neural network and click on the new training set. Finally, click on the 'Test' button.

We got a very high total mean square errorr. Also you can see 3 very bad outputs. Because of that we will continue our training process.

Training attempt 5.

Step 5.5. Train the neural network

Go on 'Randomise'. Paremeters that we use in this attempt are: max error=0.01, learning rate=0.4, momentum= 0.7. The result is:

Now the network completed training for 13 iterations. Let's go now to test the network.

Step 6.3. Test the neural network

After testing, we got that in this attempt the total error is even higher than it was in previous case but still good for our problem.

This is the results of testing 5 radnom data that we got.

This is very good result for our case. We doesn't have large individual errors like in previous attempt.

Training attempt 6

Step 5.6 Train the neural network

Go on 'Randomise'. Paremeters that we use in this attempt are: max error=0.01, learning rate=0.7, momentum= 0.7. The result is:

In only 10 iterations process of training is finished. Hoping to get even better results we willl do the test with whole data set.

Step 6.4. Test the neural network

Only we need to do is to click on the 'Test' button and see the results.

Total mean square errior is 0.03440339693707508. This is higher than in the previous attempt. Let's try a test with five random data.

Again we get satisfactory results, which means that this architecture works pretty well. Let's try to in the end the training with a lower value of momentum.

Training attempt 7.

Step 5.7. Train the neural network

Let's try to train the network with a smaller value of momentum. So all the parameters remains the same as the previous attempt except mementum who now will be 0.3. Of course, as before each new training click on 'Randomize'. Click on 'Train' look and wait for the results.

Process stoped afrer 13 iterations. It's worse than previous attempt but let's see the test results.

Step 6.5. Test the neural network

After testing neural network, we see that in this attempt the total mean square error is 0.032681833816451226, which is better than the error that was in the previous attempt.

Now we will test our 5 random data. Here are the results:

This architecture is much better than the previous one but we will continue to seek the best solution. We still are going to try to find out if it is possible to get even better testing result, with lower total error, but with another network architecture.

Training attempt 8

Step 4.3. Create the neural network

Now we try to train the neural network that will have more neurons than in the previous case. We will try with 9 hidden neurons in one layer. Like the previous two times we decided to use back propagation with momentum algorithm. Name of this network is MultiLayer9. Training set that we will use in this case will be the 'TSFull' .

Step 5.8. Train the neural network

First we will try with the next parameters: max error=0.01, learning rate=0.2, momentum= 0.7. The result is:

We can see that process of training is finished in 837 iterations. Let's see the test results.

Step 6.6. Test the neural network

We click 'Test' and then we can see testing results for this type of neural network architecture. In this case we did not get good result. Total mean square error is not below 5 percents, so we cant accept this result.

Now we will try to test 5 random data.

This is so bad result. We can see 3 big errors (one of them is 1 !), so we conclude that this set of parameters doesn't work. In the following table you can see our next few attempts for this network.

Training attempt	Number of hidden neurons	Number of hidden layers	Training set	Maximum error	Learning rate	Momentum	Number of iterations	Total mean square error	5 random inputs test - number of correct guesses	Network trained
9.	9	1	full	0.01	0.4	0.7	28	0.0333	5/5	yes
10.	9	1	full	0.01	0.7	0.7	19	0.03495	5/5	yes
11.	9	1	full	0.01	0.5	0.3	23	0.03288	4/5	yes
12.	9	1	full	0.01	0.7	0.1	18	0.0331	5/5	yes
13.	9	1	full	0.01	0.7	0	14	0.03282	5/5	yes

Following this table we see that this architecture works well. For different data gives equally good results which leads us to conclude that 9 neurons are enough to learn this data. It is interesting that we get the best result when momentum was 0. Now we will try to do train on a network which contains a more than one hidden layer.

Training attempt 14.

Step 4.4 Create the neural network

In this attempt, we will create a different type of neural network. We want to see what will happens if we create neural network with two hidden layers. First we create a new neural network, type will be Multy Layer Perceptron as it was in the previous attempts. Network we called MultiLayer4 3 Now we have to set network parameters. We will set 4 neurons on the first layer, and 3 on the second. Learning rule will be Backpropagation with Momentum.

Step 5.14. Train the neural network

As we have learned how to train and test neural networks, in the following table we will present the results of training this network. Training set that we use is 'TSFull'.

Training attempt	Number of hidden neurons	Number of hidden layers	Training set	Maximum error	Learning rate	Momentum	Number of iterations	Total mean square error	5 random inputs test - number of correct guesses	Network trained
14.	4 3	2	full	0.01	0.2	0.7	690	0.0556	2/5	no
15.	4 3	2	full	0.01	0.4	0.7	3408	0.04141	1/5	no
16.	4 3	2	full	0.01	0.7	0.7	did not finish (30000)	/	/	no
17.	4 3	2	full	0.01	0.3	0.4	961	0.0601	1/5	no

After this four attempts we show that the type of architecture with more hidden layers of neurons does not correspond to our problem. In this case network has failed to score more than 2 outputs in all attempts. So we do not recommend you to use networks with multiple layers to solve this problem.

Advanced training techniques

Neural networks represent a class of systems that do not fit into the current paradigms of software development and certification. Instead of being programmed, a learning algorithm “teaches” a neural network using a set of data. Often, because of the non-deterministic result of the adaptation, the neural network is considered a “black box” and its response may not be predictable. Testing the neural network with similar data as that used in the training set is one of the few methods used to verify that the network has adequately learned the input domain.

In most instances, such traditional testing techniques prove adequate for the acceptance of a neural network system. However, in more complex, safety- and mission-critical systems, the standard neural network training-testing approach is not able to provide a reliable method for their certification.

One of the major advantages of neural networks is their ability to generalize. This means that a trained network could classify data from the same class as the learning data that it has never seen before. In real world applications developers normally have only a small part of all possible patterns for the generation of a neural network. To reach the best generalization, the data set should be split into three parts: validation, training and testing set.

The validation set contains a smaller percentage of instances from the initial data set, and is used to determine whether the selected network architecture is good enough. If validation was successful, only then we can do the training. The training set is applied to the neural network for learning and adaptation. The testing set is then used to determine the performance of the neural network by computation of an error metric.

This validating-training-testing approach is the first, and often the only, option system developers consider for the assessment of a neural network. The assessment is accomplished by the repeated application of neural network training data, followed by an application of neural network testing data to determine whether the neural network is acceptable.

Training attempt 18.

Step 3.3. Create a Training Set

The idea of this attempt is to use only a part of the data set when training a network, and then test the network with inputs from the other, unused part of the data set. That way we can determine whether the neural network has the power of generalization.

In the initial training set we have 493 instances. In this attempt we will create a new training set that contains only 10% of initial data set instances and we will pick those instances randomly. First we have to create a new file that would contains new data set instances. A new data set would have 49 instances Then, in Neuroph studio we create a new training set, which we called 'TS10' with the same parameters that we used in the first one, and load data from a new data set.

We will also create a training set, which we called 'TS90' that contains the rest 90% of instances that we should use for network testing later in this attempt. This training set will contains 444 instances

The final results of this training attempt are shown in Table 2.

Step 5.18 Train the neural network

Unlike previous attempts, now we will train some neural network which is already created, but in this case it would be trained with a new created training set which contains 10% instances of the initial training set. For this training we will use neural network which has 9 hidden neurons (MultiLayer9). Learning rate will be 0.2 and momentum 0.7 in this case. We click on 'Train' button and wait for training process to finish.

As we can see in the image above, network was successfully trained. It took only 2 iterations for training process to finish. Let's see the test results

Step 6.15. Test the neural network

After successfully training, we can now test neural network. First, will test network with training set that contains only 10% of the initial training set instances. We got that in this case the total error is 0.010952389892228815, which is so good result for our problem.

But, the idea was to test neural network with the other 90% of data set that wasn't used for training this neural network. So now, we will try to do that kind of test. This time, for testing, we will use training set that contains the remaining 90% instances that weren't used for training ('TS90').

When training process has completed, we can see that the total error is 0.03775515556170865, which is not so bad considering the fact that we have tested the network with data that was not used during the training.

Now, we will analyze individual errors by selecting some random inputs to see whether the network is in all cases well predicted the output. We will random choose 5 observations which will be subjected to individual testing ('Random5' data set'). Those observations and their testing results are in the following picture:

As we can see in the table, in all 5 times network correctly guessed. We have a small deviation. It showed us that this type of network has a good ability of generalization.

Training attempt 19.

Step 3.4. Create a Training Set

In this training attempt we will create three different data sets from the initial data set. The first data set will be used for the validation of neural network, the second for training and third for testing the network.

Validation set: 10% of instances - 49 randomly selected observations (firs 49 instances in out 'Random' file) - 'TS10' training set
Training set: 70% of instances - 345 randomly selected observations (whisch includes already created 10%.') - 'TS70' training set
Testing set: 30% of instances that that do not appear in previous two data sets (148 randomly selected observations) - 'TS30' training set

The final results of this training attempt are shown in Table 2.

Step 5.19. Validate and Train the neural network

First we need to do a validation of the network by using a smaller set of data so we can check whether such a network architecture is suitable for our problem, and if so, then we can train the network with a larger data set. Neural network that we use in this attempt will be our 'MultiLayer9'.

We will train the network with validation data set that contains 10% of observations. We will set maximum error to be 0.01, learning rate 0.2 and momentum 0.7. Then we click on 'Train' and training starts. Process ends after only 2 iterations.

Based on validation, we can conclude that this type of neural network architecture is appropriate, but it is also necessary to train the network with a larger set of data so we can be sure.

We will again train this network, but this time with training set that contains 70% of instances. Learning parameters will remain the same as they were during the validation.

We can see that training process is finished after 540 iterations, which is good. Now we want to see the results of testing another 30% of data, which do not appear in this training data.

Step 6.16. Test the neural network

As we already said for this test we use out 'TS30' training set. Only what we need to do is to click on this training set an go to 'Test'.

We do not get the results what we want, This level of total mean square error is high for us. In the following table we present next few attempt , wanting to get better results. (below the 5 %).

Training attempt	Number of hidden neurons	Number of hidden layers	Validation set	Training set	Testing set	Maximum error	Learning rate	Momentum	Success validation	Number of iterations during training	Total mean square error	5 random inputs test	Network trained
20.	9	1	10%	70%	30%	0.01	0.4	0.7	yes	270	0.0653	/	no
21.	9	1	10%	70%	30%	0.01	0.7	0.7	yes	146	0.0629	/	no
22.	4 3	2	10%	70%	30%	0.01	0.2	0.7	yes	9207	0.06796	/	no
23.	4 3	2	10%	70%	30%	0.01	0.4	0.3	yes	1256	0.0700	/	no

In this table you can see that this way of training is very difficult for our case. We tried a several architectures and parameters but we did't get total net error below 5 %.
We are left to try a network that is different from all previous types .

Training attempt 24.

Step 4.5. Create a neural network

In this attempt, we will create a different type of neural network. First we create a new neural network, type will be Multy Layer Perceptron as it was in the previous attempts. Network we called MultiLayer8 4. Now we have to set network parameters. We will set 8 neurons on the first layer, and 4 on the second. Learning rule will be Backpropagation. Please note that we will try training without momentum.

Step 5.24. Validate and Train the neural network

As we did it before, first we must to validate this network. For this we will use our 'TS10'. Max error will be 0.01 and learnng rate 0.2. Go to the 'Train'.

As you can see validation process is success. That means that we can go to the main training. All you need to do is to choose 'TS70', set learning rate on 0.7 and click to the 'Train'.

After 1806 iterations process of training is finished. Let's see the test with rest 30%.

Step 6.21. Test a neural network

Finally. The result witch total main square error is below 5%.
Based on this we think that for this type of testing we do not need Backpropagation with momentum algorithm. Also when it comes to advanced training techniques better is to use a two layer architecture.

Training attempt 25.

Step 3.5. Create a training set

In this training attempt we will also show the use of advanced training techniques. Three training sets will be created - validation, training, and testing set.

Validation set: 30% of instances - 148 randomly selected observations (first 148 instances from 'Random' file) - 'TS30Val' training set
Training set: 80% of instances - 394 randomly selected observations (whisch includes already created 30%.') - 'TS80' training set
Testing set: 20% of instances that that do not appear in previous data sets (99 randomly selected observations) - 'TS20' training set

Now we use 30% instances for validation because in the previous attempt process of validation was very short (only 2 iterations). This is good, but we want a safer validation.

The final results of this training attempt are shown in Table 2.

Step 5.25. Validate and Train the neural network

Again first we need to do a validation of the network by using a smaller set of data so we can check whether such a network architecture is suitable for our problem, and if so, then we can train the network with a larger data set. In this attempt we will use MultiLayer8 4 architecture like in the previous attempt.

We will train neural network with validation data set that contains 30% of instances. The maximum error will be 0.01, learning rate 0.2. Training process ends after 13 iterations.

Then we need to train the neural network with training set that contains 80% of instances with learning rate 0.7. After 1246 iterations process ends, and we can see that training was successful.

Step 6.22 Test the neural network

Now we need to test this neural network in order to see results. Training set that we use in this case is 'TS20'.

The total mean square error in this case is significantly lower than in the previous attempt, and is 0.03599245182360657. We think that this is the best solution that can be obtained for this case, but we will trie the other architectures. The results are shown in the table below.

Training attempt	Number of hidden neurons	Number of hidden layers	Validation set	Training set	Testing set	Maximum error	Learning rate	Momentum	Success validation	Number of iterations during training	Total mean square error	5 random inputs test	Network trained
26.	9	1	30%	80%	20%	0.01	0.2	0.6	yes	944	0.0972	/	no
27.	4 3	2	30%	80%	20%	0.01	0.2	0.3	yes	540	0.0754	/	no
28.	4 3	2	30%	80%	20%	0.01	0.5	0.6	yes	1485	0.0591	/	no

After this we can state that for advanced training techniques for us is better to use networks with 2 layers and Backpropagation algorithm.

Training attempt 29.

Step 3.6. Create a training set

Validation set: 30% of instances - 148 randomly selected observations (first 148 instances from 'Random' file) - 'TS30Val' training set
Training set: 60% of instances - 296 randomly selected observations (whisch includes already created 30%.') - 'TS60' training set
Testing set: 40% of instances that that do not appear in previous data sets (197 randomly selected observations) - 'TS40' training set

Finally, we will try training with a combination of 60-40. For validation we will use 30 percent of the data, as in the previous attempt. For training will use 60 percent, and for test we will use the remaining 40 percent of the data. The results are shown in the table below.

Training attempt	Number of hidden neurons	Number of hidden layers	Validation set	Training set	Testing set	Maximum error	Learning rate	Momentum	Success validation	Number of iterations during training	Total mean square error	5 random inputs test	Network trained
29.	9	1	30%	60%	40%	0.01	0.7	0.7	yes	142	0.0508	/	no
30.	4 3	2	30%	60%	40%	0.01	0.4	0.7	yes	80	0.0537	/	no
31.	8 4	2	30%	60%	40%	0.01	0.7	/	yes	1326	0.0410	/	yes

As in the previous 2 attempts we can see that 8 4 architecture (algorithm without momentum) shows the best results. Thus confirm our expectations.

Conclusion

During this experiment, we have created several different architectures of neural networks. We wanted to find out what is the most important thing to do during the neural network training in order to get the best results.

What proved out to be crucial to the success of the training, is the selection of an appropriate number of hidden neurons during the creating of a new neural network. One hidden layer is in most cases proved to be sufficient for the training success. As it turned, in our experiment was better to use more neurons. In the standard techniques we used double-layer architecture, but they did not show results.

Also, through the various tests we have demonstrated the sensitivity of neural networks to high and low values of learning parameters. We have shown the difference between standard and advanced training techniques. In this tutorial we have given guidance on how to test the network. If you have patience, knowledge and a little luck you may get better results than us. We wish you luck.

Final results of our experiment are given in the two tables below. In the first table (Table 1.) are the results obtained using standard training techniques, and in the second table (Table 2.) the results obtained by using advanced training techniques. The best solutions is indicated by a green background.

Table 1. Standard training techniques

Training attempt	Number of hidden neurons	Number of hidden layers	Training set	Maximum error	Learning rate	Momentum	Number of iterations	Total mean square error	5 random inputs test - number of correct guesses	Network trained
1.	2	1	full	0.01	0.2	0.7	did not finish (9117)	/	/	no
2.	2	1	full	0.01	0.4	0.7	did not finish (46847)	/	/	no
3.	2	1	full	0.01	0.6	0.6	105	0.0593	/	no
4.	5	1	full	0.01	0.2	0.7	6993	0.0252	2/5	no
5.	5	1	full	0.01	0.4	0.7	13	0.0332	5/5	yes
6.	5	1	full	0.01	0.7	0.7	10	0.0344	5/5	yes
7.	5	1	full	0.01	0.7	0.3	13	0.0327	5/5	yes
8.	9	1	full	0..01	0.2	0.7	837	0.0596	2/5	no
9.	9	1	full	0.01	0.4	0.7	28	0.0333	5/5	yes
10.	9	1	full	0.01	0.7	0.7	19	0.03495	5/5	yes
11.	9	1	full	0.01	0.5	0.3	23	0.0329	4/5	yes
12.	9	1	full	0.01	0.7	0.1	18	0.0331	5/5	yes
13.	9	1	full	0.01	0.7	0	14	0.0328	5/5	yes
14.	4 3	2	full	0.01	0.2	0.7	690	0.0556	2/5	no
15.	4 3	2	full	0.01	0.4	0.7	3408	0.0414	1/5	no
16.	4 3	2	full	0.01	0.7	0.7	did not finish (30000)	/	/	no
17.	4 3	2	full	0.01	0.3	0.4	961	0.0601	1/5	no

Table 2. Advanced training techniques

Training attempt	Number of hidden neurons	Number of hidden layers	Validation set	Training set	Testing set	Maximum error	Learning rate	Momentum	Success validation	Number of iterations during training	Total mean square error	5 random inputs test	Network trained
18.	9	1	/	10%	90%	0.01	0.2	0.7	/	2	0.0377	5/5	yes
19.	9	1	10%	70%	30%	0.01	0.2	0.7	yes	540	0.0522	/	no
20.	9	1	10%	70%	30%	0.01	0.4	0.7	yes	270	0.0653	/	no
21.	9	1	10%	70%	30%	0.01	0.7	0.7	yes	146	0.0629	/	no
22.	4 3	2	10%	70%	30%	0.01	0.2	0.7	yes	9207	0.06796	/	no
23.	4 3	2	10%	70%	30%	0.01	0.4	0.3	yes	1256	0.0700	/	no
24.	8 4	2	10%	70%	30%	0.01	0.7	/	yes	1806	0.0471	/	yes
25.	8 4	2	30%	80%	20%	0.01	0.7	/	yes	1246	0.03599	/	yes
26.	9	1	30%	80%	20%	0.01	0.2	0.6	yes	944	0.0972	/	no
27.	4 3	2	30%	80%	20%	0.01	0.2	0.3	yes	540	0.0754	/	no
28.	4 3	2	30%	80%	20%	0.01	0.5	0.6	yes	1485	0.0591	/	no
29.	9	1	30%	60%	40%	0.01	0.7	0.7	yes	142	0.0508	/	no
30.	4 3	2	30%	60%	40%	0.01	0.4	0.7	yes	80	0.0537	/	no
31.	8 4	2	30%	60%	40%	0.01	0.7	/	yes	1326	0.0410	/	yes

Download

PREDICTING THE BURNED AREA OF FOREST FIRES WITH NEURAL NETWORKS