java neural network
Forum  |  Blog  |  Wiki  
Get Java Neural Network Framework Neuroph at SourceForge.net. Fast, secure and Free Open Source software downloads
      

Zoo database

An example of a multivariate data type classification problem using Neuroph

by Nevena Jovanovic, Faculty of Organisation Sciences, University of Belgrade

an experiment for Intelligent Systems course

 

Introduction

In this example we will be testing Neuroph 2.4 with Zoo database, which can be found : here. Several architectures will be tried out, and it will be determined which ones represent a good solution to the problem, and which ones do not.

First here are some useful information about our Zoo database:
Data Set Characteristics: Multivariate
Number of Instances: 101
Attribute Characteristics: Categorical,Integer
Number of Attributes: 17
Associated Tasks: Classification

 

Introducing the problem

In this project I will work on the classification of animals, according to some of their characteristics. Trying to train through a network of networks learn to recognize our exhibits based on the output..
It was found that each of these animals belonged to one of seven classes:

Class# -- Set of animals:

  1. (41) aardvark, antelope, bear, boar, buffalo, calf, cavy, cheetah, deer, dolphin, elephant, fruitbat, giraffe, girl, goat, gorilla, hamster, hare, leopard, lion, lynx, mink, mole, mongoose, opossum, oryx, platypus, polecat, pony, porpoise, puma, pussycat, raccoon, reindeer, seal, sealion, squirrel, vampire, vole, wallaby, wolf
  2. (20) chicken, crow, dove, duck, flamingo, gull, hawk, kiwi, lark, ostrich, parakeet, penguin, pheasant, rhea, skimmer, skua, sparrow, swan, vulture, wren
  3. (5) pitviper, seasnake, slowworm, tortoise, tuatara
  4. (13) bass, carp, catfish, chub, dogfish, haddock, herring, pike, piranha, seahorse, sole, stingray, tuna
  5. (4) frog, frog, newt, toad
  6. (8) flea, gnat, honeybee, housefly, ladybird, moth, termite, wasp
  7. (10) clam, crab, crayfish, lobster, octopus, scorpion, seawasp, slug, starfish, worm

This variable, named type, represents the output variable. Except the output variable, there were 17 input variables for each animal species. Information of input variables:

  1. animal name
  2. hair
  3. feathers
  4. eggs
  5. milk
  6. airborne
  7. aquatic
  8. predator
  9. toothed
  10. backbone
  11. bretahes
  12. venomous
  13. fins
  14. legs
  15. tail
  16. domestic
  17. catsize

Each variable is type of Boolean, except variable animal name which is nominal variable and variable legs is a numeric variable (set of values: {0, 2, 4, 6, 8}).

Handling non-numeric data, such as Boolean = {true, false}, is more difficult. However, nominal-valued variables can be represented numerically. Value true will be replaced with value 1, and value false will be replaced with value 0. We wil not use variable animal name in experiment, because this variable is unique for each case.

In this example we will be using 70%,85% and 90% of data for training the network and 30%,15% and 10% of data for testing it.

To be able to use this data set in Neuoroph, it is necessary to normalize the data, but in this dataset we used the standard formula for normalization, but will vary for each input or output assign one neuron. So in the case of inputs related to the number of legs of the animals we will have 6 different values, so we will make one column and 7 columns, which will be a combination of 1 and 0. So it will be the last column refers to the exhibit, which is seventh In this way, our dataset now have only 0 and 1 values.

Before you start reading our experiment we suggest to first get more familiar with Neuroph Studio and Multi Layer Perceptron.You can do that by clicking on the links below:

Neuroph Studio Geting started

Multi Layer Perceptron

Network design

Here you can see the structure of our network with its inputs,outputs and hidden neurons in the middle layer.


Training attempt 1

Network Type: Multi Layer Perceptron
Training Algorithm: Backpropagation with Momentum
Number of inputs: 21
Number of outputs: 7
Hidden neurons: 15

Training Parameters:
Learning Rate: 0.2
Momentum: 0.7
Max. Error: 0.01

Training Results:

For this training, we used Sigmoid transfer function.



As you can see, the neural network took 33 iterations to train. Total Net Error is acceptable 0.0095

Total Net Error graph look like this:



Practical Testing:

The final part of testing this network is testing it with several input values. To do that, we will select 4 random input values from our data set. Those are:

Inputs
Output
number
hair
feathers
eggs
milk
airborne
aquatic
predator
toothed
backbone
venomous
fins
 legs
tail
domestic
catsize
Designed outputs
Real outputs
1.
1
0
0
1
0
0
0
1
1
1
0
0,0,0,1,0,0,0
1
1
1
1,0,0,0,0,0,0
1,0,0,0,0,0,0
2
0
0
0
0
0
1
1
1
1
0
1
0,1,0,0,0,0,0
1
0
0
0,0,1,0,0,0,0
0,0,0.78,0.0019,0,0,0.0329
3
0
0
1
0
0
0
1
1
1
1
0
0,1,0,0,0,0,0
1
0
0
0,0,1,0,0,0,0
0,0,0.997,0,0,0,0.008
4.
0
0
1
0
0
0
0
0
0
1
0
0,0,0,0,0,1,0
0
0
0
0,0,0,0,0,1,0
0,0,0,0,0,0.9994,0.0223

The network guessed correct in all five instances. After this test, we can conclude that this solution does not need to be rejected. It can be used to give good results in most cases.

In our next experiment we will be using the same network,but some of the parametres will be diferent and we will see how the result is going to change.

Training attempt 2

Network Type: Multi Layer Perceptron
Training Algorithm: Backpropagation with Momentum
Number of inputs: 21
Number of outputs: 7
Hidden neurons: 14

Training Parameters:
Learning Rate: 0.5
Momentum: 0.01
Max. Error: 0.01

Training Results:

For this training, we used Sigmoid transfer function.



As you can see, the neural network took 15 iterations to train. Total Net Error is acceptable 0.009

Total Net Error graph look like this:



Practical Testing:

The only thing left is to put the random inputs stated above into the neural network. The result of the test are shown in the table. The network guessed right in all five cases.

number
1.Category
2.Category
3.Category
4.Category
5.Category
6.Category
7.Category
1. 0,9705 0 0 0 0 0 0
2. 0 0 0,9985 0 0 0 0
3. 0 0 0 0 0 0,8136 0

In this table we see exhibited in appropriate random number insanci. These are the value obtained in testing. Based on the results we can conclude that the deviations were smaller than in the previous case.

In the next two attempts we will be making a new neural network.The main difference will be the number of hidden neurons in the structure of our network and other parametres will also be changed.

Training attempt 3

Network Type: Multi Layer Perceptron
Training Algorithm: Backpropagation with Momentum
Number of inputs: 21
Number of outputs: 7
Hidden neurons: 10

Training Parameters:


Learning Rate: 0.3
Momentum: 0.9
Max. Error: 0.01

Training Results:


For this training, we used Sigmoid transfer function.



As you can see, the neural network took 7 iterations to train. Total Net Error is acceptable 0.008169

Total Net Error graph look like this:



Practical Testing:


The final part of testing this network is testing it with several input values. To do that, we will select 5 random input values from our data set. Those are:

Inputs
Output
number
hair
feathers
eggs
milk
airborne
aquatic
predator
toothed
backbone
venomous
fins
legs
tail
domestic
catsize
Designed outputs
Real outputs
1.
1
0
0
1
0
0
0
1
1
1
0
0,0,0,1,0,0,0
1
1
1
 1,0,0,0,0,0,0
 0,9964,0,001,0,0019,0,0007,0,0029,0,003,0,0002
2.
0
1
1
0
1 
0
0
0
1
1
0
0,0,1,0,0,0,0
1
0
0
 0,1,0,0,0,0,0
0,0089,0,9824,0,0089,0,0031,0,0009,0,0053,0,0052
3.
1
0
0
1
1
0
0
1
1
1
0
0,0,0,1,0,0,0
1
0
0
 1,0,0,0,0,0,0
 0,9965,0,0002,0,0016,0,0003,0,0045,0,0084,0,0003
4.
1
0
0
1
0
0
0
0
0
0
1
0,0,1,0,0,0,0
0
0
0
 0,0,0,0,0,1,0
  0,0025,0,069,0,0276,0,0163,0,0056,0,7659,0,0403

Based on this test, we conclude that the results are worse compared to previous tests and that the error is greater than the earlier ones.

In our next experiment we will be using the same network,but some of the parametres will be diferent and we will see how the result is going to change.

Training attempt 4

Network Type: Multi Layer Perceptron
Training Algorithm: Backpropagation with Momentum
Number of inputs: 21
Number of outputs: 7
Hidden neurons: 10

Training Parameters:


Learning Rate: 0.5
Momentum: 0.4
Max. Error: 0.01

Training Results:


For this training, we used Sigmoid transfer function

.

As you can see, the neural network took 24 iterations to train. Total Net Error is acceptable 0.0099

Total Net Error graph look like this:



Practical Testing:


The only thing left is to put the random inputs stated above into the neural network. The result of the test are shown in the table. The network guessed right in all four cases.

number
1.Category
2.Category
3.Category
4.Category
5.Category
6.Category
7.Category
1. 0,006 0,975 0,0123 0,0006 0,0018 0,0166 0,0014
2. 0,0013 0,0129 0,0231 0,0396 0,0385 0,0429 0,8808
3. 0,9598 0,0408 0,0099 0,0003 0,0028 0,0072 0
4. 0,0079 0,0616 0,021 0,0023 0,0104 0,6888 0,0767

Training attempt 5

This time we will be making some more significant changes in the structure of our network.Now we will try to train a network with 5 neurons in its hidden layer.

Network Type: Multi Layer Perceptron
Training Algorithm: Backpropagation with Momentum
Number of inputs: 21
Number of outputs: 7
Hidden neurons: 5

Training Parameters:


Learning Rate: 0.2
Momentum: 0.7
Max. Error: 0.01

We use here 85% of data set for traning.

Training Results:


For this training, we used Sigmoid transfer function.



Training was completed in 826 iterations, but the error is large. So this is a not good selection of combinations of network and training set.


Total Net Error graph look like this:



So the conclusion of this experiment is that the choice of the number of hidden neurons is crucial to the effectiveness of a neural network.

One of the "rules" for determining the correct number of neurons to use in the hidden layers is that the number of hidden neurons should be between the size of the input layer and the size of the output layer. Formula that we used looks like this:((number of inputs + number of outputs)/2)+1.In that case we made a good network that showed great results.Then we made a network with less neurons in its hidden layer and the results were not as good as before. So,in the next example we are going to see how will the network react with a greater number of hidden neurons.

Training attempt 6

Network Type: Multi Layer Perceptron
Training Algorithm: Backpropagation with Momentum
Number of inputs: 21
Number of outputs: 7
Hidden neurons: 10

We use here 85% of data set for traning, and 15 % data set for testing.

Training Parameters:


Learning Rate: 0.02
Momentum: 0.7
Max. Error: 0.01

Training Results:


For this training, we used Sigmoid transfer function.



As you can see, the neural network took 23 iterations to train. Total Net Error is acceptable 0.0099

Total Net Error graph look like this:



Practical Testing:

The final part of testing this network is testing it with several input values. To do that, we will select 4 random input values from our data set. Those are:

Inputs
Output
number
hair
feathers
eggs
milk
airborne
aquatic
predator
toothed
backbone
breathes
venomous
fins
legs
tail
domestic
catsize
Designed outputs
Real outputs
 1.
1
0
0
1
0
0
1
1
1
1
0
0
0,0,1,0,0,0
0
0
1
 1,0,0,0,0,0,0
 1,0,0,0009,0,0,0
 2.
0
0
1
0
0
1
1
1
1
0
0
1
1,0,0,0,0,0
1
0
0
 0,0,0,1,0,0,0
 0,0,0,0291,1,0,0,0
 3.
0
0
1
0
0
1
0
1
1
0
0
1
1,0,0,0,0,0
1
1
0
 0,0,0,1,0,0,0
 0,0,0,0455,1,0,0,0
 4.
  0
    0
1
0
0
0
1
0
0
0
0
0
1,0,0,0,0,0
0
0
0
 0,0,0,0,0,0,1
 0,0,0,0003,0,0,0,1

We can conclude that a decrease in the number of hidden neurons reduces the total error in the testing and to appear in the test outputs deviate less than anticipated.

As you can see,this number of hidden neurons with appropriate combination of parametres also gave a good results except the third instances, where the error is 0.0000455.

Training attempt 7

Now we will see how the same network is going to work with a diferent set of parametres.

Network Type: Multi Layer Perceptron
Training Algorithm: Backpropagation with Momentum
Number of inputs: 21
Number of outputs: 7
Hidden neurons: 10

Training Parameters:


Learning Rate: 0.01
Momentum: 0.7
Max. Error: 0.01

Training Results:


For this training, we used Sigmoid transfer function.



As you can see, the neural network took 463 iterations to train. Total Net Error is acceptable 0.0099.

Total Net Error graph look like this:



Practical Testing:

The only thing left is to put the random inputs stated above into the neural network. The result of the test are shown in the table. The network guessed right in all four cases. In this case we can see that the deviations from the defined high value on the basis of error in the test. And based on the total error in the test.

Outputs
Individual errors
 0.9925, 0.0032, 0.0085; 0.0011, 0.0133, 0.004, 0.0001
 -0.0075, 0.0032, 0.0085, 0.0011, 0.0133, 0.004, 0.0001
0.0097, 0.9667, 0.0091, 0.0012, 0.003, 0.0117, 0.0098
 0.0097, -0.0333, 0.0091, 0.0012, 0.003, 0.0117, 0.0098
 0.3141, 0.0941, 0.0133, 0.0004, 0.0215, 0.027, 0.0184
 0.3141, 0.0941, 0.0133, 0.0004, 0.0215, 0.027, -0.9816
In this table you can see a few cases from test result, their outputs and individual errors.

Although in this example we used considerably different set of parametres the network gave a good results in the test.

Some statistics

Four different solutions tested in this experiment have shown that the choice of the number of hidden neurons is very important for the effectiveness of a neural network. We have concluded that one layer of hidden neurons is enough in this case. Also, the experiment showed that the success of a neural network is very sensitive to parameters chosen in the training process. The learning rate must not be too high, and the maximum error must not be too low.

Below is a table that summarizes this experiment. The best solution for the problem is marked in the table.

Training attempt
Number of hidden neurons
Number of hidden layers
Training set
Maximum error
Learning rate
Momentum
Total mean square error
Number of iterations
Test set
Network trained
1
15
1
70% of full data set
0.01
0.2
0.7
0.00084
23
30% of full data set
yes
2
15
1
70% of full data set
0.01
0.5
0.01
0.042
15
30% of full data set
yes
3
10
1
70% of full data set
0.01
0.3
0.9
0.0411
7
30% of full data set
yes
4
10
1
70% of full data set
0.01
0.5
0.4
0.0419
24
30% of full data set
yes
5
5
1
85% of full data set
0.01
0.2
0.7
/
826
15% of full data set
yes
6
10
1
85% of full data set
0.01
0.02
0.7
0,000045
94
15% of full data set
yes
7
10
1
70% of full data set
0.01
0.01
0.7
0,0099
463
30% of full data set
yes

Advanced Training Techniques

When the training is complete, you will want to check the network performance. A learning neural network is expected to extract rules from a finite set of examples. It is often the case that the neural network memorizes the training data well, but fails to generate correct output for some of the new test data. Therefore, it is desirable to come up with some form of regularization.

One form of regularization is to split the training set into a new training set and a validation set. After each step through the new training set, the neural network is evaluated on the validation set. The network with the best performance on the validation set is then used for actual testing. Your new training set consisted of the say it for example 80% - 90% of the original training set, and the remaining 10% - 20% would be classified in the validation set. Then you have to compute the validation error rate periodically during training and stop training when the validation error rate starts to go up. However, validation error is not a good estimate of the generalization error, if your initial set consists of a relatively small number of instances. Our initial set, we named it TS1, consists only of 101 instances (animal species). In this case 10% or 20%, of the original training set, consisted of the 10 or 20 instances. This is the insufficient number of instances to perform validation. In this case instead validation we will use a generalization as a form of regularization.

In the following examples we will check the generalization error, such as from the example to the example we will increase the number of instances in the training set, which we use for training, and we will decrease the number of instances in the sets that we used for testing.

Training attempt 8

In this case we'll create a new training set that will cover 10% sample, a mine that does not have to be located in previously trained dataset.That part of data you can get from this link:TS1.

First we train the network to 90%, and then test the new test of 10%. On this basis, we will see how fast you learn, if the test data that are not in the training set.We will train on another network that we created earlier, and in doing so we will use the parameters: learning rate 0.21, momentum 0.7 and the maximum error 0.01Of course, previously it was necessary to isolate the total dataset of 10 instances we will place at TS1.

The test results are:

Total Net Error graph look like this:

Based on the error which is small, we can conclude that our network learns very quickly even on data that were not previously trained.

Here we have error 0.002 which is ok, and The individual errors are also negligible. Also we can conclude that a decrease in the number of hidden neurons, mainly reduces toltalna error.

Dynamic Backpropagation

These are the results of a Dynamic Backpropagation algoritam used on the best example in our experiment.


Training Results:
For this training, we used Sigmoid transfer function.


Total Net Error graph look like this:

Practical Testing:

Graph

 In this graphic, we see the relationship between
 learning rate and number of iterations.
 And we notice that the number of iterations,
 generally increases with decreasing learnig rate  


Conclusion

During this experiment, we created three different architectures, one basic training set and six training sets derived from the basic training set. We normalize the original data set using a linear scaling method. Through six basic steps we explained in detail the creation, training and testing neural networks. If the network architecture using a small number of hidden neurons training will become excessively and the network may over fit no matter what are the values of training parameters. We point out major differences between the Perceptron and MultiLayerPerceptron, as network types. Through the various tests we have demonstrated the sensitivity of neural networks to high and low values of learning parameters. We have shown that the best solution to the problem of classification of animal species, in seven different groups, is architecture with one hidden layer and six hidden neurons. Finally, we explain the importance of generalization and we pointed to the importance of validation as an important form of regularization. In the table below can been seen the overall results of this experiment. Best solution is indicated in green color.



DOWNLOAD


See also:
Multi Layer Perceptron Tutorial

 

      Java Get Powered      Java Get Powered                           Get Java Neural Network Framework Neuroph at SourceForge.net. Fast, secure and Free Open Source software downloads