java neural network
Forum  |  Blog  |  Wiki  
Get Java Neural Network Framework Neuroph at SourceForge.net. Fast, secure and Free Open Source software downloads
      

CLASSIFICATION OF ANIMAL SPECIES USING NEURAL NETWORK - PART 2

An example of a multivariate data type classification problem using Neuroph

by Boris Ruzic, Faculty of Organizational Sciences, University of Belgrade

an experiment for Intelligent Systems course

 

Introduction

This work represents a continuation of the experiment CLASSIFICATION OF ANIMAL SPECIES USING NEURAL NETWORK.

Classification is one of the most frequently encountered decision making tasks of human activity. A classification problem occurs when an object needs to be assigned into a predefined group or class based on a number of observed attributes related to that object. Because of this the aim of cluster analysis is to classify the objects into clusters, especially in such a way that two objects of the same cluster are more similar than the objects of other clusters. Neural networks are machine learning technology suitable for ill-defined problems, such as recognition, prediction, classification, and control. Advantage of neural networks lies in the following aspects. First, they can adjust themselves to the data without any explicit specification of functional or distributional form for the underling model, because they are data driven self-adaptive methods. Second, neural networks are nonlinear models, which makes them flexible in modeling real world complex relationships. Finally, neural networks can approximate any function with arbitrary accuracy.

 

Introduction to the problem

The purpose of this experiment is to present the results from previous work in graphical form. Also to study the feasibility of classification animal species using neural networks. An animal class is made up of animal that are all alike in important ways. So we need to train a neural network to make it able to predict which species belong to a particular group. Once we have decided on a problem to solve using neural networks, we will need to gather data for training purposes. The training data set includes a number of cases, each containing values for a range of input and output variables. The data set that we use in this experiment can be found at http://archive.ics.uci.edu/ml/datasets.html under the category classification. In this category there are many sets of data but for the purposes of this experiment we will use the data set named Zoo. This set of data was published by Richard Forsyth (date donated: 1990-05-15).

This database includes 101 cases. Each case is the name of animal. It was found that each of these animals belonged to one of seven classes.

Each variable is type of Boolean, except variable animal name which is nominal variable and variable legs is a numeric variable (set of values: {0, 2, 4, 6, 8}).

Some training attempts will show the relations between certain variables and at the end of experiment will be presented relations between three different parameters - learning rate, momentum and number of iterations.

 

Training attempt 2

Table 1. Training results for the first architecture

Training attempt
Hidden Neurons
Learning Rate
Momentum
Max Error
Number of iterations
Total Net Errors
1. 2 0.2 0.7 0.01 19540 0.0201
2. 2 0.3 0.7 0.01 19798 0.1977
3. 2 0.5 0.4 0.01 25630 0.1289
4. 2 0.7 0.7 0.01 20342 0.1995
5. 2 0.9 0.8 0.01 20907 0.3007

Based on data from Table 1 can be seen that regardless of the parameters of training error do not falls below a specified level, even if we train the network through a different number of iterations.

This all may be due to the small number of hidden neurons. In the following solution we will increase the number of hidden neurons.

Here you can see relation between learning rate and number of iterations.

As you can see we can conclude that the smallest number of iterations is when the learning rate is also the smallest -> 0.2. But the bigest number of iterations is when the learning rate has mean value -> 0.5. On the second graph total net errors has tendency of growth as learning rate increase.

And below it can be seen relation between momentum and number of iterations.

So number of iterations has the biggest value when momentum has the smallest.

Recommendation: If you do not get the desired results, continue to gradually increase the training parameters. The neural network will definitely learn the new sample, and it would not forget all the samples it had learnt previously.

 

Conclusion

During this experiment, we created six different architectures, one basic training set and six training sets derived from the basic training set. We normalize the original data set using a linear scaling method. Through graphs we have shown relations between major parameters. We have concluded that one layer of hidden neurons is enough in this case. Also, the experiment showed that the success of a neural network is very sensitive to parameters chosen in the training process. If the network architecture using a small number of hidden neurons training will become excessively and the network may over fit no matter what are the values of training parameters. Through the various tests we have demonstrated the sensitivity of neural networks to high and low values of learning parameters. We have shown that the best solution to the problem of classification of animal species, in seven different groups, is architecture with one hidden layer and six hidden neurons. Finally, in the table below can been seen the overall results of this experiment. Best solution is indicated in green color.

Training attempt
Number of hidden neurons
Number of hidden layers
Training set
Maximum error
Learning rate
Momentum
Total mean square error
Number of iterations
Number of correct guesses
Network trained
1
2
1
full
0.01
0.2
0.7
-
19540
-
no
2
2
1
full
0.01
0.3
0.7
-
19798
-
no
3
2
1
full
0.01
0.5
0.4
-
25630
-
no
4
2
1
full
0.01
0.7
0.7
-
20342
-
no
5
2
1
full
0.01
0.9
0.8
-
20907
-
no
6
4
1
full
0.01
0.001
0.05
-
2000
-
no
7
4
1
full
0.01
0.9
0.9
-
2000
-
no
8
4
1
full
0.01
0.5
0.5
-
2000
-
no
9
6
1
full
0.01
0.6
0.4
0.00267
71
3/5
yes
10
6
1
full
0.01
0.7
0.4
0.002557
1
5/5
yes
11
6
1
only 70% of instances used
0.01
0.7
0.4
0.01526
53
16/31
yes
12
6
1
only 85% of instances used
0.01
0.7
0.4
0.02003
250
15/16
yes
13
6
1
only 90% of instances used
0.01
0.7
0.4
0.01005
119
11/11
yes
14
10
1
full
0.01
0.7
0.4
0.00256
62
5/5
yes
15
10
1
only 70% of instances used
0.01
0.7
0.4
0.00251
80
5/7
yes
16
10
1
only 85% of instances used
0.01
0.7
0.4
0.00203
2000
4/6
yes
17
10
1
only 90% of instances used
0.01
0.7
0.4
0.00191
2000
11/11
yes
18
18
1
full
0.01
0.7
0.4
0.00252
60
4/6
yes
19
18
1
only 70% of instances used
0.01
0.7
0.4
0.01364
37
11/12
yes
20
18
1
only 85% of instances used
0.01
0.7
0.4
0.00993
2000
12/15
yes
21
18
1
only 90% of instances used
0.01
0.7
0.4
0.00205
2000
11/11
yes
22
30
1
full
0.01
0.7
0.4
0.00252
2000
6/8
yes
23
30
1
only 70% of instances used
0.01
0.7
0.4
0.00269
2000
7/11
yes
24
30
1
only 85% of instances used
0.01
0.7
0.4
0.00896
2000
16/18
yes
25
30
1
only 90% of instances used
0.01
0.7
0.4
0.00401
2000
11/11
yes

Dynamic Backpropagation

These are the results of a Dynamic Backpropagation algoritam used on the best example in our experiment.


Training Results:
For this training, we used Sigmoid transfer function.



Total Net Error graph look like this:

Practical Testing:


Impact of Learning rate on Number of iterations

Now here will be shown relation between Learning Rate and Number of iterations and relation between Momentum and of course Number of iterations.

This graph shows what order the values had been chosen for Learning rate and how it affected the Number of iterations. We can conclude that it takes less iterations to make the network well trained. Learning rate is a value ranging from zero to unity. Choosing a value very close to zero, requires a large number of training cycles. This makes the training process extremely slow. On the other hand, if the learning rate is very large, the weights diverge and the objective error function heavily oscillates and the network reaches a state where no useful training takes place.

Impact of Momentum on Number of iterations

On this graph we can see that Number of iterations has the highest value when the momentum is 0.4, and has the lowest value when the momentum is 0.9. When we found the right value for momentum, number of iterations as more and more reduced. The momentum parameter is used to prevent the system from converging to a local minimum or saddle point. A high momentum parameter can also help to increase the speed of convergence of the system. However, setting the momentum parameter too high can create a risk of overshooting the minimum, which can cause the system to become unstable. A momentum coefficient that is too low can not reliably avoid local minima, and can also slow down the training of the system.

Impact of Hidden neurons on Number of iterations

Below is a graph that shows relation beetween number of hidden neurons and iterations.

On this graph we can see that by increasing the number of hidden neurons the network can be successfully trained with a smaller number of iterations.

Impact of Total Net Error on Number of hidden neurons

On the next graph it is shown relation between number of hidden neurons and total net error.

Result of this graph shows impact of number of hidden neurons on total net error. So the more we increase the number of hidden neurons, total net error will decrease even more.

DOWNLOAD
Data set used in this tutorial
Training sets
Neuroph project

See also:
Multi Layer Perceptron Tutorial

 

      Java Get Powered      Java Get Powered                           Get Java Neural Network Framework Neuroph at SourceForge.net. Fast, secure and Free Open Source software downloads