Zoo Dataset Test
An example of a multivariate data type classification problem using Neuroph
by Alma Murankovic, Faculty of Organisation Sciences, University of Belgrade
an experiment for Intelligent Systems course
Introduction
A Neural network is an artificial system (made of artificial neuron cells). It is modeled after the way the human brain works.In this example we will be testing Neuroph with Zoo Dataset, which can be found here http://archive.ics.uci.edu/ml/. Several architectures will be tried out, and it will be determined which ones represent a good solution to the problem.
First here are some useful information about our Zoo Dataset:
Data Set Characteristics:Multivariate Number of Instances: 101
Attribute Characteristics:Categorical,Integer Number of Attributes: 17
Associated Tasks:Classification
Introducing the problem
The objective of this problem is to create and train neural network to study the feasibility of classification animal species.The name of data set is Zoo Data Set create by Richard Forsyth.The data set that we use in this experiment can be found at
This data set includes 101 instances.We have 17 input attributes and one output.
Input attributes are:
- animal name
- hair
- feathers
- eggs
- milk
- airborne
- aquatic
- predator
- toothed
- backbone
- bretahes
- venomous
- fins
- legs
- tail
- domestic
- catsize
Data Set Information:
A simple database containing 17 Boolean-valued attributes. The "type" attribute appears to be the class attribute. Here is a breakdown of which animals are in which type:
Class - Set of animals:
1 - aardvark, antelope, bear, boar, buffalo, calf, cavy, cheetah, deer, dolphin, elephant, fruitbat, giraffe, girl, goat, gorilla, hamster, hare, leopard, lion, lynx, mink, mole, mongoose, opossum, oryx, platypus, polecat, pony, porpoise, puma, pussycat, raccoon, reindeer, seal, sealion, squirrel, vampire, vole, wallaby,wolf
2 - chicken, crow, dove, duck, flamingo, gull, hawk, kiwi, lark, ostrich, parakeet, penguin, pheasant, rhea, skimmer, skua, sparrow, swan, vulture, wren
3 - pitviper, seasnake, slowworm, tortoise, tuatara
4 - bass, carp, catfish, chub, dogfish, haddock, herring, pike, piranha, seahorse, sole, stingray, tuna
5 - frog, frog, newt, toad
6 - flea, gnat, honeybee, housefly, ladybird, moth, termite, wasp
7 - clam, crab, crayfish, lobster, octopus, scorpion, seawasp, slug, starfish, worm
In order to train a neural network, there are six steps to be made:
1. Normalize the data
2. Create a Neuroph project
3. Create a training set
4. Create a neural network
5. Train the network
6. Test the network to make sure that it is trained properly
Step 1. Normalizing the data
First the dataset must be normalized. It means that value in this dataset must be a number beetween 0 and 1.
In this case, the thirteenth and seventeenth column must be normalized. After we finished that we got 28 columns,exactly we got 21 inputs and 7 outputs.
Step 2. Creating a new Neuroph project.
Now we need to create our project. First, you must start Neuroph Studio and create a new project select File --> New Project.
After that, we click button "Next" and enter Project name. The name of our project is "ZOOProjekt". Then press Finish button to create your project.
Step 3. Creating a Training set
To create training set, we choose Training --> New Training Set. We need training our dataset to teach our neural network to perform clasification of animals species.
Select training set file type and click "Next". After that, enter training set name. Select the type of supervised because we are using the dataset with both input and output attributes, enter number of inputs and number of outputs.
Number of inputs: 21
Number of outputs: 7
In our example the name of training set is"TrainingAllData".
After pressing "Next" we must insert some data into training set table. We need to click "Load From File" button.
Now we choose our dataset,select values separator(in our case select the ,) and press "Load" to load chosen dataset.
After we clicked to load we must go on button "Finish",and then we can start to train our dataset.
After this, everything is ready for the creation of neural networks. We will create several neural networks with different parameters, and determine which is the best solution for our problem by testing them.
Training attempt 1
Step 4.1 Creating a neural network
Now we are going to create our neural network.We must click on our project in the "Project" window, and then click "New", then "Neural Network". Now me must enter your neural network name and choose a Neural Network Type.
Network Type: Multi Layer Perceptron
Our network will call "zooMreza1" and choose "Multi Layer Perception" type. A multilayer perceptron (MLP) is a feedforward artificial neural network model that maps sets of input data onto a set of appropriate output.
By pressing the "Next" button, we will get next window.
Training Algorithm: Backpropagation with Momentum
Number of inputs: 21
Hidden neurons: 15
Number of outputs: 7
First, we enter number of input and output neurons.The number of input and output neurons are the same as in the training set.
And now we have to choose number of hidden layers. Deciding the number of neurons in the hidden layers is a very important part of deciding your overall neural network architecture. Though these layers do not directly interact with the external environment, they have a tremendous influence on the final output. Both the number of hidden layers and the number of neurons in each of these hidden layers must be carefully considered.
There are many rule-of-thumb methods for determining the correct number of neurons to use in the hidden layers, such as the following:
The number of hidden neurons should be between the size of the input layer and the size of the output layer.
The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output layer.
The number of hidden neurons should be less than twice the size of the input layer.
Next, we will check "Use Bias Neurons" because Bias neurons are added to neural networks to help them learn patterns.
Then, for Transfer function select "Sigmoid" because it is the best solution for our king of problem, and last select "Backpropogation with Momentum" for learning rule because Backpropagation With Momentum algorithm shows a much higher rate of convergence than the Backpropagation algorithm.
|