BLOOD TRANSFUSION SERVICE CENTER
An example of a multivariate data type classification problem using Neuroph
by Ana Jovanovic, Faculty of Organisation Sciences, University of Belgrade
an experiment for Intelligent Systems course
Introduction
In this experiment it will be shown how neural networks and Neuroph Studio are used when there are
problems of classification. We will work with several architecture, and it will be determined which
ones is the best solution to the problem, and which ones do not.
Classification is a task that is often found in every day life. A classification process involves
assigning objects into predefined groups or classes based on a number of observed attributes related to
those objects. Although there are some more traditional tools for classification, such as certain
statistical procedures, neural networks have shown to be an effective solution for this type of
problems. There is a number of advantages to using neural networks - they are data driven, they are
self-adaptive, they can approximate any function - linear as well as non-linear (which is quite
important in this case because groups often cannot be divided by linear functions). Neural networks
classify objects rather simply - they take data as input, derive rules based on those data, and make
decisons.
For better understanding of our experiment, we suggest that you first look at the links below:
Neuroph
Studio-Geting started
Multi Layer
Perceptron
Introducing the problem
The goal is to teach the neural network to predict whether a blood
donor gave blood in March 2007 based on characteristics that are given as input parameters. The first
thing we need is a data set. Data set is available here.
The name of the data set is the Blood Transfusion Service Centre data set. Data taken from the Blood Transfusion
Service Center in Hsin-Chu City in Taiwan.
The dataset contains 748 instances and 5 attributes, where the first four attributes are the
input and the fifth is the output. Each instance has one of 2 possible classes( person donated/hasn't donated blood
in March 2007).
Input attributes are:
- R (Recency - months since last donation-numerical),
- F (Frequency - total number of donation-numerical)
- M (Monetary - total blood donated in c.c.-numerical)
- T (Time - months since first donation-numerical)
Output attribute is a binary variable representing
whether he/she donated blood in March 2007
1 - stand for donating blood;
0 - stands for not donating blood;
When data set dowloaded, it can not be inserted in Neuroph in its original form. For it to be
able to help us with this classification problem, we need to normalize the data first. The type of
neural network that will be used in this experiment is multi layer perceptron with backpropagation.
Prodecure of training a neural network
In order to train a neural network, there are five steps to be made:
- Preparation of the data set
- Create a Neuroph project
- Create a training set
- Create a neural network
- Train the network
- Test the network to make sure that it is trained properly
1.Step: preparation of the data set
In order to train neural network this data set have to be normalized. Normalization implies that all values from the data set should take values in the range from 0 to 1.
For that purpose it would be used the following formula:
Where:
X - value that should be normalized
Xn - normalized value
Xmin - minimum value of X
Xmax - maximum value of X
The last attribute has a normalized value (0 or 1), so no need to use formula.
2.Step Creating a new Neuroph project
When all the data are standardized, we just need to put it in a .csv file and everything is set for the
creation of a new training set. First, a new Neuroph projects needs to be created by clicking on the 'File' menu,
and then 'New project'.The project will be called 'Blood Transfusion Service Center'.
When we click 'Finish', and new project is created and it will appear in the 'Projects' window, in the top left
corner of Neuroph Studio.
3.Step Creating a training set
Now, we need to create a new training set by right-clicking on our project, and selecting 'New', and 'Training
set'. We give it a name, and then set the parameters. The type is chosen to be 'Supervised' training , because we
want to minimize the error of prediction through an iterative procedure. Supervised training is accomplished by
giving the neural network a set of sample data along with the anticipated outputs from each of these samples. That
sample data will be our data set. Supervised training is the most common way of neural network training. As
supervised training proceeds, the neural network is taken through a number of iterations, until the output of the
neural network matches the anticipated output, with a reasonably small rate of error. Error rate we find to be
appropriate to make the network well trained is set just before the training starts. Usually, that number will be
around 0.01.
Next, we set the number of input, which is 4, because there are 4 input attributes, and the number of outputs is
1, as explained above.
After clicking next, we need to edit training set table. In this case click a 'Load from file', to select a file
from which the table will be loaded. We click on 'Choose file', find the file with data we need, and then select a
values separator. In this case, it is comma, but it can also be a space, tab, or semicolon.
Then, we click 'Next', and a window that represents our table of data will appear. We can see that everything is
in order, there are 4 input and 1 output column, all the data are in the range od 0-1, and we can now click on
'Finish'.
After we done this, everything is ready for the creation of neural networks. We will create several neural
networks, all with different sets of parameters, and determine which is the best solution for our problem by
testing them. This is the reason why there will be several options for steps 4, 5 and 6.
Training attempt 1
4.1 Step Creating a neural network
To create the optimal architecture of neural network for the problem, we will pay more attention to the number
of neurons we choose to have in the hidden layer. There is no formula that would calculate that number for every
possible problem. There are only some rules and guidlines. For example, as stated above, more neurons make the
network more flexible, but also make it more sensitive to noise. So the answer is to have just enough neurons to
solve the problem appropriately, but no more. Some of the rule-of-thumb methods for determining the correct number
of neurons to use in the hidden layers are:
The number of hidden neurons should be between the size of the input layer and the size of the output
layer - in this case, the number should be between 1 and 4
The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output
layer - applied to this problem: 2/3 * 4 + 1 = 3,66 , so we'll round up to 4.
The number of hidden neurons should be less than twice the size of the input layer - here, less than
8, but we can put more neurons, is not strictly limited
The first neural network that we are testing, will be called BTSC_4_4_1. We will create it by
right-clicking our project in the 'Projects' window, and then clicking 'New' and 'Neural Network'. We will set the
name and the type of the network. Multi Layer Perceptron will be selected. Multi layer perceptron is the most
widely studied and used neural network classifier. It is capable of modeling complex functions, it is robust (good
at ignoring irrelevant inputs and noise) ,and can adapt its weights and/or topology in response to environment
changes. Another reason we will use this type of perceptron is simply because it is very easy to use - it
implements black-box point of view, and can be used with few knowledge about the relationship of the function to be
modeled.
When we have selected the type of the perceptron, we can click 'Next'. A new window will appear, where we will
set some more parameters that are characteristic for multi layer perceptron. The number of input and output neuron
is the same as the number of inputs and outputs in the training set. However, now we have to select the number of
hidden layers, and the number of neurons in each layer. Guided by the rule that problems that require two hidden
layers are rarely encountered (and that there is currently no theoretical reason to use neural networks with any
more than two hidden layers), we will decide for only one layer. As for the number of units in the layer, since it
is known that one should only use as much neurons as it is needed to solve the problem, we will choose as little as
we can for the first experiment - only one. Networks with many hidden neurons can represent functions with any kind
of shape, but this flexibility can cause the network to learn the noise in the data. This is called 'overtraining'.
We have checked 'Use Bias Neurons', and chosen sigmoid transfer function (because the range of our data is 0-1,
had it been -1 to 1, we would check 'Tanh'). As a learning rule we have chosen 'Backpropagation with Momentum'.
This learning rule will be used in all the networks we create, because backpropagation is most commonly used
technique and is most suited for this type of problem. In this method, the objects in the training set are given to
the network one by one in random order and the regression coefficients are updated each time in order to make the
current prediction error as small as it can be. This process continues until convergence of the regression
coefficients. Also, we have chosen to add an extra term, momentum, to the standard backpropagation formulae in
order to improve the efficiency of the algorithm.
Next, we click 'Finish', and the first neural network which we will test is completed.
If you want to see neural network as a graph, just select 'Graph View'. Right nodes in first and second level
are bias neurons that we explained above.
5.1 Step Training the neural network
Now we need to train the network using the training set we have created. We select the training set and click
'Train'. A new window will open, where we need to set the learning parameters. The maximum error will be 0.01, and
learning rate 0.2 and momentum will be 0.2. Learning rate is basically the size of the 'steps' the algorithm will
take when minimizing the error function in an iterative process. We click 'Train' and see what happens.
The graph appeared and went down to the horizontal asymptote, immediately after 20 iterations. This means that
the neural network quickly learned 85 percent of date set.
6.1 Step Testing the neural network
After the network is trained, we click 'Test', in order to see the total error, and all the individual errors.
The results show that total error is 0.083382.
The final part of testing this network is testing it with several input values. To do that, we will select 5
random input values from our data set.
The output neural network produced for this input is the last column.
|
Inputs |
Output |
Individual errors |
observation |
months since last donation |
total number of donation |
total blood donated in c.c. |
months since first donation |
Donated blood in March 2007.(1 for yes, 0 for no) |
Result - obtained outputs |
1. |
0.03 |
1 |
1 |
1 |
1 |
0.363940 |
2. |
0.18 |
0.06 |
0.06 |
0.20 |
0 |
0.378470 |
3. |
0.05 |
0 |
0 |
0.2 |
0 |
0.379896 |
4. |
0.03 |
0 |
0 |
0 |
0 |
0.379988 |
5. |
0.03 |
0 |
0 |
0 |
1 |
0.379988 |
The network has given accurate results in 2 of 3 cases, while for the last two
cases, the network concluded that the person has not donated blood. These two cases have the same inputs, but
different output, and because a large number of people with the same characteristics that have not given blood in
comparison to those who did, the network made this conclusion.
Training attempt 2
4.4 Step Creating a neural network
Following these rules, we now decide for a neural network that contains 3 hidden neurons in one hidden layer.
Again, we type in the standard number of inputs and outputs, check 'Use Bias Neurons', choose a Sigmoid Transfer
function, and select 'Backpropagation with Momentum' as the Learning rule.
In this case, we chose three hidden neurons.
Graphical representation of neural network
5.4 Step Train the network
The neural network, that will be used as our second solution to the problem, has been created. Like the previous
neural network, we will train this one with the training set we created before, with the entire sample. We select
'BTSC75', click 'Train' and a new window appears, asking us to fill in the parameters. This time the
maximum error to be 0.01. We do not limit the maximum number of iterations. As for the learning parameters, the
learning rate will be 0.1, and momentum 0.3. After we
click 'Train', the iteration process starts. The total net error, grows very fast and stops in 69 iteration with
error 0.002456.
6.4 Step Testing the network
Total Mean Square Error measures the average of the squares of the "errors". The error is the amount by which
the value implied by the estimator differs from the quantity to be estimated. An mean square error of zero, meaning
that the estimator predicts observations of the parameter with perfect accuracy, is the ideal, but is practically
never possible. The unbiased model with the smallest mean square error is generally interpreted as best explaining
the variability in the observations. The test showed that total mean square is 0.08331974. The goal of
experimental design is to construct experiments in such a way that when the observations are analyzed, the mean
square error is close to zero relative to the magnitude of at least one of the estimated treatment effects.
The only thing left is to put the random inputs stated above into the neural network. The result of the test are
shown in the table. The network guessed right in all five cases.
The final part of testing this network is testing it with several input values. To do that, we will select 5
random input values from our data set.
The output neural network produced for this input is the last column.
|
Inputs |
Output |
Individual errors |
observation |
months since last donation |
total number of donation |
total blood donated in c.c. |
months since first donation |
Donated blood in March 2007.(1 for yes, 0 for no) |
Result - obtained outputs |
1. |
0.05 |
0.20 |
0.20 |
0.27 |
0 |
0.493543 |
2. |
0.03 |
0.02 |
0.02 |
0.02 |
0 |
0.243438 |
3. |
0.12 |
0 |
0 |
0.07 |
0 |
0.141070 |
4. |
0.31 |
0 |
0 |
0.22 |
0 |
0.054891 |
5. |
0.03 |
0.04 |
0.04 |
0.09 |
0 |
0.277875 |
The network guessed correct in all five instances. After this test, we can conclude that this solution does not
need to be rejected.
Training attempt 4
4.6 Creating new neural network
Next Neural Network will have same number of input and output neurons but different number hidden
layer and neurons in them. We will use 4 neurons in the first hidden layer and 3 in the second. Network in named BloodTransfusionSet2.
And the neuronal network looks like this:
5.6 Step Training the network
We will train the network with learning rate value 0.2 and momentum 0.2, and max error 0.01.
6.6 Step Testing the network
We clicked 'Test' after 'Training', to see whether more neurons contribute to a better training.As we can see,
the error is not reduced, but higher than when we tested at 75 and 85 percent of the data set.
The final part of testing this network is testing it with several input values. To do that, we will select 3
random input values from our data set.
The output neural network produced for this input is the last column.
|
Inputs |
Output |
Individual errors |
observation |
months since last donation |
total number of donation |
total blood donated in c.c. |
months since first donation |
Donated blood in March 2007.(1 for yes, 0 for no) |
Result - obtained outputs |
1. |
0.22 |
0.04 |
0.04 |
0.88 |
0 |
0.10968 |
2. |
0.28 |
0 |
0 |
0.20 |
0 |
0.10968 |
3. |
0.05 |
0.12 |
0.12 |
0.63 |
0 |
0.196674 |
The network guessed correct in all 3 instances. After this test, we can conclude that this solution does not
need to be rejected.
Training attempt 15
5.10 Step Training the network
In this attempt, we will use the network with 3 hidden neurons(BloodTransfusionNet15).
The learning parameters set the maximum error of 0.01, learning rate of 0.2 and momentum of 0.2.
Than we click "Train" and wait.
The training was stopped after 25 iterations, and the total net error is 0.00123.
6.10 Step Testing the Neural Network
Total Mean Square Error measures the average of the squares of the "errors". The error is the amount by which
the value implied by the estimator differs from the quantity to be estimated. An mean square error of zero, meaning
that the estimator predicts observations of the parameter with perfect accuracy, is the ideal, but is practically
never possible. The model with the smallest mean square error is generally interpreted as best explaining the
variability in the observations. The test showed that total mean square is 0.2029742515. The goal of
experiment is to construct experiments in such a way that when the observations are analyzed, the mean square error
is close to zero relative to the magnitude of at least one of the estimated treatment effects.
Now we need to examine all the individual errors for every single instance and check if there are any extreme
values. When you have a large data set, individual testing requires a lot of time. Instead of testing 748
observations we will random choose 5 observations which will be subjected to individual testing.
In the introduction we mentioned that the output has two values, 0 or 1. So if a person gave blood in March
2007., the output will have a value of 1, and if not will have a value of 0.
Values of inputs, outputs and individual errors, in 5 randomly selected observations, are in table below:
|
Inputs |
Output |
Individual errors |
observation |
months since last donation |
total number of donation |
total blood donated in c.c. |
months since first donation |
Donated blood in March 2007.(1 for yes, 0 for no) |
Result |
1. |
0.31 |
0.02 |
0.02 |
0.38 |
0 |
0.0872 |
2. |
0.31 |
0 |
0 |
0.22 |
0 |
0 |
3. |
0.31 |
0.12 |
0.12 |
0.9 |
0.0872 |
0.0872 |
4. |
0.03 |
0.12 |
0.12 |
0.78 |
0.1398 |
0.1398 |
5. |
0.28 |
0 |
0 |
0.2 |
0.0872 |
0.0872 |
The network guessed all of them right. We can conclude that this network has a good ability of generalization,
and, the training of this network has been validated.
Below is a table that summarizes this experiment. The two best solutions for the problem are in bold and have a
violet background.
Training attempt |
Number of hidden neurons |
Number of hidden layers |
Training set |
Maximum error |
Learning rate |
Momentum |
Total mean square error |
Number of iterations |
5 random inputs test - number of correct guesses
| Network trained |
1 |
4 |
1 |
85% of full date set |
0.01 |
0.2 |
0.2 |
0.00395 |
20 |
3/5 |
yes |
2 |
3 |
1 |
75% of full date set |
0.01 |
0.1 |
0.3 |
0.00245> |
69 |
5/5 |
yes |
3 |
4,3 |
2 |
full |
0.01 |
0.2 |
0.7 |
0,08195 |
1617 |
- |
no |
4 |
4,3 |
2 |
full |
0.01 |
0.2 |
0.2 |
0.00563 |
105 |
3/3 |
yes |
5 |
4,4 |
2 |
full |
0.01 |
0.2 |
0.2 |
0.02585 |
21617 |
- |
no |
6 |
4,4 |
2 |
full |
0.01 |
0.1 |
0.1 |
0.371029 |
8445 |
- |
no |
7 |
6,7 |
2 |
full |
0.01 |
0.2 |
0.2 |
0.06261 |
1158 |
- |
no |
8 |
6,7 |
2 |
full |
0.01 |
0.2 |
0.7 |
0.06106 |
4011 |
- |
no |
9 |
10,5 |
2 |
full |
0.01 |
0.2 |
0.2 |
0.07119 |
2516 |
- |
no |
10 |
3 |
1 |
15% of full date set |
0.01 |
0.2 |
0.7 |
0.03336 |
20955 |
|
no |
11 |
3 |
1 |
25% of full date set |
0.01 |
0.2 |
0.7 |
0.04852 |
23795 |
- |
no |
12 |
2 |
1 |
15% of full date set |
0.01 |
0.2 |
0.7 |
0.02241 |
20712 |
- |
no |
13 |
10 |
1 |
15% of full date set |
0.01 |
0.2 |
0.7 |
0.03241 |
3808 |
- |
no |
14 |
10 |
1 |
15% of full date set |
0.01 |
0.2 |
0.2 |
0.00791 |
6030 |
- |
no |
15 |
3 |
1 |
full |
0.01 |
0.2 |
0.2 |
0.00123 |
25 |
5/5 |
yes |
Advanced Training Techniques
We want to check the network performance, when the training is complete. A learning neural network is expected
to extract rules from a finite set of examples. It is often the case that the neural network memorizes the training
data well, but fails to generate correct output for some of the new test data. Therefore, it is desirable to come
up with some form of regularization.
One form of regularization is to split the training set into a new training set and a validation set. After each
step through the new training set, the neural network is evaluated on the validation set. The network with the best
performance on the validation set is then used for actual testing. Your new training set consisted of the say it
for example 80% - 90% of the original training set, and the remaining 10% - 20% would be classified in the
validation set. Then you have to compute the validation error rate periodically during training and stop training
when the validation error rate starts to go up. Validation error is not a good estimate of the generalization
error, if your initial set consists of a relatively small number of instances, however, it is not our case.
One way to get appropriate estimate of the generalization error is to run the neural network on the test set of
data that is not used at all during the training process. The generalization error is usually defined as the
expected value of the square of the difference between the learned function and the exact target.
In the following examples we will check the generalization error, such as from the example to the example we
will increase the number of instances in the training set, which we use for training, and we will decrease the
number of instances in the sets that we used for testing.
Training attempt 16
3.14 Step Create a Training Set
We will choose random 90% of instances of training set for training and remaining 10% for testing. First group
will be called BTSC90, and second BTSC10.
5.14 Step Train the network
Unlike previous training, now there is no need to create new neural network. Advanced Training Techniques
consist in the fact that we examine the performance of existing architectures, using a new training and test set of
data. Satisfactory results we found using architecture BloodTransfusionNet15. By the end of this article we will use
not only this architecture, but also the parameters of the training that we used in this architecture previously
which brought us desired results. But before you open an existing architecture, create new training sets. First
training set name it BTSC90 and second one name it BTSC10.
Now open neural network BloodTransfusionNet15, select training set BTSC90 and in new network window
press button 'Train'. The parameters that we now need to set will be the same as the ones in previous training
attempt: the maximum error will be 0.01, the Learning rate 0.2, and the Momentum 0.2. We will not limit the maximum
number of iterations, and we will check 'Display error graph', as we want the see how the error changes throughout
the iteration sequence. Then press 'Train' button again and see what will happen.
The error function do not fluctuate much, moving in a straight line, horizontally, and in 11239 iteration stops,
can not find its optimal solution.
We train the 70, 80, 90 percent of the date set and test 30,20,10 percent randomly selected as the date set.
We obtain the following table of the results:
Training attempt |
Number of hidden neurons |
Number of hidden layers |
Training set |
Test set |
Maximum error |
Learning rate |
Momentum |
Iterations |
Total Net Error(during training) |
Total Mean Square Error (during testing) |
Network trained |
16 |
3 |
1 |
90% |
10% |
0.01 |
0.2 |
0.2 |
11239 |
0.01185 |
0.06259 |
no |
17 |
3 |
1 |
80% |
20% |
0.01 |
0.2 |
0.2 |
3674 |
0.04250 |
0.17187 |
no |
18 |
3 |
1 |
70% |
30% |
0.01 |
0.2 |
0.2 |
2065 |
0.00932 |
0.14516 |
no |
After training 18th attempt we concluded that there are some cases that makes big impact on Total Mean Squared
Error as an example of error as 0.8 and 0.7. With big
error we mean that network classified completly wrong (for example it is 1 but it should be 0) and that error
makes huge impact on Total Mean Square Error.
Because all of these network failed to make error less than 0.01 we can say that this network failed to
generalize this problem.
Conclusion
Four different solutions tested in this experiment have shown that the choice of the number of hidden neurons
is crucial to the effectiveness of a neural network. Also, the experiment showed that the success of a neural
network is very sensitive to parameters chosen in the training process. The learning rate must not be too high, and
the maximum error must not be too low. The results have shown that the total mean square error does not reflect
directly the success of a network .
DOWNLOAD
Data set used in this tutorial
The prepared date set
Neuroph projects
The samples used for advanced techniques
See also:
Multi Layer Perceptron
Tutorial
|