Forum | Blog | Wiki

BLOOD TRANSFUSION SERVICE CENTER

An example of a multivariate data type classification problem using Neuroph

by Ana Jovanovic, Faculty of Organisation Sciences, University of Belgrade

an experiment for Intelligent Systems course

Introduction

In this experiment it will be shown how neural networks and Neuroph Studio are used when there are problems of classification. We will work with several architecture, and it will be determined which ones is the best solution to the problem, and which ones do not.

Classification is a task that is often found in every day life. A classification process involves assigning objects into predefined groups or classes based on a number of observed attributes related to those objects. Although there are some more traditional tools for classification, such as certain statistical procedures, neural networks have shown to be an effective solution for this type of problems. There is a number of advantages to using neural networks - they are data driven, they are self-adaptive, they can approximate any function - linear as well as non-linear (which is quite important in this case because groups often cannot be divided by linear functions). Neural networks classify objects rather simply - they take data as input, derive rules based on those data, and make decisons.

For better understanding of our experiment, we suggest that you first look at the links below:

Neuroph Studio-Geting started

Multi Layer Perceptron

Introducing the problem

The goal is to teach the neural network to predict whether a blood donor gave blood in March 2007 based on characteristics that are given as input parameters. The first thing we need is a data set. Data set is available here. The name of the data set is the Blood Transfusion Service Centre data set. Data taken from the Blood Transfusion Service Center in Hsin-Chu City in Taiwan.

The dataset contains 748 instances and 5 attributes, where the first four attributes are the input and the fifth is the output. Each instance has one of 2 possible classes( person donated/hasn't donated blood in March 2007).

Input attributes are:

R (Recency - months since last donation-numerical),
F (Frequency - total number of donation-numerical)
M (Monetary - total blood donated in c.c.-numerical)
T (Time - months since first donation-numerical)

Output attribute is a binary variable representing whether he/she donated blood in March 2007

1 - stand for donating blood;

0 - stands for not donating blood;

When data set dowloaded, it can not be inserted in Neuroph in its original form. For it to be able to help us with this classification problem, we need to normalize the data first. The type of neural network that will be used in this experiment is multi layer perceptron with backpropagation.

Prodecure of training a neural network

In order to train a neural network, there are five steps to be made:

Preparation of the data set

Create a Neuroph project

Create a training set

Create a neural network

Train the network

Test the network to make sure that it is trained properly

1.Step: preparation of the data set

In order to train neural network this data set have to be normalized. Normalization implies that all values from the data set should take values in the range from 0 to 1. For that purpose it would be used the following formula:

Where:

X - value that should be normalized

Xn - normalized value

Xmin - minimum value of X

Xmax - maximum value of X

The last attribute has a normalized value (0 or 1), so no need to use formula.

2.Step Creating a new Neuroph project

When all the data are standardized, we just need to put it in a .csv file and everything is set for the creation of a new training set. First, a new Neuroph projects needs to be created by clicking on the 'File' menu, and then 'New project'.The project will be called 'Blood Transfusion Service Center'.

When we click 'Finish', and new project is created and it will appear in the 'Projects' window, in the top left corner of Neuroph Studio.

3.Step Creating a training set

Now, we need to create a new training set by right-clicking on our project, and selecting 'New', and 'Training set'. We give it a name, and then set the parameters. The type is chosen to be 'Supervised' training , because we want to minimize the error of prediction through an iterative procedure. Supervised training is accomplished by giving the neural network a set of sample data along with the anticipated outputs from each of these samples. That sample data will be our data set. Supervised training is the most common way of neural network training. As supervised training proceeds, the neural network is taken through a number of iterations, until the output of the neural network matches the anticipated output, with a reasonably small rate of error. Error rate we find to be appropriate to make the network well trained is set just before the training starts. Usually, that number will be around 0.01.

Next, we set the number of input, which is 4, because there are 4 input attributes, and the number of outputs is 1, as explained above.

After clicking next, we need to edit training set table. In this case click a 'Load from file', to select a file from which the table will be loaded. We click on 'Choose file', find the file with data we need, and then select a values separator. In this case, it is comma, but it can also be a space, tab, or semicolon.

Then, we click 'Next', and a window that represents our table of data will appear. We can see that everything is in order, there are 4 input and 1 output column, all the data are in the range od 0-1, and we can now click on 'Finish'.

After we done this, everything is ready for the creation of neural networks. We will create several neural networks, all with different sets of parameters, and determine which is the best solution for our problem by testing them. This is the reason why there will be several options for steps 4, 5 and 6.

Training attempt 1
4.1 Step Creating a neural network

To create the optimal architecture of neural network for the problem, we will pay more attention to the number of neurons we choose to have in the hidden layer. There is no formula that would calculate that number for every possible problem. There are only some rules and guidlines. For example, as stated above, more neurons make the network more flexible, but also make it more sensitive to noise. So the answer is to have just enough neurons to solve the problem appropriately, but no more. Some of the rule-of-thumb methods for determining the correct number of neurons to use in the hidden layers are:

The number of hidden neurons should be between the size of the input layer and the size of the output layer - in this case, the number should be between 1 and 4

The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output layer - applied to this problem: 2/3 * 4 + 1 = 3,66 , so we'll round up to 4.

The number of hidden neurons should be less than twice the size of the input layer - here, less than 8, but we can put more neurons, is not strictly limited

The first neural network that we are testing, will be called BTSC_4_4_1. We will create it by right-clicking our project in the 'Projects' window, and then clicking 'New' and 'Neural Network'. We will set the name and the type of the network. Multi Layer Perceptron will be selected. Multi layer perceptron is the most widely studied and used neural network classifier. It is capable of modeling complex functions, it is robust (good at ignoring irrelevant inputs and noise) ,and can adapt its weights and/or topology in response to environment changes. Another reason we will use this type of perceptron is simply because it is very easy to use - it implements black-box point of view, and can be used with few knowledge about the relationship of the function to be modeled.

When we have selected the type of the perceptron, we can click 'Next'. A new window will appear, where we will set some more parameters that are characteristic for multi layer perceptron. The number of input and output neuron is the same as the number of inputs and outputs in the training set. However, now we have to select the number of hidden layers, and the number of neurons in each layer. Guided by the rule that problems that require two hidden layers are rarely encountered (and that there is currently no theoretical reason to use neural networks with any more than two hidden layers), we will decide for only one layer. As for the number of units in the layer, since it is known that one should only use as much neurons as it is needed to solve the problem, we will choose as little as we can for the first experiment - only one. Networks with many hidden neurons can represent functions with any kind of shape, but this flexibility can cause the network to learn the noise in the data. This is called 'overtraining'.

We have checked 'Use Bias Neurons', and chosen sigmoid transfer function (because the range of our data is 0-1, had it been -1 to 1, we would check 'Tanh'). As a learning rule we have chosen 'Backpropagation with Momentum'. This learning rule will be used in all the networks we create, because backpropagation is most commonly used technique and is most suited for this type of problem. In this method, the objects in the training set are given to the network one by one in random order and the regression coefficients are updated each time in order to make the current prediction error as small as it can be. This process continues until convergence of the regression coefficients. Also, we have chosen to add an extra term, momentum, to the standard backpropagation formulae in order to improve the efficiency of the algorithm.

Next, we click 'Finish', and the first neural network which we will test is completed.

If you want to see neural network as a graph, just select 'Graph View'. Right nodes in first and second level are bias neurons that we explained above.

5.1 Step Training the neural network

Now we need to train the network using the training set we have created. We select the training set and click 'Train'. A new window will open, where we need to set the learning parameters. The maximum error will be 0.01, and learning rate 0.2 and momentum will be 0.2. Learning rate is basically the size of the 'steps' the algorithm will take when minimizing the error function in an iterative process. We click 'Train' and see what happens.

The graph appeared and went down to the horizontal asymptote, immediately after 20 iterations. This means that the neural network quickly learned 85 percent of date set.

6.1 Step Testing the neural network

After the network is trained, we click 'Test', in order to see the total error, and all the individual errors. The results show that total error is 0.083382.

The final part of testing this network is testing it with several input values. To do that, we will select 5 random input values from our data set.

The output neural network produced for this input is the last column.

	Inputs				Output	Individual errors

observation	months since last donation	total number of donation	total blood donated in c.c.	months since first donation	Donated blood in March 2007.(1 for yes, 0 for no)	Result - obtained outputs
1.	0.03	1	1	1	1	0.363940
2.	0.18	0.06	0.06	0.20	0	0.378470
3.	0.05	0	0	0.2	0	0.379896
4.	0.03	0	0	0	0	0.379988
5.	0.03	0	0	0	1	0.379988

The network has given accurate results in 2 of 3 cases, while for the last two cases, the network concluded that the person has not donated blood. These two cases have the same inputs, but different output, and because a large number of people with the same characteristics that have not given blood in comparison to those who did, the network made this conclusion.

Training attempt 2

4.4 Step Creating a neural network

Following these rules, we now decide for a neural network that contains 3 hidden neurons in one hidden layer. Again, we type in the standard number of inputs and outputs, check 'Use Bias Neurons', choose a Sigmoid Transfer function, and select 'Backpropagation with Momentum' as the Learning rule.

In this case, we chose three hidden neurons.

Graphical representation of neural network

5.4 Step Train the network

The neural network, that will be used as our second solution to the problem, has been created. Like the previous neural network, we will train this one with the training set we created before, with the entire sample. We select 'BTSC75', click 'Train' and a new window appears, asking us to fill in the parameters. This time the maximum error to be 0.01. We do not limit the maximum number of iterations. As for the learning parameters, the learning rate will be 0.1, and momentum 0.3. After we click 'Train', the iteration process starts. The total net error, grows very fast and stops in 69 iteration with error 0.002456.

6.4 Step Testing the network

Total Mean Square Error measures the average of the squares of the "errors". The error is the amount by which the value implied by the estimator differs from the quantity to be estimated. An mean square error of zero, meaning that the estimator predicts observations of the parameter with perfect accuracy, is the ideal, but is practically never possible. The unbiased model with the smallest mean square error is generally interpreted as best explaining the variability in the observations. The test showed that total mean square is 0.08331974. The goal of experimental design is to construct experiments in such a way that when the observations are analyzed, the mean square error is close to zero relative to the magnitude of at least one of the estimated treatment effects.

The only thing left is to put the random inputs stated above into the neural network. The result of the test are shown in the table. The network guessed right in all five cases.

The final part of testing this network is testing it with several input values. To do that, we will select 5 random input values from our data set.

The output neural network produced for this input is the last column.

	Inputs				Output	Individual errors

observation	months since last donation	total number of donation	total blood donated in c.c.	months since first donation	Donated blood in March 2007.(1 for yes, 0 for no)	Result - obtained outputs
1.	0.05	0.20	0.20	0.27	0	0.493543
2.	0.03	0.02	0.02	0.02	0	0.243438
3.	0.12	0	0	0.07	0	0.141070
4.	0.31	0	0	0.22	0	0.054891
5.	0.03	0.04	0.04	0.09	0	0.277875

The network guessed correct in all five instances. After this test, we can conclude that this solution does not need to be rejected.

Training attempt 4

4.6 Creating new neural network

Next Neural Network will have same number of input and output neurons but different number hidden layer and neurons in them. We will use 4 neurons in the first hidden layer and 3 in the second. Network in named BloodTransfusionSet2.

And the neuronal network looks like this:

5.6 Step Training the network

We will train the network with learning rate value 0.2 and momentum 0.2, and max error 0.01.

6.6 Step Testing the network

We clicked 'Test' after 'Training', to see whether more neurons contribute to a better training.As we can see, the error is not reduced, but higher than when we tested at 75 and 85 percent of the data set.

The final part of testing this network is testing it with several input values. To do that, we will select 3 random input values from our data set.

The output neural network produced for this input is the last column.

	Inputs				Output	Individual errors

observation	months since last donation	total number of donation	total blood donated in c.c.	months since first donation	Donated blood in March 2007.(1 for yes, 0 for no)	Result - obtained outputs
1.	0.22	0.04	0.04	0.88	0	0.10968
2.	0.28	0	0	0.20	0	0.10968
3.	0.05	0.12	0.12	0.63	0	0.196674

The network guessed correct in all 3 instances. After this test, we can conclude that this solution does not need to be rejected.

Training attempt 15

5.10 Step Training the network

In this attempt, we will use the network with 3 hidden neurons(BloodTransfusionNet15).

The learning parameters set the maximum error of 0.01, learning rate of 0.2 and momentum of 0.2. Than we click "Train" and wait.

The training was stopped after 25 iterations, and the total net error is 0.00123.

6.10 Step Testing the Neural Network

Total Mean Square Error measures the average of the squares of the "errors". The error is the amount by which the value implied by the estimator differs from the quantity to be estimated. An mean square error of zero, meaning that the estimator predicts observations of the parameter with perfect accuracy, is the ideal, but is practically never possible. The model with the smallest mean square error is generally interpreted as best explaining the variability in the observations. The test showed that total mean square is 0.2029742515. The goal of experiment is to construct experiments in such a way that when the observations are analyzed, the mean square error is close to zero relative to the magnitude of at least one of the estimated treatment effects.

Now we need to examine all the individual errors for every single instance and check if there are any extreme values. When you have a large data set, individual testing requires a lot of time. Instead of testing 748 observations we will random choose 5 observations which will be subjected to individual testing.

In the introduction we mentioned that the output has two values, 0 or 1. So if a person gave blood in March 2007., the output will have a value of 1, and if not will have a value of 0.

Values of inputs, outputs and individual errors, in 5 randomly selected observations, are in table below:

	Inputs				Output	Individual errors

observation	months since last donation	total number of donation	total blood donated in c.c.	months since first donation	Donated blood in March 2007.(1 for yes, 0 for no)	Result
1.	0.31	0.02	0.02	0.38	0	0.0872
2.	0.31	0	0	0.22	0	0
3.	0.31	0.12	0.12	0.9	0.0872	0.0872
4.	0.03	0.12	0.12	0.78	0.1398	0.1398
5.	0.28	0	0	0.2	0.0872	0.0872

The network guessed all of them right. We can conclude that this network has a good ability of generalization, and, the training of this network has been validated.

Below is a table that summarizes this experiment. The two best solutions for the problem are in bold and have a violet background.

Training attempt	Number of hidden neurons	Number of hidden layers	Training set	Maximum error	Learning rate	Momentum	Total mean square error	Number of iterations	5 random inputs test - number of correct guesses	Network trained
1	4	1	85% of full date set	0.01	0.2	0.2	0.00395	20	3/5	yes
2	3	1	75% of full date set	0.01	0.1	0.3	0.00245>	69	5/5	yes
3	4,3	2	full	0.01	0.2	0.7	0,08195	1617	-	no
4	4,3	2	full	0.01	0.2	0.2	0.00563	105	3/3	yes
5	4,4	2	full	0.01	0.2	0.2	0.02585	21617	-	no
6	4,4	2	full	0.01	0.1	0.1	0.371029	8445	-	no
7	6,7	2	full	0.01	0.2	0.2	0.06261	1158	-	no
8	6,7	2	full	0.01	0.2	0.7	0.06106	4011	-	no
9	10,5	2	full	0.01	0.2	0.2	0.07119	2516	-	no
10	3	1	15% of full date set	0.01	0.2	0.7	0.03336	20955	-	no
11	3	1	25% of full date set	0.01	0.2	0.7	0.04852	23795	-	no
12	2	1	15% of full date set	0.01	0.2	0.7	0.02241	20712	-	no
13	10	1	15% of full date set	0.01	0.2	0.7	0.03241	3808	-	no
14	10	1	15% of full date set	0.01	0.2	0.2	0.00791	6030	-	no
15	3	1	full	0.01	0.2	0.2	0.00123	25	5/5	yes

Advanced Training Techniques

We want to check the network performance, when the training is complete. A learning neural network is expected to extract rules from a finite set of examples. It is often the case that the neural network memorizes the training data well, but fails to generate correct output for some of the new test data. Therefore, it is desirable to come up with some form of regularization.

One form of regularization is to split the training set into a new training set and a validation set. After each step through the new training set, the neural network is evaluated on the validation set. The network with the best performance on the validation set is then used for actual testing. Your new training set consisted of the say it for example 80% - 90% of the original training set, and the remaining 10% - 20% would be classified in the validation set. Then you have to compute the validation error rate periodically during training and stop training when the validation error rate starts to go up. Validation error is not a good estimate of the generalization error, if your initial set consists of a relatively small number of instances, however, it is not our case.

One way to get appropriate estimate of the generalization error is to run the neural network on the test set of data that is not used at all during the training process. The generalization error is usually defined as the expected value of the square of the difference between the learned function and the exact target.

In the following examples we will check the generalization error, such as from the example to the example we will increase the number of instances in the training set, which we use for training, and we will decrease the number of instances in the sets that we used for testing.

Training attempt 16
3.14 Step Create a Training Set

We will choose random 90% of instances of training set for training and remaining 10% for testing. First group will be called BTSC90, and second BTSC10.

5.14 Step Train the network

Unlike previous training, now there is no need to create new neural network. Advanced Training Techniques consist in the fact that we examine the performance of existing architectures, using a new training and test set of data. Satisfactory results we found using architecture BloodTransfusionNet15. By the end of this article we will use not only this architecture, but also the parameters of the training that we used in this architecture previously which brought us desired results. But before you open an existing architecture, create new training sets. First training set name it BTSC90 and second one name it BTSC10.

Now open neural network BloodTransfusionNet15, select training set BTSC90 and in new network window press button 'Train'. The parameters that we now need to set will be the same as the ones in previous training attempt: the maximum error will be 0.01, the Learning rate 0.2, and the Momentum 0.2. We will not limit the maximum number of iterations, and we will check 'Display error graph', as we want the see how the error changes throughout the iteration sequence. Then press 'Train' button again and see what will happen.

The error function do not fluctuate much, moving in a straight line, horizontally, and in 11239 iteration stops, can not find its optimal solution.

We train the 70, 80, 90 percent of the date set and test 30,20,10 percent randomly selected as the date set. We obtain the following table of the results:

Training attempt	Number of hidden neurons	Number of hidden layers	Training set	Test set	Maximum error	Learning rate	Momentum	Iterations	Total Net Error(during training)	Total Mean Square Error (during testing)	Network trained
16	3	1	90%	10%	0.01	0.2	0.2	11239	0.01185	0.06259	no
17	3	1	80%	20%	0.01	0.2	0.2	3674	0.04250	0.17187	no
18	3	1	70%	30%	0.01	0.2	0.2	2065	0.00932	0.14516	no

After training 18th attempt we concluded that there are some cases that makes big impact on Total Mean Squared Error as an example of error as 0.8 and 0.7. With big error we mean that network classified completly wrong (for example it is 1 but it should be 0) and that error makes huge impact on Total Mean Square Error.

Because all of these network failed to make error less than 0.01 we can say that this network failed to generalize this problem.

Conclusion

Four different solutions tested in this experiment have shown that the choice of the number of hidden neurons is crucial to the effectiveness of a neural network. Also, the experiment showed that the success of a neural network is very sensitive to parameters chosen in the training process. The learning rate must not be too high, and the maximum error must not be too low. The results have shown that the total mean square error does not reflect directly the success of a network .

DOWNLOAD

Data set used in this tutorial

The prepared date set

Neuroph projects

The samples used for advanced techniques

See also:

Multi Layer Perceptron Tutorial