Text Recognition

TEXT RECOGNITION WITH NEURAL NETWORKS HOWTO

Neural networks are one technique which can be used for text recognition. Text recognition application uses artificial neural network to recognize text characters from image (scanned documents, photography etc.), using Image Recognition, and transforms them into editable document (as MS Word .doc or Notepad and Wordpad .txt file). It is based on neural network that can learn to recognize more characters.

This tutorial will explain the following:

1. How to train neural networks for text recognition with Neuroph Studio
2. How to use neural networks trained for text recognition in your applications

1. Training Neural Network for Text Recognition with Neuroph Studio

Neuroph Studio provides environment for creating and training neural networks, which can be saved as ready-to-use java components. Also it provides specialised text recognition tool to train neural networks for text recognition. Creating and training neural network for text recognition consists of the following steps:

1. Draw letters whitch should be recognized and create training set
2. Create neural network
3. Train neural network
4. Test neural network
5. Save & deploy neural network

Create Neuroph Project

Click File > New Project

Select Neuroph Project, click Next.

Enter project name and location, click Finish.

Neuroph Project is created, now create Text Recognition network.

Click File > New File.

Select Text recognition file type, click Next.

Step 1. Create Training Set
In this step we need to create image of characters that should be recognized. Image of Characters to learn will appear in the lower area. Choose Font Style, for example Arial and leave the default font size, click Next.

For training set label write 'Arial'. Leave the default Image Sampling Resolution and click Next button.

Step 2. Create neural network

The next thing to do is to create the neural network.

To create the neural network you need to enter the following:

Network label - The label for the neural network, which is usefull when you create several neural networks for the same problem, and you're comparing them.
Transfer function - This setting determines which transfer function will be used by the neurons. In most cases you can leave the default settings 'Sigmoid', but sometimes using 'Tanh' can give you better results.
Hidden Layers Neuron Counts - This is the most important setting which determines the number of hidden layers in network, and number of neurons in each hidden layer. Hidden layers are layers between input and output layer. The trick is to have the smallest possible number of layers and neurons which can succesfully learn the training set. The smaller number of neurons - the faster learning, better generalization. Suitable number of hidden neurons also depends of the number of input and output neurons, and the best value can be figured out by experimenting. For start, try one hidden layer with 12 neurons.
Click Finish button to create the neural network. After you click the button new window with created neural network will open.

Step 3. Train network

To train the network select the training set from the list and click the Train button.

This will open the dialog for setting learning parameters. Enter the Learning Rate value 0.2 and just click the Train button.

This will start training and open network learning graph and iteration counter, so you can obesrve the learning process. If the learning gets stuck (total network error does not go down), you can try with different number of neurons, layers or learning parameters. For learning rate and momentum use the values between [0, 1] , and for the error some small value bellow 0.1 is recommended.

Step 4. Test Network

After you have trained the network you can try how it works in the test panel. Click 'Load Text Image ' button to set input image for the network. Select image file and click Select image with text button.

Image of Characters will appear in Loaded Image area. Click the 'Recognize>>' button.

Recognized text will appear in 'Recognized Text' area.

To save recognized text, click the Save button above the recognized text area

Enter the name and file type, for example .txt and click Save.

File will be saved as .txt file

Step 5. Save neural network

To save the neural network as Java component click [Main menu > File > Save] and use the .nnet extension. The network will be saved as seralized MultiLayerPerceptron class.

2. Using Neuroph Text Recognition in Your Applications

TODO