Neural networks are one technique which can be used for text recognition. Text recognition application uses artificial neural network to recognize text characters from image (scanned documents, photography etc.), using Image Recognition, and transforms them into editable document (as MS Word .doc or Notepad and Wordpad .txt file). It is based on neural network that can learn to recognize more characters.
This tutorial will explain the following:
1. How to train neural networks for text recognition with Neuroph Studio
2. How to use neural networks trained for text recognition in your applications
Neuroph Studio provides environment for creating and training neural networks, which can be saved as ready-to-use java components. Also it provides specialised text recognition tool to train neural networks for text recognition. Creating and training neural network for text recognition consists of the following steps:
1. Draw letters whitch should be recognized and create training set
2. Create neural network
3. Train neural network
4. Test neural network
5. Save & deploy neural network
Create Neuroph Project
Click File > New Project
Neuroph Project is created, now create Text Recognition network.
Click File > New File.
Step 1. Create Training Set
In this step we need to create image of characters that should be recognized. Image of Characters to learn will appear in the lower area. Choose Font Style, for example Arial and leave the default font size, click Next.
For training set label write 'Arial'. Leave the default Image Sampling Resolution and click Next button.
Step 2. Create neural network
The next thing to do is to create the neural network.
To create the neural network you need to enter the following:
Network label - The label for the neural network, which is usefull when you create several neural networks for the same problem, and you're comparing them.
Transfer function - This setting determines which transfer function will be used by the neurons. In most cases you can leave the default settings 'Sigmoid', but sometimes using 'Tanh' can give you better results.
Hidden Layers Neuron Counts - This is the most important setting which determines the number of hidden layers in network, and number of neurons in each hidden layer. Hidden layers are layers between input and output layer. The trick is to have the smallest possible number of layers and neurons which can succesfully learn the training set. The smaller number of neurons - the faster learning, better generalization. Suitable number of hidden neurons also depends of the number of input and output neurons, and the best value can be figured out by experimenting. For start, try one hidden layer with 12 neurons.
Click Finish button to create the neural network. After you click the button new window with created neural network will open.
Step 3. Train network
To train the network select the training set from the list and click the Train button.
This will start training and open network learning graph and iteration counter, so you can obesrve the learning process. If the learning gets stuck (total network error does not go down), you can try with different number of neurons, layers or learning parameters. For learning rate and momentum use the values between [0, 1] , and for the error some small value bellow 0.1 is recommended.
Step 4. Test Network
After you have trained the network you can try how it works in the test panel. Click 'Load Text Image ' button to set input image for the network. Select image file and click Select image with text button.
Image of Characters will appear in Loaded Image area. Click the 'Recognize>>' button.
To save recognized text, click the Save button above the recognized text area
Enter the name and file type, for example .txt and click Save.
File will be saved as .txt file
Step 5. Save neural network
To save the neural network as Java component click [Main menu > File > Save] and use the .nnet extension. The network will be saved as seralized MultiLayerPerceptron class.
TODO