Fun with Science: December 2015

In this activity, the task is to apply neural network in any classification. In a previous activity, pattern recognition was used to classify three different cereals using r ang g normalized chromaticity (ncc) values as features. You can view that activity here.

Figure below shows some of the samples for our classification. The complete set composed of 24 samples, 8 for each class - honey gold flakes, koko krunch and honey stars.

To easily classify objects using machine learning, the combination of features to be selected should discriminate well the classes. Using only colors as features will be a weak classification technique. In the previous activity, there's a difficulty in the distinction between corn flakes and honey stars since they have similarity in their color. To strengthen the classification, we can add other features concerning the shape of the cereals. The edge of the samples are obtained using Matlab. Based on the edge or boundary another two features are obtained.

Roundness is given by[1]

$$Roundness = \frac{4\pi A}{P^2}$$

where A is the area of the edge shape and P is the perimeter of the contour.

Solidity is defined as the ratio of internal and external area[1].

$$Solidity = \frac{S_1}{S_2}$$

where $S_1$ and $S_2$ is the internal and external area of the shape, respectively as illustrated in the figure below[1].

From these additional features, we can easily differentiate corn flakes and honey stars. The stars will have low roundness and solidity.

Now that we have five features to classify our cereals, we use an existing package for neural network in Matlab. Here the network was trained using Levenberg-Marquardt backpropagation. We observe the different result of the fitting using different number of hidden neurons.

The error histogram using 10 hidden neurons is used. In the histogram, one can observe if there's any outliers[2]. For this case, the error seem close to each other with most errors within -0.06 to 0.04.

Using 50 and 100 hidden neurons, the resulting error histogram is as follows.

For these cases, it was more apparent that the data has outliers and this would indicate that there are some of the data are not valid or if they are valid, the guide suggests to collect additional data that are similar to the outlier points[2]. This error histogram will indicate whether our data or selected features is suitable enough for neural network classification.

The plot above shows the different regression plots. This shows the linear relation between the outputs and the target for the training, validation and test sets[2]. In the neural network, we should observe linearity between network outputs and target. The best linearity was observed using 100 hidden neurons.

Retraining the data set doesn't really improve the result. Actually, every training seem to be very different from the each other. But the error seem to be relatively smaller using higher hidden neurons. From our data so far, we can say that mostly have low error with some possible outlier data points.

Reference

[1] Q. Wu, C. Zhou, and C. Wang, "Feature extraction and automatic recognition of plant leaf using artificial neural network," Advances in Artificial Intelligence, 3, 2006.

[2] N.A., "Fit Data with a Neural Network," MathWorks, Inc., 2016. <http://www.mathworks.com/help/nnet/gs/fit-data-with-a-neural-network.html>

Matlab has an existing package for neural network.

Here the network was trained using Levenberg-Marquardt backpropagation

Apply NN in any classification or fitting task. You need not program from scratch, use existing packages. Play around with free parameters such as number or hidden neurons, number of hidden layers, activation functions and learning algorithms.

Saturday, December 19, 2015

Neural Network