In Computer Vision

In Computer Vision, the task of food image recognition is considered to be one of the most significant and potential applications for visual object recognition. CNN is a Deep Neural Network (DNN) consisting of additional repetitive Convolutional and pooling layers which automatically learns the features from the given input image. Linear classifier works as a template match, in which the weights learnt row-wise are match up to a template for a particular class. With the learning rate set to 0.001 (1e-03) Validation accuracy obtained was 0.21with the regularization strength of 1e+00. A dropout of 0.75 was applied to each layer. The last layer had softmax classifier with cross-entropy loss. The obtained validation accuracy and test accuracy was 0.4 with Adam optimizer over 25 epochs with a learning rate of 1e-04. In the experiment, BoF model with SIFT (Scale Invariant Feature Transform) descriptors mined the features from the images which were used as the input to linear SVM to classify the images. The VLFeat library 9 was used to implement this procedure. A decaying learning rate was used that was continuously updated using an exponential function with a cost value ? = ?0×exp(C), where ?0 is set to 0.00l and C indicates the training loss. The pooling operation is also called as downsampling or sub-sampling. The number of trainable parameters considered and the number of computations performed in the network are reduced. This reduces one of the common issue in CNN which is overfitting of the CNN. Pooling operation allows the network to be unaffected by small deformations, alterations and translations that have been made to the input image to while training the network. The Fully Connected layer is a conventional Multi-Layered Perceptron in which each neuron in this layer is connected to each one of the neuron in the previous layer. The softmax activation function is used in the in the final output layer. The input to the Softmax activation function is a vector of arbitrary real-valued scores. The output fo Softmax activation produces a vector of values in the range between zero and one that sums to one for the class that the input image belongs to. 13 The Softmax layer output is: ?(?_i??w_i x_i ?+b). This gives, for each class i, P (yi = 1; xi; w). For each sample x, the class i with the maximum Softmax output is the predicted class for sample x. For image classification problem, each unit of the final layer indicates the probability of a particular class. A dropout was applied on the last Convolutional layer and the Fully Connected layer. Anaconda is a freely available, enterprise- ready Python distribution that can be used for processing the data, perform scientific computations, and data analytics. Anaconda comes with a built-in Python 2.7 or Python 3.4 along with more than 100 python packages that are tested on various cross-platforms and optimized. All Python based tools can be used with Anaconda. It enables one to create isolated custom environments by combining various Python versions available and also switch between these using the command “conda”, which is an inventive multi-platform package manager for Python and other languages 14. Anaconda can be used with Linux/ Mac OS, Windows.
Keras is an open source neural network library coded in Python. Keras can run on top of TensorFlow, Microsoft Cognitive Toolkit, Theano, or MXNet, but Tensorflow or Theano must be additionally installed to use Keras.

Being user-friendly, modular and extensible, Keras was designed to speed up the computations in the Deep Neural Networks. Keras was developed during the research work of the venture ONEIROS (Open-ended Neuro-Electronic Intelligent Robot Operating System)by a Google engineer, François Chollet, who is also responsible for its maintenance.

Instead of using it as an independent Machine Learning framework, Keras was intended to be used as an interface. It offers more sophisticated, more insightful set of generalizations that are helpful in easily building up the deep learning models irrespective of the computational backend used.

Keras has abundant implementations of the generally used Neural Network building blocks such as optimizers (Adam, Adagrid, RMS Prop), activation functions (ReLU, softmax), cost functions (MSE, loss function), layers and a assortment of tools required to easily handle text and images.

Keras permits the users to develop the deep leaning models for smartphones (iOS and Android), on the web, or on the Java Virtual Machine (JVM).