Loading...

A Deep Recurrent Neural Network Model for Supervised Labeling of Variable Length Sequences

Taheri, Ali | 2017

609 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 50076 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Rabiee, Hamid Reza
  7. Abstract:
  8. In recent years, machine learning has been widely used to solve various computational problems. One of the most fundamental and integral type of these issues is the problem of classification, in which the data samples of different types such as image, voice, or video are classified into different categories. Since these data types are inherently sequential, the problem of classifying sequential data, is of great importance. This problem has a wide range of applications, including but not restricted to, handwriting recognition, voice recognition, malware detection, and fraud detection in financial transactions. Regarding the fact that in many cases, the length of the sequence (which could be text, voice, video, etc) is varying, in this dissertation, we try to solve the problem labeling and classification of variable length sequenced data. This problem is a challenging issue, due to the varying length of the sequences, variant training and test data prior distributions, and the change in the number of classes in the training phase. Furthermore, the most basic and important challenge is the detection of intrinsic dependencies of sequences, since these dependencies can help to determine the class of sequences. In this dissertation, a novel method based on Recurrent Neural Networks (RNNs) for classification of variable length sequenced data is proposed. This method utilizes multiple generative models with shared parameters to classify sequenced data, instead of a single discriminative model. We introduce a method in order to solve the problem of variable length sequence classification by exploiting the capabilities of an RNN-based algorithm. Finally, we evaluate the proposed method in various sequence classification problems, including the poem-poet detection, handwritten digit recognition (using MNIST dataset), and Android malware detection. The experimental results clearly demonstrate that our proposed method outperforms the most competitive baselines in classification accuracy as increases the classification accuracy about 5% in poem classification and handwritten digit recognition. Our method also seems to work much better in facing with unbalanced classes. Experimental results shoe that it decreases the classification error for smaller class of binary classification from about 26% to about 1% while the bigger class has 10 times more training samples
  9. Keywords:
  10. Recurrent Neural Networks ; Classification ; Labeling Algorithm ; Sequential Data ; Variable Length Sequences

 Digital Object List

 Bookmark

...see more