Combining fast text embeddings with neural networks for short text classification

dc.contributor.authorJayakody, J.R.K.C.
dc.contributor.authorVidanagama, V.G.T.N.
dc.date.accessioned2025-11-18T03:04:20Z
dc.date.available2025-11-18T03:04:20Z
dc.date.issued2023-11-03
dc.description.abstractUsing embedding representation is a critical step to improve the classification accuracy of a text dataset. Even though Bag of Word (BOW) models are used with past research work, usage of word2vec, Glove and FastText as embedding techniques helps to represent the features of text documents in a distributed manner, hence improving the accuracy of such models. The latest research work used a combination of embedding techniques and enhanced neural network models to improve the classification accuracy of text documents. FastText as an embedding unsupervised model and CNN, LSTM, and RNN as neural models were used extensively in the latest research work. However, comprehensive analysis with FastText and neural models with text documents has not been undertaken thus far. As a result, it is hard to compare the existing research work, and it is unclear which combination of neural model with FastText performs well over the other techniques. Therefore, it is necessary to investigate the impact of neural networks when the features were represented with the FastText embedding model. A famous movie review dataset was used for the experiment. CNN, LSTM, RNN, NN, and variations of those neural networks were used as neural networks. Hold out stratified Training and testing set was taken with 70 % to 30% split. Seventy per cent of training data was split as 80% of training and 20% of validation set. We compare classification accuracy across a range of neural network models, and our results show that the RNN model outperforms other neural network models with FastText embeddings with 86% accuracy. Moreover, out of various neural networks, the combination CNN-LSTM outperforms all other neural network models with 88% accuracy. The outcomes of this study can be a baseline for future research.
dc.identifier.citationProceedings of the Postgraduate Institute of Science Research Congress (RESCON) -2023, University of Peradeniya, P41
dc.identifier.isbn978-955-8787-09-0
dc.identifier.urihttps://ir.lib.pdn.ac.lk/handle/20.500.14444/6742
dc.language.isoen_US
dc.publisherPostgraduate Institute of Science (PGIS), University of Peradeniya, Sri Lanka
dc.subjectClassification
dc.subjectCNN
dc.subjectFastText
dc.subjectLSTM
dc.subjectRNN
dc.titleCombining fast text embeddings with neural networks for short text classification
dc.title.alternativeICT, mathematics, and statistics
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Jayakody.pdf
Size:
93.87 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed to upon submission
Description:

Collections