Sentiment analysis (SA) is a branch of opinion mining that focuses on obtaining people’s thoughts and feelings about a specific subject from systematic, semi-structured, or unorganized text data. In this paper, the efficacy of four ML classifiers i.e. Decision Tree (DT), Random Forest (RF), Logistic Regression (LR) and Multinomial NB is analyzed on IMDB dataset. The main objective of the proposed work is to analyze which classifier shows best results on the given dataset. To achieve this objective, necessary movie review or comment data is taken from IMDB dataset that is available on Kaggle. However, this dataset is not balanced and contains a lot of unnecessary and redundant data that needs to be eliminated, therefore, pre-processing is must. During the pre-processing phase, tokenization, stemming, stop words removal and segregation like techniques are implemented to make the data balanced and normalized. After this, the given dataset is divided into subsets by using k-fold cross validation approach. The main motive for doing so is to train the ML classifiers effectively on various combinations of data so that its accuracy can be enhanced. Finally, the classification is performed by DT, RF, LR and Multinomial NB classifiers as per the training provided to them. The efficacy of the system is analyzed using MATLAB on IMDB dataset for every fold. Simulation results revealed that LR classifiers is outperforming DT, RF and Multinomial NB in terms of accuracy, precision, recall and F1-Score as well, to prove its supremacy.
• Sudhir, Prajval, and Varun Deshakulkarni Suresh. “Comparative study of various approaches, applications and classifiers for sentiment analysis.” Global Transitions Proceedings 2.2 (2021): 205-211.
• Raghuvanshi, N., J. M. Patil, A naive bays classifier to classify movie, International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), vol.5 no.4, pp.1014–1017, 2016.
• Sailunaz, K., Dhaliwal, M., Rokne, J., & Alhajj, R. (2018). Emotion detection from text and speech: a survey. Social Network Analysis and Mining, 8(1), 28.
• Bandhakavi, A., Wiratunga, N., Padmanabhan, D., & Massie, S. (2017). Lexicon based feature extraction for emotion text classification. Pattern recognition letters, 93, 133-142.
• Medhat, W., A. Hassan, H. Korashy, Sentiment analysis algorithms and applications: A survey, International Journal of Ain Shams Engineering Journal, vol.5, no.4, pp. 1093–1113, 2014.
• Buche, A., D. Chandak, A. Zadgaonkar, Opinion mining and analysis: a survey, International Journal of Natural Language Computing (IJNLC), vol.2, no.3, pp.39–48, 2013.
• Ghosh, Ayanabha. “Sentiment Analysis of IMDb Movie Reviews: A comparative study on Performance of Hyperparameter-tuned Classification Algorithms.” 2022 8th International Conference on Advanced Computing and Communication Systems (ICACCS). Vol. 1. IEEE, 2022.
• Karak, Gahina, et al. “Sentiment Analysis of IMDb Movie Reviews: A Comparative Analysis of Feature Selection and Feature Extraction Techniques.” International Conference on Hybrid Intelligent Systems. Springer, Cham, 2021.
• Qaisar, Saeed Mian. “Sentiment analysis of IMDb movie reviews using long short-term memory.” 2020 2nd International Conference on Computer and Information Sciences (ICCIS). IEEE, 2020.
• S. Sabba, N. Chekired, H. Katab, N. Chekkai and M. Chalbi, “Sentiment Analysis for IMDb Reviews Using Deep Learning Classifier,” 2022 7th International Conference on Image and Signal Processing and their Applications (ISPA), 2022, pp. 1-6, doi: 10.1109/ISPA54004.2022.9786284.
• K. Amulya, S. B. Swathi, P. Kamakshi and Y. Bhavani, “Sentiment Analysis on IMDB Movie Reviews using Machine Learning and Deep Learning Algorithms,” 2022 4th International Conference on Smart Systems and Inventive Technology (ICSSIT), 2022, pp. 814-819.
• Gamal, Donia, et al. “An Evaluation of Sentiment Analysis on Smart Entertainment and Devices Reviews.” International Journal Information Theories and Applications 26.2 (2019): 147-164.
• Haque, Md Rakibul, Salma Akter Lima, and Sadia Zaman Mishu. “Performance analysis of different neural networks for sentiment analysis on IMDb movie reviews.” 2019 3rd International Conference on Electrical, Computer & Telecommunication Engineering (ICECTE). IEEE, 2019.
• R. Bandana, “Sentiment Analysis of Movie Reviews Using Heterogeneous Features,” 2018 2nd International Conference on Electronics, Materials Engineering & Nano-Technology (IEMENTech), 2018, pp. 1-4, doi: 10.1109/IEMENTECH.2018.8465346.
• Kumar, H. M., B. S. Harish, and H. K. Darshan. “Sentiment Analysis on IMDb Movie Reviews Using Hybrid Feature Extraction Method.” International Journal of Interactive Multimedia & Artificial Intelligence 5.5 (2019).
• Shaukat, Z., Zulfiqar, A.A., Xiao, C. et al. Sentiment analysis on IMDB using lexicon and neural networks. SN Appl. Sci. 2, 148 (2020).
• A. Yenter and A. Verma, “Deep CNN-LSTM with combined kernels from multiple branches for IMDb review sentiment analysis,” 2017 IEEE 8th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference (UEMCON), 2017, pp. 540-546, doi: 10.1109/UEMCON.2017.8249013.
• Tarımer, İlhan, Adil Çoban, and Arif Emre Kocaman. “Sentiment analysis on IMDB movie comments and Twitter data by machine learning and vector space techniques.” arXiv preprint arXiv:1903.11983 (2019).