Optimizing Sentiment Classification for Arabic Opinion Texts

Saeed, Radwa M.K.; Rady, Sherine; Gharib, Tarek F.; Moustafa Kamal Saeed, Radwa

Optimizing Sentiment Classification for Arabic Opinion Texts

Saeed, Radwa M.K.; Rady, Sherine; Gharib, Tarek F.; Moustafa Kamal Saeed, Radwa;

Abstract

Meanwhile, products and services reviews’ provide a guide for potential customers allowing them to reach real knowledge about such products/services while making decisions. Sentiment classification is the task of analyzing opinions expressed in textual reviews automatically. The efficiency of this task is influenced by the set of representative features extracted from the reviews. Nevertheless, the value of extracted features lies as well in those that highly contribute to the classification process. Here comes the role of dimensionality reduction to eliminate the noise and reduce the feature high space while preserving required accuracies. The Arabic language and its datasets have inherent challenges. Besides, most sentiment classification studies integrating dimensionality reduction have focused on English texts, with only few studies conducted for other languages including Arabic. Massive amounts of Arabic data have been generated due to the huge population of the Arab world, and despite that, the aforementioned technical gaps are still existing for such language. This paper proposes a supervised learning approach for Arabic reviews sentiment classification. This approach utilizes optimized compact features that depend on a well representative feature set coupled with feature reduction techniques, which manages to guarantee high accuracy and time/space savings simultaneously. The employed feature set includes a triple combination of N-gram features and positive/negative N-grams counts features obtained after considering negation handling. The proposed approach examines two different linear transformation methods; principal component analysis (PCA) as an unsupervised transformation method and latent Dirichlet allocation (LDA) as a supervised transformation method. A spam detection process is executed prior to the learning for the purpose of increasing the classifier robustness. The proposed approach has been experimented with five Arabic opinion text datasets, of different domains and varying sizes (1.6 up to 94 K reviews). Experiments have been conducted for two-class (positive/negative sentiments) and three-class (positive/negative/neutral sentiments) classification problems. Accuracy values have been recorded in the range of 95.5–99.8% for the two-class classification problem and 92–97.3% for the three-class classification problem. The LDA feature reduction outperformed PCA by an average of 4.34% and 3.52% in accuracy and F1 Score measures, respectively. The overall approach outperformed the existing related works in literature by far of 23% and 34% for accuracy and F1 Score, respectively. The experimental studies and the obtained results show the efficiency of the proposed solution, which employs optimized features that rely on integrating a feature reduction module, together with a well representative feature set based on negation handled triple combination of N-gram features and positive/negative N-grams counts features. The overall results demonstrate great improvement with 24% increase in accuracy, 93% savings in the feature space, and 97% decrease in the classification execution time.

Other data

Title	Optimizing Sentiment Classification for Arabic Opinion Texts
Authors	Saeed, Radwa M.K.; Rady, Sherine ; Gharib, Tarek F.; Moustafa Kamal Saeed, Radwa
Keywords	Arabic sentiment classification;Feature reduction;Latent Dirichlet allocation (LDA);Optimized features;Supervised learning;Principal component analysis (PCA)
Issue Date	1-Jan-2021
Publisher	SPRINGER
Journal	Cognitive Computation
Volume	13
Start page	164
End page	178
ISSN	18669956
DOI	10.1007/s12559-020-09771-z
Scopus ID	2-s2.0-85098584862
Web of science ID	WOS:000604479500001

Recommend this item

Similar Items from Core Recommender Database

Google Scholar^TM

Check

Citations 16 in scopus

views 48 in Shams Scholar

Optimizing Sentiment Classification for Arabic Opinion Texts

Saeed, Radwa M.K.; Rady, Sherine; Gharib, Tarek F.; Moustafa Kamal Saeed, Radwa;

Abstract

Other data

Google ScholarTM

Google Scholar^TM