Twitter Benchmark Dataset for Arabic Sentiment Analysis

Gamal, Donia; Alfonse, Marco; El-Sayed M. El-Horbaty; Salem A.

Twitter Benchmark Dataset for Arabic Sentiment Analysis

Gamal, Donia; Alfonse, Marco; El-Sayed M. El-Horbaty; Salem A.;

Abstract

Sentiment classification is the most rising research areas of sentiment analysis and text mining, especially with the massive amount of opinions available on social media. Recent results and efforts have demonstrated that there is no single strategy can mutually accomplish the best prediction performance on various datasets. There is a lack of existing researches to Arabic sentiment analysis compared to English sentiment analysis, because of the unique nature and difficulty of the Arabic language which leads to shortage in Arabic dataset used in sentiment analysis. An Arabic benchmark dataset is proposed in this paper for sentiment analysis showing the gathering methodology of the most recent tweets in different Arabic dialects. This dataset includes more than 151,000 different opinions in variant Arabic dialects which labeled into two balanced classes, namely, positive and negative. Different machine learning algorithms are applied on this dataset including the ridge regression which gives the highest accuracy of 99.90%.

Other data

Title	Twitter Benchmark Dataset for Arabic Sentiment Analysis
Authors	Gamal, Donia ; Alfonse, Marco ; El-Sayed M. El-Horbaty ; Salem A.
Keywords	Arabic Benchmark Dataset \| Arabic Dialects \| Arabic Opinion Mining \| Arabic Sentiment Analysis \| Machine Learning \| Twitter
Issue Date	1-Jan-2019
Journal	International Journal of Modern Education and Computer Science
ISSN	20750161
DOI	10.5815/ijmecs.2019.01.04
Scopus ID	2-s2.0-85064925523

Recommend this item

Similar Items from Core Recommender Database

Google Scholar^TM

Check

Citations 37 in scopus

Twitter Benchmark Dataset for Arabic Sentiment Analysis

Gamal, Donia; Alfonse, Marco; El-Sayed M. El-Horbaty; Salem A.;

Abstract

Other data

Google ScholarTM

Google Scholar^TM