AN EXPANDABLE AND UP-TO-DATE LEXICON FOR SENTIMENT ANALYSIS OF ARABIC TWEETS
Sentiment analysis is the process of identifying the subjective opinion within a text. And it gains a huge interest due to its several benefits in developing economy, politic, and sociology. And since twitter is considered a rich source of peopleâ€™s thoughts and opinions, it is urged to benefit from it to explore public opinions. Many researches have been conducted for English language, while Arabic language still got limited number of sentiment analysis studies, especially in the context of Arab dialects in social media. A lexicon-based approach is adopted to perform sentiment analysis on Arabic tweets, which rely on detecting sentiment words. These sentiment words are loaded in a sentiment lexicon where words are annotated by its sentiment polarity. One of the main issues of handling Arabic tweets is the changing nature of twitter, where new words that imply sentiment values emerged, and many slang words are evolved. In this paper, an expandable and up-to-date lexicon for Arabic (EULA) is developed to overcome the issue of inventing new words and phrases in social media. EULA rely on a pre-built lexicon of MSA sentiment words, and a set of rules to expand and enrich it with dialectical polarity words from a small amount of labeled tweets, and a large amount of unlabeled tweets. For evaluation, eight different corpuses of Arabic tweets were selected. And a pre-processing phase that includes normalization and stemming is implemented to reduce the number of unique words to be analyzed for sentiment analysis. Experiments show that EULA improved the lexicon-based approach`s accuracy and F-1 score by more than 20% on average.
J. Eisenstein, â€œWhat to do about bad language on the internet,â€ Naacl-Hlt, pp. 359â€“369, 2013.
S. Golder and M. Macy, â€œDiurnal and Seasonal Mood Vary with Work, Sleep, and Daylength Across Diverse Cultures,â€ Science, vol. 339, no. February, pp. 819â€“822, 2013.
S. Volkova, T. Wilson, and D. Yarowsky, â€œExploring Sentiment in Social Media: Bootstrapping Subjectivity Clues from Multilingual Twitter Streams,â€ Proc. 51st Annu. Meet. Assoc. Comput. Linguist. (Volume 2 Short Pap., pp. 505â€“510, 2013.
E. Refaee and V. Rieser, â€œBenchmarking Machine Translated Sentiment Analysis for Arabic Tweets,â€ Proc. NAACL-HLT 2015 Student Res. Work., pp. 71â€“78, 2015.
H. Elsahar and S. R. El-Beltagy, â€œA fully automated approach for Arabic slang lexicon extraction from microblogs,â€ Lect. Notes Comput. Sci., vol. 8403 LNCS, no. PART 1, pp. 79â€“91, 2014.
E. Riloff, and J. Wiebe, â€œLearning extraction patterns for subjective expressions,â€ Proc. 2003 Conf. Empir. methods Nat. Lang. Process. -, pp. 105â€“112, 2003.
J. Wiebe, T. Wilson, and C. Cardie, â€œAnnotating expressions of opinions and emotions in language,â€ Lang. Resour. Eval., vol. 39, no. 2â€“3, pp. 165â€“210, 2005.
P. Turney, â€œThumbs up or thumbs down? Semantic Orientation applied to Unsupervised Classification of Reviews,â€ Proc. 40th Annu. Meet. Assoc. Comput. Linguist., no. July, pp. 417â€“424, 2002.
A. Aue and M. Gamon, â€œCustomizing Sentiment Classifiers to New Domains: A Case Study.,â€ Proc. Recent Adv. Nat. Lang. Process., vol. 3, no. 3, pp. 16â€“18, 2005.
L. Albraheem and H. S. Al-Khalifa, â€œExploring the problems of sentiment analysis in informal Arabic,â€ Proc. 14th Int. Conf. Inf. Integr. Web-based Appl. Serv. - IIWAS â€™12, p. 415, 2012.
R. M. Duwairi, N. A. Ahmed, and S. Y. Al-Rifai, â€œDetecting sentiment embedded in Arabic social media - A lexicon-based approach,â€ J. Intell. Fuzzy Syst., vol. 29, no. 1, pp. 107â€“117, 2015.
H. S. Ibrahim, S. M. Abdou, and M. Gheith, â€œIdioms-proverbs lexicon for modern standard Arabic and colloquial sentiment analysis,â€ Int. J. Nat. Lang. Comput., vol. 4, no. 2, pp. 95â€“109, 2015.
Samhaa R. El-Beltagy, â€œWeightedNileULex: A Scored Arabic Sentiment Lexicon for Improved Sentiment Analysis,â€ Lang. Process. Pattern Recognit. Intell. Syst. Spec. Issue Comput. Linguist. Speech& Image Process. Arab. Lang., no. February, 2017.
S. M. Mohammad, M. Salameh, and S. Kiritchenko, â€œHow translation alters sentiment,â€ J. Artif. Intell. Res., vol. 55, no. January, pp. 95â€“130, 2016.
Shoeb, M., & Gupta, V. K. (2013). A crypt analysis of the tiny encryption algorithm in key generation. International Journal of Communication and Computer Technologies, 1(38).
M. Abdul-mageed, M. Diab, and S. KÃ¼bler, â€œSAMAR : Subjectivity and sentiment analysis for Arabic social media,â€ Comput. Speech Lang., pp. 1â€“18, 2013.
P. D. Turney and M. L. Littman, â€œMeasuring Praise and Criticism: Inference of Semantic Orientation from Association,â€ ACM Trans. Inf. Syst., vol. 21, no. 4, pp. 315â€“346, 2003.
H. Mobarz, M. Rashown, and I. Farag, â€œusing automated lexical resources in arabic sentence subjectivity,â€ Int. J. Artif. Intell. Appl., vol. 5, no. 6, 2014.
H. S. Ibrahim, S. M. Abdou, and M. Gheith, â€œsentiment analysis for modern standard arabic and colloquial,â€ 2015 IEEE 2nd Int. Conf. Recent Trends Inf. Syst. ReTIS 2015 - Proc., vol. 4, no. 2, pp. 353â€“358, 2015.
M. Al-ayyoub, S. B. Essa, and I. Alsmadi, â€œLexicon-based sentiment analysis of Arabic tweets,â€ Int. J. Soc. Netw. Min., vol. 2, no. July 2016, pp. 101â€“114, 2015.
S. El-Beltagy and A. Ali, â€œOpen issues in the sentiment analysis of Arabic social media: A case study,â€ â€¦ Inf. Technol. (IIT), 2013 9th â€¦, pp. 1â€“6, 2013.
N. A. Abdulla, N. A. Ahmed, M. A. Shehab, M. Al-Ayyoub, M. N. Al-Kabi, and S. Al-rifai, â€œTowards Improving the Lexicon-Based Approach for Arabic Sentiment Analysis,â€ International Journal of Information Technology and Web Engineering, vol. 9, no. 3. pp. 55â€“71, 2014.
shah, a., sanghvi, k., sureja, d., & seth, a. (2018). insilico drug design and molecular docking studies of some natural products as tyrosine kinase inhibitors. international journal of pharmaceutical research, 10(2).
N. a Abdulla, N. a Ahmed, M. a Shehab, and M. Al-ayyoub, â€œArabic Sentiment Analysis: Corpus-based and Lexicon-based,â€ Jordan Conf. Appl. Electr. Eng. Comput. Technol., vol. 6, no. 12, pp. 1â€“6, 2013.
M. Abdul-Mageed and M. Diab, â€œSANA: A Large Scale Multi-Genre, Multi-Dialect Lexicon for Arabic Subjectivity and Sentiment Analysis,â€ Proc. Lang. Resour. Eval. Conf., pp. 1162â€“1169, 2014.
G. Badaro, R. Baly, and H. Hajj, â€œA Large Scale Arabic Sentiment Lexicon for Arabic Opinion Mining,â€ Arab. Nat. Lang. Process. Work. co-located with EMNLP 2014, Doha, Qatar, pp. 176â€“184, 2014.
D. Vilares, C. GÃ³mez-RodrÃguez, and M. A. Alonso, â€œUniversal, Unsupervised, Uncovered Sentiment Analysis,â€ no. July, 2016.
K. Graff, D., Maamouri, M., Bouziri, B., Krouna, S. and T. S., and Buckwalter, â€œStandard Arabic Morphological Analyzer (SAMA) Version 3.1.,â€ Linguist. Data Consort. LDC2009E73., p. 2018, 2009.
S. Baccianella, A. Esuli, and F. Sebastiani, â€œSentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining,â€ Proc. Seventh Int. Conf. Lang. Resour. Eval., vol. 0, no. January, pp. 2200â€“2204, 2010.
S. M. Mohammad, M. Salameh, and S. Kiritchenko, â€œSentiment Lexicons for Arabic Social Media,â€ Tenth Int. Conf. Lang. Resour. Eval. Lr. 2016, no. September, pp. 33â€“37, 2016.
A. Assiri, A. Emam, and H. Al-Dossari, â€œTowards enhancement of a lexicon-based approach for Saudi dialect sentiment analysis,â€ J. Inf. Sci., p. 016555151668814, 2017.
M. Mataoui, â€œA Proposed Lexicon-Based Sentiment Analysis Approach for the Vernacular Algerian Arabic.,â€ Res. Comput. Sci., vol. 110, no. April 2016, pp. 55â€“70, 2016.
N. Al-twairesh, â€œAraSenTi-Tweet : A Corpus for Arabic Sentiment Analysis of Saudi Tweets,â€ Procedia Comput. Sci., vol. 117, no. November, pp. 63â€“72, 2017.
M. Nabil, â€œASTD : Arabic Sentiment Tweets Dataset,â€ no. September, pp. 2515â€“2519, 2015.
E. Refaee and V. Rieser, â€œAn Arabic twitter corpus for subjectivity and sentiment analysis,â€ Proc. Lang. Resour. Eval. Conf., no. spring 2013, pp. 2268â€“2273, 2014.
Munot, N. M., Lasure, P., & Girme, S. S. (n.d.). Design and Evaluation of Chronotropic Systems for Colon Targeted Drug Delivery. International Journal of Pharmacy Research and Technology (Vol. 2, pp. 13â€“17).
B. Ihnaini and M. Mahmuddin, â€œAn expandable and up-to-date lexicon for sentiment analysis of Arabic tweets,â€ J. Eng. Appl. Sci., vol. 13, no. 17, pp. 7313â€“7322, 2018.
S. R. El-Beltagy, â€œNileULex: A Phrase and Word Level Sentiment Lexicon for Egyptian and Modern Standard Arabic,â€ to Appear Proc. Lr. 2016, no. April, pp. 2900â€“2905, 2016.
T. Wilson, J. Wiebe, and P. Hoffman, â€œRecognizing contextual polarity in phrase level sentiment analysis,â€ Acl, vol. 7, no. 5, pp. 12â€“21, 2005.
M. Hu and B. Liu, â€œMining and summarizing customer reviews,â€ Proc. 2004 ACM SIGKDD Int. Conf. Knowl. Discov. data Min. KDD 04, vol. 04, p. 168, 2004.
T. Al-Moslmi, M. Albared, A. Al-Shabi, N. Omar, and S. Abdullah, â€œArabic senti-lexicon: Constructing publicly available language resources for Arabic sentiment analysis,â€ J. Inf. Sci., p. 0165551516683908, 2017.
K. Schouten and F. Frasincar, â€œSurvey on Aspect-Level Sentiment Analysis,â€ IEEE Trans. Knowl. Data Eng., vol. 28, no. 3, pp. 813â€“830, 2016.
The submitter hereby warrants that the Work (collectively, the “Materials”) is original and that he/she is the author of the Materials. To the extent the Materials incorporate text passages, figures, data or other material from the works of others, the undersigned has obtained any necessary permissions. Where necessary, the undersigned has obtained all third party permissions and consents to grant the license above and has all copies of such permissions and consents.
The submitter represents that he/she has the power and authority to make and execute this assignment. The submitter agrees to indemnify and hold harmless the COMPUSOFT from any damage or expense that may arise in the event of a breach of any of the warranties set forth above. For authenticity, validity and originality of the research paper the author/authors will be totally responsible.