Negation Detection In Arabic Opinion Reviews: A Comprehensive Annotated Dataset For Sentiment Analysis

Authors

  • Ahmed S. Abuhammad Department of Computer Science and Information Technology, University College of Science and Technology, Khan Younis, Palestine & University of the Holy Quran and Taseel of Science, Wad Madani, Sudan

Keywords:

Annotated dataset; Arabic opinion reviews; Dialectal Arabic; Modern standard Arabic; Natural Language Processing; Negation detection; Sentiment analysis.

Abstract

Negation detection plays a vital role in Natural Language Processing (NLP), especially in sentiment analysis. In this paper, we introduce a comprehensive dataset of Arabic opinion reviews, specifically annotated for negation detection. The dataset consists of 84,000 reviews collected from TripAdvisor, Booking.com, and Agoda, spanning the period from June 2013 to June 2023. It is evenly divided between 42,000 'negated positive' reviews and 42,000 positive reviews. The reviews focus on hotels and travel accommodations across the Middle East and North Africa and are written in various Arabic dialects. The data collection process involved web scraping, language filtering, and both automatic and manual annotation of negation cues, such as ‘لا’ (no) and ‘ليس’ (not). The quality of the annotations was verified through expert review and inter-annotator agreement, ensuring high consistency. This dataset offers valuable insights into negation structures in both Modern Standard (MSA) and Dialectal Arabic (DA), providing a foundation for developing and evaluating negation detection methods. It will be made available to the Arabic research community to help address these key linguistic challenges.

Downloads

Download data is not yet available.

Published

2024-10-06

How to Cite

Abuhammad, A. S. . (2024). Negation Detection In Arabic Opinion Reviews: A Comprehensive Annotated Dataset For Sentiment Analysis. Journal of Information Systems Research and Practice, 2(4), 2–19. Retrieved from https://vmis.um.edu.my/index.php/JISRP/article/view/55506