Developing Semantic-based System for Arabic Information Retrieval

Wasim Ahmed Abdul-Aziz Alromima

Developing Semantic-based System for Arabic Information Retrieval

Wasim Ahmed Abdul-Aziz Alromima;

Abstract

In the era of information overload, Information Retrieval Systems are vital applications. Nowadays, the World Wide Web and the social media has become a vast library of unstructured data, which is laboriously comprehended and processed without using intelligent techniques.Many researchers are endeavoring to enhance search results in terms of precision and recall by developing new methods, especially in semantics. The amount of available Arabic content is increasing, but this is of low usefulness due to the complexity of the Arabic language morphology and the lack of resources like ontologies and machine-readable dictionaries.
The main objective of this thesis is to introduce a new Semantic-based Arabic Information Retrieval System (SAIRS) to improve Arabic text retrieval. Due to the complexity aspect and limited resources of the Arabic language, the proposed approach has three main contributions. First, the query is expanded using n-gram term collocations, which are automatically mined from the Arabic corpus; therefore there is no need for external semantic resource. Second, the query is expanded using Arabic domain ontology, which wasdesigned and represented manually by the Web Ontology Language (OWL).Third, the system index is constructed using the corpus words, and hence the cost and effort of the stemming process are saved.The Vector Space Model (VSM) has been employed to represent both documents and user queries. The experimental evaluation has been conducted on the scripts of the Arabic Holy Quran.
The main two sub-objectivesfor this thesis are: first,extracts tagged n-gram collocations (from 2- 6 gram) from the Arabic corpus is presented, which extractswords collocations by matching input structured pattern of the Arabic language versus the Part of Speech Tagging (POST) for the Arabiccorpus. The system is useful for extracting different kinds of sequences of words and phrases.The prototype is beneficial for linguistic research as shown in different scenarios for the experiments conducted.

Other data

Title	Developing Semantic-based System for Arabic Information Retrieval
Other Titles	تطوير نظام دلالي لاسترجاع المعلومات باللغة العربية
Authors	Wasim Ahmed Abdul-Aziz Alromima
Issue Date	2016

Attached Files

File	Size	Format
G14107.pdf	589.66 kB	Adobe PDF	View/Open

Recommend this item

Similar Items from Core Recommender Database

Google Scholar^TM

Check

views 2 in Shams Scholar

downloads 2 in Shams Scholar

Developing Semantic-based System for Arabic Information Retrieval

Wasim Ahmed Abdul-Aziz Alromima;

Abstract

Other data

Attached Files

Google ScholarTM

Google Scholar^TM