Implementation of EnhancedOntological Techniques for Information Retrieval Purposes

Amr Aly Amin Aly ElSehemy;

Abstract


Keyword-based search engines were used to solve such problem, however, in many cases; keyword-based search engines either miss some documents or retrieve non-relevant documents. New technologies were presented to enhance the search results. One of these technologies is based on semantic web. Many semantic search techniques and models were made to enhance the traditional keyword-based search. Including conceptual knowledge such as ontologies in the information retrieval process contributes to the solution of major problems found in keyword-based search. Advances were made in languages such as English, German, French and Spanish. Although Arabic language is spoken by as many as 422 million native speakers, the literature does not fully cover it yet.
In this work, an architecture for an ontology-based information retrieval for Arabic language is presented. The architecture presented consists of four main modules, the query parser, the indexer, the search and the ranking modules. This work included building a semantic index; providing weighted links between the ontology concepts and the documents. Furthermore, a document categorizer to the architecture was built. The document categorizer was used as an additional process to enhance the overall document ranking.
To test our work, three Arabic domain ontologies were built. These ontologies are Sports, Economics and Politics. A knowledge base was built that consisted of 79 classes and 1456 instances. The document categorizer was evaluated using the WATAN-2004 corpus. We introduced some modifications merging the categories; to be relevant to the work in here. The corpus contains around 20,000 articles in five categories (Culture, Religion, Economy, News and Sports). Twenty retrieval operations were manually chosen to assess different cases. The operations were applied on a sample of 40,316 documents with a size 320 Mega Bytes of pure text. Each operation was applied three times, once using the keyword-based search, and the other two using the presented ontology-based search with and without the classification module. The documents were downloaded from www.aljazeera.net news website.


Other data

Title Implementation of EnhancedOntological Techniques for Information Retrieval Purposes
Other Titles تنفيذ إستخدام تقنية مفاهيم اللغة المحسنة لأغراض إسترجاع المعلومات
Authors Amr Aly Amin Aly ElSehemy
Issue Date 2015

Attached Files

File SizeFormat
G8032.pdf504.05 kBAdobe PDFView/Open
Recommend this item

Similar Items from Core Recommender Database

Google ScholarTM

Check

views 2 in Shams Scholar
downloads 1 in Shams Scholar


Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.