Enhancing Information Retrieval through Dependency Modeling

Doaa Mabrouk Abd El-Fatah Mabrouk

Enhancing Information Retrieval through Dependency Modeling

Doaa Mabrouk Abd El-Fatah Mabrouk;

Abstract

In every field in our life, there are many problems especially in the field of computer. These problems increased due to the rapid spread of the internet. Today, the most important field in our life is information retrieval and the search to convey user’s need. With the growth of using the internet and available information on the web, Information Retrieval “IR” became a fact of life for users. The internet is providing the user with vast knowledge and information in different domains. The major research areas include biology, chemistry, commerce, tourism, earth, education, mathematics, physics, economics, agriculture, and information and computer sciences.

In this thesis, the following problems are introduced: Term dependency, especially, that some of the mathematical models assume terms are independent. One of these models is Vector Space Model “VSM”, while others, assume that terms are dependent such as Markov Random Field “MRF”, Unigram and Bigram models. Term weighting is a core behind mathematical retrieval modeling which is important in document ranking. There are some methods such as Term Frequency Inverse Document Frequency “TF*IDF”, Information Gain Ratio “IGR”, Confidence weight “Conf.Weight” and weighted clustering.

The proposed algorithm of the power sets a theory to discover all the combinations between words in documents. Moreover, the judgement of the results uses accuracy measurements by Subsumptions Rule-Based Classifiers “SRBC” through two ways (Maximum-Number –Term Dependency Identification “Max-No-TDI” and Maximum-Feature Count “Max-FC”).

This thesis introduces a survey of mathematical information retrieval systems’ using dependency modeling and term weighting. The enhancement of dependency modeling is through performance, effectiveness and efficiency in addition to term weighting which considers another factor that affects the result. It also contains the power set theory to discover Term Dependency Identification “TDI” between words in Text Classification “TC” and measure accuracy of all generated random experiments. The result is Max-No-TDI which is better than Max-Fc with 96% accuracy level.

Other data

Title	Enhancing Information Retrieval through Dependency Modeling
Other Titles	تطوير نظم استرجاع المعلومات من خلال نماذج التبعية
Authors	Doaa Mabrouk Abd El-Fatah Mabrouk
Issue Date	2019

Attached Files

File	Size	Format
cc1031.pdf	373.07 kB	Adobe PDF	View/Open

Recommend this item

Similar Items from Core Recommender Database

Google Scholar^TM

Check

views 3 in Shams Scholar

downloads 4 in Shams Scholar

Enhancing Information Retrieval through Dependency Modeling

Doaa Mabrouk Abd El-Fatah Mabrouk;

Abstract

Other data

Attached Files

Google ScholarTM

Google Scholar^TM