Mining the Publication Papers via Text Mining
Ahmed Saeed Ibrahim El-dosouky;
Abstract
The amount of data is produced every day is truly massive. There are tons of bytes of data created each day as the byte is the unit of storage on electronic devices. Data here might be a number or text, so mining the data is an important phase to discover patterns in apparently random data, and use all this information to better understand trends, patterns, correlations.
Text is considered as datum there are two types of text: structured text and unstructured text. Mining the unstructured text is considered a difficult process as the text won’t be in the form of rows and columns it would be in the form of a document that contains a set of paragraphs. Text mining may be a valuable process as it is used for analyzing data to capture key concepts and themes and uncover hidden relationships and trends without prior knowledge of the precise words or terms that authors have used to express those concepts [1]. Text Miming has mainly five approaches. The first approach is the information Retrieval approach, which is the activity of obtaining information resources (mostly documents) from a set of unstructured data collections that satisfies the information need [2].
The second approach is Natural Language Processing (NLP) which concerns mainly on how to generate the text. The Natural Language Generation (NLG) is the field responsible for that process and how to understand the text however, the Natural Language Understanding (NLU) is the field that responsible for analyzing the text, it consists at least one of the following components; tokenization, morphological or lexical analysis, syntactic analysis, and semantic analysis.
Text is considered as datum there are two types of text: structured text and unstructured text. Mining the unstructured text is considered a difficult process as the text won’t be in the form of rows and columns it would be in the form of a document that contains a set of paragraphs. Text mining may be a valuable process as it is used for analyzing data to capture key concepts and themes and uncover hidden relationships and trends without prior knowledge of the precise words or terms that authors have used to express those concepts [1]. Text Miming has mainly five approaches. The first approach is the information Retrieval approach, which is the activity of obtaining information resources (mostly documents) from a set of unstructured data collections that satisfies the information need [2].
The second approach is Natural Language Processing (NLP) which concerns mainly on how to generate the text. The Natural Language Generation (NLG) is the field responsible for that process and how to understand the text however, the Natural Language Understanding (NLU) is the field that responsible for analyzing the text, it consists at least one of the following components; tokenization, morphological or lexical analysis, syntactic analysis, and semantic analysis.
Other data
| Title | Mining the Publication Papers via Text Mining | Other Titles | تنقيب اوراق النشر عن طريق التنقيب النصي | Authors | Ahmed Saeed Ibrahim El-dosouky | Issue Date | 2022 |
Attached Files
| File | Size | Format | |
|---|---|---|---|
| BB12557.pdf | 539.88 kB | Adobe PDF | View/Open |
Similar Items from Core Recommender Database
Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.