EdgeSumm: Graph-based framework for automatic text summarization

El-Kassas, WS; Salama, Cherif; Rafea, AA; Mohamed, HK;

Abstract


Searching the Internet for a certain topic can become a daunting task because users cannot read and comprehend all the resulting texts. Automatic Text summarization (ATS) in this case is clearly beneficial because manual summarization is expensive and time-consuming. To enhance ATS for single documents, this paper proposes a novel extractive graph-based framework “EdgeSumm” that relies on four proposed algorithms. The first algorithm constructs a new text graph model representation from the input document. The second and third algorithms search the constructed text graph for sentences to be included in the candidate summary. When the resulting candidate summary still exceeds a user-required limit, the fourth algorithm is used to select the most important sentences. EdgeSumm combines a set of extractive ATS methods (namely graph-based, statistical-based, semantic-based, and centrality-based methods) to benefit from their advantages and overcome their individual drawbacks. EdgeSumm is general for any document genre (not limited to a specific domain) and unsupervised so it does not require any training data. The standard datasets DUC2001 and DUC2002 are used to evaluate EdgeSumm using the widely used automatic evaluation tool: Recall-Oriented Understudy for Gisting Evaluation (ROUGE). EdgeSumm gets the highest ROUGE scores on DUC2001. For DUC2002, the evaluation results show that the proposed framework outperforms the state-of-the-art ATS systems by achieving improvements of 1.2% and 4.7% over the highest scores in the literature for the metrics of ROUGE-1 and ROUGE-L respectively. In addition, EdgeSumm achieves very competitive results for the metrics of ROUGE-2 and ROUGE-SU4.


Other data

Title EdgeSumm: Graph-based framework for automatic text summarization
Authors El-Kassas, WS; Salama, Cherif ; Rafea, AA; Mohamed, HK
Keywords Automatic text summarization; Extractive text summarization; Graph representation model; Single-document summarization; EdgeSumm
Issue Date Nov-2020
Publisher ELSEVIER SCI LTD
Journal INFORMATION PROCESSING & MANAGEMENT 
Volume 57
Issue 6
ISSN 0306-4573
DOI 10.1016/j.ipm.2020.102264
Scopus ID 2-s2.0-85087201719
Web of science ID WOS:000582206800024

Recommend this item

Similar Items from Core Recommender Database

Google ScholarTM

Check

Citations 47 in scopus


Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.