Volume 18, No. 6, 2021

Encoder And Decoder Techniques For Cross-Language Multi-Document Abstractive And Extractive Summarization


Dr. Shivaprakash , Nityanand D M , Sangamesh

Abstract

The universalization of social media and digital documents led to the swift advancement of multilinguistic data accessible on the web. Nevertheless, this enormous quantity of data could not be assessed physically. The present work addresses Cross-Language Text Summarization (CLTS) that creates a summary in a disparate language out of the source documents. CLTS’s task concentrates upon creating a summary in a target language (TL) (e.g., Japanese) for a provided document array in a disparate source language (e.g., English). The encoder-decoder paradigm remains comprehensively employed in CLTS study. Soft attention will be employed for attaining the necessary contextual semantic data when performing the decoding. Nevertheless, because of the deficit of accessibility to the primary features, the produced summary diverges out of the main content. The present work proposes a novel architecture to discuss the job by the excerption of several summaries within the TL by Double Attention Mechanism and Bi-directional Long Short-Term Memory (DAM_Bi-LSTM) networks, which can extract relevant cross-language keywords better and reduce the problem of unfamiliar words within the process of summary generation for optimizing the data of the CLTS. In the Attention Pointer Network, the self-attention mechanism gathers principal data out of the encoder, and the soft attention and the pointer network produce extra clear summaries. Additionally, the optimized coverage mechanism will be used for dealing with the reiteration issue and optimizing the generated summaries’ quality. Consequently, the proffered DAM_BiLSTM attains 24% in rouge-1, 20% in rouge-2, 40% in rouge-L, 92.6% of accuracy, 80.6% of precision, 74.6% of recall, and 86.8% of F1-score.


Pages: 3913-3927

Keywords: Text Summarization (TS) remains the job of filtering important data out of the original document for providing compressed variants for a specific procedure.

Full Text