• Journal of Internet Computing and Services
    ISSN 2287 - 1136 (Online) / ISSN 1598 - 0170 (Print)
    https://jics.or.kr/

Performance Improvement of Topic Modeling using BART based Document Summarization


Eun Su Kim, Hyun Yoo, Kyungyong Chung, Journal of Internet Computing and Services, Vol. 25, No. 3, pp. 27-33, Jun. 2024
10.7472/jksii.2024.25.3.27, Full Text:
Keywords: Document Summarization, BART, topic modeling, LDA, Perplexity, ROUGE

Abstract

The environment of academic research is continuously changing due to the increase of information, which raises the need for an effective way to analyze and organize large amounts of documents. In this paper, we propose Performance Improvement of Topic Modeling using BART(Bidirectional and Auto-Regressive Transformers) based Document Summarization. The proposed method uses BART-based document summary model to extract the core content and improve topic modeling performance using LDA(Latent Dirichlet Allocation) algorithm. We suggest an approach to improve the performance and efficiency of LDA topic modeling through document summarization and validate it through experiments. The experimental results show that the BART-based model for summarizing article data captures the important information of the original articles with F1-Scores of 0.5819, 0.4384, and 0.5038 in Rouge-1, Rouge-2, and Rouge-L performance evaluations, respectively. In addition, topic modeling using summarized documents performs about 8.08% better than topic modeling using full text in the performance comparison using the Perplexity metric. This contributes to the reduction of data throughput and improvement of efficiency in the topic modeling process.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from November 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[APA Style]
Kim, E., Yoo, H., & Chung, K. (2024). Performance Improvement of Topic Modeling using BART based Document Summarization. Journal of Internet Computing and Services, 25(3), 27-33. DOI: 10.7472/jksii.2024.25.3.27.

[IEEE Style]
E. S. Kim, H. Yoo, K. Chung, "Performance Improvement of Topic Modeling using BART based Document Summarization," Journal of Internet Computing and Services, vol. 25, no. 3, pp. 27-33, 2024. DOI: 10.7472/jksii.2024.25.3.27.

[ACM Style]
Eun Su Kim, Hyun Yoo, and Kyungyong Chung. 2024. Performance Improvement of Topic Modeling using BART based Document Summarization. Journal of Internet Computing and Services, 25, 3, (2024), 27-33. DOI: 10.7472/jksii.2024.25.3.27.