• Journal of Internet Computing and Services
    ISSN 2287 - 1136 (Online) / ISSN 1598 - 0170 (Print)
    https://jics.or.kr/

Prediction of infectious diseases using multiple web data and LSTM


Yeongha Kim, Inhwan Kim, Beakcheol Jang, Journal of Internet Computing and Services, Vol. 21, No. 5, pp. 139-148, Oct. 2020
10.7472/jksii.2020.21.5.139, Full Text:
Keywords: Machine Learning, Predict infectious diseases, Web data, LSTM

Abstract

Infectious diseases have long plagued mankind, and predicting and preventing them has been a big challenge for mankind. For this reasen, various studies have been conducted so far to predict infectious diseases. Most of the early studies relied on epidemiological data from the Centers for Disease Control and Prevention (CDC), and the problem was that the data provided by the CDC was updated only once a week, making it difficult to predict the number of real-time disease outbreaks. However, with the emergence of various Internet media due to the recent development of IT technology, studies have been conducted to predict the occurrence of infectious diseases through web data, and most of the studies we have researched have been using single Web data to predict diseases. However, disease forecasting through a single Web data has the disadvantage of having difficulty collecting large amounts of learning data and making accurate predictions through models for recent outbreaks such as "COVID-19". Thus, we would like to demonstrate through experiments that models that use multiple Web data to predict the occurrence of infectious diseases through LSTM models are more accurate than those that use single Web data and suggest models suitable for predicting infectious diseases. In this experiment, we predicted the occurrence of "Malaria" and "Epidemic-parotitis" using a single web data model and the model we propose. A total of 104 weeks of NEWS, SNS, and search query data were collected, of which 75 weeks were used as learning data and 29 weeks were used as verification data. In the experiment we predicted verification data using our proposed model and single web data, Pearson correlation coefficient for the predicted results of our proposed model showed the highest similarity at 0.94, 0.86, and RMSE was also the lowest at 0.19, 0.07.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from November 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[APA Style]
Kim, Y., Kim, I., & Jang, B. (2020). Prediction of infectious diseases using multiple web data and LSTM. Journal of Internet Computing and Services, 21(5), 139-148. DOI: 10.7472/jksii.2020.21.5.139.

[IEEE Style]
Y. Kim, I. Kim, B. Jang, "Prediction of infectious diseases using multiple web data and LSTM," Journal of Internet Computing and Services, vol. 21, no. 5, pp. 139-148, 2020. DOI: 10.7472/jksii.2020.21.5.139.

[ACM Style]
Yeongha Kim, Inhwan Kim, and Beakcheol Jang. 2020. Prediction of infectious diseases using multiple web data and LSTM. Journal of Internet Computing and Services, 21, 5, (2020), 139-148. DOI: 10.7472/jksii.2020.21.5.139.