Review and Implementation of Topic Modeling in Hindi

Ray, Santosh Kumar and Ahmad, Amir and Kumar, Ch. Aswani (2019) Review and Implementation of Topic Modeling in Hindi. Applied Artificial Intelligence, 33 (11). pp. 979-1007. ISSN 0883-9514

[thumbnail of Review and Implementation of Topic Modeling in Hindi.pdf] Text
Review and Implementation of Topic Modeling in Hindi.pdf - Published Version

Download (4MB)

Abstract

Due to the widespread usage of electronic devices and the growing popularity of social media, a lot of text data is being generated at the rate never seen before. It is not possible for humans to read all data generated and find what is being discussed in his field of interest. Topic modeling is a technique to identify the topics present in a large set of text documents. In this paper, we have discussed the widely used techniques and tools for topic modeling. There has been a lot of research on topic modeling in English, but there is not much progress in the resource-scarce languages like Hindi despite Hindi being spoken by millions of people across the world. In this paper, we have discussed the challenges faced in developing topic models for Hindi. We have applied Latent Semantic Indexing (LSI), Non-negative Matrix Factorization (NMF), and Latent Dirichlet Allocation (LDA) algorithms for topic modeling in Hindi. The outcomes of the topic model algorithms are usually difficult to interpret for the common user. We have used various visualization techniques to represent the outcomes of topic modeling in a meaningful way. Then we have used the metrics like perplexity and coherence to evaluate the topic models. The results of Topic modeling in Hindi seem to be promising and comparable to some results reported in the literature on English datasets.

Item Type: Article
Subjects: Pustaka Library > Computer Science
Depositing User: Unnamed user with email support@pustakalibrary.com
Date Deposited: 27 Jun 2023 06:54
Last Modified: 30 Oct 2023 05:23
URI: http://archive.bionaturalists.in/id/eprint/1210

Actions (login required)

View Item
View Item