Hi, I'm Priyanka Kargupta!

Let's make both models and humans think critically.



I am a third year PhD candidate working in natural language processing at University of Illinois at Urbana-Champaign, advised by Prof. Jiawei Han. I am supported by the NSF Graduate Research Fellowship. Prior to this, I was working on 3D scene representations with Prof. Ren Ng at the University of California, Berkeley, funded by the Intel SRC Fellowship.

My research aims to integrate structured, critical reasoning and adaptive creativity in order to make both models and their human users critically think.

My work explores this in both the scientific (AI for Research) and educational (LLMs + Education) domains. Specifically, I am interested in exploring how to structure and unstructure the reasoning of both models and humans, as well as structure the data critical to guiding such reasoning.

Email  /  CV  /  Scholar  /  Twitter  /  Github  /  Linkedin

profile photo

Highlights

Research

My research begins by exploring how explicit structure can empower both models and researchers to reason more effectively about scientific literature. By embedding structured frameworks into the reasoning process, we can leverage LLMs to augment critical tasks, such as analyzing nuanced scientific findings, tracing the evolution of multidimensional research contributions, and distinguishing novel insights from existing work. However, novel scientific research is inherently open-ended and creative, which has led me to investigate the interplay between structure, flexibility, and creativity. I hypothesize that (1) structure enables the systematic navigation of complex knowledge and problems, (2) flexibility enables the ability to breaking free from unhelpful structures, and (3) creativity guides the exploration of new, promising approaches that can then be tackled again through structure. Ultimately, my work seeks to uncover how structured, critical reasoning and adaptive creativity can be integrated to drive human–LLM scientific discovery.

Tree-of-Debate: Multi-Persona Debate Trees Elicit Critical Thinking for Scientific Comparative Analysis
Priyanka Kargupta, Ishika Agarwal, Tal August, Jiawei Han
ACL'25 Oral
Paper / Code

Determining significant novelties, incremental findings, and equivalent approaches between works is challenging, especially when the papers are not explicitly connected through citations. In order to elicit the critical reasoning required for comprehending the contribution degree of a paper, we propose converting the papers to LLM personas which debate one another. In other words, we propose a tree-of-debate (ToD), where we focus more on the personas' comparative reasoning induced by the debate, as opposed to its final outcome. ToD can dynamically construct a debate tree to reason about fine-grained arguments discussed in scholarly articles.

Synergizing Unsupervised Episode Detection with LLMs for Large-Scale News Events
Priyanka Kargupta, Yunyi Zhang, Yizhu Jiao, Siru Ouyang, Jiawei Han
ACL 2025 Oral
Paper / Code

Introduces a novel task, episode detection, which identifies episodes within a news corpus of key event articles. Detecting episodes poses unique challenges, as they lack explicit temporal or locational markers and cannot be merged using semantic similarity alone. While large language models (LLMs) can aid with these reasoning difficulties, they suffer with long contexts typical of news corpora. To address these challenges, we introduce EpiMine, an unsupervised framework that identifies a key event's candidate episodes by leveraging natural episodic partitions in articles, estimated through shifts in discriminative term combinations. These candidate episodes are more cohesive and representative of true episodes, synergizing with LLMs to better interpret and refine them into final episodes.

Beyond True or False: Retrieval-Augmented Hierarchical Analysis of Nuanced Claims
Priyanka Kargupta*, Runchu Tian*, Jiawei Han
ACL 2025 Main Conference
Paper / Code

Scientific and political claims are often nuanced and are not strictly “true” or “false” (e.g., Vaccine A is better than B). However, a claim (e.g., "vaccine A is better than vaccine B") can be dissected into its integral aspects and sub-aspects (e.g., efficacy, safety, distribution), which are individually easier to validate. Thus, we propose ClaimSpect, a retrieval-augmented generation-based framework for automatically constructing a hierarchy of aspects typically considered when addressing a claim and enriching them with corpus-specific perspectives. This structure hierarchically partitions an input corpus to retrieve relevant segments, which assist in discovering new sub-aspects. Moreover, these segments enable the discovery of varying perspectives towards an aspect of the claim (e.g., support, neutral, or oppose) and their respective prevalence (e.g., "how many biomedical papers believe vaccine A is more transportable than B?").

TaxoAdapt: Aligning LLM-Based Multidimensional Taxonomy Construction to Evolving Research Corpora
Priyanka Kargupta*, Nan Zhang, Yunyi Zhang, Rui Zhang, Prasenjit Mitra, Jiawei Han
ACL 2025 Main Conference
Paper / Code

TaxoAdapt is a framework that dynamically adapts an LLM-generated taxonomy to a given corpus across multiple dimensions. TaxoAdapt performs iterative hierarchical classification, expanding both the taxonomy width and depth based on corpus' topical distribution. We demonstrate its state-of-the-art performance across a diverse set of computer science conferences over the years to showcase its ability to structure and capture the evolution of scientific fields. As a multidimensional method, TaxoAdapt generates taxonomies that are 26.51% more granularity-preserving and 50.41% more coherent than the most competitive baselines judged by LLMs.

Instruct, Not Assist: LLM-based Multi-Turn Planning and Hierarchical Questioning for Socratic Code Debugging
Priyanka Kargupta*, Ishika Agarwal*, Dilek Hakkani-Tur, Jiawei Han
EMNLP'24 Findings
Paper / Code

An Instructor agent guided by a novel state space-based planning algorithm. TreeInstruct asks probing questions to help students independently identify and resolve errors. It estimates a student's conceptual and syntactical knowledge to dynamically construct a question tree based on their responses and current knowledge state, effectively addressing both independent and dependent mistakes concurrently in a multi-turn interaction setting.

MEGClass: Extremely Weakly Supervised Text Classification via Mutually-Enhancing Text Granularities
Priyanka Kargupta, Tanay Komarlu, Susik Yoon, Xuan Wang, Jiawei Han
EMNLP'23 Findings
Paper / Code

An extremely weakly-supervised text classification method that leverages Mutually-Enhancing Text Granularities. MEGClass utilizes coarse- and fine-grained context signals obtained by jointly considering a document's most class-indicative words and sentences. This approach enables the learning of a contextualized document representation that captures the most discriminative class indicators.

Reaction miner: An integrated system for chemical reaction extraction from textual data
Ming Zhong, Siru Ouyang, Yizhu Jiao, Priyanka Kargupta, Leo Luo, Yanzhen Shen, Bobby Zhou, Xianrui Zhong, Xuan Liu, Hongxiang Li, Jinfeng Xiao, Minhao Jiang, Vivian Hu, Xuan Wang, Heng Ji, Martin Burke, Huimin Zhao, Jiawei Han
EMNLP'23 Demo, 2023
Paper / Code

A system which interacts with raw scientific literature, delivering precise and more informative chemical reactions. Going beyond mere extraction, Reaction Miner integrates a holistic workflow: it accepts PDF files as input, bypassing the need for pre-processing and bolstering user accessibility. Subsequently, a text segmentation module ensures that the refined text encapsulates complete chemical reactions, augmenting the accuracy of extraction. Moreover, Reaction Miner broadens the scope of existing pre-defined reaction roles, including vital attributes previously neglected, thereby offering a more comprehensive depiction of chemical reactions. Evaluations conducted by chemistry domain users highlight the efficacy of each module in our system, demonstrating Reaction Miner as a powerful tool in this field.