1 142 Table of Contents 144 280

UROP Proceeding 2024-25

School of Engineering Department of Computer Science and Engineering 124 Human-Centric Trustworthy AI/NLP for Real-World Applications Supervisor: May FUNG / CSE Student: LYU Zongwei / COMP Course: UROP 1100, Summer The author–reviewer rebuttal is a pivotal yet understudied step in scholarly publishing: a dynamic game of incomplete information in which authors must strategically persuade reviewers whose private beliefs and biases are only partially observable. Existing language-model approaches—trained through straightforward supervised imitation—excel at fluent text generation but falter at the deeper requirements of strategic planning and Theory of Mind (ToM) reasoning that effective rebuttals demand. We introduce the first ToMdriven Rebuttal Agent that explicitly “thinks before it writes.” Our framework (1) constructs fine-grained reviewer profiles, capturing macro-level stance, attitude, and expertise as well as micro-level comment types and severities; (2) employs a critique-and-refine pipeline in which a powerful teacher model synthesizes multi-step rebuttal strategies and superior responses, yielding high-quality (analysis, strategy, response) triples and an interpretable Reviewer Model for reward estimation; and (3) trains a Rebuttal Agent via a twostage process—supervised fine-tuning on the synthetic corpus followed by reinforcement learning with the Reviewer Model as reward—to optimize the complete Analyze-Plan-Retrieve-Generate chain end-to-end. Comprehensive experiments across multiple model sizes and baselines show that our agent produces rebuttals that are measurably more persuasive, strategy-aware, and evidence-grounded than those generated by prior methods. By embedding ToM reasoning into the rebuttal workflow, we pave the way for AI systems that can participate in scholarly discourse as sophisticated, socially aware collaborators rather than mere text generators. Open Topic in Algorithms and Complexity Supervisor: Amir GOHARSHADY / CSE Student: ABDUL REHMAN / COMP HASAN Dewan Saadman / DSCT TSE Yik Long / COSC Course: UROP 1100, Fall UROP 1100, Fall UROP 2100, Fall Profile-guided optimization (PGO) is a technique used to optimize the performance of a program by using the information gathered from the execution of the program. Profile-guided optimization allows compilers to leverage the dynamic behaviors of the programs, thus producing faster binaries. In this paper, we introduce a novel approach to PGO by considering the grammatical decomposition of the structured control flow graph of the program and using dynamic programming to find the optimal path through the control flow graph by solving minimum-cost maximum connected flow problem. We show that our approach works in polynomial time and thus can be used to optimize large programs.

Made with FlippingBook

RkJQdWJsaXNoZXIy NDk5Njg=