1 137 Table of Contents 139 280

UROP Proceeding 2024-25

School of Engineering Department of Computer Science and Engineering 119 Using Large Language Models (LLMs) for Software Development Supervisor: CHEUNG Shing Chi / CSE Student: CHEUNG Yiu Wa / COMP Course: UROP 1100, Spring This progress report outlines my research in the UROP 1100O project. The project that I am conducting is “An Ecosystem for Pseudocode-Driven Software Development in the GenAI Era” under “Using Large Language Models (LLMs) for Software Development”, supervised by Professor CHEUNG, Shing Chi. I am collaborating with a senior student CHAN, Wai Pang to tackle various tasks. In this report, I will first explain the tasks that I am required to complete, followed by describing my objectives, highlight key achievements, and discuss the challenges I faced during my research, along with the how I overcome them. Furthermore, I will share valuable insights gained throughout this research. Using Large Language Models (LLMs) for Software Development Supervisor: CHEUNG Shing Chi / CSE Student: GHAZNAVI Muhammad Shaheer / COMP Course: UROP 1000, Summer Large Language Model (LLM) or AI Agents are systems where the LLM has control over the process-flow of the task, that is they can “dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks.” These agents have access to tools via Model Context Protocols (MCP) servers, that augment their utility beyond a simple natural language processing (NLP) application. The market for these agents is projected to grow to $103.6 billion by 2032, from just $3.7 billion in 2023. It is therefore important to test agentic systems thoroughly. Using Large Language Models (LLMs) for Software Development Supervisor: CHEUNG Shing Chi / CSE Student: NOH Sang Hyun / DSCT Course: UROP 1000, Summer Scientific reproducibility remains a fundamental challenge in software engineering research, particularly when validating results presented through complex figures, tables, or experimental outputs. We introduce PaperRepro, a vision-language-based agent framework that automates the reproduction of scientific results from published research artifacts. PaperRepro orchestrates a multi-stage pipeline: extracting visual elements, matching them to referenced content, acquiring contextual information from both the paper and its associated artifact, and executing a Plan–Act–Evaluate loop using large language models. Applied to a realworld artifact, our system reproduced multiple figures and tables, with some outputs closely matching the originals and others exhibiting notable semantic or structural discrepancies. This report details the system’s design, implementation, and evaluation, and outlines future directions for scaling the framework to support more diverse and complex research outputs.

Made with FlippingBook

RkJQdWJsaXNoZXIy NDk5Njg=