UROP Proceeding 2024-25

School of Engineering Department of Computer Science and Engineering 133 Automated Transformation of Computer Programs Supervisor: SHEN Jiasi / CSE Student: LIU Chenqi / CPEG Course: UROP 1100, Fall Program comprehension is a critical aspect of software engineering, particularly when dealing with complex systems like Android applications. Last semester, we proposed a dynamic approach to program comprehension. The method is inspired by the intuition that users can often grasp an app’s functionality through direct interaction, without needing to read the source code. For example, users naturally understand app navigation by clicking buttons and observing the resulting transitions between screens. This semester, we continued development and introduced several new features and optimizations to enhance the approach. Compiler Optimization Guided by Machine Learning Supervisor: SHEN Jiasi / CSE Student: LOK Yin Fung / COGBM ZHANG Li / COMP ZHANG Puyang / COMP Course: UROP 1000, Summer UROP 1100, Summer UROP 2100, Summer One of the major tasks of this UROP project is to implement an LLVM pass that can transform LLVM intermediate representation (IR) codes into static single-assignment (SSA) form. This is a transitional task that builds upon the foundational knowledge of compiler principles and serves as a basis for subsequent work on automatic compiler optimisation. Evaluating and Improving LLMs’ Code Generation Capabilities Supervisor: SHEN Jiasi / CSE Student: QIAN Jiayi / DSCT Course: UROP 1000, Summer To make Large Language Models (LLMs) better at writing code, we need a smarter way to train them than just showing them examples. Reinforcement Learning (RL) offers a promising path by enabling models to learn from feedback, but it requires a scalable, low-latency execution environment to provide this feedback in real-time. This project addresses this infrastructural gap by designing and implementing a highconcurrency code evaluation system. I began by studying state-of-the-art approaches like SWE-RL to understand the requirements of RL-based reward mechanisms. Subsequently, we engineered a new server backend using Python’s FastAPI framework for superior asynchronous performance. This system was then utilized to process and validate a large-scale dataset of code-test pairs, establishing a workflow for filtering high-quality samples suitable for future model training.

RkJQdWJsaXNoZXIy NDk5Njg=