1 152 Table of Contents 154 280

UROP Proceeding 2024-25

School of Engineering Department of Computer Science and Engineering 134 Evaluating and Improving LLMs' Code Generation Capabilities Supervisor: SHEN Jiasi / CSE Student: WANG Junping / COMP Course: UROP 1100, Summer Typically, in Coq proof development, both specifications and proofs must be manually constructed. Our research aims to automatically generate proofs for given specifications using trained AI models. This semester, we focus on creating a dataset where each proof can be executed independently for model training and verification. This requires identifying dependencies between specifications to isolate relevant components. Since the legacy Coq code we’re using cannot be processed by built-in tools, we developed a dependency extraction method based on keyword analysis. However, this approach struggles with complex dependency structures and sentences containing multiple specification names. Our current implementation works only for small Coq projects (under 1000 lines of code). Future work will focus on scaling this to projects with hundreds of dependencies and thousands of lines. IDE Extensions Supervisor: SHEN Jiasi / CSE Student: ZHANG Puyang / COMP ZHAO Yuhua / COMP Course: UROP 1100, Fall This UROP1100 project includes an IntelliJ IDE extension for symbolic execution that analyzes the Abstract Syntax Tree (AST) and uses Z3 for variable substitution in constraint solving. It builds on prior work for automating test case refactoring which is the focus of our previous UROP1000 project. Commonsense Entailment Graph Construction and Reasoning Supervisor: SONG Yangqiu / CSE Student: HONG Lanxuan / COSC Course: UROP 1100, Spring This report presents a summary of my learning and practical experience in deep learning and natural language processing during my UROP experience, which began with the implementation of classic deep learning models for natural language processing (NLP) tasks and the setup of appropriate computing environments. After building foundational skills, my main focus shifted to understanding and attempting to replicate the instruction tuning process using open-source resources online, specifically the Alpaca model provided by Stanford CRFM. The report describes the steps I followed, the challenges encountered— including environment setup and learning to use the PyTorch toolkit—and the key takeaways from my attempts to reproduce the Alpaca instruction tuning workflow.

Made with FlippingBook

RkJQdWJsaXNoZXIy NDk5Njg=