1 128 Table of Contents 130 240

UROP Proceeding 2023-24

School of Engineering Department of Computer Science and Engineering 108 Trustworthy Machine Learning Supervisor: CHENG Minhao / CSE Student: WONG Hiu Tung / COMP Course: UROP 1100, Fall AI is gaining more usage in application across numerous fields. However, the problem faced currently is that there is insufficient way to ensure the reliability and trustworthiness of the generated product, for example text-to-image generation. The UROP project I joined this year is on trustworthy machine learning, targeting text-to-image models, where they investigated methods to embed a watermark in generated models to identify the prompt used to generate the image. This paper summarizes what I have learnt and the background information of the project from the diffusion model to the current methodology of the project, and reflection on my performance and work. Using Large Language Models (LLMs) for Software Development Supervisor: CHEUNG Shing Chi / CSE Student: CHAN Wai Pang / COMP Course: UROP 1100, Summer This summer, my research centered on the application of advanced techniques in software development, focusing on the integration of code search and code completion. I investigated various tools and methodologies for performing code search. On top of this, I used LLMs, specifically GPT-4, to leverage the search results for improving code completion. The findings indicate that combining traditional code search methods with LLM can optimize the performance of code completion tasks. This report provides an overview of the research process, detailing the selection and evaluation of code search tools, the methodologies employed, and the resulting insights from combining these approaches to advance code completion strategies. Using Large Language Models (LLMs) for Software Development Supervisor: CHEUNG Shing Chi / CSE Student: LIU Chenqi / CPEG Course: UROP 1100, Spring In collaboration with Li Tsz-On, we present two useful tools, that is a code comparator and a code collector. The Code Comparator efficiently discerns syntax and semantic disparities between program versions, with techniques such as normalization with AST. The Code Collector automates the extraction of bug-fixing pairs from Codeforces contests. Leveraging multi-threaded fetching and algorithmic subject identification, it streamlines the collection process. These tools facilitate comprehensive analysis into program analysis and error correction methodologies, and should be helpful in future researches.

Made with FlippingBook

RkJQdWJsaXNoZXIy NDk5Njg=