UROP Proceeding 2024-25

School of Engineering Department of Computer Science and Engineering 121 Large Language Models as Your Machine Learning Experts Supervisor: DI Shimin / CSE Student: CHAN Chun Man / DA GAMAGE NANAYAKKARA Dinusara Sasindu / MATH-AM Course: UROP 2100, Fall UROP 1100, Fall Neural Architecture Search (NAS) is transforming automated machine learning (AutoML) by automating the design of neural network architectures. This report explores the impact of search spaces and transfer learning on NAS performance, synthesizing insights from Taskonomy, TransNAS-Bench-101, and tabular benchmarks. Search spaces, categorized as macro, chain-structured, cell-based, and hierarchical, define the scope of architectural exploration and directly influence model quality. Taskonomy reveals task similarities through transfer learning, enabling improved generalization across tasks by leveraging shared representations. TransNAS-Bench-101 highlights the role of search spaces in cross-task generalizability. Tabular benchmarks ensure consistent and reproducible evaluations. By integrating task affinity insights with NAS, this report identifies strategies to optimize transfer learning and improve the efficiency, flexibility, and generalizability of NAS frameworks. Large Language Models as Your Machine Learning Experts Supervisor: DI Shimin / CSE Student: HAN Xinyi / COMP Course: UROP 1100, Fall This semester’s UROP project mainly includes two parts of research. The first part is about the Neural Architecture Search (NAS) algorithm in the Auto ML field, especially for the Graph Neural Network (GNN) design. The main task at this stage is to reproduce the core algorithm of the paper. Its primary method involves decomposing and reconstructing the structure of graph neural networks using adjacency matrices. Additionally, it embeds both local graphs and the global graph into the graph neural network through GCNs. The second stage of the work is to develop a demo system for an Auto ML algorithm. This algorithm leverages LLMs (Large Language Models) to quickly accumulate knowledge about the relationship between GNN structures, dataset features and GNN performance. Through this method, the system can generate better results for unseen datasets with reduced computational cost. The demonstration system will be implemented as a web-based interface. Users will be able to input the graph data required for their task and specify model requirements (e.g., whether statistical characteristics of the graph dataset should be considered). The system will then provide feedback using the GNN structure inferred by LLM and experimental data (e.g., accuracy and performance metrics).

RkJQdWJsaXNoZXIy NDk5Njg=