School of Engineering Department of Computer Science and Engineering 114 Generative AI Supervisor: CHEN Qifeng / CSE Student: LI Haoming / COMP Course: UROP 1100, Fall In recent years, image editing has become very common; people can easily remove some parts of the picture with a few steps. But there is also some problem with video editing. Most video editing projects only consider the video without the audio, like ProPainter: Improving Propagation and Transformer for Video Inpainting, but if you just remove the item by ignoring the audio, the output video can easily be found that it is edited. So, the model needs to find the relationship between the deleted picture and the original audio. This project is about training a model that can understand the item in a video and can produce an audio file without the sound of the deleted item. With this help, we can use deep learning to delete the audio of the deleted item. Generative AI Supervisor: CHEN Qifeng / CSE Student: LI Jiajun / ELEC Course: UROP 1100, Fall In the early stage of UROP1100 project, I was mainly responsible for reading and studying the code and related papers, including the principles of core technologies such as LoRA and Transformer. As the project progressed, I took on the task of integrating CogVideoX in the VideoTuna project, which included seamlessly integrating CogVideoX’s inference scripts, runtime environment, and model weights into the VideoTuna framework. In the process, I not only gained a deeper understanding of generative modeling code structure and its specification, but also successfully refactored the CogVideoX 1.5 SAT inference code to better fit VideoTuna’s overall architecture. This work allowed me to further grasp the importance of code modularity and standardized design, while accumulating a wealth of practical experience and contributing to the overall development of the project. Generative AI Supervisor: CHEN Qifeng / CSE Student: LIN Yi / COMP Course: UROP 1100, Summer Video super-resolution is a critical technology for modern media, enabling the enhancement of lowresolution, compressed, or bandwidth-constrained video streams into high-quality formats suitable for large displays, restoration, and post-production. As streaming platforms and mobile devices increasingly demand high-resolution content from limited sources, robust and perceptually convincing video super-resolution and rescaling methods have become essential. In this research project, we studied various deep learning models for image/video rescaling and super-resolution, as well as other related works, aiming to develop a more efficient and effective approach for video super-resolution and rescaling tasks.
RkJQdWJsaXNoZXIy NDk5Njg=