ECE Seminar Lecture Series

Augmenting Human Creativity with Generative AI

Hao-Wen Dong, Assistant Professor in the Department of Performing Arts Technology at the University of Michigan

Wednesday, March 5, 2025
Noon–1 p.m.

601 Computer Studies Building

man looking at camera wearing glasses and plaid shirtGenerative AI has been transforming the way we interact with technology and consume content. Despite the successes of generative AI in certain fields, it has been challenging to integrate generative AI into the professional creative workflow for content creation for several reasons: 1) new application domains may require new generative models; 2) professionals need assistive tools that augment their creativity and productivity in addition to fully automated tools; 3) certain media require handling multimodal data streams at the same time. My research aims to address these challenges and augment human creativity with machine learning. I develop generative AI technology that can be integrated into the professional creative workflow, with a focus on music, audio and video content creation. 

In this talk, I will introduce my representative work in three main directions of my research: 1) novel generative models for new domains, 2) AI-assisted tools for content creation, and 3) multimodal generative models for content creation. In particular, I will focus on two recent projects: First, I will discuss our work on approaching text-to-audio synthesis through combining the naturally occurring audio-visual correspondence in videos and the multimodal representation learned by contrastive language-vision models. Our proposed method can learn to synthesize audio given a text prompt without using any paired text-audio data. Second, I will discuss our work on generating teasers for long documentaries. We approach this new task by first generating the teaser narration from the transcribed narration of the documentary using a large language model, and then selecting the most relevant visual content to accompany the generated narration through language-vision models. 

Bio: Hao-Wen (Herman) Dong is an Assistant Professor in the Department of Performing Arts Technology at the University of Michigan. Herman’s research aims to augment human creativity with machine learning. He develops human-centered generative AI technology that can be integrated into the professional creative workflow, with a focus on music, audio and video content creation. His long-term goal is to lower the barrier of entry for content creation and democratize professional content creation for everyone. Herman received his PhD degree in Computer Science from the University of California San Diego, where he worked with Julian McAuley and Taylor Berg-Kirkpatrick. His research has been recognized by the UCSD CSE Doctoral Award for Excellence in Research, KAUST Rising Stars in AI, UChicago and UCSD Rising Stars in Data Science, ICASSP Rising Stars in Signal Processing, and UCSD GPSA Interdisciplinary Research Award.