👋 About me

I am currently a second-year master’s student at Tsinghua University, based in Shenzhen.

I am now working on Text-to-audio generation, Video-to-audio generation research. If you would like to have an academic discussion or cooperation, please feel free to email me at liaoh22@mails.tsinghua.edu.cn.

My research interests include:

  • Applications: Audio Generation
  • Technologies: Generative Model, Multimodel Understanding and Learning, RLHF

🔥 News

  • 2024.05: I join Tencent AI lab as a research intern.
  • 2024.04: One Paper of text-to-audio system finetuned from human preference feedback is accepted by IJCAI 2024.
  • 2024.03: One Paper of controllable text-to-audio generation is accepted by ICME 2024.
  • 2023.05: I join Huawei 2012 lab as a research intern.

📝 Publications

🎙 Audio Generation

IJCAI 2024
sym

BATON: Aligning Text-to-Audio Model with Human Preference Feedback
Huan Liao, Haonan Han, Kai Yang, Tianjiao Du, Rui Yang, Zunnan Xu, Qinmei Xu, Jingquan Liu, Jiasheng Lu, Xiu Li

[Project] [Paper] [Dataset&Code]

  • The first text-to-audio (TTA) system finetuned from human preference feedback.
  • Curated a dataset containing both prompts and the corresponding generated audio, annotated based on human feedback.
  • Addressed the audio event semantic omission and temporal disarray with a weighted preference strategy
ICME 2024
sym

Controllable Text-to-Audio Generation with Training-Free Temporal Guidance Diffusion
Tianjiao Du, Jun Chen, Jiasheng Lu, Qinmei Xu, Huan Liao, Yupeng Chen, Zhiyong Wu

[Paper]

  • Training-free approach for controllable TTA generation based on the location and duration of corresponding sound events.
ARXIV 2024
sym

Rhythmic Foley: A Framework for Seamless Audio-Visual Alignment in Video-to-Audio Synthesis
Zhiqi Huang Dan Luo Jun Wang Huan Liao Zhiheng Li Zhiyong Wu

[Project] [Paper]

  • An innovative framework for video-to-audio synthesis, characterized by semantic integrity and precise beat point synchronization.

🧙 3D Generation

ARXIV 2024
sym

REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment
Haonan Han, Rui Yang, Huan Liao, Jiankai Xing, Zunnan Xu, Xiaoming Yu, Junwei Zha, Xiu Li, Wanhua Li

[Project] [Paper] [Code]

  • A novel approach for compositional 3D asset generation from single images.

🎖 Honors and Awards

  • 2023.10 Second Class Scholarship at Tsinghua Shenzhen International Graduate School
  • 2023.08 Tsinghua & Huawei - Information and Media Technology Outstanding Practice Project
  • 2022.06 Outstanding Graduate and Outstanding College Student Party Members of Hunan Province
  • 2021.10 National Scholarship (Top 1%)

📖 Educations

  • 2022.09 - 2025.06, Master, Tsinghua University, Beijing.

💻 Internships

  • 2024.05 - 2024.07, Tencent AI lab, Shenzhen.
  • 2023.05 - 2024.03, Huawei 2012 lab, Shenzhen.

🎵 Music Backgroud

  • Guzheng (Chinese instrumental exam-Grade 9)
  • Vice president of the 100-member music club

📚 Class

  • Digital Processing of Speech Signals (A)
  • Introduction to Statistical Learning Theory (A-)

Thanks so much for RayeRen’s open-sourced template version AcadHomepage .