About me
Hello! I am Yujia Xiao, a third-year PhD student in the DSP & Speech Technology Laboratory (DSP-STL) at The Chinese University of Hong Kong (CUHK), under the supervision of Prof. Tan Lee. Prior to this, I worked as an applied scientist at Microsoft from 2018 to 2022. I earned both my M.S. and B.S. degrees from South China University of Technology. My current research focuses on long-form audio and speech generation as well as multimodal agents. If you are interested in my work, feel free to contact me!
News
- π₯ Mar 4, 2025: PodAgent is released. Given the topic to be discussed, PodAgent will simulate human behavior to create podcast-like audio presented as a talk show, featuring one host and several guests. The show will include diverse and insightful viewpoints, delivered in appropriate voices, along with structured sound effects and background music to enrich the listening experience.
Experience
- π» 2023.07 - 2024.03: Research Intern at Microsoft (TTS Algorithm Team)
- πΌ 2018.05 - 2022.07: Applied Scientist at Microsoft (TTS Algorithm Team)
- π» 2016.08 - 2018.04: Research Intern at Microsoft Research Asia (Speech Group & IEG)
- π» 2014.07 - 2015.08: Research Intern at Microsoft Research Asia (Speech Group & IEG)
Selected Publications
- π PodAgent: A Comprehensive Framework for Podcast Generation Yujia Xiao, Lei He, Haohan Guo, Fenglong Xie, Tan Lee. 2025.
- π Contrastive context-speech pretraining for expressive text-to-speech synthesis Yujia Xiao Xi Wang, Xu Tan, Lei He, Xinfa Zhu, Sheng Zhao, Tan Lee. ACM Multimedia, 2024.
- π Contextspeech: Expressive and efficient text-to-speech for paragraph reading Yujia Xiao, Shaofei Zhang, Xi Wang, Xu Tan, Lei He, Sheng Zhao, Frank K. Soong, Tan Lee. INTERSPEECH 2023.
- π Improving fastspeech tts with efficient self-attention and compact feed-forward network Yujia Xiao, Xi Wang, Lei He, Frank K Soong. ICASSP 2022.
- π Improving prosody with linguistic and bert derived features in multi-speaker based mandarin chinese neural tts Yujia Xiao, Lei He, Huaiping Ming, Frank K. Soong. ICASSP 2020.
- π Paired phone-posteriors approach to ESL pronunciation quality assessment Yujia Xiao, Frank K Soong, Wenping Hu. INTERSPEECH 2018.
- π Proficiency Assessment of ESL Learnerβs Sentence Prosody with TTS Synthesized Voice as Reference Yujia Xiao, Frank K Soong. INTERSPEECH 2017.
Awards
- π 2021.12 [Microsoft Hacathon] Executive Challenge - Hack for Consumer Business Growth - 2nd Place
- π 2020.09 [Microsoft Hacathon] Honorable Mention
- π 2019.09 [Microsoft Hacathon] Hackathon Challenge - Hack for Big Ideas - 2nd Place
- π₯ 2016 National Scholarship for Postgraduates
- π₯ 2013 National Scholarship
- π₯ 2012 National Scholarship
Teaching & Services
- π§βπ«οΈ Teaching Assistant of UGEB1408-ENGG1920 Artificial Intelligence in Action at CUHK
- π Invited Reviewer of ICASSP 2025