Minghao Wu - Monash University
Minghao Wu is currently a Ph.D. student at Monash University under the supervision of Prof. Gholamreza (Reza) Haffari and Dr. George Foster. His research interests include large language models, language agent, multilinguality, and machine translation.
My CV can be downloaded from English version and 中文版.
Education
- Ph.D. Computer Science, Monash University, 2022, supervised by Prof. Gholamreza (Reza) Haffari and Dr. George Foster.
- M.Eng. Information Technology, The University of Melbourne, 2018, supervised by Prof. Trevor Cohn.
- B.Sc. Statistics and Information Systems, The University of Sydney, 2016.
Work Experience
- Jul. 2023 - Oct. 2023: Research Intern at Tencent AI Lab, supervised by Dr. Longyue Wang.
- Apr. 2023 - Jul. 2023: Visiting Researcher at Mohamed bin Zayed University of Artificial Intelligence, supervised by Dr. Alham Fikri Aji.
- Jul. 2020 - Jul. 2021: Research Intern at Huawei Noah’s Ark Lab, supervised by Dr. Meng Zhang.
- Aug. 2018 - Jul. 2019: Research Engineer at JD AI Research of JD.com, Inc, supervised by Dr. Xiaodong He.
Selected Publications
You can find the complete list of my articles on my Google Scholar profile.
Mixture-of-Skills: Learning to Optimize Data Usage for Fine-Tuning Large Language Models
Minghao Wu, Thuy-Trang Vu, Lizhen Qu, Gholamreza Haffari. 2024. Mixture-of-Skills: Learning to Optimize Data Usage for Fine-Tuning Large Language Models. In CoRR, abs/2406.08811.
(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts
Minghao Wu, Yulin Yuan, Gholamreza Haffari, Longyue Wang. 2024. (Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts. In CoRR, abs/2405.11804.
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-Mageed, and Alham Fikri Aji. 2024. LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 944–964, St. Julian’s, Malta. Association for Computational Linguistics.
Importance-Aware Data Augmentation for Document-Level Neural Machine Translation
Minghao Wu, Yufei Wang, George Foster, Lizhen Qu, and Gholamreza Haffari. 2024. Importance-Aware Data Augmentation for Document-Level Neural Machine Translation. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 740–752, St. Julian’s, Malta. Association for Computational Linguistics.
Adapting Large Language Models for Document-Level Machine Translation
Minghao Wu, Thuy-Trang Vu, Lizhen Qu, George Foster, Gholamreza Haffari. 2024. Adapting Large Language Models for Document-Level Machine Translation. In CoRR, abs/2401.06468.
Bactrian-X : A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation
Haonan Li, Fajri Koto, Minghao Wu, Alham Fikri Aji, Timothy Baldwin. 2023. Bactrian-X : A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation. In CoRR, abs/2305.15011.
Document Flattening: Beyond Concatenating Context for Document-Level Neural Machine Translation
Minghao Wu, George Foster, Lizhen Qu, and Gholamreza Haffari. 2023. Document Flattening: Beyond Concatenating Context for Document-Level Neural Machine Translation. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 448–462, Dubrovnik, Croatia. Association for Computational Linguistics.
Universal Conditional Masked Language Pre-training for Neural Machine Translation
Pengfei Li, Liangyou Li, Meng Zhang, Minghao Wu, and Qun Liu. 2022. Universal Conditional Masked Language Pre-training for Neural Machine Translation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6379–6391, Dublin, Ireland. Association for Computational Linguistics.
Uncertainty-Aware Balancing for Multilingual and Multi-Domain Neural Machine Translation Training
Minghao Wu, Yitong Li, Meng Zhang, Liangyou Li, Gholamreza Haffari and Qun Liu (2021) Uncertainty-Aware Balancing for Multilingual and Multi-Domain Neural Machine Translation Training. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7291–7305, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Evaluating the Utility of Hand-crafted Features in Sequence Labelling
Minghao Wu, Fei Liu and Trevor Cohn (2018) Evaluating the Utility of Hand-crafted Features in Sequence Labelling. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2850–2856, Brussels, Belgium. Association for Computational Linguistics.