Paper
.Chenkai Zhang, Yiming Lei, Zeming Liuˆ, Haitao Leng, ShaoGuo Liu, Tingting Gao, Qingjie Liu, Yunhong Wang, GODBench:A Benchmark for Multimodal Large Language Models in Video Comment Art, In Proceedings of the 63rd annual meeting of the Association for Computational Linguistics (ACL 2025), 2025.07, Vienna, Austria. (CCF A). (Corresponding author).
.Jiayi Zeng, Yizhe Feng, Mengliang He, Wenhui Lei, Wei Zhang, Zeming Liuˆ, Xiaoming Shi, Aimin Zhou, Mis-prompt: Benchmarking Large Language Models for Proactive Error Handling, In Proceedings of the 63rd annual meeting of the Association for Computational Linguistics (ACL 2025), 2025.07, Vienna, Austria. (CCF A). (Corresponding author).
.Chenkai Zhang, Yiming Lei, Zeming Liu^, Haitao Leng, Kai Li, Tingting Gao, Qingjie Liu, Yunhong Wang, SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2025), 2025.06, Nashville, Tennessee, USA. (First submission to CVPR, Corresponding author, CCF A)
.Bin Deng, Yizhe Feng, Zeming Liu^, Qing Wei, Xiangrong Zhu, Shuai Chen, Yuanfang Guo, Yunhong Wang, RETAIL: Towards Real-world Travel Planning for Large Language Models, In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025), 2025, Suzhou, China. (CCF B, CAAI A, Tsinghua A, Corresponding author).
.Yanzhi Tian, Zeming Liu, Zhengyang Liu, Chong Feng, Xin Li, Heyan Huang, Yuhang Guo, PRIM: Towards Practical In-Image Multilingual Machine Translation, In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025), 2025, Suzhou, China. (CCF B, CAAI A, Tsinghua A).
.Jingjing Liu, Zeming Liu, Zihao Cheng, Mengliang He, Xiaoming Shi, Yuhang Guo, Xiangrong Zhu, Yuanfang Guo, Yunhong Wang, Haifeng Wang, RepoDebug: Repository-Level Multi-Task and Multi-Language Debugging Evaluation of Large Language Models, In Findings of the Association for Computational Linguistics: EMNLP 2025, 2025, Suzhou, China. (Corresponding author).
.Hongfei Xia*, Hongru Wang*, Zeming Liu*, Qian Yu, Yuhang Guo, Haifeng Wang, SafeToolBench: Pioneering a Prospective Benchmark to Evaluating Tool Utilization Safety in LLMs, In Findings of the Association for Computational Linguistics: EMNLP 2025, 2025, Suzhou, China. (Co-first author).
.Jianing Lin, Yuanfang Guo, Shunning Liu, Zeming Liu, Yunhong Wang, Weak2Wise: An Automated, Lightweight Framework for Weak-LLM-Friendly Reasoning Synthesis, In Findings of the Association for Computational Linguistics: EMNLP 2025, 2025, Suzhou, China.
.Hongru Wang, Rui Wang, Boyang Xue, Heming Xia, Jingtao Cao, Zeming Liu, Jeff Z. Pan, Kam-Fai Wong, AppBench: Planning of Multiple APIs from Various APPs for Complex User Instruction, In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024), 2024, Miami, Florida, USA. (CCF B, CAAI A, Tsinghua A).
.Zihao Cheng, Hongru Wang, Zeming Liuˆ, Yuhang Guo, Yuanfang Guo, Yunhong Wang, Haifeng Wang, ToolSpectrum: Towards Personalized Tool Utilization for Large Language Models, In Findings of the Association for Computational Linguistics: ACL 2025, Vienna, Austria. (Corresponding author).