Yaping Zhang (张亚萍)

I am an associate researcher at State Key Laboratory of Multimodal Artificial Intelligence Systems(MAIS), Institute of Automation, Chinese Academy of Sciences (CASIA). I received the Ph.D degree from CASIA in 2020. My research primarily centers on multimodal learning, document analysis and natural language processing. Recently, we aim to establish a new pathway toward document image translation with complex layout.

News

We are organizing the ICDAR 2025 Competition on End-to-End DIMT.
One paper is accepted by IEEE Transactions on PAMI (TPAMI, CCF A) 2025
One paper is accepted by IEEE transaction on Multimedia (TMM, CCF B) 2025
Three papers are accepted by ACL 2025 (CCF A)
Two papers are accepted by COLING 2025 (CCF B)

Selected Publications

Zhiyang Zhang, Yaping Zhang, Yupu Liang, Cong Ma, Lu Xiang, Yang Zhao, Yu Zhou, and Chengqing Zong. Understand Layout and Translate Text: Unified Feature-Conductive End-to-End Document Image Translation. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). Accepted. https://doi.org/10.1109/TPAMI.2025.3530998 (CCF A)
Yaping Zhang, Shuai Nie, Wenju Liu, Xing Xu, Dongxiang Zhang, Heng Tao Shen. Sequence-to-sequence domain adaptation network for robust text image recognition. Proceedings of the IEEE/CVF conference on CVPR. 2019. (CCF A)
Yaping Zhang, Shuai Nie, Shan Liang, Wenju Liu. Robust text image recognition via adversarial sequence-to-sequence domain adaptation. IEEE trans. on Image Processing. 2021. (CCF A)
Zhiyang Zhang, Yaping Zhang, Yupu Liang, Lu Xiang, Yang Zhao, Yu Zhou, Chengqing Zong. From Chaotic OCR Words to Coherent Document: A Fine-to-Coarse Zoom-Out Network for Complex-Layout Document Image Translation. Proceedings of the 31st International Conference on Computational Linguistics(COLING-2025). Abu Dhabi, UAE. pages 10877–10890. (CCF B)
Yupu Liang, Yaping Zhang, Cong Ma, Zhiyang Zhang, Yang Zhao, Lu Xiang, Chengqing Zong, Yu Zhou. Document Image Machine Translation with Dynamic Multi-pre-trained Models Assembling. In The 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024). Mexico City, Mexico. June 16-21, 2024. (CCF B)
Cong Ma, Yaping Zhang, Zhiyang Zhang, Yupu Liang, Yang Zhao, Yu Zhou, Chengqing Zong. Born a BabNet with Hierarchical Parental Supervision for End-to-End Text Image Machine Translation. In The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Torino, Italia. May 20-25, 2024. (CCF B)
Zhiyang Zhang, Yaping Zhang, Yupu Liang, Lu Xiang, Yang Zhao, Yu Zhou, Chengqing Zong. LayoutDIT: Layout-Aware End-to-End Document Image Translation with Multi-Step Conductive Decoder. In Findings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), Singapore. December 6-10, 2023. pp. 4959–4965. ACL_Anthology_version. (CCF B)

More papers

Projects

国家自然科学基金委员会, 面上项目, 62476275, 面向复杂版面的文档图像机器翻译方法研究, 2025-01-01 至 2028-12-31, 30万元, 在研, 主持
国家自然科学基金委员会, 重点项目, 62336008, 大规模多语种多模态神经机器翻译关键技术研究, 2024-01-01 至 2028-12-31, 233万元, 在研, 参与
国家自然科学基金委员会, 青年科学基金项目（C类）, 62106265, 融合视觉与文本信息的图像文本机器翻译方法研究, 2022-01-01 至 2024-12-31, 30万元, 结题, 主持

Awards

2023, Best Paper Award for the 19th China Conference on Machine Translation, CCMT2023
2018, Best Student Paper Award for International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

Academic Activities

Committee Member, Chinese Society of Image and Graphics, Document Image Analysis and Recognition（2023-）
Committee Member, Chinese Society of Image and Graphics, Big Vision Data Committee（2023-）
Committee Member, Chinese Information Processing Society of China, Youth Working Committee（2023-）
Reviewers for ICLR，AAAI, TMM, NeuNet, ACL, EMNLP, NACCAL, ICME, ICASSP, CVMJ, TALLIP, etc