Seleted Publications

 ∇  CodeLLM & AI4SE:

  • [ICASSP'26]  TIMEDIFF: Leveraging Differential Domain Representations for Long Time Series Forecasting.
    Yongding Tao, Xiaohang Zeng, Qinxu Ding, Yihong Dong, Da Peng, Peng Di, Puzhuo Liu, Wei Ke.
    The IEEE International Conference on Acoustics, Speech, and Signal Processing, 2026.
  • [ICSE-SEIP'26]  OpenDerisk: An Industrial Framework for AI-Driven SRE, with Design, Implementation, and Case Studies.
    Peng Di, Faqiang Chen, Xiao Bai, Hongjun Yang, Qingfeng Li, Ganglin Wei, Jian Mou, Feng Shi, Keting Chen, Peng Tang, Zhitao Shen, Zheng Li, Wenhui Shi, Junwei Guo, Hang Yu.
    The IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, 2026.
  • [AAAI'26]  LAMDAS: LLM as an Implicit Classifier for Domain-specific Data Selection.
    Jian Wu, Hang Yu, Bingchang Liu, Yang Wenjie, Peng Di, Jianguo Li, Yue Zhang.
    The AAAI Conference on Artificial Intelligence, 2026.
  • [NeurIPS'25]  Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks.
    Hongyuan Tao, Ying Zhang, Zhenhao Tang, Hongen Peng, Xukun Zhu, Bingchang Liu, Yingguang Yang, Ziyin Zhang, Zhaogui Xu, Haipeng Zhang, Linchao Zhu, Rui Wang, Hang Yu, Jianguo Li, Peng Di.
    The Annual Conference on Neural Information Processing Systems, 2025.
  • [ACL'25]  GALLa: Graph Aligned Large Language Models for Improved Source Code Understanding.
    Ziyin Zhang, Hang Yu, Sage Lee, Peng Di, Jianguo Li, Rui Wang.
    The Annual Meeting of the Association for Computational Linguistics, 2025.
  • [Technical report]  Every Sample Matters: Leveraging Mixture-of-Experts and High-Quality Data for Efficient and Accurate Code LLM.
    CodeFuse & Ling Team.
    arXiv preprint arXiv:2311.12785, [cs.LG], 2025.
  • [ASE'24]  Understanding Code Changes Practically with Small-Scale Language Models.
    Cong Li, Zhaogui Xu, Peng Di, Dongxia Wang, Zheng Li, and Qian Zheng.
    The IEEE/ACM International Conference on Automated Software Engineering, 2024.
  • [ICSE-SEIP'24]  CodeFuse-13B: A Pretrained Multi-lingual Code Large Language Model.
    CodeFuse Team.
    The IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, 2024.
  • [CSUR]  Prompting Frameworks for Large Language Models: A Survey.
    Xiaoxia Liu, Jingyi Wang, Jun Sun, Xiaohan Yuan, Guoliang Dong, Peng Di, Wenhai Wang, Dongxia Wang.
    arXiv preprint arXiv:2311.12785 2023 and ACM Computing Surveys 2026.
  •  ∇  Program Analysis:

  • [ICSE-SEIP'26]  Principles and Practices of Large-Scale Code Analysis at Ant Group: A Data- and Logic-Oriented Approach.
    Xiaoheng Xie, Gang Fan, Xiaojun Lin, Ang Zhou, Shijie Li, Xunjin Zheng, Yinan Liang, Yu Zhang, Na Yu, Haokun Li, Xinyu Chen, Yingzhuang Chen, Yi Zhen, Dejun Dong, Xianjin Fu, Jinzhou Su, Fuxiong Pan, Pengshuai Luo, Youzheng Feng, Ruoxiang Hu, Hanyang Guo, Jing Fan, Xiao Xiao, Peng Di.
    The IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, 2026.
  • [CGO'26]  BIT: Empowering Binary Analysis Through the LLVM Toolchain.
    Puzhuo Liu, Peng Di, Jingling Xue, Yu Jiang.
    The IEEE/ACM International Symposium on Code Generation and Optimization, 2026.
  • [ICSE'26]  Evolving Trends, Patterns, and Hidden Pitfalls: Unveiling JavaScript Feature Usage in the Wild.
    Dawei Chen, Wuxia Jin, Hui Guo, Guanlin Qiao, Peng Di, Ting Liu.
    The IEEE/ACM International Conference on Software Engineering, 2026.
  • [TOSEM'25 & Journal-First of ASE'25]  LLM-Powered Static Binary Taint Analysis.
    Puzhuo Liu, Chengnian Sun, Yaowen Zheng, Xuan Feng, Chuan Qin, Yuncheng Wang, Zhenyang Xu, Zhi Li, Peng Di, Yu Jiang, Limin Sun.
    The ACM Transactions on Software Engineering and Methodology (TOSEM), and
    Journal-First Track of the IEEE/ACM International Conference on Automated Software Engineering, 2025.
  • [ICSE'25]  Datalog-Based Language-Agnostic Change Impact Analysis for Microservices.
    Qingkai Shi, Xiaoheng Xie, Xianjin Fu, Peng Di, Huawei Li, Ang Zhou, Gang Fan.
    The IEEE/ACM International Conference on Software Engineering, 2025.
  • [ICSE'25, Distinguished Paper Award]   Tumbling Down the Rabbit Hole: How do Assisting Exploration Strategies Facilitate Grey-box Fuzzing?
    Mingyuan Wu, Jiahong Xiang, Kunqiu Chen, Peng Di, Shin Hwei Tan, Heming Cui, Yuqun Zhang.
    The IEEE/ACM International Conference on Software Engineering, 2025.
  • [OOPSLA'24]  Scaling Abstraction Refinement for Program Analyses in Datalog Using Graph Neural Networks.
    Zhenyu Yan, Xin Zhang, Peng Di.
    The ACM SIGPLAN International Conference on Object-Oriented Programming Systems, Languages, and Applications, 2024.
  • [FSE'24]  Finding and Understanding Defects in Static Analyzers by Constructing Automated Oracles.
    Weigang He, Peng Di, Mengli Ming, Chengyu Zhang, Ting Su, Shijie Li, Yulei Sui.
    The ACM International Conference on the Foundations of Software Engineering, 2024.
  • [ICSE-SEIP'24]  MicroFuzz: An Efficient Fuzzing Framework for Microservices.
    Peng Di, Bingchang Liu, Yiyi Gao.
    The IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, 2024.
  • [TSE'24]  Generic Sensitivity: Generics-Guided Context Sensitivity for Pointer Analysis.
    Haofeng Li, Tian Tan, Yue Li, Jie Lu, Haining Meng, Liqing Cao, Yongheng Huang, Lian Li, Lin Gao, Peng Di, Liang Lin, ChenXi Cui.
    The IEEE Transactions on Software Engineering, 2024.
  • [ISSTA'23]  Hybrid Inlining: A Framework for Compositional and Context-Sensitive Static Analysis.
    Jiangchao Liu, Jierui Liu, Peng Di, Diyu Wu, Hengjie Zheng, Alex X. Liu, Jingling Xue.
    The ACM SIGSOFT International Symposium on Software Testing and Analysis, 2023.
  • [ICSE-SEIP'23]  Incremental Call Graph Construction in Industrial Practice.
    Zelin Zhao, Xizao Wang, Zhaogui Xu, Zhenhao Tang, Yongchao Li and Peng Di.
    The IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, 2023.
  • [ICSE-SEIP'23]  Scalable Compositional Static Taint Analysis for Sensitive Data Tracing on Industrial Micro-Services.
    Zexin Zhong, Jiangchao Liu, Diyu Wu, Peng Di, Yulei Sui, Alex X. Liu and John C.S. Lui.
    The IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, 2023.
  • [ICSE-SEIP'22]  Record and Replay of Online Traffic for Microservices with Automatic Mocking Point Identification.
    Jiangchao Liu, Jierui Liu, Peng Di, Alex X. Liu, Zexin Zhong.
    The IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, 2022.
  • [ICSE-SEIP'22]  Field-based Static Taint Analysis for Industrial Microservice.
    Zexin Zhong, Jiangchao Liu, Diyu Wu, Peng Di, Yulei Sui, Alex X. Liu.
    The IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, 2022.
  • [EuroLLVM'16]  SVF: Static Value-Flow Analysis in LLVM.
    Yulei Sui, Peng Di, Ding Ye, Hua Yan and Jingling Xue.
    The 2016 European LLVM Conference, 2016.
  •  ∇  AI Infra & Parallel Programming:

  • [PPoPP'26]  TAC: Cache-based System for Accelerating Billion-Scale GNN Training on Multi-GPU Platform.
    Zhiqiang Liang, Hongyu Gao, Fang Liu, Jue Wang, Xingguo Shi, Juyu Gu, Peng Di, San Li, Lei Tang, Chunbao Zhou, Lian Zhao, Yangang Wang, Xuebin Chi,
    The ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2026.
  • [TSE'25]  Efficient Function Orchestration for Large Language Models.
    Xiaoxia Liu, Peng Di, Cong Li, Jun Sun, Jingyi Wang.
    The IEEE Transactions on Software Engineering, 2025.
  • [TOCS'23]  Modeling the Interplay between Loop Tiling and Fusion in Optimizing Compilers using Affine Relations.
    Jie Zhao, Jinchen Xu, Peng Di, Wang Nie, Jiahui Hu, Yanzhi Yi, Sijia Yang, Zhen Geng, Renwei Zhang, Bojie Li, Zhiliang Gan, Xuefeng Jin.
    The ACM Transactions on Computer Systems, 2023.
  • [PLDI'21]  AKG: Automatic Kernel Generation for Neural Processing Units using Polyhedral Transformations.
    Jie Zhao, Bojie Li, Wang Nie, Zhen Geng, Renwei Zhang, Xiong Gao, Bin Cheng, Chen Wu, Yun Cheng, Zheng Li, Peng Di, Kun Zhang and Xuefeng Jin.
    The ACM SIGPLAN Conference on Programming Language Design and Implementation, 2021.
  • [MICRO'20, Best Paper Nomination]  Optimizing the Memory Hierarchy by Compositing Automatic Transformations on Computations and Data.
    Jie Zhao, Peng Di.
    The IEEE/ACM International Symposium on Microarchitecture, 2020.