Skip to content

Da Chang

Pengcheng Laboratory, Shenzhen, China. 2025.7-Present

Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China. 2024.9-Present

Ph.D. in Pattern Recognition and Intelligent Systems

Central South University, School of Automation, China. 2020.9-2024.6

B.Eng. in Intelligent Science and Technology

DL Research — Optimization • PEFT • Pre & Post Training

Da Chang

I graduated from the Department of Intelligent Science and Technology in 2024, School of Automation, Central South University. Currently, I am a jointly educated Ph.D candidate in a collaborative program between the Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences (SIAT) and Pengcheng Laboratory (PCL).
My major is Pattern Recognition and Intelligent Systems and my research interests focus on deep learning optimization and generalization and the application of deep models to various areas.
I am fortunate to be guided by Associate Professor Ganzhao Yuan and to maintain close contact with Researcher Yongxiang Liu and Researcher Huihui Zhou at Pengcheng Laboratory. I am very interested in the theory and application of deep learning. I would like to communicate with you about neural network training techniques, application scenarios and optimization theories of deep learning.


News

  • 2026-01KG-SAM: Injecting Anatomical Knowledge into Segment Anything Models via Conditional Random Fields accepted —— Accepted at ICASSP 2026.
  • 2025-11Calibrating and Rotating: A Unified Framework for Weight Conditioning in PEFT accepted —— Accepted at AAAI 2026.
  • 2025-09 — Preprint On the Convergence of Muon and Beyond posted.
  • 2025-09MGUP: A Momentum-Gradient Alignment Update Policy for Stochastic Optimization —— Accepted at NeurIPS 2025 Spotlight.
  • 2025-07 — I started my second year of doctoral study in Pengcheng Laboratory.
  • 2024-09 — I started my first year of doctoral study in Chinese Academy of Sciences.
  • 2024-04 — Undergraduate graduation Project Mixed Text Recognition with Efficient Parameter Fine-Tuning and Transformer —— Accepted at ICONIP 2024.
  • 2024-06 — I graduated from the School of Automation of Central South University.
  • 2024-06 — I won the outstanding undergraduate thesis of Central South University.

Projects

Distributed Muon-based MoE Training on 64 NPUs
Da Chang
Project · Systems for LLM Training
Megatron & MindSpeed · MoE / Distributed Training
Built Muon and Distributed Muon (ZeRO-style) on top of a Megatron- and MindSpeed-based 64-NPU MoE training framework, with support for TP, CP, EP, and PP parallelism. Muon consistently outperformed AdamW during training, while Distributed Muon reduced memory usage at the cost of additional communication overhead. I also identified a loss-spike issue caused by the interaction between MoE experts and Muon redundancy. The project improved training throughput, stability, and resource utilization, and enabled training at larger model scales.
MoEMuonZeROMegatronMindSpeed64 NPUs

Selected Publications

KG-SAM: Injecting Anatomical Knowledge into Segment Anything Models via Conditional Random Fields
Yu Li*, Da Chang*, Xi Xiao
(*:Equal contribution)
ICASSP 2026 · CCF B · PDF
A knowledge-guided framework for medical image segmentation that integrates a medical knowledge graph for anatomical priors, an energy-based CRF for boundary refinement, and an uncertainty-aware fusion module, achieving 82.69% Dice on multi-site prostate segmentation.
SAMMedical SegmentationKnowledge GraphCRF
On the Convergence of Muon and Beyond
Da Chang, Yongxiang Liu, Ganzhao Yuan
Preprint 2025.9 · PDF · Code
Analyzes Muon-type optimizers' convergence and extends the framework; The optimal complexity properties brought about by variance reduction are also discussed.
MuonConvergenceStochastic Optimization
Calibrating and Rotating: A Unified Framework for Weight Conditioning in PEFT
Da Chang, Peng Xue, Yu Li , Yongxiang Liu, Pengxiang Xu, Shixun Zhang
AAAI 2026 · CCF A · PDF · Code
Analyze the properties of DoRA and LoRA, and unify the “calibration + rotation” weight conditioning strategy to enhance the performance and training-inference efficiency of PEFT.
LLMs PEFTWeight Conditioning
MGUP: A Momentum-Gradient Alignment Update Policy for Stochastic Optimization
Da Chang, Ganzhao Yuan
NeurIPS 2025 Spotlight(Top 3%) · CCF A · PDF · Code
Provide ergodic convergence guarantees for stochastic nonconvex optimization and accelerate training via a momentum-gradient alignment strategy.
Stochastic OptimizationMomentumAlignment
Mixed Text Recognition with Efficient Parameter Fine-Tuning and Transformer
Da Chang*, Yu Li*
(*:Equal contribution)
Undergraduate Graduation Project / ICONIP 2024 · CCF C · PDF · Code
TrOCR-based OCR with efficient PEFT for mixed text; practical pipeline and evaluation.
OCRTrOCRPEFT

Academic Service

  • 2026 — Reviewer for International Conference on Machine Learning (ICML).
  • 2025 — Reviewer for IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI).

Honors and Awards

  • Second prize, 8th National Biomedical Engineering Innovation Design Competition for College Students, China, 2023.
  • Third prize, 9th National Statistical Modeling Competition for College Students, China, 2023.
  • Second Class Scholarship, CSU (Top 15%), 2023.
  • "ShanHe Excellent Student" Second Class Scholarship, CSU (Top 5%), 2022.
  • First-Class Scholarship, CSU (Top 5%), 2022.

Skills

  • Languages: Python, C/C++, MATLAB, LaTeX
  • Frameworks: PyTorch, scikit-learn, Megatron, MindSpeed
  • Distributed LLM Training: TP, PP, EP, CP; MoE training; NPU-based large-scale training
  • AI-assisted Development: Claude Code, Codex

Visitor Map

Visitors map
Visitors map (powered by mapmyvisitors.com)