👋 About Me
Hello! I’m Xutian Chen, a passionate student and researcher in the field of Artificial Intelligence and Machine Learning. I am currently pursuing my Master’s degree in Artificial Intelligence at Beihang University, having graduated with a Bachelor’s degree in Artificial Intelligence from Jinan University.
My research interests lie in the intersection of AI for Bio&Chem, Diffusion Models, LLM Inference Acceleration, and Autonomous Vehicle Path Planning. I am always open to appropriate opportunities in research and tech industries, and I am excited about how GenAI can be applied to revolutionize specific problems in any domain.
📧 Contact: chenxutian@buaa.edu.cn
🔗 GitHub: github.com/Blossom0913
🔗 Gitee: gitee.com/chenxutian
Remember brick walls let us show our dedication. They are there to separate us from the people who don't really want to achieve their childhood dreams.
📖 Education
- 2025.09 - Present: Master of Engineering in Artificial Intelligence, Beihang University, Beijing, China
- Major Courses: Machine Learning, Deep Learning, Computer Vision and 15 others
- 2021.09 - 2025.07: Bachelor of Engineering in Artificial Intelligence, Jinan University, Zhuhai, China
- A+ Courses: Computer Network, Data Structure, Advanced Programming Language and 15 others
💼 Internship
Decision & Planning Intern |Meituan Autonomous Vehicle Department
2026.03 - present | Beijing, China
- Improved rule-based planning and refined trajectory feasibility checks for intersection exit decisions
- Built data pipelines and Jenkins workflows to establish a daily quantitative expert data loop for model and rule evaluation
- Collaborated on sim-to-real alignment and debugging, producing reusable evaluation and traceability processes
🚀 Competition Experience
ASC2022 Student Supercomputer Challenge
Team Member | 2021.11 - 2022.06 | Zhuhai, China
- Project goal: Under limited compute (8× Tesla V100 16GB) and power constraints, pre-train a 4.7B-parameter Yuan-1.0 language model and achieve a 55% training speedup (from 45h to 28h).
- Memory optimization: Led application of ZeRO-Offload and ZeRO Stage 1 to offload model states (parameters, gradients, optimizer states) to CPU memory, enabling training of a 4.7B model on 8×16GB GPUs and resolving CUDA OOM issues.
- Parallel & acceleration strategy: Deployed Megatron-DeepSpeed and designed a hybrid parallel scheme with 4-way tensor parallelism + 2-way pipeline parallelism, improving throughput from 4.08 → 4.66 samples/s compared with pure 8-way tensor parallelism.
- Engineering optimizations: Adopted mixed precision training (AMP), built PyTorch with Intel MKL, and used DeepSpeed’s CPU Adam to optimize CPU offload computation and communication.
- Convergence tuning: Tuned learning rate scaling and warmup strategies alongside data pipelines and micro-batching, achieving a final training loss of 5.826 in reproduction runs.
- Deliverables: Responsible for parallel strategy evaluation, memory/performance analysis, and engineering implementation; project notes and partial experiment logs at: ASC Student Supercomputer Challenge Proposal
🔬 Research Experience
Research on AI4S and AI+Chem | Guangdong Institute of Intelligent Science and Technology
Research Intern | 2025.02 - 2025.09 | Zhuhai, China
- Large-Scale Computational Framework: Engineered a parallel computing framework deployed on a 4-GPU (RTX 2080Ti) cluster, successfully processing 43.8 million molecular docking tasks to accelerate drug discovery pipelines.
- Algorithm Research & Benchmarking: Designed a mouse social behavior classification framework using DeepLabCut for keypoint labeling; conducted comparative experiments on LightGBM, LSTM, CNN, and GMM models to evaluate performance.
- System Development: Contributed to the development of a multi-robot path planning system and a lightweight task management platform, resulting in 2 software copyright registrations (Top-3 Author).
- Technical Stack: Python, Deep Learning, Parallel Computing, CUDA, Robotics Simulation.
- Source code available at: Dock Repository
Multi-agent Path Planning | Jinan University
Research Assistant | 2024.03 - 2024.07 | Zhuhai, China
-
Performance Analysis: Designed controlled experiments scaling agents from 10 to 50, revealing that runtime increased 135x (2.3s → 312s) and success rate dropped to 67%. Profiling identified conflict detection (70% of runtime) and deep copy operations as primary bottlenecks.
-
Algorithm Optimization (C++/MAPF): Reduced conflict detection complexity from O(n²) to O(n) by implementing incremental checks—comparing only the current agent against others instead of full scans. Reduced memory overhead by replacing deep copies with Copy-on-Write, enabling multiple agents to share conflict tree nodes until a write is required.
-
Systems Thinking: Applied knowledge of stack/heap allocation and shallow/deep copy semantics to guide optimization decisions. Derived complexity reduction (N² → N) and presented findings with clear problem-solution-impact narrative—demonstrating interpretability in engineering work.
-
Platform Integration: Built messaging architecture between AGV fleet and local server for state synchronization under real-world constraints. Reproduced CL-CBS baseline (paper: CL-MAPF), fixed implementation bugs, and delivered faster planning than Hybrid A* baseline within a 3-month cycle.
🎖 Honors and Awards
- 2022.06: 🥈 National Second Prize in ASC2022 (Student Supercomputer Challenge), ranked 22nd among all participants
- 2022.07: Attended the ASC final as visitors at USTC
- 2024: 🥇 Provincial First Prize (preliminary) in Chinese Mathematics Competition (CMC), Guangdong Division
- 2023: 🥈 Provincial Second Prize (preliminary) in Chinese Mathematics Competition (CMC), Guangdong Division
🌐 Open-Source Contributions
Tiny-llm
2026.04 – present
- Focused on Attention implementation and lightweight LLM training/inference pipelines
- Tech stack: vLLM, SGLang, Tool Calling
- Reference: https://github.com/skyzh/tiny-llm
PlayTask - Time Management APP
2023.09 – 2023.12 | Personal Project
- 📱 Designed and built a time management APP from scratch
- 🛠️ Learned Version Control Systems (VCS), Event Response and basic debugging tools in Android Studio
- 🎨 Designed UI/UX with ViewPage2, TabLayout and Fragment independently
- 📝 Source code: Blossom0913/PlayTask
💻 Technical Skills
Programming Languages
- Proficient: Python, C/C++, Java
- Experienced: Rust, Kotlin
Technical Skills
- Development Tools: Git, SSH, Android Studio
- High Performance Computing: CUDA, MPI, OpenMP
- Distributed Training & Optimization: DeepSpeed, Megatron, ZeRO, ZeRO-Offload
- Precision & Acceleration: Mixed Precision (AMP), Intel MKL optimizations
- Machine Learning: TensorFlow, PyTorch, Fine-Tuning
- Systems: Linux, Shell
- Other: Algorithm Design, Multi-Agent Systems, Computer Vision, Model Inference & Deployment
📝 Publications
Currently preparing manuscripts for submission. Feel free to reach out for collaboration opportunities!