Internship

ByteDance, Shanghai-AI Chip-AI Software Stack Intern· Jan. 2026 – Present.
- Explore LLM-oriented accelerator system via simulation and profiling, analyzing compute/communication trade-offs and GPU/NPU workload characteristics for LLM training and inference.
- Explore accelerator architecture and software–hardware co-design for AI systems, including NoC interconnects, chiplet-based integration, and 3D-stacked architectures.
AMD (Advanced Micro Devices), Shanghai — GPU Post Silicon Intern – Semi-Custom Business Unit (SCBU)· Aug. 2021 – Jan. 2022
- Developed Python-based tools and infrastructure for post-silicon validation and internal server platforms.
- Built a runtime power monitor for key on-board nodes using STM32 and ADC chips.