Internship

  • ByteDance, Shanghai-AI Chip-AI Software Stack Intern· Jan. 2026 – Present.
    • Explore LLM-oriented accelerator system via simulation and profiling, analyzing compute/communication trade-offs and GPU/NPU workload characteristics for LLM training and inference.
    • Explore accelerator architecture and software–hardware co-design for AI systems, including NoC interconnects, chiplet-based integration, and 3D-stacked architectures.
  • AMD (Advanced Micro Devices), ShanghaiGPU Post Silicon Intern – Semi-Custom Business Unit (SCBU)· Aug. 2021 – Jan. 2022
    • Developed Python-based tools and infrastructure for post-silicon validation and internal server platforms.
    • Built a runtime power monitor for key on-board nodes using STM32 and ADC chips.