Jiahang Lou 楼佳杭

About

I am a Ph.D. student in Electronics Engineering (EE) at the State Key Laboratory of Integrated Circuits and System, Fudan University, supervised by Prof. Lingli Wang. My research focuses on reconfigurable architectures (e.g. CGRAs) and AI compiler design, with a particular emphasis on MLIR-based compilation frameworks, tensor dataflow optimization, and heterogeneous CGRA–CPU systems for AI workloads.

I have published papers at venues such as DATE 2024 and DAC 2025, and have participated in national and international competitions in electronic design and GPU programming. I also have internship experience at leading technology companies.

  • Research Interests:
    • Coarse-Grained Reconfigurable Architectures (CGRAs)
    • Compiler design and MLIR-based compilation frameworks
    • LLM inference/training optimization on multi-GPU/NPU systems (parallelism, communication, scheduling)
    • AI deployment and dataflow-aware tensor optimization for DNNs
    • Polyhedral model and loop transformations
    • Heterogeneous accelerator–CPU systems and communication optimizations

Education

  • Ph.D. in Electronics EngineeringFudan University, Shanghai, China · Sept. 2022 – Jun. 2027 (expected)
  • B.Sc. in MicroelectronicsFudan University, Shanghai, China · Sept. 2018 – Jun. 2022

Recent News

  • Jan 2026 — Joined ByteDance’s AI chip team as an intern, contributing to the software stack for LLM acceleration.
  • Dec 2025 — Co-hosting “Fusion SoC tutorial-workshop” at FPT 2025 (ShanghaiTech University) on Dec 2. Come play with our CGRA toolchain! See https://fpt-2025.lin.pub
  • Nov 2025 — Shared a Chinese interpretation of the FPGA2025 best paper, FlightVGM: Video-Generation Model Deployed on FPGA, on Zhihu. Read more

Selected Publications

  • Adora Compiler: End-to-End Optimization for High-Efficiency Dataflow Acceleration and Task Pipelining on CGRAs — Jiahang Lou, et al.; Design Automation Conference (DAC), San Francisco, USA, June 21–25, 2025. Github: adora-compiler** A unified framework bridging user-friendly programming and high-performance acceleration for CGRAs through automated loop transformations, task/data-flow optimization, and systematic algorithms.
  • An Agile Deploying Approach for Large-Scale Workloads on CGRA-CPU Architecture — Jiahang Lou, et al.; Design, Automation & Test in Europe Conference & Exhibition (DATE), Valencia, Spain, March 22–25, 2024. A user-friendly MLIR-based multi-level compiler framework that bridges CGRA and RISC-V CPU architectures by automating optimizations and hardware–software partitioning for large-scale workloads.

  • View all publications

Competition Experiences - view details

Internship - view details

Awards - view details

Contact

Last updated: Jan. 19 , 2026.