CV

The cutoff date for this data is Jun 6, 2024.

Basics

Name Zonghang Li
Label Ph.D.
Email lizhuestc@gmail.com
Wechat lizh_uestc
Homepage https://lizonghang.github.io
Github https://github.com/Lizonghang
Google scholar https://scholar.google.com/citations?hl=en&user=1IA-XokAAAAJ
Summary A young geeker and scholar who loves coding and exploring new technologies to realize fantastic ideas.

Work

  • 2024 - Present

    Abu Dhabi, UAE

  • 2020 - 2023

    Chengdu, CN

    Academic Instructor
    Yingcai Honors College of UESTC
    Guiding undergraduate students in the Yingcai Honors College of UESTC to carry out academic research and publish high-quality academic papers.
    • My student Shenglai Zeng was selected as an outstanding student of UESTC and is pursuing his Ph.D. degree in Michigan State University.
  • 2019 - 2020

    Shenzhen, CN

    Invited Technical Instructor
    Peng Cheng Laboratory (PCL)
    Guiding PCL researchers to develop an communication-efficient geo-distributed machine learning system.
    • The developed system was adopted by PCL.

Education

  • 2021 - 2022

    Singapore

    Visiting Scholar
    Nanyang Technological University
    School of Computer Science and Engineering
  • 2018 - 2018

    Oxford, UK

    Visiting Scholar
    University of Oxford
    Lady Margaret Hall
  • 2014 - 2024

    Chengdu, CN

    Bachelor and PhD
    University of Electronic Science and Technology of China
    School of Information and Communication Engineering

Awards

Skills

Coding
MXNET
PyTorch
Llama-Factory
Ollama
Megatron-LM
DeepSpeed
Accelerate
Transformers
Stable Diffusion
Gradio
Kohya-ss
Research Interests
Distributed AI Systems
Efficient LM Training, Fine-Tuning, and Inference
Advanced Networks for Distributed AI
Federated Learning
LM Agent
Reinforcement Learning
Generative AI Applications
Future AI Technologies

Talks

Projects

  • 2018 - Now
    GeoMX - Accepted and adopted by ZTE Co., Ltd.
    GeoMX is a fast and unified distributed system for training ML models over geographical data centers, which offers 20x speedup under identical network conditions.
  • 2022 - Now
    NetStorm - To appear on IEEE/ACM TON (CCF A)
    NetStorm is an topology-adaptive and communication-efficient system designed for geo-distributed machine learning training, which achieves a speedup of 7.5~9.2 times over the standard GeoMX system.
  • 2023 - Now
    KlonetAI - An intelligent agent adopted by a work accepted by NSDI 24 (CCF A)
    Klonet is designed to support the deployment and testing of new network protocols and applications in a realistic environment, such as distributed machine learning and federated learning, and KlonetAI provides an AI agent for intelligent interaction with the Klonet platform.
  • 2022 - 2023
    AGOD - AI-generated optimization decision accepted by IEEE TMC (CCF A)
    This project is an implementation of the system design and the deep diffusion soft actor-critic (D2SAC) algorithm
  • 2022 - 2023
    PerSF-SemCom - Personalized saliency-based semantic communication accepted by IEEE JSAC (CCF A)
    This project implements an energy-efficient task-oriented semantic communication framework with a triple-based scene graph for image information at the semantic level, and then designs a personalized semantic encoder based on user interests to meet the requirements of personalized saliency.
  • 2019 - 2021
    NBSync - An asynchronous pipelining scheduler accepted by IEEE TSC (CCF A)
    NBSync is a novel training algorithm for distributed ML over WANs, which greatly speeds up the model training by the parallelism of local computing and global synchronization. NBSync employs a well-designed pipelining scheme, which relaxes the sequential dependency of local computing and global synchronization and process them in parallel so as to overlap their operating overhead in the time dimension. NBSync also realizes flexible, differentiated and dynamical local computing for workers to maximize the overlap ratio in dynamically heterogeneous training environments.
  • 2018 - 2019
    ESync - An efficient DML synchronization algorithm accepted by IEEE TSC (CCF A)
    ESync is an efficient synchronization algorithm designed for distributed ML tasks in heterogeneous clusters (the cluster consists of computing devices with different computing capabilities).
  • 2018 - Now
    Other Programs
    These programs are close sourced due to IP and confidentiality protocols.
    • 2018-2020: Advanced Distributed Machine Learning Techniques. Provincial and Ministerial Key Program. Approved.
    • 2018-2019: Advanced Data Center Network Architectures. Huawei Technologies Co., Ltd. Approved.
    • 2019-2020: Communication Optimizations for Distributed Machine Learning over WANs. Peng Cheng Laboratory. Approved.
    • 2021-Now: Computing Power Network and New Communication Primitives. ZTE Communication Co., Ltd. In progress.
    • 2022-2023: Accelerating Data Transmission for Geographically Distributed Machine Learning. Zhejiang Lab. Approved.
    • 2022-2023: Advanced Network Technologies for Giant Connections, Large Traffic, and Low Latency in the Rapid Evolution of 5G/B5G. National Key Research and Development Program. Approved.