Publications
Preprint
- FedML: A Research Library and Benchmark for Federated Machine Learning
Chaoyang He*, Songze Li, Jinhyun So, Mi Zhang, Hongyi Wang, Xiaoyang Wang, Praneeth Vepakomma, Abhishek Singh, Hang Qiu, Li Shen, Peilin Zhao, Yan Kang, Yang Liu, Ramesh Raskar, Qiang Yang, Murali Annavaram*, Salman Avestimehr*
Preprint, 2020
Spotlight paper with Contributed Talk at SpicyFL@NeurIPS 2020
* means corresponding authors.
[BibTex] [Homepage] [Arxiv] [Slack] [Documentation] [Video] [Slides] [Code]
Highlights: We are building an open source project for federated learning! https://fedml.ai
- Central Server Free Federated Learning over Single-sided Trust Social Networks
Chaoyang He, Conghui Tan, Hanlin Tang, Shuang Qiu, Ji Liu
Preprint, 2020
A short version has been accepted to SpicyFL@NeurIPS 2020 (Lightning Talk)
[BibTex] [Arxiv] [Lightning Talk] [Poster] [Code]
- Cascade-BGNN: Toward Efficient Self-supervised Representation Learning on Large-scale Bipartite Graphs
Chaoyang He*, Tian Xie*, Yu Rong, Wenbing Huang, Junzhou Huang, Xiang Ren, Cyrus Shahabi
Preprint, 2020
[BibTex] [Arxiv] [Code]
Peer-Reviewed Publications
- Advances and Open Problems in Federated Learning
Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Keith Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael GL d’Oliveira, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adria Gascón, Badih Ghazi, Phillip B Gibbons, Marco Gruteser, Zaid Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson, Justin Hsu, Martin Jaggi, Tara Javidi, Gauri Joshi, Mikhail Khodak, Jakub Konečný, Aleksandra Korolova, Farinaz Koushanfar, Sanmi Koyejo, Tancrede Lepoint, Yang Liu, Prateek Mittal, Mehryar Mohri, Richard Nock, Ayfer Özgür, Rasmus Pagh, Mariana Raykova, Hang Qi, Daniel Ramage, Ramesh Raskar, Dawn Song, Weikang Song, Sebastian U Stich, Ziteng Sun, Ananda Theertha Suresh, Florian Tramèr, Praneeth Vepakomma, Jianyu Wang, Li Xiong, Zheng Xu, Qiang Yang, Felix X Yu, Han Yu, Sen Zhao
[BibTex] [Arxiv] [Jeff Dean’s Tweet]
Accepted to FnTML 2020 (Foundations and Trends in Machine Learning, the chief editor is Michael Jordan)
Highlight: 105 pages, 485 references, 22 Googlers, and 36 academics at 24 institutions!
- Group Knowledge Transfer: Federated Learning of Large CNNs at the Edge
Chaoyang He, Murali Annavaram, Salman Avestimehr
(To appear) NeurIPS 2020 (2020 Conference on Neural Information Processing Systems)
[BibTex] [Arxiv] [NeurIPS Official Site Video Presentation] [Gather Town] [Project Page] [Poster] [Slides] [Code]
- Towards non-I.I.D. and invisible data with FedNAS: Federated Deep Learning via Neural Architecture Search
Chaoyang He, Murali Annavaram, Salman Avestimehr
Accepted to CVPR 2020 Workshop on Neural Architecture Search and Beyond for Representation Learning, 2020
[BibTex] [Arxiv] [video] [Code]
- MiLeNAS: Efficient Neural Architecture Search via Mixed-Level Reformulation
Chaoyang He*, Haishan Ye*, Li Shen, Tong Zhang
Accepted to CVPR 2020 (IEEE/CVF Conference on Computer Vision and Pattern Recognition), 2020
[BibTex] [arxiv] [video] [Code]
- Collecting Indicators of Compromise from Unstructured Text of Cybersecurity Articles using Neural-Based Sequence Labelling
Zi Long, Lianzhi Tan, Shengping Zhou, Chaoyang He, Xin Liu
Accepted to IJCNN 2019 International Joint Conference on Neural Networks (IJCNN) 2019
[BibTex] [Arxiv]
Keywords: Natural Language Processing
Production
I was a full-stack software engineer and team manager. I have lots of experience in successful Internet product publication. My experience spans nearly all Internet-related software systems development, including cloud computing, distributed systems, machine learning systems, applied machine learning, operating system, and mobile applications.
- Tencent Cloud. 2017-2018
Worked on business-oriented cloud computing solution; led a team to develop the 1st generation cloud computing solution for the automotive industry.

- Tencent Venus Distributed Machine Learning System. 2016-2017
Worked with Shengping Zhou (now CTO@AlphaCloud), developing the large-scale distributed machine learning system for MIG, Tencent. This platform supports various machine learning algorithms and models, including, LR, SVM, GBDT, DNN, etc. One more thing: I have to highlight that one module of Tencent Venus contains the first practice of federated learning system in China, even in the world, earlier than Google FL. At Tencent, we supported vertical federated learning infrastructure for WeBank to model Tencent social network features into their financial models. 1-2 years later, the core techniques in this project are documented in KDD’2019 publication as “FDML: A Collaborative Machine Learning Framework for Distributed Features“.

- Tencent Location Data Computing Platform. 2016-2017
worked on streaming data mining systems and algorithms; this platform contains billion-level daily users; I learned the entire system architecture and related real-time data analytics algorithms.

- Speech Recognition and Natural Language Processing for Tencent Automotive Operating System. 2015-2016
This project inspired me to explore ML models, algorithms, and systems; deep learning is very fancy that year; we developed a deep learning model compression engine for our product (at that time, TensorFlow Lite is not released yet)

- Tencent Games: a Pokemon Go-like Mobile Game Engine 2016
worked on Unity3D and C++ -based game engine; this is a cross-group collaboration, very fun; I learned a lot of engineering and production culture at Tencent Game.

- Tencent AI in Car (also called Tencent Automotive Service). 2014-2016
I was leading an operating system team; worked on the embedded operating system (Android/QNX) and related back-end cloud services, including more than 10 applications and system core service, such as WeChat MyCar, Tencent Maps, Tencent Music, Tencent Video, etc. Please watch the video below to have a taste how Chinese Internet companies advertise their products:-)

- As a Team Leader, I Developed the 1st generation of Smartphone Navigation SDK, Application, and Service in China (Baidu Navigation SDK). 2012-2013
Baidu Navigation SDK, Application, and Service is a cross-platform (C++/Java/Objective-C) engine that requires computer graphics (OpenGL ES), routing algorithms (A*, CH), offline and online data compiler and networking service (Hadoop; C++/PHP backend), complex navigation UI message state control and dispatch, cross-platform, layered, and modularized complex system design. I am familiar with all of these core modules.

- As a Software Engineer, I Contributed to the First Generation of Huawei Smartphone. 2011-2012
The picture below shows Huawei Ascend P6 (Emotion UI)

Patents
- Method for recognizing semantics, device storage medium and electronic device
Patent number: CN109933774A - A kind of voice recognition processing method and device
Patent number: CN107293294A - The data-pushing of vehicle cloud platform, treating method and apparatus
Patent number: CN108833489A - Method and apparatus for playing instant message sound on vehicle-mounted terminal
Patent number: CN105530171A - Resource obtaining method, terminal and vehicle terminal
Patent number: CN105610978A - Information interaction method and vehicle terminal
Patent number: CN105791395A - A kind of application development method, device, equipment and development frame
Patent number: CN107179916A - The rendering method and device of map, storage medium, electronic device
Patent number: CN109544658A - Selective recording method after realizing break-in of dispatching console in digital cluster system
Patent number: CN101969444A
Technical Report
My industrial experience spans cloud computing, distributed system, applied machine learning, data computing platform, mobile computing (Android/iOS/IoT devices), and their applications in AI, Speech Recognition, Games, Maps, and Navigation.
- 2018 – Data-Driven Cloud Computing Platform and Machine Learning System for the Internet of Vehicles
- 2016 – The Large-Scale Distributed System and Real-Time Location Data Mining
- 2017 – TAI: An Intelligent Operating System and Open Platform
- 2016 – Speech Recognition System for Connected and Autonomous Driving Car
- 2016 – WeGameMap: Real World Map Rendering Engine for Location-Based Mobile Game
- 2015 – WeLink: a High-Performance Vehicle-Mobile Networking Library
- 2014 – TMAP: A System Framework for Mobile Maps and Navigation
- 2016 – Efficient Spatial Anti-Aliasing Rendering for Line Joins on Vector Maps
- 2014 – WeCross: a Mobile and Vehicle C/C++ Cross-Platform Library
- 2017 – A High Precision Private Car Trajectory Dataset and an Open Source Location Data Computing Platform
- 2017 – A Open Source High Reliable Location SDK for the Automotive Industry
- 2013 – The First Generation Mobile Navigation SDK Design in China
- 2012 – Map State Switching under Multi-trigger Condition Based on Finite State Machine Design Pattern
- 2012 – A General Downloader Engine for iOS/Android Platform
- 2012 – The Best Practice for Java Native Interface (JNI) Development on Android Operating Platform
- 2014 – Taxi-Calling Platform and the Open Source Code
- 2015 – A UX design of Bluetooth Steering Wheel Wireless Controller and the communication system design
- 2016 – Sliding Window Algorithm Template to Solve All the Leetcode Substring Search Problem.
Leetcode
An Algorithmic Source Code Template for Solving Many Substring Search Problems, 2016
(1.8K upvote, 2000+ stars, 167.8K views)