Academia Sponsorship

FedML Inc. sponsored the following academic events:

MLSys 2022 – Workshop on Cross-Community Federated Learning: Algorithms, Systems and Co-designs

https://crossfl2022.github.io/

Federated Learning (FL) has recently emerged as the overarching framework for distributed machine learning (ML) beyond data centers. FL, in both cross-device and cross-silo, enables collaborative ML model training from originally isolated data without sacrificing data privacy. Such potential use of FL has since then attracted an explosive attention from the ML, computer systems, optimization, signal processing, wireless networking, data mining, computer architecture, privacy and security communities.

FL-related research is penetrating into almost every science and engineering discipline. However, as FL comes closer to being deployable in real-world systems, many open problems in FL today cannot be solved solely by researchers in one community. For example, designing most effcient and reliable FL algorithms require leveraging expertises from systems, security, signal processing and networking communities. On the other hand, designing most effcient and scalable computing and networking systems require leveraging collaborative advances from ML, data mining, and optimization communities.

In light of the differences in education backgrounds, toolboxes, viewpoints, and design principles of different communities, this workshop aims to break community barriers and bring researchers from pertinent communities together to address open problems in FL. More importantly, this workshop aims to stimulate discussion among experts in different fields (e.g., industry and academia) and identify new problems that remain underexplored from an interdisciplinary perspective.

IJCAI 2022 – International Workshop on Trustworthy Federated Learning

https://federated-learning.org/fl-ijcai-2022/

Federated Learning (FL), a learning paradigm that enables collaborative training of machine learning models in which data reside and remain in distributed data silos during the training process. FL is a necessary framework to ensure AI thrive in the privacy-focused regulatory environment. As FL allows self-interested data owners to collaboratively train machine learning models, end-users can become co-creators of AI solutions. To enable open collaboration among FL co-creators and enhance the adoption of the federated learning paradigm, we envision that communities of data owners must self-organize during FL model training based on diverse notions of trustworthy federated learning, which include, but not limited to, security and robustness, privacy-preservation, interpretability, fairness, verifiability, transparency, auditability, incremental aggregation of shared learned models, and creating healthy market mechanisms to enable open dynamic collaboration among data owners under the FL paradigm. This workshop aims to bring together academic researchers and industry practitioners to address open issues in this interdisciplinary research area. For industry participants, we intend to create a forum to communicate problems are practically relevant. For academic participants, we hope to make it easier to become productive in this area. The workshop will focus on the theme of building trustworthiness into federated learning to enable open dynamic collaboration among data owners under the FL paradigm, and make FL solutions readily applicable to solve real-world problems.

CIKM 2022 – Workshop on Federated Learning with Graph Data

https://sites.google.com/view/fedgraph2022

The field of graph data mining, one of the most important AI research areas, has been revolutionized by graph neural networks (GNNs), which benefit from training on real-world graph data with millions to billions of nodes and links. Unfortunately, the training data and process of GNNs involving graphs beyond millions of nodes are extremely costly on a centralized server, if not impossible. Moreover, due to the increasing concerns about data privacy, emerging data from realistic applications are naturally fragmented, forming distributed private graphs of multiple “data silos”, among which direct transferring of data is forbidden. The nascent field of federated learning (FL), which aims to enable individual clients to jointly train their models while keeping their local data decentralized and completely private, is a promising paradigm for large-scale distributed and private training of GNNs.

The FedGraph workshop aims to bring together researchers from different backgrounds with a common interest in how to extend current FL algorithms to operate with graph data models such as GNNs. FL is an extremely hot topic of large commercial interest and has been intensively explored for machine learning with visual and textual data. The exploration from graph mining researchers and industrial practitioners is timely catching up just recently. There are many unexplored challenges and opportunities, which urges the establishment of an organized and open community to collaboratively advance the science behind it. The prospective participants of this workshop will include researchers and practitioners from both graph mining and federated learning communities, whose interests include, but are not limited to: graph analysis and mining, heterogeneous network modeling, complex data mining, large-scale machine learning, distributed systems, optimization, meta-learning, reinforcement learning, privacy, robustness, explainability, fairness, ethics, and trustworthiness.

ACL 2022 – Workshop on Federated Learning for Natural Language Processing (FL4NLP)

https://fl4nlp.github.io/

Due to increasing concerns and regulations about data privacy (e.g., General Data Protection Regulation), coupled with the growing computational power of edge devices, emerging data from realistic users have become much more fragmented, forming distributed private datasets across different clients (i.e., organizations or personal devices). Respecting users’ privacy and restricted by these regulations, we have to assume that users’ data in a client are not allowed to transfer to a centralized server or other clients. For example, a hospital does not want to share its private data (e.g., conversations, questions asked on its website/app) with other hospitals. This is despite the fact that models trained by a centralized dataset (i.e., combining data from all clients) usually enjoy better performance on downstream tasks (e.g., dialogue, question answering). Therefore, it is of vital importance to study NLP problems in such a scenario, where data are distributed across different isolated organizations or remote devices and cannot be shared for privacy concerns.

The field of federated learning (FL) aims to enable many individual clients to jointly train their models, while keeping their local data decentralized and completely private from other users or a centralized server. A common training schema of FL methods is that each client sends its model parameters to the server, which updates and sends back the global model to all clients in each round. Since the raw data of one client has never been exposed to others, FL is promising to be an effective way to address the above challenges, particularly in the NLP domain where many user-generated text data contain sensitive, personal information.