Disentangled Representation Learning:
Approaches and Applications

IJCAI-ECAI 2022, Vienna, Austria

Speakers

Xin Wang Tsinghua University, China

Xin Wang is currently an Assistant Professor at the Department of Computer Science and Technology, Tsinghua University. He got both of his Ph.D. and B.E degrees in Computer Science and Technology from Zhejiang University, China. He also holds a Ph.D. degree in Computing Science from Simon Fraser University, Canada. His research interests include multimedia intelligence, machine learning and its applications in multimedia big data. He has published several high-quality research papers in top journals and conferences including IEEE TPAMI, IEEE TKDE, IEEE TMM, ICML, NeurIPS, ACM Multimedia, KDD, WWW, SIGIR etc. He is the recipient of 2017 China Postdoctoral innovative talents supporting program. He receives the ACM China Rising Star Award in 2020.

Hong Chen Tsinghua University, China

Hong Chen is a Ph.D candiate in machine learning at Tsinghua University, China. His research interesets are disentangled representation learning, meta learning, and multi-modal learning, etc. He has published several papers in top tier conferences and journals, including NeurIPS, TPAMI, ICME, etc.

Wenwu Zhu Tsinghua University, China

Wenwu Zhu is currently a Professor in the Department of Computer Science and Technology at Tsinghua University, the Vice Dean of National Research Center for Information Science and Technology. Prior to his current post, he was a Senior Researcher and Research Manager at Microsoft Research Asia. He was the Chief Scientist and Director at Intel Research China from 2004 to 2008. He worked at Bell Labs New Jersey as Member of Technical Staff during 1996-1999. He received his Ph.D. degree from New York University in 1996.

His current research interests are in the area of data-driven multimedia networking and Cross-media big data computing. He has published over 350 referred papers, and is inventor or co-inventor of over 50 patents. He received eight Best Paper Awards, including ACM Multimedia 2012 and IEEE Transactions on Circuits and Systems for Video Technology in 2001 and 2019.

He served as EiC for IEEE Transactions on Multimedia (2017-2019). He served in the steering committee for IEEE Transactions on Multimedia (2015-2016) and IEEE Transactions on Mobile Computing (2007-2010), respectively. He serves as General Co-Chair for ACM Multimedia 2018 and ACM CIKM 2019, respectively. He is an AAAS Fellow, IEEE Fellow, SPIE Fellow, and a member of The Academy of Europe (Academia Europaea).

Tutorial Description

Discovering and recognizing the hidden factors behind observable data serves as one crucial step for machine learning algorithms to better understand the world. However, it still remains a challenging problem for current deep learning models which heavily rely on data representations. To solve this challenge, disentangled representation learning, as a recently cutting-edge topic in both academy and industry, aims at learning a disentangled representation for each object where different parts of the representation can express different (disentangled) semantics so as to improve the explainability and controllability of the machine learning models. Notably, it has achieved great success in diverse fields, such as image/video generation, recommender systems, and graph neural networks, covering a variety of areas ranging from computer vision to data-mining. In this tutorial, we will disseminate and promote the recent research achievements on disentangled representation learning as well as its applications, which is an exciting and fast-growing research direction in the general field of machine learning. We will also advocate novel, high-quality research findings, and innovative solutions to the challenging problems in disentangled representation learning. This tutorial consists of five parts. We first give a brief introduction to the research and industrial motivation, followed by discussions on basics, fundamentals and applications of disentangled representation learning. We will also discuss some recent advances covering disentangled graph representation learning and disentangled representation for recommendation. We finally share some of our insights on the trend for disentangled representation learning.


Tutorial Outline

The tutorial can either be scheduled for quarter-slot or half-slot depending on the actual needs of the conference, and can be organized into the following 5 sections.

  • The research and industrial motivation
  • Basics and fundamentals of disentangled representation learning
  • Applications of disentangled representation learning
  • Recent advances of disentangled representation learning
  • Discussions and future directions

  • Target Audience and Prerequisites

    This tutorial will be highly accessible to the whole AI community, including researchers, students and practitioners who are interested in disentangled representation learning and their applications in AI related tasks. The tutorial will be self-contained and designed for introductory and intermediate audiences. Although no special prerequisite knowledge is required to attend this tutorial, the audiences are supposed to have basic knowledge of machine learning, linear algebra, and calculus. In particular, audiences who have engaged in related topics (e.g., deep learning, reinforcement learning, information theory, causal inference) are welcome to have Q&A interaction during the tutorial.


    Motivation, Relevance and Rationale

    This tutorial is to disseminate and promote the recent research achievements on disentangled representation learning as well as its applications, which is an exciting and fast-growing research direction in the general field of machine learning. We will advocate novel, high-quality research findings, and innovative solutions to the challenging problems in disentangled representation learning. This topic is at the core of the scope of IJCAI, and is attractive to IJCAI audiences from both academia and industry. The objective of "Motivate and explain a topic of emerging importance for AI" will be best served by this tutorial.


    Tutorial Overview

    We introduce the most recent updates and advances in disentangled representation learning during the past years. The discussion about disentangled representation learning will be scheduled from the following three aspects: i) basics and fundamentals of disentangled representation learning, ii) suitable application scenarios of disentangled representation learning and iii) advances of disentangled representation learning, including disentangled graph representation learning and disentangled representation learning for recommendation, etc.

    Basics and fundamentals of disentangled representation learning

    Current popular methods for learning disentangled representations can be roughly divided into the following categories: VAE-based, GAN-based, clustering-based methods, knowledge-guided methods.

  • VAE-based methods obtain disentangled representation of an object from the perspective of probabilistic generative models. By modeling the process of the object generation with disentangled factors, this line of approaches obtain dimensionally disentangled representation.
  • Different from the VAE-based methods, GAN-based methods focus on disentangling the expected factors of the object instead of disentangling each dimension of its representation.
  • Clustering-based methods utilize the similarities and differences among data points to learn the disentangled concepts behind the observable data.
  • Knowledge-guided methods usually give some supervisions to the model so that the designated parts of the learned representation are able to possess the expected semantics.
  • Applications of disentangled representation learning

    Disentangled representation learning easily finds its wide applications in various areas related to deep learning for its explainability and controllability. For example, in image generation, when we have disentangled representations for an image, it will be easy to generate the image with specific semantic attributes. Similarly, in recommendation, when we disentangle the click behavior of users within the latent representations, the recommender systems can provide the users with not only the items but also the potential reasons why the target users may like these items, thus improving the explainability in representation learning.

    Recent advances of disentangled representation learning

    Given that disentangled representation learning in visual data has been largely studied in the past few years, where we focus on more advances of disentangled representation learning, including disentangled representation learning for relational data structured via graphs and disentangled representation learning for user behaviors in recommendation, etc. For graph representation learning, the underlying reasons for edges connecting different nodes can be different, thus one advantage of learning disentangled representation is to discover the semantic meaning carried via these edges. For user behavior data which can be more complex and highly entangled, it will be more challenging and interesting to learn representations capable of disentangling the hidden patterns carried in the observed data.