ACML 2021 Workshop on Machine Learning for Mobile Robot Vision and Control (MRVC)


MRVC Workshop

In recent years, the advances in machine learning for vision and control applications support the increasing demands for mobile robots. Such demands surge especially in the past several months, arguably due to the spread of COVID-19. While mobile robots are typically expected to work in controlled environments (e.g., supply chain automation at factories), more challenging unconstrained situations (e.g., cleaning, sanitizing, etc.) have also begun attracting attention, which in turn causes automation and safety of mobile robots to become a serious concern.

To enable mobile robots to meet such demands, equipping them with satisfactory vision and control capabilities is necessary and has become the key focus of relevant robotic research endeavors. Many sophisticated computer vision / machine learning / robotics approaches have been developed to meet this aim, including but not limited to semantic segmentation, optical flow estimation, depth estimation, object detection and tracking, domain adaptation, sim-to-real transfer, reinforcement learning and imitation learning for robot navigation. However, these advances have not yet been properly translated to significant progress in practical mobile robot applications due to the insufficiency of effective data samples from the real world, leading to unsatisfactory performance and safety concerns of mobile robots during deployment.

Moreover, effectively collecting data and efficiently utilizing them for training vision and control models, especially in unconstrained outdoor environments, have further raised a number of fundamental challenges for mobile robot applications. These challenges include several open but crucial issues, such as multimodal sensing, privacy issues, human activity recognition and prediction, as well as the constraints on batteries, computing capabilities, and limited field of view.

To better understand the aforementioned issues and improve the current solutions, this MRVC workshop presents a timely opportunity to bring together researchers in computer vision, machine learning, and robotics communities together to discuss the unique challenges and opportunities for mobile robots.


* The time zone shown in this website is in UTC+08:00

17 November, 2021

Workshop Events

The opening remark

14 : 00 PM - 14 : 05 PM


Keynote (I): Dr. Simon See - Senior Director, NVIDIA AI Technology Center
AI and Simulation

14 : 05 PM - 14 : 30 PM

Keynote (II): Dr. Yoko Sasaki - Senior Researcher at National Institute of Advanced Industrial Science and Technology, Japan
Autonomous mobile robot in human living space

14 : 35 PM - 15 : 00 PM

Keynote (III): Dr. Shang-Hong Lai - Professor at NTHU and Principal Researcher at Microsoft
Deep Multimodal Learning for Face Recognition

15 : 05 PM - 15 : 30 PM

The invited paper presentation

Time-Constrained Multi-Agent Path Finding in Non-Lattice Graphs with Deep Reinforcement Learning

15 : 30 PM - 15 : 45 PM

The accepted paper presentation

Sim-to-Real: Virtual Guidance for Robot Navigation

15 : 48 PM - 16 : 00 PM

Less Reward is More: Improved Reinforcement Learning Control of a Mobile Manipulator using Clamped Joints

16 : 03 PM - 16 : 15 PM

A Fine-grained Dynamic Inference Architecture for Semantic Image Segmentation

16 : 18 PM - 16 : 30 PM

The closing remark

16 : 30 PM - 16 : 35 PM


Unofficial post-workshop discussion session.

16 : 35 PM - 17 : 30 PM

Call for submissions

Call for Short Papers and Extended Abstracts

Please submit papers via CMT.

At MRVC-21, we will solicit contributions at the intersection of mobile robotics and machine learning and computer vision. Specific topics of interest will include:

  • Machine learning for mobile robot control
    • Reinforcement learning and imitation learning for robot navigation and exploration
    • Cooperative multi-agent control, learning to cooperate and communicate
    • Probabilistic learning and representation of uncertainty in robotics
  • Machine learning for unconstrained and real environments
    • Domain adaptation, meta learning, and sim-to-real transfer for mobile robots
    • Federated learning
    • Privacy-preserving ML
  • Computer vision for mobile robots
    • Vision based localization and mapping (Visual Odometry, SLAM)
    • Low-level vision (optical flow, depth estimation)
    • Semantic segmentation
    • Trajectory forecasting
    • Human activity recognition and prediction
    • Multimodal perception, sensor fusion, and computer vision Embodied vision
    • Embodied vision
  • Other applications of learning in robot manipulation, navigation, driving, flight, and other areas of robotics

Author guideline
All submissions should be in the ACML-21 format with a maximum of eight pages for short papers and four pages for extended abstracts, including the reference and appendices. Both unpublished and already-published works are welcome. Reviews will be single-blind. Accepted papers will be published on the workshop website and presented in the form of either short talks or posters. Please note that we will NOT publish any official proceedings so that participants can submit their work to future conferences based on the feedback from the workshop.

Please submit papers via CMT: CMT link

Important Dates

* The time zone shown in this website is in UTC+08:00

Event Date
Submission begins for the short papers and extended abstracts 12 September 2021
Submission closes for the short papers extended abstracts 15 October 2021
Review period begins 15 October 2021
Review period ends 25 October 2021
Discussion of paper acceptance by the committee members To be announced
Notification of paper acceptance To be announced
Deadline for camera ready manuscripts and the posters To be announced
Workshop date 17 November 2021

Brief Biography of the Organizers

The organizers of this workshop contain professionals and experts from both academia and industry (National Tsing Hua University, Tokyo Institute of Technology, Keio University, OMRON SINIC X, NVIDIA Corporation, Microsoft, and DeepMind).

Organization Committee

  • Chun-Yi Lee (National Tsing Hua University, Taiwan)
  • Ryo Yonetani (OMRON SINIC X / Keio University, Japan)
  • Asako Kanezaki (Tokyo Institute of Technology, Japan)
  • Mohammadamin Barekatain (DeepMind, UK)
  • Simon See (NVIDIA AI Technology Center, Singapore)
  • Shang-Hong Lai (Microsoft AI R&D Center, Taiwan)

Program committee



Chun-Yi Lee

Chun-Yi Lee is an Associate Professor of Computer Science at National Tsing Hua University (NTHU), Hsinchu, Taiwan, and is the supervisor of Elsa Lab. Prof. Lee’s research focuses on deep reinforcement learning (DRL), intelligent robotics, computer vision (CV), and parallel computing systems.


Ryo Yonetani

Ryo Yonetani is a principal investigator at OMRON SINIC X, Japan and a project assistant professor at Keio University. His research interests include human activity recognition, visual forecasting, federated learning, and transfer learning.


Asako Kanezaki

Asako Kanezaki is an associate professor at Tokyo Institute of Technology. Her research interests include object detection, 3D shape recognition, and robot applications.


Mohammadamin Barekatain

Mohammadamin Barekatain is a Research Engineer at DeepMind, UK. His research interests include reinforcement learning, computer vision, and algorithmic reasoning.


Simon See

Simon See is currently the Solution Architecture and Engineering Director and Chief Solution Architect for Nvidia AI Technology Center. His research interests are in the area of High-Performance Computing, Big Data, Artificial Intelligence, Machine Learning, Computational Science, Applied Mathematics and Simulation Methodology.


Shang-Hong Lai

Shang-Hong Lai is a professor at National Tsing Hua University (NTHU), Taiwan, and is a principal researcher at Microsoft AI R&D Center. Dr. Lai’s research interests are mainly focused on computer vision, image processing, and machine learning.

Plans on the Virtual Workshop

Virtual platforms
We will select between Zoom (for keynote and oral presentations), Gather.Town (oral and/or poster presentations), Teams, Slack, or Discord (asynchronized text-based discussions) as our platform. These services have great interfaces for desktops, smartphones, and tablets, and have been commonly used for virtual conferences and our daily online discussions. We believe that using them in combination will be ideal to make the workshop easily accessible to diverse attendees.

Public accessibility
Accepted papers and posters will also be made available on our workshop website. The workshop presentations will be video recorded and made public during and/or after the workshop.