Home

ARI humanoid robot imitates human gaze behaviour using reinforcement learning in real-world environments

About the Paper

This paper presents a novel approach to enhance the social interaction capabilities of the ARI humanoid robot using reinforcement learning. We focus on enabling ARI to imitate human attention/gaze behaviour by identifying salient points in dynamic environments, employing the Zero-Shot Transfer technique combined with domain randomisation and generalisation. Our methodology employs the Proximal Policy Optimisation algorithm, training the reinforcement learning agent in a simulated environment to maximise robustness in real-world scenarios. We demonstrated the efficacy of our approach by deploying the trained agent on the ARI humanoid and validating its performance in human-robot interaction scenarios. The results indicated that ARI can successfully identify and respond to salient points, exhibiting human-like attention/gaze behaviours, which is an important step towards acceptability and efficiency in human-robot interactions. This research contributes to advancing the capabilities of social robots in dynamic and unpredictable environments, highlighting the potential of combining Zero-Shot Transfer with domain randomisation and generalisation for robust real-world applications.

Demonstration Videos

This research presents an innovative method to enhance the ARI humanoid robot's ability to engage in dynamic social interactions by imitating human gaze behaviours using reinforcement learning. The study tackles key challenges in deploying reinforcement learning agents in real-world environments, particularly the gap between simulation and real-world performance, known as the sim-to-real transfer problem. The method leverages Zero-Shot Transfer (ZST), domain randomisation, and generalisation techniques, enabling ARI to seamlessly adapt from a simulated training environment to unpredictable, real-world scenarios without additional training.

The reinforcement learning agent is trained using the Proximal Policy Optimisation (PPO) algorithm, integrated with an attention model that allows ARI to focus on salient points in its environment. By simulating various human activities, such as speaking and hand gestures, and considering proximity, the robot learns to prioritise individuals in a scene based on their social relevance. A Gaze Control System (GCS) is designed to guide the robot's gaze behaviour, rewarding it for identifying and attending to more socially important participants.

Challenges Addressed

After training, the model was successfully deployed onto the ARI robot and tested in human-robot interaction scenarios. The results demonstrated that ARI can accurately mimic human-like gaze behaviours by identifying and responding to salient points in real time. The robot’s ability to interact dynamically with human participants shows the potential for reinforcement learning to enhance social robots' capabilities in unpredictable, real-world environments.

This research marks a significant advancement in human-robot interaction, addressing key challenges of transitioning reinforcement learning models from simulation to real-world deployment and providing insights into developing robust, adaptable social robots.

Download Full Paper Code