HOMIE :

Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit

Image description

Abstract: Current humanoid teleoperation systems either lack reliable low-level control policies, or struggle to acquire accurate whole-body control commands, making it difficult to teleoperate humanoids for loco-manipulation tasks. To solve these issues, we propose~\ourshort, a novel humanoid teleoperation cockpit integrates a humanoid loco-manipulation policy and a low-cost exoskeleton-based hardware system. The policy enables humanoid robots to walk and squat to specific heights while accommodating arbitrary upper-body poses. This is achieved through our novel reinforcement learning-based training framework that incorporates upper-body pose curriculum, height-tracking reward, and symmetry utilization, without relying on any motion priors. Complementing the policy, the hardware system integrates isomorphic exoskeleton arms, a pair of motion-sensing gloves, and a pedal, allowing a single operator to achieve full control of the humanoid robot. Our experiments show our cockpit facilitates more stable, rapid, and precise humanoid loco-manipulation teleoperation, accelerating task completion and eliminating retargeting errors compared to inverse kinematics-based methods. We also validate the effectiveness of the data collected by our cockpit for imitation learning.

HOMIE :

Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit

Image description

Abstract: Current humanoid teleoperation systems either lack reliable low-level control policies, or struggle to acquire accurate whole-body control commands, making it difficult to teleoperate humanoids for loco-manipulation tasks. To solve these issues, we propose~\ourshort, a novel humanoid teleoperation cockpit integrates a humanoid loco-manipulation policy and a low-cost exoskeleton-based hardware system. The policy enables humanoid robots to walk and squat to specific heights while accommodating arbitrary upper-body poses. This is achieved through our novel reinforcement learning-based training framework that incorporates upper-body pose curriculum, height-tracking reward, and symmetry utilization, without relying on any motion priors. Complementing the policy, the hardware system integrates isomorphic exoskeleton arms, a pair of motion-sensing gloves, and a pedal, allowing a single operator to achieve full control of the humanoid robot. Our experiments show our cockpit facilitates more stable, rapid, and precise humanoid loco-manipulation teleoperation, accelerating task completion and eliminating retargeting errors compared to inverse kinematics-based methods. We also validate the effectiveness of the data collected by our cockpit for imitation learning.


Overview

All videos below are at normal speed. 😊

Reinforcement Learning

rl

We introduce three core techniques to our RL-based training framework to significantly expand the operational workspce of humanoid robots whereas ensuring robustness of locomotion:

  1. Upper-body pose curriculum: Enable balance under continuous changing upper-body poses.
  2. Height tracking reward: Enable the robot to squat to any required heights robustly and quickly.
  3. Symmetry utilization: Make the robot act more symmetrically & improve data efficiency.

Our framework is totally MoCap-free, resulting in a more efficient pipeline.

Our framework can be used to train different kinds of robots such as Unitree G1and Fourier GR-1.

Unitree G1 trained in Isaac Gym.

Fourier GR-1 trained in Isaac Gym.


After training with our framework on an Nvidia RTX 4090 for only about 3 hours , we can get policies that can be deployed directly in the real world to drive robots walk and squat robustly.

rl

We conduct serval ablation experiments to verify the effectiveness of our framework, and we find:

  1. Our upper-body pose curriculum can help robots better learn to balance under dynamic upper-body movements gradually than methods without curriculum or with other curriculum style.
  2. The introduction of novel height tracking reward can accelerate the training for robot squatting.
  3. The symmetry utilization can both significantly accelerate the training process by over 10 times and guarantee the symmetry of the trained policy.

Hardware System

Our hardware system features isomorphic exoskeleton arms, a pair of motion-sensing gloves, and a pedal. The pedal design for locomotion command acquisition liberates the operator's upper body, enabling simultaneous acquisition of upper-body poses. Since the exoskeleton arms are isomorphic to the controlled robot and each glove has 15 degrees of freedom (DoF), which is more than most existing dexterous hands, we can directly set upper-body joint positions from the exoskeleton readings, dispensing with IK and achieving faster and more accurate teleoperation.

Arms and Hands.

Pedal.


We design hardware systems for both Unitree G1 and Fourier GR-1. Notably, our gloves can be detached from the arms, allowing them to be reused in systems isomorphic to different robots.

rl

1. Isomorphic Exoskeleton for Unitree G1.

rl

2. Isomorphic Exoskeleton for Fourier GR-1.

Using our hardware system, one single operator can choose to control:

1. Diverse dexterous hands.

2. Upper body of Humanoids.

3. Whole body of humanoids.


The total cost of the hardware system is only $0.5k, significantly lower than that of MoCap devices. We list the detailed costs for all parts here.

Deployment

We deploy the trained policy on the Unitree G1 in the real world and teleoperate it to perform various loco-manipulation tasks using our isomorphic exoskeleton hardware system.

1. Walking under changing upper-body poses.

2. Squatting under changing upper-body poses.



3. Squat to hold flower and transfer.

4. Squat to grasp a bottle.



5. Hand over and pick & place.

6. Step back and open oven.



7. Transfer a grasp from lower to higher.

8. Transfer a box from one shelf to another.



9. Push the man on a chair.

10. Hand over between two robots.



We further conduct some experiments to show the robustness of our policy.

1. Strong hitting.

2. Hit with a heavy ball.



To demonstrate the effectiveness of the isomorphic exoskeleton, we compare the task completion times across four different tasks between our hardware system and OpenTelevision.

1. Pick & Place.

2. Scan Barcode.



3. Hand Over.

4. Open Oven.



vsopentv

The completion times for these tasks are computed based on data from three different operators, with each operator performing the tasks three times. Our hardware system can accelerate the teleoperation by approximately 2 times, particularly in tasks that require radial movement.

Extensions

1. Simulation

We transfer the trained policies for Unitree G1 and Fourier GR-1 from Isaac Gym to scenes developed by GRUtopia, thus the robots can perform diverse loco-manipulation tasks more cost-effectively and in a wider range of scenarios than would be feasible in the real world.

1. Unitree G1 in GRUtopia.

2. Fourier GR-1 in GRUtopia.



3. Loco-Manipulation Task Completion in GRUtopia.



2. Imitation Learning

To validate the effectiveness of the demonstratons collected by HOMIE for IL algorithms, we design two distinct tasks, collect data by teleoperating, train with IL algorithm, and deploy in the real world. We achieve over 70% success rate, showing the feasibility of training IL with collected data.

1. Squat Pick.

2. Pick & Place.


Authors

1Shanghai Artificial Intelligence Laboratory, 2The Chinese University of Hong Kong, *Equal contribution

@article{ben2024homie,
  title={HOMIE: Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit},
  author={Qingwei Ben, Feiyu Jia, Jia Zeng, Junting Dong, Dahua Lin, Jiangmiao Pang},
  journal={arXiv preprint arXiv:2502.13013},
  year={2025}
}

If you have any questions, please contact Qingwei Ben. 🎉