Bidirectional Decoding Collaborating with Hierarchical Imitation for Long-Horizon Task Learning

Authors: Faquan Zhang, Bo Jin, Kun Zhang, and Ziqi Wei
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 564-578
Keywords: Imitation Learning, Hierarchical Reinforcement Learning, Bidirectional Decoding.

Abstract

Imitation Learning IL struggles with long-horizon tasks due to insufficient policy generalization and adaptability in dynamic environments. To address this, we propose a hierarchical framework that integrates Hierarchical Reinforcement Learning HRL with a Bidirectional Decoding mechanism. The framework decomposes complex tasks into subtasks, leveraging human demonstrations to rap-idly capture behavioral patterns through IL, while employing HRL to refine policies via reward-driven optimization. A novel Bidirectional Decoding mechanism leverages temporal consistency backward coherence and enhances robustness forward contrast by dynamically reassessing action sequences against strong and weak policy predictions. Evaluations in the Franka Kitchen environment demonstrate superior performance in task success rates and cumulative rewards, outperforming existing approaches. Ablation studies confirm the critical role of Bidirectional Decoding in resolving the rigidity of traditional action chunking, while the discovery of novel strategies—diverging from human demonstrations—highlights autonomous policy improvement. Our framework efficiently handles dynamic and diverse long-horizon tasks, even with limited demonstration data, offering a robust solution for real-world applications such as robotic manipulation.
đź“„ View Full Paper (PDF) đź“‹ Show Citation