T-Attention: Optimizing Attention Computation Using Temporal Parameter Time

Authors: Yichen Yang, Hongxu Hou, and Wei Chen
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 1308-1322
Keywords: Attention, Parametric Equations, Fourier Series, Function Projection.

Abstract

The attention mechanism exhibits remarkable capability in processing sequential data however, its computational complexity scales quadratically with sequence length, resulting in significant resource demands. Numerous studies have achieved substantial success in leveraging sparse matrices to reduce the computational burden of dot-product operations, thereby improving both the computational efficiency and accuracy of models. Nevertheless, the question remains: can we further optimize these computations? In this paper, we introduce a novel approach based on function projection, integrating a restructured word embedding technique with the attention mechanism to alleviate computational overhead. We first validate the theoretical efficacy of designing word embedding using parametric equations and demonstrate the effectiveness of our proposed embedding method. Subsequently, we conduct experiments across a variety of basis functions, illustrating that our approach affords greater flexibility in parameter selection while effectively reducing computational costs. Compared to state-of-the-art attention-based models, our method achieves a reduction in inference time, underscoring its practical advantages.
📄 View Full Paper (PDF) 📋 Show Citation