Transformer Computation Formula

danieleschmidt/secure-mpc-transformer-infer

Revolutionary implementation of secure multi-party computation for transformer inference enhanced with quantum-inspired task planning. First practical system achieving BERT inference in tens of ...

GitHub

Epsilon: A Transformer with Adaptive Computation and Quantized Attention

Epsilon is a novel Transformer architecture designed for high efficiency, training stability, and interpretability. It is built for sequence-classification tasks and is demonstrated on the IMDb ...

IEEE

Hardware Accelerator for MobileViT Vision Transformer with Reconfigurable Computation

Abstract: With the great success of the Transformer model in Natural Language Processing (NLP), Vision Transformer (ViT) was proposed achieving comparable performance to traditional Convolutional ...

IEEE

A 28nm 64.5TOPS/W Sparse Transformer Accelerator with Partial Product-based Speculation and Sparsity-Adaptive Computation

Abstract: Transformer-based models have achieved notable success across various fields, thanks to the Multi-Head Attention (MHA) mechanism. However, their high computational and memory demands pose ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results