Fused Multiply-Add Algorithm

A Configurable Floating-Point Fused Multiply-Add Design With Mixed Precision for AI Accelerators

Abstract: Hardware accelerators for deep learning in artificial intelligence applications must often meet stringent constraints for accuracy and throughput. In addition to architecture/algorithm ...

GitHub

BennyE/mfma-cdna-amd

A progressive, hands-on learning path for AMD GPU kernel programming, focusing on Matrix Fused Multiply-Add (MFMA) instructions on CDNA3 architecture. This guide takes you from your first HIP kernel ...

TechRepublic

Implementation of Low Power, Low Delay Fused Add-Multiply Operator Using Prefix Adders

In many Digital Signal Processing (DSP) applications, complex arithmetic operations are used. To increase the performance and to reduce the complexity of arithmetic operations, the authors designed a ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

A Configurable Floating-Point Fused Multiply-Add Design With Mixed Precision for AI Accelerators

BennyE/mfma-cdna-amd

Implementation of Low Power, Low Delay Fused Add-Multiply Operator Using Prefix Adders

Trending now