FMA (Fused multiply-add) is a technical term representing the floating-point operation performed multiplication and addition in one step. This operation executes the following arithmetic calculation, which frequently shows up in various programs. Unlike separating multiplication and additional operation, it achieves higher performance and numerical precisions.

$ r = (a * b) + c $

For instance, x86 provides FMA instruction set as streaming SIMD extensions for 128 and 256 bits. This instruction is helpful to achieve better performance for machine learning applications requiring many intensive numerical calculations.

This unit is counted to estimate the number of flowing point instructions necessary to compute the state-of-the-art machine learning models.

albanie/convnet-burden

Reference