The proposed FP8 standard has the potential to accelerate AI development and works for both AI training and inference.
AI Developers from NVIDIA, Arm, and Intel have published a collaborative paper proposing a new common interchange format designed to accelerate deep learning training inference. The paper, titled FP8 Formats for Deep Learning, describes an 8-bit floating point (FP8) standard that provides a common format, accelerating AI development by optimizing memory usage and works for both AI training and inference.
According to NVIDIA's Director of Product Marketing Shar Narasimhan, testing the proposed format demonstrated a comparable accuracy to 16-bit precisions across various use cases. The test results on transformers, computer vision, and GAN networks showed that FP8 training accuracy is similar to 16-bit precisions while delivering significant speedups.
Moreover, the trio of companies stated that the proposed FP8 specification has been published in an open, license-free format to encourage broad industry adoption. The proposal will also be submitted to the Institute of Electrical and Electronics Engineers (IEEE), a technical professional organization dedicated to advancing technology for the benefit of humanity.
"AI processing requires full-stack innovation across hardware and software platforms to address the growing computational demands of neural networks. A key area to drive efficiency is using lower precision number formats to improve computational efficiency, reduce memory usage, and optimize for interconnect bandwidth," commented NVIDIA in a recent blog post. "Transformer networks, which are one of the most important innovations in AI, benefit from an 8-bit floating point precision in particular. We believe that having a common interchange format will enable rapid advancements and the interoperability of both hardware and software platforms to advance computing."