The team has set six new records for how fast an AI model can be trained using a predetermined group of datasets.
Another huge milestone for Nvidia. The team has set six new records for how fast an AI model can be trained using a predetermined group of datasets. The record is set with the help of MLPerf, a benchmark suite of tests created by prominent companies in the space to standardize and provide guidelines for how to measure AI training and inference speed.
Companies who contributed to the creation of MLPerf are Google, Nvidia, Baidu, and supercomputer maker Cray.
Nvidia is said to set records for image classification with ResNet-50 version 1.5 on the ImageNet dataset, object instance segmentation, object detection, non-recurrent translation, recurrent translation, and recommendation systems.
“For all of these benchmarks we outperformed the competition by up to 4.7x faster,” Nvidia VP and general manager of accelerated computing Ian Buck stated. “There are certainly faster DGX-2 ResNet-50 renditions out there, but none under MLPerf benchmark guidelines.”
The whole thing was achieved using Nvidia DGX systems, using NVSwitch interconnectors to work with up to 16 fully connected V100 Tensor Core GPUs, which was presented back in spring 2017. The team submitted and was judged in the single node category with 16 GPUs, as well as distributed training with 16 GPUs to 80 nodes with 640 GPUs.
Their team was able to train with ResNet-50 in 70 minutes. When it comes to distributed training, Nvidia was able to train with ResNet-50 in 6.3 minutes.
You can learn more here.