The company aims to deliver high-quality, uninterrupted experiences for everyone.
Meta has presented a real-time, high-fidelity audio codec that uses AI to compress audio files without losing quality.
High-quality sound requires a fast internet connection and a lot of storage space, so the research is meant to overcome these limitations with EnCodec: "Imagine listening to a friend’s audio message in an area with low connectivity and not having it stall or glitch."
Meta built a three-part system and trained it to compress audio data to the targeted size, it can then be decoded using a neural network. The researchers claim they achieved an approximate 10x compression rate without a loss of quality and made it work for CD-quality audio.
EnCodec consists of three parts:
- The encoder, which takes the uncompressed data and transforms it into a higher dimensional and lower frame rate representation.
- The quantizer, which compresses this representation to the targeted size.
- The decoder, which turns the compressed signal back into a waveform that is as similar as possible to the original.
The tech identifies changes not perceivable by humans. To make it possible, the researchers use discriminators to improve the quality of the generated samples. The discriminator needs to differentiate between real and reconstructed samples. The compression model attempts to generate samples to fool the discriminators by pushing the reconstructed samples to be more perceptually similar to the original samples.
Meta plans to make the encoder cover video to improve experiences such as videoconferencing, streaming movies, and playing games with friends in VR. Find the code here, read about the research on Meta's blog, and don't forget to join our Reddit page and our Telegram channel, follow us on Instagram and Twitter, where we share breakdowns, the latest news, awesome artworks, and more.