10-minute Whisper model processing test: Nvidia RTX 4090 is 86% slower than Apple M3 Max

2024-07-21


Shulou( Report--, December 14, developer Oliver Wehrens recently tested Apple's M1 Pro, M2 Ultra and M3 Max Apple Silicon chips to train AI capabilities after upgrading the MLX framework, and compared Nvidia's RTX 4090 graphics card.

Wehrens uses OpenAI's speech recognition model Whisper to test, mainly measuring the time it takes to transcribe a 10-minute audio file.

The test results show that it takes 216s for M1 Pro to process audio and 186s for Nvidia RTX 4090 graphics card.

M2 Ultra with 76 GPU and M3 Max with 40 GPU have better performance, which are 95 seconds and 100 seconds, respectively.

In addition, Apple's Apple Silicon chip consumes more power. The Nvidia RTX 4090 is 242W higher in operation than in idle.

However, when equipped with M1 Pro chip, it is only 38W higher than that in idle state. previously reported that the features of the MLX framework are as follows:

Familiar API:Python and C++ API have familiar frameworks such as NumPy and PyTorch, making it easy for experienced researchers to learn.

Easy and efficient: MLX uses combinable functional transformations to optimize Apple Silicon performance.

Delayed calculation: can prevent unnecessary calculation and improve resource efficiency.

Dynamic design: it can adapt to the change of input shape and simplify the debugging and testing process.

Combination of software and hardware: MLX seamlessly uses the CPU and GPU of Apple devices to ensure that users can make full use of the hardware.

Unified memory advantage: MLX uses Apple's unified memory to further enhance data movement speed

Researcher-friendly: MLX is designed for researchers.

IT Information


