This is because Tesla’s self-developed chip has an annual performance improvement of 486%, and it only takes 17 years to reach the level of a mature human brain, while it takes 25 years for the human brain to mature naturally.
Tesla’s latest self-developed chip, Dojo D1, has a performance of 362 trillion times, which is 30 times that of the NVIDIA chip they used in 6 years, which has a performance of only 12 trillion times.
At the recent AI Day event, Tesla officially released the D1 chip, which is manufactured using TSMC’s 7nm process and has a core area of 645 square millimeters, second only to the NVIDIA Ampere architecture supercomputing core A100 (826 square millimeters),AMDThe next-generation computing core Arcturus (about 750 square millimeters) of the CDNA2 architecture integrates up to 50 billion transistors, which is equivalent to half of the Intel Ponte Vecchio computing chip.
It integrates four 64-bit superscalar CPU cores, with up to 354 training nodes, especially for 8×8 multiplication, supports various data instruction formats such as FP32, BFP64, CFP8, INT16, INT8, etc., all related to AI training of.
Tesla said,The D1 chip’s FP32 single-precision floating-point computing performance can reach 22.6TFlops (22.6 trillion times per second), and the BF16/CFP8 computing performance can reach 362TFlops (362 trillion times per second).
In order to support the scalability of AI training, its interconnection bandwidth is astonishing, up to 10TB/s, consisting of up to 576 channels, each with a bandwidth of 112Gbps.
And to achieve all this, the thermal design power consumption is only 400W.