You are on page 1of 64
Attorney Docket No. TESLPO0S / P0920-2NUS. APPLICATION FOR UNITED STATES PATENT A COMPUTATIONAL ARRAY MICROPROCESSOR SYSTEM WITH VARIABLE LATENCY MEMORY ACCES: By Inventor(s): Emil Talpes San Mateo, CA Peter Bannon Woodside, CA Kevin Hurd Redwood City, CA Assignee: Tesla, Ine. VAN PELT, YI & JAMES LLP 10050 N, Foothill Blvd., Suite 200 Cupertino, CA 95014 Telephone (408) 973-2585 A COMPUTATIONAL ARRAY MICROPROCESSOR SYSTEM WITH VARIABLE LATENCY MEMORY ACCESS CROSS REFERENCE TO OTHER APPLICATIONS [0001 This application claims priority to U.S. Provisional Patent Application No. 62/635,399 entitled A COMPUTATIONAL ARRAY MICROPROCESSOR SYSTEM WITH VARIABLE LATENCY MEMORY ACCESS filed February 26, 2018, and this application claims priority to U.S. Provisional Patent Application No. 62/625,251 entitled VECTOR COMPUTATIONAL UNIT filed February 1, 2018, and this application claims priority to U.S. Provisional Patent Application No, 62/536,399 entitled ACCELERATED MATHEMATICAL, ENGINE filed July 24, 2017, and this application is a continuation-in-part of co-pending U.S. Patent Application No. 15/710,433 entitled ACCELERATED MATHEMATICAL ENGINE. filed September 20, 2017, which claims priori to US. Provisional Patent Application No. 62/536,399 entitled ACCELERATED MATHEMATICAL ENGINE filed July 24, 2017, all of which are incorporated herein by reference for all purposes. BA GRO! IND OF THE INVENTIO! 0002] Performing inference on a machine learning model typically requires retrieving data from memory and applying one or more computational array operations on the data. Applications of machine learning, such as those targeting self-driving and driver-assisted automobiles, often utilize computational array operations to calculate matrix and vector results. These operations require loading data, such captured sensor data, and performing image processing to identify key features, such as lane markers and other objects in a scene. Traditionally, these operations may be implemented using a generic microprocessor system that loads the computation data from memory before performing a computational array instruction While the data is loading, the microprocessor system often its idle, The software platform running these applications will initiate the computational array instruction once the data has, completed loading. The length of stalls and the time required to synchronize the computational tomy Dacket No, TESLPUOS / PO920-2NUIS 1 Parise operation with the retrieved data can be particularly long for when accessing variable latency memory. Stalls and synchronization efforts by the software platform reduce the efficiency of the microprocessor system and result in higher power consumption and lower throughput. Therefore, there exists a need for a microprocessor system with increased throughput that performs array computational operations using variable latency memory access tomy Dacket No, TESLPUOS / PO920-2NUIS 2 Parise