Data Extraction and Preparation for AI Inference in a Distributed Computation Load Based FPGA Design
C.S. Ibala (Intel, Oregon, USA), J.G. de la Mora (Intel, Jalisco, Mexico), A. Muralidharan (Intel, San Jose CA, USA), C.Y. Tan Lee (Intel, Penang, Malaysia)
The amount of data (weight, bias, image) required for a high precision AI (Artificial Intelligence) inference makes the implementation in a single FPGA a very challenging task. The purpose of this work is to explore a flow that will encapsulate data extraction and transformation of a trained AI model. Moreover, this effort will explain how to distribute the computation load across multiple FPGAs. In this paper, we will use a CNN U-net model to describe how to reduce the number of parameters (weights, bias) from more than 3 million to a little bit more than 100 k and share the data across multiple FPGAs using high speed ethernet connection at a speed that could vary from 10 GB/s to 400 GB/s.
Download one page abstract