The intent of this work is to evaluate the Vitis Unified Software Platform development environment for acceleration of tasks alike matrix multiplication. For that, a system for multiplying matrices is implemented in an FPGA device on a PCI-e extension card in a host computer with the aforementioned development tools. The system's performance is compared with the performance of two existing solutions, Vitis BLAS and Intel MKL. The basis for the comparison are the measured speeds of execution.
The system is implemented for the Alveo U250 Data Center accelerator card. Its design is based on a systolic array for matrix multiplication, utilises 16-bit fixed point arithmetic and supports sizes of matrices up to 1024 x 1024. Due to the amount of unused resources after implementation, a second system with two identical pipelines is implemented. Its architectural characteristics are the same, but it operates at a slightly lower frequency and supports execution of two operations in parallel.
Due to complications with the Vitis BLAS library, only the Intel MKL library is used for comparison. Its 32-bit floating point matrix multiplication performance is measured on Intel processors: i7 4700HQ, Xeon Gold 6144, Xeon Gold 6154, and Xeon Platinum 8180. The results favor the implemented systems, however the data type used is different. The key takeaway is that the performance is comparable and the Vitis development platform is able to provide useful results.
|