We present a configurable implementation of a convolution processing unit suitable for computing mixed-precision quantized neural networks. The design is implemented as a hardware generator written in Chisel, which is a software framework for writing hardware circuit generators. Our generator is designed to use minimal hardware resources and is very flexible in
regards to various aspects of the convolution operation, including:
image size, kernel size, image bitwidth, kernel bitwidth, activation
function, and more. The processing unit is configurable only at
generation time, thus we don’t pay the price of using more
general hardware, instead we can tailor it to the problem at
hand.