Empirical Evaluation of Fixed-Point Arithmetic for Deep Belief Networks
Deep Belief Networks (DBNs) are state-of-art Machine Learning techniques and one of the most important unsupervised learning algorithms. Training DBNs is computationally intensive which naturally leads to investigate FPGA acceleration. Fixed-point arithmetic can be used when implementing DBNs in FPGAs to reduce execution time, but it is not clear the implications for accuracy. Previous studies have focused only on accelerators using some fixed bit-widths. A contribution of this paper is to demonstrate the bit-width effect on various configurations of DBNs in a comprehensive way by experimental evaluation. Our work is inspired by the original DBN built on a subset of neural networks known as Restricted Boltzmann Machine (RBM) and the idea of Stacked Denoising Auto-Encoder (SDAE). We modified the floating-point versions of the original DBN and the denoising DBN (dDN) into fixed-point versions and compared their performance. Explicit performance changing points are found using various bit-widths. The results indicate that different configurations of DBNs have different performance changing points. The performance variations of three layers DBNs are a little larger than one layer DBNs because of the better sensitivity of deeper DBN. Sigmoid function approximation methods must be used when implementing DBNs in FPGA. The impacts of Piecewise Linear Approximation of nonlinearity algorithms (PLA) with two different precisions are evaluated quantitatively in our experiments. Modern FPGAs supply built-in primitives to support matrix operations including multiplications, accumulations and additions, which are the main operations of DBNs. A solution of mixed bit-widths DBN is proposed that a narrower bitwidth can be used for neural units and a wider one can be used for weights, thus fitting the bit-widths of FPGA primitives and gaining similar performance to the software implementation. Our results provide a guide to inform the design choices on bit-widths when implementing DBNs in FPGAs documenting clearly the trade-off in accuracy.