Charisios Loukas, "Large Scale Design and Implementation of Convolutional Neural
Networks based on Large FPGA Arrays", Diploma Thesis, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2020
https://doi.org/10.26233/heallink.tuc.84315
An international race towards the development of the first exascale supercomputer during the past few years, has created a demand for applications that require comparable amounts of computational work. The superior performance and power consumption that hardware accelerators can achieve, renders their use on such applications inevitable. Convolutional Neural Networks constitute a prime example of a computationally intensive and highly parallelizable system, whose performance can increase significantly, when implemented as such an accelerator, as recent work has shown. This study inherits the hardware accelerator design of a CNN developed by G. Pitsis and attempts to scale it, both horizontally, by incorporating it into the ExaNeSt designs and allowing its use on the QFDB multi-FPGA prototype board, and vertically, by fixing issues with a version of the design that increases the batch size. It also applies a recently published dropout technique on the hardware accelerator at prediction time, demonstrating the capability of trading computational power for increased confidence in the results of the network. The boards used for the purposes of this thesis are the Xilinx ZCU102 and the QFDB, a 4-FPGA prototype board developed in FORTH.