Institutional Repository [SANDBOX]
Technical University of Crete
EN  |  EL

Search

Browse

My Space

Reconfigurable logic based acceleration of convolutional neural network training

Flengas Georgios

Full record


URI: http://purl.tuc.gr/dl/dias/05A0B1B4-517D-4C54-B42C-E1E04D52AB15
Year 2024
Type of Item Diploma Work
License
Details
Bibliographic Citation Georgios Flengas, "Reconfigurable logic based acceleration of convolutional neural network training", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2024 https://doi.org/10.26233/heallink.tuc.99113
Appears in Collections

Summary

In the rapidly evolving landscape of artificial intelligence and machine learning, the intricate nature of neural network architectures, combined with exponential data growth, has intensified the need for advanced computational training. Traditional CPUs and GPUs struggle to meet these demands, prompting exploration into the untapped potential of FPGA-based acceleration. This research introduces an innovative FPGA-tailored hardware architecture for training Convolutional Neural Networks (CNNs), prioritizing optimal accuracy, energy efficiency, and speedup over conventional CPU and GPU systems. Building on prior research, we employ General Matrix Multiply (GEMM) and Image to Column(im2col) implementations, coupled with batch level parallelism. The workload distribution between the CPU and FPGA is intricately balanced, ensuring efficient collaboration, while multiple operations are synergistically combined to streamline computation time and reduce complexity. The integration of state-of-the-art machine learning algorithms with advanced FPGA design tools, including Vitis High-Level Synthesis (HLS), yields tailored IP blocks for each stage of the neural network training process. Our Proposed Platform achieves a notable throughput of 374.32 images per second, surpassing the CPU rate of 258.7 images per second but falls behind GPU with a throughput of 1333.3 images per second, while operating at a significantly lower power consumption of 4.16 Watts (0.011 Joules per image). This positions the Proposed Platform as a leading candidate for energy-efficient neural network training, showcasing a 16.55X energy efficiency gain over CPUs and a 7.75X over GPUs.

Available Files

Services

Statistics