Το work with title Modelling error gradients in deep learning methods using reconfigurable hardware by Fragiadakis-Theodorouleas Michail is licensed under Creative Commons Attribution 4.0 International
Bibliographic Citation
Michail Fragiadakis-Theodorouleas, "Modelling error gradients in deep learning methods using reconfigurable hardware", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2018
https://doi.org/10.26233/heallink.tuc.80191
During the training process of a Neural Network there is significant amount of resources that remains unused due to data dependencies, waiting for the forward pass and the error backpropagation to complete. Decoupled neural interfaces using gradient error modelling were introduced in order to overcome this pitfall allowing each layer to be updated before the backpropagation is complete and it is provided with the error gradient.In this diploma thesis we examine the parallelisation of decoupled neural interfaces operations when implemented on Field Programmable Gate Array (FPGA), configurations that can decrease training time as well as the effects of decoupled neural interfaces regarding the training ability and the accuracy of the network.In this thesis we adjudge that:Integration of synthetic gradient error does not have negative effect in accuracy and representational strength.The addition of synthetic gradient error causes the addition of noise in the training error which results in training error regularisation. This is beneficial for the training process as it broadens the exploration of error space, decreases the generalisation error of the neural network and prevents from overfitting the training dataset.Synthetic gradient error modeling can accomplish decrease of training time only in particular cases.The combination of synthetic and true gradient error increases the number of neural layers error correction updates and accelerate the convergence rate of the training error.Despite the fact that the combination of synthetic and true gradient error increases the training pass latency, the overall training time can decrease due to the higher error convergence rate.