Emmanouil Limnaios, "Study of gradient and stochastic gradient algorithms for Logistic Regression", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2023
https://doi.org/10.26233/heallink.tuc.94861
In this Diploma thesis, we study the Logistic Regression (LR), which is a widely used method for classification. We start by presenting the regularized LR cost function and computing its gradient and Hessian. It is well known that the LR is a convex function. Our main aim is to study the performance (convergence speed and solution accuracy) of deterministic versus stochastic algorithms for the minimization of the regularized LR cost function. First, we present two variants of the deterministic (full) gradient algorithm, one with a “naive” step-size and one with backtracking line search. Next, we move to the (Nesterov-type) accelerated full gradient algorithm. Then, we present variants of the stochastic gradient descent with step-sizes computed by various methods. For example,(1) by exploiting the strong convexity property of the regularized LR,(2) by using Armijo line-search using only a subset of the data determined by the batch size, (3) by using an ad-hoc line-search based on the angle of two successive stochastic gradients, etc. We test the performance of the various algorithms by using synthetic data (linearly separable and linearly non-separable). We observe that some stochastic variants (especially the variant which exploits the strong convexity of the regularized LR) perform quite wellduring the first epochs, while the accelerated gradient algorithms become more accurate after the first epochs. In general, accelerated stochastic gradient-type algorithms are fast during the first epochs but not very accurate. Thus, more sophisticated accelerated stochastic algorithms must be pursued.