Nikolaos Tzimos, "Distributed and Online maintenance of graphical models in Apache Flink", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2023
https://doi.org/10.26233/heallink.tuc.95918
With the growing need for large scale data analysis, distributed machine learning has grown importance in recent years. The raw data is described by large number of interrelated variables and an important task is to describe the joint probability distribution over these variables, allowing simultaneously interferences and predications to be made. Directly modeling of joint probability distribution of all these variables may be infeasible, since the complexity of such model grown exponential with the number of variables. We focus on Bayesian Networks, the father of graphical models and present a different communication-efficient approach using the well-known method of Functional Geometric Monitoring, for continuously learning and maintenance of Bayesian Networks in a distributed streaming environment. Finally, the experimental results confirmed the functionality of proposed method.