Το έργο με τίτλο Έγκυρη υποστήριξη αυτόματης κλιμάκωσης του Apache Flink σε υποδομή Kubernetes στο υπολογιστικό νέφος από τον/τους δημιουργό/ούς Zafeirakopoulos Alexandros-Nikolaos διατίθεται με την άδεια Creative Commons Αναφορά Δημιουργού 4.0 Διεθνές
Βιβλιογραφική Αναφορά
Αλέξανδρος-Νικόλαος Ζαφειρακόπουλος, "Έγκυρη υποστήριξη αυτόματης κλιμάκωσης του Apache Flink σε υποδομή Kubernetes στο υπολογιστικό νέφος", Διπλωματική Εργασία, Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών, Πολυτεχνείο Κρήτης, Χανιά, Ελλάς, 2022
https://doi.org/10.26233/heallink.tuc.95102
Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. It executes arbitrary dataflow programs in a data-parallel and pipelined manner in event-driven applications such as, fraud detection (i.e. detection of suspicious transactions), anomaly detection (i.e. detection of rare or suspicions events), rule-based alerting (i.e. identification of data which satisfy one or more rules) and many more. Despite its versatility, Apache Flink cannot automatically and optimally adjust the utilization of its underlying computing resources when streaming sources produce data at varying speeds. In order to address this issue, we describe an autonomous agent to support dynamic autoscaling for Apache Flink on Kubernetes. This agent monitors, models and adjusts Flink's behaviour by optimally modifying its allocated resources in order to match the incoming workload while achieving minimum cost. The decision making process is based on operator idleness and changes to the input's record lag. We prove that our model not only successfully maintains the performance of the application while minimizing infrastructure costs, but can provide a better performance-to-cost ratio compared to already existing work on Flink autoscaling. The effectiveness of our model is supported by an exhaustive set of synthetic and real life workloads aimed to simulate a plethora of possible scenarios.