Institutional Repository [SANDBOX]
Technical University of Crete
EN  |  EL

Search

Browse

My Space

Outlier detection using Spark Streaming

Psarakis Kyriakos

Simple record


URIhttp://purl.tuc.gr/dl/dias/02E820B1-D8E6-45DF-ACFA-7C37D76F29DA-
Identifierhttps://doi.org/10.26233/heallink.tuc.70468-
Languageen-
Extent50 pagesen
TitleOutlier detection using Spark Streamingen
TitleΑνίχνευση δεδομένων άτυπης συμπεριφοράς με το σύστημα Spark Streamingel
CreatorPsarakis Kyriakosen
CreatorΨαρακης Κυριακοςel
Contributor [Thesis Supervisor]Deligiannakis Antoniosen
Contributor [Thesis Supervisor]Δεληγιαννακης Αντωνιοςel
Contributor [Committee Member]Garofalakis Minosen
Contributor [Committee Member]Γαροφαλακης Μινωςel
Contributor [Committee Member]Lagoudakis Michaelen
Contributor [Committee Member]Λαγουδακης Μιχαηλel
PublisherΠολυτεχνείο Κρήτηςel
PublisherTechnical University of Creteen
Academic UnitTechnical University of Crete::School of Electrical and Computer Engineeringen
Academic UnitΠολυτεχνείο Κρήτης::Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστώνel
Content SummaryData is continuously being generated from sources such as machines, network traffic, sensor networks, etc. Timely and accurate detection of outliers in massive data streams has important applications such as in preventing machine failures, intrusion detection, and financial fraud detection. In this thesis, we implement an outlier detection algorithm inside the Spark Streaming environment that, makes only one pass over the data while utilizing limited storage. We chose the Spark Streaming environment because it offers scalable, high-throughput, fault-tolerant stream processing of live data streams. The algorithm adapts ideas from matrix sketching to maintain a set of few orthogonal vectors that form a good approximate basis for all the observed data. Using this constructed orthogonal basis, outliers in new incoming data are detected based on a simple reconstruction error test. Additionally, we have implemented two methods for updating the orthogonal vectors one deterministic and one randomized to further speedup the algorithm with a small cost to accuracy.en
Type of ItemΔιπλωματική Εργασίαel
Type of ItemDiploma Worken
Licensehttp://creativecommons.org/licenses/by/4.0/en
Date of Item2017-12-19-
Date of Publication2017-
SubjectData streamsen
SubjectOutlier detectionen
Bibliographic CitationKyriakos Psarakis, "Outlier detection using Spark Streaming", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2017en
Bibliographic CitationΚυριάκος Ψαράκης, "Ανίχνευση δεδομένων άτυπης συμπεριφοράς με το σύστημα Spark Streaming", Διπλωματική Εργασία, Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών, Πολυτεχνείο Κρήτης, Χανιά, Ελλάς, 2017el

Available Files

Services

Statistics