<efrbr:recordSet xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:efrbr="http://vfrbr.info/efrbr/1.1" xmlns:efrbr-work="http://vfrbr.info/efrbr/1.1/work" xmlns:efrbr-expression="http://vfrbr.info/efrbr/1.1/expression" xmlns:efrbr-manifestation="http://vfrbr.info/efrbr/1.1/manifestation" xmlns:efrbr-person="http://vfrbr.info/efrbr/1.1/person" xmlns:efrbr-corporateBody="http://vfrbr.info/efrbr/1.1/corporateBody" xmlns:efrbr-concept="http://vfrbr.info/efrbr/1.1/concept" xmlns:efrbr-structure="http://vfrbr.info/efrbr/1.1/structure" xmlns:efrbr-responsible="http://vfrbr.info/efrbr/1.1/responsible" xmlns:efrbr-subject="http://vfrbr.info/efrbr/1.1/subject" xmlns:efrbr-other="http://vfrbr.info/efrbr/1.1/other" xsi:schemaLocation="http://vfrbr.info/efrbr/1.1 http://vfrbr.info/schemas/1.1/efrbr.xsd"><efrbr:entities><efrbr-work:work identifier="http://purl.tuc.gr/dl/dias/BEEF5DFC-1996-42F3-A0E6-3F3D10EB5CF3"><efrbr-work:titleOfTheWork>Clustering Big Data Streams in Apache Flink</efrbr-work:titleOfTheWork></efrbr-work:work><efrbr-expression:expression identifier="http://purl.tuc.gr/dl/dias/BEEF5DFC-1996-42F3-A0E6-3F3D10EB5CF3"><efrbr-expression:titleOfTheExpression>Clustering Big Data Streams in Apache Flink</efrbr-expression:titleOfTheExpression><efrbr-expression:titleOfTheExpression>Συσταδοποίηση Μεγάλων Ροών Δεδομένων στο Apache Flink</efrbr-expression:titleOfTheExpression><efrbr-expression:formOfExpression vocabulary="DIAS:TYPES">
            Διπλωματική Εργασία
            Diploma Work
         </efrbr-expression:formOfExpression><efrbr-expression:dateOfExpression type="issued">2018-10-11</efrbr-expression:dateOfExpression><efrbr-expression:dateOfExpression type="published">2018</efrbr-expression:dateOfExpression><efrbr-expression:languageOfExpression vocabulary="iso639-1">en</efrbr-expression:languageOfExpression><efrbr-expression:summarizationOfContent>We live in the era of Big Data where massive amounts of information are generated continuously from numerous types of sources. Today’s goal is to apply techniques that take into consideration the volume, the variety and the velocity of the data, in order to gain insight that couldn’t be revealed with traditional data processing application software. Cluster analysis is a technique that groups a set of objects such that objects in the same group have similar properties. It is commonly used in the fields of machine learning, data mining, statistical data analysis, pattern recognition and bioinformatics. In this thesis, we propose a parallel implementation for the well-known unsupervised learning algorithm, StreamKM++, for clustering data streams in an online fashion. For the development phase, Apache Flink framework is chosen as a distributed streaming engine with high-throughput, low-latency and fault-tolerant computations over unbounded and bounded data streams. Initially, we introduce the theoretical background of the implemented algorithm and the distributed framework. Afterwards, we propose a parallel implementation which computes the set of cluster centers after the consumption of the input dataset. In addition to that, we propose an alternative implementation which produces periodically requests for the re-evaluation of cluster centers. Finally, we develop a program that exploits the Queryable State feature of Flink, in order to allow the user to query the most up-to-date values of the cluster centers. Experimental evaluation shows that by increasing the level of parallelism the running time droops significantly and at the same time the quality of the clustering gets slightly better.</efrbr-expression:summarizationOfContent><efrbr-expression:useRestrictionsOnTheExpression type="creative-commons">http://creativecommons.org/licenses/by/4.0/</efrbr-expression:useRestrictionsOnTheExpression><efrbr-expression:note type="academic unit">Πολυτεχνείο Κρήτης::Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών</efrbr-expression:note></efrbr-expression:expression><efrbr-manifestation:manifestation identifier="http://purl.tuc.gr/dl/dias/69D5BA67-1DDB-4851-9937-380C1F38BF7C"><efrbr-manifestation:titleOfTheManifestation>Bitsakis_Theodoros_Dip_2018.pdf</efrbr-manifestation:titleOfTheManifestation><efrbr-manifestation:publicationDistribution><efrbr-manifestation:placeOfPublicationDistribution type="distribution">Chania [Greece]</efrbr-manifestation:placeOfPublicationDistribution><efrbr-manifestation:publisherDistributor type="distributor">Library of TUC</efrbr-manifestation:publisherDistributor><efrbr-manifestation:dateOfPublicationDistribution>2018-10-11</efrbr-manifestation:dateOfPublicationDistribution></efrbr-manifestation:publicationDistribution><efrbr-manifestation:formOfCarrier>application/pdf</efrbr-manifestation:formOfCarrier><efrbr-manifestation:extentOfTheCarrier>2.7 MB</efrbr-manifestation:extentOfTheCarrier><efrbr-manifestation:accessRestrictionsOnTheManifestation>free</efrbr-manifestation:accessRestrictionsOnTheManifestation></efrbr-manifestation:manifestation><efrbr-person:person identifier="http://users.isc.tuc.gr/~tbitsakis"><efrbr-person:nameOfPerson vocabulary="TUC:LDAP">
            Bitsakis Theodoros
            Μπιτσακης Θεοδωρος
         </efrbr-person:nameOfPerson></efrbr-person:person><efrbr-person:person identifier="http://users.isc.tuc.gr/~adeligiannakis"><efrbr-person:nameOfPerson vocabulary="TUC:LDAP">
            Deligiannakis Antonios
            Δεληγιαννακης Αντωνιος
         </efrbr-person:nameOfPerson></efrbr-person:person><efrbr-person:person identifier="http://users.isc.tuc.gr/~mgarofalakis"><efrbr-person:nameOfPerson vocabulary="TUC:LDAP">
            Garofalakis Minos
            Γαροφαλακης Μινως
         </efrbr-person:nameOfPerson></efrbr-person:person><efrbr-person:person identifier="http://users.isc.tuc.gr/~vsamoladas"><efrbr-person:nameOfPerson vocabulary="TUC:LDAP">
            Samoladas Vasilis
            Σαμολαδας Βασιλης
         </efrbr-person:nameOfPerson></efrbr-person:person><efrbr-corporateBody:corporateBody identifier="06901DA2-CE20-40D6-8D70-FBAAFCE63C2D"><efrbr-corporateBody:nameOfTheCorporateBody vocabulary="">
            Πολυτεχνείο Κρήτης
            Technical University of Crete
         </efrbr-corporateBody:nameOfTheCorporateBody></efrbr-corporateBody:corporateBody><efrbr-concept:concept identifier="003460B6-259B-4E1C-9782-5CB856059484"><efrbr-concept:termForTheConcept>
            Apache Flink
         </efrbr-concept:termForTheConcept></efrbr-concept:concept><efrbr-concept:concept identifier="EAEC318A-5B4A-4F8F-8690-277B5D5EDEA1"><efrbr-concept:termForTheConcept>
            Data Stream Clustering
         </efrbr-concept:termForTheConcept></efrbr-concept:concept></efrbr:entities><efrbr:relationships><efrbr-structure:structureRelations><efrbr-structure:realizedThrough sourceEntity="work" sourceURI="http://purl.tuc.gr/dl/dias/BEEF5DFC-1996-42F3-A0E6-3F3D10EB5CF3" targetEntity="expression" targetURI="http://purl.tuc.gr/dl/dias/BEEF5DFC-1996-42F3-A0E6-3F3D10EB5CF3"/><efrbr-structure:embodiedIn sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/BEEF5DFC-1996-42F3-A0E6-3F3D10EB5CF3" targetEntity="manifestation" targetURI="http://purl.tuc.gr/dl/dias/69D5BA67-1DDB-4851-9937-380C1F38BF7C"/></efrbr-structure:structureRelations><efrbr-responsible:responsibleRelations><efrbr-responsible:createdBy sourceEntity="work" sourceURI="http://purl.tuc.gr/dl/dias/BEEF5DFC-1996-42F3-A0E6-3F3D10EB5CF3" targetEntity="person" targetURI="http://users.isc.tuc.gr/~tbitsakis"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/BEEF5DFC-1996-42F3-A0E6-3F3D10EB5CF3" targetEntity="person" targetURI="http://users.isc.tuc.gr/~tbitsakis" role="author"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/BEEF5DFC-1996-42F3-A0E6-3F3D10EB5CF3" targetEntity="person" targetURI="http://users.isc.tuc.gr/~adeligiannakis" role="http://purl.tuc.gr/dl/dias/vocabs/contributor-roles/1"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/BEEF5DFC-1996-42F3-A0E6-3F3D10EB5CF3" targetEntity="person" targetURI="http://users.isc.tuc.gr/~mgarofalakis" role="http://purl.tuc.gr/dl/dias/vocabs/contributor-roles/2"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/BEEF5DFC-1996-42F3-A0E6-3F3D10EB5CF3" targetEntity="person" targetURI="http://users.isc.tuc.gr/~vsamoladas" role="http://purl.tuc.gr/dl/dias/vocabs/contributor-roles/2"/><efrbr-responsible:realizedBy sourceEntity="expression" sourceURI="http://purl.tuc.gr/dl/dias/BEEF5DFC-1996-42F3-A0E6-3F3D10EB5CF3" targetEntity="person" targetURI="06901DA2-CE20-40D6-8D70-FBAAFCE63C2D" role="publisher"/></efrbr-responsible:responsibleRelations><efrbr-subject:subjectRelations><efrbr-subject:hasSubject sourceEntity="work" sourceURI="http://purl.tuc.gr/dl/dias/BEEF5DFC-1996-42F3-A0E6-3F3D10EB5CF3" targetEntity="concept" targetURI="003460B6-259B-4E1C-9782-5CB856059484"/><efrbr-subject:hasSubject sourceEntity="work" sourceURI="http://purl.tuc.gr/dl/dias/BEEF5DFC-1996-42F3-A0E6-3F3D10EB5CF3" targetEntity="concept" targetURI="EAEC318A-5B4A-4F8F-8690-277B5D5EDEA1"/></efrbr-subject:subjectRelations><efrbr-other:otherRelations/></efrbr:relationships></efrbr:recordSet>