Institutional Repository [SANDBOX]
Technical University of Crete
EN  |  EL

Search

Browse

My Space

FPGA-Based system design for applications of de Bruijn Graphs

Rompogiannakis Emmanouil-Eleftherios

Full record


URI: http://purl.tuc.gr/dl/dias/EAB4F00F-749F-4529-96C2-7E4E9EB0057D
Year 2022
Type of Item Diploma Work
License
Details
Bibliographic Citation Emmanouil-Eleftherios Rompogiannakis, "FPGA-Based system design for applications of de Bruijn Graphs", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2022 https://doi.org/10.26233/heallink.tuc.91724
Appears in Collections

Summary

The mathematical properties of De Bruijn graph were originally introduced in 1951 by the Dutch mathematicians Tanja van Aardenne-Ehrenfestsand Nicolaas Govert de Bruijn. The De Bruijn graph is a directed graph representing overlaps between sequences of symbols; it has several uses in thefield of telecommunications in protocols and networks and in the field ofBioinformatics, specifically in De novo genome assembly.The properties of De Bruijn graph and its promising uses in De novo genomeassembly have been presented in several scientific articles. In this thesis we have implemented an FPGA-based prototype hardware system for deBruijn graph applications in de Novo genome assembly. We used the Russian genome assembler named SPAdes13.0 as a case study for the use ofde Bruijn Graphs. The SPAdes.13.0 genome assembler is a current-generationtool, and it is widely used in the field. The SPAdes.13.0 is also used for theverification of our experimental results. The data sets used in this thesis comefrom to European Nucleotide Archive (ENA) . The FPGA Alveo U50has been used as the target technology for experimental results in this thesis. The resulting speedup is modest (up to 1.14x-1.35x) for small data setsand the system has worse performance than SPAdes for large data sets, thebottleneck being the resources and the memory subsystem. Different accelerator cards with more storage capacity and resources could better exploitparallelism with more compute units. Thus, this thesis is more of a firstgeneration feasibility study, and can form the baseline for future acceleratorarchitectures.

Available Files

Services

Statistics