Institutional Repository [SANDBOX]
Technical University of Crete
EN  |  EL

Search

Browse

My Space

Scalable phylogeny reconstruction with disaggregated near-memory processing

Alachiotis Nikolaos, Skrimponis Panagiotis, Pissadakis Emmanouil, Pnevmatikatos Dionysios

Full record


URI: http://purl.tuc.gr/dl/dias/55F291BA-07D1-4EDD-80CE-14BA2CC2F080
Year 2022
Type of Item Peer-Reviewed Journal Publication
License
Details
Bibliographic Citation N. Alachiotis, P. Skrimponis, M. Pissadakis and D. Pnevmatikatos, “Scalable phylogeny reconstruction with disaggregated near-memory processing,” ACM Trans. Reconfigurable Technol. Syst., vol. 15, no. 3, 2022, doi: 10.1145/3484983. https://doi.org/10.1145/3484983
Appears in Collections

Summary

Disaggregated computer architectures eliminate resource fragmentation in next-generation datacenters by enabling virtual machines to employ resources such as CPUs, memory, and accelerators that are physically located on different servers. While this paves the way for highly compute- and/or memory-intensive applications to potentially deploy all CPUs and/or memory resources in a datacenter, it poses a major challenge to the efficient deployment of hardware accelerators: input/output data can reside on different servers than the ones hosting accelerator resources, thereby requiring time- and energy-consuming remote data transfers that diminish the gains of hardware acceleration. Targeting a disaggregated datacenter architecture similar to the IBM dReDBox disaggregated datacenter prototype, the present work explores the potential of deploying custom acceleration units adjacently to the disaggregated-memory controller on memory bricks (in dReDBox terminology), which is implemented on FPGA technology, to reduce data movement and improve performance and energy efficiency when reconstructing large phylogenies (evolutionary relationships among organisms). A fundamental computational kernel is the Phylogenetic Likelihood Function (PLF), which dominates the total execution time (up to 95%) of widely used maximum-likelihood methods. Numerous efforts to boost PLF performance over the years focused on accelerating computation; since the PLF is a data-intensive, memory-bound operation, performance remains limited by data movement, and memory disaggregation only exacerbates the problem. We describe two near-memory processing models, one that addresses the problem of workload distribution to memory bricks, which is particularly tailored toward larger genomes (e.g., plants and mammals), and one that reduces overall memory requirements through memory-side data interpolation transparently to the application, thereby allowing the phylogeny size to scale to a larger number of organisms without requiring additional memory.

Available Files

Services

Statistics