The best of many worlds: scheduling machine learning inference on CPU-GPU integrated architectures

Vasiliadis Giorgos, Tsirbas Rafail, Ioannidis Sotirios

Πλήρης Εγγραφή

URI:

http://purl.tuc.gr/dl/dias/30E0BFB7-D2EE-4943-B7A2-0F2D1C5DA2EC

Έτος

2022

Τύπος

Πλήρης Δημοσίευση σε Συνέδριο

Άδεια Χρήσης

Λεπτομέρειες

Βιβλιογραφική Αναφορά

G. Vasiliadis, R. Tsirbas and S. Ioannidis, "The best of many worlds: scheduling machine learning inference on CPU-GPU integrated architectures," in Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW 2022), Lyon, France, 2022, pp. 55-64, doi: 10.1109/IPDPSW55747.2022.00017. https://doi.org/10.1109/IPDPSW55747.2022.00017

Εμφανίζεται στις Συλλογές

Δημοσιεύσεις σε Συνέδρια στην Κοινότητα Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών

Δημοσιεύσεις σε Συνέδρια στην Κοινότητα Εργαστήριο Μικροεπεξεργαστών και Υλικού

Περίληψη

A plethora of applications are using machine learning, the operations of which are becoming more complex and require additional computing power. At the same time, typical commodity system setups (including desktops, servers, and embeddeddevices) are now offering different processing devices, the most often of which are multi-core CPUs, integrated GPUs, and discrete GPUs. In this paper, we follow a data-driven approach, where we first show the performance of different processingdevices when executing a diversified set of inference engines; some processing devices perform better for different performance metrics (e.g., throughput, latency, and power consumption), while at the same time, these metrics may also deviate significantly among different applications. Based on these findings, we proposean adaptive scheduling approach, tailored for machine learning inference operations, that enables the use of the most efficient processing device available. Our scheduler is device-agnostic and can respond quickly to dynamic fluctuations that occur at real-time, such as data bursts, application overloads and system changes. The experimental results show that it is able to match the peak throughput, by predicting correctly the optimal processing device with an accuracy of 92.5%, with energy savings up to 10%.

Αναζήτηση

Πλοήγηση

Ο Χώρος μου

The best of many worlds: scheduling machine learning inference on CPU-GPU integrated architectures

Vasiliadis Giorgos, Tsirbas Rafail, Ioannidis Sotirios

Περίληψη

Διαθέσιμα αρχεία

Υπηρεσίες

Εξαγωγή

Κοινοποίηση

Στατιστικά

Μεταδεδομένων & Περιεχομένου σε METS:

Μεταδεδομένων σε Μορφότυπο: