Το work with title Explainable machine learning pipeline for Twitter bot detection during the 2020 US Presidential Elections by Shevtsov Alexander, Tzagkarakis Christos, Antonakaki Despoina, Ioannidis Sotirios is licensed under Creative Commons Attribution 4.0 International
Bibliographic Citation
A. Shevtsov, C. Tzagkarakis, D. Antonakaki, and S. Ioannidis, “Explainable machine learning pipeline for Twitter bot detection during the 2020 US Presidential Elections,” Software Impacts, vol. 13, Aug. 2022, doi: 10.1016/j.simpa.2022.100333.
https://doi.org/10.1016/j.simpa.2022.100333
This study introduces a novel, reproducible and reusable Twitter bot identification system. The system uses a machine learning (ML) pipeline, fed with hundreds of features extracted from a Twitter corpus. The main objective of the proposed ML pipeline is to train and validate different state-of-the-art machine learning models, where the eXtreme Gradient Boosting (XGBoost) model is selected since it achieves the highest detection performance. The Twitter dataset was collected during the 2020 US Presidential Elections, and additional experimental evaluation on distinct Twitter datasets demonstrates the superiority of our approach, in terms of high bot detection accuracy.