30 Mar 2020

[Dissertation] Ukhetho: A Text Mining Study Of The South African General Elections

Masters dissertation by Avashlin Moodley, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria


Avashlin Moodley, Masters Computer Science.


Dr. Vukosi Marivate


The elections in South Africa are contested by multiple political parties appealing to a diverse population that comes from a variety of socioeconomic backgrounds. As a result, a rich source of discourse is created to inform voters about election-related content. Two common sources of information to help voters with their decision are news articles and tweets, this study aims to understand the discourse in these two sources using natural language processing. Topic modelling techniques, Latent Dirichlet Allocation and Nonnegative Matrix Factorization, are applied to digest the breadth of information collected about the elections into topics. The topics produced are subjected to further analysis that uncovers similarities between topics, links topics to dates and events and provides a summary of the discourse that existed prior to the South African general elections. The primary focus is on the 2019 elections, however election-related articles from 2014 and 2019 were also compared to understand how the discourse has changed.



  • Moodley, A. and Marivate, V., 2019, November. Topic modelling of news articles for two consecutive elections in South Africa. In 2019 6th International Conference on Soft Computing & Machine Intelligence (ISCMI) (pp. 131-136). IEEE. [Link].

  • V Marivate, A Moodley, A Saba. Extracting and categorising the reactions to COVID-19 by the South African public – A social media study. Proceedings of IEEE AFRICON 2021 (To Appear) [Preprint][ML][NLP][SOC]