#DS4SocietySeminar 2023 <> Lost in Translation: Large Language Models and Non-English Content Analysis

Aliya Bhatia and Gabriel Nicholas
#DS4SocietySeminar 2023 <> Data Science Needs Law

Chijioke Okorie
[Publication] Exploring COVID-19 public perceptions in South Africa through sentiment analysis and topic modelling of Twitter posts

Paper by Temitope Kekere and Vukosi Marivate
#DS4SocietySeminar 2023 <> Towards Evaluating and Understanding the Multilingual Performance of Generative Language Models

Maxamed Axmed/Mohamed Ahmed
DSFSI @ Deep Learning IndabaX 2023 - Cape Town, South Africa: 12-14 July 2023

We thrilled to have the opportunity to participate in and contribute to Deep Learning IndabaX 2023 in Cape Town, South Africa
The DSFSI Large Language Models Primer <> AI and Language: A Mirror to Ourselves - Understanding how we got to ChatGPT and what it actually means

Primer/Tutorial by Prof. Vukosi Marivate
Welcome DSFSI Postdoc Dr Kayode Olaleye

Welcome Dr Kayode Olaleye to the Computer Science department at University of Pretoria. He is joining DSFSI as a Postdoctoral Fellow sponsored through the JP Morgan Research Faculty award received by Prof Vukosi Marivate.
[Publication] Emotionally driven fake news in South Africa

Paper by Marc Gagiano and Vukosi Marivate
[Publication] Unsupervised Cross-lingual Word Embedding Representation for English-isiZulu

Paper by Derwin Ngomane and Vukosi Marivate
[Publication] MasakhaPOS: Part-of-Speech Tagging for Typologically Diverse African Languages

Paper by Thapelo Sindane and Vukosi Marivate
[Publication] Combating Hate: How Multilingual Transformers Can Help Detect Topical Hate Speech

Paper by Trishanta Srikissoon and Vukosi Marivate
NSTF-South32 Awards 2022/2023 Finalist Announcement

We are pleased to announce that the Coronavirus COVID-19 (2019-nCoV) Data Repository for South Africa project led by the Data Science for Social Impact (DSFSI) Research Group at University of Pretoria is a finalist for the NSTF-South32 Awards 2022/2023
DSFSI @ ICLR 2023 - Kigali, Rwanda: 1-5 May 2023

We thrilled to have the opportunity to participate in and contribute to ICLR 2023 in Kigali
Vukosi Marivate - NLP and TDM in Africa [The Right to Research in International Copyright Law Seminar 2023]

From Data Engineering to Natural Language Corpora - The DSFSI 2022/2023 Summer Internship

Richard Lastrucci and Isheanesu Dzingirai - The main focus was collection, processing and creation of parallel corpora in the 11 official languages of South Africa. We collected multilingual data by building web scrapers, we processed and cleaned this and other extracted data before performing sentence alignment to produce the parallel corpora. We also improved and documented Masakhane Web.
[Dissertation] Interpretable Machine Learning in Natural Language Processing for Misinformation data

Masters dissertation by Yolanda Nkalashe, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
[Dissertation] BantuBERTa: Using Language Family Grouping in Multilingual Language Modeling for Bantu Languages

Masters dissertation by Jesse Parvess, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
[Dissertation] South African isiZulu and siSwati News Corpus Creation, Annotation and Categorisation

Masters dissertation by Andani Madodonga, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
DSFSI Summer Research Experience: Fine-tuning Multilingual Pre-trained African Language Models

Rozina Myoya and Fiskani Banda - The aim of the research was to explore the application of Pre-trained Language Models(PLMs) on a multilingual classification task. The study concluded that PLMs perform best on downstream tasks on the languages they were pre-trained on, thus emphasising the importance of including low resource languages in the pre-training process of LMs.
Vukosi Marivate - AfricanML: Where to next? [IndabaX South Africa 2022]

Hundzula Retreat 2023

By Vukosi Marivate
Augmenting the Data: Vukosi Marivate on African Natural Language Processing (NLP) – Critical AI
By Eleni Coundouriotis. The event was organized and co-sponsored by Critical AI @ Rutgers, DIMACS, the Rutgers Department of Computer Science, and the Institute for the Study of Global Racial Justice
Call for Low Resource Natural Language Processing Postdoctoral fellowship applications
Data Science for Social Impact (DSFSI) Research group calls for interested parties
DSFSI @ 2022 UP School of IT Research Colloquium

Our contributions to the Colloquium on 11 Nov 2022
[Publication] Conversational Pattern Mining using Motif Detection
Paper by Nicolle Garber, Vukosi Marivate
[Publication] Estimating the COVID-19 Help Offering Retention Training (COHORT19) Group Transitioning into Higher Education from South African Basic Education
Paper by Herkulaas MvE Combrink, Vukosi Marivate, Benjamin Rosman
#DS4SocietySeminar 2022 <> Localising the Mozilla Common Voice platform for South Africa’s official languages
Febe de Wet
[Publication] Reinforcement Learning in Education: A Multi-Armed Bandit Approach
Paper by Herkulaas MvE Combrink, Vukosi Marivate, Benjamin Rosman
DSFSI @ 18th African Investigative Journalism Conference #AIJC2022
Our contributions at AIJC 2022
#DS4SocietySeminar 2022 <> From Human to Social-Centric Tech & AI: Interdisciplinarity in Massively Multilingual Machine Translation.
Skyler Wang
[Publication] A Framework for Undergraduate Data Collection Strategies for Student Support Recommendation Systems in Higher Education
Paper by Herkulaas Combrink, Vukosi Marivate, Benjami Rosman
[Publication] Comparing Synthetic Tabular Data Generation Between a Probabilistic Model and a Deep Learning Model for Education Use Cases
Paper by Herkulaas MvE Combrink, Vukosi Marivate, Benjamin Rosman
Mintirho ya Vulavula: Deep Learning Indaba #4 - An Indaba deferred, but one that will shape our collective destiny

Reflecting on the 2022 Deep Learning Indaba held in Tunis, Tunisia - Prof Vukosi Marivate
DSFSI is looking for a part-time research assistant(s) for the Summer break.

DSFSI Paid Positions
#DS4SocietySeminar 2022 <> Why language technologies for African languages are important.
Mukondleteri Dumela
#DS4SocietySeminar 2022 <> Overcoming Digital Gravity when using AI in Public Health Decisions
Dr. Sekou L. Remy
DSFSI @ Deep Learning Indaba 2022 #DLI2022

Our contributions as DLIndaba 2022
#DS4SocietySeminar 2022 <> Disinformation and covid what can we learn from the disinformers and doorknobs?
William Bird and Nomshado Lubisi
UP Expert Lecture Series: ‘Riendzo ri lehile: Tackling Natural Language Processing for African languages to make better sense of our world’

presented by Professor Vukosi Marivate
#DS4SocietySeminar 2022 <> Algorithms on Trial: Interrogating Evidentiary Statistical Software
Rediet Abebe
[Publication] Improving the Predictive Power of Historical Consistent Neural Networks
Paper by Rockefeller Rockefeller, Bubacarr Bah, Vukosi Marivate, and Hans-Georg Zimmermann , *African Institute for Mathematical Sciences, Cape Town, South Africa; *Department of Mathematical Sciences, Stellenbosch University, Cape Town, South Africa; Department of Computer Sciences, University of Pretoria, Pretoria, South Africa; Fraunhofer Society, 200703 Munich, Germany, Germany
[Publication] LiSTra Automatic Speech Translation: English to Lingala Case Study
Paper by Salomon Kabongo Kabenamualu, Vukosi Marivate, and Herman Kamper, African Masters of Machine Intelligence, University of Pretoria, Stellenbosch University
[Publication] Discriminatory Gleason grade group signatures of prostate cancer: An application of machine learning methods
Paper by Mpho Mokoatle, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
[Publication] Semi-Supervised Learning Approaches for Predicting South African Political Sentiment for Local Government Elections
Paper by Mashadi Ledwaba and Vukosi Marivate, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
SAAIR Learner Analytics Institute Day 2 Workshop 5

Herkulaas MvE Combrink, Prof Vukosi Marivate & Prof Benjamin Rosman
[Dissertation] Using Machine Learning to Detect Solar Panels in Aerial Images
Masters dissertation by Palesa Rachael Lepamo, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
[Dissertation] Inventory Stock Prediction and Deep Anomaly Detection
Masters dissertation by Khutso Kgabo Sepuru, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
[Dissertation] Using NER and Doc2Vec to cluster South African criminal cases
Masters dissertation by Carel Kagiso Nchachi, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
[Publication] Training Cross-Lingual embeddings for Setswana and Sepedi
Paper by Mack Makgatho, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
[Policy Brief] More Than Just a Policy - Day to Day Effects of Data Governance on the Data Scientist
Assoc.Prof Vukosi Marivate,
INL (text) Processing for Classification of Misinformation data
Yolanda Nkalashe
A Journey Through the Opportunity of Low Resourced Natural Language Processing — An African Lens [NeurIPS Tutorial]
Vukosi Marivate , David Adelani
nbgrader and streamlit for Data Science Teaching and Learning [Workshop]
Vukosi Marivate
#DS4SocietySeminar 2021 <> Anti-Asian COVID-related Hate Speech and Stigma on Twitter: Dataset Creation to Algorithmic Detection
Syed Ishtiaque Ahmed
4IR IN AFRICA CANNOT TAKE PLACE IN ENGLISH ONLY[Article]
Vukosi Marivate
AI To the Future <> Why African Leaders Matter[Panel]

Vukosi Marivate, Uzodinma Iweala, Mutale Nkonde, & Jackie Mwaniki
#DS4SocietySeminar 2021 <> On the dangers of stochastic parrots

Timnit Gebru
Imagining Innovations 2021 <> The Covid Dashboard: AI & 'Information' in Pandemic times[Panel]

Vukosi Marivate & Barry Dwolatzky
Call for SDGs and Data Science Postdoctoral fellowship applications

Albert Luthuli Leadership Institute (ALLI), South African SDG Hub, Data Science for Social Impact (DSFSI) Research group
#DS4SocietySeminar 2021 <> Measuring the urban environment using street view imagery

Emily Muller
Business Day TV - Talk to the Bot
Dr Vukosi Marivate talking about NLP on Business Day TV
#DS4SocietySeminar 2021 <> Environmental urban noise prediction for Kampala city

Ernest Mwebaze
#DS4SocietySeminar 2021 <> Pattern Extraction in Marketing

Felipe Melo
NRF Call [Honours and MSc] for 2022 Students @ DSFSI
Introducing Masakhane WEB - A Machine Translation Web Platform for African Languages

A new milestone for Masakhane NLP and DSFSI
#MSG 2021 <> Coming to grips with the reality of Data Science - It's people all the way down

Vukosi Marivate
[Dissertation] THE DESIGN AND IMPLEMENTATION OF A DOMAIN-ADAPTIVE DEEP REINFORCEMENT LEARNING TEXT CLASSIFICATION ARCHITECTURE

Masters dissertation by Andreas Bayer, Faculty of Engineering, Built Environment and Information Technology(Big Data Science) University of Pretoria, Pretoria
[Publication] Training Cross-Lingual embeddings for Setswana and Sepedi
Mack Makgatho, Vukosi Marivate, Tshephisho Sefara, Valencia Wagner
[Publication] Investigating Statistical and Machine Learning Techniques to Improve the Credit Approval Process in Developing Countries

Publication by Moses, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
[Publication] Call Centre Shift Schedule Optimisation using Local Search Heuristics

Publication by Liketso, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
[Dissertation] Thuto: Depth Analysis of South African and Sierra Leone School Outcomes using Machine Learning

Masters dissertation by Henry Wandera, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
[Dissertation] Identifying Financial Risk through Natural Language Processing of Company Annual Reports

Masters dissertation by Jacques Lamont Theron, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
[Dissertation] Financial Sentiment Analysis: an NLP approach towards reputation management

Masters dissertation by Michelle Terblanche, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
[Dissertation] Unsupervised Anomaly Detection of Healthcare Providers using Generative Adversarial Network

Masters thesis by Krishnan Naidoo, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
[Dissertation] Conversational Pattern Mining using Motif Detection

Masters dissertation by Nicolle Garber, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
[Dissertation] Finding Suitable Locations of Wind Farms in South Africa

Masters dissertation by Shané, African Institute for Mathematical Sciences
[Deadline Extended to 31 January 2021] Call for PhD Student at DSFSI
Data Science for Society Seminar <> Lacuna Fund - Mobilizing funding for labeled datasets that solve urgent problems in low- and middle-income contexts globally
Lacuna Fund - NLP RFP
Data Science for Society Seminar <> Approximating the convolutional filters in the feature map of a radio interferometer
Marcel Atemkeng - The sky that a radio telescope such as the SKA observes (the feature map) is the true sky involved with a convolutional filter (the weighted sampling function). This convolution filter varies across the feature map and is different at each pixel.
Data Science for Society Seminar <> Massive vs. Curated Embeddings for Low-Resourced Languages: the Case of Yorùbá and Twi
Jesujoba Alabi
Data Science for Society Seminar: Scientific Writing Workshop
Dr. Elaine Nsoesie - Participants will get a high-level introduction to writing and publishing scientific papers
Data Science for Society Seminar: AI and Data in South Africa's Health Sector

Vedantha Singh, HSRC, UP
Data Science for Society Seminar: Using Machine Learning to Extract Key Meta-Data from Legal Text in Malawi

Amelia Taylor
Data Science for Society Seminar: The process of creating satellite image ground-truth datasets for machine learning

Raesetje Sefala
Mapping the South African health facility landscape in response to COVID-19

N Mtsweni, H Combrink, A Van der Walt, V Marivate
A few weeks in, Data Science thoughts on COVID-19 in SA
Dr Vukosi Marivate, ABSA UP Chair of Data Science, Data Science for Social Impact Research Group.
Sharing our work on COVID-19 Data and Analysis
Dr. Vukosi Marivate
[Dissertation] USE OF MACHINE LEARNING TECHNIQUES TO BETTER UTILISE MULTIPLE UNDERWRITING FACTORS IN MORTALITY PRICING AND RESERVING OF LIFE INSURANCE RISK PRODUCTS

Masters dissertation by Collin, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
[Dissertation] Anomaly detection in the open financial market: A Case for the Bitcoin Market

Masters dissertation by Monamo, Faculty of Engineering and Electronics, University of Johannesburg
[Dissertation] Why is this an Anomaly? Explaining Anomalies using Sequential Explanations

Masters dissertation by Mokoena, Faculty of Science, University of the Witwatersrand, Johannesburg.
[Dissertation] Road Traffic Accident Analysis Using Machine Learning Techniques for Soshanguve, Pretoria

Masters dissertation by Mokoatle, North West University
[Dissertation] Ukhetho: A Text Mining Study Of The South African General Elections

Masters dissertation by Avashlin Moodley, Faculty of Engineering, Built Environment and Information Technology University of Pretoria, Pretoria
An Effort to Collate COVID-19 Case Data Across Africa
An Effort to Collate COVID-19 Case Data Across Africa - Elaine Nsoesie, Vukosi Marivate
A repo and dashboard for COVID-19 in South Africa
How and why we built the dashboard. Dr. Vukosi Marivate - Dept. Computer Science
The Fourth Industrial Revolution and the provision of service delivery in South Africa
Dr. OS Madumo - School of Public Management and Administration, University of Pretoria
Data Science & Salone - Visiting Sierra Leone
Through an NRF Knowledge Interchange and Collaboration grant, I was able to visit some collaborators in Sierra Leone
Go Tsamaya Ke Go Bona: Deep Learning Indaba #3 - A journey just beginning.
Go Tsamaya Ke Go Bona: Deep Learning Indaba #3 - A journey just beginning. By Vukosi Marivate
We should all be Data Scientists: Why SOCIAL and SOCIETY are the missing links.
We should all be Data Scientists: Why SOCIAL and SOCIETY are the missing links. By Vukosi Marivate
Nawa oh! A week on atlantic ocean in Sierra Leone.
I suppose should have met President Bio...🤦 By Henry Wandera
[Dissertation] Improving forecast accuracy of wind power output using multi-input LSTM model.

Masters dissertation by Phillemon, African Institute for Mathematical Sciences
[Dissertation] Power Output Prediction For Wind Farms With Machine Learning

Masters dissertation by Nkosinathi, African Institute for Mathematical Sciences