20 Jan 2025

AI4D Workshop #2 Report: Advancing NLP for African Languages

UP Academic Awards 2024

AI4D Workshop #2 Report: Advancing NLP for African Languages

The AI4D African Languages Workshop #2, held on Monday 9 December 2024, brought together an inspiring cross-disciplinary group of researchers, educators, and professionals from diverse fields such as Computer Science, Mathematics, Law, and Linguistics. This workshop, following the success of the first workshop focused on bridging gaps between computational and linguistic understandings, was centered around the theme “Improving NLP Evaluation for African Languages.” With African languages remaining underrepresented in many Natural Language Processing (NLP) systems, the discussions and outcomes were critical in shaping the future of inclusive AI and language technologies.

Opening the Dialogue: Collaboration Across Disciplines

The workshop kicked off with opening remarks and introductions, setting the stage for a rich discussion on the future of African language processing and AI. The participating researchers, who represented varied disciplines, shared their interests in African languages and AI, fostering an environment of collaboration from the outset.

Engagement with African Language Experts

A key session of the day was the tutorial presented by Dr. Nomadlozi Bokaba and Tebogo Macucwa from the Department of African Languages. They provided a detailed examination of the linguistic complexities and challenges in processing African languages. Their presentation, Improving NLP Evaluation for African Languages, emphasized the importance of understanding the specific characteristics of African languages when developing NLP tools.

These tools have the potential to improve the teaching of African languages, particularly by making grammar more accessible and easier to understand. They could also address challenges faced by grammarians in conveying complex grammatical structures to students. A specific example discussed was the issue of concordial agreement in isiZulu, which can be difficult for learners. The session also addressed the lack of support for African languages in educational systems and institutions.

The use of AI tools offers a potential solution to these challenges, supporting the preservation and development of African languages in both educational and technological contexts. The audience, which included participants from Mathematics, Applied Mathematics, Computer Science, Informatics, and Law, gained a clearer understanding of the linguistic diversity of African languages and the need for AI models that account for these complexities.

Cross-Disciplinary Perspectives on NLP Evaluation

After the linguistics session, the workshop took a deeper dive into the computational and ethical dimensions of African language processing. A series of brief presentations from various disciplines presented distinct perspectives on NLP evaluation. For example:

  • Computer Science highlighted the need for improved computational models and algorithms tailored to African languages.
  • Mathematics brought forward statistical and rigorous approaches for evaluating NLP models.
  • Law addressed the ethical, legal, and societal implications of AI deployment, particularly in the African context. In addressing these implications, the presentation focused on ensuring that law is not only a compliance and/or box-ticking exercise that could stifle innovation but that the role of law in facilitating social justice is emphasized. In this regard, law will take forward-looking and solution-oriented approaches in the innovation processes and outcomes.
  • Informatics focused on the development of a multilingual dictionary labeled with part-of-speech tags and sentiment polarity, demonstrating its application in machine translation, sentiment classification using machine learning models and BERT, and exploring contextual shifts through sentiment analysis, with future plans to expand -[kulmthe dictionary, improve sentiment prediction, and address challenges in contextual shifts using advanced AI techniques.

The group then engaged in a dynamic discussion to explore how these different perspectives could converge and enhance the development of African language tools.

Looking Ahead: Priorities for 2025

The discussion turned to the future during the facilitated session on 2025 priorities. Key research priorities for the upcoming year were identified, including:

  • Development of low-resource NLP evaluation methods.
  • Creation of datasets to enhance the training of AI models for African languages.

The session also mapped out the roles and contributions of various stakeholders involved in this initiative, emphasizing the need for cross-disciplinary collaboration to address the linguistic and technical challenges ahead.

Showcasing NLP Evaluation Frameworks

In the afternoon, the workshop featured a demonstration of the NLP evaluation framework currently under development. The interactive session provided an opportunity for participants to give real-time feedback and brainstorm ways to enhance the framework. This collaborative effort is vital for ensuring that evaluation standards are comprehensive and applicable to African languages.

Future Directions and Actionable Goals

The workshop concluded with discussions on the likely topics and activities for the coming year. The groups explored the following:

  1. Expanding Datasets and Evaluation Benchmarks for African languages.
  2. Cross-lingual NLP Applications and their potential use cases.
  3. Responsible and Inclusive NLP Research frameworks to ensure ethical AI development.
  4. Policy, Legal, and Ethical Considerations for advancing AI in the African context.

Each group presented actionable recommendations, paving the way for tangible outcomes in 2025.

Closing Remarks: A Path Forward

The closing remarks highlighted the key takeaways from the workshop, with a strong emphasis on the need for continued collaboration across disciplines to address the multifaceted challenges of African language processing. Plans for the next phase of the project, including upcoming workshops, dataset curation, and policy engagements, were also outlined.

Looking to 2025: Collaborative Impact on African Languages

The AI4D Workshop #2 reaffirmed the importance of creating an inclusive AI ecosystem for African languages. With the addition of two PhD students and two postdoctoral fellows joining the project in February 2025, the team is poised to further advance research and development in this vital field. The collective efforts of researchers from diverse backgrounds will help foster the growth of AI tools that are linguistically and culturally attuned to the African context.

As the workshop concluded, it was clear that the road ahead for AI4D and African languages is one of collaboration, innovation, and a shared commitment to empowering local languages through AI.

Acknowledgements

We would like to thank the DSFSI Admin assistants Angel and Happy for putting together the logistics for this workshop. We are also very thankful for The Javett Art Centre and Puleng who hosted us and also provided the venue at no cost.