Prior Posts

You can also look at our earlier posts, which are associated with the same dataset but using different methodological approaches to gain insight from the data that was mined:

Building the mention graph

So, to construct the graph that we mentioned in the introduction, a few steps need to occur by looking at the data, specifically unpacking who mentions who in our full data set. To do so, we need to go through each microblog in each of the Twitter social media posts and extract who was mentioned (in the case of Twitter the “@user”) and then build up a connection network between the associated users. It is like constructing a mind map of a very large conversation to see who spoke about each other, so that insights can be gained based on the associations. If we refer back to our example where we used IEC and NMF topic modelling (in a previous blog post) Automated extraction of discussed topics using topic modelling, we will observe that there are associations in the data.

If we use the same logic based on our example of the IEC from the NMF topic modelling post, we see that in the preceding post we have context. Let us consider the following example of a conversation:

In this conversation we see that DawieScholtz mentions the IEC. For the purpose of this example, we will reveal the names both users as we classify them as "organisations" or "public persons". Dawie, in this case, analyses elections and shares his insights with the public. We will expand more on this later in the blog.

Given the data from this conversation, we can construct a very simple graph, given the association between the users.

To create this graph (a directed one), we added a connection between DawieScholtz and IECSouthAfrica because of the aforementioned Twitter post. The illustration below is a representation of this directed graph, to show the association between the two users.

G_example = nx.DiGraph()
G_example.add_edge('dawiescholtz','iecsouthafrica')

Note the connection between the two has an arrow from Dawie to the IEC (showing that is directed). If we take a similar approach, but with more data and scale, then we can apply this technique to our large Twitter dataset from previous posts. The only difference is, that we need to ensure that we respect the privacy of our users as we construct these graphs. That said, we work to preserve privacy for people who we do not deam as public persons. Let's now talk about who is in our public person list.

Lets say we have ANC mention IECSouthAfrica and CyrilRamaphosa in the same tweet. here is how we would add the new edges.

G_example.add_edge('myanc','iecsouthafrica')
G_example.add_edge('myanc','cyrilramaphosa')

Here is how the graph changes now with added context.

Our public person list

Our team spent some time extracting the top users, top mentioned users, top replied to users in earlier parts of this work. After they did this, we worked on listing these top users and classify them as public persons or not. Below are the users we deem as public people

print("List of public persons, total so far :", len(public_list))
public_list

List of public persons, total so far : 127
['1912onlineradio',
 '_africanunion',
 'a_c_d_p',
 'abiyahmedali',
 'action4sa',
 'advdali_mpofu',
 'afriforum',
 'alanwinde',
 'alettaha',
 'ancdsgduarte',
 'ancjhb',
 'ancncape',
 'ancparliament',
 'ancylhq',
 'athigeleba',
 'atmovement_sa',
 'bantuholomisa',
 'biancavanwyk16',
 'billynyaku',
 'brettherron',
 'cainmchunu',
 'capricornfmnews',
 'chelseafc',
 'cityofct',
 'cityofjoburgza',
 'citytshwane',
 'concourtsa',
 'cwakeni',
 'cyrilramaphosa',
 'dailymaverick',
 'david_makhura',
 'ddmabuza',
 'dirco',
 'dirco_za',
 'dlaminimarshall',
 'dlaminizuma',
 'easterncape',
 'eff',
 'effgroundforces',
 'effsc_up',
 'effsouthafrica',
 'efftshwane_56',
 'enca',
 'eskom_sa',
 'ewnupdates',
 'fcbarcelona',
 'floydshivambu',
 'forgoodza',
 'garethcliff',
 'gautenganc',
 'geordinhl',
 'governmentza',
 'gwedemantashe1',
 'healthza',
 'helenzille',
 'hermanmashaba',
 'homeaffairssa',
 'iecsouthafrica',
 'ifpinparliament',
 'insightfactor',
 'iol',
 'jhbcbdanc',
 'jsteenhuisen',
 'julius_s_malema',
 'kagutamuseveni',
 'kaizerchiefs',
 'kathradafound',
 'khu_ntshavheni',
 'ltouchinglives',
 'mancity',
 'manutd',
 'masandawana',
 'mbalulafikile',
 'mbuhari',
 'mbuyisenindlozi',
 'mmusimaimane',
 'morninglivesabc',
 'mphoformayor',
 'mphophalatse1',
 'mtlekota',
 'myanc',
 'mzwandilemasina',
 'mzwanelemanyi',
 'news24',
 'newzroom405',
 'niehaus_carl',
 'npa_prosecutes',
 'omphilemaotwe',
 'orlandopirates',
 'oscarmabuyane',
 'our_da',
 'parliamentofrsa',
 'partyofaction',
 'patriciadelille',
 'paulkagame',
 'philoso20133522',
 'potus',
 'powerfm987',
 'presidencyza',
 'presjgzuma',
 'pulemabe',
 'radio702',
 'reditlhabi',
 'rhodes_uni',
 'ronaldlamola',
 'sabcnews',
 'sahrcommission',
 'sandf_za',
 'sapoliceservice',
 'sbu_fo',
 'statecapturecom',
 'sundaytimesza',
 'teamnews24',
 'thembisiweya',
 'thulimadonsela3',
 'timeslive',
 'tito_mboweni',
 'tshwaneeff',
 'uct_news',
 'udmrevolution',
 'vfplus',
 'victoriaafrica9',
 'vngalwana',
 'vusumuzikhoza',
 'zilevandamme',
 'zsaul1',
 'zungulavuyo']

Hashing usernames

For users who are not on the public list, we hash their usernames. A hash converts an input string (e.g. a username) into another another string of arbitrary (any size). We do this to hide the original usernames of users who are taken as private persons.

This allows for us to look at the graph with you (the reader) and allow you to navigate it, without us exposing individuals who still have some expertation of privacy.

Constructing the full graph

In this section we now construct the full graph of mentions.

Number of nodes in the full graph.

47664

Number of edges (mentions)

126079

A snapshort of the graph is shown below. You can see many users (dots) which are the nodes. If you click on the interactive graph (lower), you will also be able to see the edges (created when one user mentions another). Outside the public people, usernames are not revieled (you will just see a hashed username).

image.png

In the graph, we provide weights on the edges (connections between graphs) by how many times one user has mentioned the other. So having a weight of 5 on one edge means that the user has mentioned the otherr user 5 times (this is likely in 5 different Twitter posts).

Reduced graph

We now limit the graph to users who have sent or received at least 2 messages.

Reduced number of nodes

15252

Reduced number of edges

89754

Extracting relationships within the graph #1 - Node2Vec

What is Node2Vec?

Let s check who is similar to our_da as per the node2vec algorithm.

sample_user = 'our_da'
print("Top 10 similar to: ", sample_user)
print("===================================")
get_most_similar(sample_user, node2vec_model)
Top 10 similar to:  our_da
===================================
aaa3842ae58134907fc7cf27b396dcdff360c15448250ea421509f53b76c02e575bea0279e1759034d358d527f6ac716067a220d39fe40a2ffd75eafe5357110
11298f82671232276cee846be67a2be0783d7908c42358b6fc9417400710fc956eb93cec5a34d7c32dd599d34f5f894d22397564f527fd66e3cd266db96f128c
jsteenhuisen
bf0d149f72dee5a51d9955eccfb1a3d3a16a9d8243464bfc481b507543e6bcffe8bf1a454923cd25bad5ef5a64fa56b7a6f6dac36a85f4a2ac365f85d0c9b014
96f61e226d0642f6d626fa0609601d03841039b51ec2b8f2a1c268eb9bb10d4dc9459c496df684db392836a3ea31dfb097bba4a99ba575c264091c53e52a1f25
e35626edbf7fcc2f877ca227fa3f9e4106eb0343cd985b87b1a10fc5dd0f9c9c65129f23c7cbe4accbb65b8aa637f5bed6521cd76025775354e263feca93904e
f96a8d8268e41b8c87537efcc9f5696d536cab12b949c849696adefe999cf3b5b0c4e2aeda1d89212914627424cf258e0a5eaa3411331be577d10e6e2f10deab
a00d5b565f754979542521516275f0fa523be3d1942118e67cc6ae5a59cd72c2862e66aa5d76b12e4322cefbf00b5cb4127100ff6018ad81153db52b6a06c940
d19618c882fe748ae97ab97356fd892d8ac3dbfbe878912c3b053827c9c9972e94ef0468fa6ebae695bde756dd6a78493221d0a3e6a0d6ca25ab590c6199f569
c8d6dce3efb14636a712b56f1515abdf7603d573735e6143b56b553bb9387fcfcbcd46a1cb80053416ab07c0e47ab9af90cb03d06f246f0c0eed111c9fbf7b9d

We see above, that the only person we know currently is jsteenhuisen (who is the leader of the DA).

Unhashing more public persons

As we have access to the unhashed user, we can actually reveal who in the top 10 similar nodes

Top 10 similar to:  our_da
===================================
cilliersb
gwenngwenya
jsteenhuisen
bf0d149f72dee5a51d9955eccfb1a3d3a16a9d8243464bfc481b507543e6bcffe8bf1a454923cd25bad5ef5a64fa56b7a6f6dac36a85f4a2ac365f85d0c9b014
nicholasnyati
brinkcilliers
leon_schreib
siviwe_g
da_youth
baxnodada

We can see above that all the similar users are DA members who interact with the public as such. let us now do the other top parties.

ANC Similar Users

Top 10 similar to:  myanc
===================================
2c90deb1c41545eab7b989c27c42d74acdb1fde016b0531ff2b506403bac8212131ef2229b0889245330781c77973346073b0d1a1fdc3edca6655f5f1ed085ba
7e34d22bf4c5a6f3840d27409b52d93847d9b9dc20bc189003652dfcb1c9f83b80d9f6b74133cfc5a54ec7b8bbad607dfaf771ea6bc16fc673c8605aae76c4c5
dca23498c997974a6a8b57604156edc0f65df275566e29b7d2ef4aa0de693119651cca642af9cdae2ce23c913f10a9f3a1b2a882f2f3c83d485eb99d6e1e7040
0a6dfba3d5bf354c38393412ac2382af55765c61eb899d9d00790e0b255980a3eac0849bbd7bb28812d1d26890a13027e40f375bed7416659c69e18c82d4b3ed
14a8bcdccbc5170b78d69beeccaf38649970dd02627fe1231e6fe282502d79ad755044c85801a722f8f8060e0cf1c35fb6774e581314e7f32820f748a6255949
c00a9a4fe2bbe5b2e9142816dc64a085614aea1368c09922428c872f50a78219f99016cf9ec8817f09161e26898448e80ed87d25d08b182f7ece0617f0d77252
5dcb52d08c405cb459c31fba48833c49c8641e7033c9cb087a2602b2cc85db27257251f4e3ed28bd08570d1a500e410b3cc559a51fc24b9823376aedaf722947
f6bd11e719d570fc8cac378dd466153a4b70249d7a9b4f5f5080395c7a027ce26bf8bb82d2b67dacc1253f5180f246c1041143a31bebc5fd397b8c6d47cab42f
d7a7b0d5345b6942cd940acd681b67b136f5b74cab32916da2ac5bc6bfffe35a7c801be1329acc486af5e7329a5806d8e0b931f79dc13da007cdfaeb147fbe92
a4598dbfa4786fdb2e1eb8878c03dd77bcee49576c444ceeae23b20423e2da076894cf6b77118933bb9e2fa3b7537ae8ef47bed7c4d510e1d2e446779fca101e

EFF Similar Users

Top 10 similar to:  effsouthafrica
===================================
dlaminimarshall
advdali_mpofu
piabamadokwe
julius_s_malema
veronica_mente
effstudents
mailola_poppy
sharon_letlape
mathiberebecca
floydshivambu

What we note here immediately is that there are clearly different social media campaign management by the different parties. The DA and EFF have in their top similar users their leadership in different roles. In the ANC one, we did not find even one. Maybe let us switch this.

CyrilRamaphosa Similar Users

Top 10 similar to:  cyrilramaphosa
===================================
hagegeingob
officialmasisi
11b9e5c650a6f91dcc87f6d5bf8c69b2031267ab497404e016c7c4a9c5c367200db3c4c668c5ddd5d5dd15c87610c982e0528913552256b2f13f64b650fb930a
mzwandilemasina
mbalulafikile
5dcb52d08c405cb459c31fba48833c49c8641e7033c9cb087a2602b2cc85db27257251f4e3ed28bd08570d1a500e410b3cc559a51fc24b9823376aedaf722947
david_makhura
f6bd11e719d570fc8cac378dd466153a4b70249d7a9b4f5f5080395c7a027ce26bf8bb82d2b67dacc1253f5180f246c1041143a31bebc5fd397b8c6d47cab42f
0a6dfba3d5bf354c38393412ac2382af55765c61eb899d9d00790e0b255980a3eac0849bbd7bb28812d1d26890a13027e40f375bed7416659c69e18c82d4b3ed
14a8bcdccbc5170b78d69beeccaf38649970dd02627fe1231e6fe282502d79ad755044c85801a722f8f8060e0cf1c35fb6774e581314e7f32820f748a6255949

We see above a mix of other African leaders as well as ANC leaders. This is fascinating and requires further scrutiny.

Enter PartyOfAction

PartyOfAction is anti-vaccination and has been spreading some vaccine misinformation during South Africa's battle with COVID-19. See information about the Infodemic. Further see our references below that will give your more insight into the party.

PartyOfAction has spread so much misinformation that they have been suspended a number of times by Twitter. A recent example (highlighted by the party leader),

10 November 2021

We delved into their similar top users.

Top 10 similar to:  partyofaction
===================================
billynyaku
mariskaschalek1
bennabch
giddywids
b1fd6fc13b6479aadd1fdbb60df11b8f17727a1ca2b23a7e07e75cea511ed48e55c6b766e27f06246d44016946aa9a710aa4a3316cb431921ebdf53c3c6d21e3
cathy_laar
7fbc4eee6687dabdf9c051bdc5fe355473b662d63e4ca53d728ff592ae5d5ec4f0f9041493503739cd7cd85623d3afe71f75ec217dab6a3f150ffa87a8ead0f3
de8dc5c41e66ec9664ca26172e21d28bfa7de52736da51b361d7c8fee114c5241ae735c5c4c9c9a133502350a756c1eb87836e6f249c3dd59b443451100e1aaa
wtf_justasking
spacecadet9661

The users above are leaders or anonymous accounts that also spread antivaccination messages.

Resources and References

  • William Bird and Thandi Smith, Disinformation in a time of Covid-19: While lying delights in darkness, transparency shines a light URL
  • William Bird and Nomshado Lubisi, Disinformation in the time of Covid-19: Vaccines and political parties URL