Checking our top users

There are many different types of users we want to compare. This comparison allows us to gain insight into how rapidly the posts spread, if there are people implicated by the post, if this content is being pushed by specific users, or if this is a trending category. To do so, we are interested in tracking the following variables:

Top users who send twitter posts,
Top users who mention others,
Top users who are mentioned by others,
Top users who reply to others,
Top users who are replied to.

All of the variables above assist us in identifying any anomalies, as well as detecting additional outliers that may arise in the data for further exploration. All of these variables will undergo a series of data processing steps so that trends associated with anomalies in the data may become apparent when the analysis is performed.

Top mentioned and mentioners

Top mentioned users in this instance can be defined as the top most frequent mention of a particular user throughout a variety of different posts. In the table below are the top mentioned users. As expected, the top users would be directly connected to the election itself including members or political parties, politicians themselves and/or state institutions.

These are the top mentioner patterns

Note: To preserve the privacy for users who are not known in the public domain, nor form part of public organisations we aggregated the information in a graph. For the instances where people are well known in the public domain, we will share some of the users that were classified as likely to spread misinformation later.

Text(0.5, 0.75, 'https://dsfsi.github.io/zaelection2021/')

Top replied to and top repliers

In addition to the metric employed above, tracking the top replies to accounts are also important as it indicates factors such as engagement. Again, as expected, the accounts with the highest frequency in terms of replies were associated with political figure heads/ organisations/ people well known in the public domain. In the table below are the top replied to accounts

These are the top replied to accounts

These are the top replier patterns:

Note: To preserve the privacy of people who are not part of the public domain, their data were aggregated and shown in the graph to protect their identity. . For the instances where people are well known in the public domain, We will share some of the users that were classified as likely to spread misinformation la

Text(0.5, 0.75, 'https://dsfsi.github.io/zaelection2021/')

BOT analytics

To identify the human vs BOT interaction from the data, we used BotoMeter to extract analytics on the users. To do so, we used 1000 accounts per type of top user we explored in this process. This approach was used so that the results can be repeated and understood from the data we collected if our methodological approach was used. As mentioned before, it is important to identify human posts from BOT posts as it is vital to understand the narrative associated with people, and to identify trends present in the data of humans, rather than that of a BOT (since a series of BOT posts can be conducted at a higher frequency than that humans)

Check a single account by screen name

Here we show the output of a single account in terms of the BOT score. The reason why we included this is so that the reader can contextualise what a BOT score means and how that differs with the equivalent of a human score.

User already in dict:  effsouthafrica

{'cap': {'english': 0.7967206940193189, 'universal': 0.8474634546636374},
 'display_scores': {'english': {'astroturf': 1.2,
   'fake_follower': 2.0,
   'financial': 0.0,
   'other': 3.3,
   'overall': 3.3,
   'self_declared': 1.4,
   'spammer': 0.4,
   'username': 'effsouthafrica'},
  'universal': {'astroturf': 1.2,
   'fake_follower': 1.8,
   'financial': 0.0,
   'other': 4.4,
   'overall': 4.4,
   'self_declared': 2.2,
   'spammer': 0.4}},
 'raw_scores': {'english': {'astroturf': 0.23,
   'fake_follower': 0.4,
   'financial': 0.01,
   'other': 0.66,
   'overall': 0.66,
   'self_declared': 0.27,
   'spammer': 0.07},
  'universal': {'astroturf': 0.24,
   'fake_follower': 0.35,
   'financial': 0.0,
   'other': 0.87,
   'overall': 0.87,
   'self_declared': 0.45,
   'spammer': 0.08}},
 'user': {'majority_lang': 'en',
  'user_data': {'id_str': '932163222', 'screen_name': 'EFFSouthAfrica'}}}

User already in dict:  myanc

{'cap': {'english': 0.7717813288270262, 'universal': 0.7334998320027682},
 'display_scores': {'english': {'astroturf': 1.4,
   'fake_follower': 0.4,
   'financial': 0.0,
   'other': 2.2,
   'overall': 1.4,
   'self_declared': 0.0,
   'spammer': 0.0,
   'username': 'myanc'},
  'universal': {'astroturf': 1.3,
   'fake_follower': 0.8,
   'financial': 0.0,
   'other': 1.5,
   'overall': 1.1,
   'self_declared': 0.0,
   'spammer': 0.0}},
 'raw_scores': {'english': {'astroturf': 0.29,
   'fake_follower': 0.08,
   'financial': 0.0,
   'other': 0.43,
   'overall': 0.29,
   'self_declared': 0.0,
   'spammer': 0.0},
  'universal': {'astroturf': 0.26,
   'fake_follower': 0.17,
   'financial': 0.0,
   'other': 0.3,
   'overall': 0.22,
   'self_declared': 0.0,
   'spammer': 0.0}},
 'user': {'majority_lang': 'en',
  'user_data': {'id_str': '18759465', 'screen_name': 'MYANC'}}}

User already in dict:  our_da

{'cap': {'english': 0.7971037475964349, 'universal': 0.7982287282125168},
 'display_scores': {'english': {'astroturf': 2.6,
   'fake_follower': 1.6,
   'financial': 0.0,
   'other': 2.7,
   'overall': 2.7,
   'self_declared': 0.0,
   'spammer': 0.0,
   'username': 'our_da'},
  'universal': {'astroturf': 1.6,
   'fake_follower': 1.0,
   'financial': 0.0,
   'other': 2.7,
   'overall': 1.8,
   'self_declared': 0.0,
   'spammer': 0.0}},
 'raw_scores': {'english': {'astroturf': 0.52,
   'fake_follower': 0.32,
   'financial': 0.01,
   'other': 0.54,
   'overall': 0.54,
   'self_declared': 0.01,
   'spammer': 0.0},
  'universal': {'astroturf': 0.32,
   'fake_follower': 0.2,
   'financial': 0.01,
   'other': 0.53,
   'overall': 0.35,
   'self_declared': 0.0,
   'spammer': 0.0}},
 'user': {'majority_lang': 'en',
  'user_data': {'id_str': '23594033', 'screen_name': 'Our_DA'}}}

How we got the BOTOMETER scores

To get a collection of all the BOT scores, we used the BotoMeter v4 API to get our data. The total number of scores we saved given the 1000 accounts we checked for each category are:

Number in botometer cache:  2651

Resources and References

Moodley, V Marivate. Topic Modelling of News Articles for Two Consecutive Elections in South Africa. 2019 6th International Conference on Soft Computing & Machine Intelligence (ISCMI). [Paper URL][Preprint]

	user	number_of_mentions
0	myanc	39799
1	cyrilramaphosa	25033
2	effsouthafrica	16352
3	our_da	16339
4	action4sa	9730
5	julius_s_malema	9354
6	presidencyza	7509
7	hermanmashaba	7476
8	iecsouthafrica	5263
9	jsteenhuisen	4712
10	mbalulafikile	3350
11	governmentza	2292
12	enca	1914
13	helenzille	1882
14	sabcnews	1783
15	ancparliament	1606
16	a_c_d_p	1561
17	forgoodza	1367
18	sapoliceservice	1324

	user	number_of_replies
0	cyrilramaphosa	74793
1	myanc	71759
2	effsouthafrica	64435
3	hermanmashaba	63329
4	julius_s_malema	56567
5	our_da	42929
6	action4sa	30225
7	jsteenhuisen	18387
8	iecsouthafrica	14265
9	mbalulafikile	12506
10	niehaus_carl	11023
11	presidencyza	10617
12	zungulavuyo	8926
13	helenzille	8214
14	mzwanelemanyi	7842
15	governmentza	5436
16	sapoliceservice	5297
17	bantuholomisa	5277
18	a_c_d_p	4949