September 22, 2019
Last Updated 09/25/2019
Beginning in the Spring of this year, protests began in Hong Kong over the Fugitive Offenders and Mutual Legal Assistance in Criminal Matters Legislation (Amendment) Bill, which would allow fugitives to be extradited from Hong Kong to China.
One large online community that has gotten involved with the protests is Reddit, which has a community called /r/HongKong, which is largely in favor of the protestors' cause. /r/China is another subreddit that has some (but significantly less) involvement in Hong Kong's affairs, but also leans towards the protestors' cause.
I collected a sample of 32,491 users who have commented and/or posted in one of those subreddits, and looked at their comment and submission histories since March 31st of this year. While one could simply look at which subreddits appear the most in these users' submission and comment histories, a lot of the big, default subreddits would take up most of the top spots. By scaling These values to the subreddit's respective sizes, we can get a better idea of what makes these communities more unique.
I processed different words and phrases (including up to 3-character Chinese character sequences) in each of the subreddits and looked at which ones were more frequent (in terms of proportion) in one subreddit vs. any of the others. "laowai" is Chinese for "foreigner", which makes sense for /r/China, considering that other common words appear to be related to teaching English there.
"lennon" for /r/HongKong refers to the "Lennon Wall", which was a mosaic made of encouraging post-it notes during the 2014 Umbrella Protests in Hong Kong. "tvb" refers to a television broadcasting company in Hong Kong. Most of the words and phrases representing /r/HongKong here are related to the recent protests.
Other than the reference to the 1989 Tiananmen Square Massacre, I am not sure why the below phrases are appearing for /r/Hong_Kong.
For /r/Sino, there appears to be a large focus around race, specifically.
I am currently testing out a "Creddit" risk score, which can also be accessed by my /u/CredditReportingBot. The score scales from 0 (highest risk) to 1,000 (lowest risk), with the relative risk doubling every 100 points. Below is a set of box plots comparing the risk scores of the four subreddits' userbases:
Note: I consider scores of under 200 to be "high risk", under 300 "moderately high risk", and under 400 "slightly elevated risk".
/r/China and /r/Hong_Kong have fairly similar risk profiles, and /r/HongKong is noticeably lower, with about half the risk of the other subreddits when looking at commenters. The risk is about one quarter for those who post to /r/HongKong versus the other groups. /r/Sino members are at the highest risk of being banned.
Looking at the above, there are a few things to note:
431 out of 539 commenters also commented in /r/HongKong. Although this table alone doesn't tell us if it was /r/Hong_Kong members that were going to /r/HongKong or the other way around, the fact that the commenting profile of /r/Hong_Kong is so similar to /r/HongKong implies that the larger subreddit has significant presence the smaller one. This does not exclude the possibility of /r/Hong_Kong members going over to /r/HongKong, but their impact would be significantly less.
Looking at submission-based statistics, the results are similar to the comments-based ones, but because of increased scrutiny of submissions vs. comments by moderators, these are often more representative of the general view of the community, even in spite of brigaders. The relationship of /r/Hong_Kong to /r/Sino becomes a bit more clear, and the submissions of /r/China submitters seems to be slanted more towards regional subreddits (e.g., Africa, Guangzhou) instead of only China-related ones.
One last type of graphic I like to use to visualize subreddits is a network graph of the moderators, the individual accounts that create and enforce the rules of their respective subreddits. Below is a graph that demonstrates how /r/Sino and /r/Hong_Kong are part of a tightly knit group that is run by mostly the same moderators. Many of the subreddits besides /r/Sino and /r/Hong_Kong in that clique are not very active, as indicated by the size, which is proportional to the number of subscribers.
/r/China and /r/HongKong are completely separate as far as these subreddits go, but the network only looks at the Subreddits moderated by the individuals of the four subreddits analyzed above, so there is likely an actual connection further down the line.
The code used for the main analysis (but not data collection) is available on GitHub: https://github.com/mcandocia/hk_reddit