# What Time Should You Post to Reddit?

|

July 29, 2017

Last Updated 10/24/2017

UPDATE: A more recent and thorough analysis can be found here.

When posting anything on social media, whether a news article, a picture of yourself, or a funny image (or a combination thereof), you usually want to reach the largest audience. When posting on Reddit, I have noticed that the success of a post is largely determined by the time of day and day of week that your submission is posted. There are a few other factors, such as whether the post is an image, an article, or a text-only submission.

I have used the Python scraper I built in order to collect data on articles of particular Subreddits I wish to analyze. Among the data collected, I have...

• The subreddit it was posted in
• The time at which a post was made
• The domain of the post's link

Using this information, I can formulate a model that describes what attributes affect the score. Specifically, I am looking for a percent change in the score with respect to values such as time of day, day of week, whether a post is an image post, etc. In my case, this can be approximated with this formula:

` sign(score) * log(abs(score) + 1) = time_of_day_and_day_of_week + is_image_post + is_text_post + length_of_submission_title`

I log-transform the score on the left side. Doing so ensures that the terms on the right side have a multiplicative effect on the score, as opposed to additive. The right side treats the time of day + day of week, the post being an image post, and its other attributes as independent factors that each scale the score by some value; i.e., I am controlling for other effects.

Below is a graph that estimates the effect of the time of day and day of week on six different subreddits I sampled collectively. I use Monday from 8 to 10 am as a reference, so the percentages are the percent increase in score you can expect if you post at the given time versus Monday from 8 to 10 am US Central Time . Monday morning is a relatively good time to post in these subreddits, especially from 6-8 am. Sunday is even better during that time frame, with an expected score that is 74% higher than our reference, Monday from 8 to 10 am. Saturday, however, seems fairly strong most of the day.

Because the above image only applies to a relatively small amount of data, it helps to compare it to a different set of data. Below I sampled default subreddits, as well as thread commenter's comment histories, so this model generalizes to Reddit as a whole better. This tells a similar story, except the tiles change a lot more smoothly. You could repeat the process, but the general takeaway is that the best time to post on Reddit is on Sunday, Monday, or Saturday from 6 to 8 am US Central Time. The next best times would be within 2 hours of that time range on those same days, or during that same time range on other days.

Technically, the transformation I made to the score adds 1 to the score before calculating the percent change, and negative scores are calculated as having points equal to ` 1/(1+abs(score)) `, which is a fractional score always decreasing as the score becomes more negative.

## Code and Data

Below I have the R code I used to generate the images. You can download the data for the file here: constrasts_threadmode.csv.

```library(plyr)
library(dplyr)
library(htmlTable)
library(ggplot2)
library(scales)

setwd('/mydirectory/reddit_posting')

#makes filenames possible/better
subslash <- function(x){
x = (gsub(' ','-',x))
return(gsub('/','-',x))
}

#group times to increase significance of data
target_hour_ = c('12-2 am','2-4 am', '4-6 am', '6-8 am','8-10 am', '10 am - 12 pm','12- 2 pm', '2- 4 pm', '4-6 pm','6-8 pm',
'8-10 pm','10 pm - 12 am')

daysofweek = c('Sunday','Monday','Tuesday','Wednesday','Thursday','Friday','Saturday')

weekday_hour_grid = expand.grid(target_hour_, daysofweek)
#make sure order is right
weekday_hour_levels = paste(weekday_hour_grid[,2], weekday_hour_grid[,1])
#for a better reference, ref=Monday 8-10 am
weekday_hour_levels_ = c(weekday_hour_levels, weekday_hour_levels[-17])

#domain vars
'Non-Image Submission')[2 - threads\$domain %in% c('imgur.com','i.imgur.com','i.reddit.com')])

#remove moderator posts, which will most likely be very high
#run linear model and extract coefficients
model = lm(logscore ~ weekday_hour + titlelen + is_self + image_submission + subreddit, data=threads)
model_summary = summary(model)
coefs = model_summary\$coefficients

#round sig figs
for (i in 1:4)
coefs[,i] = signif(coefs[,i], 4)

#used to produce HTML output of the model summary for display on web
print(htmlTable(coefs))
sink()

#now format matrix to show results
coefmat = as.data.frame(cbind(varname = rownames(coefs), coefs))[,1:2]
coefmat = coefmat %>% filter(grepl('weekday_hour.*',varname))
coefmat = rbind(data.frame(varname='weekday_hourMonday 8-10 am',Estimate=0), coefmat)
coefmat\$dow = factor(gsub( '.*hour','', gsub(' .*','',coefmat\$varname) ), levels=daysofweek)
coefmat\$hour = factor(gsub('^[^0-9-]*? ','', coefmat\$varname), levels=rev(target_hour_) )
coefmat\$`Percent Change`= (exp(as.numeric(coefmat\$Estimate)) - 1)

#save plot to png
png(subslash(paste0('expected_reddit_score_',tname, '.png')), height=720, width=920)
print(
ggplot(coefmat, aes(x=hour, y=dow, fill=`Percent Change`)) +
geom_tile() + xlab('') + ylab('') + #axes are self-explanatory with title
ggtitle('Percent Change in Expected Reddit Submission Score Based on Time Posted',
subtitle=paste('compared to Monday from 8 - 10 am & using',comma(n_data_points), tname,'submissions')) +
theme_bw() + theme(plot.title = element_text(hjust=0.5, size=24), plot.subtitle=element_text(hjust=0.5, size=subtitle_size),
axis.text.x=element_text(size=18,angle=0, vjust=0.8), axis.text.y = element_text(size=18)) +