Christmas Survey Results Part 1

By Max Candocia


December 19, 2017

This December, I surveyed 312 users from Reddit, Facebook, and, to a lesser extent, other social media sources on how they celebrated (or didn't celebrate) Christmas. You can find an active version of the survey here . I automatically generate a good portion of this article, so I may update it in the future.

This is the first article using this data out of a few that I will be posting in the next few days. The primary goal of this article is to visualize the overall responses to some of the main questions, divide them into groups, and provide some commentary on them. The following articles will dive deeper into different aspects, including when kids learn that Santa isn't real, what words people use to describe Christmas, and various foods, desserts, and drinks that people consume.

Other Articles in Series

Note that with the exception of the logistic model at the very bottom, these graphs are representative of the data moreso than the general population.


In the survey I asked about gender, religion, and region of the US (or outside the US). I did not include race, as I will be making the survey responses public, and that may provide too much identifying information for some people's comfort. The most notable demographic biases that are not particularly representative of the US as a whole are the overrepresentation of the Midwest, a large number of atheist/agnostic responders, as well as a notable percent of responders outside of the US.

plot of chunk demographics-gender
plot of chunk demographics-age
plot of chunk demographics-religion
plot of chunk demographics-region

Who Celebrates Christmas?

I also looked at who celebrates Christmas. Unsurprisingly, most of those who don't are not Christian.

#who celebrates christmas?
plot of chunk celebration-overall
plot of chunk celebration-gender
plot of chunk celebration-age
plot of chunk celebration-religion
plot of chunk celebration-region

When do they celebrate Christmas?

Another question I asked is if an individual had a larger celebration on the 24th or 25th of December (or if they were about the same). One oversight when asking this question was not realizing that Eastern Orthodox Christmas is on January 6th, which is not an option.

plot of chunk date-overall
plot of chunk date-religion

Activities on Christmas

I also looked at some of the activities that users did on Christmas.

plot of chunk activities-overall
plot of chunk activities-bycelebration
plot of chunk activities-gender
plot of chunk activities-age
plot of chunk activities-religion
#by region
plot of chunk activities-region

Which people hang out with their friends the most?

Looking at the above graphs, there are some correlations that have to do with oversampling from certain regions. For example, there are a high number of Evangelical Christian responses from the midwest, which is not representative of the overall structure of the survey. A statistical technique known as logistic regression can be used to determine what factors influence an outcome. In this case, I am testing to see if region, religion, gender, and/or celebration of Christmas affect individuals hanging out with friends on Christmas, since that is one of the categories that seems to have interesting correlations in the above graph.

Below is a snippet of code I used to test this out.

friends_model = step(glm(activities_Spend.time.with.friends ~ religion + region + celebrates_christmas + gender + age_group,
                     direction='both', trace=0)

## Call:
## glm(formula = activities_Spend.time.with.friends ~ celebrates_christmas, 
##     family = binomial, data = survey_categories)
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.9405  -0.9405  -0.9405   1.4345   1.9728  
## Coefficients:
##                        Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             -0.5867     0.1254  -4.679 2.88e-06 ***
## celebrates_christmasNo  -1.2051     0.4991  -2.415   0.0157 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Dispersion parameter for binomial family taken to be 1)
##     Null deviance: 397.18  on 311  degrees of freedom
## Residual deviance: 389.87  on 310  degrees of freedom
## AIC: 393.87
## Number of Fisher Scoring iterations: 4

It turns out that the only significant factor in determining if one is more likely to spend time on Christmas with friends is whether or not they celebrate Christmas to begin with. I am guessing those who don't do not know too many people who aren't celebrating that day or are otherwise apathetic.

Source Code

I have the source code for my analysis on GitHub here. All the responses (after removing timestamp/order info) will be released once I finish my article series.


Recommended Articles

Visualizing the IHSA State Cross-Country Meet

An analysis of how runners perform in the state IHSA cross-country meet and what the top runners look like in terms of pacing strategy. Also, some old-school footage from a while ago.

Using Neural Networks to Play Board Games

Neural networks can be used in many artificial intelligence applications. Here, I trained a network to play the popular tabletop game "Machi Koro", a game of buying properties and accumulating wealth.