July 02, 2018
What kinds of fireworks does everyone like to use on the Fourth of July? I asked that question (among many others) in a recent survey to Facebook, Reddit, and LinkedIn users, with a total of 137 responses.
Users chose up to 11 different types of fireworks and indicated if they typically use them. I collected demographic information, which I used to weight the responses in order to achieve a hopefully more accurate estimate of the general US population. I will describe that method in more detail later in this article.
About 55%-60% of Americans seem to use some type of firework, and possibly more that I may be missing, although if someone uses fireworks, usually at least one of these is in the picture.
Each column represents the same kind of statistic: an estimate of the percentage of Americans that use the corresponding type of firework on the Fourth of July. The error bars, in either white (on the left and right columns) or red (in the middle) represent 95% confidence intervals, which give you an idea of what range the true value is most likely in.
Because my sample is not a representative sample of the US, the raw estimates (middle column) are most likely biased. In the right column (blue bars), I weighted responses by gender, race, Hispanic/latino(a) ethnicity, region of the US, and age group. In the left column (red bars), I also weighted by support for Trump.
The weighted estimates are not usually too far off from the unweighted version, although the error bars are much wider due to the distribution of demographics among the respondents.
I used a raking algorithm to determine weights of respondents, taking into account gender, race, age group, and region of the US in which they live. I also added a political weight that also takes into account whether or not a respondent has a positive view of President Trump. In retrospect, I should have also added an education variable, but the relatively consistent results don't worry me too much.
The weighting increases the influence of under-represented groups in these statistics. For example, about 76% of the US population is white. If only 50% of the responses were from white people, then weighting would increase the influence of those responses. In reality, the actual percentage was pretty close to that value. However, 10% of the responses had "multiple races" listed, while the US Census only lists that group as making up 2.7% of the population, so the weights for those responses were relatively lower.
For more details on the algorithm, see my post on weight raking.
Because there are a bunch of different names for each of these types of fireworks, here are some descriptions of them below:
Unfortunately, in hindsight, I forgot to include mortar-based fireworks on the list, which are the ones that generally produce the most exciting show, although they are illegal to set off it many parts of the US, especially without a permit.
Some types of fireworks are more popular among individuals. Using hierarchical clustering, we can see that the more dangerous fireworks (Roman candles, bottle rockets, firecrackers, and other rockets) are in one group, while the others are in their own group. Sparklers and bang snaps are often given to smaller children (the former under close adult supervision), while the remaining fireworks are all lit on the ground and go off for several seconds to a minute (or so).
The underlying numbers connecting these can be seen below in this tile plot of the "similarities" between the different types of fireworks.
The source code of this project is hosted here: https://github.com/mcandocia/FourthOfJulySurvey