Upgrading maxcandocia.com


By Max Candocia


December 04, 2024

maxcandocia.com is finally upgraded with modern software. The circumstances that prompted the update were less than ideal, but the site is in a much better state now.

Reconstructing a Sampled Surface with Bicubic Splines


By Max Candocia


October 14, 2024

Reconstructing a partially-sampled surface can be a bit tricky, as you have limited data and prior assumptions about what a 2d surface looks like. Bicubic splines can be used alongside Markov chain models to optimize a fit.

Modeling with time-dependent error and burgers


By Max Candocia


September 21, 2024

Time-dependent error can make estimating effects and other values difficult. One method of overcoming this is by modeling a time-series alongside a regular linear model.

What is a p-value, and when should it be used?


By Max Candocia


September 06, 2024

What is a p-value, and when should it be used? This is a basic introduction to their basic usage and some caveats, rather than a deep-dive into how to calculate them.

Effectively Hiding Data in Images


By Max Candocia


December 28, 2021

How to effectively hide messages and other data inside images with steganography.

Subreddits That Get You the Most Awards


By Max Candocia


March 27, 2021

Which Subreddits are most likely to generate awards for their users?

"Error Bars" on Tiled Heatmaps


By Max Candocia


February 23, 2021

Heatmaps are a useful way of plotting 2-dimensional data, such as cross-tabulations. Adding "error bars" can seem non-intuitive, but expressing them in your visualization is possible with a small trick.

How Dream (or Anyone) Could Cheat in Minecraft Speedruns Without Anyone Noticing


By Max Candocia


December 19, 2020

Recently, famous YouTuber Dream had his Minecraft speedrun records removed as a result of cheating. If he had cheated differently, would he have been able to evade detection?

Converting fields of lists into wide and tall formats in R


By Max Candocia


December 03, 2020

If you have ever downloaded survey data, or any other kind of data, that has a field that is itself comma-separated, you may have found it annoying/difficult to reshape the data into a more useful form.

Pooled Testing for Viruses: How many tests can it save?


By Max Candocia


November 26, 2020

insert meta description

Candy Combinations for Bundling


By Max Candocia


November 13, 2020

What candies would work best in a bundle? Using rankings and correlations, popular candies can be grouped together for optimal combinations.

How to Get Survey Responses from Reddit


By Max Candocia


November 12, 2020

If you need more data for a survey, you can use Reddit as a source of responses. In this article, we look at a few factors that affect the success of a survey posted to Reddit.

How Would the US Vote for a Candy?


By Max Candocia


October 28, 2020

What would it look like if people across the US voted for a candy? Explore different results using different voting methods and different types of representation, such as a national vote versus the Electoral College.

Hashing Data to Memorable Phrases


By Max Candocia


October 02, 2020

Do you have trouble memorizing long strings, but want to keep things easy to remember? Look no further than the new keyToEnglish package in R, now available on CRAN.

Calculating Similarity of Running Routes


By Max Candocia


September 13, 2020

When working with path-like data, such as a run recorded by GPS, you may want to group near-identical routes together. With a handful of data, I demonstrate how similarities can be calculated to find duplicate runs, as well as make general comparisons between runs.

Dealing with Zeros and Negative Values with a Log Scale


By Max Candocia


August 30, 2020

When plotting data, you may want to use a log-scale for most of your data, but zeros, near-zero values, and negative values make this impossible. With piecewise linear and logarithmic functions, however, this effect can still be achieved.

Visualizing Direction in Running Routes


By Max Candocia


May 17, 2020

A relatively straightforward method of visualizing the direction of a running path using R and ggmap. This also works for any sort of path data in general.

How Likely Are You to be Banned From Reddit?


By Max Candocia


April 07, 2020

How Likely Are You to be Banned From Reddit? I got a bot for that.

Outliers in a Triathlon


By Max Candocia


February 26, 2020

How do you identify an "outlier" in a triathlon?

Hiding Data in Images


By Max Candocia


February 23, 2020

Images are one of the most common types of data that people view on the internet, but could they be hiding more than the eye can see?