Modeling Heart Rate Recovery with Nonlinear Regression

By Max Candocia


January 09, 2019

What kind of a tool do you use when you want to describe a complex phenomenon?

Linear regression is useful when your model is described by coefficients multiplying things that are added together, but it often cannot be used when coefficients are inside functions such as an exponential or log function. An example of a valid linear model could be $$Y = \beta_0 + \beta_1 \times X_1 + \beta_2 \times \log(X_2) + \epsilon$$ where \(\beta_i\) are the different coefficients, \(Y\) is the response variable, \(X_i\) are the different predictors, and \(\epsilon\) is the normal error/noise (i.e., randomness) of each response.

On the other hand, models such as random forests or neural networks are good at fitting highly complex shapes, but the underlying structure cannot be interpreted very easily. Random forests are an ensemble of decision trees that ask a bunch of yes/no questions about the different variables, and neural networks can sometimes be described as a very complicated flowchart that involves adding up points. See this article of mine for a very basic introduction to neural networks.

Another method, nonlinear regression, can be very useful for fitting a model. Very generally, this is $$Y = f(X_1, X_2,...,X_N,\beta_1, \beta_2, ..., \beta_K)$$

This can be very useful if you know what the parameters of the equation should be, as well as a (wide) ballpark idea of what their values should be. Fitting it, however, is much more complicated than fitting a linear model or random forest, as well as many neural networks.

Heart Rate Data

Over the past 16 months, I have been collecting heart rate data from my Garmin GPS watch for my runs, along with temperature data. One question I want to answer with that data is "How quickly does my heart rate decrease after I stop running, and how does temperature affect that?"

I had discussed certain factors affecting heart rate after exercise with my friend Allen, who has worked in various human biological research environments. The main factor affecting heart rate increase during exercise is the breaking down of lactic acid and the regulation of \(CO_2\). Another significant factor he described was temperature, where the body tries to regulate its temperature by circulating blood faster.

Of course, I do not have temperature of the body or \(CO_2\) levels directly measured by equipment, but the ambient temperature and the heart rate measurements should be sufficient, provided these are the descriptive variables we want to use.

The general shape of the data itself can be seen below. Note that as the temperature increases (from top to bottom), the asymptote of the heart rate as rest time grows large becomes higher. Any changes in rate are not as obvious. Higher initial heart rates are yellow, and lower ones are more purple/blue.

plot of chunk hr_graph_0

For more technical information on heart rate during exercise, see Cardiac Autonomic Responses during Exercise and Post-exercise Recovery Using Heart Rate Variability and Systolic Time Intervals—A Review. Figure 2 is a good example for a theoretical curve.

Heart Rate Model

My first model was a bit more complicated, and looked like this: $$HR_{start} = HR_{stop} + (HR_a - HR_{stop}) \times (1-e^{-((\min(c_1t, lag))-max(c_2(t-lag), 0))})$$

Where \(HR_a\) is the asymptotic heart rate, roughly what my heart rate is when walking, which I usually did after stopping, \(lag\) is the amount of time between the stop and the change of the heart rate recovery mechanism, and \(c_1\) and \(c_2\) describe the rates of change of the heart rate before and after the mechanism change.

Later, I used a simpler model, $$HR_{start} = HR_{stop} + (HR_a - HR_{stop}) \times (1-e^{-c_1t})$$ since my row count was not too high (\(N=646\)) and the \(c_2\) and \(lag\) parameters were not fitting well or consistently. Additionally, the fit for the simpler model had an \(R^2\) value very close to the random forest model (0.73 vs 0.75), suggesting that any additional parameterization with the given variables would not improve performance.

Variable Parameterization

Each of the above parameters itself is a function of temperature, so, for example $$HR_a = \beta_{HR_a,1} + \beta_{HR_a, T}T$$ where \(T\) is the average temperature.

Implementation in R

Optimization and Package Choice

When I first tried modeling, I used R's optim() function to find the best solution. However, that method is not particularly robust, and had fairly poor performance and results.

My next approach was to use the dfoptim package's hjk() function, which uses the Hooke-Jeeves algorithm, which is a pattern-search algorithm that does not require functions to be continuously differentiable.

While this was a huge improvement over optim, it was quite slow, was very sensitive to initialization parameters, and did not produce very robust results.

The final tool I used was a package called rstan, which is an interface to the C++ STAN platform, which is used for high-performance statistical modeling used in many domains.

Like the hjk() function, STAN benefits greatly from sensical initial parameters. The biggest difference this choice made was setting the constant term for \(HR_a\) to 80, which is a reasonable value for a heart rate when walking (for myself, at least).

Model Definition

I call the code in R using the following block:


heart = read.csv('s1_heart_data.csv') %>%
  filter(! & ! & !

heart_data = list(N=nrow(heart))
for (name in names(heart))
  heart_data[[name]] = heart[,name]

init_01 = list(

fit <- stan(file='model_01.stan', data=heart_data,
            init= function(chain_id) init_01,


Below is the model specification written for STAN in model_01.stan:

data {
  int<lower=0> N;
  vector[N] heart_rate_start;
  vector[N] heart_rate_stop;
  vector[N] avg_temp;
  vector[N] rest_time;

parameters {
  real hr_a_const;
  real c1_const;
  real hr_a_temp;
  real c1_temp;
  real<lower=0> sigma2_1;

  vector[N] hr_a;
  vector[N] c1;
  vector[N] yhat;
  vector[N] sigma2;
  for (n in 1:N){
    hr_a[n] = hr_a_const + avg_temp[n]*hr_a_temp;
    c1[n] = c1_const + avg_temp[n] * c1_temp;
  for (n in 1:N){
      yhat[n] = heart_rate_stop[n] + (hr_a[n] - heart_rate_stop[n]) * (1-exp(-rest_time[n]*c1[n]));
      sigma2[n] = sigma2_1;
  heart_rate_start ~ normal(yhat, sigma2);


The STAN model, because it uses MCMC iterations, reports mean and quantile statistics for each parameter (e.g., median, 75th percentile). Using the mean, the model can be described as: $$ HR_{start} = HR_{stop} + (75.9 + 0.882T - HR_{stop})\times(1- 2^{-(0.0263 + 0.000277T)t }) + N(0, 145) $$ where \(t\) is rest time, \(T\) is temperature in Celsius, and \(N(0,145)\) describes normal error with variance 145, or a standard error of about 12. In the model, though, the temperature term does not significantly affect the rate, as the 25th and 75th percentile estimates have different signs.

However, because there is some nonlinear correlation within these parameters, if they are all to be taken at the same time, a more precise (but vulnerable to overfitting) estimate can be made using the original hjk() function, with the above parameters as estimates. The results are close, but the model becomes: $$HR_{start} = HR_{stop} + (76.9 + 0.814T - HR_{stop})\times (1- 2^{-(0.0232 + 0.00101T)t}) + N(0, 2659) $$

The overall fit is technically higher, but the \(\sigma^2\) estimate is very poor, probably since it is very sensitive to small changes that greatly increase inaccuracy in the estimates.

Visualizing the first model with the graph of the data from above:

plot of chunk hr_graph_1

Interpretation of Results

According to the model, my normal "walking" heart rate is 75.9 bpm, plus an extra 0.88 bmp per degree Celsius outside, and after pausing, the average "half-life" of my heart rate recovery is \(\frac{1}{0.0632}\), or about 38 seconds. For example, if my heart rate was 160 bpm when it is about 5 degrees Celsius outside, my walking heart rate would eventually be 80 bpm, and after 38 seconds, that initial difference would be halved, so it would be at 120 bpm (since 160 bpm = 80 bpm + 80 bpm, and 80 bpm + 80/2 bpm = 120 bpm).

Of course, the standard error of these estimates is about 12 bpm, and the 95% confidence interval is about 10 bpm (71-81 bpm) leaving a 95% prediction interval roughly 30 bpm below and 30 bpm above that estimate for any single event. Some of this error may be hidden in some of the data I have collected but yet to transform, but much of it is likely in difference in behavior during stop time, as well as other values that were not measured.

Future Work

I am constantly collecting more data, and a friend of mine, using the same models of equipment, is doing the same. I am hoping to add some complexity to these models as more data is collected and extract more useful insights from these models.


The code and data for this model can be found at

Extra Model Notes

You can also see the summary of the rstan model fit below:

##                     mean      se_mean           sd          2.5%
## hr_a_const  7.592154e+01 1.432622e-01 2.4914026031  7.136686e+01
## c1_const    1.828310e-02 5.558837e-05 0.0014289690  1.566018e-02
## hr_a_temp   8.821563e-01 7.072157e-03 0.1303897701  6.341916e-01
## c1_temp     1.922401e-04 4.077463e-06 0.0001121361 -3.493921e-05
## sigma2_1    1.453253e+01 1.953821e-02 0.3920238240  1.376281e+01
## lp__       -2.047064e+03 6.985942e-02 1.4589710881 -2.050706e+03
##                      25%           50%           75%         97.5%
## hr_a_const  7.406508e+01  7.589436e+01  7.773107e+01  8.092963e+01
## c1_const    1.724736e-02  1.819029e-02  1.922995e-02  2.129886e-02
## hr_a_temp   7.880944e-01  8.813928e-01  9.784810e-01  1.121392e+00
## c1_temp     1.226810e-04  1.933863e-04  2.692400e-04  4.072611e-04
## sigma2_1    1.427220e+01  1.453229e+01  1.478619e+01  1.529961e+01
## lp__       -2.047807e+03 -2.046837e+03 -2.046006e+03 -2.045062e+03
##               n_eff      Rhat
## hr_a_const 302.4297 1.0000841
## c1_const   660.8117 1.0032265
## hr_a_temp  339.9251 1.0008696
## c1_temp    756.3285 1.0064864
## sigma2_1   402.5830 0.9998339
## lp__       436.1576 1.0044044


Recommended Articles

A Simple Explanation of how Computers Recognize Images

How do computers recognize images? The process is fundamentally a flowchart.

Evaluate My Lottery Ticket!

What is the expected value of a lottery ticket, and is it actually worth it just for the jackpot? With this tool, you can look at what any number of tickets are worth, with a highly customizeable input.