Converting Garmin FIT Files to CSV

By Max Candocia

|

September 22, 2017


Last Updated 08/06/2018

UPDATE: I have a more streamlined script available at https://github.com/mcandocia/fit_processing that performs FIT->CSV, GPX->CSV (needs testing), and CSV->censored CSV (blocks out some fields if they are within a certain radius of user-defined coordinates) conversions, as well as archiving the processed results.

I recently purchased a refurbished Garmin Forerunner 230, along with a heart rate monitor, with the hopes of being able to acquire more accurate GPS data (and heartrate info) on a smaller device. With the Strava iPhone app, you can download data in a .gpx format, which is a type of XML. Garmin, however, uses a .fit file format. This is a binary file format that allows a wide variety of data to be included in workouts, such as profile information, heart rate, and traditional time/coordinate information. Converting these files to a useful format, however, is a bit tricky.

GPSBabel

Initially I tried to use GPSBabel. After installing the latest version from the GitHub source, I ran

gpsbabel -i garmin_fit -f 2017-09-21.fit -o csv -F 2017-09-21_test.csv

Unfortunately, when I used it, GPSBabel did not look for anything other than coordinate information, so the times/heart rate info were lost. The same was true when I tried converting it to GPX.

Fitparse

I discovered a library for Python called fitparse. The library converts all of the data in a .fit file to a Python class object, which contains a messages attribute that contains most of the useful information in the file, including the GPS coordinates and the timestamps/heartrates/step cadence/etc. that goes along with it.

The current version that you would install with apt-get is not up-to-date, so you should install it from GitHub using the following code (on Ubuntu):

sudo pip3 install -e git+https://github.com/dtcooper/python-fitparse#egg=python-fitparse

I strongly recommend Python 3 for this library, as I had issues when trying to use Python 2.7.

There is a tool called "fitdump" that comes with the library, but it is an incomplete script. Below I have my own file that I use for converting .fit files to CSV.

import csv
import os
#to install fitparse, run 
#sudo pip3 install -e git+https://github.com/dtcooper/python-fitparse#egg=python-fitparse
import fitparse
import pytz

allowed_fields = ['timestamp','position_lat','position_long', 'distance',
'enhanced_altitude', 'altitude','enhanced_speed',
                 'speed', 'heart_rate','cadence','fractional_cadence']
required_fields = ['timestamp', 'position_lat', 'position_long', 'altitude']

UTC = pytz.UTC
CST = pytz.timezone('US/Central')


def main():
    files = os.listdir()
    fit_files = [file for file in files if file[-4:].lower()=='.fit']
    for file in fit_files:
        new_filename = file[:-4] + '.csv'
        if os.path.exists(new_filename):
            #print('%s already exists. skipping.' % new_filename)
            continue
        fitfile = fitparse.FitFile(file,  
            data_processor=fitparse.StandardUnitsDataProcessor())
        
        print('converting %s' % file)
        write_fitfile_to_csv(fitfile, new_filename)
    print('finished conversions')


def write_fitfile_to_csv(fitfile, output_file='test_output.csv'):
    messages = fitfile.messages
    data = []
    for m in messages:
        skip=False
        if not hasattr(m, 'fields'):
            continue
        fields = m.fields
        #check for important data types
        mdata = {}
        for field in fields:
            if field.name in allowed_fields:
                if field.name=='timestamp':
                    mdata[field.name] = UTC.localize(field.value).astimezone(CST)
                else:
                    mdata[field.name] = field.value
        for rf in required_fields:
            if rf not in mdata:
                skip=True
        if not skip:
            data.append(mdata)
    #write to csv
    with open(output_file, 'w') as f:
        writer = csv.writer(f)
        writer.writerow(allowed_fields)
        for entry in data:
            writer.writerow([ str(entry.get(k, '')) for k in allowed_fields])
    print('wrote %s' % output_file)

if __name__=='__main__':
    main()

You can find an updated version of the code that extracts lap/start/stop time information on my GitHub.

An example of a few rows from a run (made using the htmlTable package in R), excluding the "enhanced" columns:

timestamp position_lat position_long distance altitude speed heart_rate cadence fractional_cadence
280 2017-09-21 16:23:00-05:00 41.97074582800 -87.64729457907 1.02973 175.8 13.1004 142 82 0
281 2017-09-21 16:23:01-05:00 41.97071447968 -87.64731427654 1.03358 175.6 13.1328 142 82 0
282 2017-09-21 16:23:02-05:00 41.97068095207 -87.64733263291 1.03759 175.6 13.1328 142 82 0
283 2017-09-21 16:23:03-05:00 41.97064826264 -87.64734562486 1.04138 175.4 13.1688 142 80 0
284 2017-09-21 16:23:04-05:00 41.97061406448 -87.64735224656 1.04521 175.6 13.1688 143 80 0
285 2017-09-21 16:23:05-05:00 41.9705776032 -87.64735300093 1.04926 175.8 13.1688 143 80 0

Here is a segment from a run with the path highlighted by heart rate (cetcolor package in R used for visually discernible gradient):

Code for above visualization:

library(ggplot2)
library(cetcolor)

setwd('/ntfsl/somedirectory')
df = read.csv('workout_gpx/garmin_fit/2017-09-21.csv')

ggplot(df[somevalue:someothervalue,], 
       aes(y=position_lat, x=position_long, color=heart_rate)) + 
  geom_path() + theme_dark() +
  xlab('Longitude') + ylab('Latitude') + 
  ggtitle('Forerunner 230 GPS and Heart Rate Data') + 
  scale_color_gradientn('Heart Rate (bpm)', colours = cet_pal(5, name="inferno")) + 
  theme(plot.title = element_text(hjust=0.5, size=rel(2)),
        axis.title = element_text(size=rel(2)),
        legend.title = element_text(size=rel(1.5))) 

Future Work

I plan on comparing the iPhone app recording to the Garmin Forerunner's, and this conversion helps dramatically. I'm still thinking of ideas that involve heart rate data. If you have any cool ideas, you can reach me via my email, which you can discover by running

python -c "import base64;print base64.standard_b64decode('bWF4Y2FuZG9jaWFAZ21haWwuY29t')";

in the command line. Alternatively, you can run

intToUtf8(c(109L, 97L, 120L, 99L, 97L, 110L, 100L, 111L, 99L, 105L, 97L, 64L, 103L, 109L, 97L, 105L, 108L, 46L, 99L, 111L, 109L) )

in R (thanks to my colleague & friend James Balamuta for that idea).


Tags: 

Recommended Articles

Overlaying Density Heatmaps on Geographic Maps in R

In this example, I use noise complaint data from New York City to demonstrate how you can plot densities of events on a map, as well as how extreme the averages are.

Learning Strategies through Reinforcement Learning

Reinforcement learning can be used to teach computers to perform complex tasks. In this lecture, I discuss two of my board game + reinforcement learning projects as a guest lecturer for STAT 385 at University of Illinois at Urbana-Champaign.