Reliability and validity of ten consumer activity trackers depend on walking speed
Purpose: To examine the test–retest reliability and validity of ten activity trackers for step counting at three different walking speeds. Methods:Thirty-one healthy participants walked twice on a treadmill for 30 min while wearing 10 activity trackers (Polar Loop, GarminVivosmart, Fitbit Charge HR, Apple Watch Sport, Pebble Smartwatch, Samsung Gear S, Misfit Flash, Jawbone Up Move, Flyfit, andMoves). Participants walked three walking speeds for 10 min each; slow (3.2 kmIhj1), average (4.8 kmIhj1), and vigorous (6.4 kmIhj1).To measure test–retest reliability, intraclass correlations (ICC) were determined between the first and second treadmill test. Validity wasdetermined by comparing the trackers with the gold standard (hand counting), using mean differences, mean absolute percentage errors,and ICC. Statistical differences were calculated by paired-sample t tests, Wilcoxon signed-rank tests, and by constructing Bland–Altmanplots. Results: Test–retest reliability varied with ICC ranging from j0.02 to 0.97. Validity varied between trackers and different walkingspeeds with mean differences between the gold standard and activity trackers ranging from 0.0 to 26.4%. Most trackers showed relativelylow ICC and broad limits of agreement of the Bland–Altman plots at the different speeds. For the slow walking speed, the GarminVivosmart and Fitbit Charge HR showed the most accurate results. The Garmin Vivosmart and Apple Watch Sport demonstrated the bestaccuracy at an average walking speed. For vigorous walking, the Apple Watch Sport, Pebble Smartwatch, and Samsung Gear S exhibitedthe most accurate results. Conclusion: Test–retest reliability and validity of activity trackers depends on walking speed. In general,consumer activity trackers perform better at an average and vigorous walking speed than at a slower walking speed.