How Smart Are Smart Watches?

Over the past decade, smart watches and fitness trackers have become a common wearable & gadget amongst athletes, health professionals, and the general public alike. With promises to monitor everything from heart rate to sleep quality, step counts to calorie burn, these devices have gained immense popularity for their convenience and real-time insights. But behind the sleek interface and colourful graphs lies a crucial question: just how accurate and precise are smart watches when it comes to tracking the metrics that matter?

In this article, we review the last decade of scientific literature to understand what smart watches do well, where they fall short, and why using their data without considering technological error can lead to misguided decisions, especially when it comes to fat loss, recovery, and exercise programming. We also explore how a more integrated approach, combining step tracking, nutritional monitoring, and behavioural consistency, can help bridge this gap.


Step Count: Among the Most Accurate Metrics

Step count remains one of the most reliable metrics across almost all modern wearables. Controlled studies show that devices like the Apple Watch, Fitbit, and Samsung Galaxy series typically exhibit step-count errors under 5–10% during structured walking or running trials. This performance tends to remain stable across a variety of free-living conditions, although factors such as slow gait, non-arm-swinging movement (e.g., pushing a shopping cart), or irregular cadence can introduce more variability & error.

Most studies agree that smartwatch step tracking performs better than other metrics and can serve as a meaningful anchor for general activity monitoring. For individuals tracking energy expenditure through non-exercise activity thermogenesis (NEAT), step count is a valid and useful proxy.

As the coach, these findings reaffirm the importance of monitoring step count for athletes within an contest preparation, as it’s an incredibly practical & accurate tool to better understanding an individual’s daily movement. Despite it not being an exact value that depicting calorie expenditure, it’s a great metric to ensure a standardisation of daily movement of which a deficit through nutrition can be layered upon to ensure fat loss overtime.


Heart Rate: Reliable at Rest, Less at High Intensities

Wrist-based photoplethysmography (PPG) allows smart watches to estimate heart rate with surprising accuracy at rest and during light to moderate aerobic activity. Devices like the Apple Watch, Garmin Forerunner, and WHOOP Strap have demonstrated average heart rate errors within 5-10% of ECG gold standards in several trials. For bodybuilding athletes who prefer to keep cardio low to moderate in intensity, this means heart rate readings may be relatively accurate.

However, this accuracy diminishes at higher intensities, during resistance training, or during activities with limited wrist motion like cycling. Heart rate readings during interval training can lag or become erratic due to sensor motion, skin perfusion changes, and device fit. Although the relative error may be tolerable for casual users, those relying on heart rate zones for precision training should consider using chest strap monitors for more reliable data.


Calories Burned: The Achilles’ Heel of Wearables

Despite advances in sensor technology, energy expenditure remains one of the most consistently inaccurate metrics across all major smartwatch brands. Multiple peer-reviewed studies, including large-scale validations by Stanford University, have shown that no commercially available smartwatch consistently estimates calories burned with less than 20% error. In many cases, the margin of error ranges between 30–50%, particularly for strength training, high-intensity intervals, or varied free-living activities. That’s huge!

These discrepancies arise from several limitations. Smart watches typically estimate calorie burn using proprietary algorithms based on motion sensors and heart rate data, occasionally incorporating user demographics. However, they do not account for individual variations in fitness level, muscle mass, resting metabolic rate, or exercise efficiency. This often results in significant over- or underestimation of true caloric expenditure.

Using calorie burn from a smartwatch to gauge energy balance or justify food intake is therefore a red flag. For example, a device that overestimates workout calories by 200-300 kcal may cause an individual to eat back those calories, undermining a deficit. Conversely, underestimation for an athlete in a build phase may lead to over-restriction and thus performance & poorer recovery.

These cumulative errors can derail fat loss or performance goals if relied upon too heavily. The psychological impact is also worth noting: some users may develop an unhealthy relationship with exercise and food, striving to “burn off” meals or adhere rigidly to caloric targets dictated by flawed data.


Resting Metabolic Rate (RMR): An Educated Guess at Best

Most wearables do not measure resting metabolic rate directly. Instead, they estimate RMR using predictive equations based on user input (age, weight, height, sex). While these formulas offer ballpark estimates, individual RMR can vary significantly due to genetics, muscle mass, thyroid status, and training history.

As such, RMR values from smart watches should not be treated as definitive. For individuals in contest prep, performance sport, or clinical weight management, a more accurate RMR assessment may be warranted (via indirect calorimetry or validated prediction models).


Sleep Tracking: Strong for Duration, Weak for Stages

Most modern smart watches are reasonably accurate (85-90%) in detecting total sleep time and general sleep/wake cycles when compared to polysomnography (the clinical gold standard) for sleep duration and onset, although accuracy drops for detecting brief awakenings.

However, staging sleep into light, deep, and REM phases is far less reliable. Consumer devices often rely on movement and heart rate variability to infer stages, leading to inconsistent classification. Studies report only moderate agreement with clinical staging, and users should interpret sleep architecture data with caution.


Recovery and Readiness Scores: Promising but Under-Researched

Features like WHOOP’s Recovery Score or Garmin’s Body Battery synthesise HRV, sleep quality, and resting HR to provide a daily readiness estimate. These tools are appealing for guiding training intensity and recovery, but peer-reviewed validation is limited.

While underlying components like HRV and sleep time are measurable, the algorithms translating them into a single recovery score remain proprietary and inconsistently validated. These scores may provide useful trends over time but should not replace subjective readiness or objective performance in training.

As a coach, the use of HRV or readiness scores may be a tool in the tool belt for reviewing an athletes recovery status, however monitoring training load, or implementing autoregulatory strategies should be based on a culmination of subjective & objective information gathered.


Blood Pressure and Oxygen Saturation: Not Yet Clinically Reliable

A few smart watches now offer cuffless blood pressure or SpO2 measurements. While these features are technologically impressive, they lack the precision of clinical devices. Blood pressure readings often show regression to the mean and poor specificity in identifying hypertension. SpO2 measurements are more accurate at rest in healthy adults but lose precision under hypoxic conditions or during motion.

In short, these metrics may provide useful screening data or trend indicators but should not be used for medical decision-making.


The Take Home

For most users, smart watches offer tremendous value in creating awareness, building habits, and tracking general progress. But leveraging their full potential requires an understanding of what the data actually represents & where it falls short.

When it comes to managing body composition or performance, a more robust approach integrates:

  • Nutrition tracking: Using validated apps or food diaries to monitor energy intake.

  • Step count: As a more reliable measure of daily energy output than calorie estimates.

  • Resistance training logs: To assess progressive overload and training quality.

  • Subjective markers: Including mood, sleep satisfaction, hunger, and fatigue.

  • Additional measurements: Such as body composition assessments, scale weight trends, or RMR testing (if warranted).

Rather than fixating on precise calorie outputs from each session, athletes should aim to create a consistent calorie deficit or surplus through repeatable behaviours (a daily step goal, controlling of dietary intake & a planned strength program.

For athletes, coaches, and physique-focused individuals, these devices are best viewed as informative tools, not infallible trackers. When paired with sound nutrition practices, consistent physical activity, and mindful data interpretation, smart watches can play an additive valuable role in shaping your competitive success.


References:

Chowdhury, E. A., Western, M. J., Nightingale, T. E., & Thompson, D. (2021). Validity of energy expenditure prediction of wearable activity monitors: A systematic review and meta-analysis. Sports Medicine, 51(5), 893-919.

de Zambotti, M., Cellini, N., Goldstone, A., Colrain, I. M., & Baker, F. C. (2018). Wearable sleep technology in clinical and research settings. Medicine and Science in Sports and Exercise, 51(7), 1538-1557.

Fuller, D., Colwell, E., Low, J., Orychock, K., Tobin, M. A., Simango, B., … & Tsiros, A. (2020). Reliability and validity of commercially available wearable devices for measuring steps, energy expenditure, and heart rate: Systematic review. JMIR mHealth and uHealth, 8(9), e18694.

Halson, S. L. (2014). Monitoring training load to understand fatigue in athletes. Sports Medicine, 44(S2), 139-147.

Heymsfield, S. B., Peterson, C. M., Thomas, D. M., Heo, M., & Schuna Jr, J. M. (2014). Why are there race/ethnic differences in adult body mass index-adiposity relationships? A quantitative critical review. Obesity Reviews, 15(3), 219–229.

Shcherbina, A., Mattsson, C. M., Waggott, D., Salisbury, H., Christle, J. W., Hastie, T., & Ashley, E. A. (2017). Accuracy in wrist-worn, sensor-based measurements of heart rate and energy expenditure in a diverse cohort. Journal of Personalized Medicine, 7(2), 3.

Wallen, M. P., Gomersall, S. R., Keating, S. E., Wisløff, U., Coombes, J. S., & Clark, B. (2016). Accuracy of heart rate watches: Implications for weight management. PLOS ONE, 11(5), e0154420.

Wang, R., Blackburn, G., Desai, M., Phelan, D., Gillinov, L., Houghtaling, P., & Gillinov, M. (2021). Accuracy of wrist-worn heart rate monitors. JAMA Cardiology, 2(1), 104-106.

You cannot copy content of this page