Home / Lab / Why your IoT temperature averages lie
Lab

Why your IoT temperature averages lie

May 24, 2026 Andrei Gosman 4 min read

Most consumer Zigbee temperature sensors report on change, not on a fixed schedule. A naive mean() over a month weights every sample equally, regardless of how much real time it represents. When a sensor spikes briefly — direct sunlight, a draft, any rapid transient — it produces hundreds of samples in minutes, all clustered around anomalous values. The monthly average gets pulled toward those over-represented readings. The bias is invisible day-to-day and can reach 2-3°C.

Setup

  • Sonoff SNZB-02D outdoor sensor, mounted in a location with morning sun exposure
  • Zigbee2MQTT → Home Assistant → InfluxDB add-on
  • Grafana dashboard with monthly mean stat panels, one per month, simple SELECT mean("value") queries

The dashboard ran for months. Numbers looked plausible.

Cross-check

I compared the sensor’s monthly mean against București-Filaret, an official weather station reporting at meteoromania.ro.

MonthOAT sensorFilaretΔ
Nov 20259.18.8+0.3
Dec 20253.93.3+0.6
Jan 2026−0.2−0.5+0.3
Feb 20265.21.5+3.7
Mar 202611.28.0+3.2
Apr 202614.711.9+2.8

Winter months agreed within a degree. February through April were off by 3°C or more. Urban heat island can produce 1-2°C of consistent offset between two points in the same city. Not 3-4°C, and not concentrated in months with strong sun.

Wrong hypothesis: solar anomalies on the mean

The sensor catches direct morning sun. I cataloged 23 anomalous hours between November and April where the reported temperature spiked far above the actual outdoor air temperature, confirmed by a co-located Shelly Plus2PM with DS18B20 in a sheltered position.

Example, 11 March 2026, local time:

HourOATRoof (DS18B20)
07:002.682.30
08:0013.282.40
09:0020.434.87
10:0012.878.52
12:0011.3115.24

The OAT sensor reads 20.4°C while real air temperature is around 4°C. Sensor body heating from direct sunlight.

I recomputed the monthly mean while excluding all 23 anomalous hourly samples. The correction was −0.07°C for February and −0.16°C for March. Two orders of magnitude smaller than the gap to Filaret. The hypothesis was correct but the impact was negligible.

Actual cause: sample density bias

I counted samples per day for February:

DateSample count
14 Feb52
18 Feb65
19 Feb317
20 Feb178
24 Feb86
25 Feb274
27 Feb255

The high-count days were the same days with solar spikes. Raw timestamps from 19 February showed >200 samples logged between 10:24 and 11:33 local, one every few seconds, walking from −1.3°C to 19.9°C and back to 1.2°C.

This is documented SNZB-02D behavior. The sensor reports on each 0.2°C change, with a 30-minute fallback when stable. Battery-powered Zigbee firmware avoids transmitting redundant data — radio time dominates power consumption.

In a stable environment: a handful of samples per hour. During a thermal transient: 200+ samples per hour, every one at an anomalous temperature.

InfluxDB stores each report as one row. SELECT mean(value) computes the arithmetic mean over rows with no awareness of how much real time each row represents. February had ~1,800 samples; ~850 came from the seven anomalous days. Those days are 25% of the month by time but 47% of the sample population, and their values skew high. The naive mean drifts toward them.

Fix

InfluxQL 1.x supports nested subqueries:

SELECT mean("mean_value") FROM (
  SELECT mean("value") AS "mean_value"
  FROM "°C"
  WHERE "entity_id" = 'oat_sensor'
    AND time >= 'start' AND time < 'end'
  GROUP BY time(1h)
)

Inner query collapses each hour into a single value. Outer query averages those equally. A quiet 03:00 and a spiking 09:00 both contribute one number. Sample density no longer affects the result.

Results after applying the fix:

MonthRaw meanHourly meanFilaret
Nov 20259.18.68.8
Dec 20253.93.93.3
Jan 2026−0.20.1−0.5
Feb 20265.22.51.5
Mar 202611.28.88.0
Apr 202614.713.711.9

Residual offsets of 0.6-1.8°C remain, consistent with urban heat island and residual radiative heating of the sensor body. The 3-4°C gaps are gone.

Generalization

The pattern applies to any sensor reporting on change rather than on schedule, and any aggregation that treats samples as equally weighted. Candidates:

  • Smart plugs reporting power on every change — over-sampled during load cycling
  • CO2 and VOC sensors with adaptive reporting
  • Any delta-based telemetry, which covers most consumer IoT

The fix is the same in all cases: aggregate to a uniform time grid before averaging. Time-weighted means via integral() in InfluxQL or equivalent in other TSDBs are the technically correct approach, but fixed-interval pre-aggregation works for most cases and is easier to reason about.

Detection

The original monthly numbers were internally consistent. There was no signal in the data itself that anything was wrong. Cross-checking against an independent source — in this case ANM Filaret — was the only way to detect the bias. If sensor output drives decisions (heating optimization, efficiency calculations, degree-day analysis), comparison against trusted external data isn’t optional.


Stack: Home Assistant 2026.5, Zigbee2MQTT 2.10, InfluxDB add-on 1.8, Grafana 12.3. Reference: meteoromania.ro București-Filaret monthly characterizations.