Edge Computing in IoT: When to Process On-Device vs In the Cloud
"Just send everything to the cloud and process it there" sounds simple, but it creates systems that are slow, expensive, and fragile. Every byte you transmit costs money and power. Every cloud round-trip adds latency. Every connectivity hiccup risks a system failure.
The right question isn't "cloud or edge?" — it's "what belongs where?" The answer is different for every data type, use case, and business constraint.
The Decision Framework
Ask these four questions for each processing task:
1. Latency: How fast must action be taken?
< 10ms required → On-device only
10ms – 100ms → Edge gateway
100ms – 2 seconds → Cloud feasible (good connectivity)
2 seconds → Cloud fine, batch processing also viable
A conveyor belt safety system that must stop the motor if a hand is detected has a latency budget of ~10ms — no cloud round-trip can meet this. The inference must run on-device.
A temperature alert that sends a notification when a server room overheats has a budget of several seconds. Cloud processing is fine.
2. Bandwidth: How much data does the raw stream generate?
| Sensor type | Raw data rate | Cloud-viable? | |-------------|-------------|---------------| | Temperature (1 Hz) | 8 bytes/s | Yes | | GPS (1 Hz) | 20 bytes/s | Yes | | Audio (16kHz, 16-bit) | 32 KB/s | Marginal | | Vibration (10kHz, 3-axis) | 240 KB/s | No — process at edge | | HD video (720p) | 2 MB/s | No — run inference at edge |
High-frequency sensors almost always require edge processing to reduce data to meaningful events or aggregates before cloud upload.
3. Privacy: Can this data leave the device/facility?
Healthcare patient monitoring, industrial process data with trade secret implications, and consumer home monitoring all have privacy requirements that may prohibit transmitting raw data to the cloud. Edge processing extracts only the necessary inference result (event detected: yes/no) without exposing raw sensor streams.
4. Cost: What's the marginal cost of processing each unit?
Cloud compute and data transfer costs at scale. Run the math:
1,000 devices × 10 kHz vibration sensor × 3 axes × 4 bytes × 86,400 seconds
= 10,368 GB/day raw dataAWS data transfer: $0.09/GB × 10,368 = $933/day = $340,000/year
vs.
Edge gateway + edge processing: ~$50/device one-time + $5/month connectivity
= $50,000 capex + $60,000/year opex (at $5/month/device)
The edge wins by orders of magnitude on bandwidth-heavy sensors.
On-Device ML: TensorFlow Lite on ESP32
TensorFlow Lite for Microcontrollers (TFLM) runs quantized neural networks on devices with as little as 256KB of RAM. The ESP32 has 520KB SRAM and optionally 8MB PSRAM — enough for keyword detection, gesture recognition, and simple anomaly detection.
// ESP32: run TFLM inference for vibration anomaly detection
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/micro/micro_mutable_op_resolver.h"
#include "model_vibration_anomaly.h" // quantized INT8 model, generated offlinenamespace {
const int kTensorArenaSize = 60 * 1024; // 60KB arena
uint8_t tensor_arena[kTensorArenaSize];
tflite::MicroMutableOpResolver<4> resolver;
tflite::MicroInterpreter* interpreter = nullptr;
TfLiteTensor* input = nullptr;
TfLiteTensor* output = nullptr;
}
void setupInference() {
resolver.AddConv2D();
resolver.AddMaxPool2D();
resolver.AddFullyConnected();
resolver.AddSoftmax();
static tflite::MicroInterpreter static_interpreter(
tflite::GetModel(g_vibration_model_data), resolver,
tensor_arena, kTensorArenaSize
);
interpreter = &static_interpreter;
interpreter->AllocateTensors();
input = interpreter->input(0);
output = interpreter->output(0);
}
bool detectAnomaly(float* fftFeatures, int featureCount) {
// Copy quantized features into input tensor
for (int i = 0; i < featureCount; i++) {
input->data.int8[i] = static_cast(
(fftFeatures[i] / input->params.scale) + input->params.zero_point
);
}
interpreter->Invoke();
// Output: [normal_score, anomaly_score]
float anomalyScore = (output->data.int8[1] - output->params.zero_point)
* output->params.scale;
return anomalyScore > 0.85f; // 85% confidence threshold
}
void loop() {
float fftFeatures[64];
collectVibrationFFT(fftFeatures, 64); // sample + FFT on device
if (detectAnomaly(fftFeatures, 64)) {
// Only publish event to cloud, NOT raw FFT data
publishEvent("vibration_anomaly", millis());
}
// Raw data never leaves the device
}
TFLM model constraints on ESP32:
Edge Gateway Processing
The Raspberry Pi or industrial gateway PC is the natural home for more complex processing that doesn't fit on a microcontroller.
// Node.js gateway: sliding window anomaly detection
const { InferenceSession, Tensor } = require('onnxruntime-node')let session = null
async function loadModel() {
session = await InferenceSession.create('/opt/models/vibration_lstm.onnx')
}
// Sliding window buffer per device
const windows = new Map()
async function processReading(deviceId, readings) {
if (!windows.has(deviceId)) windows.set(deviceId, [])
const window = windows.get(deviceId)
window.push(...readings)
// LSTM needs 256 time steps of 3-axis data
if (window.length < 256) return null
if (window.length > 256) window.splice(0, window.length - 256)
const inputData = new Float32Array(window.flatMap(r => [r.x, r.y, r.z]))
const tensor = new Tensor('float32', inputData, [1, 256, 3])
const results = await session.run({ input: tensor })
const anomalyScore = results.output.data[0]
if (anomalyScore > 0.85) {
return {
deviceId,
type: 'vibration_anomaly',
score: anomalyScore,
ts: Date.now(),
}
}
return null
}
The gateway's advantages over on-device:
Cloud Processing: What Belongs There
Not everything should move to the edge. Cloud processing is the right choice for:
Fleet-level analytics: Detecting that 5% of a 10,000-device fleet has a subtle drift in calibration requires comparing data across all devices simultaneously — a task that requires a database and compute that no edge node can provide.
Model retraining: The cloud trains the next version of your edge model on accumulated labeled data, then pushes the updated model back to edge gateways and devices. The edge runs inference; the cloud does training.
Historical analysis: "What was the average efficiency of production line 3 for the last 6 months?" requires historical data storage and query — that's cloud territory.
Regulatory archiving: Long-term data retention for compliance (SCADA records, energy metering) belongs in cloud storage with access controls and audit logs.
The Fog Computing Pattern
Fog computing is the formal name for a hierarchy of processing layers — not just cloud and device, but cloud → regional server → site gateway → device. Each layer processes what it can locally and only escalates what it must.
Device (ESP32)
→ filter: only non-zero readings pass
→ inference: anomaly detection (TFLM)
→ transmit: events onlySite Gateway (Raspberry Pi)
→ aggregate: 1-second summaries of raw sensor values
→ inference: multi-sensor correlation (ONNX)
→ transmit: summaries + anomaly events
Regional Server (EC2 / on-prem)
→ batch analytics: shift-level production reports
→ model serving: high-accuracy models too large for Pi
→ transmit: reports + aggregated events
Cloud (AWS)
→ fleet analytics: cross-site comparisons
→ model training: update TFLM and ONNX models
→ dashboards + APIs: enterprise reporting
For the architecture patterns that define how these layers connect, see [IoT Architecture Patterns: Hub-and-Spoke, Mesh, and Edge-Cloud Hybrid](/blog/iot-architecture-patterns-2024).
For implementing the gateway layer, see [Building a Production IoT Gateway with Raspberry Pi and Node.js](/blog/raspberry-pi-iot-gateway-nodejs).
For the security implications of processing data at the edge, see [IoT Security: Zero Trust for Embedded Systems](/blog/iot-security-zero-trust-embedded-systems).
Need help designing your edge computing strategy? [Contact Code Caracal](/contact) — we've shipped these systems for clients across 15+ countries.