Back to Blog
IoT Engineering

Edge Computing in IoT: When to Process On-Device vs In the Cloud

Sending every sensor reading to the cloud for processing is expensive, slow, and often unnecessary. This decision framework helps IoT engineers identify exactly where computation belongs — on the microcontroller, at the gateway, or in the cloud.

May 20, 2024
12 min read
Edge ComputingTensorFlow LiteESP32IoT Architecture

Edge Computing in IoT: When to Process On-Device vs In the Cloud

"Just send everything to the cloud and process it there" sounds simple, but it creates systems that are slow, expensive, and fragile. Every byte you transmit costs money and power. Every cloud round-trip adds latency. Every connectivity hiccup risks a system failure.

The right question isn't "cloud or edge?" — it's "what belongs where?" The answer is different for every data type, use case, and business constraint.

The Decision Framework

Ask these four questions for each processing task:

1. Latency: How fast must action be taken?

< 10ms required   → On-device only
10ms – 100ms      → Edge gateway
100ms – 2 seconds → Cloud feasible (good connectivity)
2 seconds → Cloud fine, batch processing also viable

A conveyor belt safety system that must stop the motor if a hand is detected has a latency budget of ~10ms — no cloud round-trip can meet this. The inference must run on-device.

A temperature alert that sends a notification when a server room overheats has a budget of several seconds. Cloud processing is fine.

2. Bandwidth: How much data does the raw stream generate?

| Sensor type | Raw data rate | Cloud-viable? | |-------------|-------------|---------------| | Temperature (1 Hz) | 8 bytes/s | Yes | | GPS (1 Hz) | 20 bytes/s | Yes | | Audio (16kHz, 16-bit) | 32 KB/s | Marginal | | Vibration (10kHz, 3-axis) | 240 KB/s | No — process at edge | | HD video (720p) | 2 MB/s | No — run inference at edge |

High-frequency sensors almost always require edge processing to reduce data to meaningful events or aggregates before cloud upload.

3. Privacy: Can this data leave the device/facility?

Healthcare patient monitoring, industrial process data with trade secret implications, and consumer home monitoring all have privacy requirements that may prohibit transmitting raw data to the cloud. Edge processing extracts only the necessary inference result (event detected: yes/no) without exposing raw sensor streams.

4. Cost: What's the marginal cost of processing each unit?

Cloud compute and data transfer costs at scale. Run the math:

1,000 devices × 10 kHz vibration sensor × 3 axes × 4 bytes × 86,400 seconds
= 10,368 GB/day raw data

AWS data transfer: $0.09/GB × 10,368 = $933/day = $340,000/year vs. Edge gateway + edge processing: ~$50/device one-time + $5/month connectivity = $50,000 capex + $60,000/year opex (at $5/month/device)

The edge wins by orders of magnitude on bandwidth-heavy sensors.

On-Device ML: TensorFlow Lite on ESP32

TensorFlow Lite for Microcontrollers (TFLM) runs quantized neural networks on devices with as little as 256KB of RAM. The ESP32 has 520KB SRAM and optionally 8MB PSRAM — enough for keyword detection, gesture recognition, and simple anomaly detection.

// ESP32: run TFLM inference for vibration anomaly detection
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/micro/micro_mutable_op_resolver.h"
#include "model_vibration_anomaly.h" // quantized INT8 model, generated offline

namespace { const int kTensorArenaSize = 60 * 1024; // 60KB arena uint8_t tensor_arena[kTensorArenaSize];

tflite::MicroMutableOpResolver<4> resolver; tflite::MicroInterpreter* interpreter = nullptr; TfLiteTensor* input = nullptr; TfLiteTensor* output = nullptr; }

void setupInference() { resolver.AddConv2D(); resolver.AddMaxPool2D(); resolver.AddFullyConnected(); resolver.AddSoftmax();

static tflite::MicroInterpreter static_interpreter( tflite::GetModel(g_vibration_model_data), resolver, tensor_arena, kTensorArenaSize ); interpreter = &static_interpreter; interpreter->AllocateTensors();

input = interpreter->input(0); output = interpreter->output(0); }

bool detectAnomaly(float* fftFeatures, int featureCount) { // Copy quantized features into input tensor for (int i = 0; i < featureCount; i++) { input->data.int8[i] = static_cast( (fftFeatures[i] / input->params.scale) + input->params.zero_point ); }

interpreter->Invoke();

// Output: [normal_score, anomaly_score] float anomalyScore = (output->data.int8[1] - output->params.zero_point) * output->params.scale;

return anomalyScore > 0.85f; // 85% confidence threshold }

void loop() { float fftFeatures[64]; collectVibrationFFT(fftFeatures, 64); // sample + FFT on device

if (detectAnomaly(fftFeatures, 64)) { // Only publish event to cloud, NOT raw FFT data publishEvent("vibration_anomaly", millis()); } // Raw data never leaves the device }

TFLM model constraints on ESP32:

  • Model size: < 1MB (fits in flash)
  • Arena (RAM for activations): < 300KB (< PSRAM capacity)
  • Inference time: 10–200ms depending on model complexity
  • INT8 quantization required (no float32 on most MCUs)
  • Edge Gateway Processing

    The Raspberry Pi or industrial gateway PC is the natural home for more complex processing that doesn't fit on a microcontroller.

    // Node.js gateway: sliding window anomaly detection
    const { InferenceSession, Tensor } = require('onnxruntime-node')

    let session = null

    async function loadModel() { session = await InferenceSession.create('/opt/models/vibration_lstm.onnx') }

    // Sliding window buffer per device const windows = new Map()

    async function processReading(deviceId, readings) { if (!windows.has(deviceId)) windows.set(deviceId, []) const window = windows.get(deviceId) window.push(...readings)

    // LSTM needs 256 time steps of 3-axis data if (window.length < 256) return null if (window.length > 256) window.splice(0, window.length - 256)

    const inputData = new Float32Array(window.flatMap(r => [r.x, r.y, r.z])) const tensor = new Tensor('float32', inputData, [1, 256, 3])

    const results = await session.run({ input: tensor }) const anomalyScore = results.output.data[0]

    if (anomalyScore > 0.85) { return { deviceId, type: 'vibration_anomaly', score: anomalyScore, ts: Date.now(), } } return null }

    The gateway's advantages over on-device:

  • Full Node.js or Python environment — use any ONNX or TensorFlow SavedModel
  • 4–8 GB RAM — no model size constraints
  • Can process data from 10–100 sensors in parallel
  • Model updates via file copy, no firmware flash required
  • Cloud Processing: What Belongs There

    Not everything should move to the edge. Cloud processing is the right choice for:

    Fleet-level analytics: Detecting that 5% of a 10,000-device fleet has a subtle drift in calibration requires comparing data across all devices simultaneously — a task that requires a database and compute that no edge node can provide.

    Model retraining: The cloud trains the next version of your edge model on accumulated labeled data, then pushes the updated model back to edge gateways and devices. The edge runs inference; the cloud does training.

    Historical analysis: "What was the average efficiency of production line 3 for the last 6 months?" requires historical data storage and query — that's cloud territory.

    Regulatory archiving: Long-term data retention for compliance (SCADA records, energy metering) belongs in cloud storage with access controls and audit logs.

    The Fog Computing Pattern

    Fog computing is the formal name for a hierarchy of processing layers — not just cloud and device, but cloud → regional server → site gateway → device. Each layer processes what it can locally and only escalates what it must.

    Device (ESP32)
      → filter: only non-zero readings pass
      → inference: anomaly detection (TFLM)
      → transmit: events only

    Site Gateway (Raspberry Pi) → aggregate: 1-second summaries of raw sensor values → inference: multi-sensor correlation (ONNX) → transmit: summaries + anomaly events

    Regional Server (EC2 / on-prem) → batch analytics: shift-level production reports → model serving: high-accuracy models too large for Pi → transmit: reports + aggregated events

    Cloud (AWS) → fleet analytics: cross-site comparisons → model training: update TFLM and ONNX models → dashboards + APIs: enterprise reporting

    For the architecture patterns that define how these layers connect, see [IoT Architecture Patterns: Hub-and-Spoke, Mesh, and Edge-Cloud Hybrid](/blog/iot-architecture-patterns-2024).

    For implementing the gateway layer, see [Building a Production IoT Gateway with Raspberry Pi and Node.js](/blog/raspberry-pi-iot-gateway-nodejs).

    For the security implications of processing data at the edge, see [IoT Security: Zero Trust for Embedded Systems](/blog/iot-security-zero-trust-embedded-systems).

    Need help designing your edge computing strategy? [Contact Code Caracal](/contact) — we've shipped these systems for clients across 15+ countries.

    Written by CodeCaracal Engineering

    We write from production experience — every technique in our articles has been deployed to real clients. No academic theory.

    More Articles

    Business · 12 min read

    IoT Device Compliance: FCC, CE, and Product Certification Guide for Hardware Startups

    Business · 11 min read

    What to Look for When Hiring an IoT Development Partner: 8 Critical Criteria

    Business · 11 min read

    IoT MVP to Production: Realistic Timeline and Budget for Hardware Startups

    Business · 11 min read

    IoT Development Agency vs Building In-House: A Decision Framework for Founders

    IoT Dashboard · 13 min read

    Next.js IoT Analytics Dashboard: From Sensor Data to Production App

    Business · 11 min read

    How Much Does It Cost to Build an IoT Product in 2024? A Realistic Breakdown

    IoT Dashboard · 11 min read

    IoT Dashboard UX: Design Principles for Industrial Monitoring Interfaces

    IoT Dashboard · 12 min read

    Node.js WebSocket Server: The Real-Time Backend for IoT Dashboards

    Cloud & DevOps · 12 min read

    Containerizing IoT Backend Services with Docker: From Dev to Production

    IoT Dashboard · 14 min read

    Grafana + InfluxDB IoT Monitoring: Complete Production Setup Guide

    IoT Dashboard · 12 min read

    Building Real-Time IoT Dashboards with React and Recharts

    Cloud & DevOps · 13 min read

    CI/CD for Embedded Firmware: Automated Build, Test, and OTA Release Pipeline

    Mobile Development · 12 min read

    Flutter Offline-First IoT Apps: Hive + Sync Architecture That Works in the Field

    Cloud & DevOps · 14 min read

    Terraform for IoT Infrastructure: Provisioning AWS IoT Core, Lambda, and InfluxDB as Code

    Mobile Development · 10 min read

    Flutter IoT Alerts: Firebase Push Notifications for Device Events

    Cloud & DevOps · 12 min read

    Deploying IoT Backends on AWS: ECS Fargate vs Lambda vs EC2 Decision Guide

    Mobile Development · 11 min read

    Flutter + MQTT: Building Production IoT Mobile Apps That Scale

    Mobile Development · 13 min read

    Flutter BLE: Building a Bluetooth IoT Controller App from Scratch

    Cloud & DevOps · 13 min read

    AWS IoT Core vs Azure IoT Hub vs Google Cloud IoT: 2024 Honest Comparison

    IoT Engineering · 13 min read

    Kafka vs RabbitMQ for IoT: Choosing the Right Message Queue for High-Volume Telemetry

    IoT Engineering · 14 min read

    IoT System Testing: Unit, Integration, Hardware-in-the-Loop, and End-to-End

    IoT Engineering · 14 min read

    Predictive Maintenance with IoT Sensor Data: From Threshold to Machine Learning

    Embedded Systems · 14 min read

    IoT Bootloader Design: Secure Boot, A/B Partitions, and Reliable OTA Recovery

    IoT Engineering · 14 min read

    Multi-Tenant IoT Platform Architecture: Isolation, Scaling, and Data Partitioning

    Embedded Systems · 14 min read

    Memory Management in Embedded Firmware: Avoiding Heap Fragmentation and Stack Overflows

    IoT Engineering · 13 min read

    IoT Cost Optimization: How We Cut AWS IoT Bills by 60% Without Sacrificing Reliability

    IoT Engineering · 13 min read

    Digital Twins for IoT: Building a Virtual Mirror of Your Physical Devices

    Embedded Systems · 14 min read

    ESP32 Deep Sleep Mastery: Cutting Power Consumption from 240mA to 10µA

    IoT Engineering · 10 min read

    MQTT QoS 0, 1, and 2 Explained: Choosing the Right Level for IoT

    IoT Engineering · 14 min read

    IoT Monitoring and Observability: Metrics, Logs, and Distributed Tracing

    Embedded Systems · 14 min read

    Debugging Embedded Firmware: JTAG, GDB, Logic Analyzers, and Serial Tracing

    IoT Engineering · 12 min read

    WebSocket vs MQTT vs Server-Sent Events: Real-Time IoT Protocol Deep Dive

    Embedded Systems · 13 min read

    STM32 HAL vs Low-Level Drivers: When the Abstraction Costs You Too Much

    IoT Engineering · 13 min read

    IoT Data Pipeline: From Raw Sensor Reading to Live Dashboard in Under 100ms

    IoT Engineering · 13 min read

    Zero-Touch IoT Device Provisioning: Scaling from 10 to 100,000 Devices

    Embedded Systems · 13 min read

    UART vs SPI vs I2C: Choosing the Right Protocol for Sensor Integration

    IoT Engineering · 12 min read

    Real-Time IoT Alerting: From Simple Thresholds to ML Anomaly Detection

    Embedded Systems · 12 min read

    ESP32 Partition Table: Designing Flash Layout for Production Firmware

    IoT Engineering · 12 min read

    IoT Architecture Patterns: Hub-and-Spoke, Mesh, and Edge-Cloud Hybrid

    Embedded Systems · 13 min read

    IoT Battery Life Optimization: Engineering Devices That Last Years on a Single Charge

    IoT Engineering · 13 min read

    Time-Series Databases for IoT: InfluxDB vs TimescaleDB vs AWS Timestream

    Security · 14 min read

    Zero-Trust Security for Embedded IoT: Why Your Devices Are Probably Vulnerable

    Embedded Systems · 14 min read

    FreeRTOS on ESP32: Task Scheduling, Queues, and Resource Management for IoT

    IoT Engineering · 12 min read

    Building a Production IoT Gateway with Raspberry Pi and Node.js

    Embedded Systems · 13 min read

    ESP32 vs STM32: Choosing the Right Microcontroller for Your IoT Project

    Mobile Development · 10 min read

    Flutter + WebSocket: Building Real-Time IoT Dashboards That Don't Stutter

    IoT Engineering · 13 min read

    IoT Fleet Management at Scale: AWS IoT Core Device Registry and Provisioning

    IoT Engineering · 11 min read

    MQTT vs HTTP for IoT: Which Protocol Wins in Production?

    IoT Engineering · 12 min read

    ESP32 → MQTT → AWS IoT Core: The Production-Grade Architecture Guide

    Let's Build Together

    Got an IoT challenge?
    We've shipped it.

    Whether you need a fleet to track, a factory to monitor, or a farm to automate — our team has done it before and we'd love to build it with you. Typical response time: under 24 hours.

    No upfront commitment99.9% uptime SLANDA on requestFixed-price options