Back to Blog
IoT Engineering

Digital Twins for IoT: Building a Virtual Mirror of Your Physical Devices

A digital twin is more than a cached device state. Done right, it enables predictive maintenance, hardware-free testing, and real-time simulation. Here is how to build one.

May 18, 2024
13 min read
Digital TwinIoT Device ShadowAWS IoTPredictive Maintenance

Digital Twins for IoT: Building a Virtual Mirror of Your Physical Devices

A digital twin is a virtual representation of a physical device — not just its current state, but its history, expected behavior, and predicted future. When done well, a digital twin lets you debug a device without touching it, simulate a failure before it happens, and test new firmware against a faithful replica of a deployed fleet.

The term gets overloaded. A cached JSON blob of the last sensor reading is not a digital twin. A model that reflects the device's current state, responds to the same commands, tracks historical state transitions, and can run independently from the physical device — that is a digital twin.

This guide goes from the AWS IoT Device Shadow (the simplest starting point) to a production-grade twin with state history, synchronization guarantees, and real use cases.

AWS IoT Device Shadow: The Foundation

AWS IoT Core provides a built-in digital twin primitive: the Device Shadow. It is a JSON document with three sections:

  • reported — what the device says its current state is
  • desired — what you (or your application) want the state to be
  • delta — the difference between reported and desired (computed by AWS)
  • This gap-tracking model is elegant. When a device is offline, you write to desired. When it reconnects, it receives the delta and acts on it. No message queuing required on your side.

    // Backend: update desired state for a device
    import { IoTDataPlaneClient, UpdateThingShadowCommand } from '@aws-sdk/client-iot-data-plane'

    const iotData = new IoTDataPlaneClient({ region: 'us-east-1' })

    async function setDeviceSetpoint(deviceId: string, tempSetpoint: number) { const payload = { state: { desired: { tempSetpoint, updatedBy: 'dashboard', updatedAt: new Date().toISOString(), }, }, }

    await iotData.send( new UpdateThingShadowCommand({ thingName: deviceId, payload: Buffer.from(JSON.stringify(payload)), }) ) }

    // Firmware side: receive desired state and report back void onShadowDeltaReceived(const char* payload) { StaticJsonDocument<512> doc; deserializeJson(doc, payload);

    float newSetpoint = doc["state"]["tempSetpoint"]; applySetpoint(newSetpoint);

    // Report the new state back to close the delta StaticJsonDocument<256> report; report["state"]["reported"]["tempSetpoint"] = newSetpoint; report["state"]["reported"]["status"] = "applied";

    char buf[256]; serializeJson(report, buf); mqttClient.publish( "$aws/things/${DEVICE_ID}/shadow/update", buf ); }

    The Device Shadow handles the current state well. But a production digital twin needs more.

    Building a Richer Digital Twin

    Beyond the Device Shadow, a full digital twin stores:

  • 1. State history — every reported state with timestamps
  • 2. Event log — commands sent, errors received, OTA updates applied
  • 3. Computed attributes — derived values your firmware doesn't calculate (efficiency scores, anomaly flags)
  • 4. Behavioral model — expected state transitions for simulation and anomaly detection
  • State History with DynamoDB Time-Series

    // Lambda: IoT Rule triggers this on every shadow update
    import { DynamoDBClient, PutItemCommand } from '@aws-sdk/client-dynamodb'
    import { marshall } from '@aws-sdk/util-dynamodb'

    const db = new DynamoDBClient({ region: 'us-east-1' })

    interface ShadowUpdateEvent { deviceId: string state: { reported: Record } timestamp: number version: number }

    export const handler = async (event: ShadowUpdateEvent) => { // Store full state snapshot with TTL for retention policy const ttl = Math.floor(Date.now() / 1000) + 90 * 24 * 60 * 60 // 90 days

    await db.send( new PutItemCommand({ TableName: 'DeviceTwinHistory', Item: marshall({ pk: DEVICE#${event.deviceId}, sk: STATE#${event.timestamp}, deviceId: event.deviceId, state: event.state.reported, shadowVersion: event.version, ttl, }), }) ) }

    DynamoDB table design for the twin:

  • Partition key: DEVICE#
  • Sort key: STATE# for time-ordered state history
  • GSI on deviceType + timestamp for fleet-wide queries
  • TTL attribute for automatic data expiry (avoid unbounded growth)
  • The Twin API

    Expose the digital twin as a REST API that your dashboard, mobile app, and other services consume:

    // GET /twins/:deviceId — returns current + recent history
    async function getTwin(deviceId: string) {
      // Current shadow from AWS IoT
      const shadow = await getShadow(deviceId)

    // Recent state history from DynamoDB const history = await queryStateHistory(deviceId, { from: Date.now() - 24 * 60 * 60 * 1000, // last 24h limit: 1000, })

    // Computed attributes const computed = { uptimeHours: calculateUptime(history), avgTemperature: average(history.map((h) => h.state.temperature)), anomalyScore: runAnomalyModel(history), predictedFailureDate: runRULModel(history), // Remaining Useful Life }

    return { deviceId, currentState: shadow.state.reported, desiredState: shadow.state.desired, lastSeen: shadow.metadata.reported.temperature?.timestamp, history, computed, } }

    Synchronization Patterns

    The trickiest part of a digital twin is keeping it in sync with the physical device across unreliable networks.

    Pattern 1: Shadow-driven (simplest) Device publishes state → Shadow updates → Lambda records history. Consistent but has eventual consistency lag. Acceptable for monitoring use cases.

    Pattern 2: Direct publish + shadow Device publishes to both telemetry (high-frequency, QoS 0) and updates the shadow (low-frequency, QoS 1). Twin stores telemetry in time-series DB and shadow for current state. Best of both: rich history and guaranteed current state.

    Pattern 3: Event sourcing Every state change is an immutable event. The twin is rebuilt by replaying events. Maximum fidelity, highest complexity. Worth it for regulated industries where audit trails are mandatory.

    For most projects, Pattern 2 hits the right balance.

    Use Case: Simulation and Testing

    A digital twin that accurately models device behavior lets you test new firmware without hardware:

    // Twin simulator: generate realistic sensor data for integration tests
    class DeviceTwinSimulator {
      private state: DeviceState
      private history: DeviceState[]

    constructor(deviceId: string, seedHistory: DeviceState[]) { this.history = seedHistory this.state = seedHistory[seedHistory.length - 1] }

    // Generate next state based on learned patterns tick(elapsedSeconds: number): DeviceState { const hourOfDay = new Date().getHours()

    // Temperature follows daily cycle learned from history const baseTemp = this.learnedBaseline('temperature', hourOfDay) const noise = (Math.random() - 0.5) * 0.5

    this.state = { ...this.state, temperature: baseTemp + noise, timestamp: Date.now(), uptime: this.state.uptime + elapsedSeconds, }

    this.history.push(this.state) return this.state } }

    In CI/CD, spin up twin simulators for 50 virtual devices, run your entire backend pipeline against them, and verify data flows correctly — without a single physical device in the loop.

    Use Case: Predictive Maintenance

    The twin accumulates enough state history to run predictive models. A vibration sensor twin that has tracked motor vibration for six months can flag when vibration patterns deviate from the learned baseline — before the motor fails.

    This is covered in depth in our [predictive maintenance guide](/blog/predictive-maintenance-iot-ml), but the key point architecturally: the twin is the data source. The ML model queries the twin API, not the raw telemetry stream. This decoupling means you can improve the model without touching the ingestion pipeline.

    Implementation Checklist

  • AWS IoT Device Shadow for current state (every device, from day one)
  • Lambda + DynamoDB for state history (Time-series table design)
  • Twin REST API serving current state + history + computed attributes
  • TTL-based data retention (prevent unbounded storage growth)
  • Twin simulator for integration and end-to-end testing
  • Dashboard consuming twin API, not raw MQTT
  • The digital twin pattern pays dividends throughout the product lifecycle — faster debugging, safer testing, and eventually the foundation for ML-driven predictive features.

    Need help? [Contact Code Caracal](/contact) — we've shipped these systems for clients across 15+ countries.

    Written by CodeCaracal Engineering

    We write from production experience — every technique in our articles has been deployed to real clients. No academic theory.

    More Articles

    Business · 12 min read

    IoT Device Compliance: FCC, CE, and Product Certification Guide for Hardware Startups

    Business · 11 min read

    What to Look for When Hiring an IoT Development Partner: 8 Critical Criteria

    Business · 11 min read

    IoT MVP to Production: Realistic Timeline and Budget for Hardware Startups

    Business · 11 min read

    IoT Development Agency vs Building In-House: A Decision Framework for Founders

    IoT Dashboard · 13 min read

    Next.js IoT Analytics Dashboard: From Sensor Data to Production App

    Business · 11 min read

    How Much Does It Cost to Build an IoT Product in 2024? A Realistic Breakdown

    IoT Dashboard · 11 min read

    IoT Dashboard UX: Design Principles for Industrial Monitoring Interfaces

    IoT Dashboard · 12 min read

    Node.js WebSocket Server: The Real-Time Backend for IoT Dashboards

    Cloud & DevOps · 12 min read

    Containerizing IoT Backend Services with Docker: From Dev to Production

    IoT Dashboard · 14 min read

    Grafana + InfluxDB IoT Monitoring: Complete Production Setup Guide

    IoT Dashboard · 12 min read

    Building Real-Time IoT Dashboards with React and Recharts

    Cloud & DevOps · 13 min read

    CI/CD for Embedded Firmware: Automated Build, Test, and OTA Release Pipeline

    Mobile Development · 12 min read

    Flutter Offline-First IoT Apps: Hive + Sync Architecture That Works in the Field

    Cloud & DevOps · 14 min read

    Terraform for IoT Infrastructure: Provisioning AWS IoT Core, Lambda, and InfluxDB as Code

    Mobile Development · 10 min read

    Flutter IoT Alerts: Firebase Push Notifications for Device Events

    Cloud & DevOps · 12 min read

    Deploying IoT Backends on AWS: ECS Fargate vs Lambda vs EC2 Decision Guide

    Mobile Development · 11 min read

    Flutter + MQTT: Building Production IoT Mobile Apps That Scale

    Mobile Development · 13 min read

    Flutter BLE: Building a Bluetooth IoT Controller App from Scratch

    Cloud & DevOps · 13 min read

    AWS IoT Core vs Azure IoT Hub vs Google Cloud IoT: 2024 Honest Comparison

    IoT Engineering · 13 min read

    Kafka vs RabbitMQ for IoT: Choosing the Right Message Queue for High-Volume Telemetry

    IoT Engineering · 14 min read

    IoT System Testing: Unit, Integration, Hardware-in-the-Loop, and End-to-End

    IoT Engineering · 14 min read

    Predictive Maintenance with IoT Sensor Data: From Threshold to Machine Learning

    Embedded Systems · 14 min read

    IoT Bootloader Design: Secure Boot, A/B Partitions, and Reliable OTA Recovery

    IoT Engineering · 14 min read

    Multi-Tenant IoT Platform Architecture: Isolation, Scaling, and Data Partitioning

    Embedded Systems · 14 min read

    Memory Management in Embedded Firmware: Avoiding Heap Fragmentation and Stack Overflows

    IoT Engineering · 13 min read

    IoT Cost Optimization: How We Cut AWS IoT Bills by 60% Without Sacrificing Reliability

    IoT Engineering · 12 min read

    Edge Computing in IoT: When to Process On-Device vs In the Cloud

    Embedded Systems · 14 min read

    ESP32 Deep Sleep Mastery: Cutting Power Consumption from 240mA to 10µA

    IoT Engineering · 10 min read

    MQTT QoS 0, 1, and 2 Explained: Choosing the Right Level for IoT

    IoT Engineering · 14 min read

    IoT Monitoring and Observability: Metrics, Logs, and Distributed Tracing

    Embedded Systems · 14 min read

    Debugging Embedded Firmware: JTAG, GDB, Logic Analyzers, and Serial Tracing

    IoT Engineering · 12 min read

    WebSocket vs MQTT vs Server-Sent Events: Real-Time IoT Protocol Deep Dive

    Embedded Systems · 13 min read

    STM32 HAL vs Low-Level Drivers: When the Abstraction Costs You Too Much

    IoT Engineering · 13 min read

    IoT Data Pipeline: From Raw Sensor Reading to Live Dashboard in Under 100ms

    IoT Engineering · 13 min read

    Zero-Touch IoT Device Provisioning: Scaling from 10 to 100,000 Devices

    Embedded Systems · 13 min read

    UART vs SPI vs I2C: Choosing the Right Protocol for Sensor Integration

    IoT Engineering · 12 min read

    Real-Time IoT Alerting: From Simple Thresholds to ML Anomaly Detection

    Embedded Systems · 12 min read

    ESP32 Partition Table: Designing Flash Layout for Production Firmware

    IoT Engineering · 12 min read

    IoT Architecture Patterns: Hub-and-Spoke, Mesh, and Edge-Cloud Hybrid

    Embedded Systems · 13 min read

    IoT Battery Life Optimization: Engineering Devices That Last Years on a Single Charge

    IoT Engineering · 13 min read

    Time-Series Databases for IoT: InfluxDB vs TimescaleDB vs AWS Timestream

    Security · 14 min read

    Zero-Trust Security for Embedded IoT: Why Your Devices Are Probably Vulnerable

    Embedded Systems · 14 min read

    FreeRTOS on ESP32: Task Scheduling, Queues, and Resource Management for IoT

    IoT Engineering · 12 min read

    Building a Production IoT Gateway with Raspberry Pi and Node.js

    Embedded Systems · 13 min read

    ESP32 vs STM32: Choosing the Right Microcontroller for Your IoT Project

    Mobile Development · 10 min read

    Flutter + WebSocket: Building Real-Time IoT Dashboards That Don't Stutter

    IoT Engineering · 13 min read

    IoT Fleet Management at Scale: AWS IoT Core Device Registry and Provisioning

    IoT Engineering · 11 min read

    MQTT vs HTTP for IoT: Which Protocol Wins in Production?

    IoT Engineering · 12 min read

    ESP32 → MQTT → AWS IoT Core: The Production-Grade Architecture Guide

    Let's Build Together

    Got an IoT challenge?
    We've shipped it.

    Whether you need a fleet to track, a factory to monitor, or a farm to automate — our team has done it before and we'd love to build it with you. Typical response time: under 24 hours.

    No upfront commitment99.9% uptime SLANDA on requestFixed-price options