ESP32 → MQTT → AWS IoT Core: The Production-Grade Architecture Guide
Most IoT tutorials teach you to blink an LED or send a single sensor reading to a free MQTT broker. That's fine for learning. But when you're deploying 100+ devices for a real client, you need a different mindset entirely.
In this guide, I'll walk through the exact architecture we use at CodeCaracal for production IoT systems — the kind that runs with 99.9% uptime SLAs.
The Production Stack
Our proven stack for end-to-end IoT:
ESP32 Firmware (C/C++)
↓ TLS 1.3 MQTT
AWS IoT Core (MQTT Broker + Rules Engine)
↓
Node.js Backend (WebSocket + REST)
↓
InfluxDB (Time-series storage)
↓
React/Next.js Dashboard (Real-time)
↓
Flutter App (Mobile)
Step 1: ESP32 Firmware — Security from Day One
Never ship firmware without TLS. Period.
#include
#include const char* AWS_IOT_ENDPOINT = "your-endpoint.iot.us-east-1.amazonaws.com";
const int AWS_IOT_PORT = 8883;
// Certificate store — embed at compile time
extern const char AWS_ROOT_CA[] asm("_binary_AmazonRootCA1_pem_start");
extern const char DEVICE_CERT[] asm("_binary_certificate_pem_crt_start");
extern const char DEVICE_KEY[] asm("_binary_private_pem_key_start");
WiFiClientSecure tlsClient;
PubSubClient mqttClient(tlsClient);
void setupMQTT() {
tlsClient.setCACert(AWS_ROOT_CA);
tlsClient.setCertificate(DEVICE_CERT);
tlsClient.setPrivateKey(DEVICE_KEY);
mqttClient.setServer(AWS_IOT_ENDPOINT, AWS_IOT_PORT);
mqttClient.setCallback(messageHandler);
mqttClient.setBufferSize(1024);
}
Key firmware design principles
Step 2: AWS IoT Core Configuration
AWS IoT Core gives you a fully managed MQTT broker with fleet management, rules engine, and device shadows.
Device Shadow for State Sync
Device shadows solve a critical problem: what happens when the device is offline when you send a command?
{
"state": {
"desired": {
"relay1": true,
"brightness": 80
},
"reported": {
"relay1": false,
"brightness": 0,
"temperature": 24.5,
"firmware": "v2.3.1"
}
}
}
When the device comes online, it reads the delta and applies the desired state. Clean, reliable, offline-safe.
IoT Rules Engine
Route telemetry to multiple targets simultaneously:
SELECT *, topic(3) as deviceId, timestamp() as ts
FROM 'devices/+/telemetry'
WHERE temperature > 0 AND humidity BETWEEN 0 AND 100
Route to: Kinesis (stream), DynamoDB (device state), SNS (alerts), Lambda (processing).
Step 3: OTA Updates at Scale
OTA is where many IoT projects fall apart. Here's the architecture:
void checkOTA() {
HTTPClient http;
http.begin("https://cdn.codecaracal.dev/firmware/manifest.json"); if (http.GET() == 200) {
StaticJsonDocument<512> manifest;
deserializeJson(manifest, http.getString());
const char* latestVersion = manifest["version"];
if (strcmp(latestVersion, CURRENT_VERSION) > 0) {
Serial.printf("OTA: %s → %s\n", CURRENT_VERSION, latestVersion);
performOTA(manifest["url"]);
}
}
}
Step 4: Scalable Backend
At 10,000 devices publishing every 5 seconds, you're handling 2,000 messages/second. Your backend needs to be ready.
Time-Series Data with InfluxDB
InfluxDB is purpose-built for IoT telemetry:
// Write telemetry to InfluxDB
const point = new Point('sensor_reading')
.tag('device_id', deviceId)
.tag('location', device.location)
.floatField('temperature', data.temperature)
.floatField('humidity', data.humidity)
.timestamp(new Date())await writeApi.writePoint(point)
InfluxDB handles billions of data points efficiently with automatic retention policies and continuous queries for downsampling.
Production Checklist
Before going live with any IoT system, verify:
Conclusion
Production IoT engineering is 20% clever firmware and 80% boring reliability engineering. The teams who ship reliable IoT systems are the ones who've thought deeply about failure modes, security from day one, and observability at every layer.
If you're building a system like this, [reach out to us](/contact) — we've shipped this stack dozens of times and can help you avoid the landmines.