IoT Fleet Management at Scale: AWS IoT Core Device Registry and Provisioning
When you have 10 devices, a spreadsheet works. When you have 10,000 devices, you need automated provisioning, group-based policy management, remote job execution, and real-time fleet health visibility. AWS IoT Core's fleet management features handle all of this — if you configure them correctly from day one.
The Device Registry: Your Source of Truth
The AWS IoT Core Device Registry is a managed database for every device in your fleet. Each entry (called a *Thing*) stores:
Attributes are queryable via Fleet Indexing, making them critical for operational visibility.
// Node.js: register a new device at factory
const { IoTClient, CreateThingCommand, AttachThingPrincipalCommand } = require('@aws-sdk/client-iot')const iot = new IoTClient({ region: 'us-east-1' })
async function registerDevice(serialNumber, hwRevision, location) {
const thingName = device-${serialNumber}
await iot.send(new CreateThingCommand({
thingName,
thingTypeName: 'EnvironmentalSensor',
attributePayload: {
attributes: {
hwRevision,
location,
firmwareVersion: '1.0.0',
provisionedAt: new Date().toISOString(),
},
},
}))
console.log(Registered: ${thingName})
return thingName
}
Best practice: use your physical serial number as the Thing name. It creates a durable 1:1 mapping between hardware and cloud identity that survives firmware reflashes and certificate rotations.
Fleet Provisioning Templates: Zero-Touch at Scale
Manually creating certificates for each device doesn't scale past a few hundred units. Fleet Provisioning Templates let devices generate their own certificates and register automatically at first boot.
The flow:
CreateKeysAndCertificate and RegisterThing){
"templateBody": {
"Parameters": {
"SerialNumber": { "Type": "String" },
"HardwareRevision": { "Type": "String" },
"AWS::IoT::Certificate::Id": { "Type": "String" }
},
"Resources": {
"thing": {
"Type": "AWS::IoT::Thing",
"Properties": {
"ThingName": { "Fn::Join": ["-", ["device", { "Ref": "SerialNumber" }]] },
"ThingTypeName": "EnvironmentalSensor",
"AttributePayload": {
"hwRevision": { "Ref": "HardwareRevision" },
"firmwareVersion": "1.0.0"
}
}
},
"certificate": {
"Type": "AWS::IoT::Certificate",
"Properties": {
"CertificateId": { "Ref": "AWS::IoT::Certificate::Id" },
"Status": "ACTIVE"
}
},
"policy": {
"Type": "AWS::IoT::Policy",
"Properties": {
"PolicyName": "SensorDevicePolicy"
}
}
}
}
}
This template runs at device first-boot and wires everything together automatically — no human intervention after the factory programs the claim certificate.
Device Groups: Organizing Your Fleet
Thing Groups let you apply policies, jobs, and logging rules to logical subsets of your fleet. Groups are hierarchical, which mirrors real-world deployments:
FleetRoot
├── Building-A
│ ├── Floor-1
│ └── Floor-2
├── Building-B
└── Staging
└── QA-Devices
A device inherits the policies of all groups in its ancestry. This means you can push a firmware update to Building-A without touching Building-B or Staging.
Dynamic groups use Fleet Indexing queries instead of static membership — devices automatically join or leave based on their attributes:
// Automatically group all devices running firmware < 2.0.0
const { CreateDynamicThingGroupCommand } = require('@aws-sdk/client-iot')await iot.send(new CreateDynamicThingGroupCommand({
thingGroupName: 'LegacyFirmware',
queryString: 'attributes.firmwareVersion < "2.0.0"',
}))
AWS IoT Jobs: Coordinated OTA Updates
The Jobs API orchestrates any operation across a group of devices — firmware updates, configuration changes, certificate rotations. Each job tracks per-device status: queued → in-progress → succeeded/failed.
// Create a firmware OTA job for a device group
const { CreateJobCommand } = require('@aws-sdk/client-iot')async function createOtaJob(targetGroup, firmwareVersion, s3Url) {
const jobId = ota-${firmwareVersion.replace(/./g, '-')}-${Date.now()}
await iot.send(new CreateJobCommand({
jobId,
targets: [arn:aws:iot:us-east-1:123456789:thinggroup/${targetGroup}],
document: JSON.stringify({
operation: 'firmware_update',
firmwareVersion,
url: s3Url,
checksum: 'sha256:abc123...',
}),
jobExecutionsRolloutConfig: {
maximumPerMinute: 50, // rate-limit rollout
exponentialRate: {
baseRatePerMinute: 5,
incrementFactor: 2,
rateIncreaseCriteria: { numberOfSucceededThings: 20 },
},
},
abortConfig: {
criteriaList: [{
action: 'CANCEL',
failureType: 'FAILED',
minNumberOfExecutedThings: 10,
thresholdPercentage: 20, // abort if >20% fail
}],
},
timeoutConfig: { inProgressTimeoutInMinutes: 30 },
description: Firmware upgrade to ${firmwareVersion},
}))
return jobId
}
The exponential rollout and automatic abort are critical in production. A bad firmware build that bricks 20% of the first 10 devices should not reach the remaining 9,990.
Fleet Indexing and Search
Fleet Indexing indexes Thing attributes, connectivity status, shadow state, and job execution status in near-real-time. This turns operational questions into simple queries:
const { SearchIndexCommand } = require('@aws-sdk/client-iot')// Find all offline devices in Building-A
const offline = await iot.send(new SearchIndexCommand({
queryString: 'connectivity.connected:false AND thingGroupNames:Building-A',
maxResults: 250,
}))
// Find devices that failed the last OTA job
const failedOta = await iot.send(new SearchIndexCommand({
queryString: 'jobExecution.ota-2-1-0.status:FAILED',
}))
console.log(Offline devices: ${offline.things.length})
console.log(OTA failures: ${failedOta.things.length})
Enable indexing for REGISTRY, REGISTRY_AND_SHADOW, and CONNECTIVITY in your IoT Core settings. The cost is minimal ($0.25 per million indexed updates) and the operational value is enormous.
Monitoring Fleet Health
Connect IoT Core metrics to CloudWatch for alerting:
iot.NumConnectedDevices — current online countiot.PublishIn.Success / iot.PublishIn.ClientError — message rate and error ratioiot.NumSubscriptions — active subscriptionsSet a CloudWatch alarm when NumConnectedDevices drops more than 10% in 5 minutes — that's your early-warning for a regional network issue or a broken firmware build.
Cost Considerations
AWS IoT Core pricing has three components:
At 10,000 devices sending one reading per minute, you're looking at ~$15/month for connectivity and ~$14/month for messaging — roughly $30/month for the broker layer. This is dramatically cheaper than running your own managed MQTT cluster.
Device Shadow costs add up if you update shadow state on every telemetry publish. Only update the shadow when device *state* changes (firmware version updated, configuration changed), not on every sensor reading.
For the full end-to-end architecture connecting firmware to this fleet management layer, see the [ESP32 MQTT AWS IoT Core Production Guide](/blog/esp32-mqtt-aws-iot-core-production-guide).
Also pair this with [IoT Architecture Patterns](/blog/iot-architecture-patterns-2024) for guidance on how fleet management fits into your overall system topology.
Need help with IoT fleet management at scale? [Contact Code Caracal](/contact) — we've shipped these systems for clients across 15+ countries.