Back to Blog
IoT Engineering

Multi-Tenant IoT Platform Architecture: Isolation, Scaling, and Data Partitioning

Building an IoT platform for multiple customers requires careful decisions about tenant isolation, MQTT namespacing, and data partitioning. Get these wrong and tenants leak data or starve each other.

June 15, 2024
14 min read
Multi-TenantIoT PlatformArchitectureSaaS

Multi-Tenant IoT Platform Architecture: Isolation, Scaling, and Data Partitioning

Building an IoT platform that serves a single customer is engineering. Building one that serves 50 customers simultaneously — each with their own devices, users, data, and SLAs — is a fundamentally different problem.

Multi-tenant IoT platforms fail in predictable ways: Tenant A's rogue device floods the message broker and degrades Tenant B's latency. A misconfigured IoT policy lets Tenant A subscribe to Tenant B's device topics. A poorly designed database schema lets an application bug expose cross-tenant data. A flat pricing model means your largest tenant costs ten times what they pay.

This guide covers the architectural decisions that prevent all of these failure modes.

Isolation Strategy: Silo, Pool, or Bridge

The first decision is how much infrastructure to share between tenants:

| Model | Isolation | Cost efficiency | Complexity | |---|---|---|---| | Silo | Full — separate AWS account per tenant | Lowest | Highest | | Pool | Shared infrastructure, logical separation | Highest | Medium | | Bridge | Shared control plane, isolated data plane | High | High |

Silo model (separate AWS account per tenant): Maximum isolation. A tenant's devices cannot interfere with another tenant even at the infrastructure level. Required for enterprise customers with strict compliance requirements (HIPAA, FedRAMP). Expensive to operate — 50 tenants means 50 AWS accounts to manage, monitor, and update.

Pool model (shared everything, logical separation): Most cost-efficient. All tenants share the same IoT Core endpoint, the same Lambda functions, the same database clusters. Isolation is enforced entirely by software: MQTT topic ACLs, API authentication, database row-level security. Suitable for SMB SaaS products.

Bridge model (our recommendation for most platforms): Shared control plane (API gateway, tenant management, billing), isolated data plane per tenant or per tenant tier. Enterprise tenants get dedicated IoT Core endpoints; SMB tenants share. Scale the data plane independently from the control plane.

MQTT Topic Namespacing

In a pooled or bridge deployment, every MQTT topic must be namespaced by tenant. A flat topic like devices/sensor01/telemetry becomes t/{tenantId}/devices/{deviceId}/telemetry.

// Topic conventions for multi-tenant
const Topics = {
  telemetry: (tenantId: string, deviceId: string) =>
    t/${tenantId}/devices/${deviceId}/telemetry,

heartbeat: (tenantId: string, deviceId: string) => t/${tenantId}/devices/${deviceId}/heartbeat,

command: (tenantId: string, deviceId: string) => t/${tenantId}/devices/${deviceId}/cmd,

// Wildcard for subscribing to all devices in a tenant (backend only) tenantWildcard: (tenantId: string) => t/${tenantId}/devices/+/telemetry, }

IoT policies enforce this at the broker level — not at the application level. Application-level enforcement alone is not sufficient; a compromised device or a code bug could bypass it.

AWS IoT Policy Templates with Tenant Variables

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "iot:Connect",
      "Resource": "arn:aws:iot:us-east-1:123456789:client/${iot:Connection.Thing.ThingName}"
    },
    {
      "Effect": "Allow",
      "Action": "iot:Publish",
      "Resource": [
        "arn:aws:iot:us-east-1:123456789:topic/t/${iot:Thing.Attributes[tenantId]}/devices/${iot:Connection.Thing.ThingName}/*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": "iot:Subscribe",
      "Resource": [
        "arn:aws:iot:us-east-1:123456789:topicfilter/t/${iot:Thing.Attributes[tenantId]}/devices/${iot:Connection.Thing.ThingName}/cmd"
      ]
    },
    {
      "Effect": "Deny",
      "Action": "*",
      "Resource": "arn:aws:iot:us-east-1:123456789:topic/t/*",
      "Condition": {
        "StringNotEquals": {
          "iot:Thing.Attributes[tenantId]": "${iot:Thing.Attributes[tenantId]}"
        }
      }
    }
  ]
}

The ${iot:Thing.Attributes[tenantId]} substitution variable is evaluated at runtime against the connecting device's Thing attributes. A device registered under Tenant A literally cannot publish to Tenant B's topics — the broker rejects the attempt before your application code ever sees it.

Database Partitioning for Multi-Tenancy

DynamoDB: Tenant Prefix Pattern

// All tenant data prefixed by tenantId
const DeviceTable = {
  put: async (tenantId: string, device: Device) => {
    await db.send(new PutItemCommand({
      TableName: 'MultiTenantDevices',
      Item: marshall({
        pk: TENANT#${tenantId}#DEVICE#${device.id},
        sk: 'METADATA',
        ...device,
        tenantId, // always store tenantId on every item
      }),
    }))
  },

query: async (tenantId: string) => { return db.send(new QueryCommand({ TableName: 'MultiTenantDevices', KeyConditionExpression: 'begins_with(pk, :prefix)', ExpressionAttributeValues: marshall({ ':prefix': TENANT#${tenantId}#DEVICE#, }), })) }, }

Every query scopes to the tenant's key prefix — cross-tenant reads are impossible by key design, not just by application logic.

For large tenants (>100k devices), consider dedicated DynamoDB tables per tenant using the bridge model. DynamoDB table-level isolation also simplifies GDPR right-to-deletion compliance — drop the table to purge all tenant data.

API Gateway Multi-Tenancy

Every API call must carry a tenant context. Resolve tenant identity at the gateway, not in each microservice.

// Lambda authorizer: resolve tenant from JWT, attach to context
export const authorizer = async (event: APIGatewayAuthorizerEvent) => {
  const token = event.authorizationToken.replace('Bearer ', '')

const payload = verifyJWT(token, process.env.JWT_SECRET!) const tenant = await getTenantById(payload.tenantId)

if (!tenant || tenant.status !== 'active') { throw new Error('Unauthorized') }

return { principalId: payload.sub, policyDocument: allowPolicy(event.methodArn), context: { tenantId: tenant.id, tenantTier: tenant.tier, // 'starter' | 'professional' | 'enterprise' deviceLimit: tenant.deviceLimit, userId: payload.sub, }, } }

Every downstream Lambda receives tenantId via event.requestContext.authorizer.tenantId — it never trusts tenant identity from the request body or query params.

Per-Tenant Billing and Rate Limiting

Track resource usage per tenant for billing and to prevent noisy-neighbor problems:

// Lambda: record usage metrics per tenant
async function recordUsage(tenantId: string, metric: string, value: number) {
  const month = new Date().toISOString().slice(0, 7) // "2024-06"

await db.send(new UpdateItemCommand({ TableName: 'TenantUsage', Key: marshall({ pk: TENANT#${tenantId}, sk: USAGE#${month} }), UpdateExpression: 'ADD #metric :val', ExpressionAttributeNames: { '#metric': metric }, ExpressionAttributeValues: marshall({ ':val': value }), })) }

// In your telemetry processor: await recordUsage(tenantId, 'messagesReceived', batch.readings.length) await recordUsage(tenantId, 'bytesIngested', payloadBytes)

Per-tenant rate limiting at the IoT Rule level is not natively supported, so implement it in your processing Lambda: check a Redis/DynamoDB counter before processing, return early if the tenant has exceeded their plan's message rate.

Tenant Onboarding Automation

Manual tenant onboarding at scale becomes a bottleneck. Automate it with a service that provisions all required resources:

async function onboardTenant(tenantId: string, plan: TenantPlan) {
  // 1. Create IoT Thing Group for the tenant
  await iot.createThingGroup({ thingGroupName: tenant-${tenantId} })

// 2. Create tenant-scoped IoT policy from template await iot.createPolicy({ policyName: TenantPolicy-${tenantId}, policyDocument: renderPolicyTemplate(tenantId), })

// 3. Create DynamoDB table (enterprise) or record tenant prefix (starter) if (plan === 'enterprise') { await createDedicatedTable(tenantId) }

// 4. Create Cognito user pool (or user pool group for pooled model) await cognito.createGroup({ UserPoolId: USER_POOL_ID, GroupName: tenant-${tenantId}, })

// 5. Record tenant metadata await recordTenant({ tenantId, plan, createdAt: Date.now() }) }

With this automation, a new enterprise customer is fully provisioned in under 30 seconds.

The Architecture Diagram

[Tenant A Devices] ──MQTT──┐
[Tenant B Devices] ──MQTT──┤──► AWS IoT Core (shared endpoint)
[Tenant C Devices] ──MQTT──┘         │
                                  IoT Rules
                                      │
                             Lambda (tenant router)
                            /         │          \
                    DynamoDB      S3 Archive    CloudWatch
                  (prefix isolated) (prefix isolated) (per-tenant namespace)
                            \         │          /
                             API Gateway + Authorizer
                                      │
                         ┌────────────┼────────────┐
                    Tenant A UI  Tenant B UI  Tenant C UI

The shared IoT Core endpoint is enforced at the policy layer. The Lambda router validates tenant context on every message. The database uses key prefixes for logical isolation. Each tenant's UI authenticates via Cognito and receives a JWT scoped to their tenantId only.

Multi-tenant IoT is complex, but the complexity is manageable when isolation is enforced at multiple layers: broker policy, API gateway authorizer, and database key design.

Need help? [Contact Code Caracal](/contact) — we've shipped these systems for clients across 15+ countries.

Written by CodeCaracal Engineering

We write from production experience — every technique in our articles has been deployed to real clients. No academic theory.

More Articles

Business · 12 min read

IoT Device Compliance: FCC, CE, and Product Certification Guide for Hardware Startups

Business · 11 min read

What to Look for When Hiring an IoT Development Partner: 8 Critical Criteria

Business · 11 min read

IoT MVP to Production: Realistic Timeline and Budget for Hardware Startups

Business · 11 min read

IoT Development Agency vs Building In-House: A Decision Framework for Founders

IoT Dashboard · 13 min read

Next.js IoT Analytics Dashboard: From Sensor Data to Production App

Business · 11 min read

How Much Does It Cost to Build an IoT Product in 2024? A Realistic Breakdown

IoT Dashboard · 11 min read

IoT Dashboard UX: Design Principles for Industrial Monitoring Interfaces

IoT Dashboard · 12 min read

Node.js WebSocket Server: The Real-Time Backend for IoT Dashboards

Cloud & DevOps · 12 min read

Containerizing IoT Backend Services with Docker: From Dev to Production

IoT Dashboard · 14 min read

Grafana + InfluxDB IoT Monitoring: Complete Production Setup Guide

IoT Dashboard · 12 min read

Building Real-Time IoT Dashboards with React and Recharts

Cloud & DevOps · 13 min read

CI/CD for Embedded Firmware: Automated Build, Test, and OTA Release Pipeline

Mobile Development · 12 min read

Flutter Offline-First IoT Apps: Hive + Sync Architecture That Works in the Field

Cloud & DevOps · 14 min read

Terraform for IoT Infrastructure: Provisioning AWS IoT Core, Lambda, and InfluxDB as Code

Mobile Development · 10 min read

Flutter IoT Alerts: Firebase Push Notifications for Device Events

Cloud & DevOps · 12 min read

Deploying IoT Backends on AWS: ECS Fargate vs Lambda vs EC2 Decision Guide

Mobile Development · 11 min read

Flutter + MQTT: Building Production IoT Mobile Apps That Scale

Mobile Development · 13 min read

Flutter BLE: Building a Bluetooth IoT Controller App from Scratch

Cloud & DevOps · 13 min read

AWS IoT Core vs Azure IoT Hub vs Google Cloud IoT: 2024 Honest Comparison

IoT Engineering · 13 min read

Kafka vs RabbitMQ for IoT: Choosing the Right Message Queue for High-Volume Telemetry

IoT Engineering · 14 min read

IoT System Testing: Unit, Integration, Hardware-in-the-Loop, and End-to-End

IoT Engineering · 14 min read

Predictive Maintenance with IoT Sensor Data: From Threshold to Machine Learning

Embedded Systems · 14 min read

IoT Bootloader Design: Secure Boot, A/B Partitions, and Reliable OTA Recovery

Embedded Systems · 14 min read

Memory Management in Embedded Firmware: Avoiding Heap Fragmentation and Stack Overflows

IoT Engineering · 13 min read

IoT Cost Optimization: How We Cut AWS IoT Bills by 60% Without Sacrificing Reliability

IoT Engineering · 12 min read

Edge Computing in IoT: When to Process On-Device vs In the Cloud

IoT Engineering · 13 min read

Digital Twins for IoT: Building a Virtual Mirror of Your Physical Devices

Embedded Systems · 14 min read

ESP32 Deep Sleep Mastery: Cutting Power Consumption from 240mA to 10µA

IoT Engineering · 10 min read

MQTT QoS 0, 1, and 2 Explained: Choosing the Right Level for IoT

IoT Engineering · 14 min read

IoT Monitoring and Observability: Metrics, Logs, and Distributed Tracing

Embedded Systems · 14 min read

Debugging Embedded Firmware: JTAG, GDB, Logic Analyzers, and Serial Tracing

IoT Engineering · 12 min read

WebSocket vs MQTT vs Server-Sent Events: Real-Time IoT Protocol Deep Dive

Embedded Systems · 13 min read

STM32 HAL vs Low-Level Drivers: When the Abstraction Costs You Too Much

IoT Engineering · 13 min read

IoT Data Pipeline: From Raw Sensor Reading to Live Dashboard in Under 100ms

IoT Engineering · 13 min read

Zero-Touch IoT Device Provisioning: Scaling from 10 to 100,000 Devices

Embedded Systems · 13 min read

UART vs SPI vs I2C: Choosing the Right Protocol for Sensor Integration

IoT Engineering · 12 min read

Real-Time IoT Alerting: From Simple Thresholds to ML Anomaly Detection

Embedded Systems · 12 min read

ESP32 Partition Table: Designing Flash Layout for Production Firmware

IoT Engineering · 12 min read

IoT Architecture Patterns: Hub-and-Spoke, Mesh, and Edge-Cloud Hybrid

Embedded Systems · 13 min read

IoT Battery Life Optimization: Engineering Devices That Last Years on a Single Charge

IoT Engineering · 13 min read

Time-Series Databases for IoT: InfluxDB vs TimescaleDB vs AWS Timestream

Security · 14 min read

Zero-Trust Security for Embedded IoT: Why Your Devices Are Probably Vulnerable

Embedded Systems · 14 min read

FreeRTOS on ESP32: Task Scheduling, Queues, and Resource Management for IoT

IoT Engineering · 12 min read

Building a Production IoT Gateway with Raspberry Pi and Node.js

Embedded Systems · 13 min read

ESP32 vs STM32: Choosing the Right Microcontroller for Your IoT Project

Mobile Development · 10 min read

Flutter + WebSocket: Building Real-Time IoT Dashboards That Don't Stutter

IoT Engineering · 13 min read

IoT Fleet Management at Scale: AWS IoT Core Device Registry and Provisioning

IoT Engineering · 11 min read

MQTT vs HTTP for IoT: Which Protocol Wins in Production?

IoT Engineering · 12 min read

ESP32 → MQTT → AWS IoT Core: The Production-Grade Architecture Guide

Let's Build Together

Got an IoT challenge?
We've shipped it.

Whether you need a fleet to track, a factory to monitor, or a farm to automate — our team has done it before and we'd love to build it with you. Typical response time: under 24 hours.

No upfront commitment99.9% uptime SLANDA on requestFixed-price options