Ensuring Data Integrity and Handling Failures in OTA Firmware Updates
In previous episodes, we explored OTA architecture, partitioning, and security. But even with perfect encryption and boot validation, there’s another crucial layer: ensuring data integrity and safely handling failures — especially in constrained BLE environments.
If a firmware update is interrupted or corrupted, and your system doesn’t catch it in time, you risk bricking the device.
This episode will break down:
- How to detect data corruption
- How to fail gracefully and recover
- How real-world devices like Fitbit and STM32WB do it
- Techniques like CRC, SHA, status flags, watchdog resets, and rollback
What Can Go Wrong in OTA?
| Risk | Example Case |
|---|---|
| Packet loss or corruption | BLE dropout mid-transfer |
| Power loss during write/swap | Battery dies mid-flash |
| Partial update | Transfer aborted but metadata was updated |
| Flash write errors | Misaligned writes or ECC failure |
| Wrong or tampered firmware | Firmware modified post-download |
Verifying Data Integrity
Data integrity checks ensure what was received is what was expected — before flashing or booting.
1. CRC (Cyclic Redundancy Check)
- Lightweight and fast
- Usually CRC16 or CRC32
- Verified during:
- End of OTA transfer
- Bootloader check
bool crc_check_passed = calculate_crc32(image) == expected_crc32;
Example:
- Fitbit and Nordic DFU protocols attach CRC32 to each chunk and to the entire image.
2. SHA-256 Hash Check
- Stronger, cryptographic hash
- Slower, but more robust
- Used for:
- Final image validation before reboot
- Signature verification
sha256(image, length, hash_out);
if (memcmp(hash_out, expected_hash, 32) != 0) {
return IMAGE_CORRUPTED;
}
Example:
- STM32WB + SBSFU, ESP32 Secure Boot, and MCUBoot all use SHA-256 for validating update images.
3. Signature Check = Integrity + Authenticity
If your update is signed (ECC/RSA), the signature validates both the integrity and authenticity of the image.
- If SHA-256 hash fails, the signature check fails.
- If someone tampers with the firmware post-signing, bootloader blocks execution.
Handling OTA Failures
OTA should never leave your device in a broken state. That’s where rollback and retry mechanisms come in.
1. Watchdog Timers for First Boot
- Start a watchdog timer after OTA reboot
- If the app crashes or fails to clear the watchdog, bootloader flags the update as failed
start_watchdog(5_seconds);
app_code(); // if this hangs, watchdog triggers reboot
Example:
- MCUBoot uses a “pending” state. Only after the first boot succeeds does the firmware become “permanent”.
- Fitbit uses a boot signal sent from app to bootloader via shared flash flag.
2. Boot Status Flags in Flash
- Use a reserved flash page or option byte to store OTA status:
OTA_PENDINGOTA_SUCCESSOTA_FAILED
Bootloader Logic:
if (ota_status == OTA_PENDING) {
if (firmware_valid()) {
set_ota_status(OTA_SUCCESS);
} else {
rollback_to_previous_image();
}
}
3. Rollback to Previous Image
Devices with dual-slot partitioning (Episode 3) can revert if the new image fails validation or boot.
Steps:
- Keep last known good firmware in App Slot A
- OTA installs new image to Slot B
- If Slot B fails, bootloader rolls back to Slot A
Example:
- Fitbit, Oura Ring, and Amazon Echo Buds all support rollback using similar logic.
4. Resume Interrupted Updates
BLE transfers are fragile. Your OTA logic should:
- Allow resume from last good chunk
- Validate each chunk’s CRC/hash
- Avoid rewriting already validated blocks
Nordic DFU Example:
// If transfer fails after 62%, reconnect resumes from chunk 63
device responds with last received offset → app continues
Practical Design Tips
| Tip | Why It Helps |
|---|---|
| Always verify hash or CRC before boot | Prevents corrupted firmware from being executed |
| Use boot status flags | Tracks update status across reboots |
| Don’t erase old firmware until success | Enables rollback |
| Use watchdog timer after OTA | Catches faulty first boots |
| Use power-fail-safe flash writing logic | Avoids half-burned sectors |
| Keep metadata separate from OTA partitions | Prevents accidental overwrite |
Real-World Snapshot: Fitbit OTA
- OTA update via BLE
- Each packet validated with CRC
- Final image validated with SHA-256
- Bootloader only swaps if update status is OK
- Rollback on boot error or hash mismatch
- Watchdog used for first-boot crash detection
Failure Handling in STM32WB (with SBSFU)
- SBSFU marks OTA status in a reserved flash area
- Verifies firmware header (hash + signature)
- Boots new image only if verified
- Resets to old image if first boot fails or watchdog triggers
- Optional: tamper-detection logic or hardware reset control
Conclusion
A secure OTA update is not just about encryption — it’s about ensuring the firmware was delivered, validated, and installed safely. If something fails, your device must recover gracefully, not silently fail or brick.
Handling failures is a mark of a mature, production-grade OTA system.

Leave a comment