Skip to content

Runbook: Database Backup and Restore

Audience: DevOps engineers, on-call engineers
Prerequisites: Azure CLI, SQL Server Management Studio or Azure Data Studio, Owner access on the SQL resource group


Overview

Microtec ERP uses Azure SQL with the database-per-tenant model. Each tenant has an independently named database on the shared SQL server (mic-backend-shared-sql). Backups are managed at two levels:

  1. Automated Azure SQL backups — full, differential, and log backups managed by Azure (default: 7–35 day retention)
  2. Manual exports.bacpac files for tenant migration or long-term archival

Automated Azure SQL Backups

Azure SQL automatically performs:

Backup TypeFrequencyRetention
Full backupWeeklyConfigurable (7–35 days)
Differential backupEvery 12–24 hoursIncluded in retention window
Transaction log backupEvery 5–12 minutesIncluded in retention window

These backups enable point-in-time restore (PITR) to any second within the retention window.

Current Retention Configuration

EnvironmentRetention Days
dev7
stage14
preprod14
uat14
production35

Verify Backup Configuration

bash
ENV=production
SQL_SERVER="mic-backend-shared-sql"
RG="mic-backend-shared-sql-rg"

# List all databases and their backup retention
az sql db list \
  --server $SQL_SERVER \
  --resource-group $RG \
  --query "[].{name:name, pitrDays:backupRetentionDays}" \
  -o table

Adjust Retention (production change)

bash
# Example: extend retention for a specific tenant database
az sql db update \
  --server $SQL_SERVER \
  --resource-group $RG \
  --name "tenant_acme_production" \
  --backup-storage-redundancy Geo \
  --retention-days 35

Point-in-Time Restore

Use PITR when a tenant's data was accidentally corrupted or deleted.

Step 1 — Identify the restore point

Determine the timestamp (UTC) before the data corruption occurred. Get this from:

  • The incident report / tenant complaint
  • Azure Monitor audit logs
  • Application Seq structured logs

Step 2 — Restore to a new database

Never restore over the production database

Always restore to a new database name (e.g., tenant_acme_production_restore_20260530). Validate the restored data before switching the tenant's connection string.

bash
SQL_SERVER="mic-backend-shared-sql"
RG="mic-backend-shared-sql-rg"
SOURCE_DB="tenant_acme_production"
TARGET_DB="tenant_acme_production_restore_$(date +%Y%m%d%H%M)"
RESTORE_POINT="2026-05-30T08:00:00Z"   # UTC timestamp before corruption

az sql db restore \
  --server $SQL_SERVER \
  --resource-group $RG \
  --dest-name $TARGET_DB \
  --name $SOURCE_DB \
  --time $RESTORE_POINT \
  --edition GeneralPurpose \
  --family Gen5 \
  --capacity 2

Expected duration: 5–30 minutes depending on database size.

Step 3 — Validate the restored database

Connect to the restored database and verify the expected data state:

bash
# Using sqlcmd (on any server with sqlcmd installed)
sqlcmd \
  -S "mic-backend-shared-sql.database.windows.net" \
  -d "$TARGET_DB" \
  -U sqladmin \
  -P '<password>' \
  -Q "SELECT COUNT(*) AS invoices FROM Invoices; SELECT MAX(CreatedAt) AS latestInvoice FROM Invoices;"

Step 4 — Switch the tenant's connection string

Once the restored database is validated:

  1. Update the tenant's connection string in the admin database:
sql
-- Run on the admin_db database
UPDATE Tenants
SET ConnectionString = 'Server=mic-backend-shared-sql.database.windows.net;Database=tenant_acme_production_restore_20260530;...'
WHERE Code = 'acme';
  1. Clear the Redis cache for the tenant's connection string resolution:
bash
# Redis CLI (use the correct Redis instance)
redis-cli DEL "tenant:connstr:acme"
  1. Verify the tenant can log in and see correct data.

Step 5 — Clean up the corrupted database

After confirming the restore is good and the tenant is operational:

bash
# Archive the corrupted database first (rename)
az sql db rename \
  --server $SQL_SERVER \
  --resource-group $RG \
  --name "tenant_acme_production" \
  --new-name "tenant_acme_production_corrupted_$(date +%Y%m%d)"

# Delete it after 14 days if no further investigation is needed
az sql db delete \
  --server $SQL_SERVER \
  --resource-group $RG \
  --name "tenant_acme_production_corrupted_20260530" \
  --yes

Manual Export (.bacpac)

Use .bacpac exports for:

  • Tenant offboarding (hand over their data)
  • Long-term cold archival beyond the 35-day retention window
  • Database migration to a different SQL server

Export to Azure Blob Storage

bash
STORAGE_ACCOUNT="micerpbeproductionsa"
CONTAINER="db-exports"
EXPORT_FILE="tenant_acme_$(date +%Y%m%d).bacpac"

az sql db export \
  --server $SQL_SERVER \
  --resource-group $RG \
  --name "tenant_acme_production" \
  --admin-user "sqladmin" \
  --admin-password "<password>" \
  --storage-key-type StorageAccessKey \
  --storage-key "$(az storage account keys list --account-name $STORAGE_ACCOUNT --query '[0].value' -o tsv)" \
  --storage-uri "https://${STORAGE_ACCOUNT}.blob.core.windows.net/${CONTAINER}/${EXPORT_FILE}"

Expected duration: 5–60 minutes depending on database size. The command returns an operation ID; track progress:

bash
# The export command prints an operation URL — poll it:
az sql db operation list \
  --server $SQL_SERVER \
  --resource-group $RG \
  --database "tenant_acme_production" \
  --query "[?operationType=='Export'].{state:state, percentComplete:percentComplete}" \
  -o table

Import from .bacpac

bash
TARGET_DB="tenant_acme_restored_from_bacpac"
BACPAC_URL="https://${STORAGE_ACCOUNT}.blob.core.windows.net/${CONTAINER}/tenant_acme_20260530.bacpac"

az sql db import \
  --server $SQL_SERVER \
  --resource-group $RG \
  --name $TARGET_DB \
  --admin-user "sqladmin" \
  --admin-password "<password>" \
  --storage-key-type StorageAccessKey \
  --storage-key "$(az storage account keys list --account-name $STORAGE_ACCOUNT --query '[0].value' -o tsv)" \
  --storage-uri $BACPAC_URL

Admin Database Backup

The admin database (admin_db on the shared SQL server) contains the tenant registry and connection strings. It must be restored before any tenant can log in after a disaster recovery scenario.

Priority restore target

In a disaster recovery event, restore the admin database first. Without it, no tenant connection string can be resolved and the entire platform is non-functional.

Admin database restore follows the same PITR process above. Restore target: admin_db_restore_$(date).

After restoring:

  1. Run all pending EF Core migrations on the restored database
  2. Update the BusinessOwners service connection string in Key Vault to point to the restored database
  3. Restart the BusinessOwners container app
  4. Flush Redis cache: FLUSHDB on the tenant cache Redis instance

Long-Term Retention (LTR)

For production databases, configure Long-Term Retention beyond the 35-day PITR window:

bash
# Enable weekly LTR backup for up to 1 year
az sql db ltr-policy set \
  --server $SQL_SERVER \
  --resource-group $RG \
  --database "tenant_acme_production" \
  --weekly-retention P52W    # 52 weeks = 1 year

List LTR backups:

bash
az sql db ltr-backup list \
  --location uksouth \
  --server $SQL_SERVER \
  --database "tenant_acme_production" \
  --query "[].{name:name, backupTime:backupTime}" \
  -o table

Backup Monitoring and Alerting

Azure Monitor alerts are configured to fire when:

  • PITR backup health degrades for any production database
  • A database enters an error state

Check backup health via Azure Portal → SQL Server → Manage Backups → PITR tab. All databases should show a green status and a backup time within the last 6 hours.


Checklist (Restore Operation)

  • [ ] Incident timeline documented; corruption timestamp identified
  • [ ] Restore target name chosen (with timestamp suffix, NOT overwriting production)
  • [ ] PITR restore initiated with correct timestamp
  • [ ] Restored database validated by querying key tables
  • [ ] Tenant connection string updated in admin_db
  • [ ] Redis cache cleared for affected tenant(s)
  • [ ] Tenant logged in and confirmed correct data state
  • [ ] Corrupted database renamed (not deleted) for 14-day hold
  • [ ] Incident report filed with root cause

Internal Documentation — Microtec Platform Team