Appearance
Production Environment
The production environment serves live customers at onlinemicrotec.com.sa. All deployments to production require a manual approval gate and must occur within defined maintenance windows.
Overview
| Property | Value |
|---|---|
| Environment name | production |
| VNet CIDR | 10.2.0.0/16 |
| Domain | onlinemicrotec.com.sa |
| Branch trigger | main, master, production |
| Auto-deploy | No — manual approval required |
| Approval gate | Azure DevOps Environment approval |
| Deployment window | Weekdays 08:00–10:00 UTC; Fridays excluded |
Production Safeguards
1. Manual Approval Gate
The pipeline stops before deploying to production and waits for approval from a designated approver group:
yaml
# In deploy-services.yml — production stage
- stage: DeployProduction
condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/production'))
jobs:
- deployment: DeployToProduction
environment: 'production' # This environment has approval policies
strategy:
runOnce:
deploy:
steps:
- template: deploy-steps.ymlApprover group: Microtec-Prod-Approvers (minimum 1 approval required, maximum 24-hour window before auto-rejection).
2. Deployment Windows
Production deployments are blocked outside the maintenance window via an Azure DevOps environment deployment gate:
- Allowed: Monday–Thursday, 08:00–10:00 UTC
- Blocked: Fridays, weekends, and public holidays
- Emergency deployments: require approval from Engineering Lead + documented incident
3. Canary / Progressive Rollout
New container revisions are deployed using Azure Container Apps' traffic splitting:
bash
# Step 1: Deploy new revision with 10% traffic
az containerapp update \
--name mic-erp-gateway \
--resource-group mic-erp-be-production-apps-public-rg \
--revision-suffix "canary-$(buildId)" \
--traffic-weight latest=10 previous=90
# Step 2: Monitor for 30 minutes (automated health check)
# Step 3: If healthy, shift 100% traffic
az containerapp ingress traffic set \
--name mic-erp-gateway \
--resource-group mic-erp-be-production-apps-public-rg \
--revision-weight latest=1004. Pre-Deploy Health Verification
Before each production deployment, the pipeline verifies that the current production state is healthy:
bash
# Check all service health endpoints
for service in gateway keycloak accounting notification workflow; do
STATUS=$(curl -sf https://onlinemicrotec.com.sa/api/$service/health | jq -r '.status')
if [ "$STATUS" != "Healthy" ]; then
echo "Service $service is not healthy — aborting deployment"
exit 1
fi
doneInfrastructure Specifications
Compute
Public CAE (production-cae-public):
- Gateway.API
- Keycloak
Min replicas: 2 (HA)
Max replicas: 10
Private CAE (production-cae-private):
- All 13 backend microservices
- mTLS enforced
Min replicas: 1 per service
Max replicas: 5 per serviceData Tier
| Resource | SKU | Redundancy |
|---|---|---|
| Redis Cache | Balanced_B1 (Azure Managed Redis) | Zone-redundant |
| SQL Server | SQL Managed Instance (GP_Gen5, 4 vCores, 32GB) | HA with zone-redundancy |
| Service Bus | Premium | Geo-redundant |
| Blob Storage | ZRS (Zone Redundant) | N/A |
| MongoDB | Azure Cosmos DB | Multi-region |
| Key Vault | Standard | Soft-delete + purge protection |
Networking
| Subnet | CIDR | Usage |
|---|---|---|
| public-apps | 10.2.1.0/24 | Internet-facing CAE |
| private-apps | 10.2.2.0/23 | Internal services CAE |
| appService | 10.2.4.0/24 | App Service integration |
| functionApps | 10.2.5.0/24 | Function App integration |
| private-endpoints | 10.2.6.0/24 | PaaS private endpoints |
Monitoring and Alerting
Application Insights
- Resource:
mic-erp-be-production-ai - Log Analytics:
mic-erp-be-production-law - Retention: 90 days (vs 30 days in non-prod)
- Sampling: 10% (to manage volume and cost)
Alert Rules
| Alert | Condition | Severity | Action |
|---|---|---|---|
| Service unavailability | Health check fails 3× in 5 min | Critical | PagerDuty + Teams |
| Response time > 2s | P95 latency > 2000ms | High | Teams |
| Error rate > 1% | 5xx rate > 1% over 5 min | High | Teams |
| Container restart | Restart count > 3 in 10 min | Medium | Teams |
| CPU > 80% | CPU > 80% sustained 5 min | Medium | Teams |
Dashboard
Production health dashboard is available in Azure Portal under:
Resource Group: mic-erp-be-production-monitoring-rg
Dashboard: Microtec ERP Production HealthIncident Response
Runbook Reference
For production incidents, follow the runbook in Devops/runbooks/incident-response.md.
Quick Rollback
bash
# Emergency rollback — revert to previous stable revision
PREVIOUS=$(az containerapp revision list \
--name mic-erp-gateway \
--resource-group mic-erp-be-production-apps-public-rg \
--query "sort_by([?properties.active==\`false\`], &properties.createdTime)[-1].name" \
--output tsv)
az containerapp ingress traffic set \
--name mic-erp-gateway \
--resource-group mic-erp-be-production-apps-public-rg \
--revision-weight $PREVIOUS=100Escalation Path
On-call developer
└── Engineering Lead (30 min SLA)
└── VP Engineering (60 min SLA)
└── CEO notification (P0 only)Production-Specific Configuration
The following settings differ from all other environments:
json
// services-config.json — production overrides
{
"environment": "production",
"minReplicas": 2,
"maxReplicas": 10,
"enableMtls": true,
"serviceBusTier": "premium",
"redisTier": "premium",
"mongoBackend": "cosmos",
"requireApprovalGate": true,
"deploymentWindowCheck": true
}Security Controls (Production-Only)
| Control | Implementation |
|---|---|
| WAF | Azure Front Door Premium with OWASP rule set |
| DDoS | Azure DDoS Protection Standard |
| TLS | TLS 1.2 minimum; TLS 1.3 preferred |
| mTLS | Enforced between all private CAE services |
| Managed Identity | All services use user-assigned MI (no SAS tokens or API keys) |
| Network ACLs | All PaaS services reachable only via private endpoints |
| Key Vault | Soft-delete 90 days, purge protection enabled |
| SQL Auditing | Enabled, logs to storage account 90-day retention |
Production Access Policy
- No developer has direct production database access — all changes via migration pipelines
- No production Key Vault direct read — secrets injected via Container Apps environment variables
- SSH to SQL VM: Only permitted for the DBA role via Just-In-Time (JIT) access in Defender for Cloud
- Production ACR: Images pulled only by the production managed identity; no human push access