Backups that have been restored at least once.

Microsoft 365, Azure, and on-prem workloads tiered by recovery objective. Backups configured, restores rehearsed on a calendar, decision rights agreed before the incident. Recovery is the thing that matters, so we test it.

Request a recovery review Customer stories

Recovery · tiered SLA

Tier 0

RPO < 15m · RTO < 1h

Tier 1

RPO < 1h · RTO < 4h

Tier 2

RPO < 4h · RTO < 24h

Tier 3

RPO < 24h · RTO < 72h

4 tiers · rehearsed quarterly

recovery.tiers.ts

Tiered recovery posture

1 export const recoveryTiers = {

2 tier0: { rpo: '15m', rto: '1h', rehearsal: 'monthly' },

3 tier1: { rpo: '1h', rto: '4h', rehearsal: 'quarterly' },

4 tier2: { rpo: '4h', rto: '24h', rehearsal: 'quarterly' },

5 tier3: { rpo: '24h', rto: '72h', rehearsal: 'annual' },

6 }

What we keep seeing

The backup is fine. The restore is the problem.

Four patterns we see in tenants that have backups but have never actually exercised recovery.

01
Backup jobs green, restore unverified
Every workload shows a healthy backup, but no one in the org has ever timed a full restore. The first restore is the incident.
02
M365 'Microsoft has backup' assumption
Native retention is not backup. Deleted mailboxes age out, SharePoint version history is finite, Teams chat retention defaults are short. Surprises come at audit time.
03
RPO / RTO never agreed with the business
IT picked a backup product. The business never said what an hour of downtime costs for which workload. Targets are unmeasured because they're unspoken.
04
Decision owner unknown at 2am
When something goes wrong, the org spends the first hour deciding who decides — fail over, restore in place, declare incident, or wait. The clock runs the whole time.

Tiered recovery

Four tiers, agreed once, exercised on a calendar.

Every workload lands on one of four tiers. The tier defines the RPO, the RTO, the mechanism, and how often we prove it works. This table is the working document — not a marketing artefact.

Tier

Workloads

RPO

RTO

Mechanism

Rehearsal cadence

Tier 0

Workloads

Identity (Entra ID), financial systems of record, customer-facing transactional apps.

RPO

≤ 15 min

RTO

≤ 1 h

Mechanism

Azure Site Recovery + geo-redundant backup + Entra ID hardening

Rehearsal cadence

Monthly full restore + quarterly region failover

Tier 1

Workloads

Email, collaboration (Teams, SharePoint), ERP, line-of-business apps with daily transactions.

RPO

≤ 1 h

RTO

≤ 4 h

Mechanism

Veeam for M365 + Azure Backup + immutable retention

Rehearsal cadence

Monthly workload restore + quarterly failover drill

Tier 2

Workloads

Internal tooling, reporting databases, secondary file shares, knowledge bases.

RPO

≤ 4 h

RTO

≤ 24 h

Mechanism

Azure Backup with weekly synthetic full + offsite copy

Rehearsal cadence

Quarterly file-level restore

Tier 3

Workloads

Dev/test environments, archival, low-change reference data.

RPO

≤ 24 h

RTO

≤ 72 h

Mechanism

Azure Backup standard tier + long-term retention

Rehearsal cadence

Annual restore verification

Tier assignment is a business conversation, not an IT one. We facilitate it; the customer signs off.

Rehearsal cadence

What gets exercised, when, and what it produces.

A backup that has never been restored is a hypothesis. We turn it into a fact on a published schedule, and we keep the artefacts so the next auditor doesn't have to take our word for it.

Monthly

Muscle memory

File-level restore from Microsoft 365 backup (random sample)
Lead
Foetron Ops
Artefact
Restore log + checksum diff vs source
Tier 0 workload point-in-time restore to isolated subscription
Lead
Foetron Ops + Customer IT lead
Artefact
Timing report + restored-app smoke test result
Backup job health audit — coverage, success rate, retention drift
Lead
Foetron Ops
Artefact
Monthly coverage report shared with customer

Quarterly

Failover muscle

Tier 0/1 region failover drill into secondary Azure region
Lead
Foetron + Customer exec sponsor
Artefact
Failover runbook with measured RTO + lessons captured
Identity recovery rehearsal — Entra ID + Conditional Access restore
Lead
Foetron IR + Customer IT lead
Artefact
Tested runbook for break-glass identity recovery
Cross-team tabletop on a chosen failure scenario
Lead
Foetron facilitator
Artefact
Tabletop notes + decision-gate updates

Annual

End-to-end proof

Full DR scenario — declared incident, decision gates, failover, cutback
Lead
Foetron + Customer exec sponsor
Artefact
End-to-end DR report; tier RTO/RPO targets re-validated
Tier 3 archival restore verification (samples)
Lead
Foetron Ops
Artefact
Restore log; retention policy reaffirmed
Tier assignment review with business owners
Lead
Customer exec sponsor; Foetron facilitates
Artefact
Updated tier register signed off by business

Cadence is published; rehearsals are calendared 12 months out. Skipped rehearsals are reported, not hidden.

What we don't do

We don't promise nines we can't prove. We tier workloads, rehearse the restore, and tell you when the calendar slipped.

Recovery posture is operational, not aspirational. The proof is in last month's restore log.

Restore path

What actually happens between incident and cutback.

Five steps. Each one has an owner, a decision, an expected duration, and an artefact it produces. The path is the same whether it's a deleted folder or a region failure — only the scale changes.

01
Incident
Duration
0–15 min
Owner
Whoever notices · Foetron NOC paged
Decision / artefact
Confirm scope; classify by tier
02
Decision
Duration
10–30 min
Owner
Customer exec sponsor
Decision / artefact
Failover vs in-place restore vs wait-and-monitor
03
Failover / Restore
Duration
30 min – 4 h
Owner
Foetron Ops + Customer IT lead
Decision / artefact
Execute runbook; pause at validation gate
04
Validate
Duration
30–90 min
Owner
Customer business owner
Decision / artefact
Confirm workload usable; sign off before announcing recovery
05
Cutback
Duration
Scheduled
Owner
Foetron + Customer exec sponsor
Decision / artefact
Return to primary on a planned window; postmortem within 7 days

The decision step is the one most orgs skip. We rehearse it explicitly so the exec sponsor isn't the bottleneck on the night.

Recent recovery work

Verified restores, on the record.

One representative engagement. Customer name held back; outcomes signed off by their CIO.

Mid-market financial services · India

Ransomware tabletop turned into a real restore — and the runbook held.

Customer ran nightly backups but had never timed a restore. We tiered their workloads, set up Veeam for M365 + Azure Backup with immutability, rehearsed Tier 0 monthly. Six months in, a contained ransomware event hit a file server. The Tier 1 restore ran inside its RTO; the exec sponsor made the failover call inside 20 min because the decision rights were already mapped.

Tier 0 RTO achieved in production: 47 min (target ≤ 1 h)
M365 file restores verified monthly for 8 consecutive months
Decision gate from incident → failover call: 18 min on the live event
Postmortem produced 3 runbook updates, all merged within a week
Zero data loss on the affected file server (last backup 22 min before)

Mechanisms we operate

Tooling that's been picked because it has restored, not because it's been demoed.

Microsoft-native first; Veeam where M365 backup is the right answer; immutability everywhere it's available.

Primary accreditation

Next step

Request a recovery review.

We'll spend a session mapping your current workloads to tiers, identifying the gap between current and target RPO/RTO, and proposing a 90-day rehearsal calendar. No deck, no fear-selling.

Book the review Read customer stories