Document ID: ISMS-POL-005 | Version: 1.0 | Date: March 2026 | Classification: Internal | Owner: Chief Information Security Officer
Purpose and Scope
This policy establishes [Organisation Name]βs approach to ensuring the continuity of critical business services following a disruptive event, and the recovery of information technology systems and data in accordance with ISO/IEC 27001:2022 Annex A controls 5.29β5.30.
It applies to all critical business processes, supporting IT systems, and personnel within the ISMS scope. It must be read in conjunction with the Incident Response Policy (ISMS-POL-004) and the Asset Management Policy (ISMS-POL-002).
1. Business Impact Analysis
The Business Impact Analysis (BIA) identifies critical business processes, their dependencies, and the impact of their unavailability. It informs the Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) that drive the organisationβs recovery strategies.
Definitions:
- MTD (Maximum Tolerable Downtime): The maximum time a process can be unavailable before the disruption causes unacceptable harm to the organisation.
- RTO (Recovery Time Objective): The target time within which a process must be restored after disruption. Must be less than MTD.
- RPO (Recovery Point Objective): The maximum acceptable data loss expressed as a point in time (e.g., RPO of 15 minutes means backups must be no more than 15 minutes old).
| Process / Service | Owner | Dependencies | MTD | RTO | RPO | Impact if Unavailable |
|---|---|---|---|---|---|---|
| SOC monitoring service | SOC Manager | SIEM platform (AWS), analyst team (minimum 2), client VPN tunnels, Microsoft Teams | 4 hours | 2 hours | 15 minutes | High β client SLA breach; contractual penalties up to $50K/day for Tier 1 clients; reputational damage |
| Penetration testing delivery | Technical Director | Kali Linux VMs (lab), testing lab hardware, VPN connectivity to client environment, report authoring tools | 48 hours | 24 hours | 4 hours | Medium β project schedule delay; rescheduling cost; client relationship impact; no regulatory breach |
| Corporate email (Microsoft 365) | IT Manager | Microsoft 365 tenancy, DNS (MX records), internet connectivity, Azure AD | 4 hours | 1 hour | 1 hour | High β all internal and external communication blocked; SOC escalation chain disrupted; client communication impaired |
| Client portal (SharePoint β report delivery) | ISM | SharePoint Online, Azure AD authentication, internet connectivity | 8 hours | 4 hours | 1 hour | Medium β clients unable to access security assessment reports; manual email delivery as fallback acceptable short-term |
| Payroll processing | HR Manager | Payroll SaaS provider, bank API integration, HR records in SaaS | 5 business days | 2 business days | 1 business day | Medium β staff pay delay; morale impact; potential constructive dismissal risk if extended |
| Internet connectivity | IT Manager | ISP (primary), ISP (failover/4G), router, firewall | 2 hours | 30 minutes | N/A (stateless) | Critical β all remote staff unable to access cloud systems; SOC monitoring disrupted; client VPN tunnels drop |
| Identity and access management (Azure AD / Entra ID) | IT Manager | Azure AD, MFA service, conditional access policies | 1 hour | 30 minutes | 15 minutes | Critical β staff unable to authenticate to any system; SOC operations halt; all cloud access blocked |
| Finance and invoicing system | CFO | Accounting SaaS, bank API integration, Xero/QuickBooks subscription | 3 business days | 1 business day | 4 hours | High β invoicing and payments blocked; payroll may be affected if linked; compliance reporting delayed |
2. RTO / RPO Summary and Service Tiers
| Service | RTO | RPO | Tier | Justification |
|---|---|---|---|---|
| Identity / Azure AD | 30 minutes | 15 minutes | Tier 1 β Mission Critical | Gateway to all systems; without it nothing works |
| Internet connectivity | 30 minutes | N/A | Tier 1 β Mission Critical | All remote staff and cloud-based services depend on it |
| SOC monitoring | 2 hours | 15 minutes | Tier 1 β Mission Critical | Core revenue-generating service; contractual SLA obligations |
| Corporate email | 1 hour | 1 hour | Tier 1 β Mission Critical | Communication backbone; incident response depends on it |
| Client portal | 4 hours | 1 hour | Tier 2 β Business Critical | Client-facing; alternatives exist (email delivery) |
| Finance system | 1 business day | 4 hours | Tier 2 β Business Critical | Revenue and payments; not immediately life-threatening |
| Penetration testing lab | 24 hours | 4 hours | Tier 3 β Important | Revenue-generating but can be rescheduled |
| Payroll | 2 business days | 1 business day | Tier 3 β Important | Staff impact significant but manageable short-term |
3. Recovery Strategies
Tier 1 β Mission Critical
- Strategy: Hot standby or automatic failover with zero or near-zero manual intervention required.
- Cloud-native services (Azure AD, Microsoft 365): Microsoft manages geo-redundancy; organisation ensures at least 2 Global Administrators in separate regions; emergency access (βbreak glassβ) accounts maintained and tested.
- Internet connectivity: Primary ISP (fibre) supplemented by 4G/LTE failover router (pre-configured, auto-failover); test monthly.
- SOC SIEM platform (AWS): Deploy across two AWS Availability Zones; Auto Scaling for compute; Aurora Multi-AZ for database; Elastic Load Balancer; failover tested quarterly.
Tier 2 β Business Critical
- Strategy: Warm standby; restore to service within RTO using documented runbook; brief manual intervention required.
- Client portal (SharePoint Online): Microsoft manages availability; backup: export critical reports to email if portal unavailable; runbook documents emergency delivery procedure.
- Finance system (SaaS): Vendor manages availability; runbook documents how to process urgent payments manually via bank portal; reconciliation to be performed once system restored.
Tier 3 β Important
- Strategy: Cold standby or rebuild; restore from last backup within RTO; service may be temporarily suspended.
- Penetration testing lab: Hardware rebuilt from documented configuration baseline; VMs restored from snapshots; client projects rescheduled with advance notice.
- Payroll: Vendor SaaS is restored by vendor; manual payroll calculation possible for up to 1 pay run using spreadsheet template; payroll team trained on manual procedure.
4. DR Runbook Structure
Every Tier 1 and Tier 2 service must have a documented DR Runbook maintained by the System Owner and reviewed annually. Each runbook must contain the following sections at minimum:
- Service Description β what the service does and who depends on it
- Trigger Conditions β what constitutes a DR trigger for this service (threshold, confirmed failure, etc.)
- Escalation Contacts β primary and secondary contacts with mobile numbers; vendor support contact; escalation path
- Dependencies β list of systems this service depends on; order of recovery if multiple systems are down
- Step-by-Step Recovery Procedure β numbered, actionable steps; command-level detail for technical procedures; no assumed knowledge
- Estimated Recovery Time β how long the procedure should take; flag if approaching RTO
- Success Criteria β how to confirm the service has been successfully recovered (e.g., βSOC dashboard shows green; sample log ingestion test passes; clients confirm connectivityβ)
- Rollback Procedure β if recovery fails or worsens the situation, how to safely roll back to the previous state
- Post-Recovery Actions β monitoring steps, notifications to send, documentation to complete
5. Testing Schedule
BCP/DR plans that are never tested are not plans β they are documents. All testing must be documented with results, issues found, and actions taken.
| Test Type | Frequency | Scope | Owner | Documentation Required |
|---|---|---|---|---|
| Tabletop exercise | Every 6 months | Simulated disruption scenario (e.g., ransomware attack, data centre fire); discuss response and recovery without activating failover | CISO | Exercise scenario document; attendance register; findings and action list |
| Backup restore test | Monthly | Restore a random selection of files from backup to isolated test environment; verify data integrity | IT Manager | Restore test log: date, files restored, integrity check result, time taken, technician signature |
| Internet failover test | Monthly | Temporarily disconnect primary ISP; confirm 4G failover activates and critical services remain available | IT Manager | Failover test log: failover time, services tested, any issues |
| Full DR failover test | Annually | Activate DR environment; restore Tier 1 services to DR; validate all critical services; document RTO achieved vs target | IT Manager + CISO | Full DR test report including: actual recovery times vs RTO targets, issues found, actions raised |
| Communication tree test | Annually | Call/message all emergency contacts from the contact list; verify all numbers are current and reachable | ISM | Communication test log: contact name, method, result (reached/unreachable), date |
| BCP awareness | Annually (with DR test) | Confirm all staff in critical roles have read the relevant runbooks and know their role in a DR event | ISM | Acknowledgement sign-off sheet |
6. Roles and Responsibilities
| Role | BCP/DR Responsibility |
|---|---|
| Incident Commander (CISO or delegate) | Declare a business continuity event; activate the BCP; make real-time decisions on recovery priority and resource allocation; communicate with clients and board |
| IT Recovery Lead (IT Manager) | Execute technical recovery procedures per DR runbooks; coordinate with cloud providers and vendors; report recovery progress to Incident Commander every 30 minutes |
| Communications Lead (CISO or ISM) | Manage all external communications: clients, regulators, media, insurers; maintain accurate status updates; ensure regulatory notifications are submitted on time |
| Business Continuity Coordinator (ISM) | Coordinate non-IT recovery activities; liaise with HR, Finance, and other functions; maintain incident log; track action items; arrange alternative workspace if required |
| Vendor Liaisons (System Owners) | Engage vendor support for SaaS/cloud services under their ownership; escalate to vendor account managers; report support ticket status to IT Recovery Lead |
7. Activation Criteria
The BCP is activated when any of the following conditions apply:
- A Tier 1 service has been unavailable for more than 30 minutes without a clear resolution timeline
- A P1 security incident causes system unavailability affecting client services
- A physical event (fire, flood, loss of building access) prevents staff from working from the office
- The organisation receives confirmation of a vendor outage affecting Tier 1 services with no confirmed resolution time
Activation is declared by the CISO (or their delegate). Once declared, all relevant System Owners activate their DR runbooks and report status to the IT Recovery Lead every 30 minutes.
8. Review History
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0 | March 2026 | [CISO Name] | Initial issue |