Disaster recovery
NEXT maintains a structured disaster recovery (DR) plan to restore service after disruptive events (natural, political, human-made, or malicious). Systems are tiered (critical / non-critical), the service-level recovery objective is to restore operations within 24 hours, and the plan is tested annually (including tabletop reenactments and technical recovery validation). Recovery design relies on AWS-managed durability and restore capabilities for core data stores.
Disaster recovery objectives
Objective
What it means for NEXT
Target
RTO
Time to restore NEXT during an outage
Within 24 hours
RPO
Max tolerable data loss in a recovery
Workload-dependent; core stores use AWS-managed continuous durability and point-in-time recovery where supported
Note that definitions and process are consistent with NIST SP 800-34 guidance for IT contingency planning.
System tiers (prioritization)
- Critical systems: application and database services (or services required for them). If unavailable, restoration begins immediately.
- Non-critical systems: do not block critical operations; restored after critical services.
Phases & sequencing
- Notification / Activation – detect & assess; activate the DR plan when criteria are met; coordinate comms and roles.
- Recovery – rebuild production capability (e.g., re-deploy serverless workloads with tested scripts; restore data using AWS-managed recovery features; validate with pre-written tests).
- Reconstitution – return to steady state; goal: full operations within 24 hours of a disaster or outage.
Testing & rehearsal
NEXT tests DR at least annually, including tabletop (people/process) and technical exercises (e.g., restore-from-backup and alternate site capability checks), followed by a retrospective to improve playbooks.
Standards alignment
The program follows NIST SP 800-34 Rev.1 principles for contingency planning (objectives, roles, testing, and recovery strategies).
Related topics
FAQ
Q: What RTO and RPO does NEXT AI target?
RTO: within 24 hours for full service restoration. RPO depends on the workload: core stores use AWS-managed continuous durability and point-in-time recovery features where supported.
Q: How often is the Disaster Recovery plan tested? What kinds of tests?
At least annually, using tabletop exercises and technical tests (restore-from-backup, alternate-site capability), plus a retrospective to improve procedures.
Q: What triggers Disaster Recovery plan activation?
Examples include prolonged unavailability (e.g., systems down >48h) or hosting-facility damage (>24h), with activation by the Security Officer/CTO after assessment.
Q: How are systems prioritized during recovery?
Critical systems (app/db and dependencies) are restored first; non-critical systems follow.
Q: Which standard does your approach align to?
The plan aligns with NIST SP 800-34 Rev.1; broader business-continuity practices align with ISO 22301.