Recovery time objectives (RTO) and recovery point objectives (RPO) measure a company’s allowable time for software outages and the time between backup intervals.
- RTO: The maximum length of an application outage before business operations experience considerable damage.
- RPO: The amount of data loss that can occur before a company experiences substantial operational or financial repercussions.
Collectively, RTO and RPO are useful tools in disaster recovery and business continuity procedures and indispensable for organizations that must closely track backup requirements. This article explores both measures, their differences, and their use cases in detail.
RTO vs. RPO Comparison Chart
The chart below shows the differences between recovery time objectives and recovery point objectives at a glance.
|Characteristic||Measured In||Varies Due To||Relevant Considerations|
|RTO||Seconds, minutes, hours, or days, plus steps required to restore business operations||Criticality of the compromised application||Expected average downtime cost
How fast the recovery must occur to limit complications
|RPO||Quantity of lost data||Frequency of backup schedule||The amount of data an organization can afford to lose
The frequency and comprehensiveness of the existing backup schedule
How do RPO and RTO Work?
IT leaders within an organization conduct business impact analysis (BIA) to identify their RPO and RTO values, determining the likely effects of disruptions to critical applications or processes due to disasters, significant errors, or other emergencies.
BIA results vary based on such factors as the type of business applications the company has, the nature of the data it gathers and keeps, and the kind of emergency it might experience. Relevant disasters include:
- Storms or floods affecting data storage facilities
- Computer viruses infecting critical applications
- Ransomware attacks restricting file access
- Malicious insiders committing data theft
- Third-party application outages
Companies can express RPO and RTO as a continuum and view both as goals. Beyond that similarity, fundamental differences exist when making the necessary calculations.
How Recovery Time Objective Works
An organization’s recovery time objective measures how long a company can afford for an application to be down without significantly damaging the business. Some applications might be down for days without substantial consequences, while high-priority applications can only be down for a few seconds before they cause customer frustration and result in lost business.
When calculating RTO, a business should categorize applications by priority and potential loss according to available resources or options. For example, standard plans for near-zero RTOs require failover services, while four-hour RTOs permit on-premises recovery beginning with bare-metal restores and ending with full application and data availability.
If IT has invested in failover services for high-priority applications, it can safely express RTO in seconds. Failover services automatically switch to another server or platform when one goes down. IT must still restore an on-premises environment, but since the application is processing in the cloud, the department has more time to bring it back.
Questions to Ask When Calculating RTO
The RTO typically differs depending on the application and its function. Many decision-makers find it easier to make critical RTO calculations by asking the following questions:
- Which user-facing applications require constant availability for customers?
- Which applications must be available for the company’s revenue operations to function?
- Does the business have applications that provide front-line security?
- Which data stores hold the company’s critical applications?
- How long can an application be down before it harms the user experience?
- What’s the typical time frame for restoring a necessary application’s functionality after an outage?
- Can the company restore an application’s lost data from existing backups?
- How does this application help a business meet its objectives?
Best Practices for Achieving RTO
While RTO depends upon a wide range of variables, the following principles can help organizations get closer to RTO goals.
Be Realistic and Recognize Room for Improvement
Start by setting realistic expectations—but even so, you may find the RTO out of reach. In such cases, a practical course of action is to find the likely points of weakness.
For example, a company may be short-staffed, with teams consistently struggling to restore an application’s functionality even with all employees working on the task. Hiring more people may be the most appropriate option.
Examine Existing Backup Technologies
Review your company’s current backup solution—if it doesn’t meet your goals, consider investing in a better alternative. That’s especially true if the business uses a legacy backup solution that consistently falls short of RTO goals. Investing in an updated backup or disaster recovery platform can help companies reach the desired RTO.
Update Application Code When Necessary
Investigate whether issues with an application are due to bugs or outdated code—if they are, addressing those problems could lead to fewer outages and less downtime.
Use Real-Time Notifications
Configure real-time alerts that give immediate feedback when an application shows performance problems, or configure tools so notifications reach personnel on the appropriate platforms and devices.
Determine which applications Require Low RTOs
Limit low RTOs to a select few applications. Most organizations can’t maintain very short RTOs for many systems because making and storing backups every few hours for every company application is expensive.
Plan for Disaster Recovery and Business Continuity
Consider implementing a high-performance disaster recovery or business continuity plan. For some large enterprises that can afford a more advanced recovery platform, this will make a huge difference in achieving RTO expectations. Although these solutions take time and stakeholder approval to select and deploy, they’re valuable resources, especially for organizations with many mission-critical systems.
How Recovery Point Objective Works
An organization’s recovery point objective refers to its loss tolerance, or the amount of data it can lose before experiencing significant harm. RPO is a time measurement between the loss event and the most recent backup.
If a company backs up all or most of its data in regularly scheduled 24-hour increments, the worst-case scenario would be 24 hours of lost data. For some applications, this is acceptable; for others, it is not.
RPO can be calculated through a step-by-step process. Follow these guidelines and choose time ranges for all important systems to determine how much data your company can lose without significant damage:
- Run tests to determine how quickly data must be available for each enterprise application, including cloud storage platforms, CRM solutions, and e-commerce applications, for example.
- Categorize all main enterprise applications based on their backup restoration requirements—for example, does the data need restoration within a few minutes or can it wait a day?
- Calculate the company’s financial position regarding backups. For how many applications can it afford to maintain failover and replication services? Does the business need to store some backups offline, such as on physical hard drives?
- Choose top-priority applications for immediate restoration. This can happen quickly if the servers that support the main cloud platform have continuous replication, but a database of previous sales contacts backed up with hard drives could take hours to recover. Most businesses can’t afford rapid backups for all software, so prioritize thoughtfully.
Setting RPO goals
Depending on application priority, individual RPOs typically range from near-zero (measured in seconds) to 24 hours. Companies setting RPOs of more than eight hours may be able to achieve them with existing backup solutions, provided doing so only minimally impacts production systems.
Four-hour RPOs need scheduled snapshot replication. Near-zero RPOs require continuous replication. When both the RPO and RTO are near zero, people should combine continuous replication with failover services to get near-100 percent application and data availability.
Making Application-Related RPO Goals
Create RPO goals based on the kind and urgency of the data stored by the affected applications. For example, a four-hour RPO for one application provides a maximum four-hour time period to back things up before data loss.
A four-hour RPO does not necessarily indicate the company will lose four hours of data. A word processing application that fails at midnight and comes up by 1:15 a.m. might not lose anything. However, if a busy application goes down at 10 a.m. and restoration occurs at 2 p.m., the company could lose hours of valuable and perhaps irreplaceable information. In such cases, arrange more frequent backups to reach an application-specific RPO.
Best Practices for Achieving RPO
As with the RTO ideals, begin by understanding what RPO the company could feasibly meet with current resources. Before setting those goals, test backup rates for different failover and replication services, hard drives, and flash arrays. Examine historical data to see how long past recoveries took. Trying to achieve an RPO significantly different from historical trends could lead to discouragement and frustration while indicating a company may need to expand its resources.
Staff training is also essential. All personnel involved in the backup process must respond quickly and promptly when an incident or outage occurs so they’ll know what to do when a system goes down and can act confidently during an urgent situation.
Technology upgrades may also be necessary. Ensure replication and failover services are reliable and modern—those hosted on an old, unreliable server may not work as quickly. Similarly, optimize network performance so the business network can support data backup rates. Heavy traffic could keep the organization from meeting its backup times.
Similarities and Differences Between RPO and RTO
RPO and RTO are important considerations for disaster recovery or business continuity plans. Business impact analysis can be used in the early stages of calculating both RPO and RTO.
RTO relates to downtime length. RPO is associated with data loss and backup frequency. Some downtime causes are outside a company’s control, but business leaders can decide how often backups occur, limiting the potential damage.
RTO is a forward-looking measurement to determine how long it will take for an affected organization to resume operations. In contrast, RPO involves looking backward to see when the last data backup occurred and how much information the company will lose due to the problem.
When to Use RPO and RTO
Use the recovery point objective when handling data critical to business operations the company cannot afford to lose. RPO is also the right choice when a company must meet specific data requirements to comply with regulators.
There’s also an emerging trend of people setting recovery point objectives when storing information in multiple places, such as in the cloud and a physical location. Such arrangements are more complex, requiring extra precautions.
Increasing ransomware attacks have also pushed company leaders to take RPO more seriously. Doing so can reduce an incident’s effects and minimize pressure to pay the ransom. Companies that perform frequent backups and have robust recovery plans are much less likely to suffer significant financial repercussions and operational disruptions.
The most appropriate use cases for recovery time objectives are ones where even brief outages could be catastrophic. Emergency dispatch systems and monitoring tools for critical industrial systems are good examples.
Consider the customer-related ramifications associated with outages. For example, in 2022, Ticketmaster crashed while fans tried to get Taylor Swift tickets. People widely covered and criticized the incident in news articles and on social media, giving other companies examples of how not to operate.
Use Cases for Recovery Time Objective
The following examples show different applications for how RTO might help businesses avoid disastrous consequences.
Recovery of Important Email
A company attorney accidentally deletes a time-sensitive email and then empties the contents of the trash folder. But IT continuously backs up delta-level changes in Microsoft Exchange, a business-critical application for this busy company. The backup application is capable of granular backup and recovery, so the firm can recover the individual message within an RTO of five minutes instead of restoring an entire virtual machine for a single email.
Ensuring Availability for an e-Commerce Site
A store’s self-hosted e-commerce site uses three databases: a relational one storing the product catalog, a document one that reports historical order data, and an API one connecting to its payment processor’s gateway.
The document database can reconstruct data from others, so its RTO and RPO are within 24 hours. The business only adds products to the relational database once a week, so RPO is not critical. However, RTO is—customer transactions stop if the database goes down.
The company invests in a failover service, making the database immediately spin up on virtual servers. The business replicates the few changes made during the week to its provider’s disaster recovery platform. The API database holds ordering information and needs both RPO and RTO in seconds. IT continuously replicates data to the fail-over site, which immediately takes over processing should the API database go down.
Use Cases for Recovery Point Objective
The following examples show different applications for using RPO to maintain business continuity after an event.
Restoring a CRM Platform
CRM software hosted at a company’s main Florida office goes down when a bad storm hits. The server room is damaged, but all CRM data backs up to a data center in Missouri. Because of the CRM platform’s importance, teams prioritized replication and failover services, replicating the data center backups just minutes before the storm hit.
The team members responsible for restoration—many out of state and not working directly in Florida—follow the escalation process in their disaster recovery plan immediately and can meet the RPO of 15 minutes.
Setting Timed Hard Drive Backups
Backups are most convenient for everyone involved when they happen automatically or with limited oversight. Apple’s Mac computers have a Time Machine application to handle them—once someone attaches an external drive to a Mac with Time Machine and activates the application, it performs the following automatic backups:
- Hourly backups for the past day
- Daily backups for the past month
- Weekly backups for any periods longer than a month
In that case, the latest backup restore point is used when people realize the need to restore data. Most businesses use technology more advanced than what Time Machine offers—however, the application’s concept clearly illustrates why people must make data-critical decisions before setting the RPO.
Bottom Line: Prioritizing Enterprise Backup and Recovery Goals
Managing RTO and RPO is critical for businesses to strategically meet data backup goals. These measurements give organizations specific metrics to consider when developing backup and recovery strategies. Additionally, determining these key performance indicators makes overwhelming tasks more manageable.
The challenge for all organizations is prioritizing applications and deciding which ones need more financial investments. How many instant recovery solutions can the business afford? Knowing each system’s RTO and RPO can help them decide.
Company leaders should treat RTO and RPO as equally essential rather than measuring one and not the other. Both measurements reveal the business’s most critical applications and data and what’s required to keep them functioning for successful operations.
Read 11 Data Backup Best Practices to discover more principles of enterprise data management.