Every small business has a number it cannot afford to admit out loud. It is the number of hours, in a real outage, that the doors might as well be locked. Sales cannot quote, technicians cannot dispatch, accounting cannot post, and customers start calling the competitor while the team waits for files to come back. The point of recovery planning is to decide that number on a calm Tuesday instead of a panicked Friday.
Two targets sit at the center of every credible recovery plan. The first is how long the business can stay down. The second is how much recent work the business can afford to lose if it has to roll a system back to its last clean copy. Set those two numbers honestly and most other decisions about backup, replication, and incident response fall into place. Skip them, and the recovery plan ends up being whatever the backup tool happened to be doing the day the storm hit.
What Do Recovery Targets Actually Measure?
Recovery time objective and recovery point objective are the two numbers that translate “we have backups” into “we know exactly what happens after an outage.” They look similar on a slide and they get confused all the time, but they answer different questions and they cost different amounts of money to hit.
How long can the business be down?
Recovery time objective, often shortened to RTO, is the maximum amount of time a system can stay offline before the business takes real damage. It is measured in minutes, hours, or days, and it is supposed to be set by leadership, not by the IT team. The point-of-sale at a busy retail counter might have an RTO of fifteen minutes because every minute it is down means lost sales. A back-office reporting database might have an RTO of three days because nobody touches it on weekends and a monthly close can still finish on time.
The honest version of this conversation is uncomfortable. Owners often want every system back in under an hour, but the spend required to deliver fifteen-minute recovery on a fifteen-server environment is dramatically different from four-hour recovery. The job of the IT partner is to price each tier and let the business decide where the line goes.
How much recent work can be lost?
Recovery point objective, or RPO, is the maximum amount of recent work the business is willing to lose if a system has to be restored from backup. It is also measured in time, but it is measured backward from the moment of failure. An RPO of one hour means the team is willing to lose up to sixty minutes of work that landed between the last successful backup and the crash. An RPO of twenty-four hours means a full day of recent invoices, tickets, or emails could be gone.
RPO usually drives the backup schedule. If accounting cannot afford to re-key a day of transactions, hourly snapshots are non-negotiable. If a project file share only changes a few times a week, nightly is plenty. The contrast with RTO is important: a system can have a very short RPO (small data loss tolerance) but a long RTO (slow to bring back up), or the other way around. They are independent levers, and pretending they are the same number is one of the most common planning mistakes in this kind of work. For the broader business continuity planning conversation that sits above these two metrics, see the difference between keeping operations running through an event and restoring systems after one.
How Do You Set Realistic Recovery Targets?
The exercise that actually works is a half-day conversation with the people who run the business, not a checklist filled in by the IT team in isolation. Owners, the head of operations, the bookkeeper, and the lead salesperson all bring different answers to the same questions, and that is the point. The goal is to leave the meeting with a target written next to each major system, signed off by someone with budget authority.
What is the cost of an hour of downtime?
Start with revenue per hour during business hours, then add the soft costs that are easy to forget. Wages paid to a team that cannot work. Customer commitments that slip and trigger refunds, credits, or escalation calls. Reputation cost when a recurring outage hits the same client twice. The number does not need to be perfect, just defensible. A landscaping company doing four hundred thousand dollars a year that runs lean might land at one hundred and eighty dollars per hour of business-hours downtime once wages are included. A specialty manufacturer with line workers and shipping deadlines can land north of three thousand. Whichever number falls out, write it down. It will anchor every other recovery conversation.
Should every system have the same target?
No, and this is the part most plans get wrong. A reasonable approach is to sort systems into three tiers and assign a target band to each tier. The top tier holds the small handful of systems the business literally cannot operate without for more than a couple of hours: phones, the core line-of-business application, the platform that takes orders, and the identity system that lets everyone log in. The middle tier holds systems that matter daily but can wait half a day: email, shared file storage, payroll. The bottom tier holds systems that can wait a few days: archived records, reporting tools, secondary file shares. Most small businesses end up with four to six top-tier systems and a long tail of lower-tier ones, and that distribution is healthy. The mistake is treating everything as top-tier, because the budget required to deliver that target across the entire environment is rarely in the cards.
What Backup And Recovery Setup Matches Each Target?
Once the targets are written down, the backup and recovery architecture stops being abstract and starts being a budget decision. Different target bands require different technology, and pretending otherwise leads to a recovery plan that quietly cannot deliver what the spreadsheet promises.
What works for long recovery windows?
If a system has an RTO measured in days and an RPO measured in twelve to twenty-four hours, traditional nightly backups to cloud storage are usually sufficient. The data is captured once a day, encrypted, and copied off-site. Recovery means provisioning a fresh server, restoring the latest backup, validating the restore, and pointing users at the new copy. That work takes hours, sometimes a full business day, but for systems that do not run after five o’clock it is honest and affordable. A cloud backup that runs in the background every night is the workhorse of this tier and the foundation that everything tighter builds on.
What about systems that need to come back fast?
For tighter targets, the architecture changes. An RTO of four hours generally requires image-based backups that can boot a virtual replica of the original server inside the backup appliance or in the cloud, so the team is not waiting for a full bare-metal restore. An RTO of one hour usually means warm-standby replication, where a second copy of the workload is continuously updated and can take over with a short cutover. An RTO below fifteen minutes typically requires active-active replication or live failover, which is a different cost category entirely. The RPO side scales similarly: hourly snapshots keep loss to under sixty minutes, continuous data protection keeps it to seconds, and journaled replication keeps it near zero.
Microsoft 365 sits inside this conversation in a way most owners do not realize. Mail, OneDrive, SharePoint, and Teams are hosted by Microsoft but the data inside them is the business’s responsibility, and the platform’s native retention is not a backup. A dedicated third-party backup of the Microsoft 365 tenant typically delivers an RPO of a few hours and an RTO measured in minutes for individual mailbox or file restores, which is a different posture than restoring a full on-premises server.
How Often Should You Actually Test Recovery?
A recovery target that has never been tested is a number on paper. The point of a test is not to prove the backup software runs. It is to prove that, under the real procedures and with the real people on call, the business can be back inside the time the plan promised. Two kinds of tests cover most of what a small business needs, and they each answer a different question.
What is a tabletop test?
A tabletop is a guided walkthrough where the team sits in a room and talks through a scenario step by step. Someone reads out the situation: ransomware encrypts the main file server at ten in the morning on a Wednesday. The IT lead narrates what they would do first. The operations head describes how the team would keep working without that system. The owner decides when the call goes out to clients. Tabletops surface assumptions that nobody knew were assumptions, like “the offsite backup is at the office,” or “the only person with the failover password retired six months ago.” They are cheap, they expose process gaps, and they should happen at least twice a year. The response-time clauses inside a properly written agreement with the managed IT provider should match the targets the tabletop is testing against, otherwise the SLA is decorative.
What is a live failover test?
A live failover test actually brings up the recovered system in a sandboxed environment and runs it. The backup is restored, the replica is booted, applications are launched, and someone logs in to verify that the data is consistent and the system behaves the way it should. Live tests are how teams discover that the database recovers but the application that depends on it cannot reach it because the IP address is hard-coded in a config file no one has touched in four years. Tier-one systems should be live-tested at least once a year. Tier-two systems can be live-tested every eighteen to twenty-four months. Lower tiers can rely on tabletop coverage. Every test should produce a written record of what was tested, how long it took, what failed, and what changed afterward.
Frequently Asked Questions
Is a shorter recovery time always better?
Only up to the point where the cost of getting there exceeds the cost of the downtime it prevents. Cutting an RTO from eight hours to fifteen minutes can cost five to ten times as much per year as cutting it from twenty-four hours to eight. The right answer is the shortest target that the business can both afford and operationally support, not the shortest target that is technically possible. For most small businesses, tier-one systems land between one and four hours and tier-two systems land between eight and twenty-four.
Do cloud applications need their own recovery targets?
Yes, and treating cloud apps as if they are immune is a common mistake. The vendor handles uptime of their platform, but accidental deletion, ransomware that hits a synced folder, malicious insider activity, and account takeover can all destroy data that the vendor will not restore in the timeframe the business needs. Each cloud application should have its own RTO and RPO and a backup or export plan that the business controls.
Who should sign off on recovery targets?
The owner or the most senior operational leader, in writing. Recovery targets are budget decisions disguised as technical ones. The IT team can estimate cost and feasibility for each tier, but the decision about whether the company will spend the money to deliver a one-hour RTO instead of a four-hour RTO belongs to the person who controls the budget. The signed document also matters after an incident, when the question of whether the response met expectations gets revisited.
How are recovery targets different from uptime SLAs?
Uptime SLAs describe how often a system is expected to be available during normal operation, usually as a percentage like ninety-nine point nine percent. Recovery targets describe what happens after a failure has already occurred. Uptime is a steady-state metric. RTO and RPO are crisis metrics. Both belong in a managed-services agreement, and they should match the business tier assignments rather than being copied from a generic template.
What is a realistic recovery point for accounting data?
For most small businesses with active accounting workflows, an RPO between fifteen minutes and one hour is reasonable for the accounting database itself, with daily exports of finalized records to an off-system archive. Losing a full day of posted transactions is almost always more painful than the cost of more frequent backups, but the exact target should be checked against how much manual rekey is possible in a worst case and how the bookkeeping team is staffed.
Should remote workers be included in the recovery plan?
Yes. Endpoints used by remote workers carry business data, sit outside the office network, and often have weaker local backup coverage than office desktops. A recovery plan that only restores servers leaves a real gap if a remote employee’s laptop is the only place a deliverable lives. Endpoint backup, OneDrive or equivalent file sync, and a documented remote-worker recovery procedure should be part of the same plan.
Where Should You Start?
The first step is a one-page list. Every meaningful system the business depends on, sorted into three tiers, with a recovery time target and a recovery point target written next to each one. That single page is what turns “we have backups” into a recovery plan a leadership team can actually defend. Most owners discover during that exercise that two or three systems have never been classified at all, and that one or two have targets the current backup setup cannot meet.
If walking that list with a partner who has already done it across dozens of small businesses sounds useful, a working backup and recovery setup tuned to your real targets is what we build at O&O Systems. We will document the tier assignments, price each option clearly, deliver the architecture that matches what leadership signs off on, and run the first tabletop test together so the plan is not a number on paper.