The Best Disaster Recovery Approaches for IoT
IoT disaster recovery (DR) keeps me awake at night. It keeps me awake because I remember a major earthquake that struck our region a number of years ago. It knocked out systems and communications. I was a CIO at a financial institution and for 48 hours, we didn’t have a connection with our failover data center in Oregon. In fact, we could hardly communicate with each other! All of the landlines were out, and no one could communicate by cellphone via satellite. If we wanted to know if our families were safe, or check on our branch facilities, we literally had to get into a car and drive there.
A scenario like this could happen again because few companies have IoT written into their disaster recovery plans. Nevertheless, as more companies develop their own IoT applications and as these applications become mission-critical, so will DR. What are the best DR approaches for IoT?
Update your disaster recovery plan and test it
Disaster recovery plans are often outdated because IT gets busy on projects and DR plans get placed on the back burner. Consequently, new technologies like IoT do not feature in DR plans when they’re introduced.
One solution is to include DR planning as a line item in every IoT implementation project. There should be a stipulation that the project doesn’t go live until there is a DR plan, to support a scenario if the IoT goes offline.
The other component to IoT DR is executing a DR and failover test to ensure that the IoT DR plan as written really works. Minimally, this test should be performed on an annual basis.
Use alternate ISPs—and understand your ISPs’ DR capabilities
A majority of IoT is powered by Internet access, so part of any IoT DR strategy should be using more than one Internet services provider (ISP) to support IoT. This gives companies a fallback position in case the other ISP has a service outage.
The other element of ISP selection is vetting each ISP for its own disaster recovery strategy. Does it have redundant communications and failover that will enable it to keep running if its primary communications lines are unavailable? Will it agree to coordinate with you for annual testing of IoT failover?
Implement regional cloud backup
Since many IoT applications will be at the edge of the enterprise, it makes sense to orchestrate IoT out of the cloud, or at least to support what the central data center can do with some cloud hosting. Many cloud service providers also have a constellation of data centers in different geographic areas. This makes it easy for one data center to failover to another data center that’s in a different region of the country or the world if a disaster strikes. By using a cloud provider with facilities and data center failover capability in different geographical areas, you can protect yourself if one or even several geographical regions where IoT is installed are incapacitated by a natural disaster. Examples of cloud service providers with multi-zonal DR and failover capabilities include AWS.
Keep an on-hand spares inventory
Worldwide, there are some 30 billion connected IoT devices, with the six leading IoT technology assets being low power area networks (PWANS), cellar technology (3G/4G/5G), mesh protocols, small distance relays like Bluetooth, wi-fi and radio frequency identification (RFID). Within each category, there are scores of vendors and choices.
An enterprise may use all or some of these devices in IoT deployment, as well as other IoT such as robots that work in healthcare or on manufacturing production lines.
Given this assortment of IoT appliances and devices, when companies develop a DR plan that incorporates IoT, a useful first step is to identify all the different IoT devices and vendors that are deployed throughout the enterprise. The next step is to determine the mission-critical path for a DR and failover that will provide service to a “barebones” subset of corporate IoT. This IoT subset should be just enough to support a continuation of business operations.
Once the mission-critical IoT path has been determined, IT can make a list of all of the IoT technology that is in the DR critical path, work with vendors, and acquire spares (or arrange for rapid replacement parts) from vendors that can be swapped with any IoT in the critical path that fails. A backup parts inventory is a budget item that should be presented to management and approved as part of the DR support plan for IoT.
Manual processes and employee training
It is equally important to arrange backup and formal procedures for corporate business operations.
Recalling my experience with the major earthquake DR, we had employees in all our branches who remembered manual ways of doing things that would normally be done with computers. Because they remembered how to perform business operations by hand, we were able to keep business operations going. If such a situation occurred today, I’m not so sure that this approach would work, because many of the employees who remembered how to operate the business manually are now retired.
Finding employees who can operate in a manual mode if they have to is a major issue for companies.
Is there a way to failover a production line to a distant facility that is still functioning? If the company can’t do that, are there ways to keep most operations running manually?
These are questions that management, IT, HR and business leaders need to work together on, because even if your IoT technology and failover plan are flawless, there are still those “in between” moments between failure and recovery when individuals in the business need to know what to do.