A Whitepaper by
Dolce Vita IT Solutions LLC
Business Continuity and Ransomware Prevention in
Manufacturing environments vary tremendously from the standpoint of information technology which is in use. The technology varies from physical and virtual servers and workstations to the presence of workflow management systems including enterprise resource planning (ERP) and other exceptionally complex database systems, inventory management, and other similar systems. The vast majority of manufacturing businesses utilize hybrid environments with some systems on-premise and others cloud-based. By definition, virtualized environments will still contain a physical component. Email systems also vary from on-premise email servers to systems hosted in the cloud such as Hosted Exchange and Office 365.
Yet with all of this variety the following aspects generally apply:
Based upon experience in assisting a wide variety of manufacturing facilities from heavy industry to medical surgical camera repair and manufacturing, some guidelines are offered for manufacturing management teams to improve awareness of ransomware, the risks it poses to information systems and cash flow, and some steps to reduce risk.
Sources of critical data
The manufacturing information environment has several sources of sensitive data which need to be protected, some of which are not necessarily obvious: cash flow, workflow, customer service, and proprietary information may be affected if the environment is compromised:
Ransomware – what is it, and why all the fuss?
Businesses have been dealing with malware (malicious software) ever since the first criminal miscreants understood that they could steal someone else’s work and make a profit. Ransomware is an exceptionally malicious subset of malware, designed to steal access to information and charge a ransom to get access back.
How do ransomware infections occur?
How serious a problem is this?
A ransomware infection in the vast majority of cases starts at a user workstation, and once begun, instructs the infected workstation to inventory all data shared across the network to which the user has read/write access. The inventory effectively prioritizes the shared data across the network based upon its perceived value (for example, financial data would probably be automatically attacked and encrypted before a repository of photos). Left unchecked every folder and file which the infected user has read/write access to becomes encrypted and therefore inaccessible to all users on the network. In the majority of cases the only way to recover data is to restore it from unaffected backups.
In short, ransomware is a serious problem. The photo below shows what manufacturing users may see once ransomware has ravaged their shared network data.
Requirements for protecting data
Protection of data in a manufacturing environment does not differ greatly from that in other business environments, however the risk factors can vary greatly. To understand some of the risks, consider the most typical ways for data to be lost.
Although the physical environment does not contribute to ransomware issues, it can contribute to loss of data due to a wide variety of factors:
Firewall and Content Filtering
Antivirus and Anti-Spam
Concerning ransomware, there are a number of important conditions which owners and business managers should be aware of:
Backups and Business Continuity
It is generally acknowledged in the technology industry that ordinary file/folder backups such as those offered by many legacy backup vendors are no longer adequate for business continuity. It is essential to have a business continuity plan which includes the ability to recover not only files and folders encompassing all critical data, but also recovery of operating systems, proprietary data, documents, correspondence, accounting and business planning data, as well as recovery of email. In addition, any critical servers or workstations should be protected preferably on-site and off-site by imaging software…this allows a failed server or workstation to have a current backup image used to restore to new hardware if needed, or to completely replace the server in the event ransomware encrypts the server.
Unfortunately, it is common to see situations where the only data recoverable was that in the Enterprise Resource Planning (ERP) system or on particular devices. This may be only a portion of the data required to be recoverable. It is the business management team’s responsibility to know where all critical data resides, verify that it is backed up, and verify that it is recoverable. As noted above the only recourse when a ransomware attack is successful is typically to restore any important data from unaffected backups.
Planning for recovery from ransomware attacks involves a detailed process to account for the systems which contain the most valuable and sensitive information and the time window required for those systems to be recovered. This prioritization is not primarily a technical decision, but a business decision. As the per-hour costs of downtime increase to include personnel labor costs, contract labor, non-delivery penalties, and reputation loss, the cost effectiveness of higher quality business continuity systems and prevention measures increases.
It is useful at this stage to review the distinction between backups and business continuity, because this is exceptionally important from a cash flow perspective.
The term ‘backup’ is used to describe the fact that a copy or image of critical data is kept on separate media allowing data to be recovered in the event the original media or device is damaged, etc. The term ‘backup’ does not account for the time required to recover data.
The term ‘business continuity’ is used to describe the practice of backing up data as an image in such a way that it is recoverable in an acceptable timeframe (RTO), in a sufficiently granular fashion (RPO) to minimize cash flow impact of a negative event on a business. Since ransomware is often a very fast-acting event, this can be critically important and is illustrated by two recent real-world examples.
The importance of user training cannot be overstated. While technology will handle a portion of the ransomware threat, remember the comment at the beginning of this whitepaper: “Manufacturers generally invest minimal time and effort in IT training for management and users“. It is crucial to bear in mind the following “rules” which impact IT with regards to ransomware:
Rule #1: No anti-SPAM, antivirus, or content filtering technology is 100% effective
Rule #2: No systems administrator or consultant can change Rule #1
An organization’s users are the last line of defense against malware attacks. They must be properly trained on how to recognize ransomware threats and what to do when an attack seems to occur. The training can be conducted by the internal IT group or the business’ IT consultants if applicable. A user needs to know how to respond if they see a ransomware-related warning similar to the following:
“A malfunction has been detected with Windows 7 / Server 2008 R2 and your IE 11.0. Please call the number below to speak with a technician to assist you in resolving this matter. DO NOT SHUT DOWN OR RESTART THE COMPUTER OR YOUR INFORMATION MAY BE LOST…”
In business continuity planning one of the first steps is to review the business’ highest-risk, highest probability downtime scenarios. The next is to take reasonable steps to protect data as well as ensure that the management team understands the timeframe for recovery of data based upon the current infrastructure (the available RTO with the existing infrastructure). A risk matrix is one of the most useful and dynamic methods used for business continuity planning.
The client is an Oklahoma educational institution whose primary file server failed late on a weekend due to a failure of multiple hard drives. The school was protected by an imaging system, but not by a business continuity appliance. The failure required that the drives be replaced and a “bare-metal” recovery be performed to different hardware.
The data recovery for this 1.5TB server required approximately 20 hours for the backup system to complete. The business impact extended from about 0800 Monday morning until about 4 PM Monday.
The client is a high-tech manufacturer serving the oilfield and pipeline industries with approximately 50 employees. They had a critical database server fail due to live system modifications being made by an application developer. In addition to their personnel, they had five 18-wheel transports idled by this data incident, at an estimated downtime cost of $2500 per hour. The business was up and running with a server recovery image in under 30 minutes due to the business continuity system in place. The updated server data from the remainder of that day was restored outside of regular business hours to minimize disruption to the client.
This illustration of downtime helps visualize recovery timeframes and their impact on cash flow. When considering cash flow impact it is useful to plan around the potential absence of key data for the duration of various recovery scenarios, and to at least plan for work process adjustments to minimize business impact. Visualization, however, must go beyond the obvious cases in which the ERP system is down, or a critical file server is down. How about email? If it is on-premise, and it is a critical contact method for clients, then the recovery planning must give this a priority. If the facility uses a voice over IP (VOIP) phone system which is converged with the data network, the continuity plan must be altered to properly disconnect the affected data systems from the network without affecting the voice system. How do infected servers and workstations need to be disconnected from the network? If this is not given consideration ahead of time, then a ransomware attack can have an even greater financial impact.
Planning for DR priorities with a ransomware scenario
The following is a greatly simplified chart which illustrates a means to assess risk factors. Effectively the most critical systems are listed in the left column, initially in no particular order. The business impact of the loss of each system is estimated in the next column (from 1 to 10), and the likelihood of the system being impacted by ransomware in the next column (from 1 to 10). The product of those items gives a risk estimate from 1 to 100 in the last column. The spreadsheet is then sorted based upon the calculated risk.
Example Risk Matrix
As noted in this simplified example, the business evaluates its key risk factors based upon experience. The impact is based upon the assumption that data is recoverable…of course this can only be evaluated by performing scheduled test recoveries and documenting the results. The likelihood of a particular type of data damage or loss is quite subjective. It can be based upon experience but, in most cases, this is a judgement call based upon the experience of each business. The risk factors shown here are a very minimal listing of those present in a manufacturing setting. These will vary tremendously between businesses…it is critical for each management team to do their own evaluation based upon their unique situation.
Businesses should re-evaluate risk factors on at least an annual basis, and should test data recovery on a scheduled basis. Those businesses with predicted major cash flow impacts should typically use a business continuity system which self-tests every protected server on at least a daily basis. Changes to information systems, infrastructure, etc. should prompt the business to update their business continuity plans as needed. These evaluations can conclude with disaster recovery tabletop drills to move the organization through a notional scenario without risking data.
It is important for manufacturing business management teams to be involved in the appropriate protection of their data. Involvement in identification of all critical data sources and decision making regarding continued availability of that data will serve to reduce risk to the business and can ensure better quality of service for clients. Making assumptions about the ability to recover quickly from a successful ransomware attack is irresponsible and potentially dangerous for the business and for quality of service to customers. It is easy for business owners to feel intimidated about this process, but their IT staff or service provider should be able to provide assistance to make this a reasonably painless process.
Dolce Vita IT Solutions LLC
About the author: Dolce Vita IT Solutions is an Edmond, Oklahoma based IT consulting firm specializing in providing IT support to small and mid-sized businesses in medical, insurance, manufacturing, banking, and other business verticals. In business since 2002, Dolce Vita works with businesses from 2 to 500 users. Lane can be reached at firstname.lastname@example.org .