A Whitepaper by
Dolce Vita IT Solutions LLC
Business Continuity and Ransomware Prevention in
Manufacturing environments vary tremendously from the standpoint of information technology which is in use. The technology varies from physical and virtual servers and workstations to the presence of workflow management systems including enterprise resource planning (ERP) and other exceptionally complex database systems, inventory management, and other similar systems. The vast majority of manufacturing businesses utilize hybrid environments with some systems on-premise and others cloud-based. By definition, virtualized environments will still contain a physical component. Email systems also vary from on-premise email servers to systems hosted in the cloud such as Hosted Exchange and Office 365.
Yet with all of this variety the following aspects generally apply:
- Manufacturers generally invest minimal time and effort in IT training for management and users
- With some exceptions, manufacturers are generally slow to adopt changes to their IT systems
- Manufacturers do a very uneven job, as do most businesses, at identifying their most critical data, and ensuring adequate resources are brought to bear to protect those resources
- Many manufacturers have a relatively short time window to get ERP, email, and other critical information resources back into production before cash flow is seriously impacted
Based upon experience in assisting a wide variety of manufacturing facilities from heavy industry to medical surgical camera repair and manufacturing, some guidelines are offered for manufacturing management teams to improve awareness of ransomware, the risks it poses to information systems and cash flow, and some steps to reduce risk.
Sources of critical data
The manufacturing information environment has several sources of sensitive data which need to be protected, some of which are not necessarily obvious: cash flow, workflow, customer service, and proprietary information may be affected if the environment is compromised:
- ERP, inventory management, and shipping systems immediately affect cash flow when they are down
- Email systems affect workflow and customer service
- Financial and accounting information
- Proprietary data such as engineering, CAD, and other intellectual property
- ISO documentation unique to the business and proprietary processes
- Shipment information, bills of lading, and other “proof” documents
- General business data and documents
Ransomware – what is it, and why all the fuss?
Businesses have been dealing with malware (malicious software) ever since the first criminal miscreants understood that they could steal someone else’s work and make a profit. Ransomware is an exceptionally malicious subset of malware, designed to steal access to information and charge a ransom to get access back.
How do ransomware infections occur?
- Email - In manufacturing environments phishing emails (emails designed to look authentic which contain attachments or links to infect or to direct the user to an infected website) are the single most common infection vector
- Compromised websites – websites (either valid websites which have been infected, or “phishing” websites designed to look authentic, but are in reality carrying infectious code). The user generally either is directed to these sites by a malicious email or by ransomware-impacted search results
- USB keys (thumb drives) – used to transfer internal data from the office to the shop floor or vice versa
How serious a problem is this?
A ransomware infection in the vast majority of cases starts at a user workstation, and once begun, instructs the infected workstation to inventory all data shared across the network to which the user has read/write access. The inventory effectively prioritizes the shared data across the network based upon its perceived value (for example, financial data would probably be automatically attacked and encrypted before a repository of photos). Left unchecked every folder and file which the infected user has read/write access to becomes encrypted and therefore inaccessible to all users on the network. In the majority of cases the only way to recover data is to restore it from unaffected backups.
- During 2016 80 new ransomware families emerged, and by the end of 2016 recognized variants grew from 2900 to over 30,000
- The antivirus vendor Kaspersky estimates that roughly 40% of businesses have been impacted by ransomware
- Successful ransomware attacks result in some corporate data loss in over 50% of cases
- According to Trustwave Global Security Report the ROI for ransomware perpetrators is 1425%
In short, ransomware is a serious problem. The photo below shows what manufacturing users may see once ransomware has ravaged their shared network data.
Requirements for protecting data
Protection of data in a manufacturing environment does not differ greatly from that in other business environments, however the risk factors can vary greatly. To understand some of the risks, consider the most typical ways for data to be lost.
- Ransomware – one of the most prevalent risks today, can result in the loss of nearly all shared data for a business
- Corruption – if power issues ever occur and battery backups are older than 3-4 years, data corruption is common as servers and computer equipment can spontaneously shut down or be damaged by brown-outs (circuit amperage or voltage drops slightly in the facility and if not corrected by battery backup then damage occurs)
- Fire or water damage – power is carried at substantially higher amperages, and dependent upon the manufacturing processes in use, water or gases can be carried at high pressures, posing a higher risk than in office settings
- Environmental damage due to heat, dust, poorly controlled humidity
- Theft – improperly secured wireless can result in unintended access to systems
- Theft – weak, non-existent, or old passwords on workstations, desktops which do not automatically lock after a configured inactivity period, or shared passwords used by a malicious user
- Loss –data not being backed up and then being accidentally deleted or destroyed
- Loss of access – poor documentation of credentials for every aspect of the information operation
Although the physical environment does not contribute to ransomware issues, it can contribute to loss of data due to a wide variety of factors:
- The environment in which servers operate should be environmentally controlled…lack of dust control and temperature/humidity controls can shorten equipment lifespan. Airflow is critical to the longevity of servers, workstations, and network equipment
- Power systems – particular attention must be paid to battery backups for servers, network equipment and workstations. Manufacturing facilities are extraordinarily unfriendly to power systems, so server battery backups must be more robust and must be designed to automate the process of shutting down all servers in the event of serious power issues
- On shop floors, good practice is to use thin clients instead of conventional workstations as these are far more resistant to environmental issues
- The building should be locked and alarmed, with an alarm service
- Use of security cameras is recommended with specific attention to the location where servers and network equipment are located.
Firewall and Content Filtering
- Typically, it is recommended to use a hardware firewall which is a capable Unified Threat Management (UTM) appliance, with relevant content filtering and security licensing
- The firewall vendor will have published guidance regarding the configurations and settings required to minimize ransomware risk…these are updated periodically, so settings require adjustments as the threats change
- Firewall should normally be configured to block all outbound ports not required for routine business operations
Antivirus and Anti-Spam
Concerning ransomware, there are a number of important conditions which owners and business managers should be aware of:
- Anti-SPAM is not necessarily easy to properly configure, and SPAM is typically the most prevalent source of ransomware risk. It is generally recommended that anti-SPAM be cloud-based so that infections are dealt with before they ever land on the client on-premise systems, and to significantly reduce email system load. It is common for reputable systems to vet out over 75% of inbound email as originating from spammers and other blacklisted sources.
- Use manufacturer best practices to configure anti-SPAM systems
- Each reputable antivirus software vendor has available technical best practices with recommended settings most likely to protect the client environment. The management team should be confident that their protection is using these best practices
- Most antivirus is capable of content filtering…it is recommended that this be configured consistent with current best practices, in addition to the content filtering running on the firewall. These settings can be easily tested
- Best practices should be reviewed on at least a quarterly basis because manufacturers are continually adding features such as machine learning, etc. to their products
Backups and Business Continuity
It is generally acknowledged in the technology industry that ordinary file/folder backups such as those offered by many legacy backup vendors are no longer adequate for business continuity. It is essential to have a business continuity plan which includes the ability to recover not only files and folders encompassing all critical data, but also recovery of operating systems, proprietary data, documents, correspondence, accounting and business planning data, as well as recovery of email. In addition, any critical servers or workstations should be protected preferably on-site and off-site by imaging software…this allows a failed server or workstation to have a current backup image used to restore to new hardware if needed, or to completely replace the server in the event ransomware encrypts the server.
Unfortunately, it is common to see situations where the only data recoverable was that in the Enterprise Resource Planning (ERP) system or on particular devices. This may be only a portion of the data required to be recoverable. It is the business management team’s responsibility to know where all critical data resides, verify that it is backed up, and verify that it is recoverable. As noted above the only recourse when a ransomware attack is successful is typically to restore any important data from unaffected backups.
Planning for recovery from ransomware attacks involves a detailed process to account for the systems which contain the most valuable and sensitive information and the time window required for those systems to be recovered. This prioritization is not primarily a technical decision, but a business decision. As the per-hour costs of downtime increase to include personnel labor costs, contract labor, non-delivery penalties, and reputation loss, the cost effectiveness of higher quality business continuity systems and prevention measures increases.
It is useful at this stage to review the distinction between backups and business continuity, because this is exceptionally important from a cash flow perspective.
The term ‘backup’ is used to describe the fact that a copy or image of critical data is kept on separate media allowing data to be recovered in the event the original media or device is damaged, etc. The term ‘backup’ does not account for the time required to recover data.
The term ‘business continuity’ is used to describe the practice of backing up data as an image in such a way that it is recoverable in an acceptable timeframe (RTO), in a sufficiently granular fashion (RPO) to minimize cash flow impact of a negative event on a business. Since ransomware is often a very fast-acting event, this can be critically important and is illustrated by two recent real-world examples.
The importance of user training cannot be overstated. While technology will handle a portion of the ransomware threat, remember the comment at the beginning of this whitepaper: “Manufacturers generally invest minimal time and effort in IT training for management and users“. It is crucial to bear in mind the following “rules” which impact IT with regards to ransomware:
Rule #1: No anti-SPAM, antivirus, or content filtering technology is 100% effective
Rule #2: No systems administrator or consultant can change Rule #1
An organization’s users are the last line of defense against malware attacks. They must be properly trained on how to recognize ransomware threats and what to do when an attack seems to occur. The training can be conducted by the internal IT group or the business’ IT consultants if applicable. A user needs to know how to respond if they see a ransomware-related warning similar to the following:
“A malfunction has been detected with Windows 7 / Server 2008 R2 and your IE 11.0. Please call the number below to speak with a technician to assist you in resolving this matter. DO NOT SHUT DOWN OR RESTART THE COMPUTER OR YOUR INFORMATION MAY BE LOST…”
In business continuity planning one of the first steps is to review the business’ highest-risk, highest probability downtime scenarios. The next is to take reasonable steps to protect data as well as ensure that the management team understands the timeframe for recovery of data based upon the current infrastructure (the available RTO with the existing infrastructure). A risk matrix is one of the most useful and dynamic methods used for business continuity planning.
The client is an Oklahoma educational institution whose primary file server failed late on a weekend due to a failure of multiple hard drives. The school was protected by an imaging system, but not by a business continuity appliance. The failure required that the drives be replaced and a “bare-metal” recovery be performed to different hardware.
The data recovery for this 1.5TB server required approximately 20 hours for the backup system to complete. The business impact extended from about 0800 Monday morning until about 4 PM Monday.
The client is a high-tech manufacturer serving the oilfield and pipeline industries with approximately 50 employees. They had a critical database server fail due to live system modifications being made by an application developer. In addition to their personnel, they had five 18-wheel transports idled by this data incident, at an estimated downtime cost of $2500 per hour. The business was up and running with a server recovery image in under 30 minutes due to the business continuity system in place. The updated server data from the remainder of that day was restored outside of regular business hours to minimize disruption to the client.
This illustration of downtime helps visualize recovery timeframes and their impact on cash flow. When considering cash flow impact it is useful to plan around the potential absence of key data for the duration of various recovery scenarios, and to at least plan for work process adjustments to minimize business impact. Visualization, however, must go beyond the obvious cases in which the ERP system is down, or a critical file server is down. How about email? If it is on-premise, and it is a critical contact method for clients, then the recovery planning must give this a priority. If the facility uses a voice over IP (VOIP) phone system which is converged with the data network, the continuity plan must be altered to properly disconnect the affected data systems from the network without affecting the voice system. How do infected servers and workstations need to be disconnected from the network? If this is not given consideration ahead of time, then a ransomware attack can have an even greater financial impact.
Planning for DR priorities with a ransomware scenario
The following is a greatly simplified chart which illustrates a means to assess risk factors. Effectively the most critical systems are listed in the left column, initially in no particular order. The business impact of the loss of each system is estimated in the next column (from 1 to 10), and the likelihood of the system being impacted by ransomware in the next column (from 1 to 10). The product of those items gives a risk estimate from 1 to 100 in the last column. The spreadsheet is then sorted based upon the calculated risk.
Example Risk Matrix
As noted in this simplified example, the business evaluates its key risk factors based upon experience. The impact is based upon the assumption that data is recoverable…of course this can only be evaluated by performing scheduled test recoveries and documenting the results. The likelihood of a particular type of data damage or loss is quite subjective. It can be based upon experience but, in most cases, this is a judgement call based upon the experience of each business. The risk factors shown here are a very minimal listing of those present in a manufacturing setting. These will vary tremendously between businesses…it is critical for each management team to do their own evaluation based upon their unique situation.
Businesses should re-evaluate risk factors on at least an annual basis, and should test data recovery on a scheduled basis. Those businesses with predicted major cash flow impacts should typically use a business continuity system which self-tests every protected server on at least a daily basis. Changes to information systems, infrastructure, etc. should prompt the business to update their business continuity plans as needed. These evaluations can conclude with disaster recovery tabletop drills to move the organization through a notional scenario without risking data.
It is important for manufacturing business management teams to be involved in the appropriate protection of their data. Involvement in identification of all critical data sources and decision making regarding continued availability of that data will serve to reduce risk to the business and can ensure better quality of service for clients. Making assumptions about the ability to recover quickly from a successful ransomware attack is irresponsible and potentially dangerous for the business and for quality of service to customers. It is easy for business owners to feel intimidated about this process, but their IT staff or service provider should be able to provide assistance to make this a reasonably painless process.
Dolce Vita IT Solutions LLC
About the author: Dolce Vita IT Solutions is an Edmond, Oklahoma based IT consulting firm specializing in providing IT support to small and mid-sized businesses in medical, insurance, manufacturing, banking, and other business verticals. In business since 2002, Dolce Vita works with businesses from 2 to 500 users. Lane can be reached at firstname.lastname@example.org .