Cyber Security - what the IT need to understand about OT
Rod Hughes Consulting General Web Site | Applications Home | Innovations and Solutions Home | A bit about Rod Hughes |
|
---|
Note - if the navigation pane on the left of this window is not visible, click the 2-pane icon on the top bar
In considering "real time 24/7 critical infrastructure", someone recently asked:
"what is important for IT (Information Technology - corporate LAN) people to see if they visited an OT (Operational Technology - field equipment running infrastructure) site that could not be seen or understood by sitting around a table in the office in a meeting about cyber security threats to the IT system originating from OT ?".
Apparently the premise for this questions started as "Our IT resilience audit believe that the unknown OT areas are the biggest weaknesses in our portfolio"
Complex but great question!
So here are some thoughts about what they would see and the implications of what they see on a field tour of typical electricity utility.
First to ask is "who distrusts who"?
In many (dare I say most?) instances, the OT people would consider precisely the OPPOSITE and want to keep IT connectivity out of the operational systems!!
Their fear is hackers and bad stuff getting in via the IT network is generally the biggest threat to OT (apart from the privileged authorised user as per Maroochydore Sewerage Treatment in the year 2000 for which I'll let your imagination consider the outcome of the intrusion- refer http://csrc.nist.gov/groups/SMA/fisma/ics/documents/Maroochy-Water-Services-Case-Study_report.pdf - and which, sadly, was not the last of such events around the world - when will we ALL learn from the past?).
Yes I can see an argument that says all these remote field systems are an access point to the corporate IT network. In the extreme (may or may not be realistic) consider hackers sitting comfortably in their lounge room for hours on end trying to hack into the electricity the in-home display unit which is connected to their household electricity "smart meter" which is linked to the utility metering system (supposedly to empower the consumer) .. or driving around in their electric vehicle connecting to all sorts of recharging stations and hooking up the billing system which is connected to the corporate LAN which is connected to .... Possible? Maybe! Can I categorically rule that threat out? No. Not without spending huge effort to make sure of things and even then it is probably not absolute.
Unfortunately the "good old" air gap (arguably specifically to keep IT out of the OT area) doesn't exist any more - we now have Ethernet connectivity from the IT to the SCADA control room, to the remote site, to the SCADA interfaces, to the protection equipment, to condition monitoring devices, to control devices, to metering devices and even to the physical interfaces to the primary plant and equipment.
So it is good to engage IT in understanding the OT environment, and vice versa.
But, in contrast to the original premise that led to the question about IT checking out the threats from the OT sites, the IT people also need to understand that THEY may in fact be the threat vector to the organisation's OT which looks after the real workings of organisation's core business ... in this case, keeping the lights on!!
Second is what is the purpose of all this and why are IT talking with OT and vice-versa?
IT need to understand they are a service provider to the organisation just as much as OT is a service provider to the organisations operations. i.e. Both IT and OT must securely facilitate new technology being used to improve efficiency and reduce cost of doing the core business.
- If they (either/both) prevent core business operations and efficiencies and cost reductions, they have failed to their job.
- If their (either/both) systems fail and cause a block to the core business operations, they have failed to their job.
Third thing is that each facility is a 24/7 system supporting 24/7 critical infrastructure with rarely a human in sight (not just when people turn up at the office at 8:30am).
So the secondary systems must be reliable i.e.:
- They MUST operate correctly (do the intended something) when it is required to
- They MUST NOT operate when it is not required to (do nothing)
- They MUST keep operating correctly to both those first requirements come humans or wind or fire or rain (even flood) or cars or trucks or planes… and as much as possible, come failure or explosion of anything!
If I took them to a 40+ year old substation (say 1974 and the first epiphany that there are some very old installations still in service as vital parts of the critical infrastructure ) there would be:
- A bunch of CT and VTs and primary plant like circuit breakers, transformers, reactors, cap banks … all connected up to the secondary system using wires
- Thousands of individual isolating/test links to allow techs to totally disconnect any device from the rest of the system to allow safe testing without risk of impairing the in-service operation of the rest of the substation
- a bunch of electromechanical devices (possibly from a dozen or more different vendors),
- a bunch of panel ammeters, voltmeters, indicating lights, a bunch of hard wired buttons
- and the closest thing to "IT" or "OT" would be a telephone!
No security risk there
If I took them to a 30+ year old substation (say 1984), or older with some upgrades, there would be
- Possibly a mix of the above still in-service plus
- some electronic, even a few microprocessor relays (possibly from a dozen or more different vendors), but non-communicating devices
- an RTU of one form or another with physical I/O and a connection to the private comms network back to the master station/control center but still a proprietary system with an air gap to the corporate IT world
- .… and the telephone, and may be a fax machine
If I took them to a 20+ year old substation (say 1994), or older with some upgrades, there would be
- Possibly a mix of all of the above still in-service plus
- some now electronic, even microprocessor relays and control devices (possibly from a dozen or more different vendors) with serial comms RS232/422/485 (most likely NOT Ethernet) using a plethora of proprietary (incompatible or non-interoperable) comms protocols, or proprietary (incompatible or non-interoperable) variants of standard comms protocols, with proprietary (incompatible or non-interoperable) configuration files/semantics/tools
- the RTU might now have some comms to the relays instead of physical I/O
- the master station computers may have some link into the corporate IT network
- .… and the telephone/fax machine ...
- plus now maybe a modem for remote dial in access to the IEDs at a whopping 27 kbps! (yes, things used to be able to be done at that bandwidth .. and reasonably quickly!)
(beware the phone having been unplugged for the technicians to plug in their modem - in an emergency action, the control center can't call the substation, or vice versa! Yes, I have seen just that!)
The risks have certainly started - just ask to see the Idaho National Laboratories demo - no, not the orchestrated "publicity stunt" type Aurora test of blowing up a generator which is probably in reality not possible because of other physical mechanisms, or is equally possible in manual control systems.
I am referring to a test in 2005 of an "uninformed" hacker (as distinct from the informed user at Maroochydore) being asked to see what he could do. With no additional information he very quickly took control of the remote circuit breakers (open-close-open-close-open ...) totally unseen by the master station screens - all started by sending an email to a control center operator
if I took them to a 10+ year old (say 2004) substation (or 20/30/40 year old with upgrades,)
- Possibly a mix of all of the above still in-service (yes including perhaps some 40+ year old relays) plus
- Perhaps they have started their Ethernet connectivity journey in some way so the telephone is now an IP telephone
- Perhaps they have gone one step further with some IEDs having wifi connectivity, possibly even direct to the internet!
But if I took them to a more recent substation since 2004, things have potentially changed dramatically! (and is what the IT people are concerned about how that impacts their IT systems)
- But still potentially a mix of all the older things plus now
- Some substation will have been implemented based on IEC 61850 (Edition 1 released 2004), and hence Ethernet connectivity to all manner of Intelligent Electronic Devices for protection, metering, control, SCADA, Condition Monitoring, Distribution Feeder Automation ….. these will likely be a mix of old wire-based technologies plus replacement of lots of certain wires with LAN-based technology in various degrees of so-called Station Bus and Process Bus using mix of GOOSE and MMS. Some utilities are now moving to 100% LAN-technology by adoption of Sampled Values into the mix on Process Bus.
- Some serial servers for legacy devices
- A LAN with switches and routers and firewalls
- Gateways for external communication
Some sites are even now replacing the physical wiring with Ethernet to/from the CTs, VTs (IEC 61850-9-2 Sampled Value messages) , circuit breakers, isolators, transformers, reactors, cap banks … and a corresponding absence of all the isolating links.
They might then see a bunch of 3rd party contractors, even vendor technicians turn up with their own laptops and test sets and connect to the substation LAN … that is when you call for the defibrillator/ambulance to restart the IT professional's heart (despite popular belief they do have them )when they realise there are all sorts of “uncontrolled” devices (potential threat vectors with potentially no anti-virus or at least not verified by the IT policies) being connected to the OT LAN!
But don't take my word for it!
Looking at those different substation scenarios, the first thing is to note that all scenarios are likely to co-exist in the one utility … and they remain critical to the core purpose of the utility!
IT people are usually working with equipment that is all reasonably of the same generation (say last five years) and the same operating platform and are reasonably coherent in how they are managed.
Laptops more than six years old would be a rarity and certainly DOS and Windows 3 won’t exist anywhere in their IT systems, but may be a necessity in OT.
The technology platform is not likely to be coherent across any one, let alone all sites.
And equally what may have been no risk old substations may have pockets or whole areas of new equipment that are communications, even Ethernet connected.
Moreover, replacement of these old mixed technologies for the sake of modern, coherent platforms is just not possible.
In all cases they are looking at an engineered system. Specific equipment chosen and configured for specific functionality with specific performance requirements with specific exchange of signals between devices.
In a 24/7 real-time operating system, that means changing anything in-service is a real nightmare to be avoided at all costs.
Which then highlights duplication - we sometimes use the term “redundant systems” but we don’t mean it is not required. The idea is that there are two totally independent systems in what IT would call hot-hot operation – both are operating simultaneously and either is capable of doing what we need regardless of what is happening or not happening in the other system.
This “hot-hot” duplication is to ENSURE (a word often found in the “national electricity rules” implying guarantee under all circumstances) that a failure in either system doesn’t affect the other systems ability to protect the power system.
The “national electricity rules” will generally allow perhaps eight hours to repair any failure before the power system needs to be reconfigured. These “ner”, or corporate equivalent, usually only allows a limited time to run the power system “as is” without duplicated protection. This limited time is because of the increasing risks over time of having only one protection system in service because that may fail as well, which leaves the system totally unprotected and at risk of grid wide melt down/explosion. Hence circuit breakers are switched to re-route power through systems which do have duplicated protections both in service.
To note that this duplication may also extend to policies that the devices in each system are different – different vendors and/or different operating principles. This is to minimise the risk of one failure appearing in both systems simultaneously, or one operating principle in both systems not being able to detect the conditions and act appropriately.
There are rare exceptions to that and only where those devices have proven to be immensely reliable over decades of service (which usually means they are electromechanical devices).
It is increasingly harder to segregate devices using common software or hardware components, apart from getting such detailed and potentially confidential proprietary information, simply because there is a very limited number of different suppliers of those components or software code - classic example a comms port driver chip and the protocol stack.
This is the outworking of the reliability requirements mentioned above.
What it creates is a complexity of engineering those devices into the system – different size boxes, terminations, front panel HMI menu, setting tools, nomenclature ….
And hence the move to adopt IEC 61850 which, as per its stated purpose and if applied accordingly, is a common vendor-agnostic engineering process to configure the devices to communicate with each other.
So lets now look at this LAN-based technology does for maintenance and operation.
Technicians used to be able to open/close isolating links to enable testing of any particular piece of equip0ment independent of the rest of the system which has been made safe by the appropriate link positions. Things are physically disconnected.
Because of the interactions with the rest of the system, in LAN technology we cannot do that as the devices may now hold/integrate dozens of individual functions within one IED, of which only one function needs to/can be “isolated” for the purposes of the test. The technician cannot just unplug the fibre to create a physical isolation as they used to do as it risks the whole system collapsing in any manner of failure modes!
To get the system ready for testing of the target Device(s)/Function(s) Under Test, the technician needs to initiate a process to set all the OTHER devices in a mode that understands which messages are to be responded to as normal versus which messages are to be responded to as a result of testing of the source Device-Under-Test.
Equally the DUT may need to be reconfigured to “listen” (subscribe) to a selection of different messages it receives and responds to from the LAN, i.e. messages from the test equipment.
In all of that the test mechanisms, as in wire based links, need to be engineered-in at the start of the design process.
Which then brings into question Remote Engineering Access.
The first thing is to define what is "remote"?
IEDs often have integrated HMI front panels with a screen,indicators and a keyboard/navigation buttons. These would be termed "local" as the operator must be physically standing at the device.
In a cyber sense, "remote" means any human interaction with the device via the comms system (accessing the IEDs via RS232/422/485 or RJ45 or fibre or wifi/bluetooth or an optical probe link ...) - the person may be a few metres away or hundreds of kilometres away, they are both "remote" in terms of a comms system connection which has no indication of how far away they are, and hence they all represent a cyber risk to the IED and consequently to the real-time in-service operation of the system.
The second aspect of security for an IED is to recognise that, whether local or remote access, some sort of password mechanism should be available in the iED.
IEEE 1686-2013 (replaces previous Edition 2007) is titled IEEE Standard for Intelligent Electronic Devices Cyber Security Capabilities.
This defines the password regimes that could/should be available in the IED .. of course that means it should become a mandatory requisite in the IED procurement specifications, i.e. not provided = automatically and regardless of any other consideration/benefit of using the iED, it will excluded from potential consideration! Are your purchasing and engineering specifications and decisions centred around ensuring cyber security?
However whilst IEEE 1686 is a great start, it does imply a very significant requirement on the organisation to manage the passwords!
Managing IED passwords and how a person is authorised to access a particular device is VITAL to cyber security.
Leaving IEDS with factory-default passwords is not cyber security!
On the other hand, technicians may argue that "at 3am in the morning" when they are trying to get the lights back on, they don't have time for trying to remember/find an IED password!
So how big is this password management problem?
I know of one utility who forecast their total number of IEDs on their system by the year 2025.
Go on, have a guess before you see the answer : 50, 500, 5000, more?? ....
Start getting your head around managing multi-level passwords independently on each and every IED (including when IEDs are replaced)! ..... and managing who knows what those passwords are at 3 am in the morning (including managing what happens when someone leaves the organisation or ceases to have responsibility for that asset)!
I know of one utility asset owner who (as recently as 2014) got rather cranky with a circuit breaker vendor who was touting the latest wifi access from a smart phone to the circuit breaker's integrated condition monitoring device.
Why did he get upset?
Because the vendor was so proud that only their "ap" on the smart phone knew how to communicate with the circuit breaker so it was inherently secure and didn't need any password!!!
The vendor salesman was totally oblivious to the risk of a hacker's van with a wifi scanner parked 50 meters away outside the substation fence! I am not sure if the vendor's "ap" was free-ware available for download off their web site, but you have to wonder how it gets onto a smart phone in the first place and what security is around the phone being lost/stolen!
Needles to say the vendor left without even the prospect of an order and with a few things to talk to his technical department about ...
(refer CIGRE Technical Brochure 318 (2007) Wi-Fi Protected Access for protection and automation)
So now lets consider what the U.S. NERC CIP rules require for the management of all these passwords (if not your country, it is at least a court room reference for industry wide "good practice"):
- You must maintain a list of nominated personnel who are authorised to grant permissions/ provide passwords to other personnel to access to the devices
- You must have have a documented process, and it must be followed, for change control and configuration management of the IEDs
This means in the engineering phase(s) over the life time of the system of both the asset owner and the various contractors - You must identify, control and document all changes to hardware and software components
That means software, firmware and configuration/parameterisation files, - You must maintain a list of personnel who have access and what level of physical and/or electronic access they have to the IEDs
- You must effectively revoke a person's access within 7 days if they no longer have need for operational access to the IEDs
OK, you have some time for some paper work to float through the process ...
but here is the one with serious IT/OT system requirement implications ... - You must effectively revoke a person's access within 24 hours if they have been terminated for cause
i.e. if they have been sacked and therefore may hold a grudge, and i would extend that to any retrenchment etc that may yield similar emotions.
This implies some serious coordination and automation between HR and IT and OT!
One of the implications of all of that is most likely some form of password obfuscation. This is the process where:
- a User is identified to the system by means of their personnel credentials in some form.
We could wonder if swipe cards/proximity fobs are better or worse in that regard than typing in your personal Username and Password, or whether you need the swipe card/fob and a password! - The User never needs to know the IED password as the system manages the login to the selected and permitted access device, and then the system opens a secure "channel" from the User's PC/phone to the IED
Such obfuscation segregates the remote device password management (which rarely needs to change if it is truly secure) from the worries and difficulties of managing personnel login authority on a rapid and daily basis.
Clearly this is a close liaison, and some powerful system tools, between OT for the IEDs and IT for the corporate personnel management!
The systems do exist in various forms and capabilities - without necessarily promoting any particular one (there are "horses for courses") the range of solutions encompass this one or this other one - both of those, and indeed many others, I would be happy with in their right environment - it needs thinking about the risks, what you want to achieve, what you need to achieve and what you can practically and financially achieve - noting if you get it wrong, there could be some serious operational consequences.
Now that may all sound like the OT have thought through all this "cyber security hoo-ha"! (.... so IT should stay well away)
Or it may sound like cyber security is not really an OT issue at all (.... so IT should stay well away).
In both cases, nothing could be further from the truth.
We must also be careful of unusable procedures. Company's LOVE having a policy and a procedure they can point to.
I know of one utility that were (presumably) very proud of their diagram showing the 32 individual procedures for OT cyber security ... of course all so very easy (???) for each person to follow day in, day out.
Overkills and blockages to staff doing their job will just engender creative workarounds.
CIGRE has put some good effort into looking into cycber security in OT environments. None of it is "THE" magic pill. However it can only help to do some further reading of what the OT people in the worldwide CIGRE community have come up with on cyber security as best practice at that time - even if some of it is dated, it gives you a huge jump through the thinking process and covers off many aspects your OT and IT may not identify in their meeting room ... and maybe some people to talk to .
These Technical Brochures are available through the CIGRE on-line library/shop: http://www.e-cigre.org/
- Technical Brochure 419 (2010): Treatment of Information Security
- Technical Brochure 427 (2010): The Impact of Implementing Cyber Security Requirements Using IEC 61850
- Technical Brochure 507 (2012): Communication Architecture for IP-based Substation Applications
- Technical Brochure 603 (2014): Application and Management of Cybersecurity Measures for Protection and Control
- Technical Brochure 615 (2015) Security architecture principles for digital systems in Electric Power Utilities
- Technical Brochure 318 (2007) Wi-Fi Protected Access for protection and automation
Free Downloads
New Note: Non-CIGRE Members may now access and download FREE OF CHARGE all CIGRE publications (ELECTRA and Technical Brochures) that have been published more than three years ago.
So there are a few eye-poppers in all of that - some you may or may not have thought of sitting in a meeting room at head office .. well you definitely can now
... but no doubt there are other subtle cyber security risks and implications that are unique to the utility.
Hopefully this treatise has given a broader perspective of the benefits in "going to have a look" and is a pointer that both IT and OT will have to keep their eyes open, and think carefully - things might surprise them ... and me!!
- Protection Systems Engineering
- IEC 61850 Engineering
I provide a range of courses for company-specific in-house training and occasional public invitation courses. Contact me for details.
Contact Me
A phone call is nearly always welcome depending on the time of night wherever I am in the world.
Based in Adelaide UTC +9:30 hours e.g.
April-September | Noon UK = 2030 Adelaide |
October-March: | Noon UK = 2230 Adelaide |
Mobile + 61 419 845 253
Extra Notes:
No Waiver, No Licence:
Rod Hughes Consulting Pty Ltd accepts no direct nor consequential liability in any manner whatsoever to any party whosoever who may rely on or reference the information contained in these pages. Information contained in these pages is provided as general reference only without any specific relevance to any particular intended or actual reference to or use of this information. Any person or organisation making reference to or use of this information is at their sole responsibility under their own skill and judgement.
This page is protected by Copyright ©
Beyond referring to the web link of the material and whilst the information herein is accessible "via the web", Rod Hughes Consulting Pty Ltd grants no waiver of Copyright nor grants any licence to any extent to any party in relation to this information for use, copy, storing or redistribution of this material in any form in whole or in part without written consent of Rod Hughes Consulting Pty Ltd.