The European Telecommunication Standards Institute (ETSI) held their annual Security Week event and along with a representative from the UK National Cyber Security Centre, DataArtheld a one day workshop on Network Function Virtualization Security (NFV). What is NFV? NFV has been defined as “The effort to virtualize entire classes of network functions into building blocks that may connect, or chain together, to create communication services.” While related to virtualization, it is not the same and I recommend reading about NFV from the ETSI website.
“Security” frequently evokes images of firewalls, authentication tokens, encryption, intrusion prevention systems (IPS), anti-virus, malware scanners and more recently Artificial Intelligence / Machine Learning. For the workshop, we took a different approach; we started off with the ‘lower layers’ or as I prefer to call it foundational security.
No computer system is 100 percent secure, let me repeat that, no computer system is 100 percent secure. There used to be a joke about having a computer inside a faraday cage which itself was inside a secure vault with the network cord cut. Guess what, it is not as secure as you would think (attacks using power analysis have been shown to be effective). My statement is not intended to be sensational or alarmist, it is just a fact, security is and always will be a ‘whack a mole’ endeavor. It is important to have an ongoing, open and honest dialog about threats and responses.
In determining what security is appropriate, a use case is important. For the ETSI workshop the use case was focused on Telecoms and critical/sensitive infrastructure (e.g. water treatment plants, electrical grids, nuclear power plants and air-traffic control, etc.). Implementing security is about risk mitigation. Technology alone will not solve the problem. It needs to be an evolving blend of technology and process. To be effective security needs to be implemented in layers. Too often security is applied after the solution has been designed or worse, deployed. The results are predictable, the systems are insecure and at best give a false sense that systems/data are protected. When we talked about sensitive systems, we emphasized that “security” has a direct correlation to “safety,” just think about the threat of self-driving cars being ‘hacked’ in the future.
In the physical world, we have the analogy that if I make “myself” a less tempting target by installing security doors, a big fence and having guard dogs while my neighbor leaves his cash in plain sight (maybe post a few pictures of the cash on social media for good measure) and does not have doors on his house, a ‘bad actor’ is going to go for the easy target. Computer systems are different because the incremental cost of a cyber-attack against “everything” you can target is basically zero. It does not matter if I have built a wall and have guard dogs because it ‘costs’ the bad actor nothing additional to just try. When we are talking about the target being sensitive systems and infrastructure, we are not talking about ‘script kiddies’ as the ones that we need to be concerned about. The bad actors are organized, well-funded and patient.
Security for sensitive NFV components needs to start with attesting the underlying hardware, operating system and virtualization layer. Back in the 1990’s the Trusted Computing Group (TCG) defined the means by which a “booting” computer system would be able, with the assistance of specialized tamper resistant hardware (referred to as a Trusted Platform Module (TPM)), to verify the hardware, firmware and operating system has not been tampered with, and Attest itself. The specifications for a TPM have evolved over time and you may not even realize that most, if not all modern Windows laptops have a TPM included and go through an attestation process known as Secure Boot to make sure that the hardware and operating system have not been tampered with. On a laptop by laptop basis, this has been helpful in stopping entire classes of attacks. People with a security background reading this will note that I avoided the “root of trust” topic and I am simplifying the issue.
Most modern servers have mechanisms to be managed remotely (frequently called lights out management) built in. If you are a systems administrator for a large number of compute resources (cloud, datacenter, telecom or enterprise) these tools are very helpful for you to manage your system, but they also present a security concern. In order to provide a light out management system, these management tools are implemented in a way that is “beneath” the computer. This is a specialized management system below what we commonly think of as a computer, sometimes referred to as ring -1, -2 and -3. The computer ‘sitting’ on top of the management system generally does not have access to or know what the management system is ‘doing’ by design.
If you stop to think about this, you may be able to see a clear problem. What good is adding security on top of the operating system, if the operating system or hardware have been compromised? As an example what if the BIOS or UEFI has been changed to always make a copy of your encryption keys when they are accessed and then secretly transmit them out of the management interface? Or worse, weakens the mathematical concepts (by tampering with the random number generator) behind encryption. It would make it appear that everything is working ok, but the ‘bad actor’ would be able to decrypt anything transmitted or stored on disk at their convenience. Conversely, what if a secured virtual workload is running on a compromised hypervisor? It does not matter how much you protect or encrypt your workload, if the hypervisor has been compromised; in the end, the hypervisor sees everything. During one of the hands-on sessions my colleague demonstrated modifying both the data (a critical database field) and operational (changing addition into subtraction) components of a virtualized workloads showing what a compromised hypervisor could do.
A few months back, Intel published a security bulletin that disclosed that the management engines (MEs) had a vulnerability that impacted millions of shipped systems over several years. Getting back to that BIOS example from above, think about what if a bad-actor could replace the BIOS on the servers with a malicious one without anything or anyone knowing it (remember this can be done with the computer “off” so anti-virus / anti-malware is not even running yet). This is where Attestation can help, if the BIOS was modified, the Attestation process would be able to detect an unauthorized change that occurred and halt the boot process. Unfortunately, Attestation (or Secure Boot) works well on a one-off basis, but what do you do when you have 100s or 1000s of servers that need to be kept up-to-date and managed with low-level attestation information? Some vendors have released solutions and Intel, to its credit, has released open source tools to help with this, but sadly, they are not widely implemented. The ETSI NFV-SEC group has published a document on physical security for sensitive critical components.
Moving up to the next ‘layer’, we discussed the need to verify that the NFV workloads (think virtual machines) have not been tampered with. It’s another layer that needs to be managed, unfortunately, this is implemented even less frequently than Attestation. While most hypervisors have mechanisms to check the integrity of a workload using a digital signature, if the signatures themselves (or the signing server) have been compromised they offer little real-world protection. Vendors have introduced technologies to help with the protection of data in virtualized systems. Either built into the CPU or by adding a new component known as a Hardware-Mediated Encrypted Enclave (HMEE), the idea is that the hypervisor never sees critical data “in the clear” and therefore a compromised hypervisor will not be the point of compromise. Unfortunately, researchers have already demonstrated retrieving the encryption keys from certain implementations.
Far from it being the end of the story for NFV security, this is just the beginning of the topic. Once systems are up they need to work together through a variety of mechanisms. Systems communicate over and through an array of secondary/external systems and networks. In many cases those systems are not under ‘your’ control or administration. In some use cases an NFV infrastructure may be hosting multiple tenants and have multiple sets of administrators. Keeping bad-actors and accidental configuration mistakes at bay to ensure the integrity of systems is going to be an on-going effort.
Computer security is frequently presented at executive levels the same way that insurance policies are, as some necessary investment to mitigate an event. Given the relation that sensitive / critical infrastructure has to safety, it is time we all moved from thinking about security as an afterthought and stared to treat it as mission-critical and work with our IT departments and vendors to stay as far ahead of the curve as possible.
About the Author: Michael Lazar is a veteran of the telecom industry, and has held C-level positions in system design, custom engineering and software development for the last two decades. He joined DataArt in 2016 to lead the company's telecom practice, focusing on the most demanding areas of the marketplace - systems performance, NFV, SDN & telecom security. Prior to joining DataArt, Mr. Lazar was Chief Technology Officer of Veloxum/Ambicom Holdings where he was responsible for developing system optimization software, and before that CTO of Network Physics, where he led the design and development of Voice over IP (VoIP) and Financial Information exchange (FIX) monitoring software. Prior to the CTO role, Michael was VP of Customer Advocacy at Network Physics, in charge of worldwide pre-sales engineering, post sales support, and custom engineering. Prior to Network Physics, Michael held senior technical roles at Datatec Systems and Spirian Technologies, Inc. Mr. Lazar holds a Bachelor of Science Degree in Physics from New York’s Queens College and holds a patent for Systems and Methods of Tuning an Operating System, Application or Network Component.
Edited by Mandi Nowitz