
Join today’s top leaders online at the Data Summit on March 9. Register here.
Imagine if someone asked you to drink a glass of liquid without telling you what was inside or what the ingredients might do. Would you drink it? Maybe, if it was given to you by someone you trust, but what if that person said they couldn’t be sure what was inside? ? You probably wouldn’t participate.
Consuming the unknown is exactly what IT departments do every day. They install software and updates on critical systems without knowing what’s inside or what they’re doing. They trust their vendors, but what software vendors don’t tell IT is that they can’t be sure of all of their upstream vendors. Protecting all parts of a software supply chain, including those beyond the control of IT, is nearly impossible. Unfortunately, bad actors take full advantage of this large “attack surface” and make big gains in cyber breaches.
A big, growing problem
The most famous example is the hack of Austin, Texas-based enterprise software developer SolarWinds in 2020. Attackers inserted malicious code into software widely used by industry and the federal government. IT departments installed an update containing the malware and large volumes of sensitive and classified data were stolen.
Other software supply chain attacks have occurred at companies like Kaseya, an IT management software company where hackers added code to install ransomware, and Codecov, a tool vendor whose software was used to steal data. And compromised versions of the “coa” and “rc” open source packages have been used to steal passwords. These names may not be familiar outside of IT, but they have large user bases to tap into. Coa and rc have tens of millions of downloads.
Obviously, attackers have realized that it is much easier to hack software that people purposely install on thousands of systems than to hack each system individually. Software supply chain attacks have increased by 300% between 2020 and 2021, according to a report from Argon Security. This problem is not going away.
How could this happen?
There are two ways for hackers to attack software supply chains: they compromise software creation tools or they compromise third-party components.
Much attention has been paid to securing the build tools source code repositories. The SLSA (Supply Chain Levels for Software Artifacts) framework offered by Google allows organizations to assess the extent to which they have “locked down” these systems. This is important because there are now hundreds of commonly used build tools, many of which are easily accessible in the cloud. This month, the open-source Argo CD plugin was found to have a significant vulnerability, allowing access to secrets that unlock the build and release systems. Argo CD is used by thousands of organizations and has been downloaded over half a million times.
At SolarWinds, the attackers were able to access where the source code was stored and they added additional code which was ultimately used to steal data from SolarWinds users. SolarWinds built its software without realizing that malware was included. It was like giving an untrusted person access to the ingredients in that glass of liquid.
Even though companies control their own build environments, the use of third-party components creates huge blind spots in software. Gone are the days of companies writing a complete software package from scratch. Modern software is assembled from components built by others. Some of these third parties use components of fourth and fifth parties. All it takes is for a sub-sub-subcomponent to include malware, and the final package now includes that malware.
Examples of compromised components are incredibly common, especially in the open source world. “Namespace confusion attacks” are cases where someone downloads a package and just pretends it’s a newer version of something legitimate. Alternatively, hackers submit malicious code to add to legitimate packages, since open source allows anyone to contribute updates. When a developer adds a compromised component to their code, they inherit all current and future vulnerabilities.
The Solution: A Permissions Framework
Industry groups and government agencies such as the Department of Commerce’s National Telecommunications and Information Administration (NTIA) are working to develop a standard and plan to use an executive order to mandate the use of a software bill of materials (SBoM) for software purchased by the government. An SBoM is a list of software ingredients that helps identify all components, but unfortunately does not indicate if they have been hacked and will misbehave. Hackers won’t list their code in the ingredients.
Developers can improve the security of the build tools they control and list third-party ingredients from their vendors, but that won’t be enough for them or their users to be sure that none of the ingredients have been compromised. IT needs more than a list of ingredients. Software developers need to describe the expected behavior of code and components. IT teams can verify these statements and ensure that they are consistent with the purpose of the software. If a program is supposed to be a calculator, for example, it should not include behavior indicating that it will send data to China. Calculators don’t need to do that.
Of course, the compromised calculator might not say it intends to send data overseas, because the hackers won’t disclose that the software has been compromised. A second step is necessary. When software is running, it should be prevented from doing things it hasn’t declared. If the software did not indicate that it intended to send data to a foreign country, it would not be allowed to do so.
It sounds complicated, but examples already exist with mobile phone applications. Once installed, apps ask for permission to access your camera, contacts, or microphone. Any unrequested access is blocked. We need a framework to apply the concept of mobile app-like permissions to data center software. And that’s what companies like mine and many others in our industry are working on. Here are two of the challenges.
First, if a human approves “sending data outside of my company”, does that mean all data? To anywhere? Listing all data types and destinations is too detailed to review, so it becomes a linguistic and taxonomic challenge as much as a technical one. How do you describe risky behaviors in a comprehensive way that makes sense to a human without losing important distinctions or specific details that a computer needs?
Second, developers won’t use tools that slow them down. It’s a fact. As a result, much of the work of declaring the expected behavior of software can – and should – be automated. This means analyzing the code to discover the behaviors it contains in order to present the results to developers for review. Then, of course, the next challenge for everyone involved is determining the accuracy of this analysis and evaluation.
These challenges are not insurmountable. It is in everyone’s interest to develop a permissions framework for data center software. Only then will we know that it is safe to take this drink.
Lou Steinberg is founder and managing partner of CTM Insights, research laboratory and cybersecurity incubator. He has been at the forefront of network security and technological innovation throughout his career. Prior to CTM, he was CTO of TD Ameritrade, where he was responsible for technology innovation, platform architecture, engineering, operations, risk management and cybersecurity.
DataDecisionMakers
Welcome to the VentureBeat community!
DataDecisionMakers is where experts, including data technicians, can share data insights and innovations.
If you want to learn more about cutting-edge insights and up-to-date information, best practices, and the future of data and data technology, join us at DataDecisionMakers.
You might even consider writing your own article!
Learn more about DataDecisionMakers