How the Software Supply Chain Security is Threatened by Hackers

Introduction

In many ways, the software supply chain is similar to that of
manufactured goods, which we all know has been largely impacted by
a global pandemic and shortages of raw materials.

However, in the IT world, it is not shortages or pandemics that
have been the main obstacles to overcome in recent years, but
rather attacks aimed at using them to harm hundreds or even
thousands of victims simultaneously. If you’ve heard of a cyber
attack between 2020 and today, it’s likely that the software supply
chain played a role.

When we talk about an attack on the software supply chain, we
are actually referring to two successive attacks: one that targets
a supplier, and one that targets one or more downstream users in
the chain, using the first as a vehicle.

In this article, we will dive into the mechanisms and risks of
the software supply chain by looking at a typical vulnerability of
the modern development cycle: the presence of personal identifying
information, or “secrets”, in the digital assets of companies. We
will also see how companies are adapting to this new situation by
taking advantage of continuous improvement cycles.

The supply chain, at the heart of the IT development cycle

What is the supply chain?

Today, it is extremely rare to see companies producing software
100% in-house. Whether it’s open source libraries, developer tools,
on-premise or cloud-based deployment and delivery systems, or
software-as-a-service (SaaS) services, these building
blocks have become essential in the modern software factory.

Each of these “bricks” is itself the product of a long supply
chain, making the software supply chain a concept that encompasses
every facet of IT: from hardware, to source code written by
developers, to third-party tools and platforms, but also data
storage and all the infrastructures put in place to develop, test
and distribute the software.

The supply chain is a layered structure that allows companies to
implement highly flexible software factories, which are the engine
of their digital transformation.

The mass reuse of open-source components and libraries has
dramatically accelerated the development cycle and the ability to
deliver functionality according to customer expectations. But the
counterpart to this impressive gain has been a loss of control over
the origin of the code that goes into the companies’ products. This
chain of dependencies exposes organizations and their customers to
vulnerabilities introduced by changes outside their direct
control.

This is obviously a major cybersecurity issue, and one that is
only increasing as the supply chain becomes more and more complex
year over year. So it’s no surprise that large-scale cyber attacks
have been able to exploit it to their advantage recently.

The risk of the weak link

For hackers, the software supply chain of companies represents
an interesting target for several reasons. First of all, because of
its complexity and the number of interacting “bricks” at the heart
of the software factory, its attack surface is very large.
Secondly, application security, which was historically focused on
securing the application in production (i.e. exposed to the
public), often lacks the visibility and tools to effectively secure
internal build servers and other parts of the CI/CD
pipeline.

In addition, it’s important to understand that the development
chain today is continuously evolving, adding new tools constantly.
This is one of the defining characteristics of the DevOps movement,
which has blurred the line between development and operations
enormously, leaving developers free to deliver features for their
customers as quickly as possible.

These choices though are often implemented without oversight and
can be very different from one team to another, even within the
same department. The accumulation of slightly different tools,
libraries and platforms makes it very difficult to create accurate
inventories which are the cornerstone of effective security
management.

Finally, by exploiting the supply chain, hackers find ways to
maximize the impact, and therefore the yield, of an attack. To
understand this, we must consider that the products and services of
a software services company’s supply chain are the building blocks
of other supply chains. An attacker who has successfully
infiltrated one link in a chain can compromise the entire user
base, which can have disastrous consequences.

The rise of supply chain attacks

In the SolarWinds attack, between March and June 2020,
approximately 18,000 Orion platform customers, including a number
of U.S. government agencies, downloaded updates with malicious code
injected into them. This code granted unauthorized backdoor access
to systems and private networks. SolarWinds did not discover the
breach until December 2020. An international scandal ensued.

A few weeks later, in January 2021, an attacker obtained
credentials used in Docker image creation involving Codecov
software, due to an error in the build process. These credentials
allowed the attacker to hijack Codecov, a software for testing
developers’ code coverage, and turn it into a real Trojan horse:
since the software is used in continuous integration (CI)
environments, it has access to the secret credentials of the build
processes (we’ll come back to this).

The attacker was thus able to siphon off hundreds of credentials
from Codecov users, allowing him to access as many secure systems.
The company only detected the breach a few months later, in
April.

On July 2, 2021, some ninety days later, a sophisticated
ransomware group exploited a vulnerability in Kaseya Virtual System
Administrator (VSA) servers – affecting approximately 1,500 small
businesses. Kaseya is a developer of network, system and
infrastructure management software used by managed service
providers (MSPs) and other IT contractors. Although a ransomware
attack took control of the customers’ systems, the attack was
contained and defeated after a few days.

But this is not the biggest supply chain vulnerability of 2021.
In December 2021, a few months after the Kaseya incident, what is
arguably the simplest but most widespread attack on the software
supply chain occurred. After an initial proof-of-concept (POC) was
disclosed, attackers began a massive exploitation of a
vulnerability affecting Apache Log4j, an extremely popular
open-source logging library in the Java ecosystem.

Although an update fixing the problem was proposed relatively
quickly, the fact that this library, maintained by only a handful
of people, is used on a very large scale around the world, and
rarely in a transparent way, has created a huge attack surface that
will take years to resolve: the U.S. Cybersecurity and
Infrastructure Security Agency (CISA) has just described it as
“endemic ^[1],” meaning that it will
probably resurface within the next decade.

Despite its magnitude, this vulnerability is far from being an
isolated case: the number of attacks using the open source
ecosystem as a propagation vector to reach supply chains has
increased by 650% between 2020 and 2021 ^[2]. The European
Cybersecurity Agency (ENISA) predicts that supply chain attacks will increase fourfold
by 2022 ^[3].

All of these attacks and vulnerabilities have highlighted the
lack of visibility and tools to effectively protect the supply
chain, whether it be systems to inventory the use of open-source
components, to verify their integrity, or to prevent the leakage of
sensitive information. On this last point, it is important to take
a step back and look more closely at this key element of
security.

The key to the supply chain: secrets

Getting hold of unencrypted credentials is the perfect way for a
hacker to pivot and move down the supply chain from a supplier to
its customers: with valid credentials, attackers operate as
authorized users, and post-intrusion detection becomes much more
difficult.

From a defensive standpoint, hard-coded secrets are a unique
type of vulnerability. Source code is a very leaky asset because it
is by nature intended to be frequently cloned and distributed on
multiple machines. In fact, the secrets in the source code travel
with it. But even more problematic is that code also has a
‘memory’.

Today any code repository is managed through a version control
system (VCS), usually Git ^[4], which keeps a perfect
timeline of all the changes that have been made to the files in the
code base, sometimes over decades. The problem is that still-valid
secrets can hide anywhere on that timeline, opening up a new
dimension, this time historical, to the software attack
surface.

Unfortunately, most security scans are limited to checking the
current, deployed or soon-to-be-deployed state of an application’s
source code. In other words, when it comes to secrets buried in an
old commit or even a never-deployed branch, traditional tools are
completely blind.

Last year alone, more than 6 million secrets were published ^[5] in public repos on
GitHUb alone: on average, 3 commits out of every 1,000 contained a
secret This is a fifty percent increase from the previous year.

A large number of these secrets gave access to corporate
resources. It is important to understand that even if the majority
of open source projects hosted on GitHub are personal repositories,
it is very easy for a professional developer to inadvertently
publish code giving access to corporate resources. It happens
regularly!

It is therefore not surprising that a malicious actor looking to
carry out an attack on the software supply chain would take a close
look at the public repositories on GitHub: they would have a good
chance of discovering flaws at hand, primarily secrets present in
the source code that would allow him to authenticate himself to a
system without arousing any suspicion.

Once a secret is published, it must immediately be considered as
compromised: a simple experiment consists in voluntarily publishing
a “canary token ^[6]“, i.e. a code having
quite the appearance of a valid secret, with an alert mechanism
triggered when it is used. The time between the publication and the
alert is 4 seconds on average! This space is closely monitored and
actively exploited.

To neutralize the risk of intrusion as quickly as possible,
there is only one solution: the immediate revocation of the secret.
But, by panic or lack of technical knowledge, some people try to
cover the error by adding a commit that erases the secret, which
does not mitigate the security flaw at all: indeed, Git keeps track
of all the code history added, modified or deleted over time. In
practice, this means that it is difficult to erase all traces of a
past error. It also means that, in many cases, the secret will
remain available online even after it has been removed from the
“final” state of the code.

But the problems do not end there. In our scenario, as the file
containing the secret is replaced by a “clean” file, the secret
will no longer be detectable either during manual code review by a
peer (a common practice), or by traditional application security
tools such as scanners, which also only consider the most recent
version of the source code. Worse, the flaw will be duplicated
every time the code is cloned, and therefore risks being propagated
silently for a long time. In other words, a godsend for
hackers.

On July 3, the CEO of crypto-currency giant Binance warned of a
massive breach that allegedly leaked “1 billion records of
[Chinese] residents” belonging to the Shanghai police, including
“name, address, national identity, cell phone, police and medical
records.” The cause? A fragment of source code containing the
secret to connecting to a titanic database of personal information
was allegedly copied and pasted onto a blog by developers of
the Chinese CSDN ^[7].

Private repos also affected

Unsurprisingly, this is only the tip of the iceberg. Private
repositories hide many more secrets than their public counterparts.
Working in a closed environment provides a false sense of security,
making contributors a little less suspicious, and therefore
statistically more likely to “let a secret leak”. Tolerating the
presence of secrets in non-publicly exposed repositories would be a
big mistake.

Indeed, no matter how private these repositories are, the
secrets they contain could be used as leverage in an attack,
allowing adversaries who had access to the repository to pivot to
other systems or elevate their privileges. There are many hacking
scenarios, but they all have one thing in common: using any found
secrets to maximize the impact of an attack.

Application security teams are well aware of the problem.
Unfortunately, the amount of work involved in investigating,
revoking and rotating secrets every week is simply overwhelming,
let alone digging through years of unexplored code.

Cybersecurity teams are taking hard-coded secrets in source
code, and the risks they bring, very seriously. They are ranked
15th among the most “common and impactful” vulnerabilities in the
famous CWE Top 25 list 2022 ^[8]
(Common Weakness Enumeration).

A key difference, often forgotten that separates this
vulnerability from all others, as the previous examples have shown
us is that secrets found in the source code are exploitable without
the software being in production! In other words, it is the code
itself that carries a vulnerability, not the underlying logic.

We have therefore seen how secrets represent a critical element
in securing the supply chain. Let’s now look at how organizations
are responding to this new threat in the development cycle.

The response of organizations: bring security into the
development cycle

The emergence of DevSecOps

Software supply chains have many grey areas that are not
addressed by traditional security methods. Organizations have
realized the need to introduce security into the development
lifecycle that strikes the right balance between productivity and
resilience.

This is how the DevSecOps movement was born. DevSecOps consists
of inserting security into DevOps practices. As a reminder, DevOps
is a development philosophy that brings together processes and
technologies that allow developers to cooperate more effectively
with operational teams. We often talk about the DevOps pipeline
(the backbone of the software supply chain) which is characterized
by its continuity: it is about being able to integrate, test,
validate and deliver code in pre-production, in a continuous
way.

Traditional security approaches were at odds with the
DevOps philosophy: deliver faster and faster and adapt as you go.
There was significant friction between the application security
teams and the developer teams, with very different cultures,
expertise and methods. This divide, a source of many
misunderstandings, ultimately contributed to the fragility
of the development cycle.

For security managers, the challenge was to maintain the
velocity of DevOps while reinforcing improved security posture:
including security rules from the earliest stages of the
development cycle (planning, design), disseminating best practices,
and reducing the mean time to remediation (MTTR) by capturing more
“benign” flaws earlier.

More than a method, it is above all an ideal towards which
companies wish to strive. The path is not a long one: cultural
differences are tenacious and often take years to fade away.
Several avenues have been put forward to promote this
transition.

The first avenue is to rely on modern tools. Developers adopt
intuitive tools that integrate perfectly with their work
environments: the command line, API, IDE (Integrated Development
Environment), or even their version control system (VCS). Until
recently, the typical security analyst’s tools were far removed
from this world, with very specific and often impenetrable jargon.
Security software vendors have made great strides in this area,
offering developers the opportunity to become familiar with
security concepts and become self-sufficient over a wide area.

Automation is also key for enabling the creation of effective
security systems. Software engineers are specialists in automation,
so it really made no sense that they could not implement, or even
understand, the security rules imposed on them in order to protect
the supply chain. They are also the most knowledgeable about the
systems that need to be defended. Combining their knowledge with
the expertise of security engineers allows for the best use of
available resources and overall happier teams.

Perhaps the most important element of DevDecOps is the idea that
security must be part of all the stages of the
development cycle. Its security can not just exist as a simple
checklist to be ticked off just before the launch of a new
version.

To achieve this result, it is essential to address an important
concept: shared responsibility.

Shared responsibility and shift-left

The new security model means sharing responsibility among all
members involved in the project. Sharing within cross-functional
teams, rather than in silos, which was historically the case (a
single independent team in charge of security, audit, and quality
assurance).

The term “shift left” is often used to illustrate this desire to
move security out of its silo in order to move security operations
earlier and save money on detection and remediation. However, this
term, popularized in the early 2000s ^[9], describes a desired
operational outcome rather than a real way to achieve it. For an
organization wishing to embark on a DevSecOps transformation, it is
better to focus on how to induce this change in order to
effectively secure its software supply chain.

The empowerment of developers is an essential driver for this.
As the first artisans of the digital world, they must be involved
in security decisions in order to take their needs and working
methods into account. A simple but powerful guideline is to always
make the shortest path also the safest.

Thus, a tool for preventing the most common errors (such as
forgetting secrets in the source code) should be easy to use and
not create friction with the way teams develop code. A good tool
must prove its usefulness and value without feeling like it will
result in ‘vendor lock.’ It should also be able to interface with
the security teams, which are not going to disappear! On the
contrary security teams, which tend to be smaller than their
corresponding dev teams must be mobilized quickly for the most
complex cases.

In the past, application security was considered an area that
had to remain impenetrable to ensure its effectiveness, but those
days are gone. Today, there is a desire for security testing to be
done throughout the cycle and for the results to allow remediation
without necessarily escalating to the security teams.

Promoting ownership of security at each stage of the cycle
requires a general effort of transparency between all teams. This
is a mandatory condition for creating an environment of trust and
fostering a culture that refuses to use blame as an accountability
tool.

In fact, even functions that are further away from the technical
domain must be part of this transformation. For example, product
managers must also take into account the safety of the products
they design in their decision-making process.

The response of companies to face the new risks of the software
supply chain will therefore be technical as well as
organizational ^[10]. Collaboration between
the different professions working along the supply chain is now a
priority for information systems security.

Note — This article is written and contributed by
Thomas Segura, technical content writer at GitGuardian.

References

^{^}
endemic
(www.cisa.gov)
^{^}
650%
between 2020 and 2021 (blog.sonatype.com)
^{^}
supply
chain attacks will increase fourfold by 2022
(www.enisa.europa.eu)
^{^}
usually
Git (survey.stackoverflow.co)
^{^}
6
million secrets were published
(www.gitguardian.com)
^{^}
canary
token (blog.gitguardian.com)
^{^}
the
Chinese CSDN (threatpost.com)
^{^}
CWE Top
25 list 2022 (cwe.mitre.org)
^{^}
popularized in the early 2000s
(www.drdobbs.com)
^{^}
technical as well as
organizational (blog.gitguardian.com)