In this article I will demonstrate how an organization exposed themselves to an NPM package takeover.

Vulnerability: Dependency Confusion

Impact (In increasing order of severity):

  • An attacker can submit unauthorized code into the victim’s software development ecosystem
  • An attacker can remotely execute arbitrary code within the corporate network without any need for AV/EDR or firewall bypass

Prerequisites:

  • Victim must have a private NPM package
  • The private NPM package is not registered on npmjs.com
  • The name of the private NPM package must be known (i.e. exposed or leaked on a public Github repo)

Attack flow:

  • Attacker identifies candidate package that meets above requirements
  • Attacker crafts a package of the same name as the victim’s private NPM package
  • Attacker embeds malicious payload inside package
  • Attacker publishes the package to the public NPM registry at npmjs.com

Note: When installing NPM packages, by default the NPM registry will install the latest version of the package and will install it from either the private or public registry- whichever is higher. This is the lynchpin of this exploit.

Repro Steps:

The meat of the work in this type of attack is all in the reconnaissance (more on that later). Once a candidate has been identified, the rest falls into place. Let’s take a look at how to actually exploit this.

  1. Create a working directory (mkdir does-not-exist)
  2. Initialize the project (npm init)
  3. Follow the prompts to generate the package.json

Since the organization to which this vulnerability was reported has since remediated the finding, and they elected to not allow disclosure I will use a dummy placeholder NPM package: “does-not-exist”.

As per the prerequisites for this attack, we identify that this NPM package is not published in the NPM registry:

The goal is to publish our version of the package to the registry. To do this create a working directory, and  use npm init to create the package and initialize it with a package.json.

Package Name: @diffie_shellman/does-not-exist

Version: 2.5.5

Description: npm dependency confusion

Entry point: <enter>

Test command: <enter>

Git repository: <enter>

Keywords: <enter>

Author: <enter>

License: <enter>

The package name in this case needs to be prefaced with the username in to make it unique since “does-not-exist” is to similar to an already-existing package “doesnnotexist“. The version in this example is arbitrary since the package does not actually exist but in a real scenario the value would be anything greater than whatever seen in the public git repo. Everything else can be default.

Once the package.json is generated, edit it and add the following line inside the “scripts” field:

"preinstall": "echo "Malicious code injection here"",

After that, publish the package to the npm registry with npm publish --access=public

With the package published, all the needs to happen for the exploit to kick off is the victim needs to install it with npm i @diffie_shellman/does-not-exist. Subsequently, when they run npm install the preinstall script will kick execute.

Once done, as with all vulnerability demos, clean up all artifacts. Unpublish the package promptly as if it is stays published for 72 hours you will need to manually submit a request to support to remove it. To do this: npm unpublish --force

Mitigation:

What makes supply chain attacks so insidious is that they are exceptionally effective at bypassing even the best modern EDRs. Typical malware variants offer a multitude of indicators to alert off of such as multiple C2 domain callbacks, binaries, scripts, DLLs, etc… This is what traditional signature and heuristics based detection tools excel at identifying and quarantining.

However, when a dependency is embedded deep into a CI/CD workflow that involves several build, test, and deploy stages, usually with overly permissive service principals, the behavior of one line of code being executed is entirely homogeneous to a much larger pool of activity; making it nearly impossible to detect and isolate without adding friction to a deployment.

This is also compounded by the fact that, in many cases, the build servers are often granted some degree of EDR policy exclusion, if not entirely. This is because builds are comprised of such a plethora system activity: processes creates/deletes, file creates/deletes, network connections, and uploads/downloads. Much of this organic activity would send an EDR flying and, understandably, it’s simply easier to exclude the whole thing rather than meticulously refine policy.

Which brings us to the mitigation portion of this post. Disclaimer, I do not take credit for the below table. I shamelessly poached it from DataDog’s GuardDog repo which I highly recommend experimenting with. I include this table because it id by far and away the most comprehensive approach to proactively identifying potentially malicious packages.

Perhaps with one or two (maybe even 10) packages to vet, it would be reasonable for a security team to counsel development teams to simply “verify your dependencies for suspicious preinstall and postinstall scripts”. Obviously this is not a scalable approach however, and thus a more programmatic method is necessary for an enterprise with multiple Github Orgs, thousands of projects, and hundreds of devs with free reign over of what libraries they’re adding their the projects.

An ideal first step to mitigation against dependency confusion and other supply chain attacks (like expired author) requires the programmatic analysis of package metadata. Indicators of a potentially malicious package are generously outlined in the below table and DataDog’s GuardDog offers exactly that.

HeuristicDescription
empty_informationIdentify packages with an empty description field
release_zeroIdentify packages with an release version that’s 0.0 or 0.0.0
potentially_compromised_email_domainIdentify when a package maintainer e-mail domain (and therefore package manager account) might have been compromised; note that NPM’s API may not provide accurate information regarding the maintainer’s email, so this detector may cause false positives for NPM packages. see https://www.theregister.com/2022/05/10/security_npm_email/
unclaimed_maintainer_email_domainIdentify when a package maintainer e-mail domain (and therefore npm account) is unclaimed and can be registered by an attacker; note that NPM’s API may not provide accurate information regarding the maintainer’s email, so this detector may cause false positives for NPM packages. see https://www.theregister.com/2022/05/10/security_npm_email/
typosquattingIdentify packages that are named closely to an highly popular package
direct_url_dependencyIdentify packages with direct URL dependencies. Dependencies fetched this way are not immutable and can be used to inject untrusted code or reduce the likelihood of a reproducible install.
npm_metadata_mismatchIdentify packages which have mismatches between the npm package manifest and the package info for some critical fields
bundled_binaryIdentify packages bundling binaries
deceptive_authorThis heuristic detects when an author is using a disposable email

Granted this can require a bit of an architectural setup if you don’t opt for the DataDog + Github Advanced Security service integration which comes with a hefty price tag no doubt. But it is not by any means an unreasonable level of effort to deploy a simply homegrown workflow to leverage GuardDog as a means of protecting your organization. That is a follow-up topic for another post though.