GitLab discovers widespread NPM supply chain attack
(about.gitlab.com)416 points by OuterVale 6 days ago
416 points by OuterVale 6 days ago
This is why you want containerisation or, even better, full virtualisation. Running programs built on node, python or any other ecosystem that makes installing tons of dependencies easy (and thus frustratingly common) on your main system where you keep any unrelated data is a surefire way to get compromised by the supply chain eventually. I don't even have the interpreters for python and js on my base system anymore - just so I don't accidentally run something in the host terminal that shouldn't run there.
No thats not what i want, that whats i need when i use something like npm.
Which can't be the right way.
Why not? Make a bash alias for `npm` that runs it with `bwrap` to isolate it to the current directory, and you don't have to think about it again. Distributions could have a package that does this by default. With nix, you don't even need npm in your default profile, and can create a sandboxed nix-shell on the fly so that's the only way for the command to even be available.
Most of your programs are trusted, don't need isolation by default, and are more useful when they have access to your home data. npm is different. It doesn't need your documents, and it runs untrusted code. So add the 1 line you need to your profile to sandbox it.
I wrote myself a handy and generalized bwrap-wrapping script: https://github.com/sandbox-utils/sandbox-run
The right way (technically) and the commercially viable way are often diametrically opposed. Ship first, ask questions later, or, move fast and break things, wins.
Here I go again: Plan9 had per-process namespaces in 1995. The namespace for any process could be manipulated to see (or not see) any parts of the machine that you wanted or needed.
I really wish people had paid more attention to that operating system.
The tooling for that exists today in Linux, and it is fairly easy to use with podman etc.
K8s choices clouds that a little, but for vscode completions as an example, I have a pod, that systemd launches on request that starts it.
I have nginx receive the socket from systemd, and it communicates to llama.cpp through a socket on a shared volume. As nginx inherits the socket from systemd it does have internet access either.
If I need a new model I just download it to a shared volume.
Llama.cpp has now internet access at all, and is usable on an old 7700k + 1080ti.
People thinking that the k8s concept of a pod, with shared UTC, net, and IPC namespaces is all a pod can be confuses the issue.
The same unshare command that runc uses is very similar to how clone() drops the parent’s IPC etc…
I should probably spin up a blog on how to do this as I think it is the way forward even for long lived services.
The information is out there but scattered.
If it is something people would find useful please leave a comment.
That can only go so far. Assuming there is no container/VM escape, most software is built to get used. You can protect yourself from malicious dependencies in the build step. But at some point, you are going to do a production build, that needs to run on a production system, with access to production data. If you do not trust your supply chain; you need to fix that.
If you excuse me, I have a list of 1000 artifacts I need to audit before importing into our dependency store.
Absolutely, good old VMs can really provide the needed isolation while still having good UX. I just published a post on setting up dev VMs with Lima: https://www.metachris.dev/2025/11/sandbox-your-ai-dev-tools-...
Why distro do you run? Python is a part of the os in many cases ?
It’s a fair angle your taking here, but I would only expect to see it on hardend servers.
Why think about the consequences of your actions when you can use docker?
It's funny because techies love to tell people that common sense is the best antivirus, don't click suspicious links, etc. only to download and execute a laundry list of unvetted dependencies with a keystroke.
The lesson surely though is 'don't use web-tech, aimed at solving browser incompatibility issues for local scripting'.
When you're running NPM tooling you're running libraries primarily built for those problems, hence the torrent of otherwise unnecessary complexity of polyfills, that happen to be running on a JS engine that doesn't get a browser attached to it.
I'm sorry, but this is just incorrect. Have you ever heard of ljharb[0]? The NPM ecosystem is rife with polyfills[1]. I don't know how you can make a distinction on which libraries would be used for "local scripting" as I don't think many library authors make that distinction.
[0] - TC39 member who is self-described as "obsessed with backwards compatibility": https://github.com/ljharb
[1] - Here's one of many articles describing the situation: https://marvinh.dev/blog/speeding-up-javascript-ecosystem-pa...
I'm a victim of this.
In addition to concerns about npm, I'm now hesitant to use the GitHub CLI, which stores a highly privileged OAuth token in plain text in the HOME directory. After the attacker accesses it, they can do almost anything on behalf of me, for example, they turned many of my private repos to public.
Apparently, The Github CLI only stores its oauth token in the HOME directory if you don't have a keyring. They also say it may not work on headless systems. See https://github.com/cli/cli/discussions/7109.
For example, in my macOS machines the token is safely stored in the OS keyring (yes, I double checked the file where otherwise it would've been stored as plain text).
I use it as my secret store provider but it has its quirks.
It would be better if you could have multiple providers attached (gnome-keyring and keepassxc) and then decide which app uses which provider.
Because only some secrets you want to share across devices, like wifi passwords, and the rest you don’t, like the key chromium uses to encrypt local cookies or the gh cli token.
That's true, but the same may already be true of your browser's cookie file. I believe Chrome on MacOS and Windows (unsure about Linux) now does use OS features to prevent it being read from other executables, but Firefox doesn't (yet)
But protecting specific directories is just whack-a-mole. The real fix is to properly sandbox code - an access whitelist rather than endlessly updating a patchy blacklist
Plan9 had per-process namespaces in 1995.
One could easily allow or restrict visibility of almost anything to any program. There were/are some definite usability concerns with how it is done today (the OS was not designed to be friendly, but to try new things) and those could easily be solved. The core of this existed in the Plan9 kernel and the Plan9 kernel is small enough to be understood by one person.
I’m kinda angry that other operating systems don’t do this today. How much malware would be stopped in its tracks and made impotent if every program launched was inherently and natively walled off from everything else by default?
Linux supports per-process namespaces too, and has tools like firejail to use them for sandboxing, but nonetheless sandboxing is not widely used.
I think this normalises running untrustworthy, abusive proprietary software, because they can at least be somewhat contained. The only reason I have apps like Facebook on my android phone is that I have sufficient trust in GrapheneOSs permissions. Then, apps like syncthing become crippled as filesystem virtualisation and restrictions prevent access and modification of files regardless of my consent.
Not disagreeing with the need for isolation though, I just think it should be designed carefully in a zero-sacrifice way (of use control/pragmatic software freedom)
> But protecting specific directories is just whack-a-mole. The real fix is to properly sandbox code - an access whitelist rather than blacklist
I believe Wayland (don't quote me on this because I know exactly zero technical details) as opposed to x is a big step in this direction. Correct me if I am wrong but I believe this effort alone has been ongoing for a decade. A proper sandbox will take longer and risks being coopted by corporate drones trying to take away our right to use our computers as we see fit.
Wayland is a significant improvement in one specific area (and it's not this one).
All programs in X were trusted and had access to the same drawing space. This meant that one program could see what another one was drawing. Effectively this meant that any compromised program could see your whole screen if you were using X.
Wayland has a different architecture where programs only have access to the resources to draw their own stuff, and then a separate compositor joins all the results together.
Wayland does nothing about the REST of the application permission model - ability to access files, send network requests etc. For that you need more sandboxing e.g. Flatpak, Containers, VMs
Maybe I am missing something but how and why would a display protocol have anything to do with file access model??
this, this, this
All our tokens should be in is protected keychain and there are no proper cross-platform solutions for this. All gclouds, was aww sdks, gh and other tools just store them in dotfile.
And worst thing, afaik there is no way do do it correctly in MacOS for example. I'd like to be corrected though.
What is a proper solution for this? I don't imagine gpg can help if you encrypt it but decrypt it when you login to gnome, right? However, it would be too much of a hassle to have to authenticate each time you need a token. I imagine macOS people have access to the secure enclave using touch ID but then even that is not available on all devices.
I feel like we are barking up the wrong tree here. The plain text token thing can't be fixed. We have to protect our computers from malware to begin with. Maybe Microsoft was right to use secure admin workstations (saw) for privileged access but then again it is too much of a hassle.
The way I solve the plain text problem is through a combination of direnv[1] and pass[2].
For a given project, I have a `./creds` directory which is managed with pass and it contains all the access tokens and api keys that are relevant for that project, one per file, for example, `./creds/cloudflare/api_token`. Pass encrypts all these files via gpg, for which I use a key stored on a Yubikey.
Next to the `./creds` directory, I have an `.envrc` which includes some lines that read the encrypted files and store their values in environment variables, like so: `export CLOUDFLARE_API_TOKEN=$(pass creds/cloudflare/api_token)`.
Every time that I `cd` into that project's directory, direnv reads and executes that file (just once) and all these are stored as environment variables, but only for that terminal/session.
This solves the problem of plain-text files, but of course the values remain in ENV and something malicious could look for some well known variable names to extract from there. Personally I try to install things in a new termux tab every time which is less than ideal.
I'd like to see if and how other people solve this problem
[1]: https://direnv.net/ [2]: https://www.passwordstore.org/
I think the correct solution is to use a keyring. On Linux there's gnome keyring and last time I worked on a IOS app there was something similar.
This does mean entering your keyring password a lot.
It might be possible to lash up a cross-plaform solution with KeePassXC. It's got an API that can be accessed from the command line (chezmoi uses it to add secrets to dotfiles). Yes, you'd be authenticating every time you need a token but that might not be too much of a burden if you spend most of your time on a machine with a fingerprint scanner.
otoh I wouldn't do it, because I don't believe I could implement it securely.
I’ve got this work 1password setup, the only issue is if you have background tasks.
I had a Borg backup script for example and 1password needed me to authenticate to run it.
Authenticating for ssh and git is great.
> And worst thing, afaik there is no way do do it correctly in MacOS for example. I'd like to be corrected though.
https://developer.apple.com/documentation/security/keychain-...
And similar services exist on Linux desktops. There are libraries that will automatically pick the right backend.
For what it’s worth, the recommended way of getting credentials for AWS would be either:
1. Piggyback of your existing auth infra (eg: ActiveDirectory or whatever you already have going on for user auth) 2. Failing that use identity center to create user auth in AWS itself
Either way means that your machine gets temporary credentials only
Alternatively, we could write an AWS CLI helper to store the stuff into the keychain (maybe someone has)
Not to take away from your more general point
We need flatpak for CLI tools
Snap works with CLI tools. But underneath that, isn't it AppArmor and namespaces? (I don't really know. I'm getting the impression flatpak-style process isolation is possible just not widely adopted)
This doesn't sound like a technical problem to me. Even my throw-away bash scripts call to `secret-tool lookup`, since that is actually easier than implementing your own configuration.
Also this is a complete non-issue on Unix(-like) systems, because everything is designed around passing small strings between programs. Getting a secret from another program is the same amount of code, as reading it from a text file, since everything is a file.
I'm also a victim of this. Last time I try and install Backstage.
Have you wiped your laptop/infected machine? If not I would recommend it; part of it created a ~/.dev-env directory which turned my laptop into a GitHub runner, allowing for remote code execution.
I have a read-only filesystem OS (Bluefin Linux) and I don't know quite how much this has saved me, because so much of the attack happens in the home directory.
> "This creates a dangerous scenario. If GitHub mass-deletes the malware's repositories or npm bulk-revokes compromised tokens, thousands of infected systems could simultaneously destroy user data."
Pop quiz, hot shot! A terrorist is holding user data hostage, got enough malware strapped to his chest to blow a data center in half. Now what do you do?
Shoot the hostage.
The hostage naively walked past all the police and into the data centre, and you’re shooing them in the leg. They’ll probably survive, but they knowingly or incompetently made their choice. Sucks to be them.
Does anyone know why NPM seems to be the only attractive target? Python and Java are very popular, but I haven't heard anything in those ecosystems for a while. Is it because something inherently "weak" about NPM, or simply because, like Windows or JavaScript, everyone uses it?
Compared to the Java ecosystem, I think there's a couple of issues in the NPM ecosystem that makes the situation a lot worse:
1) The availability of the package post-install hook that can run any command after simply resolving and downloading a package[1].
That, combined with:
2) The culture with using version ranges for dependency resolution[2] means that any compromised package can just spread with ridiculous speed (and then use the post-install hook to compromise other packages). You also have version ranges in the Java ecosystem, but it's not the norm to use in my experience, you get new dependencies when you actively bump the dependencies you are directly using because everything depends on specific versions.
I'm no NPM expert, but that's the worst offenders from a technical perspective, in my opinion.
[1]: I'm sure it can be disabled, and it might even be now by default - I don't know. [2]: Yes, I know you can use a lock file, but it's definitely not the norm to actively consider each upgraded version when refreshing the lockfile.
Also badly named commands, `npm install` updates your packages to the latest version allowed by package.json and updates the lock file, `npm ci` is what people usually want to do: install the versions according to the lock file.
IMO, `ci` should be `install`, `install` should be `update`.
Plus the install command is reused to add dependencies, that should be a separate command.
This hasn't been true since version 5.4.2, released in 2017.
`npm install` will always use the versions listed in package-lock.json unless your package.json has been edited to list newer versions than are present in package-lock.json.
The only difference with `npm ci` is that `npm ci` fails if the two are out of sync (and it deletes `node_modules` first).
> The culture with using version ranges for dependency resolution
Yep, auto-updating dependencies are the main culprit why malware can spread so fast. I strongly recommend the use `save-exact` in npm and only update your dependencies when you actually need to.
This advice leaves you vulnerable to log4j style vulnerabilities that get discovered though.
The answer is a balance. Use Dependabot to keep dependencies up to date, but configure a dependency cooldown so you don't end up installing anything too new. A seven day cooldown would keep you from being vulnerable to these types of attacks.
To add a few:
* NPM has a culture of "many small dependencies", so there's a very long tail of small projects that are mostly below the radar that wouldn't stand out initially if they get a patch update. People don't look critically into updated versions because there's so many of them.
* Developers have developed a culture of staying up-to-date as much as possible, so any patch release is applied as soon as possible, often automated. This is mainly sold as a security feature, so that a vulnerability gets patched and released before disclosure is done. But it was (is?) also a thing where if you wait too long to update, updating takes more time and effort because things keep breaking.
One factor is that node's philosophy is to have a very limited standard library and rely on community software for a ton of stuff.
That means that not only the average project has a ton of dependencies, but also any given dependency will in turn have a ton of dependencies as well. there’s multiplicative effects in play.
This is my take as well. I've never come accross a JS project where the built-in datastructures were exclusively used.
One package for lists, one for sorting, and down the rabbit hole you go.
I think this is mostly historical baggage unfortunately. Every codebase I've ever worked in there was a huge push to only use native ES6 functionality, like Sets, Maps, all the Iterable methods etc., but there was still a large chunk of files that were written before these were standardized and widely used, so you get mixes of Lodash and a bunch of other cursed shit.
Refactoring these also isn't always trivial either, so it's a long journey to fully get rid of something like Lodash from an old project
This has improved recently. Packages like lodash were once popular but you can do most stuff with the standard library now. I think the only glaring exception is the lack of a deep equality function.
This is the main reason. Pythons ecosystem also has silly trends and package churn, and plenty of untrained developers. It’s the lack of a proper standard library. As bad a language as it may be, Java shows how to get this right.
Goddamnit: meant to write “node’s” not “pythons”. Human is hallucinating.
What? Python's standard library seems far more extensive than Java's.
Larger attack surface (JS has been the #1 language on GitHub for years now) and more amateur developers (who are more likely to blindly install dependencies, not harden against dev attack vectors, etc).
Also: a culture of constant churn in libraries which in combination with the potential for security bugs to be fixed in any new release leads to a common practice of ingesting a continual stream of mystery meat. That makes filtering out malware very hard. Too much noise to see the signal. None of the above cultural factors is present in the other ecosystems.
Unfortunately, blindly installing dependencies at compile-time is something that many projects will do by default nowadays. It's not just "more amateur developers" who are at risk here.
I've even seen "setup scripts" for projects that will use root (with your permission) to install software. Such scripts are less common now with containers, but unfortunately containers aren't everything.
> blindly installing dependencies at compile-time is something that many projects will do by default nowadays.
I consider this to be a sign that someone is still an amateur, and this is a reason to not use the software and quickly delete it.
If you need a dependency, you can call the OS package manager, or tell me to compile it myself. If you start a network connection, you are malware in my eyes.
Yes, exactly; I followed a Github course at one point and it was Strongly Recommended that you enable Dependabot for your project which will keep your dependencies up to date. It's basically either already enabled or a one-click setup action at this point. The norm that Github pushes is that you should trust them to keep your stuff updated and secure.
Npm has weak security boundaries.
Basically any dependency can (used to?) run any script with the develop permissions on install. JVM and python package managers don't do this.
Of course in all ecosystems once you actually run the code it can do whatever with the permissions of the executes program, but this is another hurdle.
Python absolutely can run scripts in installation. Before pyproject.toml, arbitrary scripts were the only way to install a package. It's the reason PyPi.org doesn't show a dependency graph, as dependencies are declared in the Turing-complete setup.py.
Wrong. Wheels were available long before pyproject.toml, and you could instruct pip to only install from wheels. setup.py was needed to build the wheels, but the build step wasn’t a necessary part of installation and could be disabled. In that sense its role is similar to that of pre-publish build step of npm packages, unless wheels aren’t available.
Deno has tackled some of these issues with their permission system, but afaik it can only be applied to apps, not to dependencies.
What we really need is a system to restrict packages in what they can do (for example, many packages don't need network access).
Lavamoat purports to do this. https://lavamoat.github.io/
There has been some promising prior research such as BreakApp attempting to mitigate unusual supply-chain compromises such as denial-of-service attacks targeting the CPU via pathological regexps or other logic-bomb-flavored payloads.
As far as I understand, NPM packages are not self-contained like e.g. Python wheels and can (and often need to) run scripts on install.
So just installing a package can get you compromised. If the compromised box contains credentials to update your own packages in NPM, then it's an easy vector for a worm to propagate.
Python wheels don't run arbitrary code on install, but source distributions do. And you can upload both to pypy. So you would have to run
pip install <package> --only-binary :all:
to only install wheels and fail otherwise.
Maybe some technical reasons, but more like the mind set of the JS "community" that if you don't have the latest version of a package 30 seconds after it's pushed you're hopelessly behind.
In other "communities" you upgrade dependencies when you have time to evaluate the impact.
For the last 2 years PyPi (main Python package repository) requires mandatory 2FA.
Last time I did anything with Java, felt like use of multiple package repositories including private ones was a lot more popular.
Although higher branching factor for JavaScript and potential target count are probably very important factors as well.
I feel with Python upgrade cycle is slower. I upgrade dependencies when something is broken or there is known issue. That means any active vulnerabilities propagate slower. Slower propagation means lower risk. And also as there is fewer upstream packages impact of compromised maintainer is more limited.
The credential harvesting aspect is what concerns me most for the average developer. If you've ever run `npm install` on an affected package, your environment variables, .npmrc tokens, and potentially other cached credentials may have been exfiltrated.
The action item for anyone potentially affected: rotate your npm tokens, GitHub PATs, and any API keys that were in environment variables. And if you're like most developers and reused any of those passwords elsewhere... rotate those too.
This is why periodic credential rotation matters - not just after a breach notification, but proactively. It reduces the window where any stolen credential is useful.
> anyone potentially affected
How does one know one is affected?
What's the point of rotating tokens if I'm not sure that I've been affected - the new tokens will just be ex-filtrated as well.
First step would be to identify infection, then clean up and then rotate tokens.
The article has some indicators of compromise, the main one locally would be .truffler-cache/ in the home directory. It’s more obvious for package maintainers with exposed credentials, who will have a wormed version of their own packages deployed.
From what I’ve read so far (and this definitely could change), it doesn’t install persistent malware, it relies on a postinstall script. So new tokens wouldn’t be automatically exfiltrated, but if you npm install any of an increasing number of packages then it will happen to you again.
It does install a GitHub runner and registers the infected machine as a runner, so remote code execution remains possible. It might be a stretch to call it persistent but it definitely tries.
> if you're like most developers and reused any of those passwords elsewhere
Is this true? God I hope not, if developers don't even follow basic security practices then all hope is lost.
I'd assume this is stating the obvious, but storing credentials in environment variables or files is a big no-no. Use a security key or at the very least an encrypted file, and never reuse any credential for anything.
> Is this true? God I hope not, if developers don't even follow basic security practices then all hope is lost.
"Basic security practices" is an ever expanding set of hoops to jump through, that if properly followed, stop all work in its tracks. Few are following them diligently, or at all, if given any choice.
Places that care about this - like actually care, because of contractual or regulatory reasons - don't even let you use the same machine for different projects or customers. I know someone who often has to carry 3+ laptops on them because of this.
Point being, there's a cost to all these "basic security practices", cost that security practitioners pretend doesn't exist, but in fact it does exist, and it's quite substantial. Until security world acknowledges this fact openly, they'll always be surprised by how people "stubbornly" don't follow "basic practices".
To me, the worming aspect and taking developers data as hostages against infrastructure take down is most concerning.
Previously, you had isolated places to clean up a compromise and you were good to go again. This attack approaches the semi-distributed nature and attacks the ecosystem as a whole and i am affraid this approch will get more sophisticated in the future. It reminds me a little of malicious transactions written into a distributed ledger.
Even with periodoc rotation of credentials, attacker gets enough time to do sufficient damage. Imo, the best way to solve would be to not handle any sort of credentials at all at the application layer! If at all the application must only handle only very short lived tokens. Let there be a sidecar (for example) that does the actual credential injection.
Also a good reminder that you should be storing secrets in some kind of locker, not in plain text via environment variables or config files. Impossible to get everyone on board but if you can you should as much as possible.
I hate that high profile services still default to plain text for credential storage.
How do you do this in practice?
If I just need to `fly secrets set KEY=hunter2` one time for production I can copy it from a paper pad even but if it's a key I need to use every time I run a program that I'm developing on, it's likely going to end up at least being in my program's shell environment (and thus readable from its /proc/pid/environ). So if I `npm install compromised-package` – even from some other terminal – can't it just `grep -a KEY= /proc/*/environ`?
Or are you saying the programs we hack on should use some kind of locker api to fetch secrets and do away with env vars?
Also the user data destruction if it stops being able to propagate itself.
Everyone is blaming npm but GitHub should be put on blast too for allowing the repos to be created and not quickly flagged.
GitHub has a massive malware problem as it is and it doesn’t get enough attention.
I love! how Github, as a corporate company now owned by Microsoft, is directly tied to GoLang as the main repository of the vast majority of packages/dependencies.
Imagine the number of things that can go wrong when they try to regulate or introduce restrictions for build workflows for the purpose of making some extra money... lol
The original Java platform is a good example to think about.
That's the collective choice of the authors of those packages. A go module path is literally just the canonical URL where you can download the module.
The golang modules core to the language are hosted at golang.org
Module authors have always been free to have their own prefix rather than github.com, even if they host their module on Github. If they say their module is example.com/foo and then set their webserver to respond to https://example.com/foo?go-get=1 with <meta name="go-import" content="example.com/foo mod https://github.com/the_real_repository/foo"> then they will leave no hint that it's really hosted at github, and they could host it somewhere else in future (including at https://example.com directly if they want)
Another feature is that go uses a default proxy, https://proxy.golang.org/, if you don't set one yourself. This means that Google, who control that proxy, can choose to make a request for a package like github.com/foo/bar go to some place else, if for whatever reason Microsoft won't honour it any more.
Golang builds pulling a github.com/foo/bar/baz module don't rely on any GitHub "build workflow", so unless you mean they're going to start restricting or charging for git clones for public repos (before you mention Docker Hub, yes I know), nothing's gonna change. And even if they're crazy enough to do that, Go module downloads default to a proxy (proxy.golang.org by default, can be configured and/or self-hosted) and only fall back to vcs if the module's not available, so a module only needs to be downloaded once from GitHub anyway. Oh and once a module is cached in the proxy, the proxy will keep serving it even if the repo/tag is removed from GitHub.
"The original Java platform" had no package management though, that came with Maven and later Gradle, that have similar vectors for supply chain attacks (that is, nobody reviews anything before it's made available on package repositories).
And (to put on my Go defender hat), the Go ecosystem doesn't like having many dependencies, in part because of supply chain attack vectors and the fact that Node's ecosystem went a bit overboard with libraries.
Pushing the data to Github was a blessing in disguise. A friend wouldn't have noticed he got caught if it didn't create a repo on his account. It would have been worse if it silently sent the data to some random server.
Wouldn’t have been that hard to write a rule that matches the repositories being created by this malware. It literally does the same thing to every victim.
Sure, but until the malware spreads quickly you don't know you need the rule.
True. But this was the “second coming” of the exact same malware from a few months ago.
Most of those attacks do the same kind of things.
So I'm surprised to never see something akin to "our AI systems flagged a possible attack" in those posts. Or the fact Github from AI pusher fame Microsoft does not already use their AI to find this kind of attacks before they become a problem.
Where is this miracle AI for cybersecurity when you need it?
SonaType Lifecycle has some magic to prevent these types of attacks. They claim it is AI based. Not sure how it all works as it is proprietary but it is one of the things we use at work. SonaType IQ server powers it
Mitigate this attack vector by adding:
ignore-scripts=true
to your .npmrcIs there a way to list all the packages in the dependency tree with preinstall/postinstall hooks? Preferably before doing the installation?
Yes, it can break deps, some will not install. Puppeteer is a good example because it installs binaries. But it also shows an error with the cmd needed to complete the installation.
Why it is allowed by default?
> it’s npm’s belief that the utility of having installation scripts is greater than the risk of worms.
NPM co-founder Laurie Voss
https://blog.npmjs.org/post/141702881055/package-install-scr...
Once you run the JavaScript of the npm library you just installed, if it's Node, what's to stop it accessing environment variables and any file it wants, and sending data to any domain it wants?
fs and net can be mitigated with `--permission`
https://nodejs.org/api/permissions.html
Regardless, it’s worth using `--ignore-scripts=true` because that’s the common vector these supply chain attacks target. Consider that when automating the attack, adding it to the application code is more difficult than injecting it into life-cycle scripts, which have well-known config lines.
Nothing, but at least you'll have time to see the audit if it's aware.
pnpm disables all install scripts by default and makes it trivial to whitelist the few you need. It's usually just one or two, or sometimes zero, depending on the project. Even without malware, most postinstall scripts are used for spam and analytics, and running them makes your life worse.
npm should have died long ago, I don't know why it's still being used.
They pulled a little sneaky on ya, mentioning GitLab security features available to GitLab users in a GitLab Security blog post with GitLab logos everywhere.
Call me a conspiracy theorist, but I start to think these people might be affiliated with GitLab.
I have an friend that starts an project next month that will rely on npm. He is quite a noob and didn't code in ages. He will have almost no clue how to harden against this, he will probably not even notice if he becomes a victim until something really bad happens.
Pretty sad.
At least make them run pnpm instead of npm, disabling post-install scripts. https://pnpm.io/supply-chain-security
"a friend" because friend starts with a consonant sound, not a vowel sound. "a project" for the same reason.
HTH.
In this narrow case, using pnpm or something similar that blocks postinstall scripts by default should be sufficient. In general, you probably want to use a container/vm/sandbox of some sort so dev stuff can’t access anything else on your machine.
Discussion: https://news.ycombinator.com/item?id=46032539
> Our internal monitoring system has uncovered multiple infected packages containing what appears to be an evolved version of the "Shai-Hulud" malware.
Although it's not entirely new, it's something else.
Gitlab's post and the linked discussion thread are both from November 24th 2025. I may be misreading the parent comment, but I'm personally thankful there isn't a Return of the Return of Shai-Hulud, as I assumed this was a third recent incident. For those concerned about these attacks, Helixguard's post (from the linked discussion) lists out the packages they found to be effected, while Gitlab's post gives more information on how the attack works. Since it's self-propagating though, assume the list of affected packages might be longer as more NPM tokens are compromised.
Lucky for us C programmers. Each distro provides its own trusted libc, and my code has no other dependencies. :)
C (actually POSIX) has a hashmap implementation: https://man7.org/linux/man-pages/man3/hsearch.3.html
What it doesn't have is a hashmap type, but in C types are cheap and are created on an ad-hoc basis. As long as it corresponds to the correct interface, you can declare the type anyway you like.
char *
left_pad (const char * string, unsigned int pad)
{
char tmp[strlen (string)+pad+1];
memset (tmp, ' ', pad);
strcpy (tmp+pad, string);
return strdup (tmp);
}
Doesn't sound too hard in my opinion. This only works for strings, that fit on the stack, so if you want to make it robust, you should check for the string size. It (like everything in C) can of course fail. Also it is a quite naive implementation, since it calculates the string size three times.Not a C expert but you’re using a dynamic array right on the stack, and then returning the duplicate of that. Shouldn’t that be Malloc’ed instead?? Is it safe to return the duplicate of a stack allocated array, wouldn’t the copy be heap allocated anyway? Not to mention it blows the stack and you get segmentation fault?
strndup would be safer if I correctly recall from my C days?
Safer for what? That opinion seems to be misguided to me.
strndup prevents you from overrunning the allocation of a string given that you pass it the containing allocations size correctly. But if you got passed something that is not a string, there will be a buffer overrun right there in the first line. Also what outer allocation?
You use strcpy when you get a string and memcpy when you get an array of char. strncpy is for when you get something that is maybe a string, but also a limited array. There ARE use cases for it, but it isn't for safety.
While you think this is a producer problem, it's simply a userland market.
Just like in the 90s when viruses primarily went to windows, it' wasn't some magical property of windows, it was the market of users available.
Also, following this logic, it then becomes survivorship bias, in that the more attacks they get, the more researchers spend time looking & documenting.
While it can happen to anyone npm does preselect the users most likely to unknowingly amplify such an attack. Just today I was working on a simple JS script while disconnected from the Internet, Qwen Coder suggested I “npm install glob” which I couldn’t because there was no internet, so I asked for an alternative and sure enough the alternative solution was two lines of vanilla JS. This is just one example but it is the modus operandi of the NPM ecosystem.
Are there any good alternatives to ESLint? ESLint is now my only dev dependency with hundreds of dependencies of its own.
Biome: https://biomejs.dev/
Also the whole ecosystem around OXS looks very promising: https://oxc.rs/
Both of those have over >400 dependencies each [0] [1] but just in Rust instead - there hasn't been a Rust supply chain attack yet but is this any better? [2]
Admittedly you're not normally downloading the dependencies to your machine as you're often using pre-built binaries, but a malicious package could still run if a version was shipped with it.
[0]: https://github.com/biomejs/biome/blob/93182ea8e9d479fd0187ce...
[1]: https://github.com/oxc-project/oxc/blob/65bd5584bfce0c7da90f...
[2]: https://users.rust-lang.org/t/yet-another-npm-supply-chain-a...
Once upon a time I would download the source code of a library, unzip it, and personally vet the code before adding it to my project.
With some package managers these days I don't even know how to do that (and I'm not necessarily talking about Node, specifically). How do you figure out what the install process does to your computer, without becoming an expert on the manifest syntax? For those of us who care about what goes on under the hood, it is definitely not easier than the days of following well-formed (or even semi-formed) documentation by hand.
The brutal part is how rotate secrets and move on has become the default hygiene advice when the real pattern is that npm keeps being the soft underbelly of modern stacks It should be mandatory for a build process to have some tool like Prismor scan for these
Can't GitHub just block/make private all https://github.com/search?q=Sha1-Hulud%3A%20The%20Second%20C... repos as a first step?
As a Java dev, seems like only a matter of time before Maven Nexus repo attacks become commonplace.
Send them a request to have Trusted publishers support at central-support (at) sonatype.com
I did that a couple of weeks ago and received an acknowledgment "Another request on Trusted Publishing option. Assigning to Product for review and further action." so this is a bit encouraging.
At least Maven dependencies don't execute scripts on install, but Maven plugins could have a big blast radius.
Over a decade ago at Amazon, all third party dependencies needed to be manually imported. On the one hand, it makes importing new versions or packages slow. On the other hand, there is a very explicit intention and log of every external change that made it into internal projects.
At my previous company, I implemented staged dependencies with artifactory so that production could never get packages that had never gone through CR, or staging environments first. They just were never replicated. That eliminated fuzzy dependency matches that showed up for the first time in production (something that did happen). Because dev to production was about 1 week, it also afforded time to identify packages before they had a chance to be deployed. Obviously it was less robust than manually importing.
Maybe self-hosted package caches support these features now, but 6-7 years ago, that was all manual work.
I think I found some repos here: https://github.com/search?q=in:description+Sha1-Hulud&type=r...
pardon the naive question. What i don't get is these injected payload are js files, isn't there some scanning at npm upload level to look for exfiltration behaviour, bash executions of dangerous commands like rm or shred ?
Something helpful here would be to enable developers to optionally identify themselves. Not Discord-style where only the platform knows their real identity, but publically as well.
So, EV code signing certificates? Windows has that, and it'll verify that right in the OS. Git for instance, shows as being signed by
CN = Johannes Schindelin O = Johannes Schindelin S = Nordrhein-Westfalen C = DE
Downside is the cost. Certificates cost hundreds of dollars per year. There's probably some room to reduce cost, but not by much. You also run into issues of paying some homeless person $50 to use their identity for cyber crimes.
This was largely the reason I rejected "real name verification" ideas at GitHub after the xz attack. (Especially if they are state sponsored) it's not that hard for a dedicated actor (which xz certainly was) to get a quality stolen identity.
The inevitable evolution of such a feature is a button on your repo saying" block all contributors from China, Russia, and N other countries". I personally think that's the antithesis of OSS and therefore couldn't find the value in such a thing.
That would be easily defeated by a VPN. The inevitable evolution would be some kind of in-person attestation of identity backed up with some kind of insurance on the contributor's work, and, well you're converging on the employer-employee relationship then.
As I understand it, this attack works because the worm looks for improperly stored secrets/keys/credentials. Once it find them it publishes malicious versions of those packages. It hits NPM because it’s an easy target… but I could easily imagine it hitting pip or the repo of some other popular language.
In principle, what’s stopping the technique from targeting macos CI runners which improperly store keys used for Notorization signing? Or… is it impossible to automate a publishing step for macos? Does that always require a human to do a manual thing from their account to get a project published?
Also layer upon layer of abstractions - to the point where no single person understands the stack from top to bottom.
Perhaps there is a light at the end of the tunnel: with AI coding assistance, the whole application can be written from scratch (like the old days). All the code is there, not buried deep within someone else's codebase.
While this does appear to be getting worse, I'm in the camp of letting it happen. The Node/JS ecosystem is imho completely unsuitable for serious work and this is merely the natural consequence. Let it burn, and perhaps something better will come from the ashes.
Microsoft should just bite the bullet and make a huge JS standard library and then send GitHub notifications to all the project maintainers who are using anything that could be replaced by something from there suggesting them to do such replacement. This would likely significantly reduce the number of supply chain attacks on the npm ecosystem.
JS also has a stability issue. The language evolved fast, the tools and the number of tools evolved fast and in different directions. The module system is a mess and trying to make it better caused more mess. There's Node.js, TypeScript and the browser. That's a lot to handle when trying to make something "std".
Meanwhile I have been using Ruby for 15 years and it has evolved in a stable way without breaking everything and without having to rewrite tons of libraries. It's not as powerful in terms of performance and I/O, it's not as far-reaching as JS is because it doesn't support the browser, it doesn't have a typescript equivalent, but it's mature and stable and its power is that it's human-friendly.
This is harder than it sounds. Look at the amount of effort it took to standardise temporal (new time library) and then for all the runtimes to implement it. It’s a lot of work.
And what’s more, people have proposed a standard library through tc39 without success - https://github.com/tc39/proposal-built-in-modules
Of course any large company could create a massive standard library on their own without going through the standards process but it might not be adopted by developers.
If you look at the list of compromised packages, very few of them could reasonably be included in a standard library. It's mostly project-specific stuff like `@asyncapi/specs` or `@zapier/zapier-sdk`. The most popular generic one I see is `get-them-args`, which is a CLI argument parser - which is something Node has in the form of `util.parseArgs` since v16.17.0.
Well they clearly lacked marketing? Pretty sure a red text in npm every time that package was installed that says "hey we have a better way to do this with node alone" would have made a dent in the library usage, but they didn't do anything of the sort.
That is literally how the CycloneDX SBOM packages work, well, after the fact and after the disclosure process.
Pretty sure Microsoft is exponentially bigger than 99% of the library authors out there, and add to that the giant communication channel that GitHub gives it over developers, so the analogy breaks pretty fast.
Surely in this day and age we can fairly trivially find out these come from the usual suspects - China, Russia, Iran, etc. Being in such a digital age, where our economies are built on this tech...is this not effectively (economic) warfare? Why are so many governments blase about it?
The US and Israel also have advanced penetration teams. But they wouldn't be this sloppy - they want persistent advanced access. I suspect Iran, Russia and China also wouldn't be this sloppy. This is too wide ranging and easily detectable and scattershot.
This feels like opportunistic cyber criminals, or North Korea (which acts like cyber criminals.)
Proving the attack is state-sponsored is difficult (as any attack you attribute to a country can very well be a false-flag operation), and “state sponsorship” is itself a spectrum; for example, you could argue India’s insufficient action against tech-support scammers is effectively state-sanctioned.
This can of course be resolved, but here’s the kicker: our own governments equally enjoy this ambiguity to do their own bidding; so no government truly has an incentive to actually improve cross-border identity verification and cybercrime enforcement.
Not to mention, even besides government involvement, these malicious actors still “engage” or induce “engagement” which happens to be the de-facto currency of the technology industry, so even businesses don’t actually have any incentive of fighting them.
It shouldn't be a "get the foreigners!" situation. Sure that is a method of solving the symptoms. But what you're really asking for is ... a software bill of materials. Why dont we have that yet? Bc it's cheaper to get ripped off than it is to pay for a bom. Thats the real problem
SBOMs exist. You can get them generated for most software via package managers in standard forms like cyclonedx.
It's just not that effective when the SBOM becomes unmanageable. For example, our JS project at $work has 2.3k dependencies just from npm. I can give you that SBOM (and even include the system deps with nix) but that won't really help you.
They are only really effective when the size is reasonable.
We are still bound to our primal instincts. If you cut the throat of a baby in the middle of Times Square, the outrage will be insane. Yet, lack of financing to hospitals can do that many times over but people are numb to it.
Take the Jaguar hack, the economic loss is estimated at 2.5bn. Given an average house price in the UK of $300k, that’s like destroying ~8.000 homes.
Do you think the public and international response will be the same if Russia or China leveled a small neighborhood even with no human casualties?
I wonder that, too. Surely, this is a fantastic opportunity to claim that it comes from whoever is declared evil right now, and force a harder us-vs-them mindset. If people don't have a clearly defined "evil bad guy" that is responsible for everything bad, how will you get teenagers to die for your country in war?
Or, in other words; maybe the nature of humans and the inherent pressure of our society to perform, to be rich, to be successful, drives people to do bad things without any state actor behind it?
They aren't, in fact the very true happens, that we are bombarded non stop with information that everything is the fault of actors from these companies even when it isn't.
We should fight this kind of behavior (and our privacy) regardless of whose involved, yet our governments in the west have nurtured this narrative of always pointing at big tech and foreign actors as scape goats for anything privacy or hacking related.
Also, any cyber attack tracker will show you this is a global issue, if you think there aren't millions of attacks carried out from our own countries, you're not looking enough.
majority of these are actually north korea, india and america. the really disappointing ones are usually india and american and ones that lay dormant code are usually north korea.
About a month ago I had a rather annoying task to perform, and I found an NPM package that handled it. I threw “brew install NPM” or whatever onto the terminal and watched a veritable deluge of dependencies download and install. Then I typed in ‘npm ’ and my hand hovered on the keyboard after the space as I suddenly thought long and hard about where I was on the risk/benefit curve and then I backspaced and typed “brew uninstall npm” instead, and eventually strung together an oldschool unix utilities pipeline with some awk thrown in. Probably the best decision of my life, in retrospect.