Chrome's hidden X-Browser-Validation header reverse engineered

348 points by dsekz 3 days ago

dsekz 3 days ago

Dug into chrome.dll and figured out how the x-browser-validation header is generated. Full write up and PoC code here: https://github.com/dsekz/chrome-x-browser-validation-header

Why do you think Chrome bothers with this extra headers. Anti-spoofing, bot detection, integrity or something else?

Reply View 40 replies

userbinator 20 hours ago

Making it easier to reject "unapproved" or "unsupported" browsers and take away user freedom. Trying to make it harder for other browsers to compete.

Reply View | 22 replies
- ajross 9 hours ago
  
  That can be done already based on User-Agent, though. Other browsers don't spoof their agent strings to look like Chrome, and never have (or, they do, but only in the sense that everyone still claims to be Mozilla). And browsers have always (for obvious reasons) been very happy to identify themselves correctly to backend sites.
  The purpose here is surely to detect sophisticated spoofing by non-user-browser software, like crawlers and robots. Robots are in fact required by the net's Geneva Convention equivalent to identify themselves and respect limitations, but obviously many don't.
  I have a hard time understanding robot detection as an issue of "user freedom" or "browser competition".
  
  Reply View | 21 replies
  
  jml7c5 9 hours ago
  
  >I have a hard time understanding robot detection as an issue of "user freedom" or "browser competition".
  The big one is that running a browser other than Chrome (or Safari) could come to mean endless captchas, degrading the experience. "Chrome doesn't have as many captchas" is a pretty good hook.
  
  Reply View | 14 replies
  
  Sayrus 9 hours ago
  
  > I have a hard time understanding robot detection as an issue of "user freedom" or "browser competition".
  In the name of robot detection, you can lock down device, require device attestation, prevent users from running non-standard devices/OS/software, prevent them from accessing websites (CloudFlare dislikes non-chrome browser and hates non-standard browsers, ReCaptcha blocks you out if you're not on Chrome-like/Safari/Firefox). Web Environment Integrity[1] is also a good example of where robot detection ends up affecting the end user.
  [1] https://en.wikipedia.org/wiki/Web_Environment_Integrity
  
  Reply View | 2 replies
  
  jsnell 7 hours ago
  
  The purpose here isn't to deal with sophisticated spoofing. This is setting a couple of headers to fixed and easily discoverable values. It wouldn't stop a teenager with Curl, let along a sophisticated adversary. There's no counter-abuse value here at all.
  It's quite hard to figure out what this is for, because the mechanism is so incredibly weak. Either it was implemented by some total idiots who did not bother talking at all to the thousands of people with counter-abuse experience that work at Google, or it is meant for some incredibly specific case where they think the copyright string actually provides a deterrent.
  (If I had to guess, it's about protecting server APIs only meant for use by the Chrome browser, not about protecting any kind of interactive services used directly by end-users.)
  
  Reply View | 1 reply
  
  Sophira 4 hours ago
  
  I would imagine that this serves the same purpose as the way that early home consoles would check the inserted cartridge to see that it had a specific copyright message in it, because then you can't reproduce that message without violating the copyright.
  In this case, you would need to reproduce a message that explicitly states that it's Google's copyright, and that you don't have the right to copy it ("All rights reserved."). Doing that might then give Google the legal evidence it needs to sue you.
  In other words, a legal deterrence rather than a technical one.
  
  Reply View | 0 replies
  
  soulofmischief 7 hours ago
  
  It's easy to change the User Agent and we cannot handwave this fact away for the sake of argument.
  
  Reply View | 0 replies
Avamander 9 hours ago

> Why do you think Chrome bothers with this extra headers. Anti-spoofing, bot detection, integrity or something else?
Bot detection. It's a menace to literally everyone. Not to piss anyone off, but if you haven't dealt with it, you don't have anything of value to scrape or get access to.

Reply View | 7 replies
- motorest 8 hours ago
  
  > Bot detection. It's a menace to literally everyone. Not to piss anyone off, but if you haven't dealt with it, you don't have anything of value to scrape or get access to.
  What leads you to believe that bit developers are unable to set a request header?
  They managed fine to set Chrome's user agent. Why do you think something like X-Browser-Validation is off limits?
  
  Reply View | 3 replies
  
  Sophira 4 hours ago
  
  Because you would need to reproduce an explicit Google copyright statement which states that you don't have the right to copy it ("All rights reserved.") in order to do it fully.
  That presumably gives Google the legal ammunition it needs to sue you if you do it.
  
  Reply View | 2 replies
- lxgr 8 hours ago
  
  Do you mean bot and non-Chrome-using human detection?
  
  Reply View | 0 replies
- IshKebab 4 hours ago
  
  Bots can easily copy the header though so I don't see how that helps?
  
  Reply View | 0 replies
- ohdeargodno 9 hours ago
  
  Bullshit. You don't have anything of value either. Scrapers will ram through _anything_, and figure out if it's useful later.
  
  Reply View | 0 replies
twapi 3 days ago

Seems like they are using these headers only for google.com requests.

Reply View | 2 replies
- xnx 20 hours ago
  
  Yes I think it is part of their multi level testing of for new version rollouts. In addition to all the internal unit and performance tests, they want an extra level of verification that weird things aren't happening in the wild
  
  Reply View | 0 replies
- AznHisoka 8 hours ago
  
  They probably are using it to block bots scraping Google results is my theory
  
  Reply View | 0 replies
wernerb 15 hours ago

Is it not likely that it protects against AI bot Llama?

Reply View | 1 reply
- wut42 6 hours ago
  
  I don't see how you can "protect" against a large language model that cannot do browsing.
  
  Reply View | 0 replies
exiguus 19 hours ago

I have two questions:
1. Do I understand it correctly and the validation header is individual for each installation?
2. Is this header only in Google Chrome or also in Chromium?

Reply View | 2 replies
- gruez 19 hours ago
  
  >1. Do I understand it correctly and the validation header is individual for each installation?
  I'm not sure how you got that impression. It's generated from fixed constants.
  https://github.com/dsekz/chrome-x-browser-validation-header?...
  
  Reply View | 1 reply
  
  exiguus 10 hours ago
  
  It's still not clear to me because it's called the default API key. And for me, default means that this is normally overwritten. And if overwritten, during build or during install? That's what I'm asking myself.
  
  Reply View | 0 replies
[removed] 19 hours ago

[deleted]

Reply View | 0 replies

userbinator 20 hours ago

This should be somewhat alarming to anyone who already knows about WEI.

I wonder if "x-browser-copyright" is an attempt at trying to use the legal system to stifle competition and further their monopoly. If so, have they not heard of Sega v. Accolade ?

I'm a bit amused that they're using SHA-1. Why not MD5, CRC32, or (as the dumb security scanners would recommend) even SHA256?

Reply View 36 replies

ulrikrasmussen 18 hours ago

I am also alarmed. Google has to split off its development of both Chrome and Android now, this crazy vertical integration is akin to a private company building and owning both the roads AND the cars. Sure, you can build other cars, but we just need to verify that your tires are safe before you can drive on OUR roads. It's fine as long as you build your car on our complete frame, you can still choose whatever color you like! Also, the car has ads.

Reply View | 19 replies
- nurettin 15 hours ago
  
  Ok but The Road is the internet, how much of that does google/alphabet actually own?
  
  Reply View | 18 replies
  
  ulrikrasmussen 14 hours ago
  
  All of YouTube. The vast majority of email. All sources of revenue for ad-funded sites, basically, except for those ads pushed by Meta in their respective walled gardens. They are also the gatekeepers deciding what parts of the internet the users actually see, and they continuously work towards preventing people from actually visiting other sites by siphoning off information and keeping users on Google (AMP, AI summaries). The whole Play Store ecosystem is a walled garden which pretends to be open by building on an ostensibly open source OS but adding strict integrity checks on top which gives Google the ultimate power to decide what is allowed to run on peoples phones.
  They don't have to own the servers and the pipes if they own all the clients, sources of revenue, distribution platforms and financial transaction systems.
  
  Reply View | 16 replies
  
  mschuster91 13 hours ago
  
  > how much of that does google/alphabet actually own?
  A ton. They got shares in a bunch of submarine cables, their properties (YouTube, Maps, Google Search) make up a wide share of Internet traffic, they are via Google Search the chief traffic source for most if not all websites, they own a large CDN as well as one of the three dominant hyperscalers...
  
  Reply View | 0 replies
JimDabell 10 hours ago

> I wonder if "x-browser-copyright" is an attempt at trying to use the legal system to stifle competition and further their monopoly. If so, have they not heard of Sega v. Accolade ?
My first thought was the Nintendo logo used for Gameboy game attestation.
I wonder what a court would make of the copyright header. What original work is copyright being claimed for here? The HTTP request? If I used Chrome to POST this comment, would Google be claiming copyright over the POST request?

Reply View | 1 reply
- notpushkin 8 hours ago
  
  com.apple.Dont_Steal_Mac_OS_X
  
  Reply View | 0 replies
Retr0id 18 hours ago

SHA-1 is a head-scratcher for sure.
I can only assume it's the flawed logic that it's "reasonably secure, but shorter than sha256". Flawed because SHA1 is broken, and SHA256 is faster on most hardware, and you can just truncate your SHA256 output if you really want it to be shorter.

Reply View | 8 replies
- adrian_b 15 hours ago
  
  SHA-1 is broken for being used in digital signature algorithms or for any other application that requires collision resistance.
  There are a lot of applications for which collision resistance is irrelevant and for which the use of SHA-1 is fine, for instance in some random number generators.
  On the CPUs where I have tested this (with hardware instructions for both hashes, e.g. some Ryzen and some Aarch64), SHA-1 is faster than SHA-256, though the difference is not great.
  In this case, collision resistance appears irrelevant. There is no point in finding other strings that will produce the same validation hash. The correct input strings can be obtained by reverse engineering anyway, which has been done by the author. Here the hash was used just for slight obfuscation.
  
  Reply View | 6 replies
  
  Retr0id 11 hours ago
  
  The perf difference between SHA1 and SHA256 was marginal on the systems I tested (3950x, M1 Pro), which makes SHA256 a no-brainer to me if you're just picking between those two (collision resistance is nice to have even if you "don't need it").
  You're right that collision resistance doesn't really matter here, but there's a fair chance SHA1 will end up deprecated or removed from whatever cryptography library you're using for it, at some point in the future.
  
  Reply View | 5 replies
- pinoy420 15 hours ago
  
  [dead]
  
  Reply View | 0 replies
PeterStuer 13 hours ago

WEI? As in Windows Experience Index? Can you elaborate?

Reply View | 1 reply
- runiq 13 hours ago
  
  Web Environment Integrity: https://en.wikipedia.org/wiki/Web_Environment_Integrity
  
  Reply View | 0 replies
lxgr 8 hours ago

Probably any cryptographic hash function would have done.
My suspicion is that what they're trying to do here is similar to e.g. the "Readium LCP" DRM for ebooks (previously discussed at [1]): A "secret key" and a "proprietary algorithm" might possibly bring this into DMCA scope in a way that using only a copyrighted string might not.
[1] https://news.ycombinator.com/item?id=43378627

Reply View | 0 replies
mindslight 17 hours ago

> have they not heard of Sega v. Accolade ?
My mind went here immediately as well, but some details are subtly different. For example being a remote service instead of a locally-executed copy of software, Google could argue that they are materially relying on such representation to provide any service at all. Or that without access to the service's code, someone cannot prove this string is required in order to interoperate. It also wouldn't be the first time the current Supreme Court took advantage of slightly differing details as an excuse to reject longstanding precedent in favor of fascism.

Reply View | 1 reply
- wongarsu 15 hours ago
  
  And even if it falls under fair use in the US, they could still have a case in some other relevant market. The world is a big place
  
  Reply View | 0 replies

cebert 3 days ago

I have to imagine Google added these headers to make it easier for them to identify agentic requests vs human requests. What angers me is that this is yet another signal that can be used to uniquely fingerprint users.

Reply View 13 replies

qingcharles 40 minutes ago

How does that work, though? I have a bunch of automated tasks I use to speed up my workflows, but they all run on top of the regular browser that I also use. I don't see how this war is winnable? (not without tracking things like micro-movements of the mouse that might be caused by being a human etc)

Reply View | 0 replies
gruez 19 hours ago

It doesn't really meaningfully increase the fingerprinting surface. As the OP mentioned the hash is generated from constants that are the same for all chrome builds. The only thing it really does is help distinguish chrome from other chromium forks (eg. edge or brave), but there's already enough proprietary bits inside chrome that you can easily tell it apart.

Reply View | 7 replies
- thayne 18 hours ago
  
  > The only thing it really does is help distinguish chrome from other chromium forks (eg. edge or brave)
  You could already do that with the user agent string. What this does is distinguishes between chrome and something else pretending to be chrome. Like say a firefox user who is spoofing a chrome user agent on a site that blocks, or reduces functionality for the firefox user agent.
  
  Reply View | 6 replies
  
  bobbiechen 18 hours ago
  
  Plenty of bots pretend to be Chrome via user agent, but if you look closely are actually running Headless Chromium. This is a very useful signal for fraud and abuse prevention.
  
  Reply View | 4 replies
  
  TechDebtDevin 6 hours ago
  
  I spoof User Agent, TLS/browser fingerprinting all day. These are the basics. None of this bothers me tbh, I'm constantly running tests on lots of versions chrome, firefox and brave and haven't really seen any impact in bot detection. I do a lot of browser emulation of other browsers in Chrome. PermiterX/Human seems to be the only WAF that is really good about catching this.
  
  Reply View | 0 replies
thayne 18 hours ago

I'm more concerned that whether intentional or not this will probably cause problems for users who use non-chrome browsers. Like say slowing down requests that don't have this header, responding with different content, etc.

Reply View | 3 replies
- userbinator 18 hours ago
  
  User-agent discrimination has been happening for literally decades at this point, but you're right that this could make things worse.
  
  Reply View | 2 replies
  
  snackbroken 17 hours ago
  
  User-agent discrimination is tolerable when it's Joe Webmaster doing it out of ignorance. It is not acceptable if it is being used by a company leveraging their dominant position in one market to gain an advantage over its competitors in another market. It's not acceptable even if it's not said company's expressed intent to do so but merely a "happy accident" that is getting "overlooked".
  Indeed, even for those who require a round of mental gymnastics before they concede that monopolies are, like, "bad" or whatever, GP points out precisely how this would constitute "consumer harm".
  
  Reply View | 1 reply
  
  mook 15 hours ago
  
  Tell that to Google intentionally slowing down Firefox even without ad blocking. (I'm talking about them using the fallback for web components instead, not the slowdowns when ads don't load.)
  
  Reply View | 0 replies

jakub_g 11 hours ago

FYI: Google enterprise workspace admins can enable policies which e.g. prevent login ability to google.com properties to only Chrome browsers.

I wonder if this is header is not connected in some way to that feature.

Reply View 3 replies

cj 10 hours ago

Seems unnecessary.
The same policies also offer the ability to force-install an official Google "Endpoint Verification" chrome extension which validates browser/OS integrity using Enterprise Chrome Extension APIs ("chrome.enterprise") [0] only available in force-installed enterprise extensions.
FWIW, in my years of managing enterprise chrome deployments, I haven't come across the feature to force people to use Chrome (there are a lot of settings, maybe I've missed this one). But, there definitely is the ability to prevent users from mixing their work and non-work gmail accounts in the same chrome profile.
[0] https://developer.chrome.com/docs/extensions/reference/api/e...
Edit: Okay, maybe one hole in my logic is the first-sign in experience. When signing into google for the first time in a new chrome browser, the force-installed extension wouldn't be there yet. Although Google could hypothetically still allow the login initially, but then abort/cancel the sign in process as part of the login flow if the extension doesn't sync and install (indicating non-chrome use).

Reply View | 2 replies
- jakub_g 9 hours ago
  
  In my current job we do have force-Chrome setting enabled. I can't log in to Gmail through any other browser. Neither SSO login to GitHub via Google.
  
  Reply View | 1 reply
  
  cj 9 hours ago
  
  This might be their “context aware” security feature. Which can prevent access to certain things based on device, browser, etc.
  I don’t see why any of that can’t rely on a chrome extension implementation using the privileged APIs to verify OS, Browser, etc. Struggling to understand why they need special headers for any of this functionality.
  
  Reply View | 0 replies

thayne 18 hours ago

Why would they think this was a good idea after losing the chrome anti-trust trial? I don't know the intended purpose is for this, but I can see several ways this could be used anti-competitive way, although now it has been reverse engineered, an extension could spoof it. On the other hand, I wonder if they intend to claim the header is a form of DRM and such spoofing is a DMCA violation...

Reply View 9 replies

jsnell 10 hours ago

> after losing the chrome anti-trust trial?
There hasn't been such a trial.

Reply View | 0 replies
Retr0id 18 hours ago

x-browser-copyright seems like an attempt at something similar to the Gameboy's nintendo-logo DRM (wherein cartridges are required to have the nintendo logo bitmap before they can boot, so any unlicensed carts would be trademark infringement)

Reply View | 4 replies
- userbinator 18 hours ago
  
  http://en.wikipedia.org/wiki/Sega_Enterprises_Ltd._v._Accola... is the legal precedent that says trying to do that won't work, but then again maybe Google thinks it's invincible and can do whatever it wants after it ironically defeated Oracle in a case about interoperability and copyright.
  
  Reply View | 3 replies
  
  Retr0id 18 hours ago
  
  Even if they can't defend it legally, it costs them ~nothing to add the header and it could still act as a deterrent.
  
  Reply View | 1 reply
  
  meibo 10 hours ago
  
  Apple famously does this with this word soup in their SMC chips, and proceeded to bankrupt a company that sold Hackintoshes and shipped it in their EFI: https://en.wikipedia.org/wiki/Psystar_Corporation
  Our hard work by these words guarded please don't steal (c) Apple Computer Inc
  Though one could argue that they would have probably bankrupted them anyway even if they hadn't done that.
  
  Reply View | 0 replies
  
  thayne 6 hours ago
  
  That was before the DMCA was passed. It's possible DMCA section 1201 could apply here.
  
  Reply View | 0 replies
krackers 18 hours ago

>an extension could spoof it
not if they make it dynamic somehow (e.g. include current day in hash). Then with MV3 changes that prevent dynamic header manipulation there is no way for an extension to spoof it.

Reply View | 2 replies
- thayne 18 hours ago
  
  > Then with MV3 changes that prevent dynamic header manipulation
  That doesn't apply to Firefox
  
  Reply View | 1 reply
  
  krackers 17 hours ago
  
  Fair, I was considering chrome headless since firefox users are already served google captchas more often.
  
  Reply View | 0 replies

binary132 10 hours ago

I think it’s difficult to argue that Google doesn’t have the right and capability to build their own private internet, I just also think they’d like to make the entire internet their own private internet, and do away with the public internet, and I’d really prefer they not do that.

Reply View 1 reply

TacticalCoder 10 hours ago

[dead]

Reply View | 0 replies

RandyOrion 7 hours ago

Two questions:

Which version of chrome is the first to implement these headers?

What are the potential effects of these headers on chromium forks, e.g. ungoogled chromium?

Reply View 1 reply

kuschkufan 7 hours ago

Well, what did you find out?

Reply View | 0 replies

aussieguy1234 17 hours ago

So this is basically hidden client attestation?

Reply View 2 replies

Aaargh20318 9 hours ago

Not really. It's just an API key + the user agent. There is no mechanism to detect the browser hasn't been tampered with. If you wanted to do that you'd at least include a hash over the browser binary, or better yet the in-memory application binary.

Reply View | 1 reply
- delusional 9 hours ago
  
  That would provide not extra capability. Anybody smart enough to modify the chrome executable could just patch the hash generation to also return a static (but correct) hash.
  
  Reply View | 0 replies

_imnothere 18 hours ago

And why should anyone with a sane mind (except for Googlers) allow this kind of validation bs to exist?

Reply View 2 replies

rs186 9 hours ago

At this point I am fully convinced that Google is abusing Chrome's dominant position to push their own agenda and shape the whole Internet the way they want. Privacy sandbox, manifest v3, you name it.
Sadly nobody can do anything about it, so far. We'll yet need to see the outcome of the antitrust trial.

Reply View | 1 reply
- orphea 6 hours ago
  
  > to push their own agenda and shape the whole Internet the way they want
  It is Chrome's raison d'être from the very beginning. You don't think Google made its own browser because they felt generous, right?
  
  Reply View | 0 replies

Larrikin 6 hours ago

How do I set this in Firefox?

Reply View 0 replies

Everdred2dx 18 hours ago

If you were using a user agent spoofing extension couldn't this be used to guess your "real" UA?

Reply View 4 replies

jedimastert 7 hours ago

It looks like it's an SHA hash, so working backwards would probably be prohibitively irritating.

Reply View | 3 replies
- dataflow 7 hours ago
  
  That's not how it works. The combination of valid inputs is a small set. You just try each one until you get the hash.
  
  Reply View | 2 replies
  
  jedimastert 5 hours ago
  
  It's not all that small, although probably small enough to make a rainbow table or something.
  You would have to maintain the code to generate character-perfect strings (or maybe just keep a very large library of the current most popular ones) and also make sure you have the up to date API key salt values (which they probably going to start rotating regularly), which–as I said before–wouldn't be impossible, just prohibitively irritating to maintain for comparatively little benefit.
  And besides, it won't be too long before people just start spoofing the hash too, probably shorter than getting the generator up and running
  
  Reply View | 1 reply
  
  giingyui 5 hours ago
  
  There is no salt plus the list of api keys and user agents is finite and very short. Any computer could crack this header in milliseconds.
  
  Reply View | 0 replies

delusional 9 hours ago

Is an "api key" like this covered by copyright? Would that technically mean that spoofing this random sequence of numbers would require me to agree to whatever source license they offer it under, since I wouldn't know the random sequence unless I read it in their source?

That's an odd possibility.

Reply View 0 replies