Comment by thrwaway1985882
Comment by thrwaway1985882 2 days ago
The threat actor most use to talk about this is a global passive adversary: a threat actor who can see all relevant traffic on the Internet but who can't decrypt or adjust the traffic.
This adversary would have the ability to ingest massive amounts of data and metadata[0] it acquires from tier 1 ISPs all over the country[1] and the world[2]. They'll not see raw HTTP traffic because most everything of interest is encrypted, but can store and capture (time, srcip, srcport, dstip, dstport, bytes).
From there, it's a statistical attack: user A sent 700 kilobytes to a VPN service at time t; at t+epsilon the VPN connected to bad site B and sent 700 kilobytes+epsilon packets. Capture enough packet flows that span the user, the VPN, and the bad site and you can build statistical confidence that user A is interacting with bad site B, even with the presence of a VPN.
This could go other directions too. If bad site B is a Tor hidden site whose admin gets captured by the FBI and turns over access, they'll be unmasking in reverse – I got packets from Tor relay A, which relay sent packets at time-epsilon to it, (...), to the source.
There's very little you can do to fight this kind of adversary. Adding hops and layers (VPN + VPN, Tor, Tor + VPN, etc.) can only make it harder. It's certainly an expensive attack both in terms of time consumption, storage, and it requires massive amounts of data, but if your threat model includes a global passive adversary, game over.
[0] https://en.wikipedia.org/wiki/XKeyscore
https://mullvad.net/fr/blog/introducing-defense-against-ai-g...
and multi-hop addresses https://news.ycombinator.com/item?id=43114966