Comment by beejiu

Comment by beejiu 16 hours ago

5 replies

> So what’s happened here? Well, whoever collected these emails first converted from CRLF (i.e., “Windows” line ending coding) to “NL” (i.e., “Unix” line ending coding). This is pretty normal if you want to deal with email. But you then have one byte fewer:

I think there is a second possible conclusion, which is that the transformation happened historically. Everyone assumes these emails are an exact dump from Gmail, but isn't it possible that Epstein was syncing emails from Gmail to a third party mail server?

Since the Stackoverflow post details the exact situation in 2011, I think we should be open to the idea that we're seeing data collected from a secondary mail server, not Gmail directly.

Do we have anything to discount this?

(If I'm not mistaken, I think you can also see the "=" issue simply by applying the Quoted-Printable encoding twice, not just by mishandling the line-endings, which also makes me think two mail servers. It also explains why the "=" symbol is retained.)

TazeTSchnitzel 12 hours ago

In one of the email PDFs I saw an XML plist with some metadata that looked like it was from Apple's Mail.app, so these might be extracted from whatever internal format that uses.

topspin 5 hours ago

What happened here is what always happens with all printed and digital material that goes through some evidentiary process.

The shot-callers demand the material, which is a task fobbed off onto some nobody intern who doesn't matter (deliberately, because the lawyers and career LEOs don't want any "officer of the court" or other "party" to put eyes on things they might need to deny knowing about later.) They use only the most primitive, mechanical method possible, with little to no discretion. The collected mass of mangled junk is then shipped to whoever, either in boxes or on CD-ROM/DVD (yes, still) or something. Then, the reverse process is done, equally badly, again by low-level staff, also with zero discretion and little to no technical knowledge or ability, for exactly the same reasons, to get the material into some form suitable for filing or whatever.

Through all of this, the subtle details of data formats and encodings are utterly lost, and the legal archive fills with mangled garbage like raw quoted-printable emails. The parties involved have other priorities, such as minimizing the number of people involved in the process, and tight control over the number of copies created. Their instinct is not to bring in a bunch of clever folk that might make the work product come out better, because "better" for them is different than "better" for Twitter or Facebook. Also, these disclosures are inevitably and invariably challenged by time: the obligation to provide one thing or another is fought to the last possible minute, and when the word does finally go out there is next to no time to piddle around with details.

In the Epstein case, the disclosures were done years ago, the original source material (computers, accounts, file systems, etc.) have all long since been (deliberately) destroyed, and what the feds have is the shrapnel we see today.

flomo 8 hours ago

When they process these emails, it's fairly common to import everything into a MS Outlook PST file (using whatever buggy tool). That's probably why these look like Outlook printouts even though its Yahoo mail or etc.

ErigmolCt 10 hours ago

Yeah, I wouldn't bet on this being a single bad Gmail export; it smells much more like the accumulated scars of multiple mail systems doing "helpful" things to the same messages over time