Comment by w10-1

Comment by w10-1 4 days ago

5 replies

Their content-addressing hash would seem critical, but the "combineUnordered" hash they use just adds each byte. The API is clear that this is only as good as possible, and I'm not sure I would rely on it for data used for security investigations. I suspect they'll come up with something like an arbitrary but fixed order over keys that would improve hash quality.

More generally, while I can maybe understand what they're doing, it's hard to imagine how to QA it in a way that's convincing to customers without a lot of data/compute/coverage analysis.

paulvrutledge 4 days ago

I suspect you're right and was already having similar thoughts regarding the hashing scheme. I put a patch together and am going to supplement with some additional tests of the collision space.

Unordered hashes made more sense for arbitrary Clojure data structures where the keys might be complex compound objects, but once we're in the land of datoms with finite value data types it's pretty easy to enforce a consistent ordering.

(disclaimer: I wrote much of the feature and post)

  • limit499karma 4 days ago

    > where the keys might be complex compound objects

    Given that any (simple) object has a unique c.a. identity, even anonymous (and unordered) containment in the parent object provides a key, that is k->k, with implicit order over {K} in the domain of keys (say ascending sort on k0..kn). As you obviously know, there are various schemas for that, such as "objspace://oid/<value-hash>". By definition, that is a bounded domain, with the same cardinality as the hash space. So then there remains the matter of nesting (aka trees) in which case unique identity of the parent is recursively computed as we walk up the tree from the leaves.

lvh 4 days ago

Re: hashing: Yes, but I'll leave that one to Paul who is a lot smarter than I am :)

Re: QA: can you say a bit more about the type of coverage you're worried about? Is your concern that we'd be missing APIs, or that the storage format itself breaks, resulting in fact elision? payne (the underlying project) has a borderline obnoxious amount of tests, but that doesn't mean we didn't miss anything :)

juoitre 4 days ago

> in a way that's convincing to customers

Customers of this sort of security consulting are largely uninterested in the security as much as the audit report that allows them to say to their customers and incestors “we had these security professionals look at our stuff and this is what they said”.

  • lvh 4 days ago

    Some of our customers, like Tailscale, are a helluva lot more picky than that.