Comment by the_duke

Comment by the_duke 2 days ago

3 replies

This isn't just a problem in JS.

In every language I looked the otel libraries were a bloated, over-abstracted and resource-hungry mess.

I think that's partially because it is actually difficult and complex to implement, and partially because the implementations are often written by devs without a long history of implementing similar systems.

mcronce 2 days ago

It's been a bit since I've added it to an existing project, but at least as of a year or so ago, the Rust implementation (tracing + tracing-opentelemetry + opentelemetry-jaeger specifically for that project) was similar.

The impact on compile time and code size wasn't bad (for a project that was large and already pulling in a lot of crates), but it had a huge runtime cost - mostly allocator pressure in the form of temp allocations from what I could see. For a mostly I/O bound workload, it more than doubled CPU utilization and ballooned OS-measured memory consumption by >30%

deathanatos 2 days ago

The OpenTelemetry spec is a mess. There's so much … abstract blah blah blah? … and very little actual details.

If I actually go to the part of the spec that I think gets down to "here is how to concretely write OpenTelemetry stuff [1], that seems to have the various attributes camelCased, for example, whereas the article has named them "spanID" and "traceID".

AFAICT the "spec" also just links you to the implementation. "Just" read this protobuf definition, translate that to JSON in your mind's eye. I "POST" this to a hard-coded path tacked onto a URL… but do I post individual traces/logs? Can I batch them? I'm sure there's a gRPC thing I could start guessing from…

But it seems like the JSON stuff is a second class citizen from the gRPC interface. Unless that's just as bad, too…

Actually getting set up in Python isn't too terrible, though there are a few classes that you're like "what's the point of this?" and most of them are apparently just undoc'd. (E.g., [2], ^F TraceProvider, get nothing.)

It is a bit depressing how this seems to be becoming The Chosen Spec.

I also sort of have 64-bit integers for span IDs (TFA never mentions it, but AFAICT this is required by spec). I'd much rather have "/span/ids/are/a/tree" span IDs, as this integrates much better with any logging system: I can easily ask my log viewer to filter to a specific span (span_id == "/spans/a/b/c") or to a subtree (span_id regex-matches /^\/spans\/a\/.*/)

(And the spec bizarrely focuses on some sort of language-abstract API, instead of … actual data types / media types?)

[1]: https://opentelemetry.io/docs/specs/otlp/#otlphttp

[2]: https://opentelemetry-python.readthedocs.io/en/latest/api/tr...

snuxoll 2 days ago

The .net implementation is about as clean as it can get, but a lot of that has to do with Microsoft caring very deeply about this kind of performance data for a very long time (thus having the entire System.Diagnostics namespace).

There’s certainly some abstraction that is gratuitous still, but it’s better than most of the architect astronaut code I’ve seen targeting the CLR.