Comment by sali0

Comment by sali0 a day ago

12 replies

noob question, i'm currently adding telemetry to my backend.

I was at first implementing otel throughout my api, but ran into some minor headaches and a lot of boilerplate. I shopped a bit around and saw that Sentry has a lot of nice integrations everywhere, and seems to have all the same features (metrics, traces, error reporting). I'm considering just using Sentry for both backend and frontend and other pieces as well.

Curious if anyone has thoughts on this. Assuming Sentry can fulfill our requirements, the only thing taht really concerns me is vendor-lockin. But I'm wondering other people's thoughts

srikanthccv a day ago

>I was at first implementing otel throughout my api, but ran into some minor headaches and a lot of boilerplate

OTeL also has numerous integrations https://opentelemetry.io/ecosystem/registry/. In contrast, Sentry lacks traditional metrics and other capabilities that OTeL offers. IIRC, Sentry experimented with "DDM" (Delightful Developer Metrics), but this feature was deprecated and removed while still in alpha/beta.

Sentry excels at error tracking and provides excellent browser integration. This might be sufficient for your needs, but if you're looking for the comprehensive observability features that OpenTelemetry provides, you'd likely need a full observability platform.

vrosas a day ago

Think of otel as just a standard data format for your logs/traces/metrics that your backend(s) emit, and some open source libraries for dealing with that data. You can pipe it straight to an observability vendor that accepts these formats (pretty much everyone does - datadog, stackdriver, etc) or you can simply write the data to a database and wire up your own dashboards on top of it (i.e. graphana).

Otel can take a little while to understand because, like many standards, it's designed by committee and the code/documentation will reflect that. LLMs can help but the last time I was asking them about otel they constantly gave me code that was out of date with the latest otel libraries.

stackskipton a day ago

Ops type here, Otel is great but if your metrics are not there, please fix that. In particular, consider just import prometheus_client and going from there.

Prometheus is bog easy to run, Grafana understands it and anything involving alerting/monitoring from logs is bad idea for future you, I PROMISE YOU, PLEASE DON'T!

  • sali0 10 hours ago

    Thank you, this is where I'll likely start.

    From other comments as well, seems it's still worth trying to integrate otel. Appreciate everyone's insights

  • avtar a day ago

    > anything involving alerting/monitoring from logs is bad idea for future you

    Why is issuing alerts for log events a bad idea?

    • _kblcuk_ 21 hours ago

      It’s trivial to alter or remove log lines without knowing or realizing that it affects some alerting or monitoring somewhere. That’s why there are dedicated monitoring and alerting systems to start with.

      • sethammons 21 hours ago

        Same with metrics.

        If you need an artifact from your system, it should be tested. We test our logs and many types of metrics. Too many incidents from logs or metrics changing and no longer causing alerts. Never got to build out my alert test bed that exercises all know alerts in prod, verifying they continue to work.

    • stackskipton 17 hours ago

      Couple of reasons.

      Biggest one, sample rate is much higher (every log) and this can cause problems if service goes haywire and starts spewing logs everywhere. Logging pipelines tend to be very rigid as well for various reasons. Metrics are easier to handle as you can step back sample rate, drop certain metrics or spin up additional Prometheus instances.

      Logging format becomes very rigid and if the company goes multiple languages, this can be problematic as different languages can behave differently. Is this exception something we care about or not? So we throw more code in attempt to get logging alerting into state that does not drive everyone crazy where if we were just doing "rate(critical_errors[5m] > 10" in Prometheus, we would be all set!

whatevermom a day ago

Sentry isn’t really a full on observability platform. It’s for error reporting only (that is annotated with traces and logs). It turns out that for most projects, this is sufficient. Can’t comment on the vendor lock-in part.

dboreham a day ago

You can run your own sentry server (or at least last time I worked with it you could). But as others have noted sentry is not going to provide the same functionality as OTel.