Comment by sali0

Comment by sali0 a day ago

noob question, i'm currently adding telemetry to my backend.

I was at first implementing otel throughout my api, but ran into some minor headaches and a lot of boilerplate. I shopped a bit around and saw that Sentry has a lot of nice integrations everywhere, and seems to have all the same features (metrics, traces, error reporting). I'm considering just using Sentry for both backend and frontend and other pieces as well.

Curious if anyone has thoughts on this. Assuming Sentry can fulfill our requirements, the only thing taht really concerns me is vendor-lockin. But I'm wondering other people's thoughts

vanschelven 2 hours ago

I'd say "track errors first" [0] and focus on APM later (if at all). If you're worried about Sentry's lock-in, know that there are API-compatible drop-in replacements[1][2] though they are less feature-complete on the APM/observability side.

[0] https://www.bugsink.com/blog/track-errors-first/

[1] https://www.bugsink.com/

[2] https://glitchtip.com/

Reply View 0 replies

srikanthccv a day ago

>I was at first implementing otel throughout my api, but ran into some minor headaches and a lot of boilerplate

OTeL also has numerous integrations https://opentelemetry.io/ecosystem/registry/. In contrast, Sentry lacks traditional metrics and other capabilities that OTeL offers. IIRC, Sentry experimented with "DDM" (Delightful Developer Metrics), but this feature was deprecated and removed while still in alpha/beta.

Sentry excels at error tracking and provides excellent browser integration. This might be sufficient for your needs, but if you're looking for the comprehensive observability features that OpenTelemetry provides, you'd likely need a full observability platform.

Reply View 0 replies

vrosas a day ago

Think of otel as just a standard data format for your logs/traces/metrics that your backend(s) emit, and some open source libraries for dealing with that data. You can pipe it straight to an observability vendor that accepts these formats (pretty much everyone does - datadog, stackdriver, etc) or you can simply write the data to a database and wire up your own dashboards on top of it (i.e. graphana).

Otel can take a little while to understand because, like many standards, it's designed by committee and the code/documentation will reflect that. LLMs can help but the last time I was asking them about otel they constantly gave me code that was out of date with the latest otel libraries.

Reply View 0 replies

stackskipton a day ago

Ops type here, Otel is great but if your metrics are not there, please fix that. In particular, consider just import prometheus_client and going from there.

Prometheus is bog easy to run, Grafana understands it and anything involving alerting/monitoring from logs is bad idea for future you, I PROMISE YOU, PLEASE DON'T!

Reply View 5 replies

sali0 10 hours ago

Thank you, this is where I'll likely start.
From other comments as well, seems it's still worth trying to integrate otel. Appreciate everyone's insights

Reply View | 0 replies
avtar a day ago

> anything involving alerting/monitoring from logs is bad idea for future you
Why is issuing alerts for log events a bad idea?

Reply View | 3 replies
- _kblcuk_ 21 hours ago
  
  It’s trivial to alter or remove log lines without knowing or realizing that it affects some alerting or monitoring somewhere. That’s why there are dedicated monitoring and alerting systems to start with.
  
  Reply View | 1 reply
  
  sethammons 21 hours ago
  
  Same with metrics.
  If you need an artifact from your system, it should be tested. We test our logs and many types of metrics. Too many incidents from logs or metrics changing and no longer causing alerts. Never got to build out my alert test bed that exercises all know alerts in prod, verifying they continue to work.
  
  Reply View | 0 replies
- stackskipton 17 hours ago
  
  Couple of reasons.
  Biggest one, sample rate is much higher (every log) and this can cause problems if service goes haywire and starts spewing logs everywhere. Logging pipelines tend to be very rigid as well for various reasons. Metrics are easier to handle as you can step back sample rate, drop certain metrics or spin up additional Prometheus instances.
  Logging format becomes very rigid and if the company goes multiple languages, this can be problematic as different languages can behave differently. Is this exception something we care about or not? So we throw more code in attempt to get logging alerting into state that does not drive everyone crazy where if we were just doing "rate(critical_errors[5m] > 10" in Prometheus, we would be all set!
  
  Reply View | 0 replies

whatevermom a day ago

Sentry isn’t really a full on observability platform. It’s for error reporting only (that is annotated with traces and logs). It turns out that for most projects, this is sufficient. Can’t comment on the vendor lock-in part.

Reply View 0 replies

dboreham a day ago

You can run your own sentry server (or at least last time I worked with it you could). But as others have noted sentry is not going to provide the same functionality as OTel.

Reply View 1 reply

mdaniel a day ago

The word "can" is doing a lot of work in your comment, based on the now horrific number of moving parts[1] and I think David has even said the self-hosting story isn't a priority for them. Also, don't overlook the license, if your shop is sensitive to non-FOSS licensing terms
1: https://github.com/getsentry/self-hosted/blob/25.5.1/docker-...

Reply View | 0 replies