Comment by laxmansharma

Comment by laxmansharma 3 days ago

Really like the direction: hot data in memory for dashboards, cold data in Parquet so you can use normal lake/OLAP tools and avoid lock-in. The streaming rollups + open formats story makes sense for cost and flexibility.

A few focused questions to understand how it behaves in the real world:

Core design & reliability

What protects recent/ hot in-memory data if a node dies? Is there a write-ahead log (on disk or external log like Kafka) or replication, or do we accept some loss between snapshots?

How does sharding and failover work? If a shard is down, can reads fan out to replicas, and how are writes handled?

When memory gets tight, what’s the backpressure plan—slow senders, drop data, or some other smarter approach?

How do you handle late or out-of-order samples after a rollup/export—can you backfill/compact Parquet to fix history? Are there any plans to do this?

Queries

Will there be any plans for data models and different metric types for the hot in memory store like gauge, counter etc.?

Performance & sizing

The sub-ms reads are great, is there a Linux version for the performance reports so it's easier to compare with other products?

Along with the throughput/ latency details I found on Github, are you able to share the memory/ CPU overhead/ GC details etc. for the benchmarks?

What is the rough recommended sizing for RAM/ CPU for different ingestion inputs in terms of bytes per sample or traffic estimation.

Lake/Parquet details

Considering most people use some other solution like prometheus etc. at this point, will there be an easier migration strategy by Okapi ?

Will Okapi be able to serve a single query across hot (memory) + cold (Parquet) seamlessly, or should older ranges be pushed entirely to the lake side and analyzed through OLAP systems ?

Ops & security

Snapshots can slow ingest—are those pauses tunable and bounded? Any metrics/alerts for export lag, memory pressure, or cardinality spikes?

A couple of end-to-end examples (for queries) and a Helm chart/Terraform module would make trials much easier.

Are there any additional monitoring and observability implemented or have plans for Okapi itself?

Overall: promising approach with a strong cost/flexibility angle. If you share Linux+concurrency benchmarks, ingest compatibility, and lake table format plans (Iceberg/Delta), I think a lot of folks here will try it out.

laxmansharma 3 days ago

I realize some of the questions I raised may not be fully addressed yet given how early the project is. My goal isn’t to nitpick but to understand the direction the Okapi core team is planning to take and how they’re thinking about these areas over time. Really appreciate the work so far and looking forward to seeing how it evolves.

Reply View 0 replies

kushal2048 3 days ago

Lets unpack it one by one.

> What protects recent/ hot in-memory data if a node dies? There is a WAL implementation in Okapi. This one https://github.com/okapi-core/okapi/tree/main/okapi-wal . However it hasn't been battle tested so it isn't integrated. For now durability is done by periodic snapshots and yes, in the event of catastrophic failure Okapi will lose data in flight. Durability is being actively worked on. Will be part of the next release.

>How does sharding and failover work? If a shard is down, can reads fan out to replicas, and how are writes handled? In sharded mode, the replication factor is 1. This is done to prevent write fan-outs. This will be picked up after single-node durability issues are solved. But it won't be part of the next release.

> When memory gets tight, what’s the backpressure plan The plan here is to put a fixed size in-memory buffer and swap pages in-and-out. This should usually suffice as Okapi rolls data up as soon as it arrives which reduces memory consumption. After swapping pages gets saturated, we'll take a look at some production tests and decide if other optimizations are necessary.

> How do you handle late or out-of-order samples after a rollup/export—can you backfill/compact Parquet to fix history? Yup, some part of this is available today via the concept of admission window. All in-memory data can be written to and all data in admission window is held in memory. The default admission window is 24hrs so upto 24hrs old data can be ingested. As for backfilling, Okapi follows a simple schema for partitioning data, backfilling can be done externally by writing a parquet file. We'll document our schema to ensure compatibility.

> Will there be any plans for data models and different metric types for the hot in memory store like gauge, counter etc. Everything is a gauge in Okapi and count ops will be implemented as reductions. Count is already present as a minutely,hourly,secondly statistic.

> The sub-ms reads are great, is there a Linux version for the performance reports so it's easier to compare with other products?

Working on it. We'll publish a new detailed benchmark.

>are you able to share the memory/ CPU overhead/ GC details etc. for the benchmarks? I allocated 6gigs of JVM memory. GC is standard Java on OpenJDK 22 without any tuning.

> Considering most people use some other solution like prometheus etc. Support for the Prometheus ecosystem is actively being worked on. Support for Prometheus and OTel style ingestion with PromQL queries will be part of the next release.

> Will Okapi be able to serve a single query across hot (memory) + cold (Parquet) seamlessly ? Yes ! We have a metrics-proxy that already does this but currently not interoperable with Parquet. It uses Okapi's internal format which optimized for range style scans. Only scan queries are supported right now.

> Snapshots can slow ingest—are those pauses tunable and bounded? Any metrics/alerts for export lag, memory pressure, or cardinality spikes?

Yes the snapshot pause is tunable via parameter --checkpoint.gapMillis . Default is 1hr. Right now memory pressure, CPU can be measured by JMX, okapi doesn't publish these metrics. Okapi intends to be a high cardinality engine, so we don't frown on cardinality spikes.

> A couple of end-to-end examples (for queries) and a Helm chart/Terraform module would make trials much easier. Coming soon !

> Are there any additional monitoring and observability implemented or have plans for Okapi itself? Yes we ourselves will emit fleet metrics via OTel. This is not being done right now but will be soon. But we won't emit Okapi metrics into Okapi as it might cause spiralling.

> Overall: promising approach with a strong cost/flexibility angle Thank you !

TLDR coming soon: fixed-memory buffer, durability improvements, deployment improvements, one very long and detailed benchmark report, OTel / Prometheus style suport, PromQL queries.

Reply View 0 replies