Comment by vlovich123
Comment by vlovich123 a day ago
I’m pretty sure the reasoning and conclusion is way off on explaining the speed up:
> The network is better utilized because successive queries can be grouped in the same network packets, resulting in less packets overall.
> the network packets are like 50 seater buses that ride with only one passenger.
The performance improvement is not likely to be because you’re sending larger packets, since most queries transfer very little data and the benchmark the conclusion is drawn from definitely is transferring near 0 data. The speed up comes from removing waiting on a round trip ack of a batch from executing subsequent queries; the number of network packets is irrelevant.
I’m not sure that’s it either. PostgreSQL has a feature — don’t remember what it’s called — where multiple readers can share a serial table scan.
Suppose client A runs “select * from foo”, which has a thousand records. It can start streaming those results starting with row 1. Now suppose it’s on row 500 when client B runs the same query. Instead of starting over for B, it can start streaming results to B starting at row 501. Each time it reads a row, now it sends that to both clients.
Now when it finishes with row 1000, client A’s query is done. It starts back over with B on row 1 and continues through row 500.
Hypothetically, you can serve N clients with a total of 2 table scans if they all arrive before the first client’s scan is finished.
So that’s the kind of magic where I think this is going to shine. Queue up a few queries and it’s likely that several will be able to share the same underlying work.