Comment by scottcha

Comment by scottcha 6 months ago

There are a large number of reasons the AI datacenters are geographically distributed--just to list a few off the top of my head which come up as top drivers: latency, data sovereignty, resilience, grid capacity, renewable energy availability.

Karrot_Kream 6 months ago

Why does latency matter for a model that responds in 10s of seconds? Latency to a datacenter is measured in 10s or 100s of milliseconds, which is 3-4 orders of magnitude less.

Reply View 2 replies

scottcha 6 months ago

Two reasons that I understand 1. not all these AIs are LLMs and many have much lower latency SLAs than chat and 2. These are just one part of a service architecture and when you have multiple latencies across the stack they tend to have multiplicative effects.

Reply View | 0 replies
recursivecaveat 6 months ago

If you look at a model with a diverse competitive provider set like llama 3 the latency is 1/4 second, and it will definitely improve at a minimum incrementally if quality is held constant: https://artificialanalysis.ai/models/llama-3-3-instruct-70b/... Remember that as long as you experience the response linearly (very much the case for audio output for eg) then the first-chunk latency is your actual latency, not to stream the entire response.

Reply View | 0 replies