eadz 7 days ago

It would be great if there was an easy way to run their open model (https://huggingface.co/zed-industries/zeta) locally ( for latency reasons ).

I don't think Zeta is quite up to windsurf's completion quality/speed.

I get that this would go against their business model, but maybe people would pay for this - it could in theory be the fastest completion since it would run locally.

  • rfoo 6 days ago

    > the fastest completion since it would run locally

    We are living in a strange age that local is slower than the cloud. Due to the sheer amount of compute we need to do. Compute takes hundreds of milliseconds (if not seconds) on local hardware, making 100ms of network latency irrelevant.

    Even for a 7B model your expensive Mac or 4090 can't beat, for example, a box with 8x A100s running FOSS serving stack (sglang) with TP=8, in latency.

  • xmorse 7 days ago

    Running models locally is very expensive in terms of memory and scheduling requirements, maybe instead they should host their model on the Cloudflare AI network which is distributed all around the world and can have lower latency