Comment by raw_anon_1111

Comment by raw_anon_1111 2 days ago

3 replies

For the most part, I don’t do chatbots except for a couple of RAG based chatbots. It’s more behind the scenes stuff like image understanding, categorization, nuanced sentiment analsys, semantic alignment, etc.

I’ve created a framework that lets me test the quality in automated way between prompt changes and models and I compare costs/speed/quality.

The only thing that requires humans to judge the qualify out of all those are RAG results.

biophysboy 2 days ago

So who is the winner using the framework you created?

  • raw_anon_1111 2 days ago

    It depends. Amazon’s Nova Light gave me the best speed vs performance when I needed really quick real time inference for categorizing a users input (think call centers).

    One of Anthropics models did the best with image understanding with Amazon’s Nova Pro being slightly behind.

    For my tests, I used a customer’s specific set of test data.

    For RAG I forgot. But is much more subjective. I just gave the customer an ability to configure the model and modify the prompt so they could choose.

    • biophysboy 2 days ago

      Your experience matches mine then... I haven't noticed any clear, consistent differences. I'm always looking for second opinions on this (bc I've gotten fairly cynical). Appreciate it