Comment by raw_anon_1111

Comment by raw_anon_1111 2 days ago

For the most part, I don’t do chatbots except for a couple of RAG based chatbots. It’s more behind the scenes stuff like image understanding, categorization, nuanced sentiment analsys, semantic alignment, etc.

I’ve created a framework that lets me test the quality in automated way between prompt changes and models and I compare costs/speed/quality.

The only thing that requires humans to judge the qualify out of all those are RAG results.

biophysboy 2 days ago

So who is the winner using the framework you created?

Reply View 2 replies

raw_anon_1111 2 days ago

It depends. Amazon’s Nova Light gave me the best speed vs performance when I needed really quick real time inference for categorizing a users input (think call centers).
One of Anthropics models did the best with image understanding with Amazon’s Nova Pro being slightly behind.
For my tests, I used a customer’s specific set of test data.
For RAG I forgot. But is much more subjective. I just gave the customer an ability to configure the model and modify the prompt so they could choose.

Reply View | 1 reply
- biophysboy 2 days ago
  
  Your experience matches mine then... I haven't noticed any clear, consistent differences. I'm always looking for second opinions on this (bc I've gotten fairly cynical). Appreciate it
  
  Reply View | 0 replies