Comment by tmpz22

Comment by tmpz22 8 months ago

Prompt engineering is just trying that task on a variety of models and prompt variations until you can better understand the syntax needed to get the desired outcome, if the desired outcome can be gotten.

Honestly you’re trying to prove AI is ineffective by telling us it didn’t work with your ineffective protocol. That is not a strong argument.

only-one1701 8 months ago

What should I have done there? Tell it to make sure that it gives me all 10 objects I give it back? Tell it to not put brackets in the wrong place? This is a real question --- what would you have done?

Reply View 9 replies

tmpz22 8 months ago

In no particular order:
* experiment with multiple models, preferably free high quality models like Gemini 2.5. Make sure you're using the right model, usually NOT one of the "mini" varieties even if its marketed for coding.
* experiment with different ways of delivering necessary context. I use repomix to compile a codebase to a text file and upload that file. I've found more integrated tooling like cursor, aider, or copilot, are less effective then dumping a text file into the prompt
* use multi-step workflows like the one described [1] to allow the llm to ask you questions to better understand the task
* similarly use a back-and-forth one-question-at-a-time conversation to have the llm draft the prompt for you
* for this prompt I would focus less on specifying 10 results and more about uploading all necessary modules (like with repomix) and then verifying all 10 were completed. Sometimes the act of over specifying results can corrupt the answer.
[1]: https://harper.blog/2025/02/16/my-llm-codegen-workflow-atm/
I'm a pretty vocal AI-hater, partly because I use it day to day and am more familiar with its shortfalls - and I hate the naive zealotry so many pro-AI people bring to AI discussions. BUTTT we can also be a bit more scientific in our assessments before discarding LLMs - or else we become just like those naive pro-AI-everything zealots.

Reply View | 1 reply
- bboozzoo 8 months ago
  
  With that many ways to try things out differently hoping for good results, it feels like this would become a huge time sink, wouldn't it?
  
  Reply View | 0 replies
simonw 8 months ago

How long ago was this? I'd be surprised to see Claude 3.7 Sonnet make a mistake of this nature.
Either way, when a model starts making dumb mistakes like that these days I start a fresh conversation (to blow away all of the bad tokens in the current one), either with that model or another one.
I often switch from Claude 3.7 Sonnet to o3 or o4-mini these days. I paste in the most recent "good" version of the thing we're working on and prompt from there.

Reply View | 5 replies
- th0ma5 8 months ago
  
  Lol, "it didn't do it... and if it did it didn't mean it... and if it meant it it surely can't mean it now." This is unserious.
  
  Reply View | 4 replies
  
  simonw 8 months ago
  
  A full two thirds of the comment you replied to there were me saying "when these things start to make dumb mistakes here are the steps I take to fix the problem".
  
  Reply View | 1 reply
  
  th0ma5 8 months ago
  
  Not actual knowledge but adages ! Lol "here is my magic potion that I can't tell you how it differs from the other magic potion!" That is not fixing anything. That's trying to string people along and insert yourself otherwise you'd be able to point so some kind of empirical documented experience as to why one vendor is better than the other or why older models have the problem but newer ones don't, and more importantly assurances that future models or even future unannounced changes to the models you mention will even work the same as they do today. What knowledge are you actually imparting other than "I think this is a road you could try." Thanks.
  
  Reply View | 0 replies
  
  gotimo 8 months ago
  
  this is the rhetoric that you will see replied to effectively any negative experience with LLMs in programming.
  
  Reply View | 1 reply
  
  th0ma5 8 months ago
  
  Yeah I'm starting to think they aren't aware of the shell game in their own rhetoric. If nothing can ever be wrong then nothing is right either in that worldview.
  
  Reply View | 0 replies
pdimitar 8 months ago

You should have dropped the LLM, of course. They are not replacing us the programmers anytime soon. If they can be used as an enabler / booster, cool, if not, back to business as usual. You can only win here. You can't lose.

Reply View | 0 replies