Comment by verdverm
The core issue is likely not with the LLM itself. Given sufficient context, instructions, and purposeful agents, a DAG of these will not produce such consistently wrong results with good grounding context
There are a lot of devils in the details and there are few in the story