Comment by zdc1

Comment by zdc1 2 days ago

And for better or worse it feels like the errors are being "pushed down" into smaller, more subtle spaces.

I asked ChatGPT a question about a made up character in a made up work and it came back with "I don’t actually have a reliable answer for that". Perfect.

On the other hand, I can ask it about varnishing a piece of wood and it will give a lovely table with options, tradeoffs, and Good/Ok/Bad ratings for each option, except the ratings can be a little off the mark. Same thing when asking what thickness cable is required to carry 15A in AU electrical work. Depending on the journey and line of questioning, you would either get 2.5mm^2 or 4mm^2.

Not wrong enough to kill someone, but wrong enough that you're forced to use it as a research tool rather than a trusted expert/guru.

StephenMelon 2 days ago

I asked ChatGPT, Gemini, Grok and DeepSeek to tell me about a contemporary Scottish indie band that hasn’t had a lot of press coverage. ChatGPT, Gemini and Grok all gave good answers based on the small amount of press coverage they have had.

DeepSeek however hallucinated a completely fictional band from 30 years ago, right down to album names, a hard luck story about how they’d been shafted by the industry (and by whom), made up names of the members and even their supposed subsequent collaborations with contemporary pop artists.

I asked if it was telling the truth or making it up and it doubled down quite aggressively on claiming it was telling the truth. The whole thing was very detailed and convincing yet complete and utter bollocks.

I understand the difference in the cost/parameters etc. but it was miles behind the other 3, in fact it wasn’t just behind it was hurtling in the opposite direction, while being incredibly plausible.

Reply View 1 reply

anonymous908213 2 days ago

This is by no means unique to DeepSeek, and that it happened with specifically DeepSeek seems to be luck of the draw for you (in this case it's entirely possible the band's limited press coverage was not in DeepSeek's training data). You can easily run into it from trying to use ChatGPT as a Google search too. A couple of weeks ago I posed the question "Do any esoteric programming languages with X and Y traits exist?" and it generated three fictional languages while asserting they were real. Further prompting led it to generate great detail about their various features and tradeoffs, as well as making up the people responsible for creating the language and other aspects of the fictional languages' history.

Reply View | 0 replies