MarkusQ 8 hours ago

Could be. Someone hallucinated the arXive reference for the Apple paper.

mfro 10 hours ago

> These findings highlight the importance of careful experimental design when evaluating AI reasoning capabilities.

I would like to carefully design my response to this article with a downvote