Comment by raincole

Comment by raincole 5 days ago

17 replies

I know many people have negative opinions about this.

I'd also like to share what I saw. Since GPT-4o became a thing, everyone who submits academic papers I know in my non-english speaking country (N > 5) has been writing papers in our native language and translating them with GPT-4o exclusively. It has been the norm for quite a while. If hallucination is such a serious problem it has been so for one and half a year.

direwolf20 5 days ago

Translation is something Large Language Models are inherently pretty good at, without controversy, even though the output still should be independently verified. It's a language task and they are language models.

  • kccqzy 5 days ago

    Of course. Transformers were originally invented for Google Translate.

  • biophysboy 5 days ago

    Are they good at translating scientific jargon specific to a niche within a field? I have no doubt LLMs are excellent at translating well-trodden patterns; I'm a bit suspicious otherwise..

    • andy12_ 4 days ago

      In my experience of using it to translate ML work between English->Spanish|Galician, it seems to literally translate jargon too eagerly, to the point that I have to tell it to maintain specific terms in English to avoid it sounding too weird (for most modern ML jargon there really isn't a Spanish translation).

    • mbreese 5 days ago

      It seems to me that jargon would tend to be defined in one language and minimally adapted in other languages. So I’d not sure that would be much of a concern.

      • fuzzfactor 4 days ago

        I would look at non-English research papers along with the English ones in my field and the more jargon and just plain numbers and equations there were, the more I could get out of it without much further translation.

    • disconcision 5 days ago

      for better or for worse, most specific scientific jargon is already going to be in english

      • [removed] 4 days ago
        [deleted]
  • [removed] 5 days ago
    [deleted]
ivirshup 5 days ago

I've heard that now that AI conferences are starting to check for hallucinated references, rejection rates are going up significantly. See also the Neurips hallucinated references kerfuffle [1]

[1]: https://statmodeling.stat.columbia.edu/2026/01/26/machine-le...

  • doodlesdev 5 days ago

    Honestly, hallucinated references should simply get the submitter banned from ever applying again. Anyone submitting papers or anything with hallucinated references shall be publicly shamed. The problem isn't only the LLMs hallucinating, it's lazy and immoral humans who don't bother to check the output either, wasting everyone's time and corroding public trust in science and research.

    • lionkor 4 days ago

      I fully agree. Not reading your own references should be grounds for banning, but that's impossible to check. Hallucinated references cannot be read, so by definition,they should get people banned.

      • fuzzfactor 4 days ago

        >Not reading your own references

        This could be considered in degrees.

        Like when you only need a single table from another researcher's 25-page publication, you would cite it to be thorough but it wouldn't be so bad if you didn't even read very much of their other text. Perhaps not any at all.

        Maybe one of the very helpful things is not just reading every reference in detail, but actually looking up every one in detail to begin with?

  • SilverBirch 4 days ago

    Yeah that's not going to work for long. You can draw a line in 2023, and say "Every paper before this isn't AI". But in the future, you're going to have AI generated papers citing other AI slop papers that slipped through the cracks, because of the cost of doing reseach vs the cost of generating AI slop, the AI slop papers will start to outcompete the real research papers.

    • BlueTemplar 4 days ago

      How is this different from flat earth / creationist papers citing other flat earth / creationist papers ?

    • fuzzfactor 4 days ago

      >the cost of doing reseach vs the cost of generating

      >slop papers will start to outcompete the real research papers.

      This started to rear its ugly head when electric typewriters got more affordable.

      Sometimes all it takes is faster horses and you're off to the races :\

utopiah 5 days ago

It's quite a safe case if you maintain provenance because there is a ground truth to compare to, namely the untranslated paper.