Comment by zimpenfish

Comment by zimpenfish a day ago

12 replies

> Nothing?

It breaks. Which is weird because you can create a string which isn't valid UTF-8 (eg "\xbd\xb2\x3d\xbc\x20\xe2\x8c\x98") and print it out with no trouble; you just can't pass it to e.g. `os.Create` or `os.Open`.

(Bash and a variety of other utils will also complain about it being valid UTF-8; neovim won't save a file under that name; etc.)

yencabulator a day ago

That sounds like your kernel refusing to create that file, nothing to do with Go.

  $ cat main.go
  package main

  import (
   "log"
   "os"
  )

  func main() {
   f, err := os.Create("\xbd\xb2\x3d\xbc\x20\xe2\x8c\x98")
   if err != nil {
    log.Fatalf("create: %v", err)
   }
   _ = f
  }
  $ go run .
  $ ls -1
  ''$'\275\262''='$'\274'' ⌘'
  go.mod
  main.go
  • kragen 6 hours ago

    I've posted a longer explanation in https://news.ycombinator.com/item?id=44991638. I'm interested to hear which kernel and which firesystem zimpenfish is using that has this problem.

    • yencabulator 6 hours ago

      I believe macOS forces UTF-8 filenames and normalizes them to something near-but-not-quite Unicode NFD.

      Windows doing something similar wouldn't surprise me at all. I believe NTFS internally stores filenames as UTF-16, so enforcing UTF-8 at the API boundary sounds likely.

      • kragen 5 hours ago

        That sounds right. Fortunately, it's not my problem that they're using a buggy piece of shit for an OS.

  • commandersaki a day ago

    I'm confused, so is Go restricted to UTF-8 only filenames, because it can read/write arbitrary byte sequences (which is what string can hold), which should be sufficient for dealing with other encodings?

    • yencabulator a day ago

      Go is not restricted, since strings are only conventionally utf-8 but not restricted to that.

      • commandersaki 21 hours ago

        Then I am having a hard time understanding the issue in the post, it seems pretty vague, is there any idea what specific issue is happening, is it how they've used Go, or does Go have an inherent implementation issue, specifically these lines:

        If you stuff random binary data into a string, Go just steams along, as described in this post.

        Over the decades I have lost data to tools skipping non-UTF-8 filenames. I should not be blamed for having files that were named before UTF-8 existed.

  • zimpenfish a day ago

    > That sounds like your kernel refusing to create that file

    Yes, that was my assumption when bash et al also had problems with it.

kragen a day ago

It sounds like you found a bug in your filesystem, not in Golang's API, because you totally can pass that string to those functions and open the file successfully.