Comment by nh2

Comment by nh2 2 days ago

8 replies

Why not a simple solution:

1. Programs should call close() on stdout and report errors.

2. It's the job of whomever creates the open file description to fsync() it afterwards if desired.

3. If somebody runs a file system or hardware that ignores fsync() or hides close() errors, it's their own fault.

If you `hello > out.txt`, then it's not `hello` that creates and opens `out.txt`; the calling shell does it. So if you use `>` redirection, you should fsync in the calling shell.

Is there a drawback to this approach?

> LLVM tools were made to close stdout [..] and it caused so many problems that this code was eventually reverted

It would be good to know what those problems were.

bluetomcat 2 days ago

> Programs should call close() on stdout and report errors.

Programs have never called open() to obtain stdin, stdout and stderr. They are inherited from the shell. What would be a meaningful way to report errors if the basic output streams are unreliable? If close(stdout) fails, we would need to write to stderr. Then you will have exactly the same error handling issue with closing stderr.

It's a flaw in the design of Unix where polymorphic behaviour is achieved through file descriptors. Worse is better...

  • marcosdumay 2 days ago

    > It's a flaw in the design of Unix where polymorphic behaviour is achieved through file descriptors. Worse is better...

    Looks to me it's a flaw on the signature of `write`. There should be a way to recover the status without changing the descriptor status, and there should be a way to ensure you get the final status, blocking if necessary.

    This can even be fixed in a backwards compatible way, by creating a new pair of functions.

112233 2 days ago

> Is there a drawback to this approach?

You mean, apart from no existing code working like that? It is not possible for process that creates descriptor to fsync it, because in many very important cases that descriptor outlives the process.

What do you propose should "exec cat a.txt > b.txt" shell command do?

  • nh2 2 days ago

    > no existing code working like that

    That doesn't really matter for discussing how correct code _should_ be written.

    Also, a good amount of existing code works like that. For example, if you `with open(..) as f:` a file in Python and pass it as an FD to a `subprocess` call, you can fsync and close it fine afterwards, and Python code bases that care about durability and correct error reporting do that.

    > What do you propose should "exec cat a.txt > b.txt" shell command do?

    That code would be wrong according to my proposed approach of who should be responsible for what (which is what the blog post discusses).

    If you create the `b.txt` FD and you want it fsync'ed, then you can't `exec`.

    It's equivalent to "if you call malloc(), you should call free()" -- you shouldn't demand that functions you invoke will call free() on your pointer. Same for open files.

    • 112233 2 days ago

      > you can fsync and close it fine afterwards

      No you cannot. Once you pass descriptor to another process, that process can pass it to yet another process, fork and detach, send it via SCM_RIGHTS, give "/proc/PID/fd/N" path to something etc.

      Never assume descriptor cleanup will happen, unless you have complete control over everything.

    • duped 2 days ago

      > That doesn't really matter for discussing how correct code _should_ be written.

      It absolutely does when you're talking about the semantics of virtually every program on earth

      > It's equivalent to "if you call malloc(), you should call free()" -- you shouldn't demand that functions you invoke will call free() on your pointer. Same for open files.

      There are many cases where the one calling malloc cannot be the one calling free and must explicitly document to callers/callees who is responsible for memory deallocation. This is a good example of where no convention exists and it's contextual.

      But open files aren't memory and one cannot rely on file descriptors being closed without errors in practice, so people don't, and you can't just repave decades of infrastructure for no benefit out of ideological purity.

kreetx 2 days ago

I fully agree.

The blog post is essentially a long winded way of saying that there isn't a compatible way to safely call `close` given all programs ever written. Yet, I think we already knew that.

wruza 2 days ago

It would be good to know what those problems were.

Idk which problems LLVM had, but closing stdout(stderr) long before exiting may make next open() to return 1(2) and voila some stray printf() now writes right into your database.

If you have to close std*, at least dup2() null device into it, that was a common advice.