Comment by 112233

Comment by 112233 2 days ago

3 replies

> Is there a drawback to this approach?

You mean, apart from no existing code working like that? It is not possible for process that creates descriptor to fsync it, because in many very important cases that descriptor outlives the process.

What do you propose should "exec cat a.txt > b.txt" shell command do?

nh2 2 days ago

> no existing code working like that

That doesn't really matter for discussing how correct code _should_ be written.

Also, a good amount of existing code works like that. For example, if you `with open(..) as f:` a file in Python and pass it as an FD to a `subprocess` call, you can fsync and close it fine afterwards, and Python code bases that care about durability and correct error reporting do that.

> What do you propose should "exec cat a.txt > b.txt" shell command do?

That code would be wrong according to my proposed approach of who should be responsible for what (which is what the blog post discusses).

If you create the `b.txt` FD and you want it fsync'ed, then you can't `exec`.

It's equivalent to "if you call malloc(), you should call free()" -- you shouldn't demand that functions you invoke will call free() on your pointer. Same for open files.

  • 112233 2 days ago

    > you can fsync and close it fine afterwards

    No you cannot. Once you pass descriptor to another process, that process can pass it to yet another process, fork and detach, send it via SCM_RIGHTS, give "/proc/PID/fd/N" path to something etc.

    Never assume descriptor cleanup will happen, unless you have complete control over everything.

  • duped 2 days ago

    > That doesn't really matter for discussing how correct code _should_ be written.

    It absolutely does when you're talking about the semantics of virtually every program on earth

    > It's equivalent to "if you call malloc(), you should call free()" -- you shouldn't demand that functions you invoke will call free() on your pointer. Same for open files.

    There are many cases where the one calling malloc cannot be the one calling free and must explicitly document to callers/callees who is responsible for memory deallocation. This is a good example of where no convention exists and it's contextual.

    But open files aren't memory and one cannot rely on file descriptors being closed without errors in practice, so people don't, and you can't just repave decades of infrastructure for no benefit out of ideological purity.