Comment by pwdisswordfishz

Comment by pwdisswordfishz 8 days ago

13 replies

> So now I am confused, am I allowed to free() the Vec's pointer directly or not?

No, you are not; simple as that. Miri is right. Rust using malloc/free behind the scenes is an internal implementation detail you are not supposed to rely on. Rust used to use a completely different memory allocator, and this code would have crashed at runtime if it were still the case. Since when is undocumented information obtained from strace a stable API?

It's not like you can rely on Rust references and C pointers being identical in the ABI either, but the sample in the post blithely conflates them.

> It might be a bit surprising to a pure Rust developer given the Vec guarantees, but since the C side could pass anything, we must be defensive.

This is just masking bugs that otherwise could have been caught by sanitizers. Better to leave it out.

josephg 8 days ago

Yep. The right philosophy is to always put the allocation and deallocation code in the same language. If you create a struct with C, also make a destructor in C and call that from rust to destroy your object. (Usually by putting an ffi call in your drop impl).

And the same is true in reverse: rust struct might be temporarily owned and used by C code, but it should always be destroyed from rust.

  • Animats 6 days ago

    > Yep. The right philosophy is to always put the allocation and deallocation code in the same language.

    And on the same side of the API. Especially in Rust, he who allocates is responsible for deallocation. Rust's model likes that symmetry.

    This example looks like someone going to considerable trouble to create a footgun, then shooting themself with it. More likely, this example exists because they're using some C library with terrible memory semantics, and this is a simple example to illustrate a problem they had with a badly designed real library.

  • flohofwoe 8 days ago

    It's also a bad idea within the same language.

    A C library returning a pointer to allocated memory and then expecting the caller to free that memory with a function outside the library (like calling stdlib free()) is just bad API design (because you can't and shouldn't need to know whether the library is actually using the stdlib alloc functions under the hood - or whether the library has been linked with the same C stdlib than your own code - for instance when the library resides in a DLL, or the library might decide to bypass malloc and directly use lower-level OS calls for allocating memory).

    If you have a 'create' function in a C library, also always have a matching 'destroy' function.

    On top of that it's also usually a good idea to let the library user override things like memory allocation or file IO functions.

    ...and of course 'general purpose' global allocators are a bad idea to begin with :)

    • filmor 8 days ago

      That is exactly what what the parent post proposed.

JJJollyjim 8 days ago

It is in fact documented that you can't do this:

"Currently the default global allocator is unspecified. Libraries, however, like cdylibs and staticlibs are guaranteed to use the System by default.", however:

"[std::alloc::System] is based on malloc on Unix platforms and HeapAlloc on Windows, plus related functions. However, it is not valid to mix use of the backing system allocator with System, as this implementation may include extra work, such as to serve alignment requests greater than the alignment provided directly by the backing system allocator."

https://doc.rust-lang.org/std/alloc/index.html https://doc.rust-lang.org/std/alloc/struct.System.html

  • astrange 8 days ago

    > such as to serve alignment requests greater than the alignment provided directly by the backing system allocator

    Surely the system allocator provides memalign() or similar? Does Windows not have one of those?

    • jsheard 8 days ago

      Kind of but not really, the Windows heap doesn't support arbitrary alignments natively, so the aligned malloc function it provides is implemented as a wrapper around the native heap which over-allocates and aligns within the allocated block. The pointer you get back isn't the start of the allocation in the native heap, so you can't pass it to the standard free() without everything exploding.

      I don't think fixing that mess was ever a priority for Microsoft because it's mainly an issue for C, and their focus has long been on C++ instead. C++ new/delete knows the alignment of the type being allocated for so it can dispatch to the appropriate path automatically with no overhead in the common <=16 byte case.

    • jcelerier 8 days ago

      On macOS, aligned_alloc works starting from macOS 10.15.

CryZe 8 days ago

> It's not like you can rely on Rust references and C pointers being identical in the ABI either.

You absolutely can rely on that: https://doc.rust-lang.org/reference/type-layout.html#pointer...

In fact, it's almost even best practice to do so, because it reduces the unsafe needed on Rust's side. E.g. if you want to pass a nullable pointer to T, declare it as e.g. a parameter x: Option<&T> on the Rust side and now can you do fully safe pattern matching as usual to handle the null case.

Want to use some Rust type (not even repr(C)) from C? Just do something like this:

#[no_mangle] pub extern "C" fn Foo_new() -> Box<Foo> { Box::new(Foo::new()) }

#[no_mangle] pub extern "C" fn Foo_free(_: Box<Foo>) { /* automatic cleanup */ }

#[no_mangle] pub extern "C" fn Foo_do_something(this: &mut Foo) { this.do_something() }

all fully safe. Unfortunately for the OP, Vec<_> isn't FFI safe, so that's still one of those cases that are on the rough side.

  • pwdisswordfishz 8 days ago

    > Pointers and references have the same layout.

    > The layout of a type is its size, alignment, and the relative offsets of its fields.

    So “layout” only constrains the memory addresses where a value and its constituents are stored; it does not cover niches. Pointers are allowed to be null, references are not. There is talk of making it illegal to have a reference be unaligned, or even point to very low addresses: <https://github.com/rust-lang/rfcs/pull/3204>. At one point, there was even talk of certain kinds of references not even being stored as memory addresses at all: <https://github.com/rust-lang/rfcs/pull/2040>. And Box<_> is not #[repr(transparent)] either. To put it more generally, the only semantics of a Box<_> or a reference is that it grants you access to a value of a given type and is inter-convertible with a pointer. Only *const _ and *mut _ have a guaranteed ABI.

    Just because you write fewer “unsafe” keywords does not mean your code is more safe.

    • zamalek 6 days ago

      Right, if references were meant to be pointers then they would simply be called pointers.

      Some abstractions are meant to be treated as a black box. It's bloody trivial to realize and point out that references are represented as pointers internally. The actually intelligent step is to figure out how to use that knowledge correctly. Do we pass small bittable types around by-value because we skip the dereference? Hell yes (pending profiling). Do we treat references as pointers? No.

      As a plausible scenario, rustc could use tagged pointers for some purpose in refs. Your code could seem to be working in tests, but segfault when rustc sets the MSB for some reason.

    • comex 6 days ago

      There is also the ABI compatibility guarantee:

      https://doc.rust-lang.org/nightly/std/primitive.fn.html#abi-...

      > The following types are guaranteed to be ABI-compatible:

      > *const T, *mut T, &T, &mut T, Box<T> (specifically, only Box<T, Global>), and NonNull<T> are all ABI-compatible with each other for all T.

      Honestly, this seems like something where the docs need to be clarified, but it's definitely intended that, if you transmute a reference into a pointer (explicitly, or implicitly through FFI as documented in that link), then you get a valid pointer to the type, not just an unknown blob that happens to have the same size and alignment. Likewise, any _valid_ pointer can be transmuted to a reference. It is already UB for a reference to be null or unaligned, but those are both cases of invalid pointers. (Note: rustc only uses null as a niche right now, not unaligned values, but unaligned values still cause UB because rustc tells LLVM's optimizer to assume alignment.)

      If you want more evidence, here is a test for the improper_ctypes lint:

      https://github.com/rust-lang/rust/blob/b91a3a05609a46f73d23e...

      You can see that it's quite conservative about what can be passed through extern "C" functions without a warning, but &[u8; 4 as usize] is allowed, as is TransparentRef (a repr(transparent) struct containing a reference).

      It is also guaranteed that you can transmute between pointers and optional references (i.e. Option<&T>), which are the same except that they do allow null:

      https://github.com/rust-lang/rust/pull/60300

      Thus, if you are writing FFI bindings for a C library, you are allowed to just declare pointer arguments as references, or as optional references, as long as the semantics line up. In practice this is not terribly common (outside of fn() types, which are treated like references), but it is allowed.

      The RFC you cited as "certain kinds of references not even being stored as memory addresses at all" was rejected as a breaking change. If it had been adopted, it would have violated even the more-strict interpretation of the rule that pointers and references have the same layout, because it would have changed the size of references to zero-sized types (to zero).

      • pwdisswordfishz 4 days ago

        > Nothing in this section should be taken as a guarantee for non-Rust-to-Rust calls, even with types from core::ffi or libc.

        Which is exactly the case here.