When O3 is 2x slower than O2

spacecadet_ 13 hours ago

In the branchy function, id is only compared if distance is equal, and since distance is a random float, this almost never happens and the corresponding branch is nearly perfectly predicted. The branchless function always compares both id and distance, effectively doing twice the work. It's only part of the reason why there's a 2x performance difference, but I thought it was interesting.

I really don't know how LLVM picks between branches or conditional moves, but my guess is that it doesn't assume that float equality is any less likely than other conditions, and some optimization pass in O3 turns unpredictable branches into conditional moves. I base this on the fact that adding std::hint::unlikely to the "equal" branch produces the same assembly for the function in both modes.

https://godbolt.org/z/erGPKaPcx

Whether it's safe to assume in general that float equality is unlikely for the purpose of program optimization, I'll leave to the compiler engineers. If you know the data your program will be handling, adding hints will avoid these surprises.

Reply View 1 reply

charleslmunger 12 hours ago

You can influence the choice of conditional moves (usually inserting them) with
__builtin_expect_with_probability(..., 0.5)
https://github.com/protocolbuffers/protobuf/commit/9f29f02a3...

Reply View | 0 replies

YesBox 14 hours ago

While working on my game, I tried implementing a custom priority queue.

It ended up being > 2x faster in debug build, but 2x-5x slower in the release build (??!!?) [1]. I haven't learned much about compilers/"lower level" C++, so I moved on at that point.

How it worked:

1.) The P.Q. created a vector and resized it to known bounds.

2.) The P.Q. kept tract of and updated the "active sorting range" each time an element was inserted or popped.

2B.) So each time an element is added, it uses the closest unused vector element and updates the ".end" range to sort

2C.) Each time an element was removed, it updated the ".start" range

3.) In theory this should have saved reallocation overhead.

[1] I believe Visual Studio uses -O0 for debug, and -O2 for release.

Reply View 5 replies

adzm 10 hours ago

These kinds of things are really interesting, and a good example of the importance of performance testing in optimized builds. I encountered something similar and realized the issue was different sizes of structures in debug vs release which ended up causing the CPU cache lines to overlap and invalidate. Though that sounds unlikely here.

Reply View | 0 replies
FuckButtons 3 hours ago

And this is why you go and look at the assembly in godbolt to see wtf is going on.

Reply View | 0 replies
shihab 12 hours ago

I'm curious what you did with the "active sorting range" after a push/pop event. Since it's a vector underneath, I don't see any option other than to sort the entire range after each event, O(N). This would surely destroy performance, right?

Reply View | 2 replies
- YesBox 9 hours ago
  
  There's no need to resort when popping an item.
  When adding an item, it gets added to the next unused vector element. The sorting range end offset gets updated.
  Then it sorts (note you would actually need a custom sort since a PriorityQueue is a pair)
  std::push_heap( vec.begin() + startOffset, vec.begin() + endOffset ) [1]
  Adding an item would be something like:
  endOffset++; vec.insert( vec.begin() + endOffset, $value ); [1]
  Or maybe I just used
  endOffset++; vec[endOffset] = $value; [1]
  Popping an item:
  startOffset++;
  [1] Im writing this from memory from my attempt many months ago. May have mistakes.. but should communicate the jist
  
  Reply View | 0 replies
- [removed] 7 hours ago
  
  [deleted]
  
  Reply View | 0 replies

pclmulqdq 18 hours ago

-O3 -march=haswell

The second one is your problem. Haswell is 15 years old now. Almost nobody owns a CPU that old. -O3 makes a lot of architecture-dependent decisions, and tying yourself to an antique architecture gives you very bad results.

Reply View 21 replies

arthur2e5 17 hours ago

The author gives a godbolt link. It takes 5 minutes to add two compilers on raptorlake and see that it gives the same result.
https://godbolt.org/z/oof145zjb
So no, Haswell is not the problem. LLVM just doesn't know about the dependency thing.

Reply View | 2 replies
- hedora 16 hours ago
  
  Also, if you’re building binaries for redistribution, requiring Haswell or newer is kind of an aggressive choice.
  I have a box that old that can run image diffusion models (I upgraded the GPU during covid.)
  My point being that compiler authors do worry about “obsolete” targets because they’re widely used for compatibility reasons.
  
  Reply View | 0 replies
- pclmulqdq 14 hours ago
  
  Oh, yeah. Looking at the code the comparison function is bad in a way that will hit issues with cmovs. The "else if" there is a nearly 100% predictable branch which follows a branch that's probably completely unpredictable. One optimization you get with -O3 is that branches not marked likely or unlikely can be turned into dataflow. The proper ordering for that code is to pull the equality check forward into an if statement and mark with "unlikely" if you're using -O3, and then return the ordering of the floating point comparison.
  Since we're using Rust there's some decent syntactic sugar for this function you can use instead:
  let cmp = |other: &Neighbor| -> Ordering { other.dist.partial_cmp(&neighbor.dist).unwrap() .then_with(|| other.id.cmp(&neighbor.id)) };
  You probably won't get any CMOVs in -O3 with this function because there are NaN issues with the original code.
  
  Reply View | 0 replies
mgaunard 17 hours ago

My desktop computer is a Sandy Bridge from 2011. I still haven't seen any compelling reason to upgrade.

Reply View | 13 replies
- bigstrat2003 11 hours ago
  
  For what it's worth, you may be pleasantly surprised by the performance if you upgrade. I went from an Ivy Bridge processor to a Zen 3 processor, and I found that there were a lot of real world scenarios which got noticably faster. For example, AI turns in a late game Civ 6 map went from 30s to 15s. I can't promise you'll see good results, but it's worth considering.
  
  Reply View | 0 replies
- alberth 17 hours ago
  
  What factors would be compelling to upgrade for?
  Just curious, since perf alone doesn’t seem to be the factor.
  https://browser.geekbench.com/processors/intel-core-i7-2600k
  https://browser.geekbench.com/processors/intel-core-i9-14900...
  
  Reply View | 11 replies
  
  jpc0 17 hours ago
  
  Because number bigger doesn’t translate to higher perceived performance…
  The only compelling reason that I want to upgrade my Sandy Lake chip is AVX2.
  So it is instruction set not perf, sure there will be improved performance but most of the things that are actually performance issues is already handed off to the GPU.
  On that note probably rebar and PCIe4 but those aren’t dramatic differences, if CPU is really a problem (renders/compilation) then it gets offloaded to different hardware.
  
  Reply View | 9 replies
  
  genter 17 hours ago
  
  Not OP, but I'm on a 10 year old laptop.
  Only thing I'd want is a higher resolution display that's usable in daylight, and longer battery life.
  
  Reply View | 0 replies
ksec 16 hours ago

Not as old but I am still typing this on a MacBook Pro Early 2015 with a Broadwell CPU. It is doing pretty well with Chrome and Firefox, not so much with Safari.

Reply View | 0 replies
cassepipe 13 hours ago

I do. Had to replace the plastics of the laptop and the screen's definition is unacceptable but with Linux and a SSD it's still fine for basic computer usage. Not my daily driver but kept it as a daily drivers for 10 years.

Reply View | 0 replies
wffurr 16 hours ago

The default is an even older instruction set. Maybe you meant to suggest -march native ?

Reply View | 0 replies
ericpauley 17 hours ago

This era of CPUs has held up surprisingly well. I built an Ivy Bridge desktop in 2012 that still sees daily productivity use (albeit with an NVMe and RAM upgrade).

Reply View | 0 replies

johnisgood 19 hours ago

Off-topic, but I seriously dislike the syntax of Rust. It is chaotic and mind-boggling to me. See the "insert" function.

Good post though.

Reply View 54 replies

Mond_ 18 hours ago

This point has been litigated to death. Read this here: https://matklad.github.io/2023/01/26/rusts-ugly-syntax.html
Almost everything that people think is ugly about Rust's syntax exists for very specific reasons. Most of the time, imo Rust made a good decision, and is just making something explicit.
Some things take time to get used to (e.g. if let), but for most people that's less an issue of syntax, and more an issue of not understanding a powerful feature (e.g. pattern matching deconstructions).

Reply View | 15 replies
- johnisgood 17 hours ago
  
  The reasons do not matter here. It is still ugly / noisy / overly-complicated and probably could have been done better.
  I understand pattern matching deconstructions, I have seen it in other languages. Funnily enough they were nowhere as ugly / noisy / complicated as Rust's is. Rust seems to have bolted on a lot of fancy shit that may be appealing to a lot of people and that is it.
  In your link, the first one is fugly, the last one is fine. Maybe Rust just encourages ugly (i.e. complicated) code a bit too much.
  
  Reply View | 3 replies
  
  embedding-shape 16 hours ago
  
  As someone who mostly writes Clojure code professionally during the day, I agree, Rust's syntax is complicated for no good reason, ugly and overly-verbose. But then I think that about most Algol-like language too, not just Rust.
  And despite that I do use Rust when I want something simple to deploy/deliver, as handing over a binary that just runs is such a nice experience, and it's real easy to make fast. As long as I don't have to maintain in long-term, Rust is fine for what it is.
  
  Reply View | 0 replies
  
  LtdJorge 14 hours ago
  
  I like Rust’s syntax and dislike Elixir/Ruby’s. I also prefer Erlang’s to Elixir, lol.
  
  Reply View | 0 replies
  
  Mond_ 14 hours ago
  
  > It is still ugly / noisy / overly-complicated and probably could have been done better.
  I don't know, it feels like you're just saying that you don't like it, missed the point of the post, and are not giving us anything concrete. Can you list a very clear example of how you'd improve the syntax?
  Again, see the post: You can remove things, but you're losing explicitness or other things. If you want a language that's more implicit, this is fine. I don't.
  
  Reply View | 0 replies
- mihaic 16 hours ago
  
  I'm actually fine with almost all the decisions that Rust made in terms of logic and concepts, but specifically don't like the synthax itself: the symbols, keywords like the consonant-only "fn" instead of "func" for instance, the fact that || {} starts a lambda instead of || -> void {}, the fact that you can return things by simply having them in an if branch. It's the main reason I don't use the language.
  
  Reply View | 5 replies
  
  DrNefario 15 hours ago
  
  Most of those are a matter of preference, implicit return is just plain better, and it would be absolutely insane if closures required the return type to be specified. I do agree that the toilet bowl `|_|()` syntax is ugly, though.
  
  Reply View | 2 replies
  
  Mond_ 14 hours ago
  
  Meh, if these are the main reasons you don't use the language, I don't know what to tell you. I get having preferences, but whether the keyword is `fn` or `func` is such a banal, trivial thing that doesn't matter at all.
  
  Reply View | 1 reply
  
  mihaic 14 hours ago
  
  Each taken individually is something I can ignore. But there are so many of them that it's no longer trivial to ignore.
  
  Reply View | 0 replies
- bernds74 15 hours ago
  
  "if let" just breaks my brain, it feels backwards to me and takes me a minute each time to work out what's going on. In 40 years of programming I've never seen a syntactic construct that I found less intuitive. And it would probably be easily fixable, if it was more along the lines of "if x matches Some(let z)".
  
  Reply View | 1 reply
  
  ai_ 9 hours ago
  
  if let makes a lot more sense when you learn that a normal let expression also takes a pattern[1].
  let Coordinates(x, y) = get_coords();
  But this is intended for "exhaustive patterns". If you can't express an exhaustive pattern, like with an Option, then you can use let ... else
  let Some((x, y)) = get_coords() else { return };
  if let is just an extension of this "let pattern" system.
  Once you internalize how patterns work (and they really work everywhere) it all starts to really make sense and feels a lot cleaner.
  [1]: https://doc.rust-lang.org/reference/patterns.html
  
  Reply View | 0 replies
- [removed] 16 hours ago
  
  [deleted]
  
  Reply View | 0 replies
- Asooka 13 hours ago
  
  They could have at least put the types on the left. Types on the right are a good fit for mathematics, where everything is single-letter symbols, so your eyes do not have to go across the line to see the entire definition, but in programming function definitions often span an entire line on screen, so the most important information should be the first thing shown! Also most programmers do not have formal education in mathematics, so the whole "function : domain -> image" syntax is foreign to them. We really should strive to democratise programming for everyone, rather than make it isolated to a small clique of "properly" educated white men who all went to the same university.
  The type of a variable and the return type of a function are the most important pieces of information regarding those, so they ought to be on the left. It also fits the data flow going right to left (i.e. how the '=' operator works). C's type declarations can get pretty gnarly, so there is definitely room for improvement there. I would say Java (and C#) got it as close to perfect as possible.
  If you want to replace C/C++ you should make your language look as inviting as possible to users of the old one. I truly think this strange fetish for putting types on the right is what gives C programmers the ick when they see Rust for the first time and is hampering its adoption.
  
  Reply View | 1 reply
  
  kelnos 10 hours ago
  
  After using C and Java for years and years I absolutely loved types on the right when I started using Scala. On the right just makes so much more sense to me, and I'm not a math guy.
  Each to their own, I guess.
  
  Reply View | 0 replies
ikkun 19 hours ago

as a beginner rust programmer, I agree. it takes me way longer to parse someone else's rust code than it does for me to read C or C++, even though I have about the same amount of experience with them. in that example, I had to look up what "if let Err() =" does, because it's not intuitive to me. it seems like every time I read rust code, I have to learn about some strange feature that's probably convenient when you know it, but there's a billion of them and they add up to hard to read code until you've spent tons and tons of time with rust. it's just so much to memorize compared to other languages.

Reply View | 5 replies
- ben-schaaf 17 hours ago
  
  I have the opposite experience: C++ is what I have the most experience with by a very wide margin, but I find reading other people's rust code way easier than other people's C++ code. There's way more weird features, language extensions and other crazy stuff in C++ land.
  
  Reply View | 2 replies
  
  johnisgood 15 hours ago
  
  I think both Rust and C++ are behemoths. I do not even know what "proper" "modern" C++ is.
  
  Reply View | 0 replies
  
  ikkun 11 hours ago
  
  I believe you, I haven't contributed a lot of C++ code and it's quite possible the projects I have contributed to (e.g. godot engine) just happen to be written very legibly.
  
  Reply View | 0 replies
- piva00 19 hours ago
  
  It reminds me the experience I had when working with Scala, I really tried to like it but the mind-boggling amount of features created similar issues.
  It took me about 2 years to feel somewhat comfortable but I'd still run into code where someone decided to use their own set of unconventional favourite features, requiring me to learn yet-another-way to do the same thing I had seen done in other ways.
  I just got tired of it, didn't feel more productive nor enlightened...
  
  Reply View | 1 reply
  
  tormeh 18 hours ago
  
  Don't understand this complaint about Rust, but I'll give you Scala. Never seen a language so finely tuned to empower architecture astronauts to make terrible decisions. And boy do they make terrible decisions.
  
  Reply View | 0 replies
Sharlin 19 hours ago

It takes at most a week to get used to just about any syntax. It should absolutely never be a reason not to try a new language. That said, it would be lovely if Rust had a more ML-like, rather than C-like, syntax (being originally inspired by O’Caml), but that would likely not help attract devs in the intended target audience!
The insert function, for what it’s worth, has nonstandard formatting; the de facto standard rustfmt tool would use more newlines, making it clearer and less dense. The function also uses a couple of syntactic features like `if let`, which may seem unfamiliar at first but become second nature in a few days at most.

Reply View | 12 replies
- alt187 19 hours ago
  
  It probably takes a week to get used to ugly pants too. OP's point is a matter of aesthetic judgement. I don't mind it either, but = |&I: can| -> Agree { Rust::Syntax }; is.a(Bit(&mut chaotic);
  
  Reply View | 2 replies
  
  Sharlin 16 hours ago
  
  I should probably have used something less ambivalent than "getting used to". There's certainly more elegance in Rust's semantics than syntax, but the latter doesn't really differ much from C, except for much better for loops, a sane declaration syntax (this is a BIG deal), and of course all the features that C just doesn't have, like closures^* Some of the syntax noise just has to be there in some way, to support the real deal which is the semantics.
  ^* I don't particularly enjoy the Ruby-borrowed || either, particularly because Alt-7 is a terrible key combination to type, but oh well.
  
  Reply View | 1 reply
  
  bombela 10 hours ago
  
  Out of curiosity, what language specific keyboard requires alt-7 to enter the pipe symbol?
  On the QWERTY keyboard, the pipe is near the enter key and easily typed.
  
  Reply View | 0 replies
- mikkupikku 18 hours ago
  
  Every time a language developer chooses to give their new language a novel, let alone chaotic, syntax, they sin against the very premise of software development being an engineering profession. Can you imagine the madness if mechanical engineers had the same laissez-faire mentality towards the conventions and learned best practices of their field? Randomly chosen bolts to use left hand threads just because it suits their arbitrary whim, every thread has a unique pitch never seen before just to make the machine stand out from the crowd. Parts of the machine use hot rivets, not because the design demands it but because it scratches a retro aesthetic itch the designer had that day. Every time a new machine is designed it has a whole class of new never before seen quirky ways of being put together and any mechanic who pleads for standardization on common sense ways of putting a thing together are just told to git gud because the creative expression of the 'engineers' is paramount.
  (This is how we end up with German cars, isn't it?)
  
  Reply View | 5 replies
  
  dzaima 18 hours ago
  
  It's of course a matter of perspective, goals, and assumed knowledge. Is it choosing to not follow best practices, or leaving outdated practices with unfixable downsides behind? Is it inventing new things or instead actually just taking already-existing things that you just happened to not know about? Is it making arbitrary changes, or necessary ones for a useful goal? Is an extra bit of temporary pain for existing engineers worth making things better for all future engineers? (not to say that Rust has necessarily made all the right decisions; but alas basically nothing is perfect, and things certainly won't get better if noone tries)
  
  Reply View | 0 replies
  
  wtetzner 17 hours ago
  
  > Every time a language developer chooses to give their new language a novel, let alone chaotic, syntax, they sin against the very premise of software development being an engineering profession.
  Exactly which syntax should every language be using then? Everyone will give you a different answer.
  > Randomly chosen bolts to use left hand threads just because it suits their arbitrary whim
  Claiming the syntax was chosen randomly on a whim is very much not true: https://matklad.github.io/2023/01/26/rusts-ugly-syntax.html
  And there are times when it does make sense to use left-hand threads.
  Just because someone looks at new syntax and doesn't immediately understand it doesn't mean that syntax doesn't exist for good reason.
  Also, if your language's constructs and semantics don't exactly match those in other languages, then giving them the same syntax would be actively misleading, and therefore a design flaw.
  Syntax is also only an issue for people who haven't taken the time to learn the language. If you haven't learned the language, then familiar syntax isn't going to help you anyway. You'll just think you understand something you don't.
  
  Reply View | 3 replies
- dman 19 hours ago
  
  Laughs in lisp
  
  Reply View | 2 replies
  
  Sharlin 16 hours ago
  
  I've written Clojure professionally. It's exactly the language I had in mind when I said that it takes a week to get accustomed to. I find it a very clean and elegant language. Shame that it's dynamically typed.
  
  Reply View | 1 reply
  
  tmtvl 16 hours ago
  
  Check out Common Lisp, it's gradually typed, so if you want you can just write your code like this:
  (defun fib (n) (declare (type (Integer 0 100) n)) (the (Integer 0 *) (if (< n 2) n (+ (fib (- n 1)) (fib (- n 2).)
  
  Reply View | 0 replies
figassis 19 hours ago

It’s promises make me interested, but the syntax is my main turnoff. I am a Go person, and I think what brings people to go is the opposite of what brings people to Rust. I am willing to sacrifice some memory safety (because I maybe naively think I can manage to write software without many memory bugs) for the simplicity and dev experience that Go offers.

Reply View | 0 replies
Levitating 18 hours ago

The first statement defines a closure. The second is an if-let statement. It's not chaotic, you're just unfamiliar with the syntax.
I actually find the Rust syntax very natural, more than C in some areas.

Reply View | 2 replies
- johnisgood 17 hours ago
  
  More than C? Highly doubt that.
  
  Reply View | 1 reply
  
  Levitating 14 hours ago
  
  Arrays are initialized with curly braces but the type notation uses brackets.
  typedef takes the identifier at the end of the statement.
  The asterisk is used to de-reference but used to denote a reference in types.
  While loops may take the condition after the block.
  
  Reply View | 0 replies
mpalmer 19 hours ago

It really reads quite simply once you're familiar with the language. You only see chaos because there are none of the semantic hooks you'd get from experience

Reply View | 2 replies
- johnisgood 17 hours ago
  
  Everyone who tells me this, I would love them to write an useful project in Forth. ;)
  Guess what? They would blame Forth, Common Lisp, etc., yet I could tell them the same things I am being told about Rust.
  
  Reply View | 1 reply
  
  mpalmer 14 hours ago
  
  Forth and Common Lisp are demanding, but they demand different things of the novice programmer than Rust.
  The latter's baroque syntax is not provably good or necessary, but I'd call it the other end of a compromise for safety and expressiveness. And once you learn it, you've learned much of the language by default.
  Not so with with the first two! The mental models involved in reasoning about a piece of code require much more cognition on the part of the programmer.
  
  Reply View | 0 replies
6r17 19 hours ago

tbh it's bit boring to get stuck on aesthetic of writing up something. It's a bit nauseous to see how this place is hostile to rust for no reason other than "it's not pretty". It's a joke at this point we could make caricature of HN viewpoint on rust. We get it, you don't like it.

Reply View | 2 replies
- christophilus 18 hours ago
  
  HN is not some monolithic single perspective. There are plenty of Rust fans here, and they show up just as frequently as the complainers. In fact, there is overlap between the two groups. You can be a fan of a tech and still complain about its warts.
  
  Reply View | 0 replies
- johnisgood 17 hours ago
  
  Oh come on, how many times have we seen "blazingly fast in Rust", "rewritten in Rust" etc.? Many times. We get it. You like Rust. This "rewrite in Rust" bullshit is just as nauseous and I am sick of seeing it, too. So what? In fact, it is quite a meme now.
  Please, every time you see someone praise Rust (every other day), tell them the same as you have told me, just the opposite way around.
  
  Reply View | 0 replies
neonz80 17 hours ago

I find the short type names for integers and float hard to read. Somehow the size of the type is more important than if it is a signed integer, unsigned integer or a floating point number.
Using Vec for arrays is also annoying, repeating the mistake from C++.

Reply View | 3 replies
- Denvercoder9 17 hours ago
  
  > Using Vec for arrays is also annoying, repeating the mistake from C++.
  Neither Rust nor C++ uses vectors as array, they're distinct things.
  An array is a fixed-size object, which never allocates and its elements thus always retain their memory address. In Rust they're `[T; N]`, in C++ `T[N]` or more recently `std::array<T, N>`.
  A vector on the other hand is dynamically sized object, that may (re)allocate to accomodate new elements, and the address of their elements can change. In Rust they're `Vec<T>`; in C++ `std::vector<T>`.
  
  Reply View | 1 reply
  
  neonz80 17 hours ago
  
  I'm aware of how they are used, but fundamentally there is nothing with the words "array" and "vector" that says that one has a fixed size and the other has a dynamic size. If anything, it should be the other way around. Using the name vector for dynamic arrays is annoying when doing any kind of math coding. Even the designer of STL says the name was a mistake.
  
  Reply View | 0 replies
- bigstrat2003 10 hours ago
  
  I genuinely don't understand how you can find the type names for numbers hard to read, nor how you can say the size is treated as more important. The first thing in the type is what kind of number it is!
  
  Reply View | 0 replies
ozgrakkurt 18 hours ago

For me it is the usage of macros and traits everywhere.
Good luck if you want to get into the code of a library to understand what a function does. You have to go through 3 macros and 5 traits across 5 files. When it could have been just a couple function calls.
People don’t stop and think if they really need that trait or macro for five seconds, they just have to use it every time

Reply View | 2 replies
- LtdJorge 14 hours ago
  
  I dislike macros for that reason when reading code on GitHub or Docs.rs, but Rust was kind of made to be used with an IDE (or LSP).
  
  Reply View | 1 reply
  
  bigstrat2003 10 hours ago
  
  That is poor practice IMO. The simple reality of the world is that not everyone uses such tools. Designing a language with the expectation they will is deliberately ignoring the needs of some users.
  
  Reply View | 0 replies
cannonpalms 17 hours ago

The insert function is very unidiomatic. Instead of defining a cmp closure, you would typically implement Ord and friends.

Reply View | 0 replies
[removed] 19 hours ago

[deleted]

Reply View | 0 replies