Comment by gorgoiler

Comment by gorgoiler 13 hours ago

12 replies

You have the right idea but, I believe, the wrong reasoning with your first two arguments.

git-bisect works best when every commit works, contains a single idea, and stacks in a linear history. These features are of most use in a publicly visible branch, and is why it is helpful to squash an entire pull-request into a single, atomic commit — one which clearly defines the change from before- to after-this-feature.

You’re welcome to do whatever you like in your private branch of course, but once you are presenting work for someone else to review then it’s consistent with “I believe this is now complete, correct, working, and ready for review” to squash everything into a single commit. (The fact that code review tools show the sum of all the minor commits is a workaround for people that don’t do this, not a feature to support them!)

In terms of ‘git commit -m wip’: no one is saying you should wear a suit and tie around the house, but when you show up for your senate hearing, presenting yourself formally is as necessary as it is to leave the slides, sweat pants, and tee shirt at home.

Yes, commit early and often while in the flow of putting together a new idea or piece of work. When it’s ready for the attention of your peers then you absolutely ought to dress it up as smartly as possible. It’s at that point that you write a cover letter for your change: what was the situation before, why that was bad, what this patch does instead, and how you proved in practice that it made things better (tests!)

Or to use a different analogy: they don’t want every draft of your master’s thesis from start to finish and they’ll be annoyed if they have to fix basic typos for you that should’ve been caught before the final draft. They don’t care about the typos you already found either, nor how you fixed them. They just want the final draft and to discuss the ideas of the final draft!

Conversely if your master’s thesis or git branch contains multiple semantically meaningful changes — invent calculus then invent gravity / add foo-interface to lib_bar then add foo-login to homepage — then it probably ought to be two code reviews.

lmm 13 hours ago

> git-bisect works best when every commit works, contains a single idea, and stacks in a linear history. That works best in a publicly visible branch, and is why it is helpful to squash an entire pull-request into a single, atomic commit — one which clearly defines the change from before- to after-this-feature.

Disagree; git-bisect works best when every commit is small and most commits work (in particular, as long as any broken commit is likely to have a neighbour that works - isolated bad commits aren't a problem (that's what skip is for, and it's easy enough to include that in your script - you do automate your bisects, right?), long chains of bad commits are). Squashing means your bisect will land on a squashed commit, when it's only really done half the job. (In particular, the very worst case, where every single one of your intermediate commits was broken, is the same as the case you get when you squash)

> When it’s ready for the attention of your peers then you absolutely ought to dress it up as smartly as possible. It’s at that point that you write a cover letter for your change: what was the situation before, why that was bad, what this patch does instead, and how you proved in practice that it made things better (tests!)

And that's what the PR description is for! You don't have to destroy all your history to make one.

  • gorgoiler 12 hours ago

    Thanks for responding. Everything you say I agree with. I think our differences lie in the scope of how much of my private activity do I want to share in public.

    You’re right that GitHub, GitLab et al let you use their tooling to write the final commit message (for the merge commit or squash commit). My preference has always been to do that in git itself.

    In both cases you end up with a single atomic commit that represents the approved change and its description. For me, the commit is created the moment a review is requested, instead of from the moment it is approved and landed. One reason this is particularly useful is that you can now treat the commit as if it had already landed on the main branch. (It is easier to share, cherry-pick, rebase, etc. — easier than doing so with a branch of many commits, in my experience.)

    Prospective changes do not change type (from branch to squashed commit or merge commit) either, when they are approved, which simplifies these workflows.

    • lmm 35 minutes ago

      > One reason this is particularly useful is that you can now treat the commit as if it had already landed on the main branch. (It is easier to share, cherry-pick, rebase, etc. — easier than doing so with a branch of many commits, in my experience.)

      > Prospective changes do not change type (from branch to squashed commit or merge commit) either, when they are approved, which simplifies these workflows.

      You can do that even earlier if you simply never squash or otherwise edit history, which is my approach - any pushed feature branch is public as far as I'm concerned, and my colleagues are encouraged to pull them if they're e.g. working on the same area at the same time. It comes at the cost of having to actually revert when you need to undo a pushed commit (and, since cherry-pick is not an option, if you're making e.g. a standalone fix and want your colleagues to be able to pull it in unrelated branches, you have to think ahead a little and make it on a new branch based from master rather than sticking it in the middle of your incomplete feature branch), but it's very much worth it IME.

    • 1313ed01 11 hours ago

      You can also use just plain git and make sure the merge-commit has a useful message, while leaving the work-branch unsquashed and available to explore and bisect when necessary. The main branch looks as neat as when using squashes, and using something like git log --merges --first-parent all the small commits on the work-branches are hidden anyway. It looks just like when using atomic commits, but the extra details are still there when someone needs them.

    • isleyaardvark 9 hours ago

      I've followed your same approach but switched to having just "most of the commits work" after I found out about git bisect --skip.

  • amelius 11 hours ago

    Can't git-bisect simply ignore commits with a commit message smaller than N characters?

    • lmm 41 minutes ago

      Yes - or, being more principled, you can tell it to only use mainline commits (first parent) if landing on a whole PR is what you want.

    • lambdaone 11 hours ago

      That might sound facile, but it's actually a great idea. Being able to ignore commits based on regexes would be even more powerful.

      • s_gourichon 10 hours ago

        Not sure it really has huge benefits, but I guess something like this should work:

        ``` #!/bin/sh N=20 msg_len=$(git log -1 --pretty=%B | wc -c) if [ "$msg_len" -lt "$N" ]; then exit 125 fi # Here you would run your actual test and report 0 (good), 1 (bad) as needed exit 0 ```

davedx 10 hours ago

> When it’s ready for the attention of your peers then you absolutely ought to dress it up as smartly as possible. It’s at that point that you write a cover letter for your change: what was the situation before, why that was bad, what this patch does instead, and how you proved in practice that it made things better (tests!)

This is an extremely opinionated and time consuming way of working. Maybe in this context it makes sense (nvidia driver kernel somethings), but I don't think it's universally the best way to write code together.

  • gorgoiler 10 hours ago

    I agree that it’s time consuming but the complexity is constant, in my personal experience and with helping others, in that once you start writing long form commit messages (a) you only ever get faster at it, as a skill; and (b) it’s hard to stop!

  • fragmede an hour ago

    One of the best things about git, and the reason it won, is that as a tool, it's extremely unopinionated on this matter, and is supportive of however you want to do it. Of course, one of the worst things about git is how unopinionated it is. If you want 300 commit messages in every branch with the commit message of "poop" and no squashing, and none of them even compile, the tool isn't going to stop you. If every commit is fully functional and rebased on top of master so the graph isn't an octopus, you can. If you'd rather use the name main as the primary branch, also totally fine. Git, the tool leaves all that up to the user and the culture they operate in.

    Naturally, I have Opinions on the right way to use git, having used it since inception within various different contexts at various places, along with other VCSs. What works at one place won't be right for another place, and vice versa. Especially given different skill levels of individuals and of teams, the tools involved, and how much weight I have to review code and commits before it gets accepted. What's important is it should work for you, not the other way around. Regardless of where I'm working though, my local commit messages are total crap. "wip" being the most common, but I commit frequently. Importantly though, before I do eg a slightly involved refactor, going back to see what it was before I started is trivial. Being a skilled operator of git is important to make it easy to run newly written tests against the old code. Being efficient at rebase -i and sorting commits into understandable chunks and squashing minor commits to keep things clean is key.

    I don't think every patch in a series has to work totally independently for every git repo, but what it comes down to is maintenance. There's nothing worse than digging around in git history, trying to figure out why things are how they are, only to dead end at a 3000 line commit from 5 years ago with the message "poop, lol". It's even worse when the person who did that was you!

    Universally, what it comes down to is maintenance. That totally rushed prototype that was just for a demo has now been in production for years, and there's this weird bug with the new database. If you hate yourself, your job, future you, your colleagues, and everybody that comes after you, and you're no good at git, by all means, shit out 300 commits, don't squash, and have the PR message be totally useless. Also believe you're hot shit after one semester of boot camp and that no one else cares just because you want to go home, get high, and play xbox. (Not remotely saying that's you, but those people are out there.)

    We could get all philosophical and try and answer the question of if there are any universal truths, nevermind universally best git commit practices.

    I don't work where you work, don't know your team, or anybody's skill levels on it, so I'll just close with a couple thoughts. the tool is there to work for you, so learn to work with it not against it. Git bisect is your friend. And that it really sucks 4 years later to be confronted by totally useless commit messages on inappropriately sized commits (too big or too small) and have to guess at things in order to ship fixes to prod (on the prototype that was supposed to get thrown away but never did) and just hope and pray that you've guessed correctly.