Hugo Hacker News

Git 2.33

chucky 2021-08-17 09:22:27 +0000 UTC [ - ]

I think this should link directly to Git's release notes (e.g. https://lore.kernel.org/git/xmqq1r6touqi.fsf@gitster.g/), or the title should be updated to somehow reflect that this is Github's blog post about the changes, not Git's official announcement.

Git is not owned or stewarded by Github, and posts like this reinforce the common and unfortunate image that it is.

jasode 2021-08-17 11:00:01 +0000 UTC [ - ]

>I think this should link directly to Git's release notes (e.g. https://lore.kernel.org/git/xmqq1r6touqi.fsf@gitster.g/),

I disagree about that suggestion because I think it's less friendly for the wider more generalized HN audience.

The very 1st sentence of the blog post already has a helpful hyperlink ref for the "released Git 2.33" that points to the official Git mailing list post.

>The open source Git project just <released Git 2.33=="lore.kernel.org..."> with features and bug fixes from over 74 contributors, 19 of them new.

The rationale for why I believe this is better for most casual readers:

- The blog post has extra context and nice illustrations and lets more hardcore readers also discover the Junio C Hamano's mailing list post for Git.

- But the reverse direction of information discovery is not as easy... if HN thread submission was the Junio C Hamano email text, there are no (reciprocal) links to the blog post by Github.

Sometimes, the urge to avoid posts from <megacorporation> is reader hostile.

ADD EDIT reply to : >OR it can link to the blogpost with a different title.

I thought this was superfluous/redundant since "(github.blog)" in parentheses -- is already the composited title of the thread even if the submitter didn't put "Github" in the title.

>I did not notice an objection to linking to a mega corporation.

It was a general response to 3 comments (so far) I saw that indirectly complained about this thread being a Github blog post instead of something official from a Git maintainer such as a Junio C Hamano email.

capableweb 2021-08-17 14:26:22 +0000 UTC [ - ]

I think my previous comment (https://news.ycombinator.com/item?id=28208138) highlights why it's problematic to link to release notes from an organization that is not the same organization that actually does the work/releases.

In this case, GitHub didn't mention anything about `send-email`, because it's a direct competitor to themselves.

jasode 2021-08-17 15:34:36 +0000 UTC [ - ]

>I think my previous comment ([...]) highlights why it's problematic to link to release notes from an organization that is not the same organization that actually does the work/releases.

You're characterization of it being "problematic to link to Github" vs Git's official mailing list is overlooking the cause & effect of how stories end up on the front page. If most readers would prefer more "official" links from domain *.kernel.org, people can submit them, as they have done so in the past. Examples:

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

Of course, after submission, people also need to upvote them. It's wrong (and misleading) to later change a submission linking to Github blog to the kernel.org url. The upvoters may have found the context and explanations from Github's post to be more helpful than the kernel.org post. (And as stated before, the Github post has the extra bonus of leading readers to the kernel.org release notes. But the opposite is not true, the kernel.org post does not lead to Github's extra explanations about the repack algorithm.)

In your opinion, where is the root cause of the "problem" as you call it? The submitters? The upvoters? The HN mods that leave the original thread url unchanged?

>In this case, GitHub didn't mention anything about `send-email`, because it's a direct competitor to themselves.

Hang on... let's level set here with some realism and not exaggerate. The "git send-email" is not a "direct" competitor to Github. Maybe GitLab could be described as a direct competitor.

Here's a previous thread where some folks don't want to mess with send-email because it's a hassle with a bunch of replies lecturing them that they're doing it wrong: https://news.ycombinator.com/item?id=13631069

It doesn't matter if those posters' complaints about send-email are wrong. The point is that send-email caters to a different type of git user and it's not a direct threat to Github at all.

Also, if Github was truly worried about users defecting to "git send-email", why would they deliberately link to the official Git notes that highlights git send-email as the 1st bulllet point ?!?

If we entertained your theory that "git send-email" kills Github's business model, wouldn't they omit the link to keep readers in the dark?

karmanyaahm 2021-08-17 18:54:03 +0000 UTC [ - ]

tyre 2021-08-17 14:48:44 +0000 UTC [ - ]

> because it's a direct competitor to themselves.

Gonna need a source on this one.

capableweb 2021-08-17 15:39:58 +0000 UTC [ - ]

Source? GitHub is meant for collaborating on Git repositories. One of the main features they have are Pull Requests, which basically is the `send-email` feature but without email, with a web UI and some other features on top of PRs.

sofixa 2021-08-17 20:43:39 +0000 UTC [ - ]

So in the same way that say, a motorcycle and a car are competitors. You can use them for some of the same things ( move yourself), but one of them does many more things.

nextaccountic 2021-08-17 21:14:37 +0000 UTC [ - ]

We can speculate on why they would omit such a thing, but there's a clear conflict of interest here.

gugagore 2021-08-17 11:25:39 +0000 UTC [ - ]

You're not really responding to the comment. The comment is that it should link to the release notes if it keeps the title OR it can link to the blogpost with a different title.

I did not notice an objection to linking to a mega corporation.

2021-08-17 11:37:22 +0000 UTC [ - ]

remus 2021-08-17 09:30:46 +0000 UTC [ - ]

I agree with the principal, but in practice the mailing list post is pretty inscrutable to me (as a casual git user, maybe more experienced users find it more helpful?) On the other hand the github blog post provides a lot of extra context and explanation around the changes which adds a lot of value to me.

capableweb 2021-08-17 10:09:19 +0000 UTC [ - ]

I also agree in principal and agree that the GitHub post with additional context is nice.

But I'm wondering if this is rather feedback for the Git team to do better release notes (like the GitHub team here) as GitHub will always be biased with their release notes, so it would be better for everyone is the release note from Git were as well.

In this particular release, `send-email` is mentioned in the top in the mailing list as it has a new feature.

However, GitHub have zero interest in people to use `send-email` as they are basically a competitor to the `send-email` command. So they simply don't mention it in their release notes.

This is of course bad, and I'm not sure why they wouldn't at least try to remain impartial. But this is why we need the release notes of the Git team to be better.

chucky 2021-08-17 09:41:52 +0000 UTC [ - ]

I agree, I think changing the title is a better solution than linking to the release notes, but I still think it makes sense to offer extra clarity.

The original title of the blog post is "Highlights from Git 2.33" which to me is a bit clearer that this is not release notes or any kind of "official" Git 2.33 announcement but rather someone's highlights about Git 2.33. I have no idea why the original submitter chose to edit the title like this.

usr1106 2021-08-17 15:26:37 +0000 UTC [ - ]

> I have no idea why the original submitter chose to edit the title like that

Not only that; editing the title is against HN submission guidelines. https://news.ycombinator.com/newsguidelines.html

alimbada 2021-08-17 10:33:11 +0000 UTC [ - ]

Noisy text file littered with mail header, lists of names, commit hashes and branch names, file names, commit messages and the only emphasised aspect being hyperlinks which stick out like a sore thumb vs. a well formatted blog post with clear headings, easy to read font, diagrams for architectural changes, code examples and even an animated GIF.

Yeah, I know which one I'm choosing (the latter).

lobo_tuerto 2021-08-17 15:23:18 +0000 UTC [ - ]

Or title changed to the actual title on the post: "Highlights from Git 2.33"

rattray 2021-08-17 14:00:09 +0000 UTC [ - ]

The title of the post is "Highlights from Git 2.33" which implies less ownership - perhaps the submission title should be changed to that?

throwawayswede 2021-08-17 10:24:04 +0000 UTC [ - ]

Not to mention that the actual Git release note includes the names of new and returning contributors, which the Github post just replaced with numbers.

will4274 2021-08-17 14:26:40 +0000 UTC [ - ]

> posts like this reinforce the common and unfortunate image that it is.

Do they really? The first sentence of the post begins "The open source Git project" - it seems to me that this blog post could only reinforce that incorrect perception in people who didn't read it.

CRConrad 2021-08-17 14:44:37 +0000 UTC [ - ]

What about "The open source Git project" is it that says this isn't posted by it?

teddyh 2021-08-17 10:14:31 +0000 UTC [ - ]

I guess that the next release of Linux will be announced here on HN by a link to a microsoft.com blog post?

tester34 2021-08-17 10:46:12 +0000 UTC [ - ]

I hope it'll be written by Linus Torvalds then

IshKebab 2021-08-17 09:11:12 +0000 UTC [ - ]

Is anyone working on better LFS integration with core Git? Somebody mentioned on here a while ago how much of a hack it is and how you could make it way better if it was a core part of Git.

rkangel 2021-08-17 12:44:57 +0000 UTC [ - ]

There is some discussion of sparse checkouts and 'cone mode' lower down in the article. Proper support for these is the replacement for LFS.

LFS is a hack and a workaround for the fact that git ships the whole thing and requires you to have every copy of every blob from the whole of history.

LFS cheats this by storing the big blobs elsewhere and only fetching the ones needed for your checkout. It's a giant pain as everyone knows because you've got two sources of data. Instead lots of work is being done on git to remove the "every copy of every blob" requirement. This is in 2 dimensions - only some specified depth of history, and only some subset of the checkout (LFS only helps with the first). Effectively you still only end up fetching the blobs you need from the server for the checkout you're doing. That gets the same benefit of LFS but all in core git.

handrous 2021-08-17 14:30:14 +0000 UTC [ - ]

LFS has much better UX than sparse-checkout and friends. Really, the only pain points are due to its not being built-in to Git.

rkangel 2021-08-17 14:52:15 +0000 UTC [ - ]

There are two main fundamental things that cause problems with LFS:

LFS makes a 'centralised' assumption about Git usage. Git is designed to be decentralised, even if that's not how we use it, whereas LFS requires a single central object store. An example of where this is a problem: we do a lot of development for clients. It's very convenient to be able to mirror the git repo to their Gitlab and git usually makes it very easy. If you have LFS in your project then you have to jump through a lot of hoops.

The second problem is that LFS copies its data exclusively via https. This creates some security complexity if you normally authenticate via SSH.

handrous 2021-08-17 15:11:00 +0000 UTC [ - ]

Both of those could easily be solved if it were part of git proper, right? I don't see why not, or why they wouldn't be.

Regardless, I'm going to be sad if the UX we get to replace it is any of the newish/cutting-edge Git features, because they're damn ugly. In fact, despite how annoying LFS is for being a separate thing and requiring extra effort for server-side support if you're self-hosting, given Github and Gitlab (among others) support for LFS, I don't see things like sparse-checkout replacing it unless they get a lot nicer to use for that purpose specifically.

One thing that might help would be for Git to make distributing & sharing default repo config easy and built-in. Only thing that gets that treatment is the .gitignore. I feel like I need source control for my source control config, sometimes, and, indeed, some software's cropped up to do just that for e.g. hooks, but that ought to be built-in and cover more than that. One nice thing about LFS is that there's zero client-side config, aside from installing & enabling it (globally, if you like). You don't have to pass around lists of LFS-managed files or checkout-patterns or set-up scripts (hope you made them idempotent!) or anything out-of-band.

rkangel 2021-08-17 15:32:44 +0000 UTC [ - ]

> Only thing that gets that treatment is the .gitignore [...] You don't have to pass around lists of LFS-managed files or checkout-patterns

I don't quite get this point. The .gitattributes file is key to the operation of LFS and is modified and passed around in exactly the same way as .gitignore.

handrous 2021-08-17 15:49:56 +0000 UTC [ - ]

Good point about .gitattributes files being another kind that gets passed around—but that doesn't help with things the .gitattributes file can't capture, which is most things and was part of my point about why making config natively-sharable would be generally nice and also would help to make things like sparse-checkout more useful; and, crucially, LFS Just Works™ without ever having to touch those files manually.

A good built-in implementation should do exactly what LFS does, yes. But they don't. LFS does, and that's part of why it's nicer to use for its specific purpose than doing something with shallow clones and sparse checkouts and all the rest. The best way to replace LFS in git itself would be to copy its UI as completely as possible, because it's pretty good.

Having to install & enable it is a pain, which making it part of git would fix. Having to maintain separate support for it server-side, and a separate transport mechanism, is a pain, which making it part of git at least could fix if the authors wanted to (in fact, I'd think it'd be more work not to solve that problem, in implementing it, though it'd leave vendors who want the two to be significantly separate to fend for themselves)

swiley 2021-08-17 10:05:29 +0000 UTC [ - ]

IMO: LFS is a hack and you probably shouldn't keep large files in git.

tester34 2021-08-17 10:50:56 +0000 UTC [ - ]

Where I should store large binaries then?

swiley 2021-08-17 10:55:00 +0000 UTC [ - ]

In a binary repo (http/sftp server.) That's literally what LFS does.

maccard 2021-08-17 12:16:49 +0000 UTC [ - ]

So your suggestion is to do what lfs does despite not recommending lfs? How do you keep the versions of the binaries in sync with the source code?

clubdorothe 2021-08-17 08:30:23 +0000 UTC [ - ]

The success of github/gitlab is built on top of git. Are Github / GitLab contributing (money or code wise) to git?

est31 2021-08-17 08:39:56 +0000 UTC [ - ]

According to their github profile, 3 of the top 10 contributors to git list github as their employer. https://github.com/git/git/graphs/contributors

In addition, one lists MS. Not sure if that counts.

If you restrict yourself to contributions since Jan 1, 2020, you'll even find 4 "proper" Github employees in the top 10, and the 1 MS employee.

gitgud 2021-08-17 09:03:49 +0000 UTC [ - ]

It would be a terrible business decision on their part to NOT to have any influence on git...

agilob 2021-08-17 09:53:05 +0000 UTC [ - ]

I'm sure it works both ways. If Github was somethingelseHub, somethingelse would be popular. Git started gaining popularity thanks to great UI and communities on github and gitlab.

nbsande 2021-08-17 11:28:25 +0000 UTC [ - ]

That's not strictly true. I'm sure thst GitHub contributed to the awareness of git among and less technically experienced especially in recent years, but use of version control systems has always been the norm in mid to large codebases and git became popular because it improved greatly on the version control systems that came before it.

Another big factor contributing to it's popikaroty was that the Linux kernel (that even when git came out was a pretty big codebase already) also used it to great success.

The only real credit that can be given to GitHub in my eyes is that it allows individuals to more easily host remote repos.

astine 2021-08-17 14:34:45 +0000 UTC [ - ]

Git didn't need GitHub to become popular. Torvald's name being attached to it is what did that. But GitHub is what gave Git the near monopoly on open source version control that it enjoys today. Once it became more convenient to contribute to open source projects by forking projects on GitHub than to self-host, that created a barrier against using any other version control system. Other version control systems would be much more popular if it weren't for GitHub.

petepete 2021-08-17 12:59:59 +0000 UTC [ - ]

GitHub made it easier than ever for people to share and collaborate on code. The options that predated it were pretty awful by comparison. It's no wonder that within a couple of years most OSS projects had moved over.

GitHub launched in Feb 2008, Git soared and SVN plummeted.

https://trends.google.com/trends/explore?date=all&q=%2Fm%2F0...

throwawayswede 2021-08-17 11:23:42 +0000 UTC [ - ]

Github didn't even feel the need to include contributors' names...

zufallsheld 2021-08-17 08:31:30 +0000 UTC [ - ]

This blog post is from github. They made these changes or git and opensourced them. So yes, they do contribute back.

ylyn 2021-08-17 08:35:29 +0000 UTC [ - ]

> They made these changes

No, they're just describing the new features... that does not mean GitHub actually wrote these features.

IMTDb 2021-08-17 08:46:12 +0000 UTC [ - ]

First paragraph of the article:

> In a previous blog post, we discussed how GitHub was using a new mode of git repack to implement our repository maintenance jobs. In Git 2.32, many of those patches were released in the open-source Git project

It looks like GH needed a better algo for git repack. Implemented a better algo for git repack. And contributed the better algo for git repack to the open source project. You can now use the better algo for git repack yourself without using GH.

It's fair to say they made the change

vietjtnguyen 2021-08-17 09:33:27 +0000 UTC [ - ]

stoicjumbotron 2021-08-17 17:19:37 +0000 UTC [ - ]

Are there any good articles/resources which explain how to incorporate some lesser known yet useful git command in your workflow apart from the usual pull, push, rebase etc?

svnpenn 2021-08-17 13:38:29 +0000 UTC [ - ]

What I really want from Git, is to be able to have just a single executable for distribution, like `/bin/git` or `git.exe`. Currently Git is a mishmash of C, Shell (yes really), Python, Perl and who knows what else. I found from my own testing that you can actually make a static native build of the core with a few tweaks, but some big chunks of the project are still in other random languages.

Cogito 2021-08-17 14:28:17 +0000 UTC [ - ]

Git is (has been) in the process of rewriting many shell commands etc in C. There are a few reasons for this, but part of the reason is to make Git more portable.

This work is slow, and a lot of the progress seems to be driven by the Google Summer of Code program (sort of like open source interns), so I suspect it will be a while before all the shell scripts are rewritten.

makecheck 2021-08-17 14:07:28 +0000 UTC [ - ]

Actually it is very reasonable to choose the right language for the job, especially for tasks that are not performance-critical.

When you are writing and debugging something new, you don’t want it to take forever to get it into a correct state. When you are maintaining things, you want readable code. And in an open-source project, you might attract more maintainers if you have code that is more accessible (e.g. simple things appear as simple as they are because they’re scripts, and people familiar with those languages can help even if they are not C programmers). It is also more likely to work cross-platform.

I can imagine a lot of things that would be a very simple few lines of script but a mess to write, debug and read in C code. This is especially true if you could have (say) relied on a robust, debugged Python standard library function that has existed for years but instead had to do all of that yourself in C “because the project should only be using C”?

CRConrad 2021-08-17 14:49:30 +0000 UTC [ - ]

> ...but a mess to write, debug and read in C code. This is especially true if you could have (say) relied on a robust, debugged Python standard library function that has existed for years but instead had to do all of that yourself in C...

Because there are no robust, debugged standard library functions that have existed for years in C?

huntie 2021-08-17 15:43:57 +0000 UTC [ - ]

When did C add standard library functions for Vecs and HashMaps? You're putting words in his mouth, no one claimed C lacked "robust, debugged standard library functions". It's no secret that the C standard library lacks implementations of a variety of things that programmers use on a daily basis.

svnpenn 2021-08-17 14:46:42 +0000 UTC [ - ]

The Git project has had at least one full time paid employee (Junio), for over 10 years. In my mind, it's not acceptable that what I'm asking hasn't been done yet. Also, shell is about the worst possible language you could use for a large project. If you want to see for yourself, look at the code for git-bisect. Global state spread across like 5 random shell scripts.

sigjuice 2021-08-17 17:46:54 +0000 UTC [ - ]

Right, how dare that one paid employee not anticipate and address demands from 10 years in the future.

2021-08-17 16:07:36 +0000 UTC [ - ]

handrous 2021-08-17 14:23:38 +0000 UTC [ - ]

Ah, that explains why Libgit2 is so far behind, feature-wise. It'd be so damn nice if that were Git, and the standard CLI just a thin wrapper over it.

Already__Taken 2021-08-17 13:49:54 +0000 UTC [ - ]

git is a beacon of 'do one thing well' and honestly I'm not overly impressed

2021-08-17 10:26:12 +0000 UTC [ - ]

tooltower 2021-08-17 16:07:17 +0000 UTC [ - ]

I didn't know about `git commit --fixup`, having used git as a daily driver for over a decade. That's super-helpful for fixing things up pre-merge.

vlovich123 2021-08-17 16:03:18 +0000 UTC [ - ]

Does merge-ort also speedup rebase?

stolee 2021-08-17 17:42:51 +0000 UTC [ - ]

Yes! Also cherry-pick and revert.

But specifically, the ORT strategy can cache computed renames across multiple commits being rebased, so the performance benefits are even greater for "git rebase".

WorldMaker 2021-08-17 19:13:36 +0000 UTC [ - ]

The highlights here point out that much of that added work specifically for "git rebase" to use caches across merges is still left to do, but it will be an additional benefit presumably in a future version.

Tony647 2021-08-17 09:10:03 +0000 UTC [ - ]

Thank you so much for sharing such a useful information. I will definitely share this with others.

https://www.paycheckrecords.win/