Git Commit Semi-Short-URLs
As you may or may not know, the KDE git infrastructure currently has two assets for viewing repository data — Redmine and gitweb.
Redmine powers Projects at http://projects.kde.org and is intended (pending theming and greater population of data) to be the real “home” of all of KDE’s projects, putting project information, news, repository browsing and more in a pleasant UI. Each project with a git repository will have a project on Redmine as well as on ReviewBoard.
However, Redmine doesn’t have *all* of the git repositories available. This is because in the short term at least there won’t be projects created for user clones of repositories. In addition, the “scratch” area, where KDE developers can maintain their own repositories for anything they wish (that’s KDE-related of course) will never be put in Redmine.
(The reason for this is that if a developer just wants to play around with some code, or perhaps wants a location to store their (versioned) emacs/vi config files or some such thing, there’s no reason that needs to be an “official project” with an entry in Redmine and ReviewBoard. Similarly, if a developer is writing new code but feels that the code is far too raw to actually have others having eyes on it, we believe it should be up to them when they decide to have it put into a project on Redmine and ReviewBoard.)
So, let’s say you’ve just pushed some code. Where would you go to see it on the Web — Redmine or gitweb? I took a first cut at solving this problem by, based on the location of the repository you pushed to, spitting out a Redmine- or gitweb-based URL in the output sent back to your git client. Unfortunately, these were quite long. For example:
http://projects.kde.org/projects/repo-management/repository/revisions/dcd43aacaa806a7a32779a0215b7ab8ed7b05dc8
This isn’t so nice. Especially if your terminal is only 80 characters wide.
So, I came up with a solution — commits.kde.org. It’s a simple Sinatra/Thin-based web application that parses a URL generated by the gitolite hook and forwards you to the correct place. If the repository is on Redmine, it forwards you there; otherwise it forwards you to gitweb.
These aren’t tiny URLs in the style of bit.ly, but they’re deterministic and not based on a database backend. In fact, you can construct these on your own and they’ll still resolve. The form is
http://commits.kde.org/<repoid>/<commitid>
The repoid can be found by looking in the URL that is spit out when you push your code. It stays fixed for each repository (and if it does change, aliases can be added to keep old URLs alive).
In this format, the above URL goes down to
http://commits.kde.org/99c5fdd6/dcd43aacaa806a7a32779a0215b7ab8ed7b05dc8
By doing this, the URL size drops from 110 characters-ish (depending on the name of the project, whether it’s on gitweb or Redmine, and so on) to a fixed 72; enough to make it fit on a single line in most terminals (with the “remote: ” prepended to the git output it’s 80 characters total), and to make it relatively tweet-/dent-able if you don’t need to include much other information in the post.
Update: I should mention that you can shorten those URLs further, but how much depends on when you start getting a collision. If gitweb encounters an ambiguous commit ID, it will 404, without giving information as to why it returned a 404. Redmine, however, will simply return one or the other of the commits — so you may get the wrong commit shown without even realizing it. Worse, this could also mean that URLs that were shorter (very short) but once worked may not work later if a collision comes up later. So you can use as many or few of the hash characters as you want, but I’d stick with at least 8 or so for safety. *Also*, right now the webapp only matches full hash values in Redmine’s database, so if you shorten it you will always get a gitweb URL.
For anyone interested, the current webapp code is GPLv2+ and can be found right here.
RSS
email
Print
PDF
Add to favorites
Identi.ca
Twitter
Google Buzz
Facebook
del.icio.us
Google Bookmarks
Reddit
Technorati
Slashdot
Digg
For more on personal clones on git.kde.org, you may peruse the new manual: http://community.kde.org/Sysadmin/GitKdeOrgManual
You don’t have to use all the 40 digits to identify a commit, the first few are enough as long they are unique to your commit.
E.g.
http://projects.kde.org/projects/repo-management/repository/revisions/dc
or
http://commits.kde.org/99c5fdd6/dcd4
I’m aware of this, but do you know of a way to find out exactly how many characters are needed to guarantee uniqueness?
If you give Redmine too few characters it pulls out a random commit; gitweb 404s.
Also, not just the shortest currently unique value, but one that will stay unique in the future?
AFAIK there’s no exact number. That’s the issue with a hash value. The kernel guys seem to work with 7 “digits”, at least git does so in some places, see e.g. “git reflog”.
Right…I know that 7 or 8 is probably enough, but you can never really know.
Note that you *can* shorten it to whatever you want. Both tools will work with shorter URLs. The problem is that if it’s too short, while gitweb will 404 (so you know there’s a conflict, although it’s not obvious *why* it 404s), Redmine simply picks one of the commits and shows it. So you could think that the URL points to one commit when it’s really pointing to another.
I’ll update the post to note that you can shorten it if you wish but that there are caveats.
This is great. Having commits available via short URL is a great feature.
Would it be possible to make them even shorter? The repository id is strictly not needed, is it? Might be a challenge to implement, though. But from a user’s point of view…
Maybe it also would be possible to use a human-readable id for the repository instead of the numeric id. This way the URLs would not only be short, but pretty as well.
It is, because commits in clones of repositories will have the same commit ID. So it’s not specific to the repository. In Redmine *currently* it’s not strictly necessary because clones aren’t in there — but that’s planned to change at some point. It’s true that this means that you should be able to look at that commit regardless of which repository it’s in since it will be the same, but the sysadmin team didn’t like this idea because it means if you then want to browse other commits in that repo, you might be in an entirely different repo than you wish to be.
Feel free to come up with a workable scheme that is as short as 8 characters (in fact, we could probably use only 6 or 7 characters with the repository ID if we want to). Keep in mind that repos can have dashes and periods in the names, and any two user IDs can have the same repository name, and any user ID can have the same repository name in two different clone namespaces. Also this should be able to be done automatically I don’t think any such scheme that actually stays short will be pretty, and any scheme that is pretty won’t stay short. (The commit ID is already a hash…)
We’re already planning to eventually support aliases, so that if a repository moves (and thus gets a new repoid) we can support old URLs. For “official” repositories we could use short, human-readable values (amarok, konvi, kdelibs) but this won’t work for clones or items in the scratch space.
When I was playing around with this yesterday the repoid wasn’t needed. Did this change or are you not advertising this feature (since it might fail sometimes)? Seems like for any given commit there would either be only one repo that it’d be in, or multiple repos with one repo clearly more equal then the others (eg amarok mainline vs my personal work repo).
Anyways awesome work.
The repoid isn’t strictly needed with any repository in Redmine as those are preferred over gitweb when possible, and there are currently no clones in there. It *is* needed for gitweb because there’s no database to look up the commit id in, and the hook would spit out a different URL depending on whether the end target was gitweb or Redmine. Right now you could omit it and it would work (I think, unless I took that out).
The problem with “separate but one more than equal” repos is that the fact that one is more equal doesn’t mean that you didn’t want the commit URL spit out for your commit to a link to your repo. For instance, so that you can then browse around from there, and see the commits you expect in your own clone. This isn’t a problem for Redmine yet, but when clones get up on there it will be.
[...] my last post about short Commit URLs? One thing I wanted — and was much requested — was a way to have a more [...]