jefferai
I work for MIT doing all sorts of stuff and I hack on things in my free time, mostly in C++/Qt. I mainly work on KDE, where I do coding, sysadmin, community, event organization, security and promotion work. I also do a lot of development for Tomahawk Player. I love FOSS, and like most geeks I also love photography.
Home page: http://jefferai.org/
Jabber/GTalk: jeff@jefferai.org
Posts by jefferai
Speed never gets old. At least, in software.
12As my regular readers — such as that may be, considering how rarely I post — may know, I’ve been doing all sorts of things to speed up scanning your files into Amarok’s local collection. I have some really nice news on that front: a few goodies that will be very useful to you, especially if you have very flat collections. These improvements are in the forthcoming 2.2.1.
First up, and the lesser of the two: If a directory is encountered multiple times during scan, the scanner will now only scan them once. There are a few corner cases where this could happen (for instance, if you have a top-level directory specified as well as one of its subdirectories, and the subdirectory’s name changes, changing both its and its parent’s mtime). Not too useful for most cases, but was implemented while working on…
Second, and the biggie: When doing an incremental recursive scan, Amarok will now no longer scan subdirectories whose mtimes have not changed. This is big news if you have a large, flat collection (which I dislike myself but some really enjoy…to each their own). For instance, in a setup like 3,000 files in your main folder, plus a few thousand files each in a few subfolders for a total of 10,000 tracks, if you added a single file to the top-level folder and had incremental recursive scanning turned on (the default), you’d cause a rescan of your entire music collection — all 10,000 tracks or so. Now, if you add a single file to the top-level folder, you’ll only cause a rescan of 3,000 tracks…which is still a lot, but even more of a reason to use a hierarchy.
This should help scanning time even for those with hierarchies, as adding a new album to an artist won’t cause the artist’s other album folders to be rescanned, or adding a new artist to a genre won’t cause all the artists and albums in that genre to rescan.
It’s really the proper way to do things, but it required some changes to the data sent between Amarok proper and the collection scanner, so had to be done carefully (as far as I am currently aware I didnt’ cause any regressions). Regardless, I’m glad to have it in there and working, and hopefully you will be too.
Say goodbye to history
7And by that, I mean say goodbye to an historical (read: old) bug.
I’d heard these whispers recently about “whenever I add or remove an album from the collection and do an incremental update, my collection gets messed up.” Okay — we’ve heard these whispers for a long time, but there were so many other things that had to be fixed first that it wasn’t clear whether this was a symptom of a different problem or a problem of its own…and it could always be fixed with a full rescan.
However, with the collection being much more solid these days and with this being the only super-visible bug left that I knew of, I decided to tackle this. It turns out that, once the rest of the collection was behaving, this wasn’t that hard to find…it just took a lot of debug tracing, because it wasn’t obvious. Since the cause was the wrong field being used in a DB query to remove some tracks during an incremental scan, with anything other than a very tiny test collection it could quickly start to make no logical sense what was happening.
The reason I say the bug is historical is that the code causing it has been there since May 2008. While admitting that provides some fodder for naysayers and haters to harp on Amarok for such visible and longstanding bugs, I prefer to take the other approach: it means that all the various issues users have found since the total rewrite that was 2.0 *are* being found and *are* being solved. Some of them just take some time; it’s a *lot* of code. But the proof is in the 2.2 ChangeLog.
I managed to sneak in the fix a day before 2.2 was tagged, so when Amarok 2.2 comes out scanning (both full and incremental) should be really quite solid. In fact, since this has been fixed in git, I’ve yet to hear of any more problems, just lots of happy users. It’s just another way that the (very close!) 2.2 is going to *rock*.
AFT and MusicBrainz track identifiers, redux
2A bit ago I blogged about how Amarok File Tracking can now use MusicBrainz identifiers to do its stuff.
Then, a little while later, I started getting bug reports of peoples’ music disappearing from their collection, and requested some of the reporters send me some files. One of the users did so, and I found something curious in his tags (if I had a penny for every time I’ve personally seen users have something odd or strange in their tags, I’d have…well, a few dollars at least). Several of his files had full MusicBrainz tags — with absolutely no data populating them, meaning that the MusicBrainz identifier (and all other MB data) for all of those files was ending up the same (blank) and Amarok was thinking them the same file.
It was a quick fix (use generated non-embedded AFT IDs when the MB tags are empty) but just adds to the evidence that you can never, ever trust users’ tags. Also, that users that use your Git-based version or betas really rock for finding this stuff before release…so in case I don’t say this enough: thanks users!
AFT and MusicBrainz track identifiers
2A heads-up: Amarok File Tracking can now use MusicBrainz track identifiers for its embedded IDs. This means people that have used Picard to tag their files but not amarok_afttagger can still get some embedded AFT goodness! It also enables an interesting "mode" because it essentially enables song tracking vs. actual file tracking (which you may or may not want, depending on your particular needs).
Full details are here.
Presenting the KDE network on Facebook
8Many KDE developers are on Facebook. A while back I wondered if it would be possible to have an official KDE developers’ network on Facebook — after all, there are networks for schools, jobs, cities, and more (and for many developers, KDE is literally or figuratively a job…)
As it turned out, there was a "Kde" network — but something was odd. To join a work network you have to have an email address affiliated with the network. KDE owns kde.com and kde.org — so who was this? The only other "KDE" I could find that seemed like it would be legit was the Kentucky Department of Education, and I rather doubted it was them, because they would likely have used all-uppercase KDE as well. So I started an inquiry with Facebook, trying to figure out if either it was someone squatting on our name (and trademark) or whether it was some legit organization — in which case, would they mind donating the network to us?
After several months of back-and-forth with the people at Facebook, who were very nice (if a bit slow
), I’m happy to say that we’ve regained the KDE network (properly capitalized) as our own. I still don’t know the whole story as to who was there before, and never will due to their privacy policies, but I’ll say this:
- If you were in the "Kde" network before and Facebook asked if you would mind donating it to us, and you did, thanks so much!
- If someone was simply squatting in the "Kde" network before, then thanks, Facebook, for kicking them out!
To join the network, go to Settings -> Networks, and enter KDE and your kde.org email address in the appropriate fields.
DB changes — call for benchmarkers!
11I’ve done some work in trunk over the past week that may have a huge impact on many of you Amarokers. Read on, and if you can do some benchmarks for me, fantastic.
First, the schema/table changes.
- We’ve seen some issues where people have, for whatever reason, ended up with InnoDB tables instead of MyISAM tables. This is probably the result of their DB being created long ago before we were explicitly telling the mysqle startup to skip InnoDB. This mainly causes a problem because some columns cannot be as wide as we’d like them to be when using InnoDB. So, the first thing being done is that an ALTER TABLE is being forced on every table to explicitly convert to MyISAM. In addition, ENGINE parameters are now used during table creation to be more explicit in the future.
- Some of you might have seen complaints in the debug output about indexes not being able to be created due to a max key length, which by default in MySQL is 1000 (compile-time option). So, some columns have had their widths adjusted so that all indexes are now successfully created.
Now, the other changes:
As we added more features, scanning got slow. Like, really slow. You’d spend more time running SQL queries than actually scanning your files. So I’ve been aiming to change that.
Over the past week I’ve committed changes that remove, per track, anywhere from 1 to 6 SQL queries. The exact amount is highly dependent on your file set, but there is a minimum of one less SQL query per track. If you’ve done a lot of file moves and AFT kicks in, it’ll be an even more massive speedup. I’m going to try to do some further tuning, but already results are looking positive.
Nikolaj has reported that his scan time went from 68 seconds to 18 seconds — more than 3x faster. Mikko didn’t notice a speedup, but he said that whereas scanning used to peg his CPU at 100%, it no longer does so. What I want to know is: how does this affect *you*?
If you want to help, do the following:
- Backup your DB. If you’re using external MySQL do a mysqldump, if you’re using internal MySQLe backup the mysqle folder in the Amarok data directory.
- Update to a revision from a week ago…say, 995000.
- Wipe your DB.
- Start Amarok — it will do a full scan because of the empty DB. Time it as it does the scan.
- Repeat steps 3 & 4, so that you can see what the time is like after caching.
- Update to current trunk (at least 998470).
- Repeat step 3.
- Repeat steps 4 and 5.
Then leave a reply here with your values. If you watch your CPU during each of the scans, report that here too. Thanks!
More Info on Gitorious.org
15Today at the Akademy General Meeting, it was mentioned that Gitorious.org is being seriously looked at as a hosting solution for our Git repositories (as opposed to running an instance of Gitorious ourselves). Since I have been a major part of pushing in that direction, I feel that it would be prudent to make sure that those interested are aware of the relevant discussion and the current status. So, for those interested, read on.
Please note that this is *not* a post about why KDE is migrating to Git, why this is a good idea/bad idea/neutral idea, etc. This is purely discussing the hosting aspect of Git.
First, I would encourage you to read this kde-scm-interest mail, which I sent to the list on July 2nd. It goes into a good amount of depth as to why Gitorious.org could be beneficial for us, and the rest of this post will assume that you have read that email and the others in the thread, as it will simply update the information therein.
On Monday a large group of interested people, including KDE sysadmins and the guys from Shortcut AS, went to lunch to discuss the technical issues. The output from that discussion is as follows:
- The vast majority of those present feel that Gitorious.org would be the best choice, with the following being the main reasons:
- Shorcut could provide a SLA (Service Level Agreement) guaranteeing a minimum level of service, such as uptime and available bandwidth, providing professional hosting services and easing burden on our system administrators.
- As David Faure noted, user account creation is becoming a large burden on our system administrators, which is not something that we would have to administer if using Gitorious.org.
- It should be noted that the above was not a unanimous opinion.
- KDE does have infrastructure and bandwidth; it could keep one read-only Subversion server available for historical reasons, and convert the rest to serve as backups or possibly load-balancers. Or to put it in a more general fashion, KDE can reduce hosting costs (which will likely be covered by sponsors) by working with Shortcut. It is not a question that this could be done, but rather what the right method would be for doing so.
- The Gitorious developers have a feature branch where they have already fixed one or both of the current showstopping bugs relating to rights within the shared Git repository. They have said that this should be merged into mainline within a week (not sure if they meant a week from then, or from the end of GCDS).
- The hosting could be set up in such a way that it can be accessed via git.kde.org.
- Post-commit-hook functionality will be available; the Shortcut guys are currently working with us to determine how we can migrate or emulate pre-commit-hook functionality.
We have two projects that are chomping at the bit to get onto Git ASAP: Amarok and TagLib. Amarok will be converted first and will serve as the initial guinea pigs to iron out any issues. Barring any major issues being found, TagLib will be converted in short order.
I hope this gives everyone a better idea of KDE’s Git-hosting plans. If you haven’t checked out Gitorious.org, I encourage you to do so; it’s made huge leaps and bounds in the past six months and has become quite a great tool.
Please direct any questions or feedback to the kde-scm-interest mailing list at: kde-scm-interest at kay dee eee dot ooo arr gee, not to the comments section on this blog.
AFT fixed on the Playlist
0Yes, another one of my semi-habitual posts about AFT. Just a short one though.
In revision 992942, I finally fixed a bug that has kept AFT working for the playlist in certain situations (although it had previously been working for both saved user playlists and statistics). This means that if you have a track in the playlist, move it to another location, and it is then scanned in that new location (remember, kids, it uses folder mtime to determine whether to scan a folder, so when in doubt do a "touch ."), the track in the playlist should remain valid and play the song in the new location. As the playlist use case was one of the initial reasons for the development of AFT back in Amarok 1.4, you can imagine I’m happy that it’s finally (seeming to be) working again in all scenarios, instead of failing in certain situations.
MySQL Server Support — Promised and Delivered
16We told you it was coming. Sure, that was a while back, so you probably thought we forgot about it. Or maybe you thought we were simply playing politics, tossing empty promises to our users.
Well…you were wrong.
It may be a bit later than planned — we wanted to have it in time for 2.1, but it didn’t happen — but as of revision 984572, there is now support for storing an Amarok database on a MySQL server instead of the embedded MySQL database. There’s no configuration dialog in the GUI yet, but it’s pretty simple to set up, as explained below. All you have to do is add a few things into your amarokrc file and make a valid user on the MySQL server instance of your choice — you don’t even need to create the database yourself. (In fact, you shouldn’t — you should let Amarok create the database so we can ensure that the character set and collation are set right.)
Here’s how to do it.
- Update to at least r984572 (of course, updating to the latest revision is probably your best bet).
- Wipe your build dir clean and rebuild. Not necessarily necessary, but as 47 files were changed in that commit, it’s not a bad idea.
- After install, run kbuildsycoca4 –noincremental, just in case.
- On your MySQL server, run a command like: "GRANT ALL ON amarokdb.* TO ‘amarokuser’@'localhost’ IDENTIFIED BY ‘mypassword’; FLUSH PRIVILEGES;" Be sure to substitute for "amarokdb", "amarokuser", "localhost", and "mypassword" as appropriate.
- Open up your amarokrc file, usually in ~/.kde4/share/config/amarokrc. Add a [MySQL] section:
[MySQL]
UseServer=true
Database=amarokdb
Host=localhost
Password=mypassword
User=amarokuser - Close the file and start Amarok. It should create the database and start a scan of your files. If you want to switch back to the embedded collection, simply set "UseServer" to false.
Pretty easy! Be sure to let me know if you have problems — file a bug and assign it to "mitchell" at a domain of "kde" plus a dot plus "org".
Recent Comments