Licensing to Kill
11Could FLEXlm be one of the world’s worst-designed programs?
They’ve just rechristened it FLEXnet Publisher, and I can only think it’s to try to get away from existing FLEXlm stigmas. It’s so bad that according to a VMware engineer I spoke to on the phone a few months ago, in the next release of ESX (whatever the new name is going to be) they’re ditching it to go back to their own serial-number based scheme, entirely because of a large amount of hugely negative customer feedback. This is only one major release after they switched to it.
Let me describe the structure of it (if I were to go into the user interface issues, this would become far too long a post, so maybe another time). There’s a management daemon, called lmgrd, and vendor daemons. These vendor daemons are (often along with the management daemon) provided to you, often without an installer and simply as a bunch of files. There’s also quite often a node-locking generator (based on things like hostname, MAC address, and all manner of things easily faked in a VM) provided to you by the vendors, which as far as I can tell is different for every vendor. This is possibly so that they can build in their own criteria, but more likely because the FLEXlm people are lazy and can’t be bothered to provide a comprehensive set of criteria on their own.
Now, why is there a vendor daemon and a management daemon? Good question, and I don’t know the authoritative answer, but from using it I can make an educated guess or two: either to allow vendors to validate some part of the licenses in their own manner, or to simply confuse and annoy the hell out of you. See, the different vendor daemons are built against some version of the FLEXlm software, and they may, or may not, have been built against the same version. This may, or may not, cause problems, including crashes. It may explain why every time the license server reboots (thanks, generally, to automatic Windows updates), lmgrd fails to start up successfully.
Now, you can run multiple instances of lmgrd — if you can figure out how to get it installed so that it starts up on bootup, since there isn’t much in the way of help in the official help PDF, and only one of the three vendors whose licenses I deal with actually provided an installer (thanks, VMware, I’ll be using your installer long after you’ve switched off of FLEXlm). This lets you have each license under a different copy of lmgrd, perhaps so that you can attempt matching versions, but mainly so that if one of the lmgrd instances crashes, it will only take down that license. (Actually, it’s because in this configuration you need to run them on multiple machines, so if one machine goes down, most of your licenses stay up. But I know what they really meant.)
Alternately, you can combine licenses into a single file, a process that the manual warns is time-consuming and error-prone.
Finally, you can simply have all your licenses be handled by a single lmgrd instance. Now, this isn’t really a bad idea, except for port numbers.
See, each vendor daemon needs to run on its own TCP port number. You can specify the port number in the license file, except when you can’t because some vendor hardcoded it into their software. If you don’t specify a port number in the license file, the vendor daemon will be given a port number starting at 27000 on up, giving you an address like 270...@mylicenses.thissucks.com. Now, because everyone likes autoconfiguration, if in your program you leave out the port number (giving you something like @mylicenses.thissucks.com), then it will automatically search ports 27000 through 27009 for a matching vendor daemon. Unfortunately, I haven’t seen a single vendor product that doesn’t puke on such an address, claiming it is invalid because of faulty validation, even when the FLEXlm manual in front of me says it’s perfectly legal.
If you don’t then modify your license files to specify ports (which you have to remember to do every time you get a new license file), then you better be aware of when the license server reboots, because the ports are given out in the order that the services come up. So if you’re in my situation, where the first of the vendor daemons always crashes the first time it tries to load on bootup, then the other two will have their normal ports decreased by 1.
I haven’t even gotten to things like functions to reread license files that don’t actually reread them, a UI that doesn’t tell you if a server is running or stopped until you hit the buttons to try to run it or stop it, a status window that displays a large amount of license data in a tiny, non-resizable window, and more. The product feels like it was developed in 1988 (it was) and is a study in market leader stagnation. I would be very surprised if they have a single developer left and haven’t fired everyone to just sit there and watch the money roll in.
If license servers are necessary for some product you’re creating, there are other license servers out there that simply run on a well-known port and don’t require any of this idiocy. Unfortunately, in my experience I’ve seen more companies switch to FLEXlm than away from it. I hope this isn’t an industry-wide trend.
Where’s the KDE angle? There isn’t one specifically, other than this: if someone in the KDE community thinks this is a good design, or can see themselves designing such a scheme, please just leave. Now. (Don’t worry, I’m sure this doesn’t apply to any of you.
)
Amarok Power User Feature: Batch-mode collection scanning
3A long-requested feature has been a way to decouple Amarok’s collection scanning from its GUI. There are various use-cases for this. For one, it can actually help us with debugging, by allowing us to control the inputs into the scan parser. For another, many people have all of their music stored on a single machine, and would like to do the scanning locally where it’s fast instead of on their e.g. laptop running Amarok, where it’s over wireless and slow.
Yesterday and today (as of r933010) I put half of the solution into trunk. I say half, because full collection rescans are now supported in batch modes, but I am still working on the methodology for incremental scans (I have a few ideas, but have to sort out which is the most reasonable/doable/makes the most sense). Below, I’ll explain how to do it. Keep in mind that this is designed to be (lightly) scripted, not done by hand…so it can be done by hand (which I did during testing) but it has some safeguards in place so that if you script it, and forget about it, Amarok still works normally. However, there are actually some interesting things you can do now if you script the scanner…
One important thing to note: the scanner requires various bits of Amarok code, mainly centering around the extra taglib plugins we support. So you’re unlikely to have it work on a machine that doesn’t have A2 installed, and certainly won’t be able to compile it without the rest of the Amarok source, although you may be able to get it to work in a binary-only fashion with just kdelibs and taglib…YMMV.
So, here’s the flow:
- Run amarokcollectionscanner with the -b or –batch option. Although this doesn’t directly modify the output, it sets some internal flags (some for safety, some to enable other options). The output goes to stdout; save this output as amarokcollectionscanner_batchfullscan.xml (yes, it must be that specific filename). You’ll probably want to use the -r flag too…check –help for more options.
- Cool feature: if your scan is of files that have a different mount point on the local machine than the machine that will be using the output, add the –rpath option and give it the mountpoint of the directory on the consuming machine. For instance, if on the local machine your music is at /opt/music, and this directory (music) is mounted as /mnt/music on the remote machine, run amarokcollectionscanner from /opt and use –batch –rpath /mnt/music.
- Cool feature: You can scan multiple directories and simply append the output to the end of the file from step #1. Yes, directly append, including the xml version/encoding header. Amarok’s scan manager will detect and handle this. You can even scan directories that are not defined as collection directories in Amarok (these entries will work fine, but the next time you do a full rescan from within Amarok without using the saved output, they will be gone).
- Place the amarokcollectionscanner_batchfullscan.xml file in your Amarok data dir. Usually, this is ~/.kde4/share/apps/amarok.
- Trigger a full rescan from inside Amarok (in Settings->Collection). It should be super-fast.
Note that this step deletes the amarokcollectionscanner_batchfullscan.xml file. This is on purpose, so if you want to keep re-using it, you’ll have to keep copying it over.
As you can see, it’s designed for scripting, but it’s not terribly complicated even to do by hand. I’ll post more as I get incremental scanning working, and will at some point put a Wiki page up that has all of these instructions in a less transient form.
Camp KDE videos: Come and get ‘em
13After a month’s delay, they’re done.
In the interim (after the delays mentioned in my last post on this topic), my poor, underpowered desktop machine has endured transcode after transcode (from the original source material) and my Internet connection upload after upload as I tried to figure out just why Blip.tv wouldn’t work with X or Y. (In fact, I can quantify these: X is Vorbis/Theora, which produced awful audio and very desynchronized audio/video upon their conversion to .flv; Y is a lot of things related to the original anamorphic encoding of the videos, which Blip.tv can’t handle, and finding the right combination of settings and flags and adjustements to make the aspect ratios come out so that everyone didn’t look like Gumby®.)
This was followed by a few days of uploading; Blip seems to max out uploading speeds somewhere between 100kbit and 200kbit, so uploading almost 6GB of data, one chunk at a time, took a bit.
The end result is thirteen videos, each Xvid-encoded (at least it’s OSS, although patent-encumbered…see Vorbis/Theora problems above) with a Matroska container. You can get to them through my show or to individual videos directly (these might not be in strict as-presented order):
- Welcome!
- Diversity in KDE
- Akonadi
- KDE and Business Software
- KDE4 and MS Windows
- libplasma in Applications
- Accelerating Graphics
- CMake / CTest / CPack
- Taming the Leopard
- Bringing the Free Desktop to the Mobile World
- KDE & Distros
- KDE-Games, KDE-Edu, and Avogadro
- A Case for Open-Source Coders in the Enterprise
Enjoy!
#kde-soc
0Just a note to make people aware that #kde-soc exists. There’s basically no activity in there right now, which is rather expected, but as SoC-related activities pick up steam so will the channel.
Qt 4.5 RC1 packages for Intrepid, built with -graphicssystem raster
20Update: These packages are for testing only! Things that rely on the default native backend WILL break. If you want a build of Qt 4.5 RC1 for Intrepid without -graphicssystem raster, try pollycoke’s PPA.
The title says it all. They’re available from my PPA. Enjoy!
Where *are* those Camp KDE videos anyways?
2Hello there. I have good news and bad news regarding the Camp KDE videos.
The Bad News
I’ve run into a series of setbacks regarding getting them prepared for posting. The first was that the Monday I got back from Jamaica, I had the following schedule at work:
- 9:15 AM: Check email
- 9:16 AM: "My details have been confirmed for my upcoming trip? What trip?"
- 2:00 PM: At the airport awaiting departure, thinking that I *knew* I should have checked work email sometime over the weekend…
Regardless, I did manage to have some of the files with me, and did actually get a few of the videos done during that week that I was away. I then got held back a bit waiting for the videos on Wade’s camera to arrive — they were 16GB or so in total, and so he had to fetch them off the tapes, encode them, and get them uploaded to me. After that, I started to put Wade’s videos together with slides, only to find out that the videos would not properly work inside kdenlive. Or so I thought. After a period of long transcodings that I performed during my free time over the last few days, I find out this morning that the issue with the videos was a massive dose of PEBKAC.
There’s only one other piece of bad news, which is that due to a bug in (ffmpeg? libmlt?) kdenlive’s Ogg/Theora export is borked. So, I’ve ended up coding in XviD, which while using a patented algorithm is at least otherwise free software. I’ve thought about transcoding everything to Ogg/Theora afterwards, and I may end up doing so before posting them up, if the quality is okay. And before you ask — it didn’t seem right to post them up without having Wade’s keynote up there to kick things off, which is why I’ve not posted some of the ones that are already done up before now.
The Good News
The good news is that 11/13 of the videos are done, and I forsee no obstacles in getting the rest of them to behave. So I am hopeful that I can start getting them up to Blip.tv within a couple days. Thanks for bearing with me.
Oh god: I had to use VBA
10I’m working on putting together the Camp KDE 2009 videos using the excellent Kdenlive, a non-linear video editor whose name is, well, an acronym for KDE Non LInear Video Editor. More on Kdenlive in a later blog post from me and/or Wade, but trust me — with the latest versions it’s much more stable, and it’s getting very good.
Anyways, one of the reasons I’m using it is to splice the slides into the videos, because they’re just not readable inside the videos for the most part. So I needed a way to turn each slide into some sort of an image.
It turns out that OO.o doesn’t have this capability natively, but some users on the OO.o Forum came up with a script at the bottom of this page to export each slide into PDF. I modified it to do the following:
- Export to JPG instead of PDF (PNG export didn’t work)
- Add extra 0s to numbers such that you always have three digits
The code is pasted below. One really fun (not) thing I found out: VBA (or just OO.o’s implementation of it) doesn’t really do type checking. As a result, if r is a string instead of an integer (which I had forgotten), the following code will always execute as True:
If r < 10 Then
Anyways, here is the code, in case it helps anyone at some point:
REM ***** BASIC *****
Function MakePropertyValue( Optional cName As String, Optional uValue ) As com.sun.star.beans.PropertyValue
oPropertyValue = createUnoStruct( "com.sun.star.beans.PropertyValue" )
If Not IsMissing( cName ) Then
oPropertyValue.Name = cName
EndIf
If Not IsMissing( uValue ) Then
oPropertyValue.Value = uValue
EndIf
MakePropertyValue() = oPropertyValue
End Function
Sub SplitPDFs
dim oDoc as object
oDoc = ThisComponent
dim url as string
url = oDoc.getURL()
baseURL = Left( url, Len( url ) – 4 )
nNumPages = oDoc.getDrawPages().getCount()
For nPageToSave = 1 To nNumPages
dim r as string
r = Str(nPageToSave)+"-"+Str(nPageToSave)
If CInt(r) < 10 Then
oDoc.storeToUrl( baseURL+"00"+nPageToSave+".jpg" ), Array( _
MakePropertyValue( "FilterName", "impress_jpg_Export" ), _
MakePropertyValue( "Overwrite", "True"), _
MakePropertyValue( "FilterData", Array( _
MakePropertyValue( "PageRange", r ))))
Else
oDoc.storeToUrl( baseURL+"0"+nPageToSave+".jpg" ), Array( _
MakePropertyValue( "FilterName", "impress_jpg_Export" ), _
MakePropertyValue( "Overwrite", "True"), _
MakePropertyValue( "FilterData", Array( _
MakePropertyValue( "PageRange", r ))))
End If
Next
End Sub
Avast, We Be Getting Slandered, Yar
33Poisonous people. They exist everywhere, sucking the light and good out of things and repurposing them for all sorts of nasty activities. Like the rest of KDE, and the rest of the software world (both free and non-free) in general, the Amarok team has taken its fair share of abuse over the years. Normally I ignore it. However, today I got my KDE 4.0 Release Event talk twisted around in malicious ways and the blame placed right at my own feet. So I’m rising to the bait.
The latest perpetrator of bile and venom is one Antal István Miklós. This blathering idiot Web Developer called me out personally on his Great Blog (yes, that’s really what it’s called, if you want to see his post), and turned my well-received talk at the KDE 4.0 Release Event into complaint-a-thon.
I’m not going to pick apart all of the technical wrongheadedness he portrays with MySQL/e, Akonadi, and the capabilities therein — Nikolaj posted a nice response to the blog, and the guy would be very well served to *fully* read my posting about mysqle (as well as many of the comments). Much of the blog can also easily be decoded as baseless, factless trolling ("amarok1.x is the slowest KDE3 program, if not it’s surely in the top 3 slowest KDE3 programs"), and the-developers-don’t-agree-with-me-so-they’re-wrong syndrome. But I *am* going to defend my statements at the Release Event.
Let me explain, up front, the format of my Release Event talk. I presented a short introduction to Amarok, for those that did not know it. I then stated some drawbacks we found with the KDE3 platform. With these drawbacks, plus a desire to go cross-platform like many other media players (VNC, Banshee, iTunes, etc.), we had considered the possibility of switching to a Qt-only architecture. But then, KDE4 comes along, with elegant solutions to our problems — Phonon, Solid, Plasma, targeting multiple platforms, and more. Boom — the thoughts of Qt-only go out of our heads, and we commit ourselves fully to KDE4.
Far from a long series of complaints, it’s a success story, showing how the benefits the KDE4 platform offer to us solved our problems, and how they could solve yours too. Apparently, however, it simply shows — from Antal’s blog post title, and probably because he hasn’t bothered to watch past five minutes in — how we just don’t get KDE4.
This may not be apparent to everyone, but Amarok was an early poster child for adoption of many of the Pillars of KDE. We are the only application, to date, that has embedded Plasma inside of our application (with our developers doing a large amount of work to make that possible). (Update: we are technically the first, outside of the plasma workspace, but there are others playing with that now.) Device detection completely relies on Solid (which is one reason Mac and Windows ports have no device support right now). And we have completely standardized on Phonon for our media engine. We’ve also had Oxygen team members working on our icons and our interface.
It’s hard to imagine ways for us to more fully integrate with KDE4 than what we are doing. We’ve gone for KDE4 whole-hog, and it’s ludicrous to suggest otherwise. Picking out random Pillars that we don’t fully integrate with (yet) does not mean that we are not KDE4-oriented. After all, right now we don’t have a use for Marble (who knows? that could change) — but does that mean we don’t get KDE4?
it just reminds me of one of the KDE4.0 release event, where a KDE dev complained that how KDE3 sucked, because they couldn’t port Amarok to Windows, and KHTML had bad performance
It’d be nice for Antal to realize that there is a difference between complaining, and listing drawbacks of a platform. (This doesn’t really fit into how trolls work, of course.) Yes, I said that two of KDE3′s drawbacks were that it made it impossible to port Amarok to Windows (and Mac) and that KHTML rendering was found to be slow. No, I did not say that KDE3 sucked.
But there’s nothing new here. It’s not like Amarok was alone in wanting to port to other platforms. The Release Event had showcases of KDE4 applications running on both Windows and Mac. But Antal has this fixation that Windows and Mac are suddenly all we care about, taking an out-of-context "consider the majority" statement someone (he doesn’t say who) made on IRC about some topic (he doesn’t say what, only that it’s vaguely somehow about performance):
Using mysqle mostly benefits non-KDE4 desktops, because as I said earlier KDE4 will probably have a mysql server anyway, but isn’t improving the KDE4 user experiance top release priority anymore? Is amarok on Windows on Mac more important than getting the best out of amarok on KDE4?
[then, later]
What did the people in the IRC channel had to say about this?
…….
My favorite quote from here is: "consider the majority"
It’s like saying: "consider the majority, which are Windows and Mac users, and screw the KDE4 users"
I think Antal fails to realize that KDE is not just a desktop. Windows and Mac users that might be using Amarok are going to be using *lots* of KDE technologies in the process. Regardless of his mistake, there is certainly no evidence that Amarok cares more about Windows and Mac users, or thinks them the majority of our users. Speaking as a developer, I can tell you that the exact opposite is true.
Antal also clearly doesn’t realize that Akonadi is not a requirement of KDE to run (even if it’s installed), and therefore the best Amarok could do would be to integrate with Akonadi, but not to depend or rely upon it. Maybe he kind of gets it when he says, my emphasis, "KDE4 will probably have a mysql server anyway" — we can’t rely on probably, or maybe. We need to use what works, always.
He even contradicts himself:
He begins with complaining, how slow was rendering amarok’s context with KHTML, so it looks like performance matters in amarok, not that anyone forced them to use KHTML for rendering context…
Make up your mind, Antal. You’re right that no one forced us to use KHTML for rendering context, although WebKit wasn’t available back then, and hooking into Mozilla was a non-starter. But you want us to be integrating with other KDE technologies…right?
One last point:
Jeff Mitchell the developer who spoke at the event that I was referring to, referenced KDE as a family, but where is the love now? The lack of communication between Amarok and the rest of KDE4(Akonandi) doens’t seem to back up Amarok as being a family member.
It is not surprising, given how little he understands of Amarok, KDE, and the integration thereof, that he thinks both that there is a lack of communication between Amarok and the rest of KDE4, and that he implies that Akonadi is the entire rest of KDE4.
I’ve covered a small fraction of the untruths and inaccuracies in his post, but it’s enough — I’ve made my points. I love KDE, I have not publicly disparaged it, we Amarok developers are fully committed to the platform, and we are not putting the Windows and Mac ports at a higher priority than the *nix base.
Any comments will be read, but I may decide not to post them.
MySQL in Amarok 2 – The Reality
86There has been a lot of chatter lately regarding Amarok’s switch to MySQL as its only SQL backend. A decent amount is FUD — either by people simply pushing back against change, or by people that simply don’t understand the decision. Some of it (particularly Adriaan’s blog post) has been insightful and interesting, but miss the mark in terms of why this change was made. This post attempts to explain why this decision was made, what it really means for you the end-user, and why you should have a cup of tea and relax.
I want to point out first that I said that MySQL is going to be Amarok’s only SQL backend. A2′s collection system is very powerful. Just take a look at how varied music sources from Shoutcast, Jamendo, Magnatune, Ampache, MP3Tunes, as well as local sources like iPods and your local file system, are treated as equals in A2. A collection is a collection, and is limited only by what capabilities it advertises it can support (and of course, it can supply its own custom capabilities). It’s not currently enabled, I don’t think, but there’s a Nepomuk-based collection option too. So take heart — this change only affects Amarok’s internal SQL collection, and not other sources (although those sources can store information in the SQL database if they wish to cache information).
Since I mentioned Nepomuk, it’s time to discuss another common question/demand/complaint: KDE has this nice Strigi-Nepomuk thing going on…why aren’t we using it for scanning music and storing information? There are a couple main reasons. The first is that Strigi and Nepomuk are optional, not required. (Update: Strigi is required, but Soprano isn’t, so Nepomuk as a whole is still optional.) We can’t rely on the user installing them, and even if they are installed, we can’t rely on the user to configure them properly (remember that we’re going cross-platform, making it even less likely). The second reason is speed: Amarok’s custom collection scanner is extremely fast and pulls out specific pieces of information with TagLib. Strigi is, by comparison, very slow (it calculates hashes of all files, which means it needs to read the entire file) and pulls out less information. (Update: According to the Strigi developer, and despite what is said on kde-apps.org, Wikipedia, and even the author’s own home page, it does not calculate hashes by default. So it’s possible that Strigi, if properly configured, could be as fast as Amarok’s internal scanner, although whether it would pull out all necessary information, I don’t know. If it’s configured to calculate SHA1 hashes of all files, then it will indeed be far slower.) On a local hard drive, it may not be a big issue, but it sure is a huge issue when you throw networked storage into the picture, which is a very common scenario. I’ve also heard, though don’t remember specifics, that querying and such through Nepomuk is rather slow, compared to a normal SQL database. Regardless, though, remember that when the Nepomuk-based collection is finished, tracks sourced through a Nepomuk-based collection will have their metadata changes saved back to Nepomuk. So, it’s not that the SQL collection is in place of Nepomuk — they are entirely independent. (Update: I forgot to mention that a Nepomuk collection already exists. It was developed by a GSoCer over the summer. I’m not sure what its status is as far as making the 2.0 release, but we Amarokers both like Strigi/Nepomuk and are excited about the idea of opening up the app and having all your music available right then and there with no pre-configuration. But there is a place for the SQL collection too. As I said: they are complimentary technologies.)
With those topics out of the way, on to the meat.
First, it is important to understand an important pair of facts. Number one: we are not database guys. Sure, we can store data in them, and more or less come up with a working schema, but none of us are gurus/wizards/jedis/etc. This leads in to number two: maintaining three databases was driving us crazy. Every time a minor schema change was needed, it had to be coded up for all three types of databases. Modifying a schema could be trivial for one database type, and super difficult (or impossible) for another. People would report bugs that we couldn’t reproduce, only to find out that it was because we didn’t quite understand how one database or another behaved (or in some cases, none of the active devs were using that type). And so on. So from the beginning of A2 development (and in our fantasies during A1 development) we knew we wanted just one database.
(We did actually look at abstraction layers like QtSQL and others. I’m not going to comment on them much, as I didn’t do the evaluation, but in general they were found to not be flexible enough to handle all of our needs without doing some custom SQL coding (especially in the cases of things like schema changes), which kind of defeats the point. If you want to know more/want to insist that they are, try asking eean, as I think he did the evaluations.)
Now we had to choose the type. At first, SQLite seemed like a good choice. Using transactions, it’s decently fast. It’s pretty stable (those that complain about odd MySQL bugs should talk to markey, as he, being the SQLite maintainer in 1.4, can attest that SQLite’s had its fair share). However, there were a few problems that in the end knocked it out of the running. The first problem is performance. Although for people with small collections it performs fairly well, people with large collections that switched to the MySQL or PostgreSQL backends in A1 would report enormous speed gains when operations performing complex or many queries were performed, such as adding many entries to the playlist, scanning files, or filtering/searching in the collection. Since we want to accommodate users with large collections just as well as those with smaller collections, and since digital music collections aren’t getting smaller, the speed increase for our users with large collections was quite important. Many of our developers, after the switch to mysqle (as we call it, though that’s not the official name), have noticed huge speed increases in their day-to-day use of A2, so that speed increase is carrying through to the embedded server as well as the normal server. That was the first knock against SQLite.
The other blow for SQLite came for a totally different reason. Many users (myself included) have multiple computers sharing a single Amarok database. Assuming all the computers have access to the music at the same mount point (and a few other things are configured right), this allows you to scan once, play everywhere, update the same ratings no matter where you play it, and more. Even if your aren’t sharing the database among multiple computers, many users want their database stored on a particular server for speed, security, or backup reasons. If you think either of these isn’t a common use-case, you’d be quite wrong. MySQL and PostrgreSQL were quite happy with this workload. It’s a total no-go for SQLite, simply because it’s designed for a different purpose. So SQLite had two big knocks against it. K.O.
However, just as we can’t rely on the user to set up Strigi/Nepomuk correctly, we can’t rely on them to get their tables set up in MySQL or PostgreSQL. So we needed the database to be embeddable, so that it could just work for the user without any setup necessary on their part. MySQL, with libmysqld, had the seeds of this in the 4.1 series, it works decently in 5.0, and it’s becoming fully supported (AFAIK) in 5.1. PostgreSQL, on the other hand, does not have any such thing. (They have an interesting and cool concept of their own of embedded SQL though. Update: apparently that is part of the SQL standard. Still pretty cool. Still totally different from what we mean when we are talking about an embedded server.)
So this leaves us with — as you guessed — MySQL. It may not be any particular person’s favorite database (although it is for plenty), and I don’t know how much overhead it *really* has in embedded form, but it fit the bill. It’s both embeddable and can run standalone on the local or a separate machine (yes, this is not supported yet in A2, but it will be). It is fast and robust for large collections. It is well understood by the development team. And most of all, it is a single-backend solution that fills all of our needs.
If you’re still unhappy about our decision, I’m sorry. We try to please most and can’t please everyone. But we’re the ones that develop and support this thing, and so we made a decision based both upon our needs as developers and the real-world use-cases from the collective feedback of thousands of users that have contacted us over the last few years. Please remember that even if most of the comments on the Dot, or to this post, (i.e. much of the sudden visible feedback) are from people that are unhappy with our decision, it is a decision that will actually suit the vast, vast majority of our users better than the other options we currently have.
We’re a project that is known for being good to our users — we listen to them, we try to implement features they want, try to be responsive with support. It’s one of the things that got us where we are today. So please, dear readers — put some faith in us. This has not been an easy decision — we’ve discussed, we’ve argued, we’ve thrown things, we’ve made up, we’ve had an after-the-make-up orgy or two — but in the end it’s what we collectively felt was the right way to go, and we feel that, in the long run, it will make Amarok even mores awesomer. Hopefully you’ll feel that way too.
Amarok File Tracking
34I don’t blog often, but when I do it tends to be meaty. I won’t disappoint. I’ll be covering Amarok, Amarok history, and a possible future part of kdelibs.
"We can rebuild him. We have the technology. Better than before. Better, stronger, faster."
A little-known feature in Amarok 1, starting at about 1.4.3, was what was known as Amarok File Tracking, or AFT. For every single file in your collection, on scan, a unique identifier (UID) was generated from some of the file’s attributes. If you moved your tracks around your folders, as the incremental scan kicked in, the UID would allow for the file to be identified, and integration throughout Amarok would mean that your statistics, your cached lyrics, and the current playlist would all be updated with the new path. No longer did you have to worry that moving around your files would mean losing years of statistics. Or losing your files.
But I’m getting ahead of myself.
See, AFT wasn’t born AFT. AFT could not track both a file metadata change and a file location change at once, because the UID was being based on file properties such as file length, plus a portion of the file itself hashed together. So you could still lose track of your files. This was a limitation that was known in advance.
It was also a limitation that didn’t originally exist. As I said, AFT wasn’t born AFT. It was born as Advanced Tag Features, or ATF. ATF was the same idea, but a little different — it would store the generated UID directly in the file’s metadata. This allowed for superb file tracking capabilities, because unlike generating a UID from a part of a file, if that part of the file changed, you’d still have your UID. In fact, the only way you *couldn’t* track your file was if you either removed the file’s tag entirely (or some other program removed the UID when it shouldn’t), or if you removed the corresponding information from Amarok’s database. (There are some downsides to this scheme: only certain file types are supported, for instance, determined by the kind of tag they use and the tag’s ability to store this kind of information.)
So why the change? Well, ATF had a problem, which was related to the structure of Amarok itself, and Amarok’s historical penchant for crashing (which got much better as the 1.4 series progressed). The outcome is possibly worthy of an entry in The Daily WTF. In gory detail, here’s the problem.
1. Amarok would start a collection scan. The collection scanner was the entity responsible for adding the UIDs to the file metadata. Important note: the collection scan was a separate process.
2. Amarok would crash, leaving the collection scan running, although not communicating with anything. This scanner could be very slow if it was adding the UIDs, depending on whether padding had to be added to the file’s tag. If this was the case, the entire file would have to be rewritten.
3. Amarok would be restarted by the user. Another collection scan process would start. Becuase UIDs would already exist for the early files, it would very quickly catch up to the first collection scan process.
4. You now had two collection scan processes generating and writing UIDs at the same time to the same file. If you were lucky, this would mess up your tag. If you were unlucky, this toasted your entire file.
5. Repeat step 4 for the rest of the scan.
ATF was never released in this state, but it did get turned on in SVN. And a few unlucky users had far too many files end up corrupted, depending on how crashy things became for them. After we finally realized what the issue was, a user came forward on the mailing list (still trying to find the exact mail or user) proposing a solution that I believe they’d seen in a class. Essentially, the solution relies on modifications to temporary, uniquely named files instead of the original file, using MD5 checksums to find out of the original file has changed while writing the new file, then using filesystem atomicity guarantees to move the new file back over the old one. This became the MetaBundleSaver, and it worked quite well, but it was also extremely slow compared to a normal scan. And most importantly, no one was quite trusting of the whole ATF scheme any more.
So, ATF was renamed to AFT and with it came a new algorithm that wouldn’t touch anyone’s files, but couldn’t track as well.
A couple weeks ago, I added AFT to Amarok 2′s SqlCollection. Enjoy, everyone — statistics, lyrics, and the playlist are already supported, with support for stored playlists coming eventually. But there’s more.
Fast forward to today (okay, two days ago). I’m taking a shower — Wade does insist that there’s something about showers and KDE coders — and I had a thought, which was essentially: there’s absolutely no reason why Amarok 2 can’t use a UID inside a file, if one exists, for superior tracking, and if not, generate a read-only type for normal tracking.
So I created a utility that is built and installed with Amarok 2. It’s called amarok_afttagger, and it will write UIDs into your files, using a class ported from MetaBundleSaver and called SafeFileSaver to ensure that files are not overwritten/interleaved, even if you run the process twice or three times at once. It optionally supports recursion if you want to pass in directories, and it can also remove UIDs from your files if you like. Right now it supports MP3s only, but Vorbis and FLAC support will be coming soon.
I’ve tested it extensively. I’ve added UIDs to files, removed them from files, regenerated the ones in files, over and over, and still everything is cherry. And Amarok 2, when it finds these files, can do some awesomely robust file tracking.
I encourage people to give it a run on their MP3s and check it out — if you’re worried by all the Dark Ages info up above and don’t have faith in the implemented solution, back up your files first, or operate on a copy of them, until you’re satisfied it won’t harm your files. And if you still don’t want to do it, you can enjoy the less awesome but still awesome power of the non-embedded UID file tracking.
Now, I promised this would talk about a possible KDE library. I’ll eventually be submitting the SafeFileSaver class for hopeful inclusion into kdelibs, so that any application that is worried about data integrity and needs to write to a user’s files can take advantage of it. It’s very simple to use — you simply give it a file path, and then operate on the file path that’s returned to you when you call prepareToSave(), instead of the original one. When you’re all done, you call doSave() and it will perform the necessary functions. That’s it.
Hope this has been enjoyable, and enjoy AFT in Amarok 2. Play with it and be amazed. Use amarok_afttagger on your files and be even more amazed. More information is available here: http://amarok.kde.org/wiki/AFT
Recent Comments