Quarter Life Crisis

The world according to Sven-S. Porst

« IntelMainAgnes und seine Brüder »

X.4 Spotlight

3723 words on

I’ve been waiting a long time to write about Spotlight. Mainly because my hopes for it were high and my first experiences with it were quite disappointing. And they remained disappointing until now. My verdict on it is that it is neither particularly good nor particularly fast at finding things – which will be elaborated in what follows.

Despite my disappointing first experiences with Spotlight, I still think that it’s quite cool technology. It still seems to need a lot of improvement and optimisation which I hope – without holding my breath, though – we’ll see in the future.

How it works

In case you aren’t familiar with Spotlight yet, I’ll give an ultra-brief summary of how it works. Look around at Apple’s site or on Google for details. This is not meant to be comprehensive.

Spotlight indexing your drive Spotlight creates an index of the files on your hard drive. That index isn’t just about the file names but also about the file’s metadata such as the dates of file creation and last accesses to the file. Spotlight can gather even more information about the files, such as the size for images, the authors of documents and so on. The key to doing that is that Spotlight can use plugins, so-called metadata importers, to extract the information from the files. These plugins live in some Library’s Spotlight folder (I find it rather unclear at this stage how Apple name the folders in the Library. There’s Spotlight, but not Dashboard, there’s Widgets but not Actions, there’s Automator…) or, with well-made applications in the application itself and don’t have to worry the user at all.

The system uses the new powerful concept of Universal Type Identifiers (UTI) to figure out the types of the files it’s dealing with and then hands that file to the appropriate plugin for importing. An initial import will be done after installing X.4 and can take quite a while. In the first days after the release of X.4 you heard many people talk about their system still being busy with doing that initial import. And with even a moderate amount of files such an import will generate a rather large index in the folder ‘.Spotlight-V100’ at the top of your startup volume. Mine is 200MB at this stage, 50MB for the ‘store’ which I assume to be the metadata and the rest for the content index which I assume to contain the text from files’ contents.

The really smart bit about Spotlight is that the system will inform it of any changes that are made to files and there’ll be a quasi-immediate import of the new or changed file, meaning that the index is always up-to-date if you’re just running X.4 and thus that you can have things like searches that keep their results current at all times.

The data gathered by Spotlight isn’t just accessible from the system’s main Spotlight interface but can also be used by applications or from the command line. Apart from its lousy performance, it’s a very powerful tool that should enable developers to do amazing things in the long run.


My main complaints about Spotlight are nicely summarised by an animated ad campaign that Apple cursed the web with around the release of X.4, so this may be a good point to start at, with more technical bits coming later on. Two of the ad’s punchlies were Find anything on your computer and Instant results appear as quickly as you type. If Spotlight did that, it’d be great.

Find Anything

Find Anything on your computer.

Apple’s ads claim that a feature of Spotlight is that you can find anything on your computer. This is a blatant lie – or ‘feature’ as they’d say in Cupertino. In fact, there never was a version of the Mac OS in which you could find fewer things on your computer. Back in System 7, say, you could find almost anything. There were very few ‘invisible’ files which the System hid for good reasons or which inconsiderate developers spread around. This didn’t get much worse until the move to OS X.

In OS X, not only are parts of the system software not visible at all in the Finder (and thus very difficult to copy), e.g. the ‘private’ folder at the top of the startup volume. In addition, invisible files are spilled everywhere – presumably to store your view preferences and other useful information that the Finder ignores. However, with a bit of patience, the Finder’s find command would find the vast majority of files for you if you only knew a bit of their name and were allowed to see them.

In X.4, however – by its clever design – a lot of this is lost. Try searching for any file which lives inside your System folder and you’re out of luck. Do a simple search for the term ‘Finder’, the name of a running application that is well known to the computer and you may get a couple of results but not the one you’re looking for. The bottom line of this is that, thanks to Spotlight, I cannot find the vast majority of files on my computer which were perfectly easy to find in previous versions.

There may be reasons not to index every single file on the computer – particularly as OS X has the bad habit of having gazillions of them most of which are unlikely to be relevant to me at any stage. But I should be able to find them anyway. Particularly when using the world’s most advanced – or so – search technology. If the amount of data generated by putting all those files in the index (has anybody tried this? what did it do the index size and Spotlight performance?) is too large for Spotlight to work smoothly afterwards, then having just the file names in there would seem to be both a reasonable compromise and the least that Spotlight should do.

Instant Results

Instant results appear as quickly as you type.

Apple’s claim that Instant results appear as quickly as you type isn’t as blatantly wrong but still wrong. Perhaps I’m just typing too fast but nothing ever appears while I’m typing. I don’t consider that a massive problem, though. What’s more annoying is that it takes a couple of seconds for the first results to trickle in, around ten seconds for searches with very few results to complete and around half a minute for those with many results to complete. Even when having in mind that my Powerbook isn’t exactly new, this is so far from being ‘instant’ and – more importantly – from being useful, that using Spotlight just isn’t the life changing experience you’d like it to be.

With that amount of time at my hands I can locate almost any of the files that Spotlight indexes on my hard drive manually – seeing that most of them are in an order that I made up myself and have used for years. But I think I wouldn’t mind too much to wait for half a minute to access that particular e-mail from 1996… but not-so-strangely I do mind having to wait such a long time for the computer to find a file that I’ve saved a few minutes ago, an application that I use every other day or to return the item I selected yesterday when doing exactly the same search. I am tempted to say that a few rather small optimisations, such as having quick access to applications, addresses and recently used items would make Spotlight seem much quicker and useful.

User Interface

It has been said many times – the user interface of the System’s Spotlight feature is a complete mess. The first thing I noticed about it is that it loses keystrokes. You can’t simply hit the Spotlight keyboard equivalent and start typing your search term. Because, in case it’s swapped out or the system is busy it’ll take a while before Spotlight’s search field is on screen and has focus. Anything you enter before that will just be ignored (or go elsewhere). That really sucks and destroys one of the very nice things of the Mac – even if your system isn’t fast enough or very busy you could just type on and everything happened as expected as soon as the computer can catch up. This is definitely not the case with the Spotlight search field and that’s bad as it really keeps you from being able to use it in a quick workflow.

While the quick-search menu is sort-of OK – except for the jumping around of search results and the eternities it needs to load icons, even of the same type – the separate window is a piece of junk. It doesn’t belong to the Finder and can’t be found in per-application Exposé switching. In addition, its display of data is different from everything else. And not even particularly good.

It’s hard to get to see additional meta-data or location in the file system (hint: use the menu for that and rest the mouse over an item to see its location, although you can’t simply ‘Reveal’ an item from there) and once you get to see those data, their selection does seem strange (who really needs to know the type of camera used to make a phone in his search results?); It shows and sorts items with the very strange ‘last used’ date, which leads to some files appearing as ‘without date’ and a lack of spacing in its display; it’s impossible to see an image’s full name when showing images in preview mode (which ignores the information about how the image is rotated), switching those modes doesn’t preserve selections, it’s poorly localised and so on. Loads of obvious bugs. The only nice thing is that it can do neat slide shows with the images it finds - but this looks a bit like an afterthought as well, as that feature should be available for any Finder window as well, but it isn’t: You can only do slideshows from Spotlight search results for some reason as Paul points out, a feature that’s available for selections in a Finder as well – hidden in the contextual menu.

Let’s hope someone will report those problems or someone at Apple actually uses that piece from UI hell and will notice what kind of crap they shipped. Everyone else just go and hope that this won’t be another eternal usability issue like the Finder.

Localisation

While Spotlight’s UI is poorly localised, the actual technology behind it takes localisation into account. Not only will the various localised names of items be indexed, but also the list of all available fields of data that Spotlight stores can be localised be a Spotlight plugin’s developer.

Either of these points is both good and bad. Generally I think it’s good that localisation has been taken into account right from the start. But practically this is very messy. The drawback of having those localised names is that Spotlight is searching for the localised name only. This means that on a German system, say, searching for ‘Calculator’ will not find the Calculator application for you because it’s called Rechner, or rather its name is displayed as ‘Rechner’ while it’s bundle is still called ‘Calculator.app’.

To begin with this is annoying because I occasionally have English-speaking people recommend to use a certain application to me. And I have no idea what its localised name is. So I can’t easily find it. What makes this worse is that Apple’s file name localisations in the Finder just didn’t work before X.4 if you told the Finder to show you actual file names (rather than that mostly truncated stuff it shows by default). This means that my OS X experience has only had English names so far and that I still find it difficult to deal with the German ones.

But that’s not the only problem: The search results are just very broken because of this ‘feature’. When searching for ‘Rechner’ on my system, Spotlight will dutifully present both the Calculator application and the Dashboard widget. But it won’t show the Calculator’s preference file because it doesn’t have a localised file name.

This seems to be a rather hard problem to solve and probably indicates once more that localised file names aren’t as good as they look at first. Or at least that their implementation in OS X is nowhere as good as it should be. For the time being, I am really looking for a way to make the default Spotlight search match the English (or file system) names as well as the localised ones. I think that should solve a big bunch of the problems I’ve seen.

As for the other localisation problem I mentioned, you need to go to a smart folder and then select ‘Other’ in the list of search options. This will give you a sheet with a very long list of things you can search for. Spotlight plugins can add their own terms to that list. And these terms and their descriptions localised. But only if the plugin is. Seeing that many third party applications aren’t fully localised and only available in English this means that you’ll end up with multiple languages in that list. Which doesn’t just make it harder to read the list but also makes using the wisely provided filter field a pain as you’ll have to check at least two languages.

In fact the localisations provided by Apple are quite strange as well. I knew there was the ‘Raw Query’ search option available, but there didn’t seem to be a canonical way to translate it into German. I tried a couple of translations that sounded reasonable to me but none of them did the trick. Not wanting to go through the hundreds of search terms one by one, I ended up asking Pierre to mail me a smart folder with such a search and just opened that on my system, revealing that the term I was looking for was ‘Reine Daten’ which translates to an unexplicable ‘Pure Data’.

Additional Plugins

Apple are offering a list of available Spotlight plugins on their web site. It seems to be the least used list in the trio of Automator, Dashboard and Spotlight. And for a reason. Plugins can happily live in an application’s bundle and will be used by the system once the application (version) itself has been used. So there should be very little need for such a list. Sure, there will always be some importers coming from third parties, but I hope this won’t be too many and application makers will do their job.

I haven’t really tried this but as the documentation claims that the System only uses a plugin contained in an application’s bundle once that application has been opened by the user, it may be that updating an application – which should reset the application’s status to untrusted – without running it right afterwards might bring you into a situation where you think that files you copy to your drive should be indexed but they actually aren’t.

Making Plugins

Making Spotlight plugins isn’t too difficult. When making the Spotlight importer for Rechnungs Checker I found the most difficult thing to correctly figure out how to introduce UTIs to the application in a way that conforms with the existing file types. While Apple provide information on that topic, I wish it’d been more comprehensive and with more examples. From my point of view UTIs will be big and people should try to use them as soon as possible. The fact that the property list entries in the documentation don’t really match those you see in the plugins that are installed by the system doesn’t make you more confident either.

The other difficult thing was testing. It was (and probably still is) quite unclear to me how exactly everything works and when it is done by whom. For your plugin to get to index files a couple of things seem to need to happen. (a) you need to define UTIs for your file types and describe them (Having a syntactically correct property list is essential for that – unfortunately XCode doesn’t give you any warnings when it isn’t), then (b) the System needs to be aware of those UTIs and your Spotlight plugin (using lsregister -f seems to be quite a good way to do that. For your convenience the tool is easily accessible at /System/Library/Frameworks/ApplicationServices.framework/Versions/A/Frameworks/LaunchServices.framework/Versions/A/Support/lsregister – as can be found out using the stone age locate tool rather than using Spotlight – cough) and (c) your Spotlight plugin needs to be in one of the Libraries’ plugin folders or the application containing it in the Spotlight folder of its bundle’s Library folder has to have been run at least once. This will enable the machine to index your files.

But only the files that are changed after copying the application to your disk. In the – possibly not entirely uncommon – situation that files are already around but the application will be updated after the initial scan of the hard drive, you might be out of luck. Using mdimport -r might solve your problems. But it may as well not, due to a bug I’ll describe in a second.

But first let me sum this up by saying that while coding the plugin is easy, getting it do actually index all the relevant files seems to be a tricky one. I tested this with a couple of people on different machines and couldn’t come to a proper conclusion. Sometimes things worked, sometimes they didn’t. It’s all a bit messy and could use proper documentation and guidelines.

Bugs

Spotlight not returning any results where it should Besides the really strange bugs where Spotlight just doesn’t return any results for a perfectly good query you throw at it through its menu bar interface, or that doing searches by the path name (which can be found in the ‘Other’ search terms) just don’t work, there’s the issue I mentioned above. It seems that at the time of importing a file Spotlight stores a file’s UTI in its database. However, the UTI associated to a file can be changed by installing a new application on the system and that change won’t make it into the Spotlight database.

So if you have a UTI for fairly general file types (e.g. in Rechnungs Checker we have to open CSV files) that type may have been classified by the system to be a text file and Spotlight will have stored that info in its database. And while the system may happily be opening those files with your application afterwards and the Finder will be displaying the correct icon and file type, Spotlight will still have the old info. This means that your files won’t turn up in ‘by type’ searches. And, at least as worse, there’s no way to force Spotlight to re-index those files as you can only tell it to re-index the files that are imported by your plugin, i.e. the files that have the correct UTI already.

The only way around this is to run mdimport on those files one more time which will correct the issue. But if all your files of that type aren’t located in the same place, actually doing that may be somewhat painful.

Other problems I observed so far were with Apple’s PDF importing plugin which seems to choke on some large PDF files. I haven’t really tested whether it’ll find all the text in large files either – but I’d consider that an important feature. Apple’s xml/property list importer als regularly chokes on one of the property list files on my computer (meaning that it uses about a minute of processing time to index it – and as the file in question is one of NewsFire’s cache files and NewsFire touches those every five minutes for some reason, it means the computer is continuously busy.

Apple’s MP3 importer also needs more work. Not only have there been reports that it’s not particularly good at getting the songs’ durations right. It also seems to fail to import certain kinds of older MP3 tags. My hint: Go to your iTunes library, select all songs and choose the Convert ID3 Tags command from the menu to convert all tags to the newest (2.4) version.

I’ve also read that connecting a non-trivial number of external drives to your computer will result in indexing madness and basically kill your system. I haven’t done that yet and most likely never will, but it’s certainly the kind of serious bug Apple should get rid of quickly.

Wishes

Let me finish with some wishes for future updates of Spotlight. The first and foremost ones concern the points I mentioned above: Spotlight should be able to find any file on my computer by its name it least. Spotlight should be faster, at least for a small and most likely looked for subset of the files. And Spotlight’s main UI should be better.

Apart from that, there are other weaknesses of Spotlight. At this time it is designed to operate on a per-file basis. I.e. there is at most one set of metadata for any single file. As many files are little databases – bibliography files, address books, actual databases… – that could do with more detailed indexing, a more general approach that can generate several sets of metadata for a single file would be nice. To begin with this would spare the authors of such applications to do the hackery that is currently done by iCal or the Address Book where an additional file is generated for every single piece of data they manage – just to make that data available to Spotlight. This seems a bit clumsy. And once that has been improved, I’d also like (well, I didn’t see any so far…) to have a documented way that lets my application know what the user searched for when a document is opened from Spotlight search results. Applications like Preview can already do this and then automatically highlight the search term. Other applications should be able to do the same as it can really help the user.

Further Reading

I recommend reading the relevant pages of the ars technica X.4 report. And Apple’s marketing blurbs and developer documentation for Spotlight, perhaps. If you’re completely annoyed by Spotlight you might enjoy the anti-Spotlight hints from Alf.

June 8, 2005, 23:56

Tagged as X.4.

Comments

Comment by Dave2: Gravatar image

I have been waiting a long time to read about Spotlight in your continuing Tiger coverage… mainly because I struggle with it daily. Sometimes I love it, but most times I am so frustrated with interpreting the found results that I long to return to pre-Spotlight days.

The user interface for the results window is atrocious. Having to open up every single one of those stupid more info “i” buttons in order to display paths is just stupid. Many times, the path is the only clue I have as to finding the proper file, yet this crucial information is inexplicably buried, requiring extra clicks to see.

It also seems needlessly complex. I only recently found out that using quotes around a search term will look only at file names, but often forget to use them and am swarmed with useless results. I also figured out that the “Apple-Key” will reveal the file in the finder instead of opening it when you click, but that too is mostly forgotten as I work. Searching should be mindless, but I find myself thinking a lot about it as I make my way through a work day. If Spotlight worked as advertised, I shouldn’t have to.

But worse for me is that Spotlight does not fulfill its promise of looking “inside” the files I use every day. And the problem defies explanation… some files I wouldn’t dream of being found (old Adobe Illustrator files, for example) are returned. Other files I thought would be easily indexed (newer, PDF-compatible Illustrator files, for example) are completely ignored. Because of this, I never know if my Spotlight searches are complete or not (and mostly they’re not). As if that wasn’t complicated enough, I can find no resource to tell me exactly which files can and cannot be content-indexed. Very frustrating.

The saving grace of Spotlight is being able to enter Spotlight meta-data (comments) in the “Get Info” box of each file. This guarantees that files can be found, and are returned faster in search results than any other. But who wants to go through thousands of files to do this?

Spotlight shows promise, but I remain underwhelmed. Perhaps if they fix the horrible results window, make Spotlight smarter about looking into files, and document what files are actually overlooked for content… well, then at least I could manage to live with it. As it is, I wish there was a way to get “old” searches back for those times I don’t want all the Spotlight bells and whistles, and just want to find something.

June 9, 2005, 1:16

Comment by ssp: Gravatar image

Thanks for the command key hint in the menu, Dave. Too bad there is no visual feedback given for that.

As for telling which files can be imported, my hint would be to run an mdimport -d1 command for a file. That’ll tell you whether the file can be imported and which type it is considered to be by Spotlight. If you’re not into the Terminal, I guess this can be nicely wrapped into an Automator Finder plugin or so.

For the InDesign files, my guess is that Adobe haven’t provided any importers yet and Apple haven’t set theirs up to take care of those files. Assuming that the files really are PDF files (does Preview display them when their name ends with .pdf?) I suppose that getting the existing plugin to import them will just take a little fiddling with a property list file to provide a UTI for the relevant creator types that conforms to the PDF UTI. Gee, this reads like gibberish when all I’m wanting to say is that it should be quite easy to make the exisitng PDF importer import files of other types which are PDFs as well.

June 9, 2005, 2:07

Comment by Dave2: Gravatar image

Adobe Illustrator 12 files still have a “.ai” extension, but they preview properly as a PDF, and sometimes even open up in “Preview” instead of Illustrator when double-clicked! So I know that Spotlight probably can index them, it just doesn’t. This is very odd, becuase it DOES index old Illustrator 8 files, even though they are not PDF compatible and don’t show a preview!

June 9, 2005, 15:33

Comment by ssp: Gravatar image

Then I am pretty positive that I can solve (well, ‘hack’, sort of) the problem for Illustrator rather quickly. As I have neither a current version of Illustrator nor a lot of Illustrator files, I’d need a playground to test this in just to be on the safe side.

If you’re interested in doing this, contact me on iChat.

June 9, 2005, 17:49

Comment by Matt: Gravatar image

mdimport -f /System

…will get all those files indexed by Spotlight, if you really want to do that. The “-f” flag forces indexing even if “normal path filters” would leave them out. Without it, “mdimport /System” returns immediately with no action.

June 9, 2005, 20:17

Comment by ssp: Gravatar image

Matt, I was aware of the -f option. My concerns are more with what using it on the System folder will do to the size of my Spotlight index and the speed of searches afterwards.

With the sheer number of files in the System folder, I am afraid that indexing will take ages and bloat the index immensely. As Spotlight’s performance on my system is already miserable with just my normal files I didn’t want to try that.

Can you offer any information on what indexing the /System folder does to the index size and search speed?

June 9, 2005, 21:04

Comment by Paul Mison: Gravatar image

I haven’t upgraded to 10.4 yet, but I understand that you can start slideshows from the Finder contextual menu. Mind you, I’ve no idea what conditions have to be true to allow that to show up. However, there’s a macosxhints post that says it’s based on a selection: http://www.macosxhints.com/article.php?story=20050504102137590

Part of the reason I’m sticking with 10.3 is that I really like the current Finder “find file” implementation- I hated the Sherlock 2 or Sherlock 3 behaviour, and on Mac OS 9 I used to keep Sherlock 1 around because it did the Right Thing. Your (thorough) report adds to the pile of Spotlight criticism that’s helping me be happy to stick.

June 10, 2005, 15:00

Comment by Paul Mison: Gravatar image

Here’s an edited Terminal transcript of some mdfind/mdimport tweaking, on a Mac mini I was able to get an account on (and whose Spotlight index I didn’t mind messing up). The summary is that indexing /System makes the index about 20MB bigger but doesn’t appear to affect search speed significantly, but this is obviously not scientific.

macmini:~ pmison$ time mdfind "blur"
/Developer/Examples/Quartz Composer/Motion Graphics Compositions/Blurry.qtz
[...]

real    0m1.306s
user    0m0.035s
sys     0m0.028s
macmini:~ pmison$ sudo du -sh /.Spotlight-V100/
 63M    /.Spotlight-V100/
macmini:~ pmison$ mdimport -f /System
PSSniffer error: No such file or directory
PSSniffer error: Not a directory
PSSniffer error: Not a directory
PSSniffer error: Not a directory
2005-06-10 14:02:24.298 mdimport[1227] *** +[NSUnarchiver unarchiveObjectWithData:]: extra data discarded
2005-06-10 14:02:24.322 mdimport[1227] *** +[NSUnarchiver unarchiveObjectWithData:]: extra data discarded
macmini:~ pmison$ sudo du -sh /.Spotlight-V100/
 85M    /.Spotlight-V100/
macmini:~ pmison$ time mdfind "blur"
/Developer/Examples/Quartz Composer/Motion Graphics Compositions/Blurry.qtz
/System/Library/Frameworks/Quartz.framework/Versions/A/Frameworks/QuartzComposer.framework/Versions/A/Resources/English.lproj/QCImageFilter.xml
[...]

real    0m1.170s
user    0m0.037s
sys     0m0.034s

June 10, 2005, 15:19

Comment by ssp: Gravatar image

Thanks for the useful hints, Paul.

I checked the menus and the Mac Help for the slideshows and couldn’t find anything. I keep forgetting the contextual menus. And while I’m the first one to complain about things that are available through contextual menus only, I’m quite happy that this works at all.

As for the indexing of the System folder - the numbers aren’t as bad as I feared they would be. So I might give it a try.

June 10, 2005, 16:43

Comment by ssp: Gravatar image

More on UTIs and how to solve the Illustrator indexing problem in a separate post.

June 18, 2005, 23:14

Add your comment

« IntelMainAgnes und seine Brüder »

Comments on

Photos

Categories

Me

This page

Out & About

pinboard Links

♪♬♪

People

Ego-Linking