1364 words on Mac OS X
Pierre recently wrote about Spotlight’s sluggish UI and in comments to what he wrote I couldn’t help but have to mention my foremost gripe about Spotlight’s implementation once more: The allegedly most advanced and spiffy search technology isn’t good at finding things. And that is because it just ignores most of your hard drive.
While it seems to be really hard to get information from Spotlight about which files and how many files have been indexed, it looks pretty much like Spotlight prefers to ignore most of the contents of your computer’s System and main Library folders. Places which will contain the majority of files on most computers. And if the Spotlight engine is too inefficient to handle indexing and searching all of the files on your drives without completely choking the machine it may be a good idea not not have it index all of them.
However, it should at least include the names of all the other files in its index as in the current situation Spotlight’s search results are the worst you can get on the market. Anything from System 7 to Linux will give you better search results on file names than Spotlight will. And that’s only because Spotlight deliberately ignores most files on your hard drive.
If you really want to find one of them – say if you quickly want to locate some framework when programming, you’ll have to dig your way through the folders manually or use the Unixy
locate tool if it is active on your system. Or you’ll have to use the Mac system’s old fashioned find capabilites. But how is that done?
The first thing you have to be aware of is that the Spotlight menu and window behave quite differently from other instances of Spotlight UI, say Finder windows in ‘Find mode’. Not only will those dedicated Spotlight frontends display Mail messages while the Finder’s windows (and thus by default saved searches as well) won’t, they also use Spotlight exclusively to locate stuff. As far as I can tell, there’s no way to find anything in places which Apple considered inappropriate to search in using the Spotlight menu or window. So let’s look at the other methods: Find mode in Finder windows or Open/Save sheets.
Both the windows and the sheets seem to have about the same features. But those can differ vastly depending on the situation you are in. For me the standard situation starts with a window in the Finder. Say, a window displaying my home folder where the Documents folder is selected:
Starting there we give the Find (Command-F) command which doesn’t open a new Find window but instead turns our existing window into a Find window by adding some extra decorations. Note that those decorations include the ‘grey bar’ at the top to set the scope of the search. Regardless of your location this defaults to ‘Computer’ and presumably searches everything? on your startup volume? on the internal hard drives? that Apple consider reasonable? Well, the latter. Which means that starting a find in a window of your home folder where another subfolder is selected won’t give your search results for any of those but for a completely different and unknown selection of folders on your drives.
However, note that we do get entries in that ‘grey bar’ to limit the search to your home folder or to the folder that was initially selected. I.e. it does make a difference in which folder you issued the Find command, if only for convenient access to those restrictions of the search’s scope. Selecting one of those will give you an appropriate subset of the search results you had so far. So let’s do something more interesting now. Go to the System folder:
And once you’re there do the same Find command. It gives you the same results you had before. With the only difference being that you can now restrict the search results to the System folder rather than the Documents folder:
What’s surprising is what happens when you now click on the System folder tab in the ‘grey bar’. This doesn’t give you a subset of the previous search results but will generate a completely new set – presumably because the System folder isn’t in the Spotlight index and the system just goes and searches the folder ‘manually’. And here we finally find the file we’ve been looking for!
So we now have a ‘trick’ to find files in places where Apple doesn’t want us to find them. It requires a number of extra clicks and some awareness of what you’re doing. And it sort-of blurs the whole concept of the ‘grey bar’ which seems to be a bit random, but it’s probably better than having to deal with
locate and things like case sensitivity.
So far I always said, Spotlight ignored ‘most’ files on your drive. But I never really quantified that. So I thought I should do that before making the point to Pierre in my comment. Which got me into another problem with OS X: It doesn’t tell you how many files there are in a folder. Let’s just travel back in time a little and see how this worked in the early nineties: Select a folder, Get Info (Command-I) and there you are:
You won’t only see that a folder full of games easily fits in a few megabytes but you’ll also see the number of files in the selected folder right there.
Bang!, as Mr Jobs would say. In comparison to this, OS X is pretty lame. Its Finder info windows will only display the total size of the folder but won’t give you the file count. Admittedly, with all the bundle and package junk on OS X it’s a bit difficult to tell what should be considered a file. But I’m sure the clever people at Apple would’ve been able to come up with a way of counting them that at most 90% of the users hated.
However, this doesn’t really help us with our problem. How can I figure out the number of files on my hard drive or in my home folder? If you’re prepared to accept a ‘raw’ count, i.e. one that ignores packages and bundles, the answer to the first question is quite easy to get if you’re prepared to use Disk Utility. For each volume you select, it will display the number of folders and files on that volume:
But what about the folders? I’m sure that someone can come up with a ‘smart’, i.e. obfuscated, perl one-liner to do the job. But that’s nothing I can do and not particularly Mac like. So I thought we could just use Automator for this task – it can to recursive listing of files after all. And that sort-of worked. Unfortunately the technology is too bad to handle large numbers of files – I just got zero as a result when running it on my entire home folder – and it’s also terribly slow.
After Automator failing once more, I thought we should be able to use Spotlight for the trick. Having in mind the experiences described above all we’d need is the trivial search restricted to a particular folder. So it just remains to figure out what kind of search would be sufficiently trivial. I decided that asking for all objects that were modified after some date in the distant past should do the trick. And after a lengthy wait of high processor usage it turned out that there are 44212 files, as in packages or files, in my home folder:
Well done, Mr Sven. Using the finest high tech for such trivial tasks. But utterly failing on the way! Because I’m pretty sure that those numbers don’t include many of the files in my
Library/Caches folder. So this technique ends up not being particularly useful. Unless you’re exclusively searching folder hierarchies which are either included or excluded from Spotlight indexing, I suppose.
I must admit all this remains a bit unclear to me and it’s not really clear to me how to achieve a good file count in a Mac-like fashion. So send in your perl scripts…
find . -type f | wc -l
, I would guess. (This explicitly excludes folders, including bundles, but includes the files therein, such as files in bundles.)
Gives me 23486 results on my home folder, compared to 5,164 items reported by Finder’s search (which does include Folders, since the “Kind” query modifier ever-smartly does not have an “is not” option).
So to be entirely “fair”, we would have to include folders as well in our CLI search:
find . | wc -l
which returns 32347.
Am I the only one whose ~/Library is searched by default? I had to manually add it to the exclude list, because I kept getting irrelevant search results. Then I had to add the /private folders too. Except for some reason (bug) it won’t let me add most of those folders to my exclude list!! I kept getting damn files from /tmp in my results! In the end, I found a folder that it would let me add to the exclude list, and that solved part of the problem, but still: WTF?
“Then I had to add the /private folders too. Except for some reason (bug) it won’t let me add most of those folders to my exclude list!”
Maybe the list doesn’t parse symlinks? For Spotlight, /private/tmp and /tmp may be two different directories.
[kalle:~] ssp% find . -type f | wc -l 100853 [kalle:~] ssp% find . | wc -l 111207