1889 words on Mac OS X 10.5 Leopard
Mac OS X.5 ships with the Time Machine backup technology which is designed to make the creation of backups as simple as possible. Of course it is essentially a dot-zero release by a company with no experience in backup technology at this stage and thus it doesn’t seem advisable to put your data’s existence at Time Machine’s mercy alone. Yet, the sheer convenience of it, makes having Time Machine backups handy very attractive. Problems like the accidental deletion a file or Safari trashing your cookies become much less dramatic ones if you can recover things with a few clicks of your mouse in less than a minute.
Unfortunately, Time Machine and me were off to a bad start. Even trying to create the initial backup spectacularly failed on my machine and led to kernel panics time after time. Those panics pointed to some USB problem and somehow managed to screw up the Time Machine backup in a way that the software couldn’t recover from. I started to develop a number of theories about what exactly caused that problem but things couldn’t really be narrowed down. Something fishy with USB seemed to be the cause of the problem (with the panic log pointing right there) but that’s all I could tell. Whether it’s my MacBook, the cable, the hard drive enclosure or the hard drive itself that OS X.5’s buggy USB drivers trip over remains unclear to me.
After a few rounds of that, the machine finally managed to create its initial backup. I have no idea why and what the difference to my previous setup was but I just accepted that and things worked. After more than six weeks of no problems, OS X.5 decided to switch to its nasty side again, though, for the new year. Instead of wishing me a happy new one, the machine showed me a Time Machine error message about backing up having failed.
I didn’t mind much at that stage and assumed that the next Time Machine run would rectify things. Which it didn’t. Instead, I kept getting the same message hour after hour until I started being rather weary of that. The Time Machine preferences showed this:
and clicking the promising red information button in there, failed to provide any helpful information or even a way to fix the problem, it just displayed a rather pointless sheet containing no new information at all:
My first intuition was that perhaps the backup volume had run full and Time Machine wasn’t that good at clearing out old backups to gain extra space yet. Sure, there were still 5GB free on the volume in question and the amount to be backed up couldn’t have been more than a few 100MB. But the numbers I have seen Time Machine guess always seem to be a bit on the high side – which probably is a good way of playing things safely.
So I first moved some GB to another drive in the hope that this would magically solve the problem. Which it didn’t. As a last resort I decided to take a look at the logs. What I saw – hour after hour – typically looked like this:
Jan 1 00:04:19 Kalle /System/Library/CoreServices/backupd: Backup requested by automatic scheduler Jan 1 00:04:19 Kalle /System/Library/CoreServices/backupd: Starting standard backup Jan 1 00:04:19 Kalle /System/Library/CoreServices/backupd: Backing up to: /Volumes/Snowman/Backups.backupdb Jan 1 00:06:26 Kalle /System/Library/CoreServices/backupd: No pre-backup thinning needed: 3.45 GB requested (including padding), 4.94 GB available Jan 1 00:06:30 Kalle mds: (/Volumes/Snowman/.Spotlight-V100/Store-V1/Stores/6F049E25-9A76-49D6-ADC2-9C2478199143)(Error) IndexSDB in map_update:/Volumes/Snowman/.Spotlight-V100/Store-V1/Stores/6F049E25-9A76-49D6-ADC2-9C2478199143/store.db : ERR: map_update: page offset doesn't match! 0x1191 != 0x1189 Jan 1 00:06:30 Kalle mds: /SourceCache/Spotlight/Spotlight-398.1/core-db/sdb.c:1029: failed assertion 'dbme->pgnum == pgnum' Jan 1 00:06:32 Kalle ReportCrash: Formulating crash report for process mds Jan 1 00:06:33 Kalle com.apple.launchd (0x1109e0.mdworker): Exited: Terminated Jan 1 00:06:35 Kalle com.apple.launchd (com.apple.metadata.mds): Exited abnormally: Bus error Jan 1 00:06:35 Kalle ReportCrash: Saved crashreport to /Library/Logs/CrashReporter/mds_2008-01-01-000631_Kalle.crash using uid: 0 gid: 0, euid: 0 egid: 0 Jan 1 00:06:36 Kalle mds: (/Volumes/Snowman/.Spotlight-V100/Store-V1/Stores/6F049E25-9A76-49D6-ADC2-9C2478199143)(Error) IndexCI in ContentIndexOpenBulk:Unclean shutdown of /Volumes/Snowman/.Spotlight-V100/Store-V1/Stores/6F049E25-9A76-49D6-ADC2-9C2478199143/live.0.; needs recovery Jan 1 00:06:36 Kalle mds: (/Volumes/Snowman/.Spotlight-V100/Store-V1/Stores/50572971-97D1-41D8-9A8D-3A9A929EB339)(Error) IndexCI in ContentIndexOpenBulk:Unclean shutdown of /Volumes/Snowman/.Spotlight-V100/Store-V1/Stores/50572971-97D1-41D8-9A8D-3A9A929EB339/0.; needs recovery Jan 1 00:06:37 Kalle /System/Library/CoreServices/backupd: MDBackupIndexFile returned -1134 for: /Applications, /Volumes/Snowman/Backups.backupdb/Kalle/2007-12-31-190422.inProgress/C52FE0AD-F4A1-4A55-8654-7A78144150E9/Kalle/Applications Jan 1 00:06:37 Kalle /System/Library/CoreServices/backupd: Aborting backup because MDBackupIndexFile failed Jan 1 00:06:37 Kalle /System/Library/CoreServices/backupd: Canceling backup. Jan 1 00:06:37 Kalle /System/Library/CoreServices/backupd: Copied 113 files (0 bytes) from volume Kalle. Jan 1 00:06:37 Kalle /System/Library/CoreServices/backupd: Copy stage failed with error:11 Jan 1 00:06:38 Kalle /System/Library/CoreServices/backupd: Backup failed with error: 11
While these messages weren’t quite as clear as I had hoped for, they point in the direction of Spotlight. Not only does Spotlight’s background
mds process run into an error, it actually crashes over this with one of its threads ending up in a state like this:
Thread 15 Crashed: 0 com.apple.spotlight.index 0x00233580 _dead_beef + 11 1 com.apple.spotlight.index 0x00150ff1 map_update + 689 2 com.apple.spotlight.index 0x001577e1 page_insert_obj + 657 3 com.apple.spotlight.index 0x0015cfa2 db_store_obj + 546 4 com.apple.spotlight.index 0x00175b41 si_writeBackDBO + 412 5 com.apple.spotlight.index 0x0016d683 si_writeBackAndIndex + 1851 6 com.apple.spotlight.index 0x0016f23e setAttributes + 6676 7 com.apple.spotlight.index 0x001879bd si_work_queue_process + 637 8 com.apple.spotlight.index 0x00187cab si_scheduler_once + 419 9 com.apple.spotlight.index 0x00187cab si_scheduler_once + 419 10 com.apple.spotlight.index 0x00188036 si_scheduler_run_waiting_timeout + 581 11 com.apple.spotlight.index 0x00174a9e runLoop + 77 12 com.apple.spotlight.index 0x00174b8a setattr_runLoop + 68 13 libSystem.B.dylib 0x9296c075 _pthread_start + 321 14 libSystem.B.dylib 0x9296bf32 thread_start + 34
This left me non-amused. As Spotlight was totally and unrecoverably broken for me while running X.4, X.5’s improved Spotlight was a big deal for me. And
mds crashes along with failed backups were pretty much the last thing I wanted to see in this brave new X.5 world.
My first attempt to solve the problem was to reindex the volume using the
sudo mdutil -E /Volumes/Snowman command. As I also store loads of huge PDF files on that volume, doing this took ages. And it didn’t seem to solve the problem. Still having the possibly vain hope that I could fix things myself, I decided to have a look at the Spotlight index itself and possibly trash it (the results of which for some reason seem to differ a bit from what the mdutil command does).
Going to the interesting folder in question I found this:
[kalle:~] ssp% sudo ls /Volumes/Snowman/.Spotlight-V100/Store-V1/ Stores VolumeConfig.plist
Of which the VolumeConfig.plist file was interesting:
[kalle:~] ssp% sudo cat /Volumes/Snowman/.Spotlight-V100/Store-V1/VolumeConfig.plist <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>50572971-97D1-41D8-9A8D-3A9A929EB339</key> <dict> <key>partialPath</key> <string></string> <key>policySearch</key> <integer>3</integer> </dict> <key>6F049E25-9A76-49D6-ADC2-9C2478199143</key> <dict> <key>partialPath</key> <string>Backups.backupdb</string> <key>policySearch</key> <integer>3</integer> </dict> </dict> </plist>
While it is clear that the Time Machine backup has to be indexed by Spotlight to be searchable, this looks like Spotlight was actually refined a bit more carefully and can now create several indices per volume. In this case one for the volume in general and another one for the Backups.backupdb folder which contains the Time Machine backups. The UUID the index for the Backups.backupdb folder has is the one that also occurred in the log messages I had. So I decided to turn off Time Machine, delete the …/Stores/6F049E25-9A76-49D6-ADC2-9C2478199143/ folder, use the mdimport command to re-index the backup and turn Time Machine on again.
The re-indexing took quite a while. And the backup that eventually started included almost 10GB of files – of which I am not quite sure which they were (who can write a script that traverses the backup sessions and just lists the files which were freshly backed up in that session?) – but at least I seem to have a working backup system again at this moment. Let’s see how long this lasts.
Unfortunately actual information about solving Time Machine problems is still scarce in Google. In case you had similar problems, I hope this could help you. But in any case take care to not mess too badly with your data in case they are important. If you have a better understanding of what is going on, please leave a comment below or send a message.
Unfortunately I also ran into another kernel panic while trying to save my backup. The system was busy re-indexing, doing some further copying and playing a film. I also launched EyeTV. Shortly after that I got a kernel panic which points to EyeTV as the culprit:
panic(cpu 0 caller 0x001A7BED): Kernel trap at 0x0041e1bb, type 14=page fault, registers: CR0: 0x8001003b, CR2: 0x0000000c, CR3: 0x00d4d000, CR4: 0x00000660 EAX: 0x00000000, EBX: 0x03d58d80, ECX: 0x03aab998, EDX: 0x36dde96c CR2: 0x0000000c, EBP: 0x36dde8e8, ESI: 0x03d58d88, EDI: 0x03d58d90 EFL: 0x00010206, EIP: 0x0041e1bb, CS: 0x00000008, DS: 0x08d80010 Error code: 0x00000002 Backtrace, Format - Frame : Return Address (4 potential args on stack) 0x36dde6d8 : 0x12b0e1 (0x455670 0x36dde70c 0x133238 0x0) 0x36dde728 : 0x1a7bed (0x45ea20 0x41e1bb 0xe 0x45e1d4) 0x36dde808 : 0x19e517 (0x36dde820 0x0 0x36dde8e8 0x41e1bb) 0x36dde818 : 0x41e1bb (0xe 0x48 0x10 0x10) 0x36dde8e8 : 0x41da73 (0x3d58d80 0x36dde96c 0x0 0x0) 0x36dde938 : 0x41e3f4 (0x3c240c0 0x41e15e 0x36dde96c 0x0) 0x36dde988 : 0x34597ff1 (0x3d58d80 0x0 0x36dde9a8 0x1a136f) 0x36dde9f8 : 0x345ab616 (0x3c3b800 0x36ddead0 0x36ddea74 0x7) 0x36ddea48 : 0x3459f3d3 (0x3c3b800 0x36ddead0 0x36ddea74 0x7) 0x36ddea98 : 0x345aeca7 (0xc89d5c0 0x36ddead0 0x1388 0x0) 0x36ddeaf8 : 0x345aea06 (0x4aba400 0x0 0xc0 0x0) 0x36ddeb78 : 0x439b71 (0x4aba400 0x0 0x36ddec20 0x2) 0x36ddebd8 : 0x4379b0 (0x4aba400 0xd 0x36ddec20 0x345bb1a4) 0x36ddecf8 : 0x18d037 (0x4aba400 0xd 0xbbbda68 0x9) 0x36ddfdb8 : 0x12d165 (0xbbbda40 0xc11e5a0 0x0 0x0) 0x36ddfdf8 : 0x126247 (0xbbbda00 0x0 0x9c 0x36ddfee4) Backtrace continues... Kernel loadable modules in backtrace (with dependencies): com.apple.iokit.IOUSBFamily(3.0.5)@0x34595000->0x345bcfff BSD process name corresponding to current thread: EyeTV Mac OS version: 9B18 Kernel version: Darwin Kernel Version 9.1.0: Wed Oct 31 17:46:22 PDT 2007; root:xnu-1228.0.2~1/RELEASE_I386 System model name: MacBook2,1 (Mac-F4208CAA)
However, I thought that the beginning of that stack trace looks very similar to the panics I had in November, even though the version number of Apple’s IOUSBFamily stuff seems to have increased slightly since:
Sat Nov 17 18:51:07 2007 panic(cpu 1 caller 0x001A7BED): Kernel trap at 0x0041e1bb, type 14=page fault, registers: CR0: 0x8001003b, CR2: 0x0000000c, CR3: 0x01138000, CR4: 0x00000660 EAX: 0x00000000, EBX: 0x03f50280, ECX: 0x0061e216, EDX: 0x344fbbdc CR2: 0x0000000c, EBP: 0x344fbb58, ESI: 0x03f50288, EDI: 0x03f50290 EFL: 0x00010206, EIP: 0x0041e1bb, CS: 0x00000008, DS: 0x00000010 Error code: 0x00000002 Backtrace, Format - Frame : Return Address (4 potential args on stack) 0x344fb948 : 0x12b0e1 (0x455670 0x344fb97c 0x133238 0x0) 0x344fb998 : 0x1a7bed (0x45ea20 0x41e1bb 0xe 0x45e1d4) 0x344fba78 : 0x19e517 (0x344fba90 0x216 0x344fbb58 0x41e1bb) 0x344fba88 : 0x41e1bb (0xe 0x48 0x344f0010 0xba20010) 0x344fbb58 : 0x41da73 (0x3f50280 0x344fbbdc 0x0 0x0) 0x344fbba8 : 0x41e3f4 (0x3f54940 0x41e15e 0x344fbbdc 0x0) 0x344fbbf8 : 0x612cfb (0x3f50280 0x0 0x1000 0x3eed800) 0x344fbc68 : 0x6173dc (0x3eed800 0xb9ada00 0x9 0x406ac4c) 0x344fbca8 : 0xa03135 (0x406ac40 0xb9ada00 0x7530 0x7530) 0x344fbd08 : 0xa033b6 (0x4049200 0x40492e4 0x4 0x3f5db00) 0x344fbd48 : 0x612350 (0x4049200 0x40492e4 0x0 0x0) 0x344fbda8 : 0x9e5d85 (0x3eed800 0x3f5ec00 0x0 0x0) 0x344fbe48 : 0x9e6724 (0x3eed800 0x3efd7b0 0x0 0x0) 0x344fbec8 : 0x9e6779 (0x3eed800 0x0 0x0 0x408ea0) 0x344fbee8 : 0x9df998 (0x3eed800 0x0 0x344fbf18 0x9dfa7c) 0x344fbf18 : 0x41d149 (0x3eed800 0x0 0x1 0x19ccc1) Backtrace continues... Kernel loadable modules in backtrace (with dependencies): com.apple.iokit.IOUSBMassStorageClass(2.0.0)@0xa00000->0xa09fff dependency: com.apple.iokit.IOSCSIArchitectureModelFamily(2.0.0)@0x89a000 dependency: com.apple.iokit.IOUSBFamily(3.0.3)@0x60d000 dependency: com.apple.iokit.IOStorageFamily(1.5.2)@0x8b2000 com.apple.driver.AppleUSBEHCI(3.0.3)@0x9dc000->0x9edfff dependency: com.apple.iokit.IOPCIFamily(2.4)@0x63c000 dependency: com.apple.iokit.IOUSBFamily(3.0.3)@0x60d000 com.apple.iokit.IOUSBFamily(3.0.3)@0x60d000->0x634fff BSD process name corresponding to current thread: kernel_task Mac OS version: 9B18 Kernel version: Darwin Kernel Version 9.1.0: Wed Oct 31 17:46:22 PDT 2007; root:xnu-1228.0.2~1/RELEASE_I386 System model name: MacBook2,1 (Mac-F4208CAA)
Not that this is terribly meaningful to me. I’m just wondering whether both of these problems have the same roots. And if they do, who is to blame. As I used to get the crashes without EyeTV running, my bet would be on OS X.