Quarter Life Crisis

The world according to Sven-S. Porst

« Where the hell is Iowa?MainAdrian Mole »

X.5 Time Warp 2

1889 words on

Time Machine Icon Mac OS X.5 ships with the Time Machine backup technology which is designed to make the creation of backups as simple as possible. Of course it is essentially a dot-zero release by a company with no experience in backup technology at this stage and thus it doesn’t seem advisable to put your data’s existence at Time Machine’s mercy alone. Yet, the sheer convenience of it, makes having Time Machine backups handy very attractive. Problems like the accidental deletion a file or Safari trashing your cookies become much less dramatic ones if you can recover things with a few clicks of your mouse in less than a minute.

Unfortunately, Time Machine and me were off to a bad start. Even trying to create the initial backup spectacularly failed on my machine and led to kernel panics time after time. Those panics pointed to some USB problem and somehow managed to screw up the Time Machine backup in a way that the software couldn’t recover from. I started to develop a number of theories about what exactly caused that problem but things couldn’t really be narrowed down. Something fishy with USB seemed to be the cause of the problem (with the panic log pointing right there) but that’s all I could tell. Whether it’s my MacBook, the cable, the hard drive enclosure or the hard drive itself that OS X.5’s buggy USB drivers trip over remains unclear to me.

After a few rounds of that, the machine finally managed to create its initial backup. I have no idea why and what the difference to my previous setup was but I just accepted that and things worked. After more than six weeks of no problems, OS X.5 decided to switch to its nasty side again, though, for the new year. Instead of wishing me a happy new one, the machine showed me a Time Machine error message about backing up having failed.

I didn’t mind much at that stage and assumed that the next Time Machine run would rectify things. Which it didn’t. Instead, I kept getting the same message hour after hour until I started being rather weary of that. The Time Machine preferences showed this:

Time Machine preferences screenshot showing an error notification

and clicking the promising red information button in there, failed to provide any helpful information or even a way to fix the problem, it just displayed a rather pointless sheet containing no new information at all:

Sheet with text saying the backup couldn't be completed because an error occurred while copying files.

My first intuition was that perhaps the backup volume had run full and Time Machine wasn’t that good at clearing out old backups to gain extra space yet. Sure, there were still 5GB free on the volume in question and the amount to be backed up couldn’t have been more than a few 100MB. But the numbers I have seen Time Machine guess always seem to be a bit on the high side – which probably is a good way of playing things safely.

So I first moved some GB to another drive in the hope that this would magically solve the problem. Which it didn’t. As a last resort I decided to take a look at the logs. What I saw – hour after hour – typically looked like this:

Jan  1 00:04:19 Kalle /System/Library/CoreServices/backupd[52657]: Backup requested by automatic scheduler
Jan  1 00:04:19 Kalle /System/Library/CoreServices/backupd[52657]: Starting standard backup
Jan  1 00:04:19 Kalle /System/Library/CoreServices/backupd[52657]: Backing up to: /Volumes/Snowman/Backups.backupdb
Jan  1 00:06:26 Kalle /System/Library/CoreServices/backupd[52657]: No pre-backup thinning needed: 3.45 GB requested (including padding), 4.94 GB available
Jan  1 00:06:30 Kalle mds[52525]: (/Volumes/Snowman/.Spotlight-V100/Store-V1/Stores/6F049E25-9A76-49D6-ADC2-9C2478199143)(Error) IndexSDB in map_update:/Volumes/Snowman/.Spotlight-V100/Store-V1/Stores/6F049E25-9A76-49D6-ADC2-9C2478199143/store.db : ERR: map_update: page offset doesn't match! 0x1191 != 0x1189
Jan  1 00:06:30 Kalle mds[52525]: /SourceCache/Spotlight/Spotlight-398.1/core-db/sdb.c:1029: failed assertion 'dbme->pgnum == pgnum'   
Jan  1 00:06:32 Kalle ReportCrash[52661]: Formulating crash report for process mds[52525]
Jan  1 00:06:33 Kalle com.apple.launchd[1] (0x1109e0.mdworker[52531]): Exited: Terminated
Jan  1 00:06:35 Kalle com.apple.launchd[1] (com.apple.metadata.mds[52525]): Exited abnormally: Bus error
Jan  1 00:06:35 Kalle ReportCrash[52661]: Saved crashreport to /Library/Logs/CrashReporter/mds_2008-01-01-000631_Kalle.crash using uid: 0 gid: 0, euid: 0 egid: 0
Jan  1 00:06:36 Kalle mds[52662]: (/Volumes/Snowman/.Spotlight-V100/Store-V1/Stores/6F049E25-9A76-49D6-ADC2-9C2478199143)(Error) IndexCI in ContentIndexOpenBulk:Unclean shutdown of /Volumes/Snowman/.Spotlight-V100/Store-V1/Stores/6F049E25-9A76-49D6-ADC2-9C2478199143/live.0.; needs recovery
Jan  1 00:06:36 Kalle mds[52662]: (/Volumes/Snowman/.Spotlight-V100/Store-V1/Stores/50572971-97D1-41D8-9A8D-3A9A929EB339)(Error) IndexCI in ContentIndexOpenBulk:Unclean shutdown of /Volumes/Snowman/.Spotlight-V100/Store-V1/Stores/50572971-97D1-41D8-9A8D-3A9A929EB339/0.; needs recovery
Jan  1 00:06:37 Kalle /System/Library/CoreServices/backupd[52657]: MDBackupIndexFile returned -1134 for: /Applications, /Volumes/Snowman/Backups.backupdb/Kalle/2007-12-31-190422.inProgress/C52FE0AD-F4A1-4A55-8654-7A78144150E9/Kalle/Applications
Jan  1 00:06:37 Kalle /System/Library/CoreServices/backupd[52657]: Aborting backup because MDBackupIndexFile failed
Jan  1 00:06:37 Kalle /System/Library/CoreServices/backupd[52657]: Canceling backup.
Jan  1 00:06:37 Kalle /System/Library/CoreServices/backupd[52657]: Copied 113 files (0 bytes) from volume Kalle.
Jan  1 00:06:37 Kalle /System/Library/CoreServices/backupd[52657]: Copy stage failed with error:11
Jan  1 00:06:38 Kalle /System/Library/CoreServices/backupd[52657]: Backup failed with error: 11

While these messages weren’t quite as clear as I had hoped for, they point in the direction of Spotlight. Not only does Spotlight’s background mds process run into an error, it actually crashes over this with one of its threads ending up in a state like this:

Thread 15 Crashed:
0   com.apple.spotlight.index       0x00233580 _dead_beef + 11
1   com.apple.spotlight.index       0x00150ff1 map_update + 689
2   com.apple.spotlight.index       0x001577e1 page_insert_obj + 657
3   com.apple.spotlight.index       0x0015cfa2 db_store_obj + 546
4   com.apple.spotlight.index       0x00175b41 si_writeBackDBO + 412
5   com.apple.spotlight.index       0x0016d683 si_writeBackAndIndex + 1851
6   com.apple.spotlight.index       0x0016f23e setAttributes + 6676
7   com.apple.spotlight.index       0x001879bd si_work_queue_process + 637
8   com.apple.spotlight.index       0x00187cab si_scheduler_once + 419
9   com.apple.spotlight.index       0x00187cab si_scheduler_once + 419
10  com.apple.spotlight.index       0x00188036 si_scheduler_run_waiting_timeout + 581
11  com.apple.spotlight.index       0x00174a9e runLoop + 77
12  com.apple.spotlight.index       0x00174b8a setattr_runLoop + 68
13  libSystem.B.dylib               0x9296c075 _pthread_start + 321
14  libSystem.B.dylib               0x9296bf32 thread_start + 34

This left me non-amused. As Spotlight was totally and unrecoverably broken for me while running X.4, X.5’s improved Spotlight was a big deal for me. And mds crashes along with failed backups were pretty much the last thing I wanted to see in this brave new X.5 world.

My first attempt to solve the problem was to reindex the volume using the sudo mdutil -E /Volumes/Snowman command. As I also store loads of huge PDF files on that volume, doing this took ages. And it didn’t seem to solve the problem. Still having the possibly vain hope that I could fix things myself, I decided to have a look at the Spotlight index itself and possibly trash it (the results of which for some reason seem to differ a bit from what the mdutil command does).

Going to the interesting folder in question I found this:

[kalle:~] ssp% sudo ls /Volumes/Snowman/.Spotlight-V100/Store-V1/
Stores             VolumeConfig.plist

Of which the VolumeConfig.plist file was interesting:

[kalle:~] ssp% sudo cat /Volumes/Snowman/.Spotlight-V100/Store-V1/VolumeConfig.plist 
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>50572971-97D1-41D8-9A8D-3A9A929EB339</key>
    <dict>
        <key>partialPath</key>
        <string></string>
        <key>policySearch</key>
        <integer>3</integer>
    </dict>
    <key>6F049E25-9A76-49D6-ADC2-9C2478199143</key>
    <dict>
        <key>partialPath</key>
        <string>Backups.backupdb</string>
        <key>policySearch</key>
        <integer>3</integer>
    </dict>
</dict>
</plist>

While it is clear that the Time Machine backup has to be indexed by Spotlight to be searchable, this looks like Spotlight was actually refined a bit more carefully and can now create several indices per volume. In this case one for the volume in general and another one for the Backups.backupdb folder which contains the Time Machine backups. The UUID the index for the Backups.backupdb folder has is the one that also occurred in the log messages I had. So I decided to turn off Time Machine, delete the …/Stores/6F049E25-9A76-49D6-ADC2-9C2478199143/ folder, use the mdimport command to re-index the backup and turn Time Machine on again.

The re-indexing took quite a while. And the backup that eventually started included almost 10GB of files – of which I am not quite sure which they were (who can write a script that traverses the backup sessions and just lists the files which were freshly backed up in that session?) – but at least I seem to have a working backup system again at this moment. Let’s see how long this lasts.

Unfortunately actual information about solving Time Machine problems is still scarce in Google. In case you had similar problems, I hope this could help you. But in any case take care to not mess too badly with your data in case they are important. If you have a better understanding of what is going on, please leave a comment below or send a message.


Bonus Geek Material

Unfortunately I also ran into another kernel panic while trying to save my backup. The system was busy re-indexing, doing some further copying and playing a film. I also launched EyeTV. Shortly after that I got a kernel panic which points to EyeTV as the culprit:

panic(cpu 0 caller 0x001A7BED): Kernel trap at 0x0041e1bb, type 14=page fault, registers:
CR0: 0x8001003b, CR2: 0x0000000c, CR3: 0x00d4d000, CR4: 0x00000660
EAX: 0x00000000, EBX: 0x03d58d80, ECX: 0x03aab998, EDX: 0x36dde96c
CR2: 0x0000000c, EBP: 0x36dde8e8, ESI: 0x03d58d88, EDI: 0x03d58d90
EFL: 0x00010206, EIP: 0x0041e1bb, CS:  0x00000008, DS:  0x08d80010
Error code: 0x00000002

Backtrace, Format - Frame : Return Address (4 potential args on stack) 
0x36dde6d8 : 0x12b0e1 (0x455670 0x36dde70c 0x133238 0x0) 
0x36dde728 : 0x1a7bed (0x45ea20 0x41e1bb 0xe 0x45e1d4) 
0x36dde808 : 0x19e517 (0x36dde820 0x0 0x36dde8e8 0x41e1bb) 
0x36dde818 : 0x41e1bb (0xe 0x48 0x10 0x10) 
0x36dde8e8 : 0x41da73 (0x3d58d80 0x36dde96c 0x0 0x0) 
0x36dde938 : 0x41e3f4 (0x3c240c0 0x41e15e 0x36dde96c 0x0) 
0x36dde988 : 0x34597ff1 (0x3d58d80 0x0 0x36dde9a8 0x1a136f) 
0x36dde9f8 : 0x345ab616 (0x3c3b800 0x36ddead0 0x36ddea74 0x7) 
0x36ddea48 : 0x3459f3d3 (0x3c3b800 0x36ddead0 0x36ddea74 0x7) 
0x36ddea98 : 0x345aeca7 (0xc89d5c0 0x36ddead0 0x1388 0x0) 
0x36ddeaf8 : 0x345aea06 (0x4aba400 0x0 0xc0 0x0) 
0x36ddeb78 : 0x439b71 (0x4aba400 0x0 0x36ddec20 0x2) 
0x36ddebd8 : 0x4379b0 (0x4aba400 0xd 0x36ddec20 0x345bb1a4) 
0x36ddecf8 : 0x18d037 (0x4aba400 0xd 0xbbbda68 0x9) 
0x36ddfdb8 : 0x12d165 (0xbbbda40 0xc11e5a0 0x0 0x0) 
0x36ddfdf8 : 0x126247 (0xbbbda00 0x0 0x9c 0x36ddfee4) 
    Backtrace continues...
      Kernel loadable modules in backtrace (with dependencies):
         com.apple.iokit.IOUSBFamily(3.0.5)@0x34595000->0x345bcfff

BSD process name corresponding to current thread: EyeTV

Mac OS version:
9B18

Kernel version:
Darwin Kernel Version 9.1.0: Wed Oct 31 17:46:22 PDT 2007; root:xnu-1228.0.2~1/RELEASE_I386
System model name: MacBook2,1 (Mac-F4208CAA)

However, I thought that the beginning of that stack trace looks very similar to the panics I had in November, even though the version number of Apple’s IOUSBFamily stuff seems to have increased slightly since:

Sat Nov 17 18:51:07 2007
panic(cpu 1 caller 0x001A7BED): Kernel trap at 0x0041e1bb, type 14=page fault, registers:
CR0: 0x8001003b, CR2: 0x0000000c, CR3: 0x01138000, CR4: 0x00000660
EAX: 0x00000000, EBX: 0x03f50280, ECX: 0x0061e216, EDX: 0x344fbbdc
CR2: 0x0000000c, EBP: 0x344fbb58, ESI: 0x03f50288, EDI: 0x03f50290
EFL: 0x00010206, EIP: 0x0041e1bb, CS:  0x00000008, DS:  0x00000010
Error code: 0x00000002

Backtrace, Format - Frame : Return Address (4 potential args on stack) 
0x344fb948 : 0x12b0e1 (0x455670 0x344fb97c 0x133238 0x0) 
0x344fb998 : 0x1a7bed (0x45ea20 0x41e1bb 0xe 0x45e1d4) 
0x344fba78 : 0x19e517 (0x344fba90 0x216 0x344fbb58 0x41e1bb) 
0x344fba88 : 0x41e1bb (0xe 0x48 0x344f0010 0xba20010) 
0x344fbb58 : 0x41da73 (0x3f50280 0x344fbbdc 0x0 0x0) 
0x344fbba8 : 0x41e3f4 (0x3f54940 0x41e15e 0x344fbbdc 0x0) 
0x344fbbf8 : 0x612cfb (0x3f50280 0x0 0x1000 0x3eed800) 
0x344fbc68 : 0x6173dc (0x3eed800 0xb9ada00 0x9 0x406ac4c) 
0x344fbca8 : 0xa03135 (0x406ac40 0xb9ada00 0x7530 0x7530) 
0x344fbd08 : 0xa033b6 (0x4049200 0x40492e4 0x4 0x3f5db00) 
0x344fbd48 : 0x612350 (0x4049200 0x40492e4 0x0 0x0) 
0x344fbda8 : 0x9e5d85 (0x3eed800 0x3f5ec00 0x0 0x0) 
0x344fbe48 : 0x9e6724 (0x3eed800 0x3efd7b0 0x0 0x0) 
0x344fbec8 : 0x9e6779 (0x3eed800 0x0 0x0 0x408ea0) 
0x344fbee8 : 0x9df998 (0x3eed800 0x0 0x344fbf18 0x9dfa7c) 
0x344fbf18 : 0x41d149 (0x3eed800 0x0 0x1 0x19ccc1) 
    Backtrace continues...
      Kernel loadable modules in backtrace (with dependencies):
         com.apple.iokit.IOUSBMassStorageClass(2.0.0)@0xa00000->0xa09fff
            dependency: com.apple.iokit.IOSCSIArchitectureModelFamily(2.0.0)@0x89a000
            dependency: com.apple.iokit.IOUSBFamily(3.0.3)@0x60d000
            dependency: com.apple.iokit.IOStorageFamily(1.5.2)@0x8b2000
         com.apple.driver.AppleUSBEHCI(3.0.3)@0x9dc000->0x9edfff
            dependency: com.apple.iokit.IOPCIFamily(2.4)@0x63c000
            dependency: com.apple.iokit.IOUSBFamily(3.0.3)@0x60d000
         com.apple.iokit.IOUSBFamily(3.0.3)@0x60d000->0x634fff

BSD process name corresponding to current thread: kernel_task

Mac OS version:
9B18

Kernel version:
Darwin Kernel Version 9.1.0: Wed Oct 31 17:46:22 PDT 2007; root:xnu-1228.0.2~1/RELEASE_I386
System model name: MacBook2,1 (Mac-F4208CAA)

Not that this is terribly meaningful to me. I’m just wondering whether both of these problems have the same roots. And if they do, who is to blame. As I used to get the crashes without EyeTV running, my bet would be on OS X.

January 5, 2008, 10:11

Tagged as X.5.

Add your comment

« Where the hell is Iowa?MainAdrian Mole »

Comments on

Photos

Categories

Me

This page

Out & About

pinboard Links

People

Ego-Linking