A problem with the web is that it doesn't have a memory. And with that I don't only refer to the curse of linkrot but also to the fact that you can't simply rewind it to a state it was in a couple of days/weeks/months/years before. Pages come, change and go and there's not much that can be done about it short of archiving them yourself, in case you want to have a reliable reference.
I don't want to suggest this is entirely a bad thing. I can correct typos in my blog – should I see them – and the wrong version won't be referenced. No harm done, except perhaps that in the future my spelling will be rated 'moderately horrendous' instead of 'horrendous'. But similarly, I can change the points I made or the opinions I held in the past. And that doesn't only feel like cheating, it's a seriously bad thing. Particularly if done by larger bodies, whose opinions can influence decisions that are significant to many people.
Who Controls the Past, Controls the Future.
Who Controls the Present, Controls the Past.
George Orwell, 1984
And this control becomes much more significant in times where many people trust the information you find on the web for their research or decision making. To change the past you don't have to go through the hassle of re-printing back-issues or periodicals but all you have to do is shift some bits and bytes around.
Basically the web lacks a temporal component. There's no way to tell Google to "Search for 'purification' as if it were January 24th 1984". All that exists is now. This is pretty obvious, but three recent things brought it to my attention more prominently:
Firstly, I tried to research the opinions my old university held on the Afghanistan war. Basically their stance was that putting up "Peace and Goodwill" banners for the so-called festive-season while one's own country bombed the shit out of people and deserts in Afghanistan, wasn't cynical as the university chose to hold no opinion on that at all. A stance that's increasingly popular with politicians and institutions, I might add – refusing to answer the question, not even daring to say 'I don't care', and hoping it goes away. Anyway, I don't know how much of the university's opinion was on public record anyway but when trying to research it, I realised that it was very hard to find any information from before 2002.
Why? The answer seems to be that the web-site was re-done then and probably migrated to a different – allegedly more-sophisticated back-end. In many of these transitions data is lost, degenerated or not migrated at all and thus,
secondly, nobody seems to care a lot about the past. OK, let me put that right: Conservatives of course do. But it's less the factual past, but those conveniently blurred images of it made to suit one's own imagination. Thus, nobody really seems to care about providing proper archives, implying that nobody is interested in their contents anyway, thus not justifying the extra effort. That's why I think that Matthew is essentially right in stressing the importance of data preservation, even if I don't agree with all of the practical conclusions he draws.
Thirdly, actually making the temporal component live on the net there's Feedster. It also stores past entries from RSS feeds and has a 'Sort by Date' option. It's newly added image section is chronological only right now. I hope we'll see this aspect of Feedster taken further in the future. It's quite unique.
Received data seems to be invalid. The wanted file does probably not exist or the guys at last.fm changed something.