Tim Oren wrote an interesting piece about the ideas behind spam filtering strategies, explaining roughly the difference between Bayesian systems and Latent semantics. OSX's Mail seems to use the latter for its junk filter.
I find that after a bit of training Mail's filter works reasonably well, catching around eight junk mails and letting pass one per day. Also, the mails I get to see are those which look remarkably like proper e-mails. What's still not quite clear to me is whether the filter only uses keywords or is also capable of grasping the structure of e-mails. My experience with Mail's filter suggests the latter (or Apple's programmers throwing in a couple of special cases). For example when starting to train the filter I got a couple of false positives on amazon orders, which – like spam – contain some long number in the subject line and a couple of links in the text.
Probably there'd be a lot more to learn about this to understand it properly.
Received data seems to be invalid. The wanted file does probably not exist or the guys at last.fm changed something.