Spam filtering (Quarter Life Crisis)

Spam filtering¶

169 words

Tim Oren wrote an interesting piece about the ideas behind spam filtering strategies, explaining roughly the difference between Bayesian systems and Latent semantics. OSX's Mail seems to use the latter for its junk filter.

I find that after a bit of training Mail's filter works reasonably well, catching around eight junk mails and letting pass one per day. Also, the mails I get to see are those which look remarkably like proper e-mails. What's still not quite clear to me is whether the filter only uses keywords or is also capable of grasping the structure of e-mails. My experience with Mail's filter suggests the latter (or Apple's programmers throwing in a couple of special cases). For example when starting to train the filter I got a couple of false positives on amazon orders, which – like spam – contain some long number in the subject line and a couple of links in the text.

Probably there'd be a lot more to learn about this to understand it properly.

February 18, 2003, 18:51

Quarter Life Crisis

The world according to Sven-S. Porst

Spam filtering¶

Add your comment

Comments on

Photos

Categories

Me

This page

Out & About

pinboard Links

People

Ego-Linking