Re: [OT] Confirmation Spam Blocking was: List 'linux-dvb' closed topublic posts

From: Paul Jakma
Date: Fri Jan 23 2004 - 04:18:37 EST


On Thu, 22 Jan 2004, David Ford wrote:

> Considering that Bayesian filters are useless against the new spam
> that is proliferating these days, that's laughable. Spam now comes
> with a good 5-10K of random dictionary words.

Right, but random words result in strange couplings of words.
Statistical filters should be working on phrases, not just individual
words. So to the statistical filter random words will just be
meaningless noise, neither an indicator of goodness nor of spamness.
(unless a spammer reuses a boilerplate 'random word' section - in
which case it'll be an indicator of spamness).

regards,
--
Paul Jakma paul@xxxxxxxx paul@xxxxxxxxx Key ID: 64A2FF6A
warning: do not ever send email to spam@xxxxxxxxxx
Fortune:
Men of lofty genius when they are doing the least work are most active.
-- Leonardo da Vinci
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/