0

My bogofilter puts good emails into spam. I would like to know how to reset all the "learning" so that I can train it again. Uninstalling bogofilter seem to uninstall the program but not the "learned data". How to rip out the "learned data". I want then to start with clean slate. -thanks

my system:
ubuntu 14.04
evolution 3.10.4 ( which I understand includes the bogofilter)
classic gnome
Rinzwind
  • 299,756
user2712329
  • 29
  • 1
  • 6

2 Answers2

1

I found this on their website (seems to me that you can create a list of words into a text file and then can edit this file and the use this new filtered file again):

How can I delete all the spam (or non-spam) tokens?

Bogoutil lets you dump a wordlist and load the tokens into a new wordlist. With the added use of awk and grep, counts can be zeroed and tokens with zero counts for both spam and non-spam can be deleted. The following commands will delete the tokens from spam messages:

bogoutil -d wordlist.db | \
awk '{print $1 " " $2 " 0"}' | grep -v " 0 0" | \
bogoutil -l wordlist.new.db

The following commands will delete the tokens from non-spam messages:

bogoutil -d wordlist.db | \
awk '{print $1 " 0 " $3}' | grep -v " 0 0" | \
bogoutil -l wordlist.new.db

Regarding

Uninstalling bogofilter seem to uninstall the program but not the "learned data".

You probably need to "purge" the application. In general user created files and settings are not removed when deleting software. See What is the correct way to completely remove an application? for a bit on information on that.

Rinzwind
  • 299,756
0

Partial joy. I have done what you suggested. The ham that is classified as spam is actually my employer email so you can imagine it is a pressing issue. After doing what you said, the first day it miscategorized 3 emails but not all from my employer. Over the last week, it miscategorized about one third of emails from my employer as spam. So I all those emails as ham. But still making mistakes. Maybe I will give it another week.

Are we really sure that the "learned" information is only the word db in .bogofilter. Is there maybe some other info it learned -- grasping at the straws. -thanks

user2712329
  • 29
  • 1
  • 6