About Bayesian filter
Internet Exchanging Messaging Server now provides a interface for system administrator to hook a "Bayesian filter" into the Local Mail Delivery Module (LMDA). The Bayesian filter is a statistical process to indentify a spam mail message. For details about the operating princial of the Bayesian filter, please consult the acticle A Plan For Spam written by Paul Graham.
In this version of Internet Exchange Messaging Server, we bundle the bogofilter version 0.11.2. Details about bogofilter can be found Here.
Configure Bayesian filter for LMDA
Before the LMDA module can use the Bayesian filtering features, you need to enable it. Check the Enable check box and select to use either a DLL or a EXE type Bayesian filter. A DLL Bayesian filter is a software library that can be loaded into the memory space of the LMDA module. On the other hand, a EXE type Bayesian filter is a standalone binary executable which can be invoked by the LMDA module when processing each mail message. In general, a DLL type Bayesian Filter runs faster than a standalone EXE type filter.
Configure DLL type Bayesian filter
When you install Internet Exchange Messaging Server, the library version of bogofilter is installed automatically. This library is installed in the following location:
The library should expose a function that can be called by the LMDA module. The function name provided by libbogo library is "iems_bogofilter".
Configure EXE type Bayesian filter
If you are going to use a EXE type Bayesian filter, you need to provide the command line arguement and the return code provided by the filter to indicate the filtering result. The following directives are supported in the filter command line:
/var/spool/iems/msgstore/john@company.com/.bayesian
| /usr/local/bin/myfilter -d %BExpanded to:
| /usr/local/bin/myfilter -d /var/spool/iems/msgstore/john@company.com/.bayesian
/usr/local/bin/myfilter -d %B -I %MExpanded to:
/usr/local/bin/myfilter -d /var/spool/iems/msgstore/john@company.com/.bayesian -I /var/spool/iems/mqueue/01/1.msg
Beside the filter command line, you need to define the return code that your Bayesian filter uses to indicate different conditions. The return codes are integer number various from 0 to 255. If your filter may return different return code values for the same condition, use a COMMA to separate each of them. The 4 conditions are:
About Bayesain filter learning engine
Bayesian filtering is a statistical process that requires some training in order to obtain accurate result on spam message detection. In Internet Exchange Messaging Server, a program namely "bayesianlearn" is provided for this purpose. This bayesianlearn program is a command line utility that looks at the Spam and Good messages under a predefined mailbox folder by each MessageStore user. Each message will be submitted to the underlying Bayesian filter training engine. Your Bayesian filter training engine must support the following features:
Configure Bayesain filter learning engine
You need to configure the commandline arguments of your Bayesian filter training engine. The following directives are provided:
/var/spool/iems/msgstore/john@company.com/.bayesian
There is a locking mechanism between the LMDA module and the bayesianlearn program. When LMDA fails to accquire the lock, it will keep on retrying until it reaches the TIMEOUT ( default is 15 minutes ). When LMDA reaches the timeout, it will send a notifcation to the system postmaster account and terminates. If you receive such notification, you should check if there is any problem that the bayesianlearn program fail to release the lock. You may need to terminate the bayesianlearn program manually and remove the "bayesain.lock" file under each of the MessageStore user's HOME directory. Restart LMDA afterward.