  Levent Özgür, 2003    

Thesis Title

Adaptive Anti-Spam Filtering Based on Turkish Morphological Analysis, Artificial Neural Networks and Bayes Filtering


We propose an anti­spam filtering algorithm that is used for Turkish language based on Artificial Neural Networks (ANN) and Bayes Filter. The final product is an anti­spam filtering program which works compatible with Outlook so it is user­specific, thus adapts itself with the characteristics of incoming e­mails. The algorithm has two parts: the first part deals with morphology of Turkish words. The second part classifies the e­mails by using the roots of words extracted by morphology part. The input vectors to ANN are chosen with two models: based on binary model and probabilistic model. Two structures of ANN are employed in this study: single layer perceptron (SLP) and multilayer perceptron (MLP). Bayes Filter is also implemented with three different approaches: Binary Model, Probabilistic Model, Advance Probabilistic Model. Spam detection performance of the proposed system is improved by including non­Turkish words. A total of approximately 750 mails (410 spam and 340 normal) are used in the experiments. A success rate over 90% is achieved.
