ARCHIVED: What should I consider when making spam filters using the Shakespeare or Jewel filter utility?

This content has been archived, and is no longer maintained by Indiana University. Information here may no longer be accurate, and links may no longer be available or reliable.

Note: UITS is replacing the Shakespeare and Jewel systems with a new email environment called Cyrus mail. For information about the new system, see the Knowledge Base document ARCHIVED: What is Cyrus mail? On the new system, you can access your mail via IU Webmail or a desktop client, but not via Pine. For more information about your email options, see the Knowledge Base document ARCHIVED: If I used Pine on the Shakespeare or Jewel systems, what do I need to know about the recent email upgrade?

Note: For the basics of creating filters in the Shakespeare or Jewel systems at Indiana University, see the Knowledge Base document ARCHIVED: At IU, on the Shakespeare and Jewel systems, how do I set up server-side mail filters?

The key to making a good filter is to look at the pattern of spam you are getting. Consider who these messages come from, and look for repeating themes within them. Look at domains (e.g., ispam.com), usernames (e.g., viagraman@), and the names associated with the mail address (e.g., "Home Loans"). When looking at domain names, make sure that there is not a reasonable possibility that you will get legitimate mail from that domain; for example, filtering aol.com would not be wise if you have friends with email accounts there. The same goes for the subject. Look for buzz phrases that are typical of spam (e.g., Interest Rates, Look Hot!). It is important to think about potential legitimate words and phrases that you may receive that could be filtered out accidentally if you make your subject filters too general.

The following are some special characters that can be used to make your filters more powerful and effective:

.
Any character except a newline
a*
Any sequence of zero or more a
a+ Any sequence of one or more a
a? Either zero or one a
[^-a-d]
Any character which is not either a dash (-), a, b, c, d or newline
de|abc Either the sequence de or abc
(abc)*
Zero or more times the sequence abc
\.
Matches a single period ( . ); use \ to quote any of the special characters to get rid of their special meaning

To put these concepts into practice, some sample recipes are included below.

Note: In these examples, the destination folder is one called Junk. Create this folder in your mail program before using these filters. Alternately, instead of using a junk mail folder, you can simply delete the spam by using the destination /dev/null.

If you do choose to use a junk mail folder, be aware that you should empty it frequently. If too much junk mail piles up in your account, it could put you over quota, which will prevent you from being able to receive any mail at all. For more information about quotas, see the Knowledge Base documents Default storage space allotments for UITS accounts and ARCHIVED: About your Cyrus mail storage space

  • If you are getting spam from the same username, but different domains (a typical practice for spammers), your filter might look like this:
    [vicky spam]
    FROM=.*Vicky@
    CONTINUE=no
    ACTION=file
    DESTINATION=Junk
    The .* before Vicky and the @ character after it is to ensure that someone named Vicky is not weeded out, only the spammer with the username Vicky.
  • If you are sure that the spammer uses only certain domains (e.g., spammail.com, mailspam.com, and spam-mail.com), then you might construct your filter to look like this:
    [spammail]
    FROM=.*spammail\.com|.*mailspam\.com|.*spam-mail\.com
    CONTINUE=no
    ACTION=file
    DESTINATION=Junk
  • If you tend to get spam with the word FREE in the subject line, you may want to filter using the following:
    [Get rid of the FREE email messages]
    SUBJECT=.*FREE
    CONTINUE=no
    ACTION=file
    DESTINATION=Junk

By watching your patterns of spam and using the special characters above, you can create a powerful filtering option to deal with most spam.

This is document alhh in the Knowledge Base.
Last modified on 2018-01-18 13:42:08.