 |
PEP Documentation
Return to the help index
Pages: Quickstart •
Actions •
Headers •
Tests •
Commands •
Attachment Handling •
Reply Files •
Mailing Lists •
Child Accounts •
DNS Blocklists •
SpamAssassin •
Challenges •
M-Script •
Glossary •
Spam FAQ •
SMTP Tutorial •
PEP Quick Setup •
PEP Wizard •
PEP Editor
Headers
Every e-mail message contains a set of "headers". A header has a name
(like "From", "To", or "Subject") and a value.
This header is named From and its value is bob@hotmail.com:
This header is named Subject and its value is Have you read any good books lately?:
|
Subject: Have you read any good books lately?
|
PEP rules work by testing header values to see if they
match values you specify. PEP can test any header value that appears in
an e-mail message. There are also several values that you can test that
aren't actual message headers, but they can be treated like headers anyway.
It would be impossible to list every possible e-mail header you may encounter,
but here is a list of the more useful ones:
- FROM
- This header is commonly misunderstood. Most people think that it contains
the name and/or e-mail address of the person who sent the message. While this
is usually the case, it doesn't have to be.
It is possible to send an e-mail message with just about anything you want
in the FROM: header. In fact, some spammers will even put the recipient's
e-mail address in here to confuse them.
Example:
| From: spammer@hotmail.com (Joe Spammer)
|
- TO
- Just like the FROM header, this header can actually contain just about
anything. If you are the only recipient of the message, then it probably
contains your e-mail address. Messages to a list of people may or may not
include all the addresses here though.
Examples:
- SUBJECT
- This header contains a brief title or description of the message. This is
a good place to look for certain spam key words or phrases.
Examples:
Subject: ADV: Cable Descrambler
Subject: Make Money Fast!!
|
Special PEP Values
There are several special values that PEP can use that aren't actual message
headers, but you can test them as if they were:
- ORIGIN
- This value is a shortcut that is the same as typing from, message-id, reply-to, senderaddress, return-path, x-sender, ip, apparently-from.
- DESTINATION
- This value is a shortcut that is the same as typing to,cc,bcc,envelope-to,apparently-to.
- TOP
- This value refers to the first four kilobytes of the message body. It is not possible to test the entire
message body if it is over 8K in size.
- BOTTOM
- This value refers to the last four kilobytes of the message body. It is not possible to test the entire
message body if it is over 8K in size.
- PLAINTEXT
- If it is a MIME encoded message and it contains a text/plain or text/html section, then this
value will contain the first four kilobytes of this section, stripped of any HTML. If no text
section can be found then this is the same as the TOP value. This is useful for logging and
paging.
- SENDERADDRESS
- This contains the sender's address as provided to the SMTP server via
the MAIL FROM: command. Also known as the "Envelope From" value. This will
usually match the value in Return-Path: (minus any surrounding angle brackets)
but not always.
- SENDERLOCAL
- This contains the local part of SENDERADDRESS (the part to the left of the
@ sign).
- SENDERDOMAIN
- This contains the domain part of SENDERADDRESS (the part to the right of
the @ sign).
- FROMADDRESS
- Often the From: header contains more than just an e-mail address. It
might include the sender's name, company, or other text that isn't part of
the e-mail address. This header value contains only the e-mail address
portion, if any.
So if the From: header contains the value "Bob Smith <bob@aol.com>", the
FROMADDRESS will be "bob@aol.com".
- RETURNADDRESS
- The Return-path: header usually contains the sender's address surrounded
by angle brackets. RETURNADDRESS contains only the e-mail address
portion, if any.
So if the Return-path: header contains the value "<bob@aol.com>", the
RETURNADDRESS will be "bob@aol.com".
- REPLYADDRESS
- Occasionally the Reply-to: header contains more than just an e-mail address.
It might include the sender's name, company, or other text that isn't part of
the e-mail address. This header value contains only the e-mail address
portion, if any.
So if the Reply-to: header contains the value "Bob Smith <bob@aol.com>",
the REPLYADDRESS will be "bob@aol.com".
- #header
- If you place a hash mark before a header name you'll get a numeric
value that tells you how many occurances of the header there are in the
message. For example, #from will usually have a value of 1
because there's normally a single From: header. #received will
usually be more than one.
- header(n)
- If you test a header value and there happens to
be more than one instance of that header in the message, only the first
one is tested. For example, using received will test only the
first Received: header. If you want to test the second one you'd use
received(2) instead.
To refer to the last instance of a header, use a hash mark instead
of a number. So received(#) refers to the very last Received:
header. You can also follow that with a negative number to indicate the
Nth from the last header: received(#-1) would be the second to last
Received: header, for example.
- SCORE
- PEP maintains an internal numeric score that starts out at zero. You can
use the SCORE action to add or subtract from this
value. The idea is to score a message based on a variety of tests and then
if the score is high enough, delete it.
- MAILBOX
- This is a numeric value that indicates the current size of your mailbox
in bytes, before delivering the current message.
- MAILBOXNEW
- This is a numeric value that indicates how large your mailbox would be
if the current message gets delivered to it.
- %
- This value represents a random number from 1 to 100. It is different
each time PEP handles a new message.
This value was created specifically for our tech support department where we
needed to send 33% of the incoming support mail to one staff member, 33%
to another person, and the rest to a third person.
Example:
forward if % < 33 to tech1@yourdomain.com
forward if % < 66 to tech2@yourdomain.com
forward if * matches * to tech3@yourdomain.com
|
- IP
- This value contains the IP address of the last machine to handle the
message prior to reaching the local server. This is often, but not always,
the IP address of the remote mail server that relayed the message to
our server.
- HOSTNAME
- This value contains the host name that you get if you do a reverse lookup
on the IP value above. Note that this value is only available if you've
previously used the "resolve" command.
- LINES
- This value is numeric and indicates the number of lines there are in
the message.
- BYTES
- This value is numeric and indicates how large the message is in bytes.
- TOCOUNT
- This value is numeric and indicates how many addresses there are in the
To: header.
- CCCOUNT
- This value is numeric and indicates how many addresses there are in the
Cc: header.
- CHALLENGEID
- This is a unique value that is meant to be used exclusively in
reply files that are sent via the
challenge action.
- PEP_ID
- This is a value that is guaranteed to be unique for every message ever
processed by PEP. It is meant for use in conjunction with the
exec action.
- RXn
- If you don't know what a regular expression is then don't worry about these
values. This is an advanced topic.
RX values refer to substrings that were matched with the last regex
test. RX0 refers to the entire matching string, RX1 refers to the first
substring, RX2 refers to the second, and so on.
So given a Subject: line of
[Llamas] Any good llama jokes?
and the rule
reply if subject regex "\\[(.+)\\]"
RX0 would contain "[Llamas]" and RX1 would contain "Llamas".
- SASCORE
- This used to trigger a Spam Assassin scan, but now it's just an alias
for the "X-Spam-Score:" header that is now added automatically by our mail server.
- BFSCORE
- - IN TESTING -
- RAZOR
- Vipul's Razor is a shared
catalogue of know spam. When you tet this value, it connects to the
Razor database and returns either "yes" or "no" to indicate whether
the message is listed.
Example:
- DCCBODY, DCCFUZ1, and DCCFUZ2
- - IN TESTING -
- NUMPARTS
- If the message in question consists of one or more MIME attachments,
this value will tell you how many there are.
- ATTACHMENT
- When a message contains one or more attachments, each one that has a
filename attribute will be assigned to a separate "attachment" value. If
there are 5 attachments with filenames, then there will be 5 "attachment"
values. You would normally test these by using a wildcard.
Example:
|
delete if attachment* matches "*.exe"
|
- USERNAME
- This value contains the username of the account that is currently
accessing your mailrule file. Normally it will be your username, but if
you've allowed others to include your mailrule file then it will be set
to their username when PEP is processing mail for them. You can use it to
implement different rule sets depending on who's using your mailrule file.
- CALLBACK
- This is a special value that causes PEP to perform a "callback", and
report the result as either "OK" or "BAD". A callback is when PEP connects
back to the mail server(s) for the sender's email address and goes through
the motions of sending a bounce message, without actually sending it. If the
sender's address is phony, most servers will let us know about it.
So if "callback" is "bad", then you know the mail message in question is
bogus because it comes from a non-existant address (or one that's been
closed down by the ISP, etc). In the event that PEP is unable to connect
to the sender's mail servers, or there is some other kind of error, the
default is to assume that it's OK.
This is a very effective way to eliminate a lot of spam with no worries
about false positives (since messages with invalid return addresses are
invalid).
Example:
|
delete if callback is bad
|
Modifiers
Modifiers can be used to perform a variety of function on a header value prior to testing it. Examples include converting a header to upper or lower case, stripping out punctuation marks or HTML tags, etc.
For example, to test a version of the Subject header that has had all punctuation symbols removed, use "pstrip:subject" instead of just "subject". To test a version of the message body that has all HTML tags removed, use "htstrip:top" instead of just "top".
Note: modifiers can only be applied to individual headers, they do not work with wildcards. Further note that they only modify a temporary copy of the header, they do not modify the original message in any way.
A list of the available modifiers follows:
- lower
- This modifier converts the value to lower case letters.
- upper
- This modifier converts the value to upper case letters.
- d3l33t
- It is not uncommon for spammers to use "l33t sp33k" (elite speek) to try and get past filters. This involves replacing certain letters with punctuation marks that resemble the original letter. For example, they might spell "Viagra" as "\/|@gr@". The d3l33t (de-leet) modifier converts the most common punctuation marks back into the most likely letters. So "\/|@gr@" would be converted back into "viagra", which could then be caught in a general test for that keyword.
- pstrip
- Another common tactic used by spammers is to spell words with lots of extra puncuation marks in between the letters. So "viagra" might become "V.I.A.G.R.A" instead. This modifier strips out all punctuation marks (anything that is not a letter from A to Z) and squeezes all the remaining characters together without spaces.
- csdecode
- Another common technique to avoid filters is to encode the subject line (or other values) in an alternate character set. The value will look normal when viewed with most mail programs, but the actual value in the message is unreadable to a human. This modifier converts a value that is encoded in this manner back into plain text.
- htstrip
- Yet another technique to bypass filters is to insert HTML "comments" into the middle of words in the body of a spam message. This modifier will strip out anything between sets of angle brackets, effectively removing any HTML code.
- length
- This modifier replaces the value with the length of the value instead. So "length:subject" when the Subject is "Hi!" will be 3.
- ulratio
- This modifier replaces the value with a number between 0 and 100 that is the ratio of upper to lower case letters. The more upper case letters the value contains, the higher the number.
|  |