This note provides guidance on balancing the need to limit spam with preserving openness of IETF mailing lists. To preserve the utility of mailing lists, some form of spam control MUST be used. Uncontrolled lists are a denial of service attack on list members; besides, complaints about the spam that arrives has been known to shut down such lists.
We focus on the most common method of spam control: mailing lists which use heuristic methods for automated message rejection to reduce the amount of spam delivered to list subscribers and to protect the subscribers from specific types of dangerous content.
This note has these goals:
a. Protect subscribers from disrupted communication due to the burden of spam.
b. Protect subscribers from dangerous content.
c. Minimize the rejection of legitimate messages and allow the list to function without undue burden on the administrator.
d. Protect the senders by giving them the information and methods they need for contacting the list administrator to request an override of an automated rejection.
e. Require that messages from legitimate senders that are not members of the list to be accepted.
The two extremes of mailing list operation run from a completely open list in which no submissions are rejected to moderated lists where a human reviewer must explicitly approve each submission. Neither approach is desirable in most cases. The more typical approach is to allow some types of messages to be distributed with no human check, while remaining messages are either rejected outright, or sent to a human reviewer for rejection or approval.
Manual filtering is acceptable, if the approver is able to respond to postings promptly (e,g,. within one working day). However, because of the massive increase in the spam rate, many sites have been forced to resort to automated filtering of various types. Under certain conditions, such automated filters MAY be used for IETF mailing lists. Note, though, the difference between spam control and moderation. If there is any reasonable doubt about whether or not a message is spam, it almost certainly is not. Moderation is governed by a separate policy, "IESG Guidance on the Moderation of IETF Working Group Mailing Lists."
If an automated spam detector is used, it should take several factors into account. These include, in decreasing order of importance:
a. Whether or not the purported sender is a subscriber to the mailing list. This alone SHOULD NOT be relied upon as evidence that a message isn't spam, given the prevalence of "joe jobs" (spam that impersonates real users) and spam-sending worms. Note that the following addresses MUST always be considered to be subscribers for this purpose: firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org. It should also be possible for a subscriber to provide one or more alternate addresses, to be treated the same as their subscription address for this purpose.
It is not accepable to simply drop messages from non-subscribers.
b. The likelihood that the message is spam, based on textual evidence, statistical filters, and the like. Note the danger here of false positives;
c. Past behavior by that particular sending IP address;
A good spam detection mechanism will weight the various factors and react accordingly. A message from a known subscriber with the word "mortgage" is probably not spam; a message from a non-subscriber on a generic mail-hosting service that starts "I am Mariam Abacha" and talks about hidden bank accounts almost certainly is; email from a non-subscriber with the word "mortgage" may or may not be, and should be examined more closely.
Mail containing executable content (Unix shell scripts, Windows .exe files, and the like) or virus-infected attachments SHOULD be discarded immediately, regardless of other positive or negative indications. Mail containing other types of attachments -- HTML, Powerpoint, etc. -- is often inappropriate; posters are cautioned to use their own judgment for each such email. There is a tradeoff here -- large, bulky files are often best handled by sending their URLs instead; on the other hand, such practices mean that there will be no easily-accessible archival record of their contents. (Configuration of mailers to send HTML alone SHOULD NOT be done.)
In other cases, it MUST be possible for the sender of a legitimate message, whether a mailing list subscriber or not, to determine if it is has been dropped as apparent spam. This can be done in several ways; all of these have their advantages and disadvantages.
a. Provide a Web-accessible archive of postings rejected in the past few days. Few, if any, current mailing list packages currently do this, though some spam filters can easily be configured to do so. One tool written with this problem in mind is available at . This alternative is RECOMMENDED.
b. Provide an up-to-date archive of accepted postings. Unfortunately, while this can show dropped messages, it doesn't help if the email is merely delayed, nor does it say why a message was dropped. This MAY be used.
c. Have the receiving MTA scan the message and send a 500-class rejection to the DATA command. This can lead to mailer timeouts and retransmissions. This MAY be used, but (b) is a better alternative.
d. Email back a rejection message. In the event of forged spam -- and there's a lot of it -- this can lead to denial of service attacks on the purported sender. This SHOULD NOT be done.
Implicit in options (a) and (b) is that posting to an IETF mailing list is not guaranteed to be a reliable process -- it is up to the sender to verify that a message actually appears on the mailing list. This is not to mean that mailing lists are inherently unreliable, just that the burden for verification that postings are being delivered falls on the sender, not on the mailing list maintainer. Folks who post to lists they are not subscribed to need to pay particular attention.
If any spam filtering is done, there MUST be a bypass path by which someone can complain that a message was inappropriately blocked. Information about the bypass path SHOULD be available in the same place as other mailing list information - i.e., subscription address, mailing list owner, archive location. Such a complaint SHOULD be responded to within 2 business days. The usual appeal chain -- WG chairs, ADs, etc., per RFC 2026 -- applies.
Bill Fenner (for the IESG)
AT&T Labs - Research
75 Willow Rd.
Menlo Park, CA 94025