When Jon Postel wrote the SMTP standard (RFC 5321, formerly known as RFC 2321 or RFC 821) in 1982, the Internet was a 13-year old academic and military research network, and the TCP/IP portion consisted of just a handful of nodes (see the History of the Internet on Wikipedia). “Tootsie” was the highest grossing film that year. The Analog Mobile Phone System (AMPS) was approved by the FTC. Nobody had an iPhone yet, let alone an always-on connection to every other machine in the world.
In this early phase, there wasn’t much sense for Jon to add in security features such as authentication, because everyone on the network could be trusted, and they all knew each other. Fast forward three decades, and obviously the same thing can’t be said for today’s Internet. Yet despite the Internet’s pervasiveness, and the scale of the modern spam problem and other email-borne security concerns, the SMTP protocol has remained largely unchanged.
In the mid-2000s, with the exponential rise in email spam, the Internet industry got organized against email abuse, creating a number of standards that could be overlaid atop email to make it more secure and reliable. Arguably the most successful of these standards – in relation to the control of spam, anyhow – were Sender Policy Framework (SPF), and Domain Keys Identified Mail (DKIM). SPF and DKIM tie SMTP connections and email message content to a specific domain name, allowing receivers to validate that the connections and content are coming from a place that is consistent with the wishes of the domain name owner. Both standards are widely adopted, and play a helpful role in combatting abuse, by providing a new trustworthy identifier (the domain name) in addition to the IP address, that can be tracked to establish a sender’s behaviour and reputation.
However, while SPF and DKIM are useful standards, there has always been something missing: Email senders have – until now – had no way of communicating to email receivers their intentions around how their SPF and DKIM authenticated (and unauthenticated) email should be handled. For example, should bankofamerica.com email that originates from outside of Bank of America’s network be thrown in the junk, or just treated as a little more suspicious than usual? (Probably it should be junked – but you’d be surprised how difficult it is for a large organization to make this assertion without causing real harm to legitimate email)
Thus, while senders could use SPF and DKIM to “claim ownership” of the content they sent out, senders had no standardized way to communicate their intentions to email receivers as to how they wished their SPF and DKIM policy to be enforced. Furthermore, receivers had no standardized way of letting sender’s know about the email they were receiving from a sender’s domain – whether that email was authentically from the sending domain, or from a fraudster. Well, now we have a solution: DMARC.
Domain-based Message Authentication, Reporting & Conformance (DMARC) provides a relatively simple new mechanism that lets domain owners tell email receivers how they want their domain treated when it comes to email traffic. DMARC also lets domain owners ask for feedback from receivers about the email traffic relating to their domain name. For example, domain owners can get a regular report from Google indicating who sent fraudulent email to Gmail users in violation of the domain owner’s DMARC policy.
Here’s what makes DMARC so cool: You can get reports back telling you exactly how your domain is being used/abused. This information can then be used to assist in identifying – very precisely – who is abusing your domain (for example, to send spam), and to what extent, so that the abuser can be taken down. Additionally, you can register to receive “forensic” reports – samples of messages sent in violation of your DMARC authentication policy. DMARC thus closes the loop between senders, receivers, and fraudsters, in a way that has never before been possible.
To get started with DMARC, simply create a TXT record in your DNS such as the following:
v=DMARC1; p=reject; rua=mailto:firstname.lastname@example.org; ruf=mailto:email@example.com;
Once you publish this TXT record, Gmail and other major receivers will start sending you regular feedback about how they are seeing your domain. Reports about all email from your domain will be sent to the firstname.lastname@example.org address; samples of fraudulent emails using your domain will be sent to email@example.com.
In the coming months, doubtless various firms will offer free DMARC monitoring services that can take in and process this feedback data. Until then, enjoy receiving your shiny new DMARC reports. And stay tuned to this exciting new anti-abuse mechanism.
Tags: dkim, dmarc, spf
October 2nd, 2008
Posted in Uncategorized
The standards that govern Internet email have just been replaced. RFCs 2822 and 2821 are now officially obsoleted by RFCs 5322 and 5321. What’s changed? I downloaded both standards, ran the raw text through a filter to remove extraneous things like page headers, and then compared the documents using Microsoft Word.
It looks like most of the changes are intended to resolve ambiguities in the old standard. Here’s what I found:
Changes from RFC 2822 to 5322 (Internet Message Format)
1. Slight changes to the rules for message headers in RFC 2822/5322 (the crossed out text is from 2822; underlined text is from 5322) – looks like the standard is locking down on the definition of “header folding” somewhat, which has previously been an area of some ambiguity:
2.2. Header Fields
Header fields are lines
composed of a field name, followed by a colon (“:”), followed by a field body, and terminated by CRLF. A field
name MUST be composed of printable US-ASCII characters (i.e.,
characters that have values between 33 and 126, inclusive), except
colon. A field body may be composed of
any US-ASCII characters , except for CR and LF. However, a field body may contain CRLF when used in
header “folding” and “unfolding “ as described in section
2. Unfolded header lines can be arbitrarily long. I worry when I see the words “arbitrarily long,” because it means that an implementation that scans headers for security reasons now must be able to flexibly allocate storage for headers during processing rather than allocating a fixed buffer:
The process of moving from this folded multiple-line representation
of a header field to its single line representation is called
“unfolding”. Unfolding is accomplished by simply removing any CRLF
that is immediately followed by WSP. Each header field should be
treated in its unfolded form for further syntactic and semantic
3. Tightening up the email address format. There is now a note recommending (but not requiring – one step at a time, folks) that the domain portion of an email address should actually be a legitimate domain in the context that an email message is being used:
Note: A liberal syntax for the domain portion of addr-spec is
given here. However, the domain portion contains addressing
information specified by and used in other protocols (e.g.,
[RFC1034], [RFC1035], [RFC1123], [RFC5321]). It is therefore
incumbent upon implementations to conform to the syntax of
addresses for the context in which they are used.
4. Quoted text is no longer allowed in the Message-Id header field. Message-Ids are incredibly important in email systems. They permit a receiver to identify whether two messages are actually duplicates of the same message. The old standard permitted quoted strings to be included in Message-Ids, but these are now prohibited.
5. The Received: header definition has now been mostly moved into RFC5321 (the SMTP standard). This is helpful, because the Received: header contains information concerning the hosts through which an email message traverses, which really has more to do with SMTP than the Internet Message Format defined in RFC5322.
Changes from RFC 2821 to RFC 5321 (Simple Message Transfer Protocol)
1. The most significant thing to note is that RFC 5321 is not a self-contained specification for SMTP. It merely “consolidates, updates, and clarifies several previous documents, making all or parts of most of them obsolete,” covering, “SMTP extension mechanisms and best practices for the contemporary Internet…”:
This document is a
self-contained specification of the basic protocol for the Internet electronic mail transport. It consolidates, updates and clarifies , but doesn’t add new or change existing functionality
- the original
(Simple Mail Transfer Protocol) specification ofand
RFC 821 , - domain name system requirements
implications for mail transport from RFC 1035  and RFC 974 ,
- the clarifications and applicability statements in RFC 1123 ,
- material drawn from the SMTP Extension mechanisms .
It obsoletes RFC 821, RFC 974, and updates RFC 1123 (replaces the
mail transport materials of RFC 1123). However, RFC 821 specifies
some features that were not in significant use in the Internet by the
mid-1990s and (in appendices) some additional transport models.
Those sections are omitted here in the interest of clarity and
brevity; readers needing them should refer to RFC 821.
It also includes some additional material from RFC 1123 that required
amplification. This material has been identified in multiple ways,
mostly by tracking flaming on various lists and newsgroups and
problems of unusual readings or interpretations that have appeared as
the SMTP extensions have been deployed. Where this specification
moves beyond consolidation and actually differs from earlier
documents, it supersedes them technically as well as textually.
Although SMTP was designed as a mail transport and delivery protocol,
this specification also contains information that is important to its
use as a ‘mail submission’ protocol, as recommended for POP [3, 26]
and IMAP . Additional submission issues are discussed in RFC 2476
Section 2.3 provides definitions of terms specific to this document.
Except when the historical terminology is necessary for clarity, this
document uses the current ‘client’ and ‘server’ terminology to
identify the sending and receiving SMTP processes, respectively.
A companion document  discusses message headers, message bodies
and formats and structures for them, and their relationship.
2. Using port 587 is (finally) recommended. The message submission protocol basically lets you use a subset of SMTP to send messages via a trusted gateway. Most ISPs now provide port-587 access for sending email, which allows them to shut off port 25 from their subscriber networks.:
3. No spaces after MAIL FROM. This is one I hadn’t noticed in the previous standard, but which is now clarified. The SMTP MAIL FROM command cannot be followed by a space
4. Explicit recognition of SPF and DKIM. The SMTP standard completely lacks a method for verifying whether the purported sender of a message is who they say they are. The new standard recommends using external mechanisms like SPF and DKIM to help in identifying the actual sender of a message:
This specification does not deal with the verification of return
paths for use in delivery notifications. Recent work, such as that
on SPF  and DKIM  , has been done to provide ways to
ascertain that an address is valid or belongs to the person who
actually sent the message. A server MAY attempt to verify the return
path before using its address for delivery notifications, but methods
of doing so are not defined here nor is any particular method
recommended at this time.
5. It’s now quite legal to disconnect an SMTP session after detecting a timeout. Everyone did this anyhow, so it’s nice to see it finally recognized:
An SMTP server MUST NOT intentionally close the connection
except: - After receiving a QUIT command and responding with a 221 reply. - After detecting the need to shut down the SMTP service and
returning a 421 response code. This response code can be issued
after the server receives any command or, if necessary,
asynchronously from command receipt (on the assumption that the
client will receive it after the next command is issued).
6. 100-series reply codes have now been removed. These codes were never really used anyhow:
1yz Positive Preliminary reply
The command has been accepted, but the requested action is being
held in abeyance, pending confirmation of the information in this
reply. The SMTP client should send another command specifying
whether to continue or abort the action. Note: unextended SMTP
does not have any commands that allow this type of reply, and so
does not have continue or abort commands.
7. You can now send back 550 responses after DATA, when the message could not be queued for policy violations. This is a great step forward, finally recognizing the concept of inline spam and virus filtering.
8. IPv6 support is now explicitly mentioned, although not required:
5.2. IPv6 and MX Records
In the contemporary Internet, SMTP clients and servers may be hosted
on IPv4 systems, IPv6 systems, or dual-stack systems that are
compatible with either version of the Internet Protocol. The host
domains to which MX records point may, consequently, contain “A RR”s
(IPv4), “AAAA RR”s (IPv6), or any combination of them. While RFC
3974  discusses some operational experience in mixed
environments, it was not comprehensive enough to justify
standardization, and some of its recommendations appear to be
inconsistent with this specification. The appropriate actions to be
taken either will depend on local circumstances, such as performance
of the relevant networks and any conversions that might be necessary,
or will be obvious (e.g., an IPv6-only client need not attempt to
look up A RRs or attempt to reach IPv4-only servers). Designers of
SMTP implementations that might run in IPv6 or dual-stack
environments should study the procedures above, especially the
comments about multihomed hosts, and, preferably, provide mechanisms
to facilitate operational tuning and mail interoperability between
IPv4 and IPv6 systems while considering local circumstances.
9. A new section dealing specifically with abusive or attack messages. Section 6.2 argues that messages should be delivered to recipients unless the receiving system is absolutely sure that they are bad, and in the case where a message is bad, a bounce should be sent if possible to the sender. Silently discarding messages is not prohibited, but it is strongly discouraged. Bounces should only be sent if the receiver knows they’ll be “usefully delivered.” This is code for: Don’t sent bounces when your system rejects spam or virus traffic, or you’ll become an Internet pariah.
Conversely, if a message is rejected because it is found to contain
hostile content (a decision that is outside the scope of an SMTP
server as defined in this document), rejection (“bounce”) messages
SHOULD NOT be sent unless the receiving site is confident that those
messages will be usefully delivered.
10. Directory Harvest Attack (DHA) prevention and other kinds of SMTP server protection is now legal. This is a great thing to see in the standard, because it allows receivers to claim innocence if they refuse to service connections from hostile senders.
That’s all for now. I eagerly await your comments – by RFC 5321 email or otherwise.
Tags: dkim, email, reject, RFC, smtp, standards
April 7th, 2008
Posted in Uncategorized
Once Promising Proposals for a Final Ultimate Solution to the Spam Problem (FUSSP)
“Two years from now, spam will be solved.”
That was Bill Gates’ famous pronouncement back in 2004. Microsoft, Yahoo and the open source community devised two techniques that they believed would eradicate spam. The first was sender authentication, which allowed email senders to provide a list of the servers permitted to send email for users within their domain. The idea was that sender authentication would eliminate spammers spoofing legitimate email addresses, and allow for the creation of a permanent, ironclad white list of trustworthy domains that never send spam, thus allowing recipients to simply block everything not on the white list and end spam forever.
Another idea pitched in 2004 was the computational challenge. Senders would, upon connecting to a receiving email server, have to spend considerable CPU cycles computing the answer to a mathematical challenge provided by the receiving server. Bill Gates believed this approach would stop spam by making it cost too much to send the high volumes of email required to make spamming profitable.
Unfortunately, neither sender authentication nor the computational challenge technique resolved the spam problem. Computational challenges were rejected as being too costly for legitimate bulk email senders (airlines, banks, open source mailing lists, etc.) And sender authentication while eventually enjoying wide-spread adoption in the form of DKIM and SenderID, proved prone to errors. As as result it has remained useful mostly for the acceptance of legitimate email and phishing protection rather than the rejection of spam.
By 2005, what the anti-spam community was getting right was content filtering. When spam filters had reached above the 90 per cent accuracy level, spam transitioned from a problem of content to a problem of volume, the spammers simply send more spam. And they can do this because the recipient pays the cost of content filtering rather than the spammer.
The cost of a resource-consuming filtering system increases during high traffic loads. If you block spam content, spammers will find new ways to get around it. Bill Gates was right, the only way to stop them is to create difficulty by making spam too costly to send. If you do spammers are left to find new targets that are easier to hit.
NEXT: Post #4 Spamonomics: The Economics of Spamming
PREVIOUS: Post #2 Prohibition Induces “Botlegging”
Tags: accuracy, anti-spam, bill gates, content-filters, dkim, economics, high traffic loads, microsoft, spam, spammers, spamonomics, yahoo