Trends

An update to the email standards

By Ken Simpson | 4 minute read

The standards that govern Internet email have just been replaced. RFCs 2822 and 2821 are now officially obsoleted by RFCs 5322 and 5321. What’s changed? I downloaded both standards, ran the raw text through a filter to remove extraneous things like page headers, and then compared the documents using Microsoft Word.

It looks like most of the changes are intended to resolve ambiguities in the old standard. Here’s what I found:

Changes from RFC 2822 to 5322 (Internet Message Format)

1. Slight changes to the rules for message headers in RFC 2822/5322 (the crossed out text is from 2822; underlined text is from 5322) – looks like the standard is locking down on the definition of “header folding” somewhat, which has previously been an area of some ambiguity:

2.2. Header Fields
Header fields are lines ~~composed of~~beginning with a field name, followed by a
colon
(“:”), followed by a field body, and terminated by CRLF. A
field
name MUST be composed of printable US-ASCII characters (i.e.,
characters that have values between 33 and 126, inclusive), except
colon. A field body may be composed of ~~any~~printable US-ASCII characters,
~~except for CR and LF. However, a field body may contain CRLF~~ as well as the space (SP, ASCII value 32) and horizontal tab (HTAB,
ASCII value 9) characters (together known as the white space
characters, WSP). A field body MUST NOT include CR and LF except
when
used in ~~header~~ “folding” and “unfolding““, as described in section

2. Unfolded header lines can be arbitrarily long. I worry when I see the words “arbitrarily long,” because it means that an implementation that scans headers for security reasons now must be able to flexibly allocate storage for headers during processing rather than allocating a fixed buffer:

The process of moving from this folded multiple-line representation
of a header field to its single line representation is called
“unfolding”. Unfolding is accomplished by simply removing any CRLF
that is immediately followed by WSP. Each header field should be
treated in its unfolded form for further syntactic and semantic
evaluation. An unfolded header field has no length restriction and
therefore may be indeterminately long.

3. Tightening up the email address format. There is now a note recommending (but not requiring – one step at a time, folks) that the domain portion of an email address should actually be a legitimate domain in the context that an email message is being used:

Note: A liberal syntax for the domain portion of addr-spec is

given here. However, the domain portion contains addressing

information specified by and used in other protocols (e.g.,

[RFC1034], [RFC1035], [RFC1123], [RFC5321]). It is therefore

incumbent upon implementations to conform to the syntax of

addresses for the context in which they are used.

4. Quoted text is no longer allowed in the Message-Id header field. Message-Ids are incredibly important in email systems. They permit a receiver to identify whether two messages are actually duplicates of the same message. The old standard permitted quoted strings to be included in Message-Ids, but these are now prohibited.

5. The Received: header definition has now been mostly moved into RFC5321 (the SMTP standard). This is helpful, because the Received: header contains information concerning the hosts through which an email message traverses, which really has more to do with SMTP than the Internet Message Format defined in RFC5322.

Changes from RFC 2821 to RFC 5321 (Simple Message Transfer Protocol)

1. The most significant thing to note is that RFC 5321 is not a self-contained specification for SMTP. It merely “consolidates, updates, and clarifies several previous documents, making all or parts of most of them obsolete,” covering, “SMTP extension mechanisms and best practices for the contemporary Internet…”:

This document is a ~~self-contained~~ specification of the basic protocol
for ~~the~~ Internet electronic mail transport. It consolidates, updates
, and clarifies~~, but doesn’t add new or change existing functionality~~
several previous documents, making all or parts of most of them
obsolete. It covers the following:
– the original SMTP time=”2008-10-02T10:37″>(Simple Mail Transfer Protocol) specification of
RFC 821 [30], – domain name system requirements extension mechanisms and ~~implications for mail~~best practices
transport from RFC 1035 [22] and RFC 974 [27],
– the clarifications and applicability statements in RFC 1123 [2],
and
– material drawn from the SMTP Extension mechanisms [19].
It obsoletes RFC 821, RFC 974, and updates RFC 1123 (replaces the
mail transport materials of RFC 1123). However, RFC 821 specifies
some features that were not in significant use in the Internet by the
mid-1990s and (in appendices) some additional transport models.
Those sections are omitted here in the interest of clarity and
brevity; readers needing them should refer to RFC 821.
It also includes some additional material from RFC 1123 that required
amplification. This material has been identified in multiple ways,
mostly by tracking flaming on various lists and newsgroups and
problems of unusual readings or interpretations that have appeared as
the SMTP extensions have been deployed. Where this specification
moves beyond consolidation and actually differs from earlier
documents, it supersedes them technically as well as textually.
Although SMTP was designed as a mail transport and delivery protocol,
this specification also contains information that is important to its
use as a ‘mail submission’ protocol, as recommended for POP [3, 26]
and IMAP [6]. Additional submission issues are discussed in RFC 2476
[15].
Section 2.3 provides definitions of terms specific to this document.
Except when the historical terminology is necessary for clarity, this
document uses the current ‘client’ and ‘server’ terminology to
identify the sending and receiving SMTP processes, respectively.
A companion document [32] discusses message headers, message bodies
and formats and structures for them, and their relationship.
for the contemporary Internet, but does not provide details about
particular extensions. Although SMTP was designed as a mail
transport and delivery protocol, this specification also contains
information that is important to its use as a “mail submission”
protocol for “split-UA” (User Agent) mail reading systems and mobile
environments.

2. Using port 587 is (finally) recommended. The message submission protocol basically lets you use a subset of SMTP to send messages via a trusted gateway. Most ISPs now provide port-587 access for sending email, which allows them to shut off port 25 from their subscriber networks.:

In many situations and configurations, the less-

capable clients discussed above SHOULD be using the message

submission protocol (RFC 4409 [18]) rather than SMTP.

3. No spaces after MAIL FROM. This is one I hadn’t noticed in the previous standard, but which is now clarified. The SMTP MAIL FROM command cannot be followed by a space

Since it has been a common source of errors, it is worth noting that
spaces are not permitted on either side of the colon following FROM
in the MAIL command or TO in the RCPT command. The syntax is exactly
as given above.

4. Explicit recognition of SPF and DKIM. The SMTP standard completely lacks a method for verifying whether the purported sender of a message is who they say they are. The new standard recommends using external mechanisms like SPF and DKIM to help in identifying the actual sender of a message:

This specification does not deal with the verification of return
paths for use in delivery notifications. Recent work, such as that
on SPF [29] and DKIM [30] [31], has been done to provide ways to
ascertain that an address is valid or belongs to the person who
actually sent the message. A server MAY attempt to verify the return
path before using its address for delivery notifications, but methods
of doing so are not defined here nor is any particular method
recommended at this time.

5. It’s now quite legal to disconnect an SMTP session after detecting a timeout. Everyone did this anyhow, so it’s nice to see it finally recognized:

An SMTP server MUST NOT intentionally close the connection
~~except:~~
under

–
normal operational circumstances (see Section 7.8) except:
o
After receiving a QUIT command and responding with a 221 reply.

–
o
After detecting the need to shut down the SMTP service and

returning a 421 response code.
This response code can be issued

after the server receives any command or, if necessary,

asynchronously from command receipt (on the assumption that the

client will receive it after the next command is issued).

o After a timeout, as specified in Section 4.5.3.2, occurs waiting
for the client to send a command or data.

6. 100-series reply codes have now been removed. These codes were never really used anyhow:

1yz Positive Preliminary reply
The command has been accepted, but the requested action is being
held in abeyance, pending confirmation of the information in this
reply. The SMTP client should send another command specifying
whether to continue or abort the action. Note: unextended SMTP
does not have any commands that allow this type of reply, and so
does not have continue or abort commands.

7. You can now send back 550 responses after DATA, when the message could not be queued for policy violations. This is a great step forward, finally recognizing the concept of inline spam and virus filtering.

8. IPv6 support is now explicitly mentioned, although not required:

5.2. IPv6 and MX Records
In the contemporary Internet, SMTP clients and servers may be hosted
on IPv4 systems, IPv6 systems, or dual-stack systems that are
compatible with either version of the Internet Protocol. The host
domains to which MX records point may, consequently, contain “A RR”s
(IPv4), “AAAA RR”s (IPv6), or any combination of them. While RFC
3974 [39] discusses some operational experience in mixed
environments, it was not comprehensive enough to justify
standardization, and some of its recommendations appear to be
inconsistent with this specification. The appropriate actions to be
taken either will depend on local cir
cumstances, such as performance
of the relevant networks and any conversions that might be necessary,
or will be obvious (e.g., an IPv6-only client need not attempt to
look up A RRs or attempt to reach IPv4-only servers). Designers of
SMTP implementations that might run in IPv6 or dual-stack
environments should study the procedures above, especially the
comments about multihomed hosts, and, preferably, provide mechanisms
to facilitate operational tuning and mail interoperability between
IPv4 and IPv6 systems while considering local circumstances.

9. A new section dealing specifically with abusive or attack messages. Section 6.2 argues that messages should be delivered to recipients unless the receiving system is absolutely sure that they are bad, and in the case where a message is bad, a bounce should be sent if possible to the sender. Silently discarding messages is not prohibited, but it is strongly discouraged. Bounces should only be sent if the receiver knows they’ll be “usefully delivered.” This is code for: Don’t sent bounces when your system rejects spam or virus traffic, or you’ll become an Internet pariah.

Conversely, if a message is rejected because it is found to contain
hostile content (a decision that is outside the scope of an SMTP
server as defined in this document), rejection (“bounce”) messages
SHOULD NOT be sent unless the receiving site is confident that those
messages will be usefully delivered.

10. Directory Harvest Attack (DHA) prevention and other kinds of SMTP server protection is now legal. This is a great thing to see in the standard, because it allows receivers to claim innocence if they refuse to service connections from hostile senders.

7.
8. Resistance to Attacks
In recent years, there has been an increase of attacks on SMTP
servers, either in conjunction with attempts to discover addresses
for sending unsolicited messages or simply to make the servers
inaccessible to others (i.e., as an application-level denial of
service attack). While the means of doing so are beyond the scope of
this Standard, rational operational behavior requires that servers be
permitted to detect such attacks and take action to defend
themselves. For example, if a server determines that a large number
of RCPT TO commands are being sent, most or all with invalid
addresses, as part of such an attack, it would be reasonable for the
server to close the connection after generating an appropriate number
of 5yz (normally 550) replies.

That’s all for now. I eagerly await your comments – by RFC 5321 email or otherwise.

An update to the email standards

Cut your support tickets and make customers happier