Вы находитесь на странице: 1из 108

Mail Filters

1. 2. 3. 4. 5. 6.

ABSTRACT INTRODUCTION ABOUT ORGANISATION SRS DOCUMENT DESIGN PRINCIPLES & EXPLANATION DESIGN DOCUMENT 6.1 SYSTEM DESIGN

7.

PROJECT DICTIONARY 7.1 UML DIAGRAMS

8.

FORMS & REPORTS 8.1 8.2 I/O SPECIMENS I/O SAMPLES

9.

TESTING 9.1 9.2 TEST CRITERIA & TEST CASES TEST REPORT & ANALYSIS

10. 11. 12.

IMPLEMENTATION & USER MANUALS CONCLUSION BIBLIOGRAPHY

Mail Filters

Mail Filters

A Filter (also called a Mail Filter interchangeably) is the core of the decision making in Mail Fetch. A filter decides on a per-mail basis whether the message should be downloaded or not. A pipeline of filters is setup (yes, again setup in the configuration) and a message, which needs to be downloaded, is passed through this pipeline. At any point of the pipeline, a filter could indicate that the message should not be processed through the pipeline anymore. For example a SPAM filter (sender based) could find a match from the list of spammers it has and reject the message. There are two kinds of filters -- global and local. These are not an attribute of a filter itself, but rather depend on the usage of a filter. Local filters are associated to a maildrop whereas global filters are applicable to all maildrops. For example, you might want a Message-ID filter to be applicable to all maildrops whereas keep a sender-based filter only for the maildrop where you expect mail from that sender. A filter has the single job of deciding whether or not to download a single message. The actual decision of whether to download a mail or not is made through a sequence of filters. There can be a global set of filters as well as a per maildrop one. A maildrop represents your mailbox from which you want to download your mail. In this project we define totally six fileters. But we can define more than that number of filters as our requirements. In this project our focus is on main and basic filters like HeaderMailFilter, MessageIDMailFilter, NullFilter, ReceipientMailFilter, SenderMailFilter, SizeMailFilter, and SubjectMailFilter. A project titled, MAIL FILTERS is proposed to be developed with WINDOWS-2000 Server as the operating system and Java Mail API of J2EE Technologies. This package will have provision for creating your own filters and using those generated filters in appropriate places.

Mail Filters

Mail Filters

Existing System: The existing system is not computerized. All the mails were being done manually. To make this laborious job simple it is to be computerized. The administrator maintains all the mailboxes of employees of our organization. He is the responsible to organize those mailboxes. But incase of deleting unwanted mails he deletes them manually by marking after checking which mails are unwanted based on some facts like large sized mail, userID and so on. Proposed System: The first step of analysis process involves the identification of need. The success of a system depends largely on how accurately a problem is defined, thoroughly investigated and properly carried out through the choice of solution. This package has been developed in order to overcome the difficulties encountered while using the manual system. Faster and timely deletion of mails which are unwanted is another motivating factor for the development of this package. Project Scope and Objectives: Mail Filter is a tool to delete unwanted mails. Lot of effort was put to make it user friendly. Optimum utilization of tool is possible. All basic filter are provided. Reduces the user interaction work. The wastage of time is reduced.

Mail Filters

It also helps in optimum distribution of funds by the management among user groups for procurement of new equipment. More flexible it means user(administrator)can add his own number of filters if he interested easily.

Mail Filters

SaparnaInfotech limited is an organization of Professionals with multi disciplinary experience in the fields of Software development, services and consultancy and specialization in Web portals, Banking, Insurance, Hospital and Business Solutions. Saparna, a pioneer in client /server solutions, takes pride in being part of the growing Information Technology industry at Hyderabad which is fast emerging into yet another Silicon Valley of India. Strengths: The major strengths of Saparna are its advanced technological skills and a committed team of professionals, which bring in a constant stream of satisfied customers to the company's fold both in the export and the domestic markets. We have a dedicated team of professionals with real experience in the functional areas in a variety of state-of-the-art, Hardware and Software platforms. Our business interest is focused on Product Development for Banking, Hospital Management, Hotel Management, Rice Mill Automation, Web Portals, Image Processing Tools, Factory Automation and Office Automation. Our clear vision can provide new perspective to any area requiring IT solution with total commitment on quality, delivery and cost with our proven expertise both in well established and cutting edge technologies. Focus: The company focuses on delivery of cost effective and timely business solutions developed on cutting-edge technology. M/s BIL has

Mail Filters

an advanced Software development center backed by committed and experienced Software Professionals. M/s BIL respects customers and the development with full quality in mind and believes in nurturing a relationship, Trust and dependability. Mission: SAPARNAs Mission is to build a quality product, respect human values, upgrade its process and technical capabilities on a continuous basis, providing an employee-centric work atmosphere where every employee feels at home and is sensitive for enhancing the quality norms of the Software development work in IT and the State-of-art Technology and Services. To deliver innovative business solutions that maximizes the success of organizations in India and Abroad. To develop and maintains strategic alliances with institutions of expertise in India and Abroad. Vision: SAPARNA's Vision is to emerge as one of the fastest growing IT services company with special emphasis on quality, integrity, innovation, commitment, human values for varied and specific requirements and to deliver such products which will have far reaching advantages and applications. SAPARNA envisages of undertaking Business Outsource Processing works like Data enumeration, e-Catalog, Insurance processing, Accounting, Auditing, Back end Data processing for Banks, Insurance companies and MNC's. Human Resource: An aspect that makes Saparna so special is its strength in possessing manpower resources that comprise of experienced, talented and dedicated computer science graduates and postgraduates. These professionals are continuously trained to provide smart business solutions using skills in the very latest technology that includes Client/Server, Object Oriented Design and programming and Internet technologies. Saparna's human resources comprise of personnel qualified in computer sciences: M.Tech's, B.Tech's, MCA's and MSc's. Skill Areas: Operating Systems : MS-DOS, Sun Solaris, UNIX, Windows NT/2000, Windows95/98. RDBMS : Oracle, SQL Server, and MS Access. Client/Server Tools : Visual Basic, VC++.

Mail Filters

Web technologies Legacy Skills Middle Tier

: ASP, Java, WAP, WML, HTML, and XML. : COBOL. : COM/DCOM, RMI, CORBA.

Saparna adopts new and emerging technologies to deliver innovative solutions and best-of-breed products to its customers. As part of its R&D activities, Saparna has established structured R&D Group to foster innovation. Infrastructure: At M/s SIL the Infrastructure facilities are excellent and abundantly available for the IT personnel to work with. This company is located very close to the Software Technology Park where in all the facilities for Data Transmission is available. This is located in our own premises of approximately 10,000 sft . This center has following facilities to handle international parties.

Mail Filters

The Mail Filters are developed with the aim of automatically deleting the unwanted mails based on our definitions from the specified maildrops. The Mail Filter takes all the necessary definitions, in which we define some facts based on those mails are deleted automatically. The administrator can define those facts to delete the unwanted mails. 1. Introduction 1.1 Purpose: The purpose of this document is to describe all external requirements for the Mail Filters. It also describes the interfaces for the system. It is a. To implement Mail Filters we need a mail Server, which is capable of storing mail in corresponding mailboxes. In our project we implement or tested our filters on James server as it is available open. b. As a user interface we used Microsoft Outlook Express. Because it user-friendly and easy to access, read and maintain our mails. c. To send mails we need a protocol capable to send or deliver mails. And for receiving mails we need another

10

Mail Filters

protocol to get those mails from our mailboxes. In our project we used SMTP for sending the mails and POP3 for receiving the mails. These both are available in single mail server i.e. our James mail server we used.

d. Using XML language and basic java we can write script or code for filters. Because XML provides application interoperability.

1.2

Scope: This document describes the requirements of the

system. It is meant for use by the developers, and will also be the basis for validating the final system. Any changes made to the requirements in the future will have to go through a formal change approval process. The developer is responsible for asking for clarifications. When necessary and will not make any alterations without the permission of client. This project work intends to delete the not required mails from the mailboxes of organization personnel. In this lot of effort was put to make it perfect. Work Load to delete mails was avoided. The time for processing and deleting mails is considerably reduced. It helps a lot to the administrator by saving his valuable time. Thus he can allot that for other important activities. It provides more extendibility. Besides the existing filters Administrator can add his own filters if needed in future easily. We can apply these filters on any other Mail Severs to drop unwanted mails from specified maildrops. Administrator has two options to delete the mails: one is to run the filters manually whenever he wanted. Other one is he can set those filters to run automatically on schedule base.

11

Mail Filters

1.3

1.4 1.5

Definition: A Filter (also called a Mail Filter interchangeably) is the core of the decision making in Mail Fetch. A filter decides on a per-mail basis whether the message should be downloaded or not. Reference: Not Applicable. Developers Responsibilities overview: The points that mentioned in system requirements specification are 1. An introductory nature describing mainly the Purpose of the system requirements specifications document. Outlining the scope of the envisaged application.

2. Describes the iterations of the system with its environment without going into the internals of the system. Also describes the constraints imposed on the system. Thus it is out side the envisaged application. The assumptions made are also listed. It is supported by the UML Diagrams 3. It also describes the internal behaviour of the system in response to the inputs and while generating the outputs. This document is also supported with detailed level UML diagrams, list of inputs, process explanation and list of output. 4. Contains external interface requirements, which includes the user, hardware and software interfaces. 5. Deals with performance requirements of the system. Contains the design constraints composing of software constraints and hardware constraints.

2. General Description

12

Mail Filters

2.1 Product functions overview: In the Organization every employee has a mailbox. To this mailbox any one can send any number of mails for that mailbox owner. Some times we are suffering from spam mails, lengthy mails which may occupy all the memory allotted for our mail box and so on. These kind of mails are controlled by our company administrator as he is the responsible to manage all these mailboxes. He can set some constraints on those mailboxes like drop these kind of mails if any. Here those constraints are nothing but our filters. By embedding these filters in companys mail server he can restrict the mails. There no need to delete the mails manually after marking the delete mails. In this project administrator has to run those filters on specified mailboxes manually when ever he wants. There is one more option that is he can set

those filters to run periodically without taking the permission from administrator. When ever you run these filters they simply apply the logic we have written it already in a java file on every mail in a all mailboxes or specified mailboxes. Based on this logic decides whether to down load the mail or not. This functionality automates the our function of deleting the mails. 2.2 User characteristics: In our project user is an administrator. He must have the knowledge of how to implement or embed these filters on MailServer. General constraints: The system should run on Pentium, under windowsNT/2000 professional or server or forward versions of Microsoft operating systems with minimum 16 MB RAM for better performance. Actually these filters can apply on any kind of Mail servers. Assumptions and Dependencies: a. It is assumed that the James is real MailServer resource and required information already existed with the system.

2.3

2.4

13

Mail Filters

b. It is assumed that mail client is Micro Soft Outlook Express or Netscape Communicator. c. All the details produced by the user are correct. d. User will ask for new filters when he wants to filter mails more deeply or any situation , to filter like this come . 3. Function Requirements Functional requirements specify which outputs should be produced from the given inputs. They describe the relationship between the input and output of the system, for each functional requirement a detailed description of all data inputs and their source and the range of valid inputs must be specified.

All the operations to be performed on the input data to obtain the output should be specified. 3.1
Inputs:

1.

Null Filter: It deletes all kind of mails irrespective of

characteristics of mails. This filter consumes messages. It also marks them for deletion

all

2.

Header Mail Filter: Matches a header in the message.

This requires the name of the header and value of the header. 3.
MessageID Mail Filter: Filters messages if they contain

a duplicate Message-id. This Filter stores the list of downloaded message-ids in the specified file 4. Recipient Mail Filter: This filter matches the recipients of the message against those provided in a list. Sender Mail Filter: This filter matches the sender of the message against those provided in a list

5.

14

6. 3.2

Mail Filters Size Mail Filter: This filters messages based on their size

Outputs:

1.

It writes the log files according to the operations server handled. It writes also error message if any failure occurred to indicate fault where happened. It represents all this information in the from codes assigned for each and every operation.
Log Files:

4. External Interface Requirements 4.1


User Interface: After the filters are embedded in Mail server

and making all of them working properly no need of user interaction in case of administrator set those filters to run periodically. Otherwise it is the responsibility of the administrator to run them when he required. Totally the user interaction is very low. 4.2
Software Interfaces: This interface requirements should

specify the inter face with other. Software which the system will use or which will use the system, this includes the interface with the operating system and other applications. The message content and format of each interface should be given. Hardware Interfaces: Hardware interface is very important to the documentation. If the software is execute on existing hardware or on the pre-determined hardware, all the characteristics of the hardware, including memory restrictions, should be specified. In addition, the current use and load characteristics of the hardware should be given.

4.3

5. Performance Requirements

15

Mail Filters

All the requirements relating to the performance characteristics of the system must be clearly specified. There are two types of performance requirements static and dynamic. Static Requirements are those that do not impose constraint on the execution characteristics of the system. These include requirements like the number of terminals to be supported, and number simultaneous users to be supported, number of files, and their sizes that the system has to process. These are also called capacity of the system. Dynamic requirements specify constraints on execution behaviour of the system. These typically include response time and throughput constraints on the system. The processing speed, respective resource consumption throughput and efficiency measure performance. For achieving good performance few requirements like reducing code, less use of controls, minimum involvement of repeated data etc., are to be followed. Each real-time system, software what provides required function but does not conform to performance of software requirements is acceptable. These requirements are used to test run time performance of software with the context of an integrated system. 6. Design constraints 6.1
Software constraints : Operating System Reports Other Applications : Windows2000 Server/

NT or any Mail server : Log files : James Server

6.2

Hardware Constraints : Pentium Processor RAM Hard Disk Floppy Disk CD/ROM Drive VDU Key Board : : : : : : :

PentiumIII
128MB 20 GB 1.44 MB 52 Bit VGA 101 Standard

16

Mail Filters

7. Acceptance Criteria Before accepting the system, the developer must demonstrate that the system works on the details of the user email-ids entered in the corresponding files. The developer will have to show through test cases that all conditions are satisfied.

The Java Apache Mail Enterprise Server (a.k.a. Apache James) is a 100% pure Java SMTP and POP3 Mail server and NNTP News server designed to be a complete and portable enterprise mail engine solution. James is based on currently available open protocols. The James server also serves as a mail application platform. The James project hosts the Apache Mailet API, and the James server is a Mailet container. This feature makes it easy to design, write, and deploy custom applications for mail processing. This modularity and ease of customization is one of James' strengths, and can allow administrators to produce powerful applications surprisingly easily. James is built on top of version 4.1.3 of the Avalon Application Framework. This framework encourages a set of good development practices such as Component Oriented Programming and Inversion of Control. The standard distribution of James includes version 4.0.1 of the Phoenix Avalon Framework container. This stable and robust container provides a strong foundation for the James server. This documentation is intended to be an introduction to the concepts behind the James implementation, as well as a guide to installing, configuring, (and for developers) building the James server.

17

Mail Filters

The James Server


James is an open source project intended to produce a robust, flexible, and powerful enterprise class server that provides email and email-related services. It is also designed to be highly customizable, allowing administrators to configure James to process email in a nearly endless variety of fashions. The James server is built on top of the Avalon Framework. The standard James distribution deploys inside the Phoenix Avalon Framework container. In addition to providing a robust server architecuture for James, the use of Phoenix allows James administrators to deploy their own applications inside the container. These applications can then be accessed during mail processing. The James server is implemented as a complete collection of servers and related components that, taken together, provide an email solution. These components are described below.

POP3 Service
The POP3 protocol allows users to retrieve email messages. It is the method most commonly used by email clients to download and manage email messages. The James version of the POP3 service is a simple and straightforward implementation that provides full compliance with the specification and maximum compatibility with common POP3 clients. In addition, James can be configured to require SSL/TLS connections for POP3 client connecting to the server.

SMTP Service
SMTP (Simple Mail Transport Protocol) is the standard method of sending and delivering email on the internet. James provides a full-function implementation of the SMTP specification, with support for some optional features such as message size limits, SMTP auth, and encrypted client/server communication.

NNTP Service NNTP is used by clients to store messages on and

18

Mail Filters

retrieve messages from news servers. James provides the server side of this interaction by implementing the NNTP specification as well as an appropriate repository for storing news messages. The server implementation is simple and straightforward, but supports some additional features such as NNTP authentication and encrypted client/server communication. FetchPOP
FetchPOP, unlike the other James components, is not an implementation of an RFC. Instead, it's a component that allows the administrator to configure James to retrieve email from a number of POP3 servers and deliver them to the local spool. This is useful for consolidating mail delivered to a number of accounts on different machines to a single account.

The SpoolManager, Matchers, and Mailets


James separates the services that deliver mail to James (i.e. SMTP, FetchPOP) from the engine that processes mail after it is received by James. The SpoolManager component is James' mail processing engine. James' SpoolManager component is a Mailet container. It is these mailets and matchers that actually carry out mail processing.

Repositories
James uses a number of different repositories to both store message data (email, news messages) and user information. User repositories store user information, including user names, authentication information, and aliases. Mail repositories store messages that have been delivered locally. Spool repositories store messages that are still being processed. Finally, news repositories are used to store news messages. Aside from what type of data they store, repositories are distinguished by where they store data. There are three types of storage - File, Database, and DBFile.

RemoteManager
James provides a simple telnet-based interface for control. Through this interface you can add and delete users, configure per-user aliases and forward addresses, and shut down the server.

19

Mail Filters

Maillet API : The Mailet API is a simple API used to build mail processing applications. James is a Mailet container, allowing administrators to deploy Mailets (both custom and pre-made) to carry out a variety of complex mail processing tasks. In the default configuration James uses Mailets to carry out a number of tasks that are carried out deep in the source code of other mail servers (i.e. list processing, remote and local delivery). As it stands today, the Mailet API defines interfaces for both Matchers and Mailets.Matchers, as their name would suggest, match mail messages against certain conditions. They return some subset (possibly the entire set) of the original recipients of the message if there is a match. An inherent part of the Matcher

contract is that a Matcher should not induce any changes in a message under evaluation. Mailets are responsible for actually processing the message. They may alter the message in any fashion, or pass the message to an external API or component. This can include delivering a message to its destination repository or SMTP server. The Mailet API is currently in its second revision. Although, the Mailet API is expected to undergo substantial changes in the near future, it is our aim that existing Mailets that abided purely by the prior Mailet API interfaces will continue to run with the revised specification. James bundles a number of Matchers and Mailets in its distribution.

20

Mail Filters

1. INTRODUCTION

The objective of Simple Mail Transfer Protocol (SMTP) is to transfer mail reliably and efficiently. SMTP is independent of the particular transmission subsystem and requires only a reliable ordered data stream channel. An important feature of SMTP is its capability to relay mail across transport service environments. A transport service provides an interprocess communication environment (IPCE). An IPCE may cover one network, several networks, or a subset of a network. It is important to realize that transport systems (or IPCEs) are not one-to-one with networks. A process can communicate directly with another process through any mutually known IPCE. Mail is an application or use of interprocess communication. Mail can be communicated between processes in different IPCEs by relaying through a process connected to two (or more) IPCEs. More specifically, mail can be relayed between hosts on different transport systems by a host on both transport systems.

21

Mail Filters

2. THE SMTP MODEL The SMTP design is based on the following model of communication: as the result of a user mail request, the sender-SMTP establishes a two-way transmission channel to a receiver-SMTP. The receiver-SMTP may be either the ultimate destination or an intermediate. SMTP commands are generated by the sender-SMTP and sent to the receiver-SMTP. SMTP replies are sent from the receiverSMTP to the sender-SMTP in response to the commands. Once the transmission channel is established, the SMTP-sender sends a MAIL command indicating the sender of the mail. If the SMTPreceiver can accept mail it responds with an OK reply. The SMTPsender then sends a RCPT command identifying a recipient of the mail. If the SMTP-receiver can accept mail for that recipient it responds with an OK reply; if not, it responds with a reply rejecting that recipient (but not the whole mail transaction). The SMTP-sender and SMTP-receiver may negotiate several recipients. When the recipients have been negotiated the SMTP-sender sends the mail data, terminating with a special sequence. If the SMTP-receiver successfully processes the mail data it responds with an OK reply. The dialog is purposely lock-step, one-at-a-time.

+----------+ +----------+ +------+ | | | | | User |<-->| | SMTP | | +------+ | Sender- |Commands/Replies| Receiver-| +------+ | SMTP |<-------------->| SMTP | +------+ | File |<-->| | and Mail | |<-->| File | |System| | | | | |System| +------+ +----------+ +---------+ +------+ Sender-SMTP Receiver-SMTP Model for SMTP Use Figure 1 ------------------------------------------------------22

Mail Filters

The SMTP provides mechanisms for the transmission of mail; directly from the sending user's host to the receiving user's host when the August 1982 Simple Mail Transfer Protocol two host are connected to the same transport service, or via one or more relay SMTP-servers when the source and destination hosts are not connected to the same transport service. To be able to provide the relay capability the SMTP-server must be supplied with the name of the ultimate destination host as well as the destination mailbox name. The argument to the MAIL command is a reverse-path, which specifies who the mail is from. The argument to the RCPT command is a forward-path, which specifies who the mail is to. The forward-path is a source route, while the reverse-path is a return route (which may be used to return a message to the sender when an error occurs with a relayed message). When the same message is sent to multiple recipients the SMTP encourages the transmission of only one copy of the data for all the recipients at the same destination host. The mail commands and replies have a rigid syntax. Replies also have a numeric code.Commands and replies are not case sensitive. That is, a command or reply word may be upper case, lower case, or any mixture of upper and lower case. Note that this is not true of mailbox user names. For some hosts the user name is case sensitive, and SMTP implementations must take case to preserve the case of user names as they appear in mailbox arguments. Host names are not case sensitive. Commands and replies are composed of characters from the ASCII character set [1]. When the transport service provides an 8-bit byte (octet) transmission channel, each 7-bit character is transmitted right justified in an octet with the high order bit cleared to zero. When specifying the general form of a command or reply, an argument or special symbol will be denoted by a meta-linguistic variable (or constant), for example,"<string>" or "<reverse-path>". Here the angle brackets indicate these are meta-linguistic variables. However, some arguments use the angle brackets terally. For example, an actual reverse-path is enclosed in angle brackets, i.e.,"<John.Smith@USC-ISI.ARPA>" is an instance of <reverse-path> (the angle brackets are actually transmitted in the command or reply). 3 THE SMTP PROCEDURES

23

Mail Filters

This section presents the procedures used in SMTP in several parts. First comes the basic mail procedure defined as a mail transaction. Following this are descriptions of forwarding mail, verifying mailbox names and expanding mailing lists, sending to terminals instead of or in combination with mailboxes, and the opening and closing exchanges. At the end of this section are comments on relaying, a note on mail domains, and a discussion of changing roles. 3.1 MAIL There are three steps to SMTP mail transactions. The transaction is started with a MAIL command which gives the sender identification. A series of one or more RCPT commands follows giving the receiver information. Then a DATA command gives the mail data. And finally, the end of mail data indicator confirms the transaction. The first step in the procedure is the MAIL command. The <reverse-path> contains the source mailbox. MAIL <SP> FROM:<reverse-path> <CRLF> This command tells the SMTP-receiver that a new mail transaction is starting and to reset all its state tables and buffers, including any recipients or mail data. It gives the reverse-path which can be used to report errors. If accepted, the receiverSMTP returns a 250 OK reply. The <reverse-path> can contain more than just a mailbox. The <reverse-path> is a reverse source outing list of hosts and source mailbox. The first host in the <reverse-path> should be the host sending this command. The second step in the procedure is the RCPT command. RCPT <SP> TO:<forward-path> <CRLF> This command gives a forward-path identifying one recipient. If accepted, the receiver-SMTP returns a 250 OK reply, and stores the forward-path. If the recipient is unknown the receiver-SMTP returns a 550 Failure reply. This second step of the procedure can be repeated any number of times. The <forward-path> can contain more than just a mailbox. The <forward-path> is a source routing list of hosts and the destination mailbox. The first host in the <forward-path> should be the host receiving this command.

24

Mail Filters

The third step in the procedure is the DATA command. DATA <CRLF> If accepted, the receiver-SMTP returns a 354 Intermediate reply and considers all succeeding lines to be the message text. When the end of text is received and stored the SMTP-receiver sends a 250 OK reply.Since the mail data is sent on the transmission channel the end of the mail data must be indicated so that the command and reply dialog can be resumed. SMTP indicates the end of the mail data by sending a line containing only a period. A transparency procedure is used to prevent this from interfering with the user's text. Please note that the mail data includes the memo header items such as Date, Subject, To, Cc, From [2].The end of mail data indicator also confirms the mail transaction and tells the receiverSMTP to now process the stored recipients and mail data. If accepted, the receiver-SMTP returns a 250 OK reply. The DATA command should fail only if the mail transaction was incomplete (for example,no recipients),or if resources are not available. The above procedure is an example of a mail transaction. These commands must be used only in the order discussed above. Example 1 (below) illustrates the use of these commands in a mail transaction. Example of the SMTP Procedure This SMTP example shows mail sent by Smith at host Alpha.ARPA, to Jones, Green, and Brown at host Beta.ARPA. Here we assume that host Alpha contacts host Beta directly. S: MAIL FROM:<Smith@Alpha.ARPA> R: 250 OK S: RCPT TO:<Jones@Beta.ARPA> R: 250 OK S: RCPT TO:<Green@Beta.ARPA> R: 550 No such user here S: RCPT TO:<Brown@Beta.ARPA> R: 250 OK S: DATA

25

Mail Filters

R: 354 Start mail input; end with <CRLF>.<CRLF> S: Blah blah blah... S: ...etc. etc. etc. S: <CRLF>.<CRLF> R: 250 OK The mail has now been accepted for Jones and Brown.Green did not have a mailbox at host Beta in Example 1. 3.2. FORWARDING There are some cases where the destination information in the <forward-path> is incorrect, but the receiver-SMTP knows the correct destination. In such cases, one of the following replies should be used to allow the sender to contact the correct destination. 251 -User not local; will forward to <forward-path> This reply indicates that the receiver-SMTP knows the user's mailbox is on another host and indicates the correct forward-path to use in the future. Note that either the host or user or both may be different. The receiver takes responsibility for delivering the message. 551 User not local; please try <forward-path> This reply indicates that the receiver-SMTP knows the user's mailbox is on another host and indicates the correct forward-path to use. Note that either the host or user or both may be different. The receiver refuses to accept mail for this user, and the sender must either redirect the mail according to the information provided or return an error response to the originating user. 3.3 VERIFYING AND EXPANDING SMTP provides as additional features, commands to verify a user name or expand a mailing list. This is done with the VRFY and EXPN commands, which have character string arguments. For the VRFY command, the string is a user name, and the response may include the full name of the user and must include the mailbox of the user. For the EXPN command, the string identifies a mailing list, and the multiline response may include the full name of the users and must give the mailboxes on the mailing list. "User name" is a fuzzy term and used purposely. If a host implements the VRFY or EXPN commands then at least local mailboxes must be recognized as "user names". If a host chooses to recognize other strings as "user names" that is allowed.

26

Mail Filters

In some hosts the distinction between a mailing list and an alias for a single mailbox is a bit fuzzy, since a common data structure may hold both types of entries, and it is possible to have mailing lists of one mailbox. If a request is made to verify a mailing list a positive response can be given if on receipt of a message so addressed it will be delivered to everyone on the list, otherwise an error should be reported (e.g., "550 That is a mailing list, not a user"). If a request is made to expand a user name returning a list containing one name can form a positive response, or an error can be reported (e.g., "550 That is a user name, not a mailing list"). In the case of a multiline reply (normal for EXPN) exactly one mailbox is to be specified on each line of the reply. In the case of an ambiguous request, for example, "VRFY Smith", where there are two Smith's the response must be "553 User ambiguous". The case of verifying a user name is straightforward as shown in examp 3. The character string arguments of the VRFY and EXPN commands cannot be further restricted due to the variety of implementations of the user name and mailbox list concepts. On some systems it may be appropriate for the argument of the EXPN command to be a file name for a file containing a mailing list, but again there is a variety of file naming conventions in the Internet. The VRFY and EXPN commands are not included in the minimum implementation (Section 4.5.1), and are not required to work across relays when they are implemented. 3.4. SENDING AND MAILING The main purpose of SMTP is to deliver messages to user's mailboxes. A very similar service provided by some hosts is to deliver messages to user's terminals (provided the user is active on the host). The delivery to the user's mailbox is called "mailing", the delivery to the user's terminal is called "sending". Because in many hosts the implementation of sending is nearly identical to the implementation of mailing these two functions are combined in SMTP. However the sending commands are not included in the required minimum implementation (Section 4.5.1). Users should have the ability to control the writing of messages on their terminals. Most hosts permit the users to accept or refuse such messages. The following three commands are defined to support the sending options. These are used in the mail transaction instead of the MAIL command and inform the receiver-SMTP of the special semantics of this transaction:

27

Mail Filters

SEND <SP> FROM: <reverse-path> <CRLF> The SEND command requires that the mail data be delivered to the user's terminal. If the user is not active (or not accepting terminal messages) on the host a 450 reply may returned to a RCPT command. The mail transaction is successful if the message is delivered the terminal. SOML <SP> FROM: <reverse-path> <CRLF> The Send Or Mail command requires that the mail data be delivered to the user's terminal if the user is active (and accepting terminal messages) on the host. If the user is not active (or not accepting terminal messages) then the mail data is entered into the user's mailbox. The mail transaction is successful if the message is delivered either to the terminal or the mailbox. SAML <SP> FROM: <reverse-path> <CRLF>

The Send And MaiL command requires that the mail data be delivered to the user's terminal if the user is active (and accepting terminal messages) on the host. In any case the mail data is entered into the user's mailbox. The mail transaction is successful if the message is delivered the mailbox. The same reply codes that are used for the MAIL commands are used for these commands. 3.5. OPENING AND CLOSING At the time the transmission channel is opened there is an exchange to ensure that the hosts are communicating with the hosts they think they are. 3.6 RELAYING The forward-path may be a source route of the form "@ONE, @TWO: JOE@THREE", where ONE, TWO, and THREE are hosts. This form is used to emphasize the distinction between an address and a route. The mailbox is an absolute address, and the route is information about how to get there. The two concepts should not be confused. Conceptually the elements of the forward-path are moved to the reverse-path as the message is relayed from one server-SMTP to another. The reverse-path is a reverse source route, (i.e., a source route from the current location of the message to the originator of the message). When a server-SMTP deletes its identifier from the forward-

28

Mail Filters

path and inserts it into the reverse-path, it must use the name it is known by in the environment it is sending into, not the environment the mail came from, in case the server-SMTP is known by different names in different environments. If when the message arrives at an SMTP the first element of the forward-path is not the identifier of that SMTP the element is not deleted from the forward-path and is used to determine the next SMTP to send the message to. In any case, the SMTP adds its own identifier to the reverse-path. Using source routing the receiver-SMTP receives mail to be relayed to another server-SMTP The receiver-SMTP may accept or reject the task of relaying the mail in the same way it accepts or rejects mail for a local user. The receiver-SMTP transforms the command arguments by moving its own identifier from the forwardpath to the beginning of the reverse-path. The receiver-SMTP then becomes a sender-SMTP, establishes a transmission channel to the next SMTP in the forward-path, and sends it the mail.

The first host in the reverse-path should be the host sending the SMTP commands, and the first host in the forward-path should be the host receiving the SMTP commands. Notice that the forward-path and reverse-path appear in the SMTP commands and replies, but not necessarily in the message. That is, there is no need for these paths and especially this syntax to appear in the "To:, "From:", "CC:", etc. fields of the message header. If a server-SMTP has accepted the task of relaying the mail and later finds that the forward-path is incorrect or that the mail cannot be delivered for whatever reason, then it must construct an "undeliverable mail" notification message and send it to the originator of the undeliverable mail (as indicated by the reverse-path).This notification message must be from the server-SMTP at this host. Of course, server-SMTPs should not send notification messages about problems with notification messages. One way to prevent loops in error reporting is to specify a null reverse-path in the MAIL command of a notification message. When such a message is relayed it is permissible to leave the reversepath null. A MAIL command with a null reverse-path appears as follows: MAIL FROM:<> An undeliverable mail notification message is shown in example 7.

29

Mail Filters

This notification is in response to a message originated by JOE at HOSTW and sent via HOSTX to HOSTY with instructions to relay it on to HOSTZ. What we see in the example is the transaction between HOSTY and HOSTX, which is the first step in the return of the notification message. 3.7. DOMAINS Domains are a recently introduced concept in the ARPA Internet mail system. The use of domains changes the address space from a flat global space of simple character string host names to a hierarchically structured rooted tree of global addresses. The host name is replaced by a domain and host designator which is a sequence of domain element strings separated by periods with the understanding that the domain elements are ordered from the most specific to the most general. For example, "USC-ISIF.ARPA", Fred. Cambridge. UK", and "PC7.LCS.MIT.ARPA" might be host-and-domain identifiers. Whenever domain names are used in SMTP only the official names are used, the use of nicknames or aliases is not allowed.

3.8 CHANGING ROLES The TURN command may be used to reverse the roles of the two programs communicating over the transmission channel. If program-A is currently the sender-SMTP and it sends the TURN command and receives an ok reply (250) then program-A becomes the receiver-SMTP. If program-B is currently the receiver-SMTP and it receives the TURN command and sends an ok reply (250) then program-B becomes the sender-SMTP. To refuse to change roles the receiver sends the 502 reply. Please note that this command is optional. It would not normally be used in situations where the transmission channel is TCP. However, when the cost of establishing the transmission channel is high, this command may be quite useful. For example, this command may be useful in supporting be mail exchange using the public switched telephone system as a transmission channel, especially if some hosts poll therhosts for mail exchanges. 4 THE SMTP SPECIFICATIONS 4.1 SMTP COMMANDS 4.1.1 COMMAND SEMANTICS

30

Mail Filters

The SMTP commands define the mail transfer or the mail system function requested by the user. SMTP commands are character strings terminated by <CRLF>. The command codes themselves are alphabetic characters terminated by <SP> if parameters follow and <CRLF> otherwise. The syntax of mailboxes must conform to receiver site conventions. The SMTP commands are discussed below. The SMTP replies are discussed in the Section 4.2. A mail transaction involves several data objects, which are communicated as arguments to different commands. The reversepath is the argument of the MAIL command, the forward-path is the argument of the RCPT command, and the mail data is the argument of the DATA command. These arguments or data objects must be transmitted and held pending the confirmation communicated by the end of mail data indication which finalizes the transaction. The model for this is that distinct buffers are provided to hold the types of data objects, that is, there is a reverse-path buffer, a forward-path buffer, and a mail data buffer. Specific commands cause information to be appended to a specific buffer, or cause one or more buffers to be cleared. HELLO (HELO) This command is used to identify the sender-SMTP to the receiver-SMTP. The argument field contains the host name of the sender-SMTP. The receiver-SMTP identifies itself to the sender-SMTP in the connection greeting reply, and in the response to this command.This command and an OK reply to it confirm that both the sender-SMTP and the receiver-SMTP are in the initial state, that is, there is no transaction in progress and all state tables and buffers are cleared. MAIL (MAIL) This command is used to initiate a mail transaction in which the mail data is delivered to one or more mailboxes. The argument field contains a reverse-path. The reverse-path consists of an optional list of hosts and the sender mailbox. When the list of hosts is present, it is a "reverse" source route and indicates that the mail was relayed through each host on the list (the first host in the list was the most recent relay). This list is used as a source route to return non-delivery notices to the sender. As each relay host adds itself to the beginning of the list, it must use its name as known in the IPCE to which it is relaying the mail rather than the IPCE from which the mail came (if they are different). In some types of error reporting messages (for example, undeliverable mail notifications) the reverse-path may be null (see Example 7). This command clears the reverse-path buffer,

31

Mail Filters

the forward-path buffer, and the mail data buffer; and inserts the reverse-path information from this command into the reverse-path buffer. RECIPIENT (RCPT) This command is used to identify an individual recipient of the mail data; multiple recipients are specified by multiple use of this command. The forward-path consists of an optional list of hosts and a required destination mailbox. When the list of hosts is present, it is a source route and indicates that the mail must be relayed to the next host on the list. If the receiver-SMTP does not implement the relay function it may user the same reply it would for an unknown local user (550). When mail is relayed, the relay host must remove itself from the beginning forward-path and put itself at the beginning of the reverse-path. When mail reaches its ultimate destination (the forwardpath contains only a destination mailbox), the receiver-SMTP inserts it into the destination mailbox in accordance with its host mail conventions.

Return path line-- The return path line preserves the information in the <reverse-path> from the MAIL command. Here, final delivery means the message leaves the SMTP world. Normally, this would mean it has been delivered to the destination user, but in some cases it may be further processed and transmitted by another mail system. It is possible for the mailbox in the return path be different from the actual sender's mailbox, for example, if error responses are to be delivered a special error handling mailbox rather than the message senders. The preceding two paragraphs imply that the final mail data will begin with a return path line, followed by one or more time stamp lines. These lines will be followed by the mail data header and body [2]. See Example 8. Special mention is needed of the response and further action required when the processing following the end of mail data indication is partially successful. This could arise if after accepting several recipients and the mail data, the receiver-SMTP finds that the mail data can be successfully delivered to some of the recipients, but it cannot be to others (for example, due to mailbox space allocation problems). In such a situation, the response to the DATA command must be an OK reply. But, the receiver-SMTP must compose and send an "undeliverable mail" notification message to the originator of the

32

Mail Filters

message. Either a single notification which lists all of the recipients that failed to get the message, or separate notification messages must be sent for each failed recipient (see Example 7). All undeliverable mail notification messages are sent using the MAIL command (even if they result from processing a SEND, SOML, or SAML command). SEND (SEND) This command is used to initiate a mail transaction in which the mail data is delivered to one or more terminals. The argument field contains a reverse-path. This command is successful if the message is delivered to a terminal. The reverse-path consists of an optional list of hosts and the sender mailbox. When the list of hosts is present, it is a "reverse" source route and indicates that the mail was relayed through each host on the list (the first host in the list was the most recent relay). This list is used as a source route to return non-delivery notices to the sender. As each relay host adds itself to the beginning of the list, it must use its name as known in the IPCE to which it is relaying the mail rather than the IPCE from which the mail came (if they are different). This command clears the reverse-path buffer, the forward-path buffer, and the mail data buffer; and inserts the reverse-path information from this command into the reverse-path buffer. SEND OR MAIL (SOML) This command is used to initiate a mail transaction in which the mail data is delivered to one or more terminals or mailboxes. For each recipient the mail data is delivered to the recipient's terminal if the recipient is active on the host (and accepting terminal messages), otherwise to the recipient's mailbox. The argument field contains a reverse-path. This command is successful if the message is delivered to a terminal or the mailbox. The reverse-path consists of an optional list of hosts and the sender mailbox. When the list of hosts is present, it is a "reverse" source route and indicates that the mail was relayed through each host on the list (the first host in the list was the most recent relay). This list is used as a source route to return non-delivery notices to the sender. As each relay host adds itself to the beginning of the list, it must use its name as known in the IPCE to which it is relaying the mail rather than the IPCE from which the mail came (if they are different).

33

Mail Filters

This command clears the reverse-path buffer, the forward-path buffer, and the mail data buffer; and inserts the reverse-path information from this command into the reverse-path buffer. SEND AND MAIL (SAML) This command is used to initiate a mail transaction in which the mail data is delivered to one or more terminals and mailboxes. For each recipient the mail data is delivered to the recipient's terminal if the recipient is active on the host (and accepting terminal messages), and for all recipients to the recipient's mailbox. The argument field contains a reverse-path. This command is successful if the message is delivered to the mailbox. The reverse-path consists of an optional list of hosts and the sender mailbox. When the list of hosts is present, it is a "reverse" source route and indicates that the mail was relayed through each host on the list (the first host in the list was the most recent relay). This list is used as a source route to return non-delivery notices to the sender. As each relay host adds itself to the beginning of the list, it must use its name as known in the IPCE to which it is relaying the mail rather than the IPCE from which the mail came (if they are different). This command clears the reverse-path buffer, the forwardpath buffer, and the mail data buffer; and inserts the reverse-path information from this command into the reverse-path buffer.

RESET (RSET) This command specifies that the current mail transaction is to be aborted. Any stored sender, recipients, and mail data must be discarded, and all buffers and state tables cleared. The receiver must send an OK reply. VERIFY (VRFY) This command asks the receiver to confirm that the argument identifies a user. If it is a user name, the full name of the user (if known) and the fully specified mailbox are returned. This command has no effect on any of the reverse-path buffer, the forward-path buffer, or the mail data buffer.

34

Mail Filters

EXPAND (EXPN) This command asks the receiver to confirm that the argument identifies a mailing list, and if so, to return the membership of that list. The full name of the users (if known) and the fully specified mailboxes are returned in a multiline reply. This command has no effect on any of the reverse-path buffer, the forward-path buffer, or the mail data buffer. HELP (HELP) This command causes the receiver to send helpful information to the sender of the HELP command. The command may take an argument (e.g., any command name) and return more specific information as a response. This command has no effect on any of the reverse-path buffer, the forward-path buffer, or the mail data buffer. NOOP (NOOP) This command does not affect any parameters or previously entered commands. It specifies no action other than that the receiver send an OK reply. This command has no effect on any of the reversepath buffer, the forward-path buffer, or the mail data buffer. QUIT (QUIT) This command specifies that the receiver must send an OK reply, and then close the transmission channel. The receiver should not close the transmission channel until it receives and replies to a QUIT command (even if there was an error). The sender should not close the transmission channel until it send a QUIT command and receives the reply (even if there was an error response to a previous command). If the connection is closed prematurely the receiver should act as if a RSET command had been received (canceling any pending transaction, but not undoing any previously completed transaction), the sender should act as if the command or transaction in progress had received a temporary error (4xx). TURN (TURN) This command specifies that the receiver must either (1) send an OK reply and then take on the role of the sender-SMTP, or (2) send a refusal reply and retain the role of the receiver-SMTP. If program-A is currently the sender-SMTP and it sends the TURN command and receives an OK reply (250) then program-A becomes the receiver-SMTP. Program-A is then in the initial state as if

35

Mail Filters

the transmission channel just opened, and it then sends the 220 service ready greeting. If program-B is currently the receiver-SMTP and it receives the TURN command and sends an OK reply (250) then program-B becomes the sender-SMTP. Program-B is then in the initial state as if the transmission channel just opened, and it then expects to receive the 220 service ready greeting. To refuse to change roles the receiver sends the 502 reply. There are restrictions on the order in which these command may be used. The first command in a session must be the HELO command. The HELO command may be used later in a session as well. If the HELO command argument is not acceptable a 501-failure reply must be returned and the receiver-SMTP must stay in the same state. The NOOP, HELP, EXPN, and VRFY commands can be used at any time during a session. The MAIL, SEND, SOML, or SAML commands begin a mail transaction.Once started a mail transaction consists of one of the transaction beginning commands, one or more RCPT commands, and a DATA command, in that order. A mail transaction may be aborted by the RSET command. There may be zero or more transactions in a session. If the transaction beginning command argument is not acceptable a 501 failure reply must be returned and the receiverSMTP must stay in the same state. If the commands in a transaction are out of order a 503 failure reply must be returned and the receiverSMTP must stay in the same state. The last command in a session must be the QUIT command. The QUIT command can not be used at any other time in a session. 4.1.2. COMMAND SYNTAX The commands consist of a command code followed by an argument field. Command codes are four alphabetic characters. Upper and lower case alphabetic characters are to be treated identically. Thus, any of the following may represent the mail command: MAIL Mail mail MaIl mAIl

This also applies to any symbols representing parameter values, such as "TO" or "to" for the forward-path.Command codes and the argument fields are separated by one or more spaces.However, within

36

Mail Filters

the reverse-path and forward-path arguments case is important. In particular, in some hosts the user "smith" is different from the user "Smith".The argument field consists of a variable length character string ending with the character sequence <CRLF>. The receiver is to take no action until this sequence is received. Square brackets denote an optional argument field. If the option is not taken, the appropriate default is implied. The following are the SMTP commands: HELO <SP> <domain> <CRLF> MAIL <SP> FROM:<reverse-path> <CRLF> RCPT <SP> TO:<forward-path> <CRLF> DATA <CRLF> RSET <CRLF> SEND <SP> FROM:<reverse-path> <CRLF> SOML <SP> FROM:<reverse-path> <CRLF> SAML <SP> FROM:<reverse-path> <CRLF> VRFY <SP> <string> <CRLF> EXPN <SP> <string> <CRLF> HELP [<SP> <string>] <CRLF> NOOP <CRLF> QUIT <CRLF> TURN <CRLF> 4.2 SMTP REPLIES Replies to SMTP commands are devised to ensure the synchronization of requests and actions in the process of mail transfer, and to guarantee that the sender-SMTP always knows the state of the receiver-SMTP. Every command must generate exactly one reply. The details of the command-reply sequence are made explicit in Section 5.3 on Sequencing and Section 5.4 State Diagrams.

37

Mail Filters

An SMTP reply consists of a three digit number (transmitted as three alphanumeric characters) followed by some text. The number is intended for use by automata to determine what state to enter next; the text is meant for the human user. It is intended that the three digits contain enough encoded information that the sender-SMTP need not examine the text and may either discard it or pass it on to the user, as appropriate. In particular, the text may be receiverdependent and context dependent, so there are likely to be varying texts for each reply code. A discussion of the theory of reply codes is given in Appendix E. Formally, a reply is defined to be the sequence: a three-digit code, <SP>, one line of text, and <CRLF>, or a multiline reply (as defined in Appendix E). Only the EXPN and HELP commands are expected to result in multiline replies in normal circumstances, however multiline replies are allowed for any command. 4.2.2 NUMERIC ORDER LIST OF REPLY CODES 211 - System status, or system help reply 214 - Help message [Information on how to use the receiver or the meaning of a particular non-standard command; this reply is useful only to the human user] 220 - <domain> Service ready 221 - <domain> Service closing transmission channel 250 - Requested mail action okay, completed 251 - User not local; will forward to <forward-path> 354 - Start mail input; end with <CRLF>.<CRLF> 421 - <domain> Service not available, closing transmission channel [This may be a reply to any command if the service knows it must shut down] 450 - Requested mail action not taken: mailbox unavailable [E.g., mailbox busy] 451 - Requested action aborted: local error in processing 452 Requested action not taken: insufficient system storage 500 - Syntax error, command unrecognized [This may include errors such as command line too long] 501 Syntax error in parameters or arguments 502 - Command not implemented 503 - Bad sequence of commands 504 - Command parameter not implemented 550 - Requested action not taken: mailbox unavailable [E.g., mailbox not found, no access] 551 - User not local; please try <forward-path> 552 - Requested mail action aborted: exceeded storage allocation 553 Requested action not taken: mailbox name not allowed [E.g., mailbox syntax incorrect] 554 - Transaction failed 4.3. SEQUENCING OF COMMANDS AND REPLIES
38

Mail Filters

The communication between the sender and receiver is intended to be an alternating dialogue, controlled by the sender. As such,the sender issues a command and the receiver responds with a reply. The sender must wait for this response before sending further commands. One important reply is the connection greeting. Normally, a receiver will send a 220 "Service ready" reply when the connection is completed. The sender should wait for this greeting message before sending any commands. Note: all the greeting type replies have the official name of the server host as the first word following the reply code. For example, 220 <SP> USC-ISIF.ARPA <SP> Service ready <CRLF> The table below lists alternative success and failure replies for each command. These must be strictly adhered to; a receiver may substitute text in the replies, but the meaning and action implied by the code numbers and by the specific command reply sequence cannot be altered. COMMAND-REPLY SEQUENCES Each command is listed with its possible replies. The prefixes used before the possible replies are "P" for preliminary (not used in SMTP), "I" for intermediate, "S" for success, "F" for failure, and "E" for error. The 421 reply (service not available, closing transmission channel) may be given to any command if the SMTP-receiver knows it must shut down. This listing forms the basis for the State Diagrams in Section 4.4. 4.4. STATE DIAGRAMS Following are state diagrams for a simple-minded SMTP implementation. Only the first digit of the reply codes is used. There is one state diagram for each group of SMTP commands. The command groupings were determined by constructing a model for each command and then collecting together the commands with structurally identical models. For each command there are three possible outcomes: "success"(S), "failure" (F), and "error" (E). In the state diagrams below we use the symbol B for "begin", and the symbol W for "wait for reply".

39

Mail Filters

First, the diagram that represents most of the SMTP commands:

1,3 +---+ ----------->| E | | +---+ | +---+ cmd +---+ 2 +---+ | B |---------->| W |---------->| S | +---+ +---+ +---+ | | 4,5 +---+ ----------->| F | +---+
This diagram models the commands:

HELO, MAIL, RCPT, RSET, SEND, SOML, SAML, VRFY, EXPN, HELP,NOOP, QUIT, TURN.

A more complex diagram models the DATA command:

+---+ +---+ |

DATA

+---+ 1,2

| B |---------->| W |-------------------->| E +---+ ------------>+---+ +---+ | | | | -------->| S |

3| |4,5 | | -----------------| +---+ | | | +---+ | | V +---+ data --------------->+---+ | | ---------|

| -----------| | | 1,3| |2 | +---+

40

Mail Filters

| | >+---+

|---------->| W | +---+-------------------4,5

| F

+---+

Note that the "data" here is a series of lines sent from the sender to the receiver with no response expected until the last line is sent. 4.5. 4.5.1. DETAILS MINIMUM IMPLEMENTATION

In order to make SMTP workable, the following minimum implementation is required for all receivers: COMMANDS -- HELO MAIL RCPT DATA RSET NOOP QUIT 4.5.2. TRANSPARENCY Without some provision for data transparency the character sequence "<CRLF>.<CRLF>" ends the mail text and cannot be sent by the user. In general, users are not aware of such "forbidden" sequences. To allow all user composed text to be transmitted transparently the following procedures are used. 1. Before sending a line of mail text the sender-SMTP checks the first character of the line. If it is a period, one additional period is inserted at the beginning of the line. 2. When a line of mail text is received by the receiver-SMTP it checks the line. If the line is composed of a single period it is the end of mail. If the first character is a period and there are other characters on the line, the firstcharacter is deleted. The mail data may contain any of the 128 ASCII characters. All characters are to be delivered to the recipient's mailbox including format effectors and other control characters. If the transmission channel provides an 8-bit byte (octets) data stream, the 7-bit ASCII codes are transmitted right justified in the octets with the high order bits cleared to zero.

41

Mail Filters

In some systems it may be necessary to transform the data as it is received and stored. This may be necessary for hosts that use a different character set than ASCII as their local character set, or that store data in records rather than strings. If such transforms are necessary, they must be reversible -- especially if such transforms are applied to mail being relayed. 4.5.3 SIZES There are several objects that have required minimum maximum sizes. That is, every implementation must be able to receive objects of at least these sizes, but must not send objects larger than these sizes. **************************************************** * * * TO THE MAXIMUM EXTENT POSSIBLE, IMPLEMENTATION * * TECHNIQUES WHICH IMPOSE NO LIMITS ON THE LENGTH * * OF THESE OBJECTS SHOULD BE USED. * * * **************************************************** User: The maximum total length of a user name is 64 characters. Domain: The maximum total length of a domain name or number is 64 characters. Path: The maximum total length of a reverse-path or forward-path is 256 characters (including the punctuation and element separators). Command line: The maximum total length of a command line including the command word and the <CRLF> is 512 characters. Reply line: The maximum total length of a reply line including the reply code and the <CRLF> is 512 characters.

Text line: The maximum total length of a text line including the <CRLF> is 1000 characters (but not counting the leading dot duplicated for transparency). Recipients buffer: The maximum total number of recipients that must be buffered is 100 recipients. **************************************************** * * * TO THE MAXIMUM EXTENT POSSIBLE, IMPLEMENTATION * * TECHNIQUES WHICH IMPOSE NO LIMITS ON THE LENGTH * * OF THESE OBJECTS SHOULD BE USED. *

42

Mail Filters

* ****************************************************

Errors due to exceeding these limits may be reported by using the reply codes, for example: 500 501 552 552 Line too long. Path too long Too many recipients. Too much mail data.

Post Office Protocol: Status of this Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. 1. Introduction On certain types of smaller nodes in the Internet it is often impractical to maintain a message transport system (MTS). For example, a workstation may not have sufficient resources (cycles,disk space) in order to permit a SMTP server [RFC821] and associated local mail delivery system to be kept resident and continuously running. Similarly, it may be expensive (or impossible) to keep a personal computer interconnected to an IP-style network for long amounts of time (the node is lacking the resource known as "connectivity"). Despite this, it is often very useful to be able to manage mail on these smaller nodes, and they often support a user agent (UA) to aid the tasks of mail handling. To solve this problem, a node which can support an MTS entity offers a maildrop service to these less endowed nodes. The Post Office Protocol - Version 3 (POP3) is intended to permit a workstation to dynamically access a maildrop on a server host in a useful fashion. Usually, this means that the POP3 protocol is used to allow a workstation to retrieve mail that the server is holding for it. POP3 is not intended to provide extensive manipulation operations of mail on the server; normally, mail is downloaded and then deleted. A more advanced (and complex) protocol, IMAP4, is discussed in [RFC1730]. For the remainder of this memo, the term
43

Mail Filters

"client host" refers to a host making use of the POP3 service, while the term "server host" refers to a host that offers the POP3 service. 2. A Short Digression This memo does not specify how a client host enters mail into the transport system, although a method consistent with the philosophy of this memo is presented here: When the user agent on a client host wishes to enter a message into the transport system, it establishes an SMTP connection to its relay host and sends all mail to it. This relay host could be, but need not be, the POP3 server host for the client host. Of course, the relay host must accept mail for delivery to arbitrary recipient addresses, that functionality is not required of all SMTP servers. 3. Basic Operation Initially, the server host starts the POP3 service by listening on TCP port 110. When a client host wishes to make use of the service, it establishes a TCP connection with the server host. When the connection is established, the POP3 server sends a greeting. The client and POP3 server then exchange commands and responses (respectively) until the connection is closed or aborted. Commands in the POP3 consist of a case-insensitive keyword, possibly followed by one or more arguments. All commands are terminated by a CRLF pair. Keywords and arguments consist of printable ASCII characters.Keywords and arguments are each separated by a single SPACE character. Keywords are three or four characters long. Each argument may be up to 40 characters long. Responses in the POP3 consist of a status indicator and a keyword possibly followed by additional information. All responses are terminated by a CRLF pair. Responses may be up to 512 characters long, including the terminating CRLF. There are currently two status indicators: positive ("+OK") and negative ("-ERR"). Servers MUST send the "+OK" and "-ERR" in upper case. Responses to certain commands are multi-line. In these cases, which are clearly indicated below, after sending the first line of the response and a CRLF, any additional lines are sent, each terminated by a CRLF pair. When all lines of the response have been sent, a final line is sent, consisting of a termination octet (decimal code 046, ".") and a CRLF pair. If any line of the multi-line response begins with the termination octet, the line is "byte-stuffed" by pre-pending the termination octet to that line of the response.

44

Mail Filters

Hence a multi-line response is terminated with the five octets "CRLF.CRLF". When examining a multi-line response, the client checks to see if the line begins with the termination octet. If so and if octets other than CRLF follow, the first octet of the line (the termination octet) is stripped away. If so and if CRLF immediately follows the termination character, then the response from the POP server is ended and the line containing "CRLF" is not considered part of the multi-line response. A POP3 session progresses through a number of states during its lifetime. Once the TCP connection has been opened and the POP3 server has sent the greeting, the session enters the AUTHORIZATION state. In this state, the client must identify itself to the POP3 server. Once the client has successfully done this, the server acquires resources associated with the client's maildrop, and the session enters the TRANSACTION state. In this state, the client requests actions on the part of the POP3 server. When the client has issued the QUIT command, the session enters the UPDATE state. In this state, the POP3 server releases any resources acquired during the TRANSACTION state and says goodbye. The TCP connection is then closed. A server MUST respond to an unrecognized, unimplemented, or syntactically invalid command by responding with a negative status indicator. A server MUST respond to a command issued when the session is in an incorrect state by responding with a negative status indicator. There is no general method for a client to distinguish between a server which does not implement an optional command and a server which is unwilling or unable to process the command. A POP3 server MAY have an inactivity autologout timer. Such a timer MUST be of at least 10 minutes' duration. The receipt of any command from the client during that interval should suffice to reset the autologout timer. When the timer expires, the session does NOT enter the UPDATE state--the server should close the TCP connection without removing any messages or sending any response to the client.

4. THE AUTHORIZATION STATE Once the TCP connection has been opened by a POP3 client, the POP3 server issues a one-line greeting. This can be any ositive response. An example might be: S: +OK POP3 server ready

45

Mail Filters

The POP3 session is now in the AUTHORIZATION state. The client must now identify and authenticate itself to the POP3 server. Two possible mechanisms for doing this are described in this document, the USER and PASS command combination and the APOP command. Both mechanisms are described later in this document. Additional authentication mechanisms are described in [RFC1734]. While there is no single authentication mechanism that is required of all POP3 servers, a POP3 server must of course support at least one authentication mechanism. Once the POP3 server has determined through the use of any authentication command that the client should be given access to the appropriate maildrop, the POP3 server then acquires an exclusiveaccess lock on the maildrop, as necessary to prevent messages from being modified or removed before the session enters the UPDATE state. If the lock is successfully acquired, the POP3 server responds with a positive status indicator. The POP3 session now enters the TRANSACTION state, with no messages marked as deleted. If the maildrop cannot be opened for some reason (for example, a lock can not be acquired, the client is denied access to the appropriate maildrop, or the maildrop cannot be parsed), the POP3 server responds with a negative status indicator. (If a lock was acquired but the POP3 server intends to respond with a negative status indicator, the POP3 server must release the lock prior to rejecting the command.). After returning a negative status indicator, the server may close the connection. If the server does not close the connection, the client may either issue a new authentication command and start again, or the client may issue the QUIT command. After the POP3 server has opened the maildrop, it assigns a message- number to each message, and notes the size of each message in octets. The first message in the maildrop is assigned a message-number of "1", the second is assigned "2", and so on, so that the nth message in a maildrop is assigned a message-number of "n". In POP3 commands and responses, all message-numbers and message sizes are expressed in base-10 (i.e. decimal). Here is the summary for the QUIT command when used in the AUTHORIZATION state: QUIT Arguments : none Restrictions : none Possible Responses: +OK Examples: C: QUIT S: +OK dewey POP3 server signing off

46

Mail Filters

5. THE TRANSACTION STATE Once the client has successfully identified itself to the POP3 server and the POP3 server has locked and opened the appropriate maildrop, the POP3 session is now in the TRANSACTION state. The client may now issue any of the following POP3 commands repeatedly. After each command, the POP3 server issues a response. Eventually, the client issues the QUIT command and the POP3 session enters the UPDATE state. Here are the POP3 commands valid in the TRANSACTION state: STAT Arguments : none Restrictions : may only be given in the TRANSACTION state Discussion : The POP3 server issues a positive response with a line containing information for the maildrop. This line is called a "drop listing" for that maildrop. In order to simplify parsing, all POP3 servers are required to use a certain format for drop listings. The positive response consists of "+OK" followed by a single space, the number of messages in the maildrop, a single space, and the size of the maildrop in octets. This memo makes no requirement on what follows the maildrop size. Minimal implementations should just end that line of the response with a CRLF pair. More advanced implementations may include other information. NOTE: This memo STRONGLY discourages implementations from supplying additional information in the drop listing. Other, optional, facilities are discussed later on which permit the client to parse the messages in the maildrop. Note that messages marked as deleted are not counted in either total. Possible Responses: +OK nn mm Examples: C: STAT S: +OK 2 320

LIST [msg] Arguments : a message-number (optional), which, if present, may not refer to a message marked as deleted

47

Mail Filters

Restrictions : may only be given in the TRANSACTION state Discussion : If an argument was given and the POP3 server issues a positive response with a line containing information for that message. This line is called a "scan listing" for that message.If no argument was given and the POP3 server issues a positive response, then the response given is multi-line. After the initial +OK, for each message in the maildrop, the POP3 server responds with a line containing information for that message. This line is also called a "scan listing" for that message. If there are no messages in the maildrop, then the POP3 server responds with no scan listings--it issues a positive response followed by a line containing a termination octet and a CRLF pair. In order to simplify parsing, all POP3 servers are required to use a certain format for scan listings. A scan listing consists of the message-number of the message, followed by a single space and the exact size of the message in octets. Methods for calculating the exact size of the message are described in the "Message Format" section below. This memo makes no requirement on what follows the message size in the scan listing. Minimal implementations should just end that line of the response with a CRLF pair. More advanced implementations may include other information, as parsed from the message. NOTE: This memo STRONGLY discourages implementations from supplying additional information in the scan listing. Other, optional, facilities are discussed later on which permit the client to parse the messages in the maildrop. Note that messages marked as deleted are not listed. Possible Responses: +OK scan listing follows -ERR no such message Examples: C: LIST S: +OK 2 messages (320 octets) S: 1 120 S: 2 200 S: .... C: LIST 2 S: +OK 2 200 ... C: LIST 3 S: -ERR no such message, only 2 messages in maildrop RETR msg Arguments : A message-number (required) which may NOT refer to a message marked as deleted Restrictions: May only be given in the TRANSACTION state

48

Mail Filters

Discussion : If the POP3 server issues a positive response, then the response given is multi-line. After the initial +OK, the POP3 server sends the message corresponding to the given message-number, being careful to byte-stuff the termination character (as with all multi-line responses). Possible Responses: +OK message follows -ERR no such message Examples: C: RETR 1 S: +OK 120 octets S: <the POP3 server sends the entire message here> S: . DELE msg Arguments : a message-number (required) which may NOT refer to a message marked as deleted Restrictions : may only be given in the TRANSACTION state Discussion : The POP3 server marks the message as deleted. Any future reference to the message-number associated with the message in a POP3 command generates an error. The POP3 server does not actually delete the message until the POP3 session enters the UPDATE state. Possible Responses: +OK message deleted -ERR no such message Examples: C: DELE 1 S: +OK message 1 deleted ... C: DELE 2 S: -ERR message 2 already deleted NOOP Arguments : none Restrictions : may only be given in the TRANSACTION state Discussion : The POP3 server does nothing, it merely replies with a positive response. Possible Responses : +OK Examples: C: NOOP S: +OK RSET Arguments : none Restrictions : may only be given in the TRANSACTION state Discussion : If any messages have been marked as deleted by the POP3 server, they are unmarked. The POP3 server then replies with a positive response. Possible Responses: +OK

49

Mail Filters

Examples:

C: RSET S: +OK maildrop has 2 messages (320 octets)

6. THE UPDATE STATE When the client issues the QUIT command from the TRANSACTION state,the POP3 session enters the UPDATE state (Note that if the client issues the QUIT command from the AUTHORIZATION state, the POP3 session terminates but does NOT enter the UPDATE state.). If a session terminates for some reason other than a clientissued QUIT command, the POP3 session does NOT enter the UPDATE state and MUST not remove any messages from the maildrop. QUIT Arguments : none Restrictions : none Discussion : The POP3 server removes all messages marked as deleted from the maildrop and replies as to the status of this operation. If there is an error, such as a resource shortage, encountered while removing messages, the maildrop may result in having some or none of the messages marked as deleted be removed. In no case may the server remove any messages not marked as deleted. Whether the removal was successful or not, the server then releases any exclusive-access lock on the maildrop and closes the TCP connection. Possible Responses: +OK -ERR some deleted messages not remove Examples: C: QUIT S: +OK dewey POP3 server signing off (maildrop empty) ... C: QUIT S: +OK dewey POP3 server signing off (2 messages left) ... 7. Optional POP3 Commands The POP3 commands discussed above must be supported by all minimal implementations of POP3 servers. The optional POP3 commands described below permit a POP3 client greater freedom in message handling, while preserving a simple POP3 server implementation.

50

Mail Filters

NOTE: This memo STRONGLY encourages implementations to support these commands in lieu of developing augmented drop and scan listings. In short, the philosophy of this memo is to put intelligence in the part of the POP3 client and not the POP3 server. TOP msg n Arguments : a message-number (required) which may NOT refer to to a message marked as deleted, and a nonnegative number of lines (required) Restrictions: may only be given in the TRANSACTION state Discussion : If the POP3 server issues a positive response, then the response given is multi-line. After the initial +OK, the POP3 server sends the headers of the message, the blank line separating the headers from the body, and then the number of lines of the indicated message's body, being careful to byte-stuff the termination character (as with all multi-line responses). Note that if the number of lines requested by the POP3 client is greater than than the number of lines in the body, then the POP3 server sends the entire message. Possible Responses: +OK top of message follows -ERR no such message Examples: C: TOP 1 10 S: +OK C: <the POP3 server sends the headers of the message, a blank line, and the first 10 lines of the body of the message> S: .... C: TOP 100 3 S: -ERR no such message UIDL [msg] Arguments : a message-number (optional), which, if present, may NOT refer to a message marked as deleted Restrictions: may only be given in the TRANSACTION state. Discussion : If an argument was given and the POP3 server issues a positive response with a line containing information for that message.This line is called a "unique-id listing" for that message. If no argument was given and the POP3 server issues a positive response, then the response given is multi-line. After the initial +OK, for each message in the maildrop, the POP3 server responds with a line containing information for that message. This line is called a "unique-id listing" for that message.

51

Mail Filters

In order to simplify parsing, all POP3 servers are required to use a certain format for unique-id listings. A unique-id listing consists of the message-number of the message,followed by a single space and the unique-id of the message. No information follows the unique-id in the unique-id listing. The unique-id of a message is an arbitrary server-determined string, consisting of one to 70 characters in the range 0x21 to 0x7E, which uniquely identifies a message within a maildrop and which persists across sessions. This persistence is required even if a session ends without entering the UPDATE state. The server should never reuse an unique-id in a given maildrop, for as long as the entity using the unique-id exists. Note that messages marked as deleted are not listed. While it is generally preferable for server implementations to store arbitrarily assigned unique-ids in the maildrop, this specification is intended to permit unique-ids to be calculated as a hash of the message. Clients should be able to handle a situation where two identical copies of a message in a maildrop have the same unique-id. Possible Responses: +OK unique-id listing follows -ERR no such message Examples: C: UIDL S: +OK S: 1 whqtswO00WBw418f9t5JxYwZ S: 2 QhdPYR:00WBw1Ph7x7 S: . ... C: UIDL 2 S: +OK 2 QhdPYR:00WBw1Ph7x7 ... C: UIDL 3 S: -ERR no such message, only 2 messages in maildrop USER name Arguments : A string identifying a mailbox (required), which is of significance ONLY to the server Restrictions : May only be given in the AUTHORIZATION state after the POP3 greeting or after an unsuccessful USER or PASS command

52

Mail Filters

Discussion : To authenticate using the USER and PASS command combination, the client must first issue the USER command. If the POP3 server responds with a positive status indicator ("+OK"), then the client may issue either the PASS command to complete the authentication, or the QUIT command to terminate the POP3 session.If the POP3 server responds with a negative status indicator("-ERR") to the USER command, then the client may either issue a new authentication command or may issue the QUIT command. The server may return a positive response even though no such mailbox exists. The server may return a negative response if mailbox exists, but does not permit plaintext password authentication. Possible Responses: Examples: +OK name is a valid mailbox -ERR never heard of mailbox name C: USER frated S: -ERR sorry, no mailbox for frated here ... C: USER mrose S: +OK mrose is a real hoopy frood

PASS string Arguments : A server/mailbox-specific password (required) Restrictions: May only be given in the AUTHORIZATION state immediately after a successful USER command Discussion : When the client issues the PASS command, the POP3 server uses the argument pair from the USER and PASS commands to determine if the client should be given access to the appropriate maildrop. Since the PASS command has exactly one argument, a POP3 server may treat spaces in the argument as part of the password, instead of as argument separators. Possible Responses: +OK maildrop locked and ready -ERR invalid password -ERR unable to lock maildrop

Examples:

C: USER mrose S: +OK mrose is a real hoopy frood C: PASS secret S: -ERR maildrop already locked ... C: USER mrose S: +OK mrose is a real hoopy frood C: PASS secret S: +OK mrose's maildrop has 2 messages (320 octets)

53

Mail Filters

APOP name digest Arguments : A string identifying a mailbox and a MD5 digest string(both required) Restrictions : May only be given in the AUTHORIZATION state after the POP3 greeting or after an unsuccessful USER or PASS command Discussion :Normally, each POP3 session starts with a USER/PASS exchange. This results in a server/user-id specific password being sent in the clear on the network. For intermittent use of POP3, this may not introduce a sizable risk.However, many POP3 client implementations connect to the POP3 server on a regular basis -- to check for new mail. Further the interval of session initiation may be on the order of five minutes. Hence, the risk of password capture is greatly enhanced. An alternate method of authentication is required which provides for both origin authentication and replay protection, but which does not involve sending a password in the clear over the network. The APOP command provides this functionality. A POP3 server which implements the APOP command will include a timestamp in its banner greeting. The syntax of the timestamp corresponds to the `msg-id' in [RFC822], and MUST be different each time the POP3 server issues a banner greeting. For example, on a UNIX implementation in which a separate UNIX process is used for each instance of a POP3 server, the syntax of the timestamp might be: <process-ID.clock@hostname> where `process-ID' is the decimal value of the process's PID, clock is the decimal value of the system clock, and hostname is the fullyqualified domain-name corresponding to the host where the POP3 server is running. The POP3 client makes note of this timestamp, and then issues the APOP command. The `name' parameter has identical semantics to the `name' parameter of the USER command. The `digest' parameter is calculated by applying the MD5 algorithm [RFC1321] to a string consisting of the timestamp (including angle-brackets) followed by a shared secret. This shared secret is a string known only to the POP3 client and server. Great care should be taken to prevent unauthorized disclosure of the secret, as knowledge of the secret will allow any entity to successfully masquerade as the named user. The `digest' parameter itself is a 16-octet value which is sent in hexadecimal format, using lower-case ASCII characters.

54

Mail Filters

When the POP3 server receives the APOP command, it verifies the digest provided. If the digest is correct, the POP3 server issues a positive response, and the POP3 session enters the TRANSACTION state. Otherwise, a negative response is issued and the POP3 session remains in the AUTHORIZATION state. Note that as the length of the shared secret increases, so does the difficulty of deriving it. As such, shared secrets should be long strings (considerably longer than the 8-character example shown below). Possible Responses: +OK maildrop locked and ready -ERR permission denied Examples: S: +OK POP3 server ready <1896.697170952@dbc.mtview.ca.us> C: APOP mrose c4c9334bac560ecc979e58001b3e22fb S: +OK maildrop has 1 message (369 octets) In this example, the shared secret is the string `tanstaaf'. Hence, the MD5 algorithm is applied to the string <1896.697170952@dbc.mtview.ca.us>tanstaaf which produces a digest value of c4c9334bac560ecc979e58001b3e22fb 8. Scaling and Operational Considerations Since some of the optional features described above were added to the POP3 protocol, experience has accumulated in using them in large-scale commercial post office operations where most of the users are unrelated to each other. In these situations and others, users and vendors of POP3 clients have discovered that the combination of using the UIDL command and not issuing the DELE command can provide a weak version of the "maildrop as semipermanent repository" functionality normally associated with IMAP. Of course the other capabilities of IMAP, such as polling an existing connection for newly arrived messages and supporting multiple folders on the server, are not present in POP3. When these facilities are used in this way by casual users, there has been a tendency for already-read messages to accumulate on the server without bound. This is clearly an undesirable behavior pattern from the standpoint of the server operator. This situation is aggravated by the fact that the limited capabilities of the POP3 do not permit efficient handling of maildrops which have hundreds or thousands of messages. Consequently, it is recommended that operators of large-scale multi-user servers, especially ones in which the user's only access to the maildrop is via POP3, consider such options as:

55

Mail Filters

* Imposing a per-user maildrop storage quota or the like. A disadvantage to this option is that accumulation of messages may result in the user's inability to receive new ones into the maildrop. Sites which choose this option should be sure to inform users of impending or current exhaustion of quota, perhaps by nserting an appropriate message into the user's maildrop. * Enforce a site policy regarding mail retention on the server.Sites are free to establish local policy regarding the storage and retention of messages on the server, both read and unread. For example, a site might delete unread messages from the server after 60 days and delete read messages after 7 days. Such message deletions are outside the scope of the POP3 protocol and are not onsidered a protocol violation. Server operators enforcing message deletion policies should take care to make all users aware of the policies in force. Clients must not assume that a site policy will automate message eletions, and should continue to explicitly delete messages using the DELE command when appropriate. It should be noted that enforcing site message deletion policies may be confusing to the user community, since their POP3 client may contain configuration options to leave mail on the server which will not in fact be supported by the server. One special case of a site policy is that messages may only be downloaded once from the server, and are deleted after this has been accomplished. This could be implemented in POP3 server software by the following mechanism: "following a POP3 login by a lient which was ended by a QUIT, delete all messages downloaded during the session with the RETR command". It is important not to delete messages in the event of abnormal connection termination(ie, if no QUIT was received from the client) because the client may not have successfully received or stored the messages. Servers implementing a download-and-delete policy may also wish to disable or limit the optional TOP command, since it could be used as an alternate mechanism to download entire messages. 9. POP3 Command Summary Minimal POP3 Commands: USER name PASS string QUIT valid in the AUTHORIZATION state

56

Mail Filters

STAT LIST [msg] RETR msg DELE msg NOOP RSET QUIT

valid in the TRANSACTION state

Optional POP3 Commands: APOP name digest valid in the AUTHORIZATION state TOP msg n valid in the TRANSACTION state UIDL [msg] POP3 Replies: +OK AND -ERR

Note that with the exception of the STAT, LIST, and UIDL commands, the reply given by the POP3 server to any command is significant only to "+OK" and "-ERR". Any text occurring after this reply may be ignored by the client. 10. Example POP3 Session S: <wait for connection on TCP port 110> C: <open connection> S: +OK POP3 server ready <1896.697170952@dbc.mtview.ca.us> C: APOP mrose c4c9334bac560ecc979e58001b3e22fb S: +OK mrose's maildrop has 2 messages (320 octets) C: STAT S: +OK 2 320 C: LIST S: +OK 2 messages (320 octets) S: 1 120 S: 2 200 S: . C: RETR 1 S: +OK 120 octets S: <the POP3 server sends message 1> S: . C: DELE 1 S: +OK message 1 deleted C: RETR 2 S: +OK 200 octets S: <the POP3 server sends message 2> S: . C: DELE 2

57

Mail Filters

S: +OK message 2 deleted C: QUIT S: +OK dewey POP3 server signing off (maildrop empty) C: <close connection> S: <wait for next connection>

11. Message Format All messages transmitted during a POP3 session are assumed to conform to the standard for the format of Internet text messages [RFC822]. It is important to note that the octet count for a message on the server host may differ from the octet count assigned to that message due to local conventions for designating end-of-line. Usually,during the AUTHORIZATION state of the POP3 session, the POP3 server an calculate the size of each message in octets when it opens the maildrop. For example, if the POP3 server host internally represents end-of-line as a single character, then the POP3 server simply counts each occurrence of this character in a message as two octets. Note that lines in the message which start with the termination octet need not (and must not) be counted twice, since the POP3 client will remove all byte-stuffed termination characters when it receives a multi-line response. Security Considerations It is conjectured that use of the APOP command provides origin identification and replay protection for a POP3 session. Accordingly, a POP3 server which implements both the PASS and APOP commands should not allow both methods of access for a given user; that is, for a given mailbox name, either the USER/PASS command sequence or the APOP command is allowed, but not both.Further, note that as the length of the shared secret increases, so does the difficulty of deriving it. Servers that answer -ERR to the USER command are giving potential attackers clues about which names are valid. Use of the PASS command sends passwords in the clear over the network.

58

Mail Filters

Use of the RETR and TOP commands sends mail in the clear over the network. Otherwise, security issues are not discussed in this memo. 14. Acknowledgements The POP family has a long and checkered history. Although primarily a minor revision to RFC 1460, POP3 is based on the ideas presented in RFCs 918, 937, and 1081.

INTRODUCTION: The JavaMail API provides a set of abstract classes defining objects that comprise a system. The API defines classes like Message, Store and Transport. The API can be extended and can be subclassed to provide new protocols and to add functionality when necessary. In addition, the API provides concrete subclasses of the abstract classes. These subclasses, including MimeMessage and MimeBodyPart, implement widely used Internet mail protocols and conform to specifications RFC822 and RFC2045. They are ready to be used in application development. GOALS AND DESIGN PRINCIPLES: The JavaMail API is designed to make adding electronic mail capability to simple applications easy, while also supporting the creation of sophisticated user interfaces.It includes appropriate convenience classes which encapsulate common mail functions and protocols. It fits with other packages for the Java platform in order to facilitate its use with other Java APIs, and it uses familiar programming models. The JavaMail API is therefore designed to satisfy the following development and runtime requirements: Simple, straightforward class design is easy for a developer to learn and implement. Use of familiar concepts and programming models support code development that interfaces well with other Java APIs. Uses familiar exception-handling and JDK 1.1 event-handling programming models. Uses features from the JavaBeans Activation Framework (JAF) to handle access to data based on data-type and to facilitate the addition of data types and commands on those data types. The

59

Mail Filters

JavaMail API provides convenience functions to simplify these coding tasks. Lightweight classes and interfaces make it easy to add basic mail-handling tasks to any application. Supports the development of robust mail-enabled applications, that can handle a variety of complex mail message formats, data types, and access and transport protocols. The JavaMail API draws heavily from IMAP, MAPI, CMC, c-client and other messaging system APIs: many of the concepts present in these other systems are also present in the JavaMail API. It is simpler to use because it uses features of the Java programming language not available to these other APIs, and because it uses the Java programming languages object model to shelter applications from implementation complexity. The JavaMail API supports many different messaging system implementationsdifferent message stores, different message formats, and different message transports.The JavaMail API provides a set of base classes and interfaces that define the API for client applications. Many simple applications will only need to interact with the messaging system through these base classes and interfaces. JavaMail subclasses can expose additional messaging system features. For instance,the MimeMessage subclass exposes and implements common characteristics of an Internet mail message, as defined by RFC822 and MIME standards. Developers cansubclass JavaMail classes to provide the implementations of particular messaging systems, such as IMAP4, POP3, and SMTP. ARCHITECTURE The JavaMail architectural components are layered as shown below: 1. The Abstract Layer declares classes, interfaces and abstract methods intended to support mail handling functions that all mail systems support. API elements comprising the Abstract Layer are intended to be subclassed and extended as necessary in order to support standard data types, and to interface with message access and message transport protocols as necessary. 2. The internet implementation layer implements part of the abstract layer using internet standards - RFC822 and MIME. 3 JavaMail uses the JavaBeans Activation Framework (JAF) in order to encapsulate message data, and to handle commands intended to interact with that data. Interaction with message data should take place via JAF-aware JavaBeans, which are not provided by the JavaMail API.

60

Mail Filters

JavaMail clients use the JavaMail API and Service Providers implement the JavaMail API. The layered design architecture allows clients to use the same JavaMail API calls to send, receive and store a variety of messages using different data-types from different message stores and using different message transport protocols.

JAVA MAIL CLASS HIERARCHY: The figure below shows major classes and interfaces comprising the JavaMail API.

61

Mail Filters

THE JAVA MAIL FRAME WORK: The JavaMail API is intended to perform the following functions, which comprise the standard mail handling process for a typical client application:

62

Mail Filters

MAJOR JAVA MAIL API COMPONENTS: This section reviews major components comprising the JavaMail architecture.The Message Class The Message class is an abstract class that defines a set of attributes and a content for a mail message. Attributes of the Message class specify addressing information and define the structure of the content, including the content type. The content is represented as a DataHandler object that wraps around the actual data. The Message class implements the Part interface. The Part interface defines attributes that are required to define and format data content carried by a Message object, and to interface successfully to a mail system. The Message class adds From, To, Subject, Reply-To, and other attributes necessary for message routing via a message transport system. When contained in a folder, a Message object has a set of flags associated with it. JavaMail provides Message subclasses that support specific messaging implementations. Message Storage and Retrieval Messages are stored in Folder objects. A Folder object can contain subfolders as well as messages, thus providing a tree-like folder hierarchy. The Folder class declares methods that fetch, append, copy and delete messages. A Folder object can also send events to components registered as event listeners. Message Composition and Transport A client creates a new message by instantiating an appropriate Message subclass. It sets attributes like the recipient addresses and the subject, and inserts the content into the Message object. Finally, it sends the Message by invoking the Transport.send method. The Session Class The Session class defines global and per-user mail-related properties that define the interface between a mail-enabled client and the network. JavaMail system components use the Session object to set and get specific properties. The Session class also provides a default authenticated session object that desktop applications can share. The Session class is a final concrete class. It cannot be subclassed. Using the JavaMail API This section defines the syntax and lists the order in which a client application calls some JavaMail methods in order to access and open a message located in a folder:

63

Mail Filters

1. A JavaMail client typically begins a mail handling task by obtaining the default JavaMail Session object. Session session = Session.getDefaultInstance(props, authenticator); 2. The client uses the Session objects getStore method to connect to the default store. The getStore method returns a Store object subclass that supports the access protocol defined in the user properties object, which will typically contain per-user preferences. Store store = session.getStore(); store.connect(); 3. If the connection is successful, the client can list available folders in the Store, and then fetch and view specific Message objects. // get the INBOX folder Folder inbox = store.getFolder("INBOX"); // open the INBOX folder inbox.open(Folder.READ_WRITE); Message m = inbox.getMessage(1); // get Message # 1 String subject = m.getSubject(); // get Subject Object content = m.getContent(); // get content ... ... 4. Finally, the client closes all open folders, and then closes the store. inbox.close(); // Close the INBOX store.close(); // Close the Store

64

Mail Filters

DESIGN PRINCIPLES & METHODOLOGY


To produce the design for large module can be extremely complex task. The design principles are used to provide effective handling the complexity of the design process, it will not reduce to the effort needed for design but can also reduce the scope of introducing errors during design. For solving the large problems, the problem is divided into smaller pieces, using the time-tested principle of divide and conquer. This system problem divides into smaller pieces, so that each piece can be conquered separately. For software design, the problem is to divide into manageable small pieces that can be solved separately. This divide principle is used to reduce the cost of the entire problem that means the cost of solving the entire problem is more than the sum of the cost of solving all the pieces.

65

Mail Filters

When partitioning is high, then also arises a problem due to the cost of partitioning. In this situation to know the judgement about when to stop partitioning. In design, the most important quality criteria are simplicity and understandability. In this each the part is easily related to the application and that each piece can be modified separately. Proper partitioning will make the system to maintain by making the designer to understand problem partitioning also aids design verification. Abstraction is essential for problem partitioning and is used for existing components as well as components that are being designed, abstracting of existing component plays an important role in the maintenance phase. ding design process of the system. In the functional abstraction, the main four modules to taking the details and computing for further actions. In data abstraction it provides some services.

The system is a collection of modules means components. The highest-level component corresponds to the total system. For design this system, first following the top-down approach to divide the problem in modules. In top-down design methods often result in some form of stepwise refinement after divide the main modules, the bottom-up approach is allowed to designing the most basic or primitive components to higher-level components. The bottom-up method operations starting from very bottom. In this system, the system is main module, because it consists of discrete components such that each component supports a welldefined abstraction and if a change to the component has minimal impact on other components. The modules are highly coupled and coupling is reduced in the system. Because the relationships among elements in different modules is minimized.

66

Mail Filters

Design Objectives

These are some of the currently implemented features: Complete portability Apache James is a 100% pure Java application based on the Java 2 platform and the Java Mail 1.3 API. Protocol abstraction unlike other mail engines, protocols are seen only like "communication languages" ruling communications between clients and the server. Apache James is not be tied to any particular protocol but follow an abstracted server design (like Java Mail did on the client side) Complete solution the mail system is able to handle both mail transport and storage in a single server application. Apache James works alone without the need for any other server or solution. Mailet support Apache James supports the Apache Mailet API. A Mailet is a discrete piece of mail-processing logic which is incorporated into a Mailet-compliant mail-server's processing. This

67

Mail Filters

easy-to-write, easy-to-use pattern allows developers to build powerful customized mail systems. Examples of the services a Mailet might provide include: a mail-to-fax or mail-to-phone transformer, a filter, a language translator, a mailing list manager, etc. Several Mailets are included in the JAMES distribution. Resource abstraction like protocols, resources are abstracted and, accessed through defined interfaces (Java Mail for transport, JDBC for spool storage or user accounts in RDBMS's, Apache Mailet API). The server is highly modular and reuses solutions from other projects. Secure and multi-threaded design Based on the technology developed for the Apache JServ servlet engine, Apache James has a careful, security-oriented, full multi-threaded design, to allow performance, scalability and mission-critical use.

68

Mail Filters

System design is the process of applying various techniques and principles for the purpose of definition a system in sufficient detail to permit its physical realization. Software design is the kernel of the software engineering process. Once the software requirements have been analyzed and specified, the design is the first activity. The flow of information during this process is as follows. Information domain details

Function specification
Desig n

Behavioral specification

Code 69 Test

Mail Filters

Other requirement modules Procedural design

Program

Software design is the process through which requirements are translated into a representation of software.

Primary design is concerned with the transformation of requirements into data and software architecture. Detailed design focuses on refinements to the architectural representations that lead to detailed data structure and algorithmic

representation for software. In the present project report only preliminary design is given more emphasis.

System design is the bridge between system & requirements analysis and system implementation. Some of the essential fundamental concepts involved in the design of as applications are Abstraction Modularity Verification Abstraction is used to construct solutions to problems without having to take account of the intricate details of the various component subprograms. Abstraction allows system designer to make step-wise refinements by which attach stage of the design unnecessary details annunciate with representation or implementation may be hidden from the surrounding environment. Modularity is concerned with decomposing of main module into welldefined, manageable units with well-defined interfaces among the units. This enhances design clarity, which in turn eases

70

Mail Filters

implementation, debugging, testing, and documentation maintaining of the software product. Modularity viewed in this senses vital tool in the construction of large software projects. Verification is fundamental concept in software design. A design is verification. It can be demonstrated that the design will result in an implementation, which satisfied the customers requirements. Some of the important factors of quality that are to be considered in the design of application are: The software should behave strictly according to the original specification of satisfying customers requirements and should function smoothly under normal and possible abnormal conditions. This product is highly reliable, can handle any number of mails to filter. The design of the system must be such a way that any new additions to the information functional and behavioral domain may be done easily and should be adapted to new specifications. We provided this extensibility to this product. you can add any number of filters to your product in the future. System design is the process of developing specification for the candidate system that meets the criteria established during the phase of system analysis. Major step in the design is the preparation of input forms and design of output reports in a form acceptable to the user. These steps in turn lead to a successful implementation of the system.

In this project we focus on MailFetch, which is a part of our James Server. We configure our logic in that place to work. Actually mail fetch is the main key to implement our filters. First It considers our filters and then based on the logic in those filters it takes the decision to drop the messages or not. Following is the design document: Mail Fetch: Mail Fetch is an application to download your email through protocols like POP3 and IMAP. It also allows you to retrive your news messages through NNTP. In addition to the simple feature of downloading mail, Mail Fetch has the concept of mail filters. A filter has
71

Mail Filters

the single job of deciding whether or not to download a single message. The actual decision of whether to download a mail or not is made through a sequence of filters. There can be a global set of filters as well as a per maildrop one. A maildrop represents your mailbox from which you want to download your mail. Mail Fetch is written in the Java Programming language and has an extensible XML based configuration. Mail Fetch is very easy to configure. All that has to be done is edit the plain text configuration file. I have been written a fair amount of documentation, so that should help. Mail Fetch can process multiple maildrops with individual filter mechanisms and poll times. Features: Following are the list of features provided by Mail Fetch:
POP3 and IMAP Protocol Support Can handle any number of Maildrops Polling mechanism to periodically check maildrops for new

messages

Filtering system for downloading mail Standard filters provided like Size, Message-id, Sender Easy pluggability of user defined filters Runs on all platforms supported by Java2 Configurable logging mechanism to keep track of mails

downloaded Delivery

Multiple delivery options provided - like Mailbox and SMTP Delivery options accessible at the filter level Experimental NNTP Support

Configuring and Extending Mail Fetch: Mail Fetch uses XML for configuration. The configuration file is MailFetch.xml. This file exists in $MAILFETCH_HOME/conf - if $MAILFETCH_HOME is where your Mail Fetch binary distribution is.

72

Mail Filters

The configuration file is accompanied by a detailed document instructing one on how to configure Mail Fetch. I would recommend referring to that document whenever you have some problem following what Im saying. This document is called Configuration.txt and is located in the $MAILFETCH_HOME/docs directory. Essentially, there are maildrops to download mail from - they contain all the information about accessing a maildrop. There is a global sequence of filters, which are checked for each maildrop before the maildrop-local sequence of filters. Filters can be configured through the configuration file. For example a size-based filter would like to know what size it should filter at and also what action it should take when a message is of a greater size. Each of the filters themselves may have some additional configuration options. The additional configuration is totally dependent on the filter itself. You could add your own filter and want to be configured from the configuration file. I shall expand on that later in this section.

There is also the option of delivery agents. After a message passes through all the filters and none of them have an objection with it being downloaded, it is downloaded and sent to Delivery Agent who is responsible to delivering it (to a mailbox, maildir, SMTP host etc). Mail Fetch supports different kinds of delivery agents and you can choose one of them for delivery of your mail. You can go so far as to make each of your maildrops deliver messages to a different delivery agent! A Maildrop itself needs to specify its delivery agent when all the filters let the message pass through. Each delivery agent has an id - it can thereafter be referred to by its id. Some filters support delivering messages to a delivery agent specified in their configuration. For example, all messages from the execve@users.sourceforge.net would go to the execve mailbox if the SenderMailFilter is configured. You can implement your own filters by implementing certain interfaces, a user can very easily add his/her own filter to the current set of provided filters. Examples of filters are spam control, size restrictions etc. Mail Fetch downloads the email if it matches the criteria and then can deliver it using one of its delivery options. Currently, one can choose to deliver mail to a mailbox or to an SMTP Server Mail Fetch. Filters. Mail Filter interface. You will need to specify the name of the class you have implemented in the configuration, so that Mail Fetch can initialize it as required. Note that the class has to be in the system classpath. This can be easily achieved by putting the class in a jar and putting it in the lib directory. The script picks up all the jars from the directory and places them in the classpath before invoking Mail Fetch. All the delivery agents specified in the configuration are available to the filters through a Mail Fetch. delivery.
73

Mail Filters

DeliveryManager object. This object allows access to these agents based on their ids. NOTE that the id of the agent has to known by the filter requesting for the agent. A Delivery Event is generated when a message is delivered after passing through all the filters. NOTE that there is no event generated when a filter itself delivers a message through an agent. The easiest way to get a hang of how to implement the filter of your choice is to get a hold of the source and checkout some of the implemented filters (like NullMailFilter!!) Thats about it in terms of Mail Fetch configuration. Go on, open the conf/MailFetch.xml file in the Mail Fetch directory and play with it. Do let me know of any problems you face; let me know even if you dont. Table of Contents =========== 1. Introduction 2. Some Definitions 2.1 Maildrops 2.2 Filters 2.3 DeliveryAgents 2.4 Events 3. Detailed Configuration 3.1 Maildrop 3.2 Mailfilters 3.2.1 Global filters 3.2.2 Local filters 3.2.3 All filters explained 3.3 Delivery Agents 3.4 Miscellaneous Configuration 4. Sample configuration file 5. Advanced usage 1. Introduction: Mail Fetch is an application to access your remote email. It supports popular mail protocols like POP3 and IMAP. You can download your mail to your local machine and use an email client to read it. Mail Fetch also comes with a very powerful and flexible filtering system. In fact, Mail Fetch comes with a range of filters out of the box; so you can get started immediately. These filters range from those, which prevent you from getting the same message twice to those which help in spam filtering. Mail Fetch is written in Java and so has the advantage of running on most platforms. Configuration is text-based and is an XML file. This document tells you how to configure Mail Fetch. It details out the

74

Mail Filters

various configuration options available and also provides a sample configuration file. 2. Some Definitions: 2.1 Maildrops A Maildrop is the mailbox from where you download your mail. Characteristics of a maildrop are the protocol (POP3, IMAP, NNTP), the username, password, hostname, port number, the default Delivery Agent for that maildrop, any filters for that maildrop and finally any protocol-specific configuration for the maildrop. For example an NNTP maildrop would contain newsgroup information, which is not used by a POP3 or IMAP maildrop. 2.2 Filters A Filter (also called a Mail Filter interchangeably) is the core of the decision making in Mail Fetch. A filter decides on a per-mail basis whether the message should be downloaded or not. A pipeline of filters is setup (yes, again setup in the configuration) and a message which needs to be downloaded is passed through this pipeline. At any point of the pipeline, a filter could indicate that the message should not be processed through the pipeline anymore. For example a SPAM filter (sender based) could find a match from the list of spammers it has and reject the message. There are two kinds of filters -- global and local. These are not an attribute of a filter itself, but rather depend on the usage of a filter. Local filters are associated to a maildrop whereas global filters are applicable to all maildrops. For example, you might want a Message-ID filter to be applicable to all maildrops whereas keep a sender-based filter only for the maildrop where you expect mail from that sender. 2.3 DeliveryAgents A Delivery Agent has the responsibility of delivering mail. The current supported mediums are SMTP and mailbox. Delivery Agents are identified by a unique ids in the configuration. Maildrops have a default Delivery Agent configured which is used if the message passes through the Mail Filter pipeline successfully. Some filters also accept a Delivery Agent attribute in the configuration. What this implies is that if the message matches the Filter's criteria, the Filter delivers the message using this Delivery Agent. This also allows for simple filtering mechanisms. For example you might want all likely SPAM to be delivered to special mailbox where you can then later check for any false positives. 2.4 Events

75

Mail Filters

Event is an internal concept of MailFetch. If you are only going to use MailFetch and the filters it provides out of the box, you don't need to understand this concept. If you are extending MailFetch by developing your own MailFilters, you will need to understand this concept. Whether you actually use it, depends on the MailFilter functionality itself. Events are a mechanism by which a MailFilter can be notified when something interesting happens in MailFetch. Currently, we only generate events for the delivery of a message. Let us take the example of the MessageIDMailFilter. This filter rejects messages with message-ids which have already been downloaded. This avoids receiving duplicate messages for example when you are subscribed to two mailing lists and a cross-posting happens. It maintains a list of message-ids which we have already downloaded. The list is saved on the disk after the download of every message so that if the session is interrupted due to any reason, the message is not re-downloaded. So, the Mail Filter implements a DeliveryLister and hence, gets the delivery event. 3. Detailed Configuration: 3.1 Maildrop You can have more than one maildrops for MailFetch to download mail from. MailFetch downloads mail for them in the order in which they are configured. Here is a sample maildrop configuration: <maildrop protocol="pop3" mda="smtp"> <host>mail.somepopserver.com</host> <port>110</port> <user>myusername</user> <password>mypass</password> <delete>true</delete> <!-- filters specific to this maildrop --> <filters> </filters> </maildrop> The protocol attribute can be one of pop3, imap or nntp (EXPERIMENTAL). The mda attribute specifies the default delivery agent when the message is ready to be downloaded. See Delivery Agents for more information. This requires a delivery agent called "smtp" to be configured.

76

Mail Filters

Host, port, user and password are attributes for the connection and authentication. Setting delete to true makes MailFetch delete messages from the maildrop once they are downloaded. The filters configured *inside* the Maildrop element are the local maildrop filters and will not affect other maildrops. Some Maildrops like NNTP, have some extra configuration parameters like the newsgroups which have to be downloaded. Please note that NNTP support is EXPERIMENTAL and is not yet stable. 3.2 Mailfilters: 3.3 Global filters All filters which are configured outside the maildrop elements are called global filters.These filters affect all the maildrops.The configuration is the same for both global and local filters.

Here is a sample global filters configuration: <filters> <!-- size filter --> <filter class="MailFetch.filters.SizeMailFilter" maxsize="1548576" delete="false"> </filter> <!-- sender mail filter --> <filter class="MailFetch.filters.SenderMailFilter" delete="true" blocklist="/home/gautam/MailFetch/spool/blocklist" mda="junk"> </filter> <!-- msgid filter --> <filter class="MailFetch.filters.MessageIDMailFilter" delete="true"> <storage name="msgid.cache" limit="8192" destination="spool/msgid.cache"/> </filter> <!-- subject mail filter --> <filter class="MailFetch.filters.SubjectMailFilter" delete="true" blocklist="/home/gautam/MailFetch/spool/subject.blocklist" mda="junk"> </filter> </filters>

77

Mail Filters

See "All filters explained" for detailed explanation of all provided filters. 3.4 Local filters All filters which are configured inside the maildrop elements are called local filters. These filters affect only the maildrop associated with them. The configuration is the same for both global and local filters. Here is a sample local filters configuration: <maildrop protocol="pop3" mda="smtp"> <!-- .... standard maildrop config goes here .... --> <!-- filters specific to this maildrop --> <filters> <!-- sender mail filter --> <filter class="MailFetch.filters.SenderMailFilter" delete="true" blocklist="/home/gautam/MailFetch/spool/linuxlist" mda="linux"> </filter> </filters> </maildrop> 3.5 All filters explained FILTER NAME : HeaderMailFilter DESCRIPTION: Matches a header in the message. This requires the name of the header and the value of the header CLASS NAME : MailFetch.filters.HeaderMailFilter SAMPLE CONFIGURATION: <filter class="MailFetch.filters.HeaderMailFilter" delete="true" name="X-Spam-Rating" value="SPAM" mda="spambox" > </filter> EXPLANATION: This filter allows to filter messages based on the value of a particular header. The mda attribute is optional and allows you to direct the message to the delivery agent specified if the message matches the criteria. FILTER NAME : MessageIDMailFilter DESCRIPTION : Filters messages if they contain a duplicate Messageid. This Filter stores the list of downloaded messageids in the specified file. CLASS NAEM : MailFetch.filters.MessageIDMailFilter SAMPLE CONFIGURATION:

78

Mail Filters

<filter class="MailFetch.filters.MessageIDMailFilter" delete="true"> <storage name="msgid.cache" limit="8192" destination="spool/msgid.cache"/> </filter> EXPLANATION: The name of the storage element is a friendly name of the repository. limit specifies the maximum number of elements to allow in the list. The destination attribute is the actual file in which the list is stored. FILTER NAME :Null Filters DESCRIPTION :his filter consumes all messages. It also marks them for deletion. ClASS NAME :MailFetch.filters.NullMailFilter SAMPLE CONGIGURATION: <filter class="MailFetch.filters.NullMailFilter" /> EXPLANATION :This filter is a special filter; it could be used to clean up the maildrop for example. It is also a DANGEROUS filter, you have been warned. FILTER NAME :RecipientMailFilter DESCRIPTION :This filter matches the recipients of the message against those provided in a list. CLASS NAME : MailFetch.filters.RecipientMailFilter SAMPLE CONFIGURATION: <filter class="MailFetch.filters.RecipientMailFilter" delete="true" blocklist="/home/gautam/MailFetch/spool/pers" mda="personal"> </filter> EXPLANATION: This filter checks if the recipients of the message (TO and CC) exist in the defined list. The mda attribute is optional. blocklist is the file containing the recipient addresses (one on each line). FILTER NAME : SenderMailFilter DESCRIPTION : This filter matches the sender of the message against those provided in a list. CLASS NAME : MailFetch.filters.SenderMailFilter SAMPLE CONFIGURATION:

79

Mail Filters

<filter class="MailFetch.filters.SenderMailFilter" delete="true" blocklist="/home/gautam/MailFetch/spool/block" mda="junk"> </filter> EXPLANATION : This filter checks if the sender of the message exist in the defined list. The mda attribute is optional. blocklist is the file containing the sender addresses (one on each line). FILTER NAME : SizeMailFilter DESCRIPTION: This filters messages based on their size. CLASS NAME : MailFetch.filters.SizeMailFilter SAMPLE CONFIGURATION: <filter class="MailFetch.filters.SizeMailFilter" maxsize="1548576" delete="false"> </filter> EXPLANATION: max-size is the maximum size of the message which is permitted to be downloaded. The size is in bytes. A max-size of 0 indicates that the size restriction is lifted. FILTER NAME : SubjectMailFilter DESCRIPTION : This filter does subject based filtering based on a list CLASS NAME : MailFetch.filters.SubjectMailFilter SAMPLE CONFIGURATION: <filter class="MailFetch.filters.SubjectMailFilter" delete="true" blocklist="spool/virus_list" mda="possible.virus"> </filter> EXPLANATION: This filter is again similar to the sender/recipient filters except that it does filtering based on the subject of the message. The mda attribute is optional. 3.3 Delivery Agents After passing through the filter pipeline, mail is delivered using a DeliveryAgent. Currently we provide two main delivery mechanisms: mbox and smtp. SMTP is the most reliable mechanism although it requires that you have an MTA configured for delivery. DELIVERY AGENT NAME: Mailbox ESCRIPTION : Delivers a message to the specified mbox. CLASS NAME : MailFetch.delivery.MailboxDeliveryAgent SAMPLE CONFIGURATION:

80

Mail Filters

<mda class="MailFetch.delivery.MailboxDeliveryAgent" id="junk"> <destination>/home/gautam/Mail/junkmail</destination> </mda> EXPLANATION: The destination element identifies the location of the box where the delivery is made. Some basic dot-locking functionality is provided by the mbox provider to avoid multiple ccess to the mbox. DELIVERY AGENT NAME : SMTP DESCRIPTION : Deliver the message to a configured SMTP host CLASS NAMES : MailFetch.filters.SMTPDeliveryAgent SAMPLE CONFIGURATION : <mda class="MailFetch.delivery.SMTPDeliveryAgent" id="smtp"> <host>localhost.localdomain</host> <port>25</port> <localuser>gautam</localuser> <domain>localhost</domain> <user></user> <password></password> </mda> EXPLANATION: The localuser element defines who the email is directed to. The domain is the domain of the local user. In this case, the email is dispatched to gautam@localhost. user and password are used if your server requires SMTP Authentication. DELIVERY AGENT NAEM : NULL DESCRIPTION : A Null Delivery Agent does nothing. So basically equivalent to dumping into /dev/null. CLASS NAMES : MailFetch.filters.NullDeliveryAgent SAMPLE CONFIGURATION : None <mda class="MailFetch.delivery.SMTPDeliveryAgent" id="smtp"> </mda> EXPLANATION :There is no configuration for this delivery Agent. Please use with care, as you could very easily lose all your mails due to a misconfiguration. 3.4 Miscellaneous Configuration Polling: Polling time is the time Mail Fetch waits between mail downloading sessions. For example

81

Mail Filters

<poll>120</poll> Specifies the polling time as 120 seconds (2 minutes). A non-positive polling time indicates that Mail Fetch should just run through the maildrop list and download messages once. Logging: I would recommend turning logging on as it gives you a very good idea as to what is happening in the system. All exceptions are logged, so nothing would escape your eye. Mail Fetch does a light-medium logging in the DEBUG state. <log target="logs/MailFetch.log" priority="DEBUG" enabled="true" /> The target attribute specifies the file where MailFetch should log all its data. The priority attribute specifies the logging priority. Priorities of logging are DEBUG, INFO, WARN, ERROR, FATAL_ERROR. The enabled attribute is optional and is treated as true by default. 4. Sample configuration file A sample configuration file is included along with the MailFetch distribution. You will need to customize the configuration file according to your needs and requirements. Refer to this document to configure the file. Below is a small configuration file to give you some idea as to how to go about modifying the configuration. <MailFetch> <!-- Poll for new mail every ten minutes --> <poll>600</poll> <!-- Enable logging --> <log target="logs/MailFetch.log" priority="DEBUG" enabled="true" /> <!-- First the delivery agents --> <!-- I do my delivery of mail through SMTP --> <mda class="MailFetch.delivery.SMTPDeliveryAgent" id="smtp"> <host>localhost</host> <port>25</port> <localuser>gautam</localuser> <domain>localhost</domain> <user></user> <password></password> </mda>

82

Mail Filters

<!-- Mailbox delivery for suspect SPAM --> <mda class="MailFetch.delivery.MailboxDeliveryAgent" id="spam"> <destination>/home/gautam/Mail/spam</destination> </mda> <!-- Get a lot of personal mail --> <mda class="MailFetch.delivery.MailboxDeliveryAgent" id="pers"> <destination>/home/gautam/Mail/personal</destination> </mda> <!-- Global filters begin; apply to all maildrops --> <filters> <!-- Size filter comes first, I am on a dialup :( --> <filter class="MailFetch.filters.SizeMailFilter" max-size="102400" delete="true"> </filter> <!-- SPAM blocking filter is next --> <filter class="MailFetch.filters.SenderMailFilter" delete="true" blocklist="/home/gautam/MailFetch/conf/blocklist" mda="spam"> </filter> </filters> <!-- My maildrops go next --> <maildrop protocol="pop3" mda="smtp"> <host>mail.somepopserver.com</host> <port>110</port> <user>myusername</user> <password>mypass</password> <delete>true</delete> <!-- filters specific to this maildrop --> <filters> <filter class="MailFetch.filters.SenderMailFilter" delete="true" blocklist="/home/gautam/MailFetch/conf/friendlist" mda="pers"> </filter> </filters> </maildrop>

83

Mail Filters

</MailFetch> The above section is just a sample configuration file. You will need to customize your configuration depending on what kind of filtering meets your requirements. 5. Advanced usage In case you find that you need some customized filtering, you may want to write your own Mail Filters. The easiest way to understand how to do this is to look at the filters which are available in the MailFetch distribution. Good filters to start with are NullMailFilter, SizeMailFilter, SubjectMailFilter and MessageIDMailFilter. That should cover most common uses. Once you have written your MailFilter, you need to include it in the filter configuration. In addition, MailFetch requires it to be in the system classpath to be able to load it. It can simply be achieved by putting the relevant classes in a jar and putting it in the lib directory. The run scripts loadup all the jars in the classpath. MailFetch is an application to download your email through protocols like POP3 and IMAP. The decision of whether to download a mail or not is made through a sequence of filters. By implementing certain interfaces, a user can very easily add his/her own filter to the current set of provided filters. Examples of filters are spam control, size restrictions etc. MailFetch downloads the email if it matches the criteria and then can deliver it using one of its delivery options. One can choose to deliver mail to a mailbox or to an SMTP Server.I wrote MailFetch as a replacement to fetchmail when I wanted a filter mechanism before downloading the mail (since I use a dial up line). MailFetch can process multiple maildrops with individual filter mechanisms and delivery options. MailFetch is written in the Java Programming language and has an extensible XML based configuration. Configuration TO SET UP MAILFETCH FOLLOW THE FOLLOWING STEPS: If you have the source, compile using the build. bat batch file * Now, enter the dist directory and edit the conf/MailFetch.xml file. This is the configuration file for MailFetch. Refer to the Configuration.txt file in the docs directory for a detailed description of the configuration file. * Now you can run MailFetch by executing the run. bat file in the dist directory.

84

Mail Filters

Input design is the process of converting user-originated information to computer-based format. The goal of designing input data is to make data entry as easier and error free as possible. An input format should be easy to understand. In this product inputs are nothing but messages i.e. mails. Every mail has some properties like sender, subline, body, message-id and so on. By taking these inputs automatically from the message, which are inside the mailbox, we do the process to decide whether to drop the message or not. The output design relays on input, which is used to the output. Hence input design needs some special attention.

Output reflects image of the organization. The output design involves designing forms layout, making lists, making well designed reports etc., and reports are main outputs of the proposed system. Here the outputs are : LOG FILES, which record every thing handle by the server relevant to this project including error messages.

Databases and database management systems and explores how to use relationships in a pool of data when developing methods for data storage and retrieval. Databases allow data to be shared among different applications. Database in not used in this product. we simply record the details of how a particular transaction is handled by the server in some log files. We store those log files in permanent disk at specified location.

85

Mail Filters

James Server Screen

86

Mail Filters

Creating User Accounts using Telnet

87

Mail Filters

Using Microsoft Outlook Express Entering into our Mailbox

88

Mail Filters

Sending A Mail For Test From Our Opened Mailbox To Someother Mailbox On Which Filters Applied

89

Mail Filters

Test mail From Stud1 To Stud2

90

Mail Filters

OUTPUT: Log Information

91

Mail Filters INFO 15 [JFetch ] (main): Starting up JFETCH version 1.1a INFO 31 [filterpr] (main): FilterProcessor: set up 1 filters INFO 47 [filterpr] (main): FilterProcessor: set up 0 filters INFO 156 [JFetch ] (main): JFetch init done. INFO 156 [JFetch ] (main): Starting processing of maildrops ... INFO 172 [maildrop] (main): Connecting to stud2@localhost ... INFO 406 [maildrop] (main): Got 2 messages in maildrop.. DEBUG 562 [mboxdeli] (main): Mailbox f:/work/mails opened DEBUG 578 [maildrop] (main): Starting download of message #1 of 2 (769) DEBUG 890 [mboxdeli] (main): Appended message: stud1@nit2, Sun Mar 07 02:33:34 2004, , Sub: Small Mail, 769 DEBUG 890 [maildrop] (main): Delivered, From: stud1@nit2, Sub: Small Mail, (769) DEBUG 922 [maildrop] (main): Starting download of message #2 of 2 (8126) DEBUG 922 [sizefilt] (main): Filtering message of size 8126 DEBUG 969 [mboxdeli] (main): Mailbox f:/work/mails closed. INFO 984 [maildrop] (main): Closing connection to stud2@localhost ... INFO 984 [JFetch ] (main): Done, processed 1 maildrops.

Mail Box Of Stud2

92

Mail Filters

CONTENT IN F:\WORK\MAILS FILE


From stud1@nit2 Sun Mar 7 14:32:12 2004 Return-Path: <stud1@nit2> Received: from nit2 ([127.0.0.1]) by nit2 (JAMES SMTP Server 2.1.3) with SMTP ID 172 for <stud2@nit2>; Sun, 7 Mar 2004 14:32:13 +0530 (GMT+05:30)

93

Mail Filters Message-ID: <001f01c40422$e12c6a00$0100007f@nit2> From: "stud1" <stud1@nit2> To: <stud2@nit2> Subject: Small Mail Date: Sun, 7 Mar 2004 14:32:12 +0530 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_001C_01C40450.FAD8BF20" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4927.1200 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4927.1200 Delivered-To: stud2@nit2 This is a multi-part message in MIME format. ------=_NextPart_000_001C_01C40450.FAD8BF20 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable This is Small Mail...... ------=_NextPart_000_001C_01C40450.FAD8BF20 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML><HEAD> <META content=3D"text/html; charset=3Diso-8859-1" = http-equiv=3DContent-Type> <META content=3D"MSHTML 5.00.3700.6699" name=3DGENERATOR> <STYLE></STYLE> </HEAD> <BODY bgColor=3D#ffffff> <DIV><FONT face=3DArial size=3D2>This is Small = Mail......</FONT></DIV></BODY></HTML> ------=_NextPart_000_001C_01C40450.FAD8BF20--

94

Mail Filters

Testing is one of the most important phases in the software development activity. In software development life cycle (SDLC), the

95

Mail Filters

main aim of testing process is the quality; the developed software is tested against attaining the required functionality and performance. During the testing process the software is worked with some particular test cases and the output of the test cases are analyzed whether the software is working according to the expectations or not. The success of the testing process in determining the errors is mostly depends upon the test case criteria, for testing any software we need to have a description of the expected behaviour of the system and method of determining whether the observed behaviour confirmed to the expected behaviour. Since the errors in the software can be injured at any stage. So, we have to carry out the testing process at different levels during the development. The basic levels of testing are Unit, Integration, System and Acceptance Testing. The Unit Testing is carried out on coding. Here different modules are tested against the specifications produced during design for the modules. In case of integration testing different tested modules are combined into sub systems and tested in case of the system testing the full software is tested and in the next level of testing the system is tested with user requirement document prepared during SRS. There are two basic approaches for testing. They are In Functional Testing test cases are decided solely on the basis of requirements of the program or module and the internals of the program or modules are not considered for selection of test cases. This is also called Black Box Testing

In Structural Testing test cases are generated on actual code of the program or module to be tested. This is called White Box Testing. A number of activities must be performed for testing software. Testing starts with test plan. Test plan

96

Mail Filters

identifies all testing related activities that need to be performed along with the schedule and guide lines for testing. The plan also specifies the levels of testing that need to be done, by identifying the different testing units. For each unit specified in the plan first the test cases and reports are produced. These reports are analyzed. Test plan is a general document for entire project, which defines the scope, approach to be taken and the personal responsible for different activities of testing. The inputs for forming test plane are Project plan Requirements document System design Although there is one test plan for entire project test cases have to be specified separately for each test case. Test case specification gives for each item to be tested. All test cases and outputs expected for those test cases. The steps to be performed for executing the test cases are specified in separate document called test procedure specification. This document specify any specify requirements that exist for setting the test environment and describes the methods and formats for reporting the results of testing. Unit testing mainly focused first in the smallest and low level modules, proceeding one at a time. Bottom-up testing was performed on each module. As developing a driver program, that tests modules by developed or used. But for the purpose of testing, modules themselves were used as stubs, to print verification of the actions performed. After the lower level modules were tested, the modules that in the next higher level those make use of the lower modules were tested. Each module was tested against required functionally and test cases were developed to test the boundary values. Integration testing is a systematic technique for constructing the program structure, while at the same time conducting tests to uncover errors associated with interfacing. As the system consists of the number of modules the interface to be tested were

97

Mail Filters

between the edges of the two modules. The software tested under this was incremental bottom-up approach. Bottom-up approach integration strategy was implemented with the following steps. Low level modules were combined into clusters that perform specific software sub functions. The clusters were then tested. System testing is a series of different tests whose primary purpose is to fully exercise the computer-based system. It also tests to find discrepancies between the system and its original objective, current specifications. Mail Filter System Test Cases & System Test Report The system test cases mentioned below are expected to work and give the expected behaviour if the explorer is configured to run jar files as mentioned in the project folder. The necessary library files and standard jar files are in the appropriate project directories and the path and classpath environment variables are appropriately set.
Tes C.No . INPUT EXPECTED BEHAVIOUR Observe d behaviou r Status P= Passed F = Failed

1a

1b

The Mail Filter Tool application jar is double-clicked on the Windows Explorer The Mail Filter Tool application jar is double-clicked on the Windows Explorer

On the tester Applcn. Window the user selects FILE->

The application should be launched and show the initial Mail Filter Application swing window. The application should be launched and show the initial Mail Filter Application swing window. But the application could not launch due to some configuration settings or because the jar files are not set to open appropriately. A suitable error message as Entry point not found or Main class not found The test application main window (swing UI) is displayed with the

-do-

-do-

98

Mail Filters 2 New test. The test sequence is given a name to identify and save test parameters. The tester enters the URL of the web resource to be tested and then also fills up the concurrency level (no. of simulated users), iterations (no. of loops), no. of the screen parameters of the target application and any other test parameters and clicks enter. In the subsequent tab displayed the user enters the screen parameters of the target application. The user then clicks on the continue test button. various field empty for user input. A new properties file in the application folder will be created if the test is to be saved. The user is initially shown the screen to enter the generic parameters and upon click of enter the new tab to enter the specific parameters is shown. Here the user is shown the screen specific parameters as name value pairs : (since the target applications parameters will not be known before hand) the user has to enter the name of the parameter of the target servlet and also its value. The user is shown a new screen with options to select the type of output display. The options available are : Graph 2D (barchart), Summary Table, Detailed Table, Line Graph. The user is then shown the http parameters tab in which the user may enter specific HTTP parameters. The application validates the http parameters entered by the user for illegal entries and displays suitable error message on invalid parameters. The application should try to connect to the target application by invoking the Connection using the supplied parameters for the http -doP

-do-

-do-

The user selects the Graph2D/Line Graph. options in the above selection The user enters the specific HTTP params as required by the current run of the test. The user finally clicks on the test button.

-do-

-do-

99

Mail Filters 7a headers and the application specific parameters. It should also instantiate the THREAD pool and accordingly open that many connections with the target application and return a connection code of success or failure. The connection code is a success code then the user needs to be given a Connection successful message. If the connection failed the cause of failure with the http error code has to be shown. E.g. (example codes only) 404 File not found 602 Access Denied 550 Network Transient The application will have to ping the targeted servlet with all the parameters as entered by the user. The http headers used for the connection request needs to be prolonged across all the transactions of the tester application with the servlet. (This can be examined using packet sniffers) An error message saying http: // target servlet path not found has to be displayed. An error message saying http: // target servlet Invalid URL has to be displayed. -doP

7b

Depending upon the network conditions , the target application characteristics, validity of http headers and validity or target URL / URL parameters the connection may succeed or fail. These need to be conveyed to the end user with appropriate messages. The connection was successful. The tester application proceeds with the test.

-do-

-do-

10

The target application is down or is not running or is not reachable from the current host. The target application could be reached but the invoking URL does not have the right parameters.

-do -

-do-

100

Mail Filters The target application could be reached but the invoking host/URL pair does not have the privileges connect to the target application or authorization is necessary or the target application is not on a mutually trusted network. The target application could be reached and the response was received from the destined URL. The http packets are all received and every connection has succeeded in getting back valid responses. The responses received have mixture of succeeded and failed packets. The responses received have mixture of succeeded and failed packets. But the selection in 4 was detailed table. The user changes the URL and tries step 1 through 15 with a different application. The user click on quit An error message saying http: // target servlet Access Denied has to be displayed. -doP

11

12

13

14

Depending upon the selection of the user in TC.No.4 the output of the users target application is shown in the Graph or Table formats. The graph shows the proportion of good and failed packets. In this case 100% valid packets indicate a bar chart of full scale. The bar chart shows proportional bars of failed and succeeded packets. The table shows all the threads and their connection params with the corresponding responses and error codes. Appropriate changes depending upon the target application show up and refresh the table or the graph. The application exits and all connections with the targeted applcns. Are closed.

-do

-do-

-do-

15

-do-

16

-do-

17

-do-

101

Mail Filters

102

Mail Filters

It is the duty of the Administrator to configure the filters. For this purpose First place the our Jfetch directory in a Mail server administrator required. After that you can find an XML file in a sub director named conf. That file is easily readable by this administrator can change the corresponding values to configure to his chosen Mailserver. you can see the main part of that file below: <maildrop protocol="pop3" mda="ld"> <host>localhost</host> <port>110</port> <user>stud2</user> <password>pass2</password> <delete>false</delete> <!-- filters specific to this maildrop --> <filters> </filters> </maildrop> here you can observe we configure it to James server which is running on POP3 protocol and which is placed in our local system at port number 110. These filters are applied only on stud2 maildrop or mailbox. After configuration completed administrator have to create mailboxes for company personnel in a Mail server using Telnet Tool and configure those mailboxes to your local Mail client relevant to this configuration we did it before. Open MailClient used by you and follow the instruction given by that MailClient to configure those earlier created mailboxes in Mailserver. At one time it is asking for to specify incoming mail server and outgoing mail server then you have to specify the IP-address of server in that you configured your filters earlier. In case of MS-Outlook Express screen seems to be like this

103

Mail Filters

After that your Local mail client creates a new accounts for you specified mailboxes. Thus you can access those mail boxed from your local mailclient and can organize those mailboxes as you like. A part from this configuration your installed filters worked on all the mailboxes you specified in above configuration file here names as conf.xml .

104

Mail Filters

105

Mail Filters

Mail-Filters is a tool, lot of efforts were put to make it filter perfectly and efficiently. The developed system is tested with real data and the users are satisfied with the performance of the system and reports. This project is developed using JAVA MAIL API, one of the J2EE technologies, with the help of XML language. By using this tool we can drop the unwanted mails or messages automatically by specify our restrictions in corresponding files. By this lot of work load will be reduced to the administrator and also a copy of deleted message can be directed to specified location which is for verfications. This tool is very useful for Administrating deportment of SAPARNA InfoTech Limited. It provides extendibility also. So you can add your own filters in future very simply without disturbing the existing code.This tool reduces the manual work. Time as well as manpower saved. The time for processing and producing reports is considerably reduced. All the features are implemented and developed as per the requirements.

106

Mail Filters

107

Mail Filters

Basic Java Concepts Java Mail API Software Engineering Introduction to System Analysis and Design For UML diagrams

: Thinking in JAVA ( Bruce Eckel ) : Wrox Publications Volume I and II : Pankaj Jalote : I.T.Hawryszkiewycz : UML in 24 Hours Book www.sun.com/j2ee/mailapi www.sun.com/j2se

An Integrated Approach to

Some preferred websites : www.bruceeckel.com

108