good afternoon/day/night/morning ... my name is jose nazario. i'm
an openbsd developer and security analyst living in ann arbor,
michigan, in the united states. i'm going to be presenting today on
some software i wrote recently called 'vthrottle'. the slides for 
the talk are on my website here:
http://monkey.org/~jose/presentations/vthrottle.d/

the concept behind vthrottle is not mine, it belongs to an HP
labs researcher named matt williamson. he and his research
group have been investigating using activity rate limiting
mechanisms to prevent network anomalies from growing too
quickly. i became aware of this research while writing a book
on internet worms which was published last month. 

in a nutshell the technique seems to have some promise to it, 
but i'm not entirely convinced it will work for everything. so,
i decided to write up and release some software to test it to
see how well it will work in the real world. this software is
only at version 0.30, but it is very stable and has most of the
features it will need for a 1.0 release. 

[slide 2]

so, very quickly, vthrottle is a mail server plugin that allows
you to modify server behavior. it operates at the MTA layer, or
mail transport agent. this is where the SMTP transaction occurs 
between servers and clients. this is different than your mail
client, which is called an MUA (mail user agent) or the software
which drops the mail off into your inbox, which is called the 
LDA (local delivery agent).
 
vthrottle doesn't stop worms, it only slows them down. you'll 
never win the race ahead of a worm, so what you want to do is 
to try and gain some extra time to react. to do this, and to do 
this for worms we haven't seen yet, we work on generic worm
properties. we never look at the payload, we only look at the
behavior of a host. 
 
the implementation is built as a milter plugin for sendmail
severs. milter is the plugin architecture for sendmail, which
lets you interact with the server and make judgements about mail.
 
[slide 3]
 
let's take a few minutes and talk about libmilter. libmilter is
the sendmail plugin architecture, as i said on the last slide.
it is a library for the client code, and the server needs support
for it as well at compile time. it is for sendmail only, not
for postfix, qmail, exim or the like.
 
it provides a framework for interacting with mail by working on
state transitions within smtp. these transitions occur when
a client connets, it says HELO, it sets the "mail from", who
the mail is going to, and when it sends the headers, when
the headers end, when it sends the body of the mail, and when the 
body of the mail ends, when the message ends, and when the
client aborts the connection.
 
milter clients listen on a local domain (UNIX) socket or a 
network (IPv4 or IPv6) socket. this lets your run a single milter
server for all of your MTA farm and work with multiple SMTP
servers. 
 
[slide 4]
 
people have typically been using milter plugins for the following
four reasons.
 
the first is as a mail logger mechanism. this lets your copy all
of a subset of the messages that pass through a system, for
example for archiving or record keeping purposes.
 
the second is to use it as a statistics gatherer. this can
be things like a mail quota information gatherer, a connection
counter, or the like. i'm looking at using this information
to further a generic worm detection and control mechanism.
 
the third is as an antivirus subsystem hook. you can feed
the antivirus software the mail message itself and return information
about the message to the server. here you can do things like
reject or pass a mail message after inspection. 
 
the fourth one is as an anti-spam mechanism. similar to the
antivirus mechanism, you can pass the message or header information
to the anti-spam subsystem and pass or fail messages this way. 
 
[slide 5]
 
as i alluded to in the previous slides, milter can actually 
*do* things to your mail connections. you can react to a 
state transition request by giving a pass code (a 200 series
reply), or a failure (400 for a temporary failure or 500 for
a permanent failure). coming 8.13 you can quarantine messages
using libmilter. 
 
you can also modify the mail as it passes through the system.
here what you can do is add headers (such as "X-SPAM"), or
even rewrite portions of the message (such as defanging an
attachment, modifying a message body or the like). 
 
lastly you can copy messages silently using the milter system.
this can basically open up a file and dump the message body
into the file, including the headers and the body of the message.
this is useful for a permanent email storage system.
 
[slide 6]
 
using libmilter is very easy to do. the first step is to build
a sendmail program that has libmilter support added in already.
then you basically add a milter call in the configuration,
forcing messages to pass the check called by the milter program.
 
the next thing to do is to write the milter program. what you
do is you fill in a struct which tells the program what functions
to call at what points for these state transitions within an
SMTP transaction. and, of course, you can set them to NULL
if you wish them to not be evaluated. these are just function
pointers.
 
next you write the functions which evaluate the portion of the
SMTP transaction. all functions can treate the body as a 
string, but you have to be careful about embedded NULLs. all
methods return one of pass, fail, or reject. 
 
within the program you need to connect to the socket you have
set up to communicate with the MTA program (a UNIX domain
socket or a network socket). then the last thing you need to
do is call smfi_main(), which starts the milter program.
 
milter programs are threaded, allowing for high performance 
to be possible. they also don't block eachother from working,
so you can have them working in parallel. 
 
[slide 7]
 
milter programs are typically written in C or C++, but bindings
have been written in Perl. perl milters have the same basic
structure as a C milter does.
 
it should be easy to write milter bindings for other languages
using the SWIG toolkit. then you could write in Python, Ruby,
tcl, C# or whatever else you like that SWIG supports. i don't
think anyone has done this yet. 
 
[slide 8] 
 
vthrottle works very very simply. it has three parts of the SMTP
transaction it watches. the first is who connects to the
mail server. the second is how you say HELO (how you start the
SMTP transaction). and the third is the address the mail is
coming from. for the connection and HELO information the
hostname is kept and compared; for the "mail from" segment
an email address is used.

for each of these pieces of information, vthrottle keeps a list 
of who it has seen and when they were seen. what vthrottle
then does is it looks at the current time and the list time it
saw any of those obervations and enforces a minimum interval
between those observations. 

we make two pretty bold assumptions here, first that normal hosts wont
try and send mail faster than this limit. and the second part
of this assumption is that we figure most worms and viruses 
will try and send mail faster than this limit.
 
like i said earlier, this isn't my idea, it's from matt williamson.
he's a researcher in the UK working for HP labs.

[slide 9]
 
vthrottle is very easy to install. first make sure that your
sendmail has support for milter build it and that you have the
milter library and headers installed. this is part of the
normal sendmail distribution, so you dont need any special
software. 
 
then, obviously, download the software from my site:
   http://monkey.org/~jose/software/vthrottle/
the current version is 0.30, which i released last weekend
(14 december 2003). 
 
building vthrottle can be a bit tricky, only because i dont
have a ./configure script yet. you need to modify the Makefile
to point it at the libmilter headers and the library. 
 
then you can install it wherever you like. the README lists the
configuration change you need to make to your sendmail.mc file.
once you regenerate your .cf file you're all set.
 
you start it very simply: vthrottle -s <socket>, which is
the communications socket for the program. you can 
set a different interval time with -i (it defaults to
60 seconds right now). you can create a "whitelist" using
-w, too. this file specifies different limits for mail addresses
or hosts. 

[slide 10]
 
these are the big features of vthrottle right now, over its 
basic behavior. you can configure a default interval that
works for your network or your behavior. let's say people in
your office send mail every 30 seconds on average, you can
change the behavior on the command line at runtime.
 
you can also whitelist hosts or mail addresses using the
whitelist function. this is for major peer MTA systems or
mailing list addresses. you can set a different interval
for those entries.
 
[slide 11]
 
this is a basic order of operations for a mail server using
vthrottle. when a host connects or says HELO, vthrottle looks at its
list of hosts and when they were seen. if it has seen the
host before, it compares the time now to when it was last
seen and how long you are supposed to wait. if it's longer
than the required interval, then vthrottle says "ok" and
the message is allowed. if it hasn't seen the host
before, it adds it to the list and moves on with an "ok".
if the connection is too soon, vthrottle tells the server
to reject the transaction.
 
when the source mail address is sent, vthrottle repeats that
check on a list of mail addresses. if they are at least 
"interval" seconds apart, then the mail can pass. if not
the message attempt fails.
 
whenever an attempt to send mail is prevented by a rejection
or a failure, it is logged for the administrator.

[slide 12]

vthrottle has some bugs ... some of which are my fault.
 
the first is that it sends a permanent failure to the
host when it tells it to go away for a connection or a
HELO request. this is specified by the SMTP standards and
is limited by sendmail.
 
the second is that the list used in vthrottle is global.
because vthrottle is threaded, it will probably destroy 
the list and make it useless ...
 
the third bug is more of a performance problem i expect to
see. it uses a singly linked list to look up addresses,
so performance will degrade with the number of addresses
and hosts your servers interact with. maybe i should use
a splay tree, which is self optimizing ...

[slide 13]
 
this is a short list of things i plan to do with vthrottle 
for a 1.0 release. the first is to fix sendmail with a
patch to get it to return a temporary failure for a 
connection or HELO throttle action. this should make
clients react more sanely and try again in a few minutes.
 
the next two i already did. you should be able to vary the
default interval, which you can now do. the third is to
improve the whitelist file syntax, and that is done, too.
 
lastly i need to implement a deferment queue for messages
that have been throttled. this will be trickier but will
make the server easier to manage large queues of 
mail rather than hoping the client gets it right.

[slide 14]
 
sadly, i don't run mail servers anymore. so, i haven't
tested vthrottle in the real world. for all i know it's 
dead slow and ruins your mail server. it could be, because
it has to traverse this linked list of information, but that
depends on the network size. 

obviously i hope vthrottle doesn't have a negative impact on
normal network operations. the worst case scenario for 
this kind of impact is that someone will be away from the
network for a while and compose a bunch of mail. when they
sit down to send it out, that "shotgun" blast of mail
will be throttled. a deferment queue will help a lot in that
regard.

[slide 15]
 
there are some weaknesses in the design which i want to see if
i can work around. 

the first is when a host reuses its existing connection via an
SMTP RSET (reset state). what i should do here is start the
check again and pretend it connected again. should be simple
to do if libmilter has a handler for RSET ...
 
the second is the situation when the virus spoofs the HELO
information. remember you can send anything you want here, 
and the server doesn't have to verify it. i imagine some
viruses already do this.
 
the third is when the virus spoofs the source address for
the mail. in the absence of strong authentication (ie via
TLS) of the sender, there is no way around this one.
 
[slide 16]
 
and that's it! i need to thank matt williamson and the 
management team at HP labs, they were comfortable 
enough with someone else releasing a product which
does what their patent pending process does. :)
 
you can see the latest release of vthrottle on its homepage:
  http://monkey.org/~jose/softwaree/vthrottle/
the website needs sexy graphics if you feel like contributing.
i have some ideas but i'm not a good artist.
 
lastly this took only a few hours to write and extend,
but getting it up to version 0.30 took about 400 lines of C
code and about 6 hours of work.