Blog's control panel: | Home | Tags | Index | Rss 2.0

Fixing exim's content scanning

Sun, 10 Jun 2007 | Permalink | Tags: , ,

Exim is a very flexible MTA and even if I consider myself a Postfix guy after 7 months I've gotta say I appreciate its flexibility and power. Very interesting is the built-in perl support, which allows you to define any kind of function to be used even to change Exim's config/behavior at runtime. One of the things I use that for is some custom content filtering.
It's outside of the scope of this article to explain Exim's ACLs, and the documentation has got an entire chapter about it, altho it's not necessary to understand the rest of this post.

Introducing the case scenario

From my exim4.conf:
...
acl_smtp_data = acl_check_data
...
begin acl
...
acl_check_data:
...
  deny    demime        = *
          condition     = ${perl{checkContent}\
                          {/var/spool/exim4/scan/$message_id}}
          message       = Invalid content

...
The acl_smtp_data is where you want to do content inspection, the common place to hook up antispam and antivirus. In english the above means: unpack the message, check if it contains forbidden content and in case it does deny it logging an "Invalid content" message.
Demime is what unpacks the message in a suitable format for scanning. From the manual:
The demime condition unpacks MIME containers in the message. It detects
errors in MIME containers and can match file extensions found in the message
against a list. Using this facility produces files containing the unpacked
MIME parts of the message in the temporary scan directory. I
Then the checkContent perl function is executed, with /var/spool/exim4/scan/$message_id passed as a parameter. That path is where demime unpacks messages. If the function returns 1 the message is denied, otherwise the next acl is considered.

The problem[s]

Content inspection wasn't properly working with html messages so I started looking into it and found a bug which I promptly fixed. Being the bug in the perl function I had the problem of telling exim to reload the file without killing all the current connections. The init script supports reload, so I used it, and got back a bunch of errors... the joy of missing semicolons, but at least that confirmed the perl file was actually read and reloaded. Or so I thought. Because of the evidence of the file being re-read it took me a while before deciding that maybe exim was lying to me. Bottom line a restart was necessary, even tho exim is supposed to start a new perl interpreter for each process, so I'm still not entirely sure why that's been the case.
But uncovering this problem brought to light another issue: headers weren't checked, Subject being the easiest one to slip forbidden content through. Subject is passed as part of the DATA session, so I was already in the right place. Looking at the code I noticed an if checking for files' mime-type before executing the checkContent() on them, and wondered if that might been the problem. Unfortunately I managed to find absolutely 0 documentation on demime and how it works, so all I was left with were the source code and strace. Two hrs of strace later I have an .eml and a .com file, the former containing all the email, headers included, and the .com only the body of the message. And I knew that the .eml is never checked.
The if was checking for files whose type matched "text", which is reasonable to avoid scanning binary files we aren't interested into. So I used the utility file to see what the system thought the .eml file was, ang got text/mail as expected. But then if that's the mime-type the if would matched and the file checked!
Knowing how much perl loves to do things in its own creative way, I went checking what this File::MMagic module was doing, and surprise!
NAME 

File::MMagic - Guess file type
SYNOPSIS 
  use File::MMagic;
  use FileHandle;

  $mm = new File::MMagic; # use internal magic file
  # $mm = File::MMagic->new('/etc/magic'); # use external magic file
  # $mm = File::MMagic->new('/usr/share/etc/magic'); # if you use Debian
  $res = $mm->checktype_filename("/somewhere/unknown/file")
So by default perl uses its own thing rather a system wide standard! Brilliant. According to perl that was a message/rfc822, which might even be more correct from some points of view, but that isn't the problem. The problem is having something non standard to be defined as default.
Fixed that, fixed the headers checking.




SpikeLab.org is a Filippo Spike Morelli copyright 2005-2008
This work is licensed under Creative Commons Att-SA License.