Getting around IE’s MIME type mangling

For some reason, I ended up with the MSDN article on MIME type ignoring open long enough that I actually read it. If you aren’t familiar with this misfeature, well, it’s a “everyone’s at fault” thing where Apache serves up files with unknown extensions with a default MIME type, rather than with no MIME type at all, so Internet Explorer only rarely believes the server-provided MIME type, and instead looks at the content and decides what it really is.

As a result, non-IE users quite often provide examples of things like HTML or RSS or Movable Type templates by just adding .txt to the filename so they will be served as text/plain, and IE blissfully displays them as HTML or XML, or refuses to display them because they aren’t valid XML.

There are 26 MIME types that IE “knows,” plus the two it doesn’t trust: text/plain and application/octet-stream. If your type isn’t any of those, IE will accept it without munging. If your type is one of the 28 special types, IE will decide for itself whether to believe you or not. The interesting part is that it decides by looking at just a buffer containing up to the first 256 bytes of the file. If you are serving up a file that contains angle brackets as text/plain, and you can somehow avoid using anything that makes IE think it’s really something else for 256 bytes, then you can avoid having IE users email you to say that your example can’t be displayed.

The very nice thing is that comments don’t count against text. Start with a comment over 256 bytes long and you can do anything you like in the rest of the file, and IE will still display it as text. Unfortunately, you can’t put anything before the XML declaration that way, so I don’t know of a way to stop IE from trying to interpret XML-as-text as XML, but for HTML, or PHP source that otherwise might look like HTML, a nice fat comment to start will fool IE into thinking you actually know what you’re talking about.

31 Comments

Comment by Geof #
2004-04-06 14:47:24

God help me, but I just started sucking my thumb, reading this.

:sigh:

 
Comment by Jim Dabell #
2004-04-06 16:17:01

If you aren’t familiar with this misfeature, well, it’s a ”everyone’s at fault” thing where Apache serves up files with unknown extensions with a default MIME type, rather than with no MIME type at all, so Internet Explorer only rarely believes the server-provided MIME type, and instead looks at the content and decides what it really is.

It certainly isn’t an ”everyone’s at fault” thing. From RFC 2616 (the HTTP 1.1 specification):

If and only if the media type is not given by a Content-Type field, the recipient MAY attempt to guess the media type via inspection of its content and/or the name extension(s) of the URI used to identify the resource.

Basically, any attempt at file type guessing when a Content-Type header is supplied is a flat out violation of the specification. The blame entirely lies with Internet Explorer.

Of course, now Opera has an option to do the same thing, and I heard Firefox 0.8 added this misfeature as well.

Comment by Scott Johnson #
2004-04-06 16:59:13

For at least 75% of the users out there, this is defnitely a ”feature”. My grandma doesn’t want to know about mime types. She just wants the web page to display properly. She doesn’t care about XML. She just wants to be able to read the newspaper article on the web.

I’m not trying to take Microsoft’s side with this issue. In fact, I disagree with violating the standards, but sometimes it just has to be done in order to make the sale. And when you let a corporation control your browsing experience, you get what sells.

Comment by Georg Bauer #
2004-04-07 06:34:51

Your grandma will die eventually - do you really want to build up a legacy for generations to come, just to help your grandma along?

Comment by Phil Ringnalda #
2004-04-07 09:25:19

Will my putative granddaughter really enjoy having .rar17 files displayed as though they were text? Is she going to say ”Oh joy! A chance to educate someone else about why they or the people they hire to run their server need to comment out something that’s uncommented in the sample apache943.conf!”?

Comment by Georg Bauer #
2004-04-09 11:32:52

And what if your granddaughter want’s to serve just her XML source as text/plain and doesn’t care about weird hacks in browsers?

Sure, this might be a stupid configuration element in the apache config (although my stand on this is that people trusting default configs for any server should be shot instantly). But working around a problem that is produced by wrong configuration by hacking stuff into some software that _can’t_ be configured (at least not this special hack), is just plain stupid.

The configuration can be fixed with any text editor and apachectl reload. The hack in IE will stay forever.

Comment by Phil Ringnalda #
2004-04-09 14:45:35

The IE hack might stay forever. There are actually registry hacks (which didn’t always work right) going back to IE 5, and apparently XP SP2 is going to scale it back somewhat.

But the single absolutely certain precondition for it going away is for Apache to not ship with a default that results in most servers serving unknown file types as text/plain (or application/octet-stream, I’d think, though the Mozilla folks who certainly know better than me see that as better, somehow). Whatever we would do on that special day when we suddenly control the world, in the actual world where people on the IE team get phone calls from people who buy tens or hundreds of thousands of copies of Windows every year, if the most popular server on the internet lies about text/plain by default, then text/plain will get sniffed.

There are situations (very, very few of them) where ”what’s right and proper” matters more than what works. This is not one of them. Apache is wrong, a file of an unknown type shouldn’t be served with a content-type. IE is wrong, a file with a content-type shouldn’t be sniffed. If Apache doesn’t change, there is zero chance that IE will change. If Apache does change, there is some chance that IE will change. This just doesn’t seem all that complicated. Every single man who works on Apache httpd has a larger penis and can pee further than the best the IE team has to offer, we all know that, now can we have the real change which will eventually make it possible for me to serve an XHTML template as text/plain?

 
 
 
 
Comment by Jim Dabell #
2004-04-08 05:32:05

My grandma doesn’t want to know about mime types. She just wants the web page to display properly.

I’ve seen this argument many times. Throw away standards to get a marginally better user-experience. You end up with a mess of unspecified, unreliable crud (look at the state of HTML if you don’t believe me). All the resources developers spend working around crap could be spent on actual features instead.

This is a tragedy of the commons, once ignoring the specifications is commonplace, everyone is stuck with it in perpetuity. Adhering to standards isn’t developer mind masturbation at the expense of users, it’s putting the burden of fixing the problem where the problem actually lies in an attempt to work on something that is of actual use to end-users.

Fixing the problem instead of hiding the symptom seems far preferable to me.

Comment by Phil Ringnalda #
2004-04-08 07:24:32
Fixing the problem instead of hiding the symptom seems far preferable to me.

Yes! Yes, yes, yes! Let’s fix the problem, and here it is:

#
# DefaultType is the default MIME type the server will use for a document
# if it cannot otherwise determine one, such as from filename extensions.
# If your server contains mostly text or HTML documents, "text/plain" is
# a good value.  If most of your content is binary, such as applications
# or images, you may want to use "application/octet-stream" instead to
# keep browsers from trying to display binary files as though they are
# text.
#
DefaultType text/plain

Untouched from my local installation’s copy of httpd.conf (where I certainly have no need of a default type, but I didn’t think to comment it out). If every single file on my laptop was text/plain, that would be fine. Otherwise? That’s just encouraging me to violate standards by sending things with a MIME type when I don’t know the type, not even based on a guess from the content, but based on a guess by someone working on Apache at some time in the past.

Compare and contrast:

#
# You should only enable a default MIME type if you know that every file
# on the server will either have a MIME type registered for the file
# extension (or whatever other detection method you will enable), or if it
# does not have a MIME type registered it will be of that default type.
# Otherwise, leave the default type commented out, so unknown files will
# be served without a MIME type, and browsers will be allowed to use their
# best guess. If you use a default type, and it is wrong, browsers are not
# allowed to guess the correct type.
#
# DefaultType text/plain
Comment by Jim Dabell #
2004-04-08 09:35:38

Well somebody recently posted to the Apache mailing list that they were intending to change the default, so your complaint is already being handled.

People seem perfectly happy to take the Content-Type complaint to the server admins for CSS issues, so why not for other media types?

Comment by Phil Ringnalda #
2004-04-09 15:07:14

Why not evangelism? In the CSS case, if I want to see a particular page with CSS in my browser, then I have to tell the admin of the server of that particular page to fix his setup (though actually, once Mozilla eased the rules on quirks mode, I haven’t seen that many, and lack of CSS usually doesn’t make an ureadable page). In the text/plain case, if I want to make Movable Type templates available by serving them as text/plain, I have to track down every single Apache server that hasn’t commented out the DefaultType directive, browbeat them into changing it (many times at no benefit to them, since not everyone serves content with unknown types), and then tell the IE team that I’ve done it. Do the two cases seem equal to you?

 
 
Comment by Phil Ringnalda #
2004-04-09 15:53:16

If your server contains mostly text or HTML documents, ”text/plain” is a good value.

Sheesh. Just like I always use poor grammar when correcting someone’s grammar, I used poor comprehension while accusing the world of not being able to comprehend.

Apache is very much at fault for that sentence. mostly text or HTML documents? Since when is serving HTML as text/plain a good idea? Even someone who carefully reads the comments, but doesn’t know from RFCs and who can sniff what would say to themselves ”yup, more HTML than executables on my server, better leave DefaultType text/plain.”

 
 
 
 
Comment by Phil Ringnalda #
2004-04-06 20:39:03

I did think about trying to expand on that, but I knew I’d just start talking backward while my head spun in circles. Luckily, I’ve got the projectile vomiting held in check. For now.

I know, anyone who violates any RFC for any reason should be shot on sight, but let’s just ask ourselves, would the IE folks have been quite as likely to start MIME type sniffing if Apache shipped with a sample config file that had the default type section commented out, with a comment saying something along the lines of ”You should only enable a default MIME type if you know that every file on the server will either have a MIME type registered for the file extension (or whatever other detection method you will enable), or if it does not have a MIME type registered it will be of that default type. Otherwise, leave the default type commented out, so unknown files will be served without a MIME type, and browsers will be allowed to use their best guess.”?

See, in the actual world, there will be files that Apache doesn’t know the correct type for, and unless it’s clear to every server admin on the planet that there is almost no case where they want to enable default type, they will be served with the wrong type, leaving browser developers with two choices: maintain their fanatical devotion to RFCs, and produce a browser that looks broken while shifting the pain of the problem onto end users who have no idea what’s wrong, other than that their browser doesn’t work, or they can say ”screw it, I’m not displaying an obvious binary file as text.”

The fact that the Firefox devs, who combine a great fondness for specs with a great fondness for users, have started doing MIME type sniffing even when the spec says they shouldn’t is a sign, and it isn’t a sign that they are backsliders in need of a good sermon.

 
 
Comment by jgraham #
2004-04-06 16:32:39

In the case of Firefox (well Mozilla in general really), the feature isn’t nearly so bad as the IE feature.

As far as I recall, it only tries to render the content if if is being sent by a server with a known bad default MIME-type setting (i.e. Apache) and if the first n bytes of the file contain a selection of non-printable ASCII characters. It’s no worse from a standards-compliance point of view than the characterset autodetecion code which has been around forever.

So it’s very unlikely to break if you try changing file-extensions to send xml as text/plain.

But don’t believe me, read the bug report.

Before that patch got checked in, the Mozilla people filed a bug with Apache asking for the MIME type defaults to be changed to something sane. Had they done so, the patch would probably never have made it in. Standards are very useful but unless you plan to start imposing requirements on who can publish content on the net, servers must have reasonable (conservative) default settings otherwise market pressure forces browsers to be more liberal than the specifcation allows.

Comment by Jim Dabell #
2004-04-08 09:17:38

Before that patch got checked in, the Mozilla people filed a bug with Apache asking for the MIME type defaults to be changed to something sane. Had they done so, the patch would probably never have made it in.

I assume you are talking about Bug 13986? As far as I can see, the bug hasn’t been closed yet, and judging by this email, Apache are changing the default, if they haven’t already done so. I think violating standards was a bit premature.

Comment by jgraham #
2004-04-08 15:49:44

I assume you are talking about Bug 13986?

That looks right

As far as I can see, the bug hasn’t been closed yet

Well it hasn’t been closed. It has, however, been open for a year and a half. Most of the Apache people on the bug seem to believe that it’s an admin’s problem. Which is true. Unfortunatly, they also believe that because an RFC says admins should be careful they actually will be careful and that those who are ”dumb” enough to assume conservatve default settings should have their users suffer.

Of course evangelism is suggested but in reality evengelism is slow, doesn’t always work, and doesn’t prevent random users who wouldn’t know an RFC if it smacked them in the face from finding MIME related problems before clued-up users do.

The bug also has a patch attached, albeit one that doesn’t provide quite the right solution.

judging by this email, Apache are changing the default

Sense at last! Although it’s not quite as good as sending no default mime-type since that would explicitly allows browsers to use MIME type magic in the same way that apache itself is allowed to.

I think violating standards was a bit premature.

You don’t often help users with Mozilla problems then? After you’ve answered the same questions tens or hundreds of times and patiently explained that no, the Mozilla developers aren’t on crack and no they’re not trying to make things hard specifically for you. And yes, if admins just edited their server config to include the rght MIME type for .RAR files* then all would be well in the world. And yes, we know IE makes this easy, but standards are good for you. Even if they taste bad at first. And yes we know you can switch back to IE and no we don’t really care if you do, you might have a different view of what constitues ’too soon’

In any case, I can’t imagine that the new default will override existing httpd.conf files so all the misconfigured servers out there will still be misconfigured servers out there.

*For some reason, .rar files are the most common source of problems. They must be the most common file format that iisn’t explicity mentioned in the apache config. I have no idea what they’re used for although I know they’re some kind of archive file

 
 
 
Comment by Doug Ransom #
2004-04-06 17:40:55

Hm. I wonder if this is related to IEs problem in loading jpges with a lot of metadata in them (jpeg allows for comments within the jpeg).

I built the powder tour 2002 site using rdfpic to put a lot of rdf text into jpegs. If you browse this in IE, you will be lucky if you see any images and if help about still works after looking at my valid jpegs. In w9x, you might see the images or your computer might need rebooting.

rdf is xml, so likely the angle brackets
in the comment make ie go squirrely.

what’s weird is that ie will display the images just fine when loading them from a file system.

I keep hoping Microsoft will issue a security patch (I think this is a buffer overrun), so that my friends who use ie can see the powder tour site.

Comment by Phil Ringnalda #
2004-04-06 20:50:32

It doesn’t seem likely that it would be the same problem, unless they are seeing XML (or a message that it’s invalid XML, since the rest of the file wouldn’t be XML). Seems like the first 256 bytes of a jpg would be plenty binary to make it clear it wasn’t XML. But, who knows? Stranger things have happened.

 
 
Comment by Brad Choate #
2004-04-06 18:35:44

Well, spaces seem to work. It creates a blank line at the start of the content, but seems to work.

Comment by Phil Ringnalda #
2004-04-09 08:42:47

Though not for XML, where you’re still stuck with ”XML declaration not at start of external entity,” sadly.

 
 
Comment by Phil Ringnalda #
2004-04-06 20:53:27

Cor, it wasn’t until I saw Mark’s link with the ”damn ultraliberal parsing” title that I realized I was trying to open up the liberal parsing permathread. I thought I just wanted to mention a way around not being able to easily display source as text, sometimes. Walked right into it, didn’t I?

 
Comment by Ben Meadowcroft #
2004-04-07 04:58:23

You could always use the view-source psuedo protocol, for example your stylesheet will let someone view a CSS file textually in most browser (I’m using IE because I’m at work, but I’m sure Mozilaa et al do the same?)

Comment by Phil Ringnalda #
2004-04-07 09:32:35

Heh. I’d almost forgotten about using view-source: I used to distribute some templates for Blogger, where you have two, the main and the archive index template, so the easiest way to manage going from my page to copy-and-paste in Blogger is to have them open in a couple of instances of Notepad, so I had (ulp, still have) all these IE-specific buttons all over the page.

 
 
Comment by pete #
2004-04-07 06:16:15

I had a simple cgi that forced the download of a text file by supplying the correct mime-type, but since the url ended with download.cgi?file=example.txt IE kept thinking it was a text file, based off of the txt extension, completely ignoring the mime-type. Solution? I just changed all the urls that were generated to be like so: download.cgi?file=example.txt&iesux=yes

Comment by Phil Ringnalda #
2004-04-07 09:38:34

Heh. I wonder if you could throw in 256 bytes of binary-looking data to force non-text?

Oh, nope, it has to match a ”known type” to fall out before the filename match, so you would have to put the first 256 bytes of a application/x-zip-compressed file in it, and then Windows would try to hand it off for decompression anyway.

 
 
Trackback by simon's ramblings #
2004-06-23 20:03:58

IE MIME type shenanigans

 
Trackback by simon's ramblings #
2004-06-23 20:04:08

IE MIME type shenanigans

 
Comment by Phil Ringnalda #
2004-06-23 20:37:16

Simon! Come out from behind the firewall: with a URI like http://sbserve/ we can’t really tell much about what you’re saying.

Comment by Simon #
2007-11-19 03:47:02

Sorry - I’m about 3 years late to the party on this one:

http://lieschke.net/articles/ie-mime-type-shenanagins

 
 
Comment by Zach #
2004-11-21 22:51:30

After reading this whole thing, I am left with one really serious question.

Who really cares?

Not to mention that I serve up javascript files with png extensions, html as php, php as gifs, and the toaster as the oven all day long with no problem in IE as I am specifing the filetype - Moz is the one that screws it up all the time.

In the end - any idiot can make it work - jeez - if you need to have source code show up - and its not - save it as a .phps and get the nice pretty coloring, dont want that nice pretty coloring then just use pre tags or in a big text box or better yet - just make up your own extension - end of problem.

 
Comment by Andreas Gohr #
2007-02-14 07:30:25

What is really bad about this (besides violating the RFC) is that it opens up a Cross Site Scripting opportunity at many many websites. I wrote about it here: MSIE facilitates Cross Site Scripting.

 
Name (required)
E-mail (required - never shown publicly)
URI
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <del datetime="" cite=""> <dd> <dl> <dt> <em> <i> <ins datetime="" cite=""> <kbd> <li> <ol> <p> <pre> <q cite=""> <samp> <strong> <sub> <sup> <ul> in your comment.