Information and Links
Join the fray by commenting, tracking what others have to say, or linking to it from your blog.
w3.org DTD/xhtml1-strict.dtd blocks Windows IE users?
Updated on: February 23 2009
Updated on: February 25 2009
On a few sites I maintain we have several man pages setup using XML and XSL. This week started getting complaints from Windows IE users saying they can't see the man pages any more. The error message is:
-
The XML page cannot be displayed
-
Cannot view XML input using style sheet.
-
Please correct the error and then click
-
the Refresh button, or try again later.
-
-
-----------------------------------------
-
-
The server did not understand the request,
-
or the request was invalid. Error processing resource
-
'http://www.w3.org/TR/xhtm...
The header of my page has this in it:
-
<?xml version="1.0" encoding="UTF-8"?>
-
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-
<!-- saved from url=(0013)about:internet -->
When I try to access either http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd or http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd in any browser except Windows IE, the page loads or downloads as expected. In Windows IE the only thing that is served up is "No".
I am curious of others are seeing this. Is it a Microsoft problem? Is it a W3.org problem? As it is Windows IE users appear to be out of luck. Perhaps w3.org simply has had enough of Windows IE and wants them to go away?
I would love to hear other people results on trying to load these URL's and their comments.
Updated on: February 23 2009
Further investigation into this problem, shows that the User-Agent string is the key to IE being blocked from access the DTD's on w3.org.
-
curl http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd -D ./dump.txt -A "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322)"
-
-
Results:
-
--------------
-
No
-
--------------
-
-
dump.txt
-
--------------
-
HTTP/1.1 503 Go away
-
Date: Mon, 23 Feb 2009 13:48:30 GMT
-
Server: Apache/2
-
Content-Location: msie7.asis
-
Vary: negotiate,User-Agent
-
TCN: choice
-
Retry-After: 86400
-
Cache-Control: max-age=21600
-
Expires: Mon, 23 Feb 2009 19:48:30 GMT
-
P3P: policyref="http://www.w3.org/2001/05/P3P/p3p.xml"
-
Content-Length: 2
-
Connection: close
-
Content-Type: text/plain
-
--------------
-
-
curl http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd -D ./dump.txt -A "Mozilla/4.0 (compatible; Windows NT 5.1; .NET CLR 1.1.4322)"
-
-
Results:
-
--------------
-
.
-
.
-
.
-
The entire DTD file successfully lists
-
.
-
.
-
.
-
--------------
-
-
dump.txt
-
--------------
-
HTTP/1.1 200 OK
-
Date: Mon, 23 Feb 2009 13:50:22 GMT
-
Server: Apache/2
-
Content-Location: xhtml1-transitional.dtd.raw
-
Vary: negotiate,accept-encoding
-
TCN: choice
-
Last-Modified: Thu, 01 Aug 2002 18:37:56 GMT
-
ETag: "7d6f-3a72ac59d0900;45a3e4327da00"
-
Accept-Ranges: bytes
-
Content-Length: 32111
-
Cache-Control: max-age=7776000
-
Expires: Sun, 24 May 2009 13:50:22 GMT
-
P3P: policyref="http://www.w3.org/2001/05/P3P/p3p.xml"
-
Connection: close
-
Content-Type: application/xml-dtd; charset=utf-8
-
--------------
Further research shows that the offending User-Agent string would appear to be MSIE. Removal of MSIE or any change to MSIE results in a successful return of the DTD.
I tried contacting w3.org last week when I first posted this, but obviously I have the wrong contact info as no one has responded yet.
Updated on: February 25 2009
I heard back from w3.org. They responded with:
This is a known issue related to W3C's excessive traffic [1]. We are
working with Microsoft, and a fix is expected in coming months.
[1] http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic
It would appear that Windows IE is attempting to load the DTD on each page load, which is improper behaviour. Perhaps the only solution at this point is to host a copy of the DTD on our own server so that Windows users can still read the XML pages.
Thoughts and suggestions are always welcome.


Maybe IE simply need to behave how a (good?) browser is suppose to.
There is RFC, Specs, rules for that. They are made not to be annoying but in order for the whole thing to work.
If I’m not mistaken there are some HTTP directives (since the beginning) specifying how often you’re suppose to check for a resource (something like cache, expire and so on…)
That would be good if one day MS engineers stop playing only for themselves, get their finger out of their *ss, and act more playfully with other people on the world ;)