A few days ago I received a bug report from a FUDforum user about his forum members having trouble staying logged in when using the AOL 9 browser, herein to be referred to as POS. According to him after logging in and browsing a few pages, the user would suddenly find themselves being logged out from the FUDforum. This happened on seemingly random pages, with no common element in between making tracking down the problem ever so enjoyable. The following is a nightmarish tale of me trying to resolve this problem, which hopefully serve as a clue to the developers who encounter the same issue.
At first I thought the issue maybe related to the fact that POS, a hacked up IE that always goes through AOL proxies when it comes to fetching the content. These proxies change in between requests (load balancer?) so during the same session a user may go through any number of different IP addresses, which AOL has a fair number of.
The IPs of the proxies change in a totally random order and the proxies are themselves anonymous, effectively hiding the IP address of the user from the web server. As you can imagine this makes IP based validation for AOL users who use POS impossible. This caused a problem for older versions of FUD where IP validation (toggleable option) would not account for AOL peculiarities. A fix in earlier version was to simply not check the IP data for AOL users, which seemed to have solved the problem adequately until AOL 9.0 with new and exciting features intended to make web developer’s lives even more difficult came out.
After getting the user to upgrade to the new FUDforum and having them still being able to replicate the issue, I was left will little choice then to install AOL MAX (uses existing broadband connection) on my test win32 box to try to resolve the problem. This turned into a bit of an adventure as the box rebelled against AOL install and promptly trashed the drive, so the next few hours involved getting a new drive and re-installing Win XP. Interestingly enough installing AOL MAX from a CD I picked up at the store took nearly as long as it did to install Windows XP (not including security patches installs). You really have to wonder just what in the hell does AOL put on there that requires nearly an hour to install on a dual 533mhz Celeron with 512 megs of ram and a new 7200k rpm drive. Oh well, the things I do for FUDforum QA…
With AOL finally running, I’ve began trying to replicate the problem which is where the “fun” began. One of the first things I’ve done was modify the forum to log the entirety of request headers provided by AOL, so that I could see the exact nature of each request. My initial suspicion was that proxies mangle browser identification in some way causing the session to become invalidated. Unfortunately, I had been running PHP 5.1 with the latest pdo_mysql driver, which Andrey had recently modified causing buffered queries to crash. This caused an annoying distraction, taking another hour to resolve and fix, we really need some tests for buffered queries in the PDO MySQL driver.
The side affect of this problem was that the login page of the forum on the 1st request from AOL generated a blank page, which was promptly cached. This meant that anytime I would now access the page unless I explicitly refreshed it by hand it should show up as blank. Clearing AOL browser’s cache, which POS referrers to as “footprints” did absolutely nothing; neither did repeated restarts of the browser and even the entire computer. This meant that the cache was located on the proxy side, ok not a problem. I’ve added a quick hack involving:
This is a generally accepted way of telling browsers and proxies to stop caching the page. To my greatest surprise it did absolutely nothing to alleviate the problem. At the same time looking @ my request log on the server, I could clearly see the AOL proxy request the page each time. But somehow the data was not getting through to POS and “dieing” in between.
To get a better understanding of the situation I’ve decided to analyze the data being transferred between the POS and AOL proxy, by using Ethereal to capture the communication. After making a few requests I turned to my Ethereal log, only to find that it didn’t contain a single HTTP connection. Instead it seems POS talks with the proxy, whose name appears to be “AOL TurboWeb Cache” over a proprietary partially compressed protocol with binary headers. Fortunately the initial component of the request and response is more or less readable so, some data I was able to gather from it. First it seems AOL added some custom response code HTTP protocol, namely 236 that seems to correspond with page being cached, leading to a 302 redirect, presumably a cached content placeholder. It also uses digest authentication found inside the X-AOL-Auth header to presumably confirm that the user has access to the AOL cache and a SessionID to keep track of the user. The proxy request also contains a very interesting header “X-z”, containing something that looks like a winning entry to Perl obfuscation contest, here is a short excerpt:
# there a few KB of this garbage
As fun as analyzing the communication between POS and proxy was, it still didn’t solve my caching problem aside from the fact that I knew the pages were being cached (236 http code). I hit Google trying to gather information about why non-cache headers I was sending were being ignored by AOL. My search took me to the AOL Webmaster FAQ that talks a bit about the nature of AOL proxies, but ultimately was of no help. A bit of further searching took me to another page on the same site http://webmaster.info.aol.com/caching.html, which specifically talks about proxies and AOL. This page have me the clues on how to solve the problem.
First it seems that proxy is not particularly keen on Expires headers, and even if the value is in the past (Jan 1980), its mere presence causes the proxy to sometimes cache the page. If you want to the page’s content to remain uncached you’ll probably are better off not specifying the Expires header all together (AOL specific). Another interesting tidbit was that if the Cache-Control header contains the string “no-cache” which according to RFC means something along the lines of “forces caches to submit the request to the origin server for validation before releasing a cached copy, every time.” and translated by AOL to “This object may be held in any cache but it must be revalidated every time it is requested.”. However, it seems doc writers at AOL are not quite aware of the code, because the reality of the situation is that if you specify “no-cache” proxy will almost always cache the page.
Another clue was the following bit of information:
This object may be held in any cache but it must be revalidated every time it is requested.
This means that unless you explicitly indicate that the page cannot be cached by a proxy, it will be even if all other headers seem to suggest otherwise.
In the end by altering the code a bit I was able to come up with the following header line, which seems very adept and disabling caching for AOL users.
This may or may not help at all since you know most of your post went over my head like it normally does.
Posted May 15, 2004:
If anyone still has problems with the site updating, and you've recently gotten AOL 9.0. The update changes all the browser settings and automatically saves pages you've visited "for faster loading". You have to turn that option off and the page will update for you. There's a tab in the Internet browser options. If anyone needs more help, you can ask.
Most of us are not in league with you, all we know is that we can't stay logged in or we are logged out at the most inconveint times. We try to fix things ourselves and normally make a mess of our settings. AOL seems to have generated most of the worst of the problems that have plagued our members. I hope you find a solution, but it may be nice to add a warning of some sort to the registration page saying that the forums are best viewed using Firefox browser or something like that.
Now that I have most likely said nothing that would be helpful I will go look for some moon pictures.
What a pain to debug! Imagine some poor amateur PHP developer running into this problem.
What's really sad is that caching is a wonderful thing when done properly. Even if AOL properly adhered to the "must be revalidated every time it is requested" part, that would be a step in the right direction. Technically, AOL's caches are not private, so they shouldn't cache something that has a Cache-Control: private header anyway.
Actually, AOL is adhearing to the spec (RFC 2616). They use two levels of cache, one private (on the PC) and one shared. MSIE is not a hacked version, but actually loaded as a module. Cache-Control: private says an object MAY be stored but only in a private cache. The only header you need to prevent caching is Cache-Control: no-store - all the others you list are contrary to the no-store. Fortunately for you, AOL uses caches that go by the most restrictive directive.
Simply awesome. This gave us the hints we needed to solve problems with our own software that were at first completely baffling to us - and we not that wet behind the ears. The sarcasm was a welcome comic relief too, given our high level of frustration. Why do the big guys like AOL and MS always make me dream very twisted ultra violent dreams about their slow, painful deaths?
Thanks SO MUCH! for this tip. This AOL proxy has cause so much of frustrations & head-scratching at our end for endless days...
When I found a hint after checking the IP address when visiting my site using AOL 9.0 VR, I realize that the IP address is strange.. further checks reveals that they are using their own proxies.
Your post here really helps me to fix the problems I'm facing with AOL...
Somehow, AOL, after spending billions on their system wants to make life difficult for developers...
The least they could do is to announce it LOUDLY to developers on how their system functions, rather than hide it & let us run into all these x-files...