As a follow up to Lorelle’s previous post, I wanted to report that we believe we’ve stabilized the Web services. Many of you may have noticed some periods of inaccessibility when accessing the Blog, Forums, or occasionally when trying to log into the client.
To provide a little more detail behind the issues (for those of you who are interested) what’s been happening is that as we’ve scaled up the number of users on the system, the number of simultaneous requests for the Blog and Forums have increased – as one would expect. A couple of weeks ago we reached a point at which we decided the Web performance was being degraded, so we needed to shift some attention to the matter.
After doing some brainstorming as to a long term scalable solution to the problem we implemented a new, more complex architecture to handle both Web and login requests. Part of the solution included load balancing the servers. Unfortunately, as with all complex solutions, the Devil is in the details and we had one single little configuration set incorrectly. The result was that the vast majority of requests were working, but every once in a while pages would not load.
Those of you who’ve had to troubleshoot technical gear will probably be able to appreciate the frustration the team had while trying to fix the problem. It’s much easier to find when it’s actually completely broken.
The team did however finally discover the bug, and we’ve squashed it for good! Many thanks to Scott, Greg and the rest of the Layered Tech team for their assistance, as well as to all of you guys for your patience! As of this moment the site is running faster than it ever has in the past, and we hope to keep it that way.