An Evening Tale: A "Heap" of Trouble
For the longest time -- and only on certain people's systems -- there would be this strange occurrance whereas the player would experience a sudden 'freeze' or pause in gameplay for a split second, every few seconds. This did not seem to lower the framerate or updates-per-second, and happened uniformly at some same rate. However, on other systems this would happen similarly, but the effect would vanish after several seconds. To further mystify this problem, it would happen to me if I ran Skirmish via WebStart, but NOT if I ran it from my IDE. Tres bizarre!
This being initially back in October or November, and I had been baffled for a while. I had first begun profiling in order to try and determine what portion of the code was causing this behavior. At first I suspected that the garbage collector was at fault, and that it must be complaining because I was creating so many 'Vector2's (a 2d vector helper class) all over my code in places like rendering and elsewhere. I somehow convinced myself that this was the root of the problem, and wrote a system on top of the Vector2 class that would maintain a pool of reusable objects to keep heap size down. The details are fuzzy, but I must have convinced myself that it worked, somehow, and carried on with life.
Not long after, my good friend Dean told me that he still regularly experienced this "pulse lag" bug, as he so aptly named it. I had not realized this, since I was still always running the darn game from my IDE! It turns out that the bug was still present, although only a handful of the testers would experience it. Oh man! Weird!
It was at that point that I decided to give it another go at the profiling, and figure out what the heck was going really going on. So, I did. This time, the source of the time-wasting looked to be the networking. I ran a few (admittedly VERY basic) tests to see if this was the case, and indeed it did seem that the raw network calls (recv and send) were sucking up a lot of CPU time because they were blocking the main thread from doing other useful things like rendering. This surely must be the root of this problem! So, I went off on a mini-project to make the entire networking system multithreaded. Several challenging weeks and a dozen or two hours later, that was implemented. Swell.
Not swell. The testers that had the "pulse lag" bug were still experiencing it in its full glory. However, most people noted that the framerate had increased by a decent margin. I once again had not fixed the bug, but Skirmish became a little faster through my efforts, and I learned a lot about concurrency.
Once again, today, I took a stab at figuring out where the bottleneck was. It was the strangest thing. It seemed almost like every time I cleaned up some code, it jumped to a new spot. Again and again. Puzzled, I took a moment to stop and take a few steps back and think about this thing. Surely these few areas of code were not of pinnacle efficiency, but they don't account for this kind of periodic freezing to occur. The only thing that would make sense would have to be..
..garbage collection. Or, more specifically, the heap size allocated by the Java virtual machine. In some ways, I am to blame for not allocating and handling objects a little more responsibly, but the VM was vomitting every few seconds because the heap was just too small. And, in a very anti-climatic ending, I simply increased the heap size for the VM and everything was well. Testers confirmed that the elusive bug was nowhere to be seen.
Is that a truly tale of victory? Or did I just evade my own inefficiencies by increasing the heap size? I'm still not sure whether I feel satisfied or disappointed. I was really expecting a grand victory, full of music, dance, and perhaps a large feast. Still, I won't complain. It works, goshdarnit. It works. [smile]
The Other Enigma: Windows Vista
On a equally pleasing front: Windows Vista is now able to run Skirmish properly.
If this sounds like news, then you probably weren't aware of this problem. To be honest, it wasn't something that I wanted to advertise. [smile]
For the longest time, Vista users have been unable to connect to the master server. It would simply outright refuse to connect. Boggling, I know! Unfortunately none of the Vista testers were developers, so they didn't really understand how to get me the system log that the game would produce. Fortunately, I had my roommate around today, who has a Vista partition, to get one to me.
Much to my confusion, the line it produced upon connecting was an exception: "Invalid argument: sun.nio.ch.Net.setIntOption". Huh? I don't claim to be a Java guru, but I had never seen or heard of this setting before. My instinct was that one of the many integer-based parameters that standard network sockets accept wasn't appreciated by Vista.
After some searching, the closest thing I found to a satisfactory answer was on Marlon Pierce's blog, although it's not quite to the same details as my problem. Not being a big Windows fan, part of me wants to just say, "Oh, right, Vista stinks". But that kind of grumbling won't solve the problem.
Anyways, it turns out Vista didn't like that I was using the TCP_NODELAY socket parameter when connecting to the master server. I'm still a little mystified about why Vista doesn't seem to support this parameter, unless Vista does non-blocking sockets in some other way.
Once again, this fixed the problem. Vista users could now connect just fine. Ironically, since my previous problem required me to implement concurrency to networking, it was not a problem that the socket was not non-blocking. Again, not a satisfying victory. Oh well. [grin]
Despite my ironic tone, I am actually very happy about the outcomes. Two of my longest-standing bugs are finally quashed, the the game has become a little more playable and stable.
Huge thanks go out to Patrik, my roommate, for all of his time helping me trace these bugs down, and of course to the kind folks who helped me test tonight. Yeah, you know who you are. [smile]
First on the list, a little glitch. The "OK" button on the Password Confirmation box, when I attempt to register the account, is smack in the middle of the box, and my first instinct is to click it to get it out of the way so I can type. :P (Link to screenie)
Second, and more of a nitpick, I'm used to a certain flow for account login/creation dialogs, and nearly every site/game I see does this. If there's Login and Create Account buttons right next to each other, the Create Account button generally goes to a new page/dialog where you can signup, but the Login button works with the fields above. Iunno, I kinda wasn't expecting the way you have it. Like I said, it's a nit!
Third, I like to tab between fields. :(
Other than that, looks good and I can't wait to try it out when people are actually playing!
EDIT: Also, why does the chat become opaque when I hover over it? That makes it hard to see where I'm aiming if I'm aiming down there >_>