Monday, 31 March 2014

Another Entry In Smallest Blog Ever


Worked right up to the 7PM wire today, started with the usual 9AM team call. Progressing very well with six hands at three keyboards fixing bugs like some coding orchestra! 

Just some quick eye candy from a small level I created to test that the older stuff still worked with the newer hacks and changes to improve performance and stability.

We're still grappling with the stutter issue, but we have DEFINITELY reproduced it now, and just need to figure out the root cause of it. There are other issues on our growing list for the next timely beta, but the stutter is the only remaining issue from our original plan.

I can also share that I have further improved the animating object system so that EVERYTHING is instance objects unless absolutely necessary which means you can paste down a TONNE of animating vegetables with no surge in memory usage or resource hiccups.  Of course animating all that through the shader would hit performance, but as always when it comes to content, your mileage will definitely vary.

Was another good day of coding, and with a huge testing effort all lined up to receive this beta version, we should ensure get something that works out the box (as far as our 1.006 ambitions go at least) ;)

Friday, 28 March 2014

A Quick Update


All D3DX silent errors now removed from the engine so this should improve stability, certainly on the Ultrabook integrated devices I have been using as my super low end tests (not that they are particularly low end).  Also fixed more issues and made many tweaks, the new grass system works very well, faster and no more stall staggering when new veg is generated ahead of the player.

Next up will be some general testing of the build on a few machines to make sure the installer is up to date with everything done so far.  I will then plummet into the subject of media encryption which we have given a high priority too in order to project the high quality art which is being crafted for Reloaded. Built into the very fabric of the totally new asset store, we hope to make the asset purchase and download system super simple, super quick and super secure.

Anyhoo, back to TGC's holy trinity of Reloaded developers ;)

Thursday, 27 March 2014

The Developer Trinity


This is day one of a holy trinity of hard-core developers working on the exact same piece of software, and pretty a-typical in the history of Reloaded. I spent the morning setting everyone up with the latest version and code sharing so we are all now dancing on everyone else's toes, but it should speed up development. Can already report some good fixes, good learning and a better product, and it's only been four hours. Posting my blog early as there is a good chance of a power cut this evening at my end.   You will be pleased to hear the memory gauge is now sensible and starts at zero and only goes up when new stuff is added to the level. It pretty much stays under 25%, no matter what I throw at the level, so that's cool!

Good News For The UK Dev Community

A mementos day for all UK developers, the EU just authorized tax relief for all UK developers who pass some criteria for Britishness.  Here is a good article on the subject:

Wednesday, 26 March 2014

Optimized Meeting


Three hours sleep, three hours drive, five hours meeting, three hours drive home and one hour knocking off emails.  Managed to get the meeting wrapped up quickly and in short order, with a view to getting you the next beta as quickly as humanly possible without sacrificing the required testing time. To that end you will be getting shorter blogs from me during the next two weeks and perhaps less posting on the forum, but you can be assured that all the extra minutes will be spent targeting the most critical of issues for the next version.  I am also resetting my body clock to daylight hours for the next two weeks also, which means no more late night coding sessions and an early night tonight. In the spirit of this momentum, welcome to the end of my blog post.

Tuesday, 25 March 2014

Tuesday Prep Day


Sorry to say there is not much to report, save that I have now prepared all my materials for the two day meeting and everything is ready on my USB stick and Ultrabook. I set off at 6:30AM so that gives me a few hours sleep before I start off back to the real world.

I did continue to knock down my email inbox and organize some other bits and bobs such as an easier piece of grass code for later integration and cleaning up my dev area to give me some breathing room for continued debugging. I am also in the process of upgrading to Visual Studio 2013 as I have learned that the new NVIDIA NSIGHT visual debugging tool is suited to this version and that their improved support for DirectX 9 will be there - fingers crossed.

I did get a nice chat with the guy who basically writes the NSIGHT tool when I was at GDC and I added my vote for the ability to debug shader scripts from within the graphics debugger, so hopefully we might see that!

I also jumped into the forum as well to make sure all answers where forthcoming from the team and also ensured that this blog went out, despite being a little thin on the ground.

Also remember I had a nice long chat with Ravey who has been doing the software CPU occlusion stuff while I was away, and it's looking pretty good with the hierarchical z-buffer rendering 10K polygons under 1ms (without too much optimization at that) and his next stage is to create a mip-chain from that data and then run the occlusion query on the appropriate buffer slice. Once he is there, I can integrate that into the engine and replace the GPU dependent one (you remember, the one that creates that massive stall and halves your frame-rate).

I also learned at GDC that a new 'direct to GPU' evolution will eventually allow us to take results from the GPU memory without any significant stalling so there is still hope for an entirely GPU occluder (but not today for me).  I also learned from the Intel graphics guys that as the video memory is pretty much shared with the system memory in parts, that I could effectively avoid a stall by using that common area for my occlusion texture results for a unique 'integrated' solution to occlusion, but again this is all theory, smoke and mirrors until I roll my sleeves up and see if it works.

For now, I have a week of brainstorming and planning, and then Friday onwards I barricade the door, take the phones off the hook (do they have hooks any more?) and dive into the remaining performance tasks in the direction of a new 'much anticipated' version.

Monday, 24 March 2014

Back From GDC - What A Week!

Work & Play

Landed Monday night, straight to sleep, up Tuesday for a full day of presentations, meetings, setting up demos and then off to the pub for a swift half. Wednesday through Friday included booth appearances, two speaker presentations, meetings and of course a quick tour of the EXPO floor to see what's hot in 2014.


I think this picture sums up my GDC week very well. One the face of it, it looks like Lee checking out the latest V.R experience, a wires-free peripheral vision virtual reality running Android OS with full calibration-free head tracking all the way from London (amazingly had to fly half way around the world to check out tech from my own backyard).

On closer inspection though, you will see that Lee is drinking a virtual beer. The combination of cool future tech and the perfect beer is very much the take-away from GDC 2014, and I was greatly honored to be part of it.

Made most of my social feeding on Twitter which meant my blog was pretty silent for the week I was away.  Naturally no development was made, but plenty of testing was done on several Ultrabooks plus a snazzy new device which was the size of a box of teabags and contained a full PC with Iris-Pro graphics.  This little brick ran my Reloaded demo at super fast speed, even though it was an integrated graphics device, and I was quick to learn that Intel are now dedicating about 75% of their available silicon on the latest processors to graphics!

For Reloaded this means we are really hitting the mark when it comes to performance, and being best friends with integrated graphics means we are also best friends with low-end graphics cards too!

As much as I enjoyed my time talking shop during the event, I also enjoyed my occasional stroll around the EXPO floor looking at the amazing tech and learning of the incredible announcements. Had a chance to meet some of the guys and gals at PrioVR and try out their body suit, which in my humble opinion is a game changer you can only appreciate by using it.

The announcements from CryTek and Epic are set to once again to transform the ecology of game making solutions, and probably giving Unity developers something to think about too :)  

New hardware devices are coming thick and fast, and I managed to check out three new VR headsets as well. Wanted to try the Sony headset but the cue was a mile long, but by all accounts is as good as the new Rift DK2. I put my order in for DK2 as soon as I got back, and will be watching and waiting for what the Sony VR device does.

I don't want my blog to decent into a GDC report as I am sure you can check out the news from dedicated bloggers. My personal impression was that GDC represented some of the best stuff happening in the games industry right now and it's cool to be part of it.

My own GDC demo could have been better I felt. The visuals needed to be a few notches higher, it had some stability issues on my older Ultrabook and my new voice-control system suffered from the overall 'noise' of a large conference style event. The introduction of effective noise-cancellation technology would to wonders to improve that scenario!  Amazingly I felt the performance was fine during my demo, but I did notice something very strange which meant even though 'mouselook' was silky fast, the 'move player' stepping was very stuttery. I suspected the physics coding was to blame, and will be something to look at when it becomes an issue for the community.

I did return to a nice surprise in the form of a new 'grass system' which eliminates the CPU stall issue and increases overall processing speed. I also got some free shader tweaks as well, and it looks as though I get extra speed with no visual or functional drawbacks. I will be pouring through the code once I have cleared my not insubstantial email mountain.

My attention was given to a forum post which raised issue over the lack of feedback during GDC week, and the lack of a beta to replace the last one which left many users without a stable version.  Hopefully I have addressed the lack of info with this blog and I will be looking at the second point this week. I should get back into the coding swing of things from Tuesday but I have meetings Wednesday and Thursday so expect this to be a slow week for development progress, but I will still blog for you, even if it's a picture of the juicy steak I ate Wednesday night ;)

Saturday, 15 March 2014

Birthday Boy = Lee


You might think I spent the day fine tuning my GDC demo and making sure all the hardware worked as expected. That would be the professional thing to do. Alas, I discovered on waking that it was my birthday, and my presents consisted of Jamerson's and Guinness. What's a guy to do The consummate professional that I am, I spent three hours in the garden raking back the soil for a new lawn and then five hour drinking my presents and watching a marathon viewing of Pawn Stars.

I am now sitting down to the PC and all remaining work at 1:30AM (after midnight) and pondering my next move. I think I can finish up the demo, and then do the tests on the Ultrabook, then copy the latest files to the USB, but whether I can do this before my pre-timed path at 2AM I am not so sure.

The demo might be doomed, but at least I had a great birthday day, and my future includes a nice soak and perhaps an episode of Poirot.  Rest assured, my built-in need to get the job done will see me through ;)

Also, as a teaser, you should get a real sense from this pictorial hint at the secret sauce I have mere hours to integrate, and I can also assure that I did NOT play any part in it's coding. As you may already know, I am 100% performance work right now!

P.S. The hat was purchased over 15 years ago from a cowboy in Florida, and despite it's own personal history, still resembles a hat :)

Friday, 14 March 2014

Last Day Of The Week


Another week, and another round of performance improvements while juggling the issue of keeping the visuals sweet.

I have sacrificed a few FPS, but managed to add a system which renders the details provided by normal maps into the terrain, in addition to adding specular as well. The trick is pretty neat and means it only costs near the camera, and as the terrain pixel gets further away it fades out the high definition normal surface lighting and reverts to a cheaper lighting formula.

The biggest kicker, and I will be striving to add it before I jet off is to separate the static shadows with the dynamic ones (i.e. shadows cast by dynamic moving objects like the character), so that I can cast a higher resolution shadow from the chappy and make him pop more in the scene.  It will definitely be in the next Reloaded beta but I would have liked it to be in this version by now. Ah well.

I wanted to do a few more visual/performance tweaks, but time has run out for this week. It's now 3AM and I really need to add some 'secret sauce' now for the GDC demo next week.  I can only reveal what that sauce will be at the event, but I can blog it in much detail on my return.

The GDC plan has been brewing for weeks now, and at last count my schedule is pretty packed and I hit the ground running on Tuesday AM and don't stop until Friday when the EXPO closes it's doors. If you are doing GDC this year and want to know which parties I am attending, the biggest one will be the YetiZen at the stadium. I will 'not' be attempting to drink my own weight in beer (as that would be fatal) and instead will be doing my part to remain sensible enough to tweet and blog my way through the week.

Thanks to the awesome crew at Intel, I will be on their booth every day of the Expo, doing a little talk on the magic of AGK and Reloaded, joining in several discussions on the next steps in technology and meeting a few guys and gals who might help us push our products during 2014.  I am also attending a session on how to use the NVIDIA NSIGHT tool to deep dive my graphics engine and really squeeze every last analytic morsel out of what is going on. Such knowledge would have lead me to the cost of the shaders months ago! It will be a busy week, and I hope to learn a lot, and then return refreshed to continue the good fight.

Thursday, 13 March 2014

One Day Closer To GDC Demo (and your next beta)


Today I have been working through more of my GDC tick-list, which for the most part is about more performance, more predictability and a little common sense. The characters now 'mostly' behave in close quarters combat but the more I work with it, the more I want to press the reset button and start simpler. I think the AI is thinking a little too much right now :)  Performance found in other areas and I have also identified a HUGE drain when MANY characters are animating at the same time. I have posted a Challenge to the Reloaded Forums to see if anyone has a good solution to the problem. It's nor rocket science, but it will be interesting to see what solutions are presented.

It's already past 1 AM and despite a full day of coding, there is still plenty to do from my list and some things that are not even on the list. Right now I am making a small level using the new Veg Plus Pack to create a nice looking scene that runs fast, looks good and presents the main ingredients of what Reloaded can do so far (admittedly it could be more).

There are a few issues I need to solve on Friday such as the low resolution shadow could be a higher, the gun needs some shadowing from the cheap dynamic terrain shadow, the entities could use the same, objects in the distance do not need to animate as much as those in the foreground, some collision issues with the branches of larger trees, maybe add some normals to the terrain shader when in closer range and experiment with some specular without costing performance.  In truth the list is insanely huge, but I need to select the ones that either improve (or don't cost) performance, and make the visuals better.

Not really too much time left. I fly out Monday, and drive to the airport Sunday so I only really have FRI and SAT to do my thing before packing my stuff and ensuring my head is screwed on for the trip.

I visited the forum today and made some feedback posts, and it is my intention (trips permitting) to visit the forum twice a week and respond to everything on the first page so hopefully this will feed some extra information while I keep the next version close to my chest.

OTHER NEWS: We will have an announcement on Day One of GDC, so keep checking the social feeds for this as you may be able to benefit from GDC fever and pick up some freebie goodness.

Wednesday, 12 March 2014

Great Meeting And Good Work


Had a good meeting on Tuesday (got patted on the back twice, so as you can imagine I was instantly suspicious). The cause for celebration was the fact I have been hammering performance so much it now runs at 60 fps on an integrated graphics chip. Sure, it's a snazzy Intel HD 5000 chip, but it's still a far cry from a dedicated watt gobbling monster most gaming rigs are rammed with. It is the LOWEST shader levels, and I have some visual tweaks to make, but it's a great improvement over previous versions.

I also had a power cut today, all be it my fault as I invited an electrician to look at my antiquated consumer unit and his little box of tricks tripped the whole house.  I was half way through an email too.  The great news is that my PC, Monitor and main Hubs all remained powered, and the UPS device only showed about 25% drain after what must have been 30 minutes. Pretty good! My email was safe!

Today I have done a few things, listed out of context but of some interest. Sped up the reflection render by removing grass from all but the highest setting, stopped start marker texture from disappearing, save level no longer crashes under new DXT5 compression mode, modified simplest water shader technique to use a bluer blue, fixed the jutter and freeze when on water so the player can glide along it smoothly (water antics to follow after performance solved) and some smaller tweaks.

It's just gone midnight and I am tackling the water edge boundaries for AI characters so they can miss the waters edge altogether.  Also investigating using the same technique to block off high hills so enemies cannot traverse over vertical cliffs to reach you.  I am conscious however that I want an early-ish night to get up on Thursday in the AM for some parallel development work on some items for GDC so it's a tricky one.  I will plod on some more and see where it takes me.

Monday, 10 March 2014

Ultra Look Day


Tuesday is meeting day, which means Monday is 'preparing everything for meeting day', including a series of demos for the Ultrabook I will be taking to the meet up. On this occasion I am taking my newer Haswell with integrated HD 5000 beastie with me (the same on I will be taking to GDC).

I ran some early tests and the performance is good, but it needs a few extra boosts and some cosmetic tweaks before I can present it in a good light. This is my work for the evening, but not too much as I have to be out the door at 5AM Tuesday for the drive.

Gives me a few more hours, but I can report some very good early statistics from the mobile PC which bodes well for a lot of low-end cards out there :)

Sunday, 9 March 2014

Weekend Research


Did a little research over the weekend on the subject of lighting models in relation to performance. There is an argument for implementing something like a cluster deferred lighting renderer which has the potential to reduce the 'lighting/shadow cost' which was the route cause of the present performance issues. Alas such an implementation (done correctly) will require an upgrade to DirectX11 and a new deferred rendering pipeline which combined would take a few months of dedicated coding. That is, nothing else would get done while this work happened, and the kicker would be that some users will not even notice the difference, aside from some frame rate changes and the benefit of adding thousands of lights without a performance hit.  It is a thankless task that has more long-term benefits than short term goodies.

I expect everyone wants performance yesterday so will not be willing to sanction a six month sabbatical while Lee buggers off to re-write the entire graphics engine. To that end, the smart course is to finish the optimization work on the DX9 engine and get it as fast as it needs to go, so that when we do upgrade to DX11, we still have a very good fall-back for those users still using Windows XP and Vista (DX11 won't work on those OS platforms I think). Moreover, DX11.2 only works with Windows 8.2. See the pattern ;)

Anyhoo, the reason for my quick weekend blog is to write down a small idea I had about the shadow system I am working on. Right now the fast entity shader (LOWEST) does not use the dynamic terrain shadow texture due to the relatively low resolution texture and the lack of any meta data in the shadow texture to work out whether to shade entities at higher elevations (i.e. the roof gets a shadow when it should not).

My simple (and fastish) idea is to feed in the texture holding the height map data of the terrain, which will give each XZ coordinate a world space height position. I then write how 'deep' the shadow pixel in the dynamic terrain shadow texture is instead of just black/white. From these two pieces of information, I can work out whether a single world space position of the entity is in or out of the shadow being cast.  I would have to increase the texture size of the dynamic shadow texture to get a better finish, and there is a concern that the extra per-pixel calculations and texture read might create some drag factor in terms of performance, but the theory is sound in my mind. It's not a lot of work and it would mean entities get 'almost' true shadowing, just as terrain and grass currently receives.

I have all day Monday to work on this, plus the other ideas I have, and the big job of getting it all on an Ultrabook as my meeting is over 100 miles away from my main machine. It's a good test however as GDC is even further and this trial run will be very revealing.


And now, I will forget all that stuff and see if I can boot up Thief and continue my pilfering in the dark and rain soaked streets of what looks like London. It's possible they put Big Ben in there for the 'pending' UK tax breaks. It will be interesting to see UK developed titles in the next few years coming out with all manner of Britishness crow-barred in. The next time you play a fast paced zombie-horror blood-splat gore-fest shooter, and have to consume 'cream teas and buttered scones while affecting a cockney accent' to restore your health, you can blame the politicians of Europe! Interesting times!!

Friday, 7 March 2014

End Of A Good Performance Week


As the weekend approaches, I wanted to end early (midnight) today but it's now gone 4AM. Apart from making the vegetation and entity shader follow in the footsteps of the new shadow system, I've spent probably far too much time tweaking the lighting and pixel effects of the LOWEST shaders to get as close to the HIGHEST ones without adding to the performance hit. It's educational, but it's slow work.

Here is the before shot with everything set to HIGHEST and using the expensive fragment shaders:

Here is the same shot but using my new LOWEST shaders:

The terrain and grass are rendering shadows, and the entity approximates a shadow effect (but I want to do more here somehow/somewhere). Both are rendering all four cascades and the fifth dynamic terrain shadow texture but the LOWEST has a few more tricks in that I can completely switch off the cascades and only draw to the fifth texture when something moves. I did some tests prior to these shows, and I could get another 40 fps by switching them off without loosing my shadows.

The clock has beaten me (once again), but I have had lots of extra ideas on top of what I have now including the addition of meta data into the dynamic terrain shadow texture (DTST) to store information about the shadow being cast (very similar to deferred rendering but with local render targets). This extra info would allow me to shade entities 'above' the floor surface such as tables and things under canopy. I had it 'mostly' working without this, but the tops of entity roofing got shaded too which was a bit displeasing.

I also thought of reading the maximum texture size allowed on the card and then create the DTST to that size, giving my shadows greater resolution. My GeForce 9600 GT can create textures 8192x8192 large, which will increase my shadow resolution by a factor of four. For terrain and grass it is not too noticeable (but enough), but I really need a higher resolution for the entities!  It may be straying into visuals vs performance though, and there is much to do yet on the performance side (despite the early good results).

I have still to research the static vs dynamic DTST idea to avoid rendering ANY static entities after the initial blast, and using a different texture format for the DTST to reduce the memory it takes (16MB right now, 262MB if I use a 8192x8192 texture). Ouch. With an 8-bit format, this would drop to a more friendly 65MB. AND I want to see how much I can move some of the pixel shader work into the vertex shader to increase calculation efficiency. So many ideas, too little time!

Before I turn-in, I will leave you with a video monologue I made this afternoon as I was attempting to explain the new shadow system. I think it merely serves to confuse everyone, but it's material you might like:

Have a good weekend, and if I get up in time, I might have one too.  I just realized "Thief" (reboot) has been released on Steam and I had it pre-ordered, so I think I will play a few hours of my all time favorite franchise as a little treat for getting some serious performance work done this week.

Thursday, 6 March 2014

Long Day - Early Night - I Wish


As I was forced out of bed at an ungodly time (NOON), my eyes are telling me that 1AM is the time to stop working.  I can't burn the candle at both ends like I used to (at least not the candle I'm currently burning).  Progress has been steady all day, but in the last hour it has taken on a slightly sour note. My main task of creating a faster shadow system is fine, but when I promoted the code to the main editor, the engine crashed when you save levels. You could not make this stuff up!  And worse, it's one of those D3D9.DLL crashes that leave zero clue as to the root cause. Friday will certainly involve lots of undo tweaks to see when the crash stops - grrr.

Anyhoo, the boon of the day was to be a new dynamic terrain texture generator using quads and cascade detail, but it would have taken days and there was no guarantee it would give me a substantial increase once the new code and it's performance hit was taken into account.

I decided instead to target something I KNEW would give me a boost, which was eliminating the expensive terrain shader without loosing my shadows. The idea I had was to collect the shadow information in a large dynamic terrain texture image directly from the cascade shadow map information and then use that data at the more extreme distances. It would effectively replace 90% of the expense in the terrain shader with a simple texture read. Everything has gone smoothly for the most part, but getting the shadows to line up with the terrain heights is proving a 3D headache.  Not being able to save my test levels is the icing on the cake of this headache :)

To cheer you up, I can report that the old terrain shader would run at 53 fps with shadows on. The new 'flaky visual' version can run at 131 fps with shadows on. The shadows are more blocky and out of place right now, both to be corrected, but the performance boost is undeniable.  Let me repeat, the shadow slider was not switched off, it's still rendering 4 cascades and I still get 131 :)

The image above shows the system as it was being built. The blue square is the new dynamic terrain shadow texture and the black square inside it is the current largest shadow cascade area in the camera view. The white dots are the shadows cast from the buildings. As you move around the scene, this blue texture is updated with the shadows from the cascade and slowly builds a picture of relevant shadows that the lower-powered shadow renderer uses. Once I have everything straightened up and looking pretty again, I will write something which will populate this dynamic texture at the start of the level so even distant shadows will be rendered (as you would expect).  I have some other ideas about using 'multiple' dynamic textures for higher quality long-term storage of terrain shadows but I want to get everything back together first.

So in conclusion, despite the engine having wires hanging out of it and crashing, I think I am on the coat tails of a serious performance improvement and if I can get the visuals comparable to the expensive per pixel fragment shader version, we'll be laughing.

Wednesday, 5 March 2014

Strange Frame Rate Fruit


A rather mixed bag today. Once I had dispensed with the first five hours of the day on menial stuff and nonsense, I began the adventure of performance seeking in earnest. My plan was to re-introduce the quad system, but tie it to the individual static objects in the scene, so in addition to their LOD transitions they would have quad's at the furthest range. This was accomplished relatively quickly, but the next part of my evening would be filled with non-quad related musings.

It turns out that after adding the quad system, and as a quick test replacing ALL static objects with their quad equivalents, and ensuring that the quad textures or quad vertex buffers where not being locked, I only gained 4 fps for my trouble on a GeForce 9600 GT using the 'run to the river' level with everything set to Low. That's right, I went from 141 fps with real objects to 145 fps with quad replacers. You could have knocked me down with a feather. My knight in shining quad armor turned out to be a total faker. It was remarkable in that the 141 fps was pushing 134K polygons with 225 draw calls and the 145 fps was pushing 86K polygons and 132 draw calls. I achieved my goal of halving the draw calls but I did not get my reward of more frame rates. I should also mention that I switched off my built-in occlusion system for this test, initially as a way to un-bias my results but in fact when I did remove my performance helping system I went from 66 fps to 141 fps. Sometimes the universe likes to have a huge belly laugh at my expense!

Getting 163 fps on a GeForce 9600 GT - It IS possible but at too High A Price

Undeterred, I decided to abandon logic and look for something that would give me some more performance. Having done everything right by adding occlusion, quad rendering and other object thinning methods and not get a prize in performance, I decided to spend an hour running a battery of daft tests until I saw a big jump in performance.  I finally found one such spike, which happened when I moved the camera to look at the sky but not so far that the terrain and ground objects became invisible. I noticed that the more terrain was rendered, the more frame rate drain occurred.  I then replaced the terrain shader with a single color draw and the frame rate went through the roof.  It seems the single biggest performance killer is my terrain shader, which has the job of painting most of the 100K-200K polygons in a typical scene.

The good news is that I have recruited someone (Dave The Ravey) to help me reduce how much terrain is rendered in the first instance, but now I know the shader is a crucial bottleneck to fast gaming, this is the focus of my performance hunt on Thursday. Unfortunately, I have to take a phone call during daylight hours so have to cut short my development tonight and resume when I wake.  My initial thoughts are that rendering every terrain pixel with my intense terrain shader is just daft, and that if I can render a cascade of terrain textures on the GPU and then feed those textures to the terrain rendering, it will all but eliminate the drain which can take a frame rate from over 300 fps down to 110 fps.  Naturally, whatever I use to build the dynamic terrain texture will cost something, and balance the scale a little, but having experienced almost no drop in performance when rendering LOTS of quad textures, I think I can get away with it.  There are some big holes in this approach however, such as no dynamic shadow information getting to the terrain shader, but I think I can re-channel the shadow to the dynamic terrain texture and get the shadows back.  Using this new technique, I will also be able to introduce 'texture splatting' as a freebie feature, allowing more than four textures per terrain :)  I won't be doing anything on this feature until performance is solved, but it will be a great bonus if this new technique works as I expect it might.

I think you can start to appreciate why some game developers decide to buy a middle-ware engine for $250,000 and skip the whole process of figuring out the best way to do everything :)  I certainly can, but I won't be beaten!!

Tuesday, 4 March 2014

Crash Alley Tuesday


Managed to delegate some more, reduce my inbox count and get an internal version out the door for more testing and veg video production, but the main headline story is the sequence of inexplicable crashes I have had to wade through. You all know the frustration as developers of not getting to the task you want to start because of a handful of totally mysterious bugs that need fixing first. Well that was my day. Even after 30 years of coding I can still be tripped up by a crash who's cause is the random corruption of some data at some point in time, and the crash event sheds no light on either.

Add a new flag in the meantime called "dividetexturesize=X" which scales down the textures loaded into the engine. I will have it default to 2 so that lower end machines are not choking their GPU video memory with 2048x2048 textures, and those with higher end cards can simply adjust this value to 1 for the highest resolution textures. In the future I might introduce some modes which do not reduce very small textures as they have much less impact on overall video memory usage.

I've just had a bath, and waiting for a Eureka moment on the crash issues, but none came. I have thus decided to approach the problem methodically. I will totally analyse the area of memory that the crash occurs on, and record as much as possible the before, during and after states, and setup monitors of the data around the memory block in case something else changes it.  It's long-winded, but it eliminates the need for intuitive guesswork, and if I can finish off today (by 3AM) with a fix of this most illusive bug, I will be a happy chappy.

My plan (for Wednesday now), is to skip the LOAD OBJECT memory work (which would only yield a small saving overall) and go for the huge performance gain of making all distant objects QUAD buffers, and then extend it to shadows, reflection and light ray cameras. It will reduce draw calls by MORE than half for the same visual, so it's well worth it and early tests show a marked improvement in FPS, even at lower levels.  My run-to-the-river currently runs at 10fps with everything on and 15fps with some conservative reductions. The game plays well at 30-35fps so that will be my initial aim on the GeForce 9600 GT card I am currently using.  I also plan to recruit one of my coders to help me with some terrain work which should vastly reduce the memory and performance footprint too.

Monday, 3 March 2014

Monday Already


Having crunched code over the weekend, it feels odd that Monday is my first day back.  Despite the strange feeling, I set to work obliterating my backlog of emails, sorting out the work for the next two weeks and getting some key issues resolved in the engine.  I have sent code off to Simon to add shadows to the Construction Kit and a new version to Rick so he can make some nice videos about the new Vegetation Pack.

Still the best news is the massive memory savings I made by loading the HUGE textures used by the engine into GPU Video Memory. The consequence is that I can make seriously large levels now and the system memory creeps up very slowly compared to the old version. I have ideas to make even more savings too, but as much as I want to chase that particular tiger down, the elephant in the room (performance) remains the highest on my personal snag list. I will be running a demo of Reloaded on an Ultrabook in exactly two weeks time in the biggest developer conference on the planet, and I don't want egg on my face. Yes I will be saving face, but you will be getting some serious performance boon as a result so I think we both win.

I also got an email asking for clarity on the license terms of media provided in Reloaded, and the legality of releasing it as part of the standalone build. The official TGC EULA for this product allows you to distribute the assets (encrypted or otherwise) providing it remains part of the standalone demo you built. If anyone remembers the culture of sharing FPSC Classic games, same deal. We do specifically exclude the extraction of those assets for use elsewhere, either as part of a library or other game creator, but within the context of using the assets to show off your Reloaded creations you are quite safe.  It's also great to see the maturity of the community to ask such a question as it demonstrates your respect for copyright material and the terms of use applied to digital media.

Saturday, 1 March 2014

A Change Is As Good As A Rest


It seems my buss-mans holiday paid off this time. Within two hours of hitting the code again, I discovered that although the LOAD OBJECT command was gobbling a little more memory than it should, the real villain of the piece was LOAD IMAGE which had been placing MOST of the texture content in system memory as a managed backup.  My building ate 23MB in the old Image DLL, but with some extra code to ensure certain commands could still lock a video memory texture, my new Image DLL loaded the same building in at 9MB. Multiply this effect by an entire level and you will start to appreciate how this single tweak has improved your lives in Reloaded land :)

My investigations also saw the LOAD OBJECT memory usage double when the LOAD EFFECT applied it's magic to it, so there is something very suspicious happening there too.  Going to stop my weekend work for how, and celebrate my code win with some good, and Sunday/Monday I will chase down the reason for the shader taking my 3MB model up to 6MB (and then find out why my model was 3MB in the first place given that the vertex data cannot be more than 1MB).


In order to think laterally, you need to approach a problem for an entirely new direction. Leaving your code for a week, then returning to it is one way to do that, and leaving it to talk geek for a week is probably the best way to prepare your brain for the comeback.