Happy New Year 2013!
I've been playing around a bit more with the Nebula3/emscripten port over the holidays. Emscripten had some nice improvements during the past 2 months, mainly to generate smaller and faster code, and to drastically reduce code generation time in the linker stage (read this up on azakai's blog).
The work I did on my experimental Nebula3 branch were only partially emscripten-related: The biggest chunk of work went into refactoring to adapt the higher level parts of the rendering pipeline for the new CoreGraphics2 subsystem (lighting, view volume culling, and the highlevel graphics subsystem which is concerned about Stages, Views and GraphicsEntities). A lot of code was thrown away or moved around, but from the outside everything looks quite similar as before. External code which depends on the Graphics subsystem must be fixed-up, but not rewritten.
Another big chunk of work went into implementing instanced rendering for the new CoreGraphics2 system. OpenGL offers several extensions for instanced rendering, but since none of the current WebGL implementations support any of these extensions I first wrote a fallback solution which works without extensions, but uses bigger "unrolled" vertex- and index-data, and a instance-matrix palette in the vertex shader. With the current implementation, up to 64 instances can be collapsed into a single drawcall. This depends on the number of available vertex shader uniforms, and since the ANGLE wrapper used by Chrome and Firefox on Windows generally restricts the number of vertex shader uniforms to 254 I had go with only 64 instances per drawcall. This restricts the usage scenarios of this approach, but when rendering a Drakensang Online map (for instance), this comes pretty close to the average number of instances of environment objects in the view volume. For particle rendering this approach would be useless though.
I also rewrote the emscripten filesystem wrapper. The original implementation was only a quick hack to get data loaded into the engine at all. I wrapped this now into a proper subsystem which uses new emscripten API calls to directly download data into a memory buffer without mirroring the data into a "virtual filesystem", and the new implementation also accepts the file compression of Drakensang Online's HTTP filesystem (it's not the complete HTTP filesystem implementation yet though, the table-of-content-files are ignored, as well as the per-file MD5 hashes, and there's no local file cache apart from the normal browser cache). Also, while the emscripten filesystem wrapper is asynchronous, it is not yet multithreaded through the new WebWorker API. Decompression currently happens on the main thread and may lead to frame stuttering, but the plan is to move this into separate worker threads.
Finally I've uploaded a few new demos to http://n3emscripten.appspot.com. As always you should use an uptodate Chrome or Firefox browser to try them out.
First, here's the old Dragons demo, recompiled with the latest emscripten version. Thanks to the improvements in emscripten, and the house-cleaning to remove old code, the (compressed) download size of the Javascript-code is now only 308kByte:
Next is a demo for the new instanced rendering. On startup, 1000 independently animated cubes are rendered, and by pressing cursor-up you can add 1000 more. There's also 128 point lights in the scene. Every 1000 cubes require about 32 draw-calls (that's (1000/64)*2, the instancing collapses 64 cubes into one draw call, and then *2 because of the NormalDepth- and Material-Passes of the Light Pre-Pass Renderer. For every cube, a world-space transform matrix is computed per frame on the CPU (a conversion from polar-coordinates to cartesian coordinates, involving two sin() and two cos(), and a matrix-lookat involving several normalizations and cross-products.
By hitting the space-key you can also enable a disco-light posteffect for giggles, this does an additional single-pass fullscreen posteffect which does a lot of texture sampling:
And finally I wrote a little Drakensang Online monster viewer. With cursor-up/down you can switch to the next/previous monster, with cursor-right you can flip between different skin-lists (appearances), and with cursor-left you can toggle a few animations (usually idle and running anims). Obviously the material shader is different from Drakensang Online (the color texture is replaced with just white, the specular effect is exaggerated (which actually is a nice show-case for the really good normal-maps of our character models). This is only a snapshot of what's currently in the game, especially most of the animations are not included. The strange cubes which are displayed sometimes are the mesh-placeholder objects, I think I'll remove them and just use no placeholder as long as the mesh is not loaded, at least it shows that the placeholder system is working right ;)
That's it for today :)
The work I did on my experimental Nebula3 branch were only partially emscripten-related: The biggest chunk of work went into refactoring to adapt the higher level parts of the rendering pipeline for the new CoreGraphics2 subsystem (lighting, view volume culling, and the highlevel graphics subsystem which is concerned about Stages, Views and GraphicsEntities). A lot of code was thrown away or moved around, but from the outside everything looks quite similar as before. External code which depends on the Graphics subsystem must be fixed-up, but not rewritten.
Another big chunk of work went into implementing instanced rendering for the new CoreGraphics2 system. OpenGL offers several extensions for instanced rendering, but since none of the current WebGL implementations support any of these extensions I first wrote a fallback solution which works without extensions, but uses bigger "unrolled" vertex- and index-data, and a instance-matrix palette in the vertex shader. With the current implementation, up to 64 instances can be collapsed into a single drawcall. This depends on the number of available vertex shader uniforms, and since the ANGLE wrapper used by Chrome and Firefox on Windows generally restricts the number of vertex shader uniforms to 254 I had go with only 64 instances per drawcall. This restricts the usage scenarios of this approach, but when rendering a Drakensang Online map (for instance), this comes pretty close to the average number of instances of environment objects in the view volume. For particle rendering this approach would be useless though.
I also rewrote the emscripten filesystem wrapper. The original implementation was only a quick hack to get data loaded into the engine at all. I wrapped this now into a proper subsystem which uses new emscripten API calls to directly download data into a memory buffer without mirroring the data into a "virtual filesystem", and the new implementation also accepts the file compression of Drakensang Online's HTTP filesystem (it's not the complete HTTP filesystem implementation yet though, the table-of-content-files are ignored, as well as the per-file MD5 hashes, and there's no local file cache apart from the normal browser cache). Also, while the emscripten filesystem wrapper is asynchronous, it is not yet multithreaded through the new WebWorker API. Decompression currently happens on the main thread and may lead to frame stuttering, but the plan is to move this into separate worker threads.
Finally I've uploaded a few new demos to http://n3emscripten.appspot.com. As always you should use an uptodate Chrome or Firefox browser to try them out.
First, here's the old Dragons demo, recompiled with the latest emscripten version. Thanks to the improvements in emscripten, and the house-cleaning to remove old code, the (compressed) download size of the Javascript-code is now only 308kByte:
Dragons Demo (Cursor up to add more dragons) |
Next is a demo for the new instanced rendering. On startup, 1000 independently animated cubes are rendered, and by pressing cursor-up you can add 1000 more. There's also 128 point lights in the scene. Every 1000 cubes require about 32 draw-calls (that's (1000/64)*2, the instancing collapses 64 cubes into one draw call, and then *2 because of the NormalDepth- and Material-Passes of the Light Pre-Pass Renderer. For every cube, a world-space transform matrix is computed per frame on the CPU (a conversion from polar-coordinates to cartesian coordinates, involving two sin() and two cos(), and a matrix-lookat involving several normalizations and cross-products.
Pseudo Instancing |
By hitting the space-key you can also enable a disco-light posteffect for giggles, this does an additional single-pass fullscreen posteffect which does a lot of texture sampling:
Pseudo Instancing with Disco posteffect (press Space) |
And finally I wrote a little Drakensang Online monster viewer. With cursor-up/down you can switch to the next/previous monster, with cursor-right you can flip between different skin-lists (appearances), and with cursor-left you can toggle a few animations (usually idle and running anims). Obviously the material shader is different from Drakensang Online (the color texture is replaced with just white, the specular effect is exaggerated (which actually is a nice show-case for the really good normal-maps of our character models). This is only a snapshot of what's currently in the game, especially most of the animations are not included. The strange cubes which are displayed sometimes are the mesh-placeholder objects, I think I'll remove them and just use no placeholder as long as the mesh is not loaded, at least it shows that the placeholder system is working right ;)
Drakensang Online Monster Viewer |