It is finally time to release the first version of the Impostor Baker tool! This is currently uploaded to github since I could not get access to the ue4 community server:
- Currently for UE4.19 only. I have a 4.18 version that is a bit older if anybody really wants but its not uploaded anywhere.
- Everything should be downloaded to the location: \MyProject\Plugins\ImpostorBaker\ .
- The Plugin called "BlueprintMaterialTextureNodes" must be enabled (That one is my first c++ plugin in the engine. Makes it possible to the BP to save assets.)
- This is still beta. Please let me know of any issues.
- Basic instructions are in the included map file Generate_Impostor_Map.umap.
I will be talking about this in more detail at GDC next (this?) week. Here is more info on those talks:
I have been experimenting with impostors for several years now. The idea is not exactly cutting edge. There is a version of an impostor generator built into UE4 that I added a long time ago, but at this point it is pretty dated and borderline unusable (it exists under the Render To Texture BP Library). While the biggest problem with the old method is the cumbersome usability, it also wasted lots of texture memory.
This new version improves the workflow issues and makes more optimal use of the available texture space by using a layout based on octahedra.
The reason I was able to spend time on a new version is due to being asked to help out on development of the Apple Park AR Demo. So this Impostor Generator was first built and tested for that. Unfortunately I do not have permission to show any images related to that, so instead I will use examples from Fortnite since it has been in use in Battle Royale since the map 2.0 update. I made a video showing the transitions a skydiver will see going from Proxy LOD to actual level geometry. The tree impostors are built into the LOD actors.
In order to make it easier to understand exactly what is going on behind the scenes for these new impostors, I think it is helpful to first think of Traditional Billboards since everybody understands them.
Did I mention the Impostor Baker has a Traditional Billboard mode too?
The three modes are: Full Sphere, Upper Hemisphere and Traditional Billboards. In order to better understand the impostors, I will first show the Billboards and progressively enable expensive features the make them look better. Then the similarity/overlap should make sense.Billboards
The above video shows what the blueprint generates for "Traditional Billboards" mode.This is a 3x3 layout of frames with 8 side views and 1 top view (texture atlas shown briefly in video). Notice how the separate cards cause noticeable artifacts. Enabling "Dither" is like a masked Fresnel that uses DitherTemporalAA to smoothly hide the cards that are not facing the camera. It helps but TempAA is not an option for all platforms such as mobile, so this option is not reliable. Speedtree billboards use the vertex shader to simply hide the cards that are not facing the camera.
Next, PixelDepthOffset (PDO) is enabled in the video. This uses the captured depth to offset the pixels and cause more interesting intersections between the cards and the world. It has the same problem that its not supported or a good idea to enable for all platforms, plus it makes TempAA look smeary when parallax does not match for complicated reasons.
Finally, POM (parallax mapping) is enabled. Notice how this makes the transitions between frames much smoother. There are some strange artifacts at the edge of the frames (especially when approaching the top view). Those artifacts are very difficult to explain and fix, but I do not intend to spend much time fixing them, since enabling full POM on a billboard like this is quite expensive, and it would still require significantly more than 9 frames for POM to look smooth for all views.
The point I wanted to make with the above example is that while decent results can be had with a few shader tricks, it does not scale to a smooth high quality version. To increase quality we need to add more billboard cards. This makes the mesh heavier in terms of both vertex and pixel shader, and doing POM is slow enough without it being in lots of overlapping masked triangles. The 9 card cross is already 72-81 verts, so if we add many more, whats the point? At some point a very low standard LOD would be better, but it would be harder to make.
What if there was a way to capture more view angles, but render the final material without needing geometry for each card? That is basically what impostors attempt to do.
The idea behind impostors is to capture the object from a variety of view angles and store each view to a texture. An example of cameras arranged around an upper hemisphere:
The method used here is very important. In my previous impostor generator, I defined the capture points with evenly spaced X columns split over Y rows based on pitch angle. If that made no sense to you, perhaps this image will help:
You can see how each 'ring' of the sphere has the same number of verts. So the row near the top of the sphere has verts more densely packed than the equator. This is the layout the old impostor baker used. It wastes tons of resolution around the poles by capturing views that are next together. And yes, the very top Z frame also gets repeated instead of just being a single view. At the time the only reason I used the above layout was because it was cheap to compute in a shader, whereas the more 'fancy' methods with equal spacing that people were writing papers about required expensive trig and texture arrays because the grids did not map to 2D space uniformly (For example, BN11 in references).
For this version, I am using Octahedra and Hemi Octahedra to handle the mapping. For those not familiar with octahedra, they are a convenient way to convert between 2D and 3D space, or vice versa. They have been used in graphics for a while to compress things like V3 normals to be V2 since the full direction is preserved (no Z sign loss as with derivenormalZ). UE4 makes use of them in this way internally a few times. Octahedra have very little distortion. Slightly more than a 6 sided cubemap but much much simpler to compute.
To better understand how Octahedrons come into play, I made this animated gif. It shows the result of converting between 2D grid points and 3D points using Hemi-Octahedron (left) and full Octahedron(right).
You may notice that the full octahedron is more evenly spaced. While this is true, the Hemi-Octahedron still gains significant resolution for cases where you never need to see below the horizon. It is also kind of convenient that it gives a bit of over-sampling to the horizon, which is where trees are seen from in most games. Most trees should use the Hemi-Octahedron layout for that reason. Both of these are significantly better than the 'old style' method I referenced at the start, notice there are no verts really close together.
Note: I have flipped the edge direction around the poles above to make them symmetrical. The current shader is not actually doing that correction because it costs a few more instructions and added complexity. I did have it working at some point though and its a potential way to improve quality and ensure symmetrical flow.
To capture impostors, you specify how many XY frames you want. That determines how dense the 'grid mesh' is. Each vertex on the grid will become a rendered frame. Note that the grid itself is entirely virtual within the vertex shader, it is not actually rendered as a mesh at any point.
If we show the virtual grid over the captured Impostor texture atlas, we can see that the vertices of the grid align with frame centers in the atlas.
What happens if we simply find the nearest frame and draw it on a regular camera facing sprite? The result is not great:
The impostor rotates wildly from the top view. This is because the view transform is changing quite a bit from those top views. If we think about the original billboard, each 'view' was always locked to its original worldspace. if we want to render with a sprite, we also need to lock the sprite to the original view projection it was captured at. That means it needs to change every time the view changes frames. That gets us looking like:
Now the sprite geometry is locked to the original view vectors, by deriving the original transform from the quantized vector and cross products. Notice it fixes the wild spinning but there is still popping. About halfway through the above video, I turn on single Parallax which distorts the frames to look like eachother. it helps the popping quite a bit (this is the version currently used for FNBR mobile). But we can still do better.
Looking back to the 'virtual grid mesh' above, we can see that for any triangle on the grid, it has 3 vertices. So if we want to blend smoothly across this grid, we need to be able to identify the 3 nearest frames. And remember how using a sprite caused messed up projection of just one frame? Well it turns out the same thing happens when you try to reuse the projection from one frame for another! This is a pain. So you actually have to render a virtual frame projection for the other 2 frames to simulate their geometry. While using the mesh UVs for one projection and 'solving' the other two does work, it falls apart for lower (~8x8) frame counts because the angular difference can be so great between cards that you see the card start to clip at grazing angles (not shown in any videos yet). As a compromise, the shader does not use ANY UVs right now. It solves all 3 frames using virtual frame projection in the vertex shader and then uses a traditional sprite vertex shader. The only downside is at close distances you occasionally see some minor clipping on the edge but it is much more acceptable this way.
First, another quick video to show how this eliminates all popping. It does slightly blur the result but that is preferable to popping and is not noticeable at typical impostor distances (0.1 screen size or so):
Now we will look at how the 3 views are found and blended using a weight function.
Since this part is done in 2D space, it is relatively simple. First we find which QUAD (not tri) we are in by doing a standard floor(View2DVec * FrameXY)/FramesXY type operation. Then we say the 'current frame' has an offset of 0, the 'right frame' has an offset of (1,0), the 'lower frame' has an offset of (0,1), and the 'lower right' frame has an offset of (1,1).
For each quad, the weights for each frame and the 'triangle mask' are used to identify which part of the triangle the view intersects.
A (or red) represents the weight for the upper left or 'current frame'. B (green) represents the weights for both the right frame and the lower frame. Notice it is symmetrical across the diagonal. That is because to reduce from 4 to 3 neighbors, we 'flip' between either the right or lower neighbor. C (blue) shows the weight for the lower right frame. D (alpha) is the triangle flip mask. Ie when the ray hits the white part, we use the right neighbor. When the ray hits the black part of D, we use the lower neighbor. So we end up flipping which neighbor is used when the view is on the diagonal. This works because B has a weight of 0 along the diagonal. When the camera is perfectly aligned with any vertex, that vertex will always have full weight of 1 and the others will be 0.
This gif shows the intersection being found on both grids, before and after unwrapping. This demonstrates how finding the triangle in 3D also finds it in 2D. Notice the small white dot which is the precise camera ray to the center of each sphere. The color of the triangle at that single point is where the weights are read from.
The below image shows what the single depth offset is like, very similar to bump offset. Note that this works really well for very dense trees where the depth mostly forms a cohesive shell. For very sparse and busy trees, this single offset will show lots of noisy artifacts.
This comparison shows a 12x12 layout impostor for one of the main Fortnite pine trees. It holds up pretty well at a medium distance. The smallest frame count used in Fortnite is for the smaller farm trees which get by pretty well with only an 8x8 layout.
HLOD stands for Hierarchical LOD. We use HLOD in UE4 to represent distant levels that are not actually loaded. We tend to refer to those as "Proxy HLODs". Originally, FNBR just used standard simplification for all the trees. That was causing HLOD meshes to be a ton of tris/verts and ti also meant the island did not look great while skydiving and popping was quite noticeable.
With the help of engineer Jurre de Baare, we made Impostors a built in part of HLODs. We added an option called "Use MaxLOD as Impostor" that tells an object to skip simplification and simply composite its data directly into the HLOD. This does bloat up sections because each impostor type needs its own material section.
All told, switching to impostors saved ~270,000 verts from being stored in always loaded memory (proxys never unstream) but it cost around 600 draw calls when viewing the whole island. That draw call cost mostly goes away from the ground.
This shows the impact on an individual POI (point of interest), the Ranger Station aka Lonely Lodge. Standard HLOD was costing almost 13k tris. Impostor HLODs dropped that in half and made it look much nicer.
The quality difference is fairly noticeable while skydiving in: