Search public documentation:


Interested in the Unreal Engine?
Visit the Unreal Technology site.

Looking for jobs and company info?
Check out the Epic games site.

Questions about support via UDN?
Contact the UDN Staff

UE3 Home > Mobile Home > Mobile GPU Profiling

Mobile GPU Profiling


(Hover to see the alternative state)


Rendering performance on mobile hardware is even more challenging than on other platforms. Running on batteries and low heat requirements resulted in stricter limits on memory bandwith, CPU and GPU math performance. Unfortunatly good profiling tools are sparse and hardware comes in many variants. Even the OS version can affect the avilability of features. Here we try to help on that situation by providing a simple mechanism to profile performance and give general hits how to resolve those.

Measuring GPU time on mobile devices

Mobile devices do not support timestamp queries, which are used on other platforms to measure GPU times directly, such as the entire frame or the cost of one feature. For example, the 'profilegpu' command logs out GPU timings for different parts of the frame on D3D9 and D3D11, and 'stat unit' displays frame GPU time. The result is that the only measurement we can use is the total frame time. Now the CPU (game thread, rendering thread) is running in parallel with the GPU and the frame time will be the longer of the two. This means that in order to use the total frame time to measure the frame GPU time, the current frame must be GPU-bound. You can verify that this is the case by looking at 'stat unit'. If the frame time is larger than the 'Game' or 'Draw' times, then it is GPU bound. Be careful when disabling many rendering features that you do not become CPU-bound and produce an incorrect measurement! In the screenshot above, the frame time is only ~2ms larger than the Draw time, so it will quickly become CPU bound when turning off GPU features.

ProfileEx to measure the performance impact of some rendering feature

This a very simple method that allows profiling of rendering features in absence of a better method ("Ex" as making an experiment). You specify a mask of features you want to profile and once enabled it toggles those features every 5 seconds. Looking at the frame time you can see the effect this feature has on the frame time. If the frame rate is varying a lot this method is not that useful. To get a stable framerate you can put the game into pause more or setup a scene without any movement.

The frame time is automatically shown (not smoothed and only updated once a second).

To activate you set the console variable "ProfileEx".

The latest help text you get through the console:

ProfileEx ?
        Profile experiment: to profile features only by looking at stat unit (alternating it on/off)
        This is useful on platforms that don't support any other profiling functionality.
        Make sure VSYNC is off and ideally you are in "pause".
        "Stat unit" and "StableFrameTime" are activated with the feature.
        FeatureMask in hex (add to combine, 0 is off, e.g. 0x1a = 0x10 0x08 0x02):
         0x001: Deferred shadow projection
         0x002: Depth of Field
         0x004: Bloom
         0x008: Shadows (shadow map generation and projection)
         0x010: Depth resolve (mobile only)
         0x020: Bloom/DOF blur (mobile only)
         0x040: Force Gaussian to blur with 4 samples at max
         0x080: Fog (mobile only)
         0x100: Translucency
         0x200: Skeletal meshes
         0x400: StaticMeshes
         0x800: Particles (only toggles rendering)

Example usage:

ProfileEx 0x100
        to profile translucency

ProfileEx 0x500
        to profile translucency and static meshes

In code you can test for the bits with as little C++ code as this:

if(!GetProfileExState(0x2, View.Family->CurrentRealTime))
    // we want to profile with the feature off
    OutSettings.bEnableDOF = FALSE;

This feature is not available in FINAL_RELEASE.

How to adjust global setting features for specific platforms

Many settings exist in the ini files (e.g. BaseEngine.ini in the section [SystemSettings]), those can be overriden by the specific platfom (see [SystemSettingsMobile] or [SystemSettingsIPad]). A few examples:

; 0 means no sharpening, -0.5 results in sharper textures but slightly slower rendering and more shimmering

This one we recently introduced and is more experimental:

MobileSceneDepthResolveForShadows= TRUE

It can save quite some performance by avoiding a depth buffer resolve. When enabled you might see artifacts as then it might pick up the depth buffer from the former frame. In certain kind of games this might be quite acceptable (e.g. 2D like 3d game: few camera movements, mostly flat ground as shadow receiver).