Synchronization in nDisplay

An overview of how synchronization works across multiple displays

Windows
MacOS
Linux

Synchronization is a vital part of displaying multiple sections of real-time content over a large display. All systems within the render/display ecosystem must adhere to strict timing, measured in milliseconds, to produce the illusion of a seamless display.

Multi-display setups often require syncing capabilities at both software and hardware levels. Not only should the generated content be ready at the same time on all PCs, using the same timing information for simulation, but the display swap (the changing out of the current image for the next image in the video card buffer) needs to also happen at the correct time to prevent tearing artifacts in the display.

For VR and other types of stereoscopic displays, the issue of synchronization applies doubly since the two different frames—one for each eye—must coordinate perfectly.

Determinism

There are two types of approach for managing synchronization:

  • Deterministic: Each server (PC, rendering node) is set up in such a way that the output is always predictable given a particular set of inputs, which means the only information the server needs to synchronize with other machines in the system is an accurate time, and input/output information for each individual machine.

  • Non-deterministic: To ensure synchronization, the system forces replication of the transform matrices and other relevant characteristics of all actors or objects in a scene, and reproduces them throughout the system.

Each approach has pros and cons. The main advantage of a deterministic system is project simplicity, and the data bandwidth saved by not sharing transform data for each object at every frame. The downside is that if one system diverges, the divergence will have unknown issues over time. Rendering uniformity could be severely compromised, leading to visual discontinuity and artifacts.

Hardware Sync and Genlock

While the nDisplay master PC ensures that all slave (node) PCs in the cluster are informed of timing information from a gameplay perspective (for example, which frame to render), specialized hardware sync cards and compatible professional graphics cards are necessary to synchronize the display of those rendered frames at exactly the same time on the physical display devices.

For example, in broadcast applications, it is common to synchronize many devices such as cameras, monitors, and other displays so they all switch and capture the next frame at precisely the same time. In this industry, the use of generator-locked signals (genlock) is widely adopted and used.

Typically, the setup is composed of a hardware generator that sends the clock to the hardware that requires synchronization. In the case of PCs used for real-time rendering, professional graphics cards such as those in the NVIDIA Quadro line support this technology alongside the NVIDIA Quadro Sync II card, which will lock to the received timing signal or pulse.

System Topology

Master Node (house clock generator)

Slave Nodes (receiving clock from master node GPU)

Daisy Chain vs. Direct Genlock

Daisy-chaining is another signal-locking technique that can be used instead of direct genlock, where the master clock is sent to a single PC or device, then separate cables propagate the signal to all other hardware. Although previous experience with nDisplay suggests that direct genlock (where each device receives the clock directly from the master source) is simpler and more effective than daisy-chaining, a new hardware approach based on daisy-chaining gives hope to a more reliable and cost-effective solution to signal locking. This solution is called NVIDIA Swap Sync/Lock.

NVIDIA Mosaic & AMD EyeFinity

Mosaic mode from NVIDIA and EyeFinity from AMD are similar GPU technologies that combine and synchronize multiple outputs within one graphics card so that they appear as a unique display from the OS or software perspective. This virtual display is essentially an aggregation of multiple physical displays acting as one synchronous display that can also be synced to other displays from different systems using underlying framelock or genlock techniques.

The process of locking multiple displays using a house clock is called framelocking. Genlocking is when an external clock is used instead. In the context of genlock, the external clock is used throughout all devices, including display devices, cameras and tracking sensors.

Both Mosaic and EyeFinity technologies allow multiple outputs from a single graphics card to have the exact same clock, and to be treated as one big display canvas from an operating system perspective.

To synchronize displays that are driven by multiple PCs or GPUs using daisy-chaining techniques, one GPU must act as the master clock generator and share its clock to the other GPUs—either on the same PC or different PC—using sync cards like the Nvidia Quadro Sync II or AMD Firepro S400.

For more information, see Mosaic Technology for Multiple Displays | NVIDIA and Multi-Display Eyefinity Technology .

Aspects of Display Sync

The game and render threads are always synchronized by what are called thread barriers. This is the core feature for both of the two items below. It is not possible to disable barriers. Consider them as points on a timescale that help the master node deal with slaves synchronously to provide a uniform content experience across multiple UE PCs.

Feature

Description

Software/Simulation (what you should sSee)

The process of synchronizing all software-related matters. It is about making Unreal Engine and its content properly sync: feature determinism, delta-time synchronization, input replication, custom transforms, thread barriers, cluster events, game logic and so on.

Hardware/OS (how you should see content)

The process of synchronizing hardware devices such as GPUs, display devices, and the DWM (Desktop Window Manager) using one shared clock–a house sync or genlock signal. It is how to make an operating system and hardware show what you want synchronously, without tearing. This requires NVIDIA Quadro graphics cards, plus Sync II cards, plus a genlock generator. Expect at least a one-frame hardware delay if this is not enabled.

Sync Policies

Policy

Description

swap_sync_policy=0:

This means VSync=0. All nodes show their frames after RenderThreadBarrier WITHOUT synchronization to VBlank. This gives maximum FPS at the cost of potential tearing artifacts.

swap_sync_policy=1:

This means VSync=1. All nodes show their frames after RenderThreadBarrier WITH synchronization to VBlank. Output displays won't have tearing. Note that tearing can still be possible among the displays because of DWM and runtime or driver settings.

swap_sync_policy=1 + advanced sync enabled:

The same as the previous policy, except that we try to do it on a software level.

swap_sync_policy=2:

Added in 4.25, NVIDIA hardware framelock (Nvidia SwapLock API) based on daisy-chain connection.

New sync policy in the config and [swapgroup] parameter for [cluster_node]:

In previous releases, we developed a sync mechanism called Advanced Sync for specific circumstances where the rendering of a frame on a given nDisplay PC would happen after VBlank, creating tearing on screen. This software mechanism tries to predict those events, and forces all PCs to skip and render on the next VBlank. This mechanism is not 100% perfect and can have a non-desirable performance impact when enaSbled.

In this release, we decided to implement the newly available Nvidia API for synchronization, which is a hardware-supported approach to the problem.

FrameLockConnectionsTimingServer.png

Framelock connections on a timing server: 1) Timing server 2) Clients.

SyncSource.png

1) Sync source 2) Server 3) Clients

Workflow

  1. Set up daisy-chain connection among cluster GPU sync boards using regular ethernet cable with regular RJ45 connectors.

  2. Enable MOSAIC and configure synchronization via NVIDIA control panel (set a master, then set sync slaves).

  3. In the nDisplay config file, set sync policy to 2: [general] swap_sync_policy=2;

  4. Optional: For complex systems, it's possible to customize the NVIDIA swap group and sync barriers in the nDisplay config file using a new [nvidia] config entity. For example: [nvidia] sync_group=2 sync_barrier=2

By default, sync group 1 and sync barrier 1 are used.

The swap_sync_policy=2 config setting enables an NVIDIA swap group over cluster nodes.

For complex setups, you might want a few applications to be in sync over a number of computers. For these purposes, sync groups and sync barriers are used to split sync signals among different applications within a cluster environment. By default, NVIDIA sections are not needed and can be omitted.

Use Case

Different applications utilizing NVIDIA swap-sync functionality need to be split in terms of sync groups, or for two separate clusters running on the same hardware.

Additional Comments

A 0 value is not allowed. A 0 value is used internally to detach/leave a group/barrier. Only natural numbers are allowed. On a simple system with one GPU and one sync board, you have single sync group #1 and single sync barrier #1. Those are used by default.

In complex systems with multiple GPUs and sync boards (and maybe some NVIDIA Pro settings), the amount of groups/barriers might change.

Synchronization Testing

Testing synchronization for a scaled display can be tricky because desynchronization can result from any number of issues, including:

  • The wrong frame is simulated due to incorrect timestamp (software issue).

  • The display device timing is off (display or hardware issue).

To test sync, we use a simple test project displaying a single object that moves quickly across the entire display surface. When systems are properly in sync, the object retains its form as it passes across boundaries. Otherwise, the display will show artifacts at shared edges.

ScreenOneScreenTwo.png

When systems are properly in sync, the object retains its form as it passes from Screen 1 to Screen 2.

Links to Additional Information

Select Skin
Light
Dark

새로운 언리얼 엔진 4 문서 사이트에 오신 것을 환영합니다!

문서 사이트에 대한 의견을 모을 수 있는 피드백 시스템을 포함해서 여러가지 새로운 기능을 준비하고 있습니다. 아래 Documentation Feedback 포럼(영문) 또는 언리얼 엔진 네이버 공식 카페(한글) 중 편하신 곳에 의견이나 문제점을 알려 주세요.

새 시스템이 준비되면 알려 드리겠습니다.

네이버 카페
공식 포럼