Frequently Asked Questions on High Dynamic Range and Hybrid Log-Gamma
This Q&A is from a December 2016 online BBC Project Series.
1. What is HDR? HDR stands for high dynamic range. It uses the capability of modern displays to show the picture in high contrast. High contrast also gives pictures the impression of sharpness. HDR pictures support highlights and specular reflections, plus extended details in shadows and dark parts of the picture.
2. What is video dynamic range? Some people say that it is the ratio between the blackest black and the brightest white on a display. This is wrong because some OLED displays can have zero light output, which would give infinite dynamic range. This makes no sense.
Alternatively, people say that dynamic range is the ratio between the blackest black and the brightest white that can be seen on the display. The blackest black that can be seen also depends on light reflected from the screen, not just emitted from the display. In television we set the black level, using a PLUGE test pattern, to take account of this. But this definition is still not quite right, as it takes no account of the quality of the image.
A better definition of dynamic range is the ratio between blackest black and the brightest white that can be seen on the display without significant artefacts. The artefact that might be apparent is “banding” or “posterisation”, that is particularly evident “in the blacks” for conventional SDR TV, and which is caused by insufficient bits in the signal.
To be precise, that is actually the definition of the displayed dynamic range. The signal dynamic range is, however, slightly different, as it is independent of the black level set using PLUGE. The signal dynamic range is defined by the ITU-R as “the inverse of the quantization step between the digital code for the nominal black and the next code when the full range (nominal black to peak white) is normalized as unity”. This definition, though widely used in comparative figures, can be misleading because once again it takes no account of banding artefacts. The displayed dynamic range is therefore the more useful figure.
3. What is the dynamic range of standard dynamic range (SDR) TV? Conventional 8-bit SDR TV has a displayed dynamic range of about 32:1, or 5 stops (factors of 2) [1]. This seems low. One might expect an 8-bit system to have at least 8 stops of dynamic range, but that takes no account of banding artefacts on the display.
By comparison, printed pictures are also limited to an only slightly larger dynamic range of about 6 stops, due to the reflective properties of black ink. Printed pictures can look very good, so it is not surprising that SDR TV can also reproduce good pictures.
4. What is the dynamic range of the human visual system? The range of sensitivity of the eye is enormous, from starlight to sunlight (~10-4 to 105 cd/m2). This sensitivity range is at least one billion to one. But the eye adapts to lighting conditions and, at any one time, it can only see a much narrower range of brightness. Although the eye can see both starlight and sunlight, it can’t see them both at the same time. Remember the stars are still there in the daytime sky, but we can’t see them because they are swamped by the light from the sun.
In any one scene, the human eye can only see a dynamic range of about 10 000:1 (less than 14 stops). Parts of a scene that are less than one 1/10 000 of the brightest part are lost in the shadows and can’t be seen.
5. What dynamic range do we need for HDR? Ideally HDR video should exceed the static dynamic range (i.e. the dynamic range that can be appreciated in a single scene) of the human visual system. So, for video, the dynamic range should be at least 10 000:1 on the final display. A dynamic range that is much greater than 10 000:1 cannot be seen by the human visual system and so is not useful.
According to the Barten model [2, p. 39] a dynamic range of 10 000:1 includes about 700 just noticeably different grey levels. Although the model applies to grey levels, in television systems the individual red, green and blue (R, G and B) components are usually specified to the same precision as the luminance component. Thus a minimum of 700 red, 700 green and 700 blue levels would be required to deliver a dynamic range of 10 000:1, whilst avoiding banding artefacts. Consequently a 10-bit signal, which provides 1024 code values, is necessary to support HDR; 8 bits are inadequate.
6. What is colour volume? Video signals comprise three colour components, red, green and blue (R, G and B). The three components may be thought of as a three dimensional space. The maximum (“brightest”) and minimum (“darkest”) values of the three components define a volume in that space known as the “colour volume”.
One way to describe the colour volume is the number of distinct colours and brightnesses that can be represented by the signal. Note that, due to the characteristics of the human visual system, some of those colours may be difficult to distinguish. Nonetheless, this is a useful practical definition.
For an 8-bit SDR TV signal, with about 250 distinct levels for R, G and B, there are about 16 million distinct colours and brightnesses. For HDR TV, which needs a minimum of 700 distinct levels for R, G and B (see above), there would be a minimum of around 340 million distinct colours and brightnesses. Clearly an HDR system has many times the colour volume of an SDR system. The precise ratio of volumes actually perceived is difficult to assess due to the complexities of human visual perception.
7. What are the HDR video formats, and what is “BT.2100”? Recommendation ITU-R BT.2100 [3] is the international standard for high dynamic range programme production and exchange. It defines two formats for HDR video, PQ and HLG.
The ITU (International Telecommunication Union) is the top level international standards body, which is an agency of the United Nations.
8. What are the principal differences between PQ and HLG? HLG is a relative, scene-referred, signal. It is an evolutionary approach to HDR. Relative scenereferred signals are also used, for example, in Recommendations ITU-R BT.601 [4], ITU-R BT.709 [5] and ITU-R BT.2020 [6], and by Sony in S-Log, ARRI in Log C, and Panavision in Panalog.
PQ is an absolute, display-referred, signal. It is a new approach to video. PQ and HLG are fundamentally different systems.
9. What are scene-referred and display-referred signals? Scene-referred signals are the conventional approach to video. The signal represents the light detected by the camera.
Display-referred signals represent the light displayed on the production or “grading” monitor.
Scene- and display-referred signals are different because the overall television system, from camera to display, is non-linear. From a movie perspective the Visual Effects Society says that “broadly speaking, film negatives encode an HDR scene-referred image, and the print embodies a display-referred tone mapping” [7]. For CGI (computer-generated imagery), images are created in scene light to allow ray-tracing.
10. What are “relative” and “absolute” signals? Relative video signals are the conventional type of video signal that have been captured by still, movie, and video cameras for at least the past century. They represent the intensity of the light relative to the peak output of the camera sensor. Of course the camera aperture and/or the shutter time is varied over a wide range to get the best-looking picture. Consequently you cannot tell from the camera signal alone what the absolute brightness of the scene is. Relative signals are a ratio (pixel light intensity to peak intensity) and therefore do not have dimensions.
An absolute video signal represents the absolute brightness of a pixel. Absolute brightnesses are usually denoted as candelas per square meter (also known as “nits”), which is therefore the unit of an absolute video signal.
Currently PQ is the only absolute video signal in widespread use. All other video signals are relative signals.
11. What is “rendering intent” or the “OOTF”? In real life we view the world complete in its surroundings. But we view video signals on a display in a dimmer environment. The screen brightness and viewing environment are, in general, very different from those in the real world where the video was made. The eye adapts differently for realworld scenes compared to emissive displays in dim surroundings. Consequently we need to adjust the displayed image to allow for the difference in the adaptation of the eye, so that it looks correct.
The rendering intent is an end-to-end, camera-to-display non-linearity, intentionally introduced to the signal. Its purpose is to make the image perceptually as close as possible to the real world. A power law, or “gamma” non-linearity has been used for this purpose for many decades, both in movies and video. HLG continues to use a gamma curve for rendering intent. PQ uses a different non-linearity, defined in ITU-R BT.2100 [3].
“Rendering intent” is also known as the “OOTF”, or “opto-optical transfer function”.
12. What is “creative intent”? Creative intent is the “look” of the video that the producer, director (or sometimes the camera shader or colourist) wishes to convey to the end viewer. Ideally the end viewer would see precisely the “look”, or creative intent, intended by the producer. Creative intent is also known as “artistic intent”. In general, maintaining precise creative intent is not always possible due to the limitations of the display or the viewing environment. Simply ensuring that the brightness of each red, green and blue pixel in the image is identical to that on the production display does not ensure that the artistic intent is preserved. This is because the eye perceives displayed images differently in different environments. Hence the displayed image must be adjusted to match the creative intent as closely as possible.
13. Does PQ or HLG have the largest dynamic range? Both 10-bit HLG and PQ comfortably exceed the dynamic range of the human visual system. HLG provides about 16 stops of dynamic range on the final display (depending on black level), and considerably more during production if 12-bit signals are used. PQ provides about 28 stops of dynamic range, far more than the capabilities of cameras, displays or the human visual system. This large range is necessary in display-referred systems to accommodate different applications, ranging from dim digital cinema projectors in dark cinema environments to bright outdoor displays. In the limit, as the eye has a sensitivity range of at least one U.S. billion to one (1 000 000 000:1), a system based on absolute brightness ideally needs a dynamic range of around 30 stops.
14. Does PQ or HLG have a larger colour volume? The HLG signal represents the whole wide colour gamut specified in Recommendation ITU-R BT.2100 [3], and can reproduce it in a consistent way for all practical display brightnesses (up to and beyond 4000 cd/m2). With a peak display brightness of 1000 cd/m2, and brighter, the HLG signal chain supports a colour volume substantially greater than that supported by the end-to-end PQ signal chain.
The PQ signal can represent highly saturated, bright highlights that would not be reproduced by an HLG system. The HLG system does not present such colours because they cannot be reproduced in a perceptually consistently manner on all displays. If such colours are introduced, perhaps during the grading of a PQ signal on a bright display, then they cannot be reproduced on dimmer displays whilst still preserving creative intent. The use of such colours therefore leads to inconsistent reproduction of the picture on displays with varying brightness.
Further discussion is presented in a separate document, “Colour Volume Comparison of PQ and HLG,” available soon from the BBC R&D HDR web page.
15. What is display mapping? For HDR TV in particular we will have a wide range of displays and viewing environments; from home cinemas to TVs in living rooms, and desktops, laptops, tablets and mobiles in all sorts of environments. We have shown in industry demonstrations that HLG can be displayed at several brightness levels with negligible mid-tone mapping errors.
HLG has natural correction for different brightness displays; the formula is part of the ITU-R BT.2100 standard [3] and is applied by the display manufacturer as appropriate. So HLG can be graded on a monitor of one brightness (say 600 cd/m2), and shown on a brighter monitor with essentially the same perceptual look. Alternatively HLG may be graded on a bright monitor (say 4000 cd/m2), and shown on a dimmer monitor with essentially the same perceptual look. In this way the signal is independent of the display.
PQ requires display mapping, which may change the creative intent of highlights if the picture is shown on a display that is dimmer than the grading monitor. Static display mapping for this case is defined in Report ITU-R BT.2390 [8], but this can change the creative intent of the image. The display mapping for showing PQ pictures on brighter displays or in brighter environments than the grading environment (which should change lowlights, mid-tones and highlights) is not defined. By using metadata the creative intent of the PQ image can be better retained after display mapping, but it is unclear how this could be implemented in practical television production.
16. What is the purpose of HLG’s natural compatibility with standard dynamic range? Was this developed to prevent orphaning the many millions of UHDTV Receivers already shipped? This is partly true but, perhaps more importantly, it was designed to facilitate the easy migration to HDR television production. The compatibility with SDR displays allows broadcasters to continue to use low cost conventional monitoring equipment. Furthermore, the scene-referred approach, with no requirements for metadata, allows the use of conventional production tools, codecs and playout systems.
Note that HLG does not provide backwards compatibility with BT.709 [5] HD displays; a colour space conversion (implemented by the broadcaster for simulcast HDTV services) is also necessary in this case. Displaying an HLG signal on a BT.709 display results in desaturated colours, but this is still sufficient for many non-critical monitoring applications.
17. We heard that PQ signals are absolute values and relate to absolute brightnesses. Yes, that’s true; PQ only uses part of the signal range when the target brightness is lower than 10 000 cd/m2.
18. We heard that HLG is limited to 1000 cd/m2. Is this true? No, HLG is not aimed at any particular screen brightness, and is not limited in this way - it is a relative brightness system. 1000 cd/m2 is the typical brightness capability of current HDR professional monitors used in television productions, not an HLG limit. Some consumer screens are already considerably brighter than 1000 cd/m2, and work extremely well with HLG signals graded on dimmer professional displays.
19. Since the PQ standard can cope with brightnesses up to 10 000 cd/m2, does this not make it better as an archive format? Quite the opposite - if material is mastered in PQ for a 1000 cd/m2 display, then when brighter screens are developed this PQ signal will be lacklustre compared to PQ signals graded on the brighter display. In this way PQ risks obsoleting content unless it is re-graded.
HLG does not have any such constraint burnt into the signal at the time of grading, and so is a better archive format. As HLG is based on relative brightness, the capabilities of brighter displays are naturally exploited, allowing HDR signals to be viewed in brighter and brighter environments with consistent and predictable results.
Tone-mapping within a PQ display could of course increase the brightness of a PQ graded signal in a similar way, to match the capabilities of the display. But the tone-mapping is not defined, so the results would be unpredictable and vary between displays.
20. We heard that only PQ can preserve artistic intent. Is this true? PQ can only preserve artistic intent in exactly the same conditions as the colourist used (monitor type, settings, room reflectance and background illumination).
HLG will also preserve artistic intent in exactly the same conditions that the colourist used. But, in addition, it will better preserve the artistic intent across a wide range of display types (mobiles, tablets, PCs and TVs) and environmental conditions as it has inherent corrections written into the open standard, BT.2100 [3].
21. We heard that PQ needs metadata but HLG doesn’t. Tone mapping control is an inherent need of PQ when the signal is displayed in an environment or on a display that is different from the mastering set-up. The PQ signal has an end-to-end transfer function (OOTF) for the mastering set-up embedded within it. That OOTF is only appropriate for the display and environment in which the programme was produced. Metadata, essentially describing the OOTF embedded within the PQ signal, is necessary to adjust the OOTF for the new viewing set-up. This is done through tone-mapping in the display. Where the PQ signal has been mastered on a very bright display, dynamic metadata is beneficial, to identify scene-by-scene which part of the signal range has been used.
HLG is a scene-referred system based on relative brightness, and doesn’t need any -content-dependent metadata, but will have static signalling to identify itself as HLG. As the HLG signal describes the scene, it has no embedded OOTF, and therefore has no requirement for metadata. The appropriate OOTF is applied entirely at the display.
The complete Q&A article is available online in PDF format, including a list of references for the article.