+ OTHER TRANSFER FUNCTIONS
Transfer functions such as gamma serve an incredibly important role in displaying video images. In this article an overview of various transfer functions, their history and purpose is explored.
Understanding Gamma + other Transfer Functions
Getting a grasp on gamma encoding is an essential piece of knowledge for anyone doing serious work in color or mastering in film and TV, but it can also be one of the most confusing topics since our human eyesight works in a drastically different way than most electronics. The whole effort of gamma encoding and transfer functions is based around delivering an image to our human eyes that is optimized for how we see the world rather than how computers see the world.
The very first step to understand gamma is understanding the term "luma" and how it relates in the world of digital imagery. When we are talking about the luminance information we’re talking about a scene's brightness values when stripped of its color. And when this scene is captured via film or digital imagery, it's called "luma".
Color image with both the luma and color (chromacity) values
Same image with just the luma data displayed
When we are talking about gamma, we are talking about how we map out the luma values that sit between black and white. Gamma defines what we do with those mid tone values. That will end up having an affect on the colors also, but the colors are not the primary target here. In the world of TV and Film, the gamma profile used changes based on where it is being delivered. For example, computer screens have a different gamma profile than TVs and Digital Cinema projectors use a different gamma profile too. Thus, it becomes critically important to know where your content will be seen as it will affect which gamma profile you should be mastering to.
To understand the effect of gamma better, here’s how the same image looks when interpreted at various common gamma profiles:
Highlighting these differences on the internet with a small screen that’s surrounded by other bright elements is an inherently tricky thing to do but if you look carefully you will notice that as our gamma profile keeps going higher in number our midtones get darker. Particularly pay attention to sparse smoke such as that to the left of the dancers lower leg which darken in appearance. It’s also important to note that changes in gamma do not change the black point or white point of the image, they only shift the luma values that sit in between black and white.
While these differences may seem extremely subtle, they are more visible as image size is increased. By the time an image takes up an entire wall (such as in a cinema environment), you can be looking at quite significant changes.
So why do we have different gamma profiles and where did they come from?
The Origin of Gamma:
As mentioned at the beginning of this article, the primary reason that gamma is needed today is because electronic sensors see the world in a very different way than our human eyes do. Namely electronic sensors see the world in linear light whereas we prioritize information in dark regions and conversely also don’t require as much information in bright areas. However, when examining the history of gamma, there was another primary purpose for gamma encoding and that was because of the CRT (Cathode Ray Tube).
Being that the CRT was the only suitable display device that existed in the early days of TV, the way it displayed and responded to light created the need for gamma encoding. Coincidentally, the CRT responds to light in a non-linear way similar to our eyes. A CRT requires large amounts of electric energy to produce even a dark image but then as light output from the screen increases the energy requirements don’t increase in a linear way. Instead they do what can be seen in the graph here:
As you can see, it takes quite an amount of energy input in order to be able to get up to even half of the CRTs maximum light output. But then, to display the remaining 50% of light output requires disproportionately less energy.
So what happens then when an image from a camera is handed off to a CRT display?
As you can see, we have a problem. Because the CRT display doesn’t reproduce light in a linear way, the resulting image is much darker than the original scene that was recorded on camera. So how do we fix this? Ideally we want to get that line to be straight so that the linear light that the camera observed is reproduced and we can see the scene as it originally was. The answer is inverse gamma.
Inverse gamma is a way to counteract the effects of that CRT response. While it's not necessarily exactly the mathematical inverse, its overall aim is pre-correct for what the gamma profile will do. That means that images are recorded brightened up more than they appeared in real life, so that when the gamma curve is run across them they are brought down to how we’re hoping to see them. A mathematical inverse function is easily calculated by dividing 1 by the gamma curve (2.2 in this case). ie the inverse gamma of 2.2 is 1 / 2.2 = 0.45 which looks like this graph:
However, in reality many inverse functions aren't the exact mathematical inverse, they are often tweaked a little. Sometimes this can be because a camera manufacturer may want to alter how the dynamic range of their sensor is recording, or in the case of mastering different viewing conditions may alter the desired inverse gamma function.
Just as with normal gamma curves, an inverse gamma point doesn’t shift the black or white point but shifts all of our mid tone values. In the case of an inverse function, it’s doing this in anticipation of the fact that it’s going to have a gamma function applied to it downstream.
What does an image look like after having an inverse gamma function applied to it?
So, what we’re looking to do is apply an inverse gamma function, then apply a normal gamma function, and have that combination form the desired representation of the original scene:
The Whole Chain:
Gamma is a topic that gets confusing quickly because you simultaneously need to think about human eyes, various electronic devices and the signal path that connects all of these together. What we are ultimately aiming for is to reproduce the best representation of the original scene, so let’s have a quick look at the signal chain in order to see where these different gamma operations are being applied.
This is how our original scene looks to the human eye and our goal is to reproduce an approximation of this for an average person in their home.
As the camera records a scene it applies an inverse gamma function to counter the 2.2 gamma that the CRT TV will apply next.
The image with an inverse gamma function is broadcast to the TV. The CRT TV then outputs a corrected image due to the inherent gamma curve in CRT technology. This image is now a reasonable approximation for what the scene looked like when viewed by human eyes.
The Current Purpose of Gamma Correction
As most know, film, video and technology has changed dramatically since the time that CRT TV’s were invented. Video cameras still have sensors that read light linearly but that data can now be saved to a variety of formats such as RAW and logarithic data functions. As a result inverse gamma operations aren’t necessarily hard coded into the output formats of video cameras in the way they used to be. Additionally, CRT’s are a relic of the past. We’ve moved onto LCD, OLED, digital projection and laser based display technologies since that time.
So then it may be reasonably asked – why bother with gamma at all nowadays? If we have more control over our display devices why can’t we just stick with a transfer function of 1.0 so that our input equals our output and no messy transformations are required in our signal chain? Well, we could, but to do so in a way that didn’t introduce a problem called "color banding" in shadows would require operating at bit depths beyond what display and consumer storage mediums are currently capable of.
The truth of the matter is that the gamma that was required for CRT based technology had a really nice co-incidental benefit for image quality – namely it encoded more detail into the shadows and less detail into the bright areas of an image which is exactly what we prefer to see as humans. While a CRT may be hungry for electricity when producing dark shades, our human eyes are hungry for visual information in dark areas. Let’s explore the topic a bit more as it’s critical in understanding the current purpose of gamma and other transfer functions used in HDR imagery.
Gamma to Improve Image Quality.
Our human eyes are very good at seeing into the shadows but aren’t so good at distinguishing subtle differences in very bright areas. This is different from camera sensors which have a linear response to light and prioritize all areas equally. Being that content is designed for human consumption, it makes sense to use a system that prioritizes placing data in the areas that we find most important. While inverse gamma operations were necessary to deal with a CRT’s inherent gamma response, that inverse function had a double benefit – in lifting the shadows to a higher value before transmission, it meant all the shadows were given more room/bandwidth to have their values accurately captured.
Even though a CRT’s response would correct this back to how the scene originally looked, those additional values made it through transmission and still provided additional accuracy in the shadow shading. As a result, gamma encoding has continued to be used because it has allowed us to keep more of the useful information we prefer while avoiding encoding more precision than necessary for bright values. This reduces the amount of processing power and storage required. It also means that the spectrum assigned for over the air broadcast can transmit higher quality imagery using gamma encoding than linear encoding.
Here’s a look at how an 8bit gamma encoded image looks compared to one that been created with linear luminance values and then transferred back again to gamma for viewing on the display currently in front of your eyes. This a good example for why we don’t use linear luminance values in transmission and storage of deliverables.
Look closely at the shadow range of the bottom bar and you will see that the gradient begins to exhibit obvious banding. This is because linear encoding prioritises accuracy of the bright regions more than our human eyes do which then comes at the expense of the shadows having lower accuracy. If we were to use Linear Light encoding for TV’s then unless we were using much higher bit rates, much of the content we watch would exhibit the shadow banding problems we see above.
Is Gamma the Best Transfer Function for Human Vision?
So being that gamma correction was really defined on the back of a CRTs response to energy input, it raises the question, is gamma really the best transfer function for images viewed by humans? Now that we aren’t limited by CRT displays, is there another curve that would serve human vision even better? With the advent of High Dynamic Range (HDR) imagery on the horizon, Dolby did some research into this.
The Barten Ramp is an extremely handy graph that plots out where most people can begin to see banding in a gradient (ie, the steps between each shade) when mapping out all the way to 10,000 nits for potential HDR imagery. The area in green shows where no banding can be seen but the area in red shows where banding can be seen and is therefore problematic.
For image quality, we would prefer to have an image stay below the Barten threshold at all times. As can be seen in the image above however, in order to do that with a standard gamma transfer function we would need to allow for 15-bits per channel. If we only allocated 10-bits per channel, our darker regions would still exhibit banding artifacts. In addition to 15-bits per channel being a very high requirement for commercial systems (and even more so for consumer systems!), it’s also a very inefficient use of the data; there’s far more data assigned in the mid and bright portions of the image than what is required. A 13-bit Log encoded image stays below the Barten threshold also but has an inefficient encoding of the shadows where more values are allocated than required.
What we really want is a curve that stays below the Barten threshold and follows the curve as close as possible to maintain optimal efficiency in encoding. Enter, ST. 2048 otherwise known as the “PQ” transfer function.
PQ EOTF (ST.2048)
The Perceptual Quantization (PQ) transfer function was specifically designed to maximize luminance encoding efficiency for human vision – hence the name. As can be seen below, it allows a signal to cover from 0.001-10000 nits with no perceivable banding or stepping artifacts while utilizing only 12-bits.
EOTF is an acronym that stands for Electro-Optical Transfer Function. Gamma is also an EOTF but the term has really risen in use since HDR imagery has been developed for consumer viewing. Don’t let the name scare you, all it really is referring to is the transfer function that is used to convert between electrical signals and what our eyes see optically - hence electro-optical. The inverse function can occur when a camera is recording a scene in which case it is called an OETF (Optical-Electro Transfer Function).
Because of PQ’s efficiency through such a wide luminance range, it’s used by the HDR10, HDR10+ and Dolby Vision HDR Standards. HDR10 varieties utilize it at 10-bits per channel whereas Dolby Vision encodes its content at 12 bits per channel which allows it to stay below the Barten threshold at all times. While at first glance it may seem a curious choice for HDR10 and HDR10+ to use a 10-bit PQ function (being that it sits above the Barten Threshold), in the real world image characteristics such as grain and noise break up banding artifacts so that a 10-bit signal is not as problematic as what it might appear.
It should also be mentioned here that the BBC have developed another EOTF that works differently than PQ for HDR functions. Call Hybrid Log Gamma (HLG). It uses a transfer function that combines traditional gamma encoding with a logarithmic encoding system for the brightest portions of an image. This enables the one signal to contain both an SDR (Standard Dynamic Range) and HDR image, enabling backwards compatibility with existing TV sets. As a result of this backwards compatibility many are looking to HLG as the preferred EOTF for broadcasting HDR content where there's a mix of SDR and HDR displays.
Today's Transfer Functions
Now seems the best point to have a quick look at the different transfer functions being employed today as they vary depending on the color space and intended destination of the deliverable. It should be noted that while the listing here is what is commonly done, it is possible to force color spaces to use alternate transfer functions - they aren't forcibly tied together. Color space and transfer functions are independent operations but often images that use a particular color space will also use a particular transfer function by convention.
Wrapping It Up
In this article we’ve taken a look at what gamma is, why it exists and how it affects an image. From its necessary beginnings in the case of CRT displays, gamma encoding has served a secondary purpose for decades – namely it uses a given bit depth more efficiently to prioritize and encode the visual details important to humans. With the advent of High Dynamic Range imagery for consumer viewing, new transfer functions have been created. The PQ EOTF enables a range of 0.001-10000 nits to be encoded with no discernible banding when operating with 12-bits per channel. HLG is an alternative HDR EOTF that combines both gamma and log encoding and has promising future in HDR broadcast.