Automated tests for processing test files in different codecs and quality settings.

It has now been just over TWELVE YEARS since the initial release of the HAP video codecs in 2013, and now there’s a new kid on the block.

In this blog post we are going to be diving into the details behind the newest addition to the HAP codec family – HAP R, the new ultra high quality GPU accelerated codec.

With these benchmarks we’ll be looking at how HAP R compares to the other HAP codecs in metrics like image quality and file size, and discussing some of the technical details for those who are curious (the full technical specifications can be found here https://github.com/vidvox/hap).

Support for HAP R was added to VDMX in September of 2024 with the release of VDMX6.

The main highlights of HAP R:

Offers higher image quality than HAP and HAP Q.
File sizes comparable to HAP Q.
Always includes an alpha channel component, making it a great alternative to using HAP Alpha and HAP Q Alpha.
Typically slower to encode than other HAP variants, with some exceptions.
Also sometimes referred to as ‘HAP 7A’ because it uses BC7 textures and supports alpha channels.
Allows for ‘chunking’ during compression for improved performance during encoding / decoding on systems that support multi-threading.
As HAP R is relatively new, make sure the software / media server systems you are using supports it before encoding media files.

You can transcode movies to HAP R on the Mac using the latest version of the free AVF Batch Exporter utility found on the VDMX DMG in the Extras folder, or in the Releases page of the HAP in AVFoundation GitHub repository.

HAP Quality and Size Benchmarks

The goal of HAP R was to create a codec that had a similar playback & size profile as HAP Q, but with an even higher quality level by taking advantage of the BC7 texture format.

Typically every video encoding is a trade-off between the amount of disk access needed to read the file, the amount of work that your computer needs to do to decode the file and how close each frame is to its original pixel data.

For these benchmarks we look at these four main characteristics:

Playback performance (decode fps)
Encoding rate (encode fps)
File size (data rate per frame)
Image quality (PSNR, SSIM, VMAF)

As some of these values can vary depending on the source material that is being compressed, these results are presented either as averages along with best and worst case scenarios.

To examine the image compression quality of HAP R compared to other codecs, we used three different metrics:

PSNR: Peak Signal-to-Noise Ratio, measures full color quality across all channels. Measured in db, higher values are better.
SSIM: structural similarity index measure (SSIM) measures structural similarity between images, considering luminance, contrast, and structure. Values range from 0-1, with 1 being identical.
VMAF: Video Multi-Method Assessment Fusion (WMAF) is Netflix's perceptual quality metric that correlates with human visual perception. Scores range 0-100, with higher being better.

Without getting into the lengthy debate about which of these three is the best way to measure the quality of image compression, for our primary scatter charts we used PSNR. Along with comparing the relative quality of HAP R against other codecs, we wanted to verify that the Apple Texture Encoder was providing results within the similar PSNR ranges that have been published for other BC7 encoders. We included some of the results from the tests using SSIM and WMAF for comparison.

We did these tests across three general types of movie file content, as compression quality and performance can vary depending on the complexity of the images being encoded.

Basics: Simple tests like solid colors, checkerboards, gradients, and similar generated patterns.
Videos: Shot with a camera in a variety of settings, eg outdoors, indoors.
Rendered: CGI rendered graphics.

Quality vs Size Comparisons

The first goal was to compare HAP R against the existing HAP and HAP Q codecs to confirm that it offered significantly better quality results while maintaining very fast decode rates and reasonable file sizes.

The Good, The Bad, and the Average

To get an idea of what these quality metrics look like, here are the results for some actual movies from our series of tests.

We’ll start with the best and worst cases at the extremes of our examples, and then look at more typical use cases.

Color Gradient

We consider the ‘best case’ scenarios as situations where there are high quality compression results and low data rates. An example from that group is this simple color gradient.

Codec       PSNR   SSIM     VMAF   Kb/Frame  
-------------------------------------------

HAP_0.75    45.64  0.9941   100.0   922.2

HAP_1.0     49.59  0.9977   100.0   1090.2

HAP_Q       42.45  0.9986   100.0   2660.5

HAP_R_0.75  50.73  0.9993   100.0   2723.5

HAP_R_1.0   51.37  0.9994   100.0   2897.3

From these results we can see that when working with simple solid colors and gradients HAP R offers significant improvement over HAP and HAP Q in PSNR and SSIM.

RGB Noise

A movie filled with random noise – every video codec's worst nightmare. These tests traditionally have the lowest PSNR scores, at the highest data rates, for each variation of HAP, and most other codecs.

Codec       PSNR     SSIM     VMAF  Kb/Frame  
--------------------------------------------

HAP_0.75    28.32    0.9376   65.7  8286.1

HAP_1.0     29.64    0.9485   72.9  8289.3

HAP_Q       31.47    0.9750   84.5  16589.1

HAP_R_0.75  30.69    0.9752   83.3  16529.1

HAP_R_1.0   31.99    0.9828   90.2  16589.1

While random noise is not the most typical use case for video, these values give us a ‘worst case’ on the data rate and push the quality metrics to the lower limit.

Also noteworthy that although the PSNR scores are pretty close here, when going by the SSIM and VMAF metrics HAP R is a huge improvement for this extreme edge test.

Fireplace Video

How does real world footage shot with a camera do with HAP R? Videos like this typically fall in the category of high quality compression with high data rates.

Codec       PSNR     SSIM     VMAF  Kb/Frame  
--------------------------------------------

HAP_0.75    41.57    0.9957   99.7  6024.7

HAP_1.0     46.60    0.9980   99.7  6991.6

HAP_Q       42.28    0.9979   99.7  13104.1

HAP_R_0.75  49.72    0.9989   99.7  13552.3

HAP_R_1.0   50.61    0.9989   99.7  14061.6

In these cases HAP R offers significant quality improvements over HAP and HAP Q.

As with our other tests when compared to HAP Q it has similar data rates per frame.

Big Buck Bunny

Nearly right in the middle of our quality vs size tests was Big Buck Bunny, a true open-movie animation classic. This works as a great example of how HAP R handles CGI based footage.

Codec       PSNR     SSIM     VMAF  Kb/Frame  
--------------------------------------------

HAP_0.75    41.00    0.9964   93.1  4609.7

HAP_1.0     45.02    0.9987   94.1  5358.8

HAP_Q       44.68    0.9992   96.8  10075.2

HAP_R_0.75  48.34    0.9997   98.1  10023.4

HAP_R_1.0   48.95    0.9997   98.4  10314.3

With Big Buck Bunny, HAP R shows big improvements over HAP and HAP Q, in all quality metrics. As expected, the file sizes are roughly the same as using HAP Q.

Quality Metrics Side by Sides

With bar graphs we can look at the average PSNR, SSIM, and WMAF metrics for each codec alongside each other, as well as the values of our outliers like Noise.

For this set of quality comparison tests we also included PhotoJPEG at 75% and Apple ProRes 422 for reference.

PSNR, SSIM, and VMAF Ranges by Codec:

Looking at these results we can see:

While all codecs have extreme outlier cases (eg noise) where they perform poorly under PSNR, the actual average value tends to be much closer to the highest quality.
HAP R offers on average a 5 dB increase (9% improvement) in pixel color accuracy over HAP Q, and a 7 dB increase (17% improvement) over regular HAP, measured with PSNR.
HAP R also has consistently better scores than HAP and HAP Q when using perceptual based metrics like SSIM and VMAF.
The average compression quality of regular HAP at 75% is similar to PhotoJPEG at 75% using PSNR and SSIM metrics.
HAP R at 100% quality has quality levels on par with ProRes 422 (despite this fact you should NOT use HAP R in place of ProRes 422 for video editing and archival purposes)

PSNR vs Data Rate (Kb / frame)

Another way of visualizing these results is to use a scatter chart showing off various data points on a 2D grid. This gives us not just a sense of what the ‘average’ values are, but also lets us get a better sense of the relationship between two metrics. It also lets us spot “clusters” of data points for further analysis.

In this case we have PSNR in the Y-axis and the data rate per frame as the X-axis. Going from left to right, file sizes are bigger. At the bottom are data points for movies that had poor compression quality. At the top are the movies that had good compression quality.

For this scatter chart we limited the tests to a subset of 16 movies to make it more readable.

This scatter chart shows,

The size and quality of each test movie, for each HAP variation.
HAP R generally has PSNR levels in the 48 to 54 range, with some edge cases in the low 40s.
The highest quality score HAP R movies often have low to mid range data rates.
Outlier cases like ‘Noise’ result in the lowest quality and biggest file sizes, with PSNR ratings in the low 30s.
HAP R has file sizes within the same approximate range as HAP Q, even when including an alpha channel.

Encoding Speed Tests

As noted, while HAP R is higher quality than HAP Q at roughly the same file sizes, the trade off is in encoding time. Here we have a bar graph comparing the relative encoding times for each of the HAP varieties.

The major take aways from these results are:

HAP 75% and HAP Q have similar encoding rates on M-series Macs.
Using the 75% quality for HAP and HAP R is often much faster than the 100% level.
In some very specific cases (solid color) the HAP 100% encoder is extremely fast, and in others (noise) it is very slow.

HAP / HAP R Quality Setting

How does the encoding quality setting for HAP and HAP R change the overall output? What kind of an impact does it make on encoding speeds and file sizes? These are great questions that people ask all the time.

While we are doing these tests, let’s take a closer look at how the file size, quality, and encoding speed vary depending on the quality setting for both HAP and HAP R.

Medium vs Highest Quality Comparisons

In these scatter charts, the normalized encoding time for each data point is represented by the size of the circle, with smaller circles being faster (better) encoding rates.

Other notes about the encode quality setting for HAP / HAP R:

When using regular HAP, the encoder switches to a higher quality technique at 80%l. The higher quality encoding mode is much slower, but still produces files significantly smaller than HAP Q / HAP R.
When using HAP R, the quality setting for the ATE encoder shows no variation below 99%. Using the 100% quality setting with the ATE is much slower, but produces movie files comparable to ProRes 422 in PSNR, SSIM, and WMAF.

Conclusions and Further Discussion

While HAP R marks a huge improvement in quality over the existing HAP varieties, if smaller file size and / or faster encoding speeds are needed in your workflow, you may be better off sticking with HAP / HAP Q.

As HAP R always includes an alpha channel, it is also a great replacement for usage of HAP Alpha and HAP Q Alpha. We’ll provide further analysis comparing data rates of these specific use cases in a follow up post.

Our next steps with HAP R are to investigate using some of the other BC7 encoders to see how they compare to the Apple Texture Encoder in terms of quality and speed.

Some additional notes about these tests:

The HAP in AVFoundation framework for encoding test files. Other similar frameworks and codebases that are designed to work with HAP may vary slightly depending on their optimizations for encoding, but should offer quality and file size results within the same ranges.
Where applicable, the usage of the ‘quality’ setting may vary depending on the specific texture encoder algorithm being used. We’ve included some analysis of how this setting maps to quality / size / encoding speeds when using the AVF Batch Exporter / HAP in AVFoundation.
HAP in AVFoundation uses the Apple Texture Encoder (ATE for short) for encoding to HAP R. The quality and size results from our tests appear to match up with published results from other BC7 encoders.
Quality metrics were generated using the WMAF capabilities in FFmpeg. In some cases we used ProRes 4444XQ as an intermediate format during this process.

Additional Notes for Developers

Developers looking to support HAP R in their software / media servers can find more information here:

Official HAP specification: https://github.com/vidvox/hap
Developers page on the HAP website: https://hap.video/developers
Using HAP in AVFoundation: https://github.com/Vidvox/hap-in-avfoundation

For cross platform support, the Demolition Studios team has started work on an FFmpeg branch with support for HAP R (https://github.com/DemolitionStudios/FFmpeg/tree/hap_bc6h) which will hopefully be part of the standard release in the next few months.