当前位置：网站首页>[Video coding learning] - SAD and SATD

[Video coding learning] - SAD and SATD

2022-08-09 14:47:00 【Stars¹⁸⁹⁵】

Video Coding Learning - SAD and SATD

I. Definition of common errors

SAD (Sum of Absolute Difference)=SAE (Sum of Absolute Error) is the sum of absolute error

SATD (Sum of Absolute Transformed Difference) is the sum of absolute values after hadamard transformation

SSD (Sum of Squared Difference)=SSE (Sum of Squared Error) is the sum of the squares of the differences

MAD (Mean Absolute Difference)=MAE (Mean Absolute Error) is the mean absolute difference

MSD（Mean Squared Difference）=MSE（Mean Squared Error）means squared error

Second, application in video coding

Take RDO as an example, in the RDO decision of encoding mode, the cost corresponding to the mode:

J(mode)＝SSD＋λ*R(ref,mode,mv,residual)

Here, SSD refers to the mean square sum of the difference between the reconstructed block and the source image; λ is the Lagrange multiplier, so it should be regarded as the weight; R is the actual code stream encoded by the macroblock in this mode, includingBit sum over reference frames, modes, motion vectors, residuals, etc.Of course, if it is an intra-frame mode, there is only R(mode,residual).
What confuses many people is that the current macroblock has not been encoded yet, how do you know its code stream and reconstructed image?In fact, RDO is to actually encode each mode once to get J(mode), and then select the mode with the smallest J(mode) as the actual encoding mode.Just like the encoder introduces a large feedback, if the video encoder really performs RDO according to the above announcement, the encoding speed will be very slow. Of course, the encoding efficiency is the best.
Therefore, in practical application, there is the following alternative formula:

J(mode)＝SAD＋λ*R(ref,mode,mv)

J(mode)＝SATD＋λ*R(ref,mode,mv)

Here, SAD is the sum of absolute errors between the predicted block and the source image in this mode.There is no residual coding in the bit R, that is, the J (mode) value of the mode can be directly obtained after motion estimation, which greatly reduces the computational complexity.
SATD is the absolute sum of the coefficients after the Hardman transform of the residuals. In most cases, SATD has a better evaluation effect than SAD. Of course, SATD has more transformations than SAD, and the amount of calculation is larger.

Three, SAD and SATD application scenarios

SATD: When no rate-distortion optimization is used, select SATD+delta×r(mode, ref, mv) as the basis for mode selection; use SATD for intra-frame mode selection.In motion estimation, SATD is used for subpixels
SAD: In motion estimation, SAD is used for integer pixel search

Explanation
SAD is the sum of absolute errors, which only reflects the time domain difference of residuals, which affects the PSNR value and cannot effectively reflect the size of the code stream.SATD is the sum of absolute values of prediction residuals of 4×4 blocks whose residuals have undergone Hardman transform, which can be regarded as a simple time-frequency transform, and its value can reflect the size of the generated code stream to a certain extent.Therefore, when rate-distortion optimization is not used, it can be used as the basis for mode selection.
Generally, all modes need to be detected within the frame, and the reason for selecting SATD for intra-frame prediction is the same as above.
When doing motion estimation, generally speaking, the farther from the optimal matching point, the larger the matching error value SAD, which is the famous single plane assumption, and most of the existing fast motion estimation algorithms use this feature.However, the converted SATD value does not satisfy this condition, and it is easy to fall into the local optimum point if the SATD search is used in the whole pixel.In the sub-pixel, there are not many points to be searched, and the SAD difference at each point is relatively small, and the SATD can be used to select a matching position with less code streams.

IV. Physical meaning of SAD and SATD

SAD is the sum of absolute errors, which only reflects the time domain difference of residuals, which affects the PSNR value and cannot effectively reflect the size of the code stream.Its value only reflects the error D.
SATD is the sum of the absolute values of the coefficients of the Hardman transform of the residual, which can be regarded as a simple time-frequency transform, and its value can reflect the size of the generated code stream R to a certain extent.