当前位置：网站首页>Application of DCT transform

Application of DCT transform

2022-04-23 19:13:00 【cjzcjl】

Catalog

Why write this article ：

One 、 Introduction to Fourier transform

Two 、DCT What is it?

3、 ... and 、DCT Application in image lossy compression

1、 The basis point of image lossy compression ：

2、 Use DCT The specific operation of image lossy compression ：

3、 Use IDCT Specific operation of image decompression ：

3、 ... and 、DCT Algorithm Java Realization _AndroidDemo Of Gitee Address ：

Four 、libjpeg Corresponding DCT Code implementation ：

5、 ... and 、 References ：

Why write this article ：

Because since it was developed as an Android client , Began to come into contact with some audio and video 、 Knowledge of image processing , Gradually aroused a strong interest in it , Therefore, I hope to learn from the lowest theory and form a personal understanding of their relevant theories , Only have a full understanding of these underlying mathematical knowledge and engineering knowledge , It is possible to become an expert in this field in the future , And make your own innovation , Instead of being constantly by others SDK Lead the nose .

in addition , Actually, as a Android Client development , If one day Android That won't be possible , So on Android SDK Most of the experience accumulated on the call of is completely clear 0, I think it's very harmful for the whole career , So I think I'm either interested in , Or think of danger in times of peace , I have to learn knowledge with stable knowledge structure . At this point, I found that if combined with interest , Audio 、 video 、 codec 、 Graphics and images AI It is the closest to the knowledge field I want , They are based on mathematical and physical theorems , Not like the front frame 、 Various wheels on the client 、 Back end Spring Frameworks and other areas based on the survival of others on wheels can easily be overthrown , The whole field of knowledge has been pushed down and started again , All the time spent has changed from... Again with the update of the framework 0 Start , This is the unstable field of knowledge . But stable areas of knowledge , Whether it's embedded 、 Hardware development 、 Game development 、 The knowledge of graphics and images is stable and stackable , Every time you accumulate a little more knowledge , Skills will be enhanced by one point , For example, from ordinary normal vector based lighting , Then to ray tracing , The mathematical knowledge used is similar , Are mainly trigonometric functions , At most, it's just more probability , However, this mathematical knowledge will not be like the client API Just change it , But proved right , There will be no change , It's just that new knowledge will be superimposed and improved continuously , So this is the field of stable knowledge , As long as you learn , Just your own , Not like SDK The nature of knowledge accumulation , Always be SDK Developers are leading the way with the so-called trend of the times .

A certain degree of basic knowledge of mathematics is also needed , So I'm not going to be satisfied with FFMPEG、libJPEG Wait for the use of the library to learn , But from audio and video 、 The basic mathematical principles of graphics and images are studied in combination with their implementation code , Put this knowledge , Internalize into your own internal skill , Only then can we create our own innovation .

One 、 Introduction to Fourier transform

Every complex waveform , Can be decomposed into the superposition of multiple simple waveforms , This is the basic principle of Fourier transform . It is not intended to explain in detail what Fourier transform is , Just need to know this principle , Based on this principle , The time domain signal can be changed into the frequency domain signal , At this time, recording a signal does not require high-density sampling and recording the value of each sampling point , Instead, record the amplitude of each frequency component of the signal 、 Phase is enough , When restoring, use trigonometric function to generate waveform and add it again .

Two 、DCT What is it?

DCT Its full name is Discrete Cosine Transform, Discrete cosine transform . Its principle is similar to that of Fourier transform , It is to transform the target signal from a complex time-domain signal , Decomposed into frequency domain signals with different frequency intensities . Just like the introduction to Fourier above , Complex signals are the superposition of simple signals , therefore DCT The basic principle of is , Through multiple different intensities and different frequencies DCT Base signal , You can stack “ assemble ” For the original signal , Therefore, when actually recording, it is no longer necessary to record complex original signals , It's a record. DCT The base of . And the signal turns to DCT After the frequency domain signal , The upper left corner shows its low-frequency signal strength , The lower right corner shows its high-frequency signal strength , Make the frequency component and intensity of the signal clear at a glance , Without the complexity of the original signal .

Compare the high and low frequency domain of two-dimensional discrete Fourier transform to scatter in the four corners ,DCT Transform its high and low frequency distribution at a glance , The closer to the upper left corner , The closer you get to the lower right corner, the higher the frequency , Therefore, it is more suitable for frequency domain reduction to realize image lossy compression .

3、 ... and 、DCT Application in image lossy compression

1、 The basis point of image lossy compression ：

A lot of times , We look at an image with the naked eye , The main concern is its large area of color patches , The fine part pays little attention to （ People with Asperger's phenomenon may be the opposite , But most people do ）. therefore , Even if the fine part of the image is weakened to a certain extent , It doesn't affect the appearance of an image , This is it. JPEG The fundamental principle of lossy image format .

For image processing , The details generally have large brightness fluctuation 、 It is characterized by a large number of brightness changes in a short scale , A description transformed into a mathematical way , That is, the details of the image are generally “ High frequency signal ”, therefore , If it can be used “ Frequency divider ”, Put the image of “ Time domain signal ” Turn into “ Frequency domain signal ”, You can filter part of the high-frequency signals , Keep only high frequency signals , So that the space needed to express the image is reduced .

2、 Use DCT The specific operation of image lossy compression ：

We see first DCT And inverse DCT（IDCT） Formula ：

（ Quote from 《 Digital image and video processing 》 A Book ）

First look at DCT The formula 1,u,v For we need to get DCT The matrix subscript of the frequency domain result of the transformation ,f(x,y) Original 8x8 The value of the signal matrix .

The general meaning of the formula ：

according to u,v Increasing , The frequency of will increase , until (u,v) by (7,7) Each pixel is different from the surrounding pixels ,(u, v) from (0,0) To (7,7) The frequency base image of is as follows :

Computational code ：

    public void calcDCTBase(int u, int v) {
        this.mU = u;
        this.mV = v;
        double c_u = 1;
        double c_v = 1;
        if (u == 0 && v == 0) {
            c_u = c_v = 1f / Math.sqrt(2);
        }
        for (int y = 0; y < 8; y ++) {
            for (int x = 0; x < 8; x ++) {
                double base = c_u * c_v * Math.cos(((2 * x + 1) * u * Math.PI / 16f)) * Math.cos(((2 * y + 1) * v * Math.PI / 16f));
                mDCTBaseMatrix[x][y] = base;
            }
        }
        invalidate();
    }

I said before , Any complex periodic signal can be obtained by superposition of simple periodic functions . So our input image , It can also be formed by superposition of base images with different intensities and different frequencies . Then our two-dimensional image signal f(x, y) By multiplying (0,0)~(u,v) Basis functions of different frequencies , The result is a signal f(x,y) and (0,0)~(u,v) The correlation image of the base image , Take from (0,0)~(u,v) The correlation results of this range are summed and put into the corresponding result matrix (u,v) Location , Finally, we get the image f(x,y) Corresponding base image from (0,0)~(u,v) Correlation of different frequencies （ A weight ） Images .

Last , According to the visual characteristics of human eyes , The quantization table can be used to make the low-frequency signal sensitive to the naked eye use a smaller quantization interval value to ensure the progress , High frequency signals insensitive to the naked eye can be roughly quantized with a large quantization interval , To achieve the purpose of data compression —— After the quantization of high-frequency signal, it almost becomes 0 了 . The quantization table I use is shown in the figure ：

For example, input image f(x,y)：

DCT Change and use the quantization table to get the output :

It can be found that most of the signals in this image have the strongest correlation with several low-frequency base images , At this point, we have transformed the image from time domain to frequency domain , And compressed the high-frequency signal .

Computational code ：

    /** For the input 8*8 Matrix signal , Multiply by from the lowest frequency （u,v by (0,0） Situated DCT The base ） To the highest frequency DCT（u,v by (u,v） Situated DCT The base ） Base matrix and sum ,
     *  After brightness quantization of the summation result , Put it in the corresponding coordinates of the output matrix (u,v) It's about **/
    private void signalToDCTSignalTrans(int inputMatrix[], double outputMatrix[][]) {
        for (int u = 0; u < 8; u ++) {
            for (int v = 0; v < 8; v ++) {
                double base = 0;
                for (int x = 0; x < 8; x ++) {
                    for (int y = 0; y < 8; y ++) {
                        base += inputMatrix[y * 8 + x] * mDCTBaseMatrix[u][v].getSignalDCTBaseVal(x, y);
                    }
                }
                outputMatrix[u][v] = 1f / 4f * base / Constant.DCT_BRIGHTNESS_TRANS_TABLE[u * 8 + v];
            }
        }
    }

3、 Use IDCT Specific operation of image decompression ：

because Any complex periodic signal can be obtained by superposition of simple periodic functions , therefore , according to DCT Images (0,0)~(u,v) Each stored correlation value with the base image （ A weight ）, Multiply by the corresponding base image , superposition （ Sum up ） The obtained value is the time domain signal , That is, the source image .

Take what you just got DCT The image is quantized and processed by inverse brightness IDCT after , You get the image ：

You can see that the image has undergone transformation , Because of quantization error , There is a certain degree of error . That's why JPEG The reason why the image quality will be worse and worse after multiple decompression and re compression .

Computational code ：

    /**IDCT Transformation , hold DCT The transformed frequency domain signal is transformed into time domain signal again **/
    private void DCTSignalInverseDCTToSignal() {
        for (int y = 0; y < 8; y ++) {
            for (int x = 0; x < 8; x ++) {
                double base = 0;
                for (int u = 0; u < 8; u++) {
                    for (int v = 0; v < 8; v++) {
                        // Inverse luminance quantization , The inverse DCT
                        base += mDCTTransResult[u][v] * Constant.DCT_BRIGHTNESS_TRANS_TABLE[u * 8 + v] * mDCTBaseMatrix[u][v].getSignalDCTBaseVal(x, y);
                    }
                }
                mDCTReverseTransResult[x][y] = 1f / 4f * base;
            }
        }
    }

Overall operation image ：