Google
 

Home | Product | Solution |Purchase | contract-us

¡¡
¡¡
H.264 video coding standard analysis of the core technology

    

Image Communication is a long way in recent years the development of modern communications technology, image compression progress in the development of communication is an important part. International standards recommended the advent of H.261, the image coding nearly 40 summing up the outcome of the study and solve the visual communications technology in the application of this long-troubled people's problems, covering the entire audio-visual business on narrowband ISDN the image coding , A strong impetus to the conference TV, TV phones and other means of communication in international image and industrialization. Subsequently, ITU H.261 in the proposed proceed on the basis of a very low rate of image compression standard, developed a proposal H.263, and the latest H.264/AVC. This article first H.261 and H.263 recommendations on the basic principle, then the new H.264/AVC standards in the new technologies that, and then on to sum up H.26x series of standards.


1 H.261 basic principles of the proposed
         Each image compression standard-setting, it is most appropriate for the target application. Is the first definition H.261 video coding standard. It used for the first time a motion compensation prediction coding and DCT transform the method of combining its video encoding signal transmission rate from 64 kbps to 1.92 Mbps, and are thus p ¡Á 64K video encoder (p value at between 1 to 31). H.261 mainly used the Internet ISDN video conferencing systems, positioning in the circuit-switched network system.
H.261 encoder principle as shown in Figure 1.



         The proposals are mainly used CIF image resolution format and QCIF resolution format to solve the communication between the different standards of compatibility problems. For each frame coded slice, H.261 motion compensation by the Inter prediction algorithm, the elimination of television images related to the time domain; forecast error on the DCT-domain to remove images of space on the relevance of the then self DCT coefficient to quantify, to take full advantage of the visual characteristics; followed by entropy coding, in order to achieve statistical match coding; final output buffer memory used to smooth the digital stream, to maintain a constant rate of the digital output of purpose. The proposed model image frame coding, including I, P, B three categories. I frame by frame encoding; P-frame, a frame encoding, I frame or in front of a P-frame motion compensation, and then encoded on the error estimates; B frame interpolation frame for the two-way transmission not coded, and by the I frame and the frame or P P PP frame and the frame interpolation reconstruction. H.261 does not support two-way movement forecasts and GOP, every frame coding frame before it is a coded reference frame for frame.
H.261 standard encoding data structure from top to the bottom of the definition of four levels, namely frame layer, layer, slice layer and block layer. H.261 valuation of the campaign for compensation is slice of the units. Acer is a block of choice or intra-frame encoding, first of all need to judgement. If it matches slice with highly relevant, can be used frame encoding the contrary, the use of intra-encoding.

2 H.263 basic principles of the proposed
         H.263 standard in H. 261 on the basis of criteria proposed. In its low-rate conditions, can not increase too much complexity of the case, a higher image quality. In principle, it requires half the bandwidth can be achieved with the same H.261 video quality. At present, H. 263 standards have been all kinds of visual telephone terminals widespread adoption of the agreement.
H. 263 coding standard basic model with the diagram of the structure similar to H.261 standard. Similarly, the use of image motion compensation forecast to reduce the time-domain redundancy; compensation to the movement of the residual market forecasts for discrete cosine transform (DCT) coding; using variable-length coding (VCL) to quantify the DCT coefficient, and additional motion vector Information entropy coding.
H. 263 in the H.261 on the basis of recommendations made certain improvements. Image size using QCIF format, the introduction of a sub-CIF format, but also allows the use of CIF format. 8 ¡Á 8 using the DCT-Acer block reunification use the same quantitative step to quantify, can be a slice using a motion vector, it can also be a slice of each sub-block the use of a motion vector, therefore, have Block motion compensation capability and improve the forecast frame. To the motion vector x and y to support the semi-pixel accuracy motion estimation search window size is restricted to [-16, +15.5], a differential motion vector prediction coding transmission. Encoding a two-dimensional forecast and VLC combination of coding, similar to MPEG-1 standard, all the images into the frame and BP P frame.
H. 263 recommendations to ensure that in the very low rate under the conditions of a better image quality, H.261 on the basis of mixed code, adopted the unrestricted movement vector model, syntax-arithmetic coding mode, the High forecasting model and PB - Frame mode coding technology. In the unrestricted movement in the cancellation of the vector model as the benchmark pixel image must be encoded in the region restrictions. Senior forecasting models in the use of the overlap block motion compensation, but also to allow movement across the border motion vector. The PB-frame mode, B frame by frame before a decoding P and the current P-frame decoding a two-way forecast for reconstruction, thus enhancing the frame rate but did not significantly increase the number of bits. The above three ways is to improve the Inter forecast. Based on the grammatical form of arithmetic coding is used to further reduce the transmission bit rate. In this way, all of the code of variable length encoding and decoding operations are compiled using arithmetic operations to replace. To provide these senior coding mode, can make application in the compression performance and complexity and the choice between a balanced.

3 H.264 standard of core technology and its characteristics
         H.264/AVC is the ITU-T and ISO / IEC joint development of the latest encoding standards, the first by the ITU-T VCEG made in 1997, the goal is to put forward a better performance (as opposed to the then H.263 ) Video coding standard.
         With the previous compared to some coding standard, H.264 standard inherited the H.263 video standard and MPEG1/2/4 the merits of the agreement, but the structure does not change, but in all the major functions within the module using a number of advanced The technology, improve the coding efficiency. Its main problems: coding is no longer based on 8 ¡Á 8 blocks, but in the size of a 4 ¡Á 4 faster, the residual transform coding. Used by the transformation encoding is no longer DCT transformation, but an integral transform coding. A more efficient use of coding the context of adaptive binary arithmetic coding (CABAC), also with the corresponding quantitative process also differentiated. H.264 standard with a simple algorithm easy to implement, the operator of high precision and without overflow, computing speed and memory occupied small, weakening the advantages of block effect, is a more practical and effective image coding standard.
         Here are H.264/AVC standards in the previous standard on the new technology. H.264 standard image is still used to predict and transform coding combination of coding structure, the basic structure of the encoder shown in Figure 2:





         The encoder can process the data flow into the former channel to channel and reconstruction. Enter the code Fn frame, the original image 16 ¡Á 16 pixel block Acer encoded. Slice into intra-coding and coding frame coding. In any case, Acer forecast by the P block access to reconstruction frame. In intra-coding mode, P from the current frame of encoded slice the decoding, reconstruction was forecast, such as the figure above the uF'n. Encoded in the frame mode, P by one or more reference frame motion compensation, as was forecast, such as F'n-1. Acer forecast to block P Fn with the current slice of the margin as a residual slice Dn, the transformation, be quantified after a string of transformation parameters X. X parameters of the need for two aspects of the handling, sorting and re-First, entropy transform, the whole process no feedback component, known as the former is to channel the anti-quantify and inverse transform, a slice D'n, and then Acer block P are the sum of reconstruction slice uF'n, to be rebuilt after a series of handling the reference frame F'n, for the next frame motion estimation, known as the reconstruction of access.

3.1 intra prediction model code
        In video encoding, the usual method is the whole image is divided into a number of macro blocks, and then on each slice encoded. Intra coding used in the Inter or two models. In Intra mode usually carried out directly on the slice DCT transformation, the transformation coefficients entropy coding. To do so to a certain extent, the elimination of intra-space redundant, but the DCT only a slice of the internal correlation between the pixels, without taking into account the adjacent slice the correlation between. H.264 introduction of the Intra forecast methods, the use of adjacent slice of treatment related to the macro block coding forecast, the forecast error to transform coding, to eliminate redundant space. It is worth noting that before the standard is in the forecast transform domain, and H.264 is directly in the space domain to predict.

3.2 Inter prediction coding mode
         H.264 in motion estimation of many new mining technology, including variable block size, multi-frame motion estimation, the accuracy of sub-pixel motion estimation and the effect of filtering to block.
¢Å filter to block effect
         It is used to eliminate the role of images in the decoding of the block. Block of the reasons is the slice separately quantified, so in the adjacent block at the junction of Wang, because of different quantitative step which is very close to the original value of the pixel after the Reconstruction of a larger difference, a clear Block border. To block the effect of filtering is 4 ¡Á 4 blocks on the border filtering, to block the border tends to smooth.
¢Æ variable block size
         Block size of the estimated effect of the campaign is influential. Acer will be separated into different size pieces of the motion compensation sub-block structure known as the tree motion compensation. Slice of the division and partition of the sub-slice, including all four types, as shown in Figure 3. Smaller pieces can make more accurate estimates Movement, a smaller movement residuals, the lower rate. H.264 recommendations in the choice of different sizes in the block, we can see that a slice can carry up to 16 different motion vector. With multi-frame motion estimation, the same slice of the different pieces can also use a different frame of reference to the forecast.




Figure 3 Motion Compensation slice segmentation
Top: slice of the division
At the bottom: slice of the sub-division

¢Ç multi-frame motion estimation
        Previous video compression standards used in the campaign Danzhen estimated technology compared to, H.264 use of the multi-frame motion estimation has a higher efficiency, greater stability of the error. The so-called multi-frame motion estimation is that the use of one or more reference frame to estimate the motion vector, can prevent errors in a frame and the frame behind affected. However, this estimate needs more memory and higher computing complexity.

¢È sub-pixel accuracy motion estimation
         In H.264, the campaign is estimated by the accuracy of H.263-pixel to pixel, and the pixels as optional. And the accuracy of the semi-pixel motion estimation, the accuracy of the pixel motion estimation by the use of interpolation pixels and pixel-location of the point.
         H.264 forecast in the frame coding, can continue to use three-step search algorithm to identify and match the current macro block by block. In Block, block and block the displacement of the center or block any point is equivalent to the displacement. Therefore, the block of displacement can be understood as the center of displacement. In the three-step algorithm, the search for 7, that is, in the last one to the current sub-block for the original point, the current sub-block in its distance from top to bottom about seven within the scope of certain rules of movement, each move to a location, took out the same Sub-block size of the current sub-block with the match calculation. Specific divided into the following three steps:
         ¢Ù to the current sub-block as the center and 4 for stride, the successful bidder will Figure 4 to 9 as the central location of the sub-block with the current sub-block match, sought the best match of the sub-block center.
         ¢Ú to ¢Ù obtained in the best sub-block as the center, for example, x = 4, y = 0, 2 for stride, will map the nine position as the center of the sub-block with the current sub-block match, obtained The best match of the sub-block center.
         ¢Û to ¢Ú obtained in the best sub-block as the center, for example, x = 4, y = 0, 1-stride, will map the nine position as the center of the sub-block with the current sub-block match, obtained The best match of the sub-block center, with the current sub-block center of which is offset the estimated displacement.





3.3 integral DCT -

        H.264 standard used in the 4 ¡Á 4 DCT-rounded as a residual slice of the basic transformation, this transformation has been the target of motion compensation predict or intra prediction after the data contained residuals of the 4 ¡Á 4 block. Such transformation is based on the DCT transformation, but different from the DCT.
As DCT transformation is real, quantifiable, to be rounded to the coefficient, thus affecting the accuracy of the operator. At the same time, the traditional DCT there does not match the problem, generate a reference frame of the migration, a direct impact on the quality of image reconstruction.
         H.264 proposed integral DCT transform the use of all the operations are integral algorithm, the core of the transformation is mainly in northern and displacement. Throughout the process of transformation and quantified, only the implementation of 16 bit integer algorithms and a multiplication operation. As long as the H.264 recommendations on the basis of the proper use of the appropriate anti-change, encoders and decoders will not appear does not match the phenomenon. The pros and cons of transformation matrix, respectively







         One of the factors are basically rounded, 1 / 2 can be used instead of displacement. As in the transformation of the multiplication can be replaced by the transfer operation, therefore, reduce the complexity, but also solved the problem of accuracy.
         H.264 slice of the size of 16 ¡Á 16, each of which 4 ¡Á 4 blocks the size of the above 4 ¡Á 4 transform the DCT, received 16 4 ¡Á 4 of the transformation matrix. To further improve the compression efficiency, the proposal also allows each of the 4 ¡Á 4 of the transformation matrix in the DC component DC, took out a separate component of a new 4 ¡Á 4 matrix, this matrix Hardamard transformation. Slice the data transmission sequence shown in Figure 5.








3.4 entropy coding
         H.264 also adopted the recommendations of two entropy coding model: Based on the context of binary arithmetic coding CABAC, and variable length encoding VLC. VLC code also includes context-based adaptive variable length encoding CAVLC.
        CABAC use of the arithmetic coding method, a symbol can be less than 1 bit to that. According to the assumption that no error test under the condition of the data collected indicates that the rate of all, CABAC performance are stronger than CAVLC. But the anti-CAVLC error of stronger than CABAC, the complexity of computing and also far below the CABAC. Therefore, H.264 Baseline Profile provisions in the use of CAVLC, and the use of the Main Profile for CABAC entropy coding.

4 Summary
        In the past compared to the video coding standard, H.264 recommendations in its structure, motion estimation and motion compensation, slice and quantify the change and entropy coding, and other areas has significantly improved, with higher coding efficiency and Stronger network adaptability. In the same image quality under, H.264/AVC algorithm than the previous standards such as H.263 or MPEG-4 saves about 50 percent of the rate. Profile of different H.264 can be applied to both real-time communication, but also can be used to delay Yaoqiubugao other applications. In addition, the proposed increase in the NAL layer, will be responsible for encoding the output stream to various types of network adapter, which the network transmission has better support. At the same time, it has strong anti-BER characteristics, can adapt to packet loss rate is high and a serious interference in the wireless channel video transmission. Therefore, H.264 support different network resources under the classification code transmission, to obtain a smooth image quality, can be adapted to different networks in the video transmission, network affinity good.
        In today's Internet, multimedia services to the needs of the present rapid growth trend. Because of the wireless network bandwidth resources and limited transmission capacity constraints, currently on the market end-users are mostly paid in accordance with the flow approach to the use of wireless network data services, increased compression efficiency is a wireless video and multimedia applications the main objective. So H.264 / AVC coding standard as the Multimedia Messaging Service (MMS), packet-switched streaming services (PSS) and the conversations of the most competitive candidate standards. At the same time, H.264/AVC no restrictions on ownership, is a public open standards. Therefore, the strengthening of the various manufacturers in the low-cost manufacturing process in the competition, allowing for the rapid decline in prices, this technology can allow for more people to services.


¡¡

Recent Post:

¡¡

IP/Network camera server/Wireless Network Camera/Knowledge Articles: