Overview of Video Compression Codec Standards

Tag: Video Transmission Video Compression Codec

This article refers to the address: http://

With the continuous increase of Internet bandwidth, related technologies for transmitting video on the Internet have become a hot spot in the research and development of the Internet section. At present, many experimental high-speed broadband networks regard the technology and application of video transmission as the research focus. There are many difficulties in transmitting video over the Internet. The fundamental reason is that the connectionless forwarding mechanism of the Internet is mainly designed for bursty data transmission, and is not suitable for the transmission of continuous media streams. In order to effectively and high-quality transmission of video streams on the Internet, a variety of technologies are needed, and digital video compression coding technology is one of the key technologies in Internet video transmission. In addition, there are many problems in the transmission, processing and application of multimedia: how to transmit video on the network? How to access the Internet and receive video and images through mobile phones? How to quickly and efficiently retrieve multimedia data? How to unify multimedia information Access? And so on.

At present, the most important codec standards in video streaming are ITU H.261, H.263, Motion Picture Experts Group M-JPEG and ISO Organization Motion Picture Experts Group MPEG series standards, in addition to the Internet. Also widely used are Real-Networks' RealVideo, Microsoft's WMT, and Apple's QuickTime. details as follows:

I. ITU H.261, H.263 standard


H.261 is also known as P*64, where P is a 64kb/s range, a variable parameter of 1 to 30. It was originally designed for teleconferencing applications on ISDN, especially face-to-face video telephony and video. Designed for the conference. The actual encoding algorithm is similar to the MPEG algorithm, but it is not compatible with the latter. H.261 is much less computationally intensive than MPEG in real-time encoding. This algorithm introduces a balanced compromise between image quality and motion amplitude in order to optimize bandwidth usage, that is, strenuous motion. Images are worse than relatively still images. This method therefore belongs to constant code stream variable quality coding rather than constant quality variable code stream coding.


H.263 is a draft standard for the ITU ITU-T and is designed for low bitstream communication. But in fact this standard can be used in a wide range of streams, not just for low-stream applications, it can be considered to be used in place of H.261 in many applications. The encoding algorithm of H.263 is the same as H.261, but some improvements and changes have been made to improve performance and error correction. The .263 standard provides better image quality than H.261 at low bit rates. The differences between the two are: (1) H.263 motion compensation uses half pixel precision, while H.261 uses full pixel. Precision and loop filtering; (2) Some parts of the data stream hierarchy are optional in H.263, allowing codecs to be configured for lower data rates or better error correction capabilities; (3) H. 263 contains four negotiable options to improve performance; (4) H.263 uses unrestricted motion vectors and syntax-based arithmetic coding; (5) uses predictive prediction and the same frame prediction method as PB frames in MPEG; (6) H.263 supports 5 resolutions, that is, in addition to supporting QCIF and CIF supported in H.261, it also supports SQCIF, 4CIF and 16CIF. SQCIF is equivalent to half the resolution of QCIF, while 4CIF and 16CIF are respectively CIF is 4 times and 16 times.

The H.263+ introduced by IUT-T in 1998 is the second edition of the H.263 recommendation. It provides 12 new negotiable modes and other features to further improve compression coding performance. For example, H.263 has only 5 video source formats. H.263+ allows for more source formats. There are also multiple options for image clock frequency to broaden the application range. Another important improvement is scalability, which allows multiple display rates. Multi-rate and multi-resolution enhance the transmission of video information in a heterogeneous network environment that is error-prone and easy to drop packets. In addition, H.263+ improves the unrestricted motion vector mode in H.263, and adds 12 new optional modes, which not only improves coding performance, but also enhances application flexibility. H.263 has basically replaced H.261.

Second, M-JPEG

M-JPEG (Motion-Joint Photographic Experts Group) technology is a motion still image (or frame-by-frame) compression technique, widely used in the field of nonlinear editing, accurate to frame editing and multi-layer image processing, and the moving video sequence as continuous The still image is processed. This compression method compresses each frame completely and completely. Each frame can be randomly stored during the editing process, and the frame can be edited accurately. In addition, the compression and decompression of M-JPEG are symmetrical and can be the same. Hardware and software implementation. However, M-JPEG only compresses spatial redundancy within the frame. The time redundancy between frames is not compressed, so the compression efficiency is not high. With the M-JPEG digital compression format, when the compression ratio is 7:1, a program equivalent to the Betecam SP quality image can be provided.

The algorithm on which the JPEG standard is based is based on DCT (Discrete Cosine Transform) and Variable Length Coding. Key technologies of JPEG include transform coding, quantization, differential coding, motion compensation, Huffman coding, and run-length coding.

The advantage of M-JPEG is that it can be easily edited to the frame and the device is mature. The disadvantage is that the compression efficiency is not high.

In addition, the compression method of M-JPEG is not a completely unified compression standard, and there is no uniform format for codecs and storage methods of different manufacturers. That is to say, each model of video server or code board has its own M-JPEG version, so data transmission between servers, non-linear production network to the server data transmission is simply impossible.

Third, MPEG series standards

MPEG is the abbreviation of Moving Picture Exports Group. It was established in 1988 and is a group of experts for developing compression standards for digital video/audio. It currently has more than 300 members, including IBM, SUN, BBC, NEC, World-renowned companies such as INTEL and AT&T. The MPEG organization's initial mandate was to develop various standards for "live image" encoding, which were subsequently extended to "and its accompanying audio" and its combined encoding. Later, for different application needs, the restrictions on "for digital storage media" were lifted, and it became the organization that now developed the "active image and audio coding" standard. The various standards developed by the MPEG organization have different goals and applications, and MPEG-1, MPEG-2, MPEG-4, MPEG-7, and MPEG-21 standards have been proposed.

1.MPEG-1 standard

The MPEG-1 standard was published in August 1993 for the transmission of moving images of digital storage media with 1.5 Mbps data transmission rate and its accompanying audio. The standard consists of five parts:

The first part shows how to combine audio and video according to the second part (video) and the third part (audio). The fourth part illustrates the process of verifying that the output bit stream of the decoder or encoder conforms to the first three parts. The fifth part is a code and decoder implemented in the complete C language.

From the moment of promulgation, MPEG-1 has achieved a series of successes, such as the massive use of VCD and MP3. Windows95 and later versions have an MPEG-1 software decoder, a portable MPEG-1 camera and so on.

2.MPEG-2 standard

The MPEG organization introduced the MPEG-2 compression standard in 1994 to enable the possibility of interoperability between video/audio services and applications. The MPEG-2 standard is a detailed specification for compression schemes and system layers for standard digital television and high definition television in various applications. The code rate is from 3 megabits to 100 megabits per second. The standard formal specification is ISO/ In IEC13818. MPEG-2 is not a simple upgrade of MPEG-1. MPEG-2 has been more detailed and further refined in terms of system and transmission. MPEG-2 is particularly suitable for the encoding and transmission of broadcast-grade digital televisions and is recognized as the encoding standard for SDTV and HDTV.

The principle of MPEG-2 image compression is to take advantage of two characteristics in the image: spatial correlation and temporal correlation. These two correlations result in a large amount of redundant information in the image. If we can remove this redundant information and only keep a small amount of non-related information for transmission, the transmission frequency band can be greatly saved. The receiver uses these non-correlated information to restore the original image under the premise of guaranteeing a certain image quality according to a certain decoding algorithm. A good compression coding scheme is to minimize the redundant information in the image.

The encoded images of MPEG-2 are classified into three types, which are called I frames, P frames, and B frames.

The I frame image adopts the intra coding method, that is, only the spatial correlation in the single frame image is utilized, and the time correlation is not utilized. P-frame and B-frame images use inter-frame coding, which uses both spatial and temporal correlation. The P frame image uses only forward time prediction, which can improve compression efficiency and image quality. The P-frame image may include intra-coded portions, that is, each macroblock in the P-frame may be forward prediction or intra-frame coding. The B-frame image uses bidirectional time prediction, which can greatly increase the compression factor.

The encoded stream of MPEG-2 is divided into six levels. To better represent encoded data, MPEG-2 uses a syntax to specify a hierarchical structure. It is divided into six layers, from top to bottom: image sequence layer, image group (GOP), image, macroblock, macroblock, block.

The main applications of the MPEG-2 standard in the field of broadcast television are as follows:

(1) Preservation of video and audio data

Television programs, audiovisual materials, etc. have always been stored on tape. There are many drawbacks to this approach: vulnerability, large footprint, high cost, and difficulty in reusing. More importantly, it is difficult to store for a long time, difficult to find and difficult to share. With the development of computer technology and video compression technology, high-speed broadband computer networks and large-capacity data storage systems have made it possible to store, query, share and communicate TV programs.

DVD video discs with MPEG-2 compression encoding bring new hope to data storage. TV programs, audio and video materials, etc. can be encoded by the MPEG-2 encoding system, saved to low-cost CD-R discs or high-capacity rewritable DVD-RAM, and can also be used with DVD editing software (such as Daikin Scenarist NT, Spruce DVDMaestro). Etc.) Produce a standard DVD disc, which saves money and saves storage space.

(2) Nonlinear editing system for TV programs and its network

In a nonlinear editing system, program material is stored, produced, and broadcast in digital compression. Video compression technology is the technical basis of a nonlinear editing system. At present, there are mainly two digital compression formats of M-JPEG and MPEG-2.

M-JPEG technology is a motion still image (or frame-by-frame) compression technology that allows for accurate frame editing, but compression is not efficient.

MPEG-2 adopts the method of interframe compression, only needs to perform intraframe compression processing of I frame, and B frame and P frame are obtained by detection. Therefore, the data of transmission and operation are mostly obtained by time correlation between frames, and relative In terms of the small amount of data, a higher compression ratio can be achieved. With the resolution of frame-by-frame editing, MPEG-2 will be widely used in nonlinear editing systems, and the editing cost will be greatly reduced. At the same time, the decompression of MPEG-2 is standard, and the data compressed by compression devices designed by different manufacturers can be other. The manufacturer designed the decompressor to decompress, which ensures that the devices of each manufacturer are fully compatible.

Due to the adoption of MPEG-2 IBP video compression technology, the amount of data is reduced, the storage cost is reduced, the data transmission speed is increased, the pressure on the computer bus and the network bandwidth is reduced, and the purely edited nonlinear editing network system can be used. It is possible, and at present Ethernet is the most mature network, system management is relatively complete, and the price is relatively low.

The nonlinear editing system based on MPEG-2 and the nonlinear editing network will become the future development direction.

(3) Satellite transmission

MPEG-2 has been ISO-approved and has been widely used in the broadcast field, such as digital satellite video broadcasting (DVB-S), DVD video discs and video conferencing. At present, there are tens of millions of DVB-S users around the world. DVB-S signals are encoded in MPEG-2 compression format, transmitted by satellite or microwave, and decoded by the MPEG-2 satellite receiver decoder at the user end for users to watch. . In addition, MPEG-2 compression coding technology can also be used to transmit and communicate remote TV news or programs.

(4) Broadcasting of TV programs

Broadcasting in the whole TV technology is a link between the beginning and the end. It is necessary to digitally transform the broadcast system. The most critical step is to build a hard disk broadcast system. The MPEG-2 hard disk automatic broadcast system is favored by people because of its simple programming, large storage capacity and high video index. However, in the past, MPEG-2 broadcast equipment was very expensive and only used in a small amount. With the development of MPEG-2 technology and the decline in the cost of related products, MPEG-2 hard disk automatic system broadcast is expected to become popular.

3.MPEG-4 standard

The Moving Picture Experts Group MPEG officially announced the first version of the MPEG-4 (ISO/IEC 14496) standard in February 1999. At the end of the same year, the second edition of MPEG-4 was also finalized and officially became an international standard in early 2000.

MPEG-4 is very different from MPEG-1 and MPEG-2. MPEG-4 is not just a specific compression algorithm. It is an international standard for the integration and compression technologies of digital TV, interactive graphics applications (video and audio content), interactive multimedia (WWW, data acquisition and distribution). The MPEG-4 standard integrates a wide range of multimedia applications into a complete framework designed to provide standard algorithms and tools for multimedia communications and application environments, thereby establishing a common use in multimedia transmission, storage, retrieval and other applications. Uniform data format.

The encoding concept of MPEG-4 is: The most significant difference between the MPEG-4 standard and the previous standard is that it adopts the object-based coding concept, that is, when encoding, divides a scene into several video audios that are related in time and space. The objects are separately encoded and then multiplexed and transmitted to the receiving end, and then the different objects are separately decoded to be combined into the desired video and audio. This is convenient for us to adopt different encoding methods and representation methods for different objects, and is conducive to the fusion between different data types, and it is also convenient to implement operations and editing for various objects. For example, we can put a cartoon character in a real scene, or put a real person in a virtual studio, and also easily interact on the Internet, selectively combining various video and audio according to your needs. Graphical text object.

The general framework of the MPEG-4 system is: representation of natural or synthetic audiovisual content; management of audiovisual content data streams, such as multipoint, synchronization, buffer management, etc.; support for flexibility and configuration of different parts of the system.

Compared with MPEG-1 and MPEG-2, MPEG-4 has the following unique advantages:

(1) Content-based interactivity

MPEG-4 provides content-based multimedia data access tools such as indexing, hyperlinking, downloading, and deleting. With these tools, users can easily obtain the object-related content they need from the multimedia database, and provide content operation and bitstream editing functions for interactive home shopping, fading digitization. Effect, etc. MPEG-4 provides efficient natural or synthetic multimedia data encoding methods. It combines natural scenes or objects into composite multimedia data.

(2) Efficient compressibility

MPEG-4 is based on higher coding efficiency. Compared to other standards that are or will be formed, it is based on higher visual and auditory quality at the same bit rate, which makes it possible to transmit video and audio over low bandwidth channels. At the same time, MPEG-4 can encode simultaneous data streams. A multi-view or multi-channel data stream of a scene can be synthesized into a final data stream efficiently and synchronously. This can be used for virtual 3D games, 3D movies, flight simulation exercises, etc.

(3) Universal accessibility

MPEG-4 provides robustness in error-prone environments to ensure its use in many wireless and wireline networks and storage media. In addition, MPEG-4 supports content-based scalability, namely content, quality, The complexity is divided into many small blocks to meet the different needs of different users, supporting transmission channels and receiving ends with different bandwidths and different storage capacities.

These features will undoubtedly accelerate the development of multimedia applications, and benefit from the application areas: Internet multimedia applications; broadcast television; interactive video games; real-time visual communication; interactive storage media applications; studio technology and television post-production; Virtual conferences for animation technology; multimedia mail; multimedia applications under mobile communication conditions; remote video surveillance; remote database services via ATM networks, etc. The main applications of MPEG-4 are as follows:

(1) Applied to Internet video and audio broadcasting

As the number of Internet users increases, the number of viewers of traditional TV broadcasts is gradually decreasing, and the advertising revenue is reduced. Therefore, the current fixed-line TV broadcasts will eventually turn to Internet broadcasts based on TCP/IP, and the viewers’ viewing methods are also simple. The remote control selects the channel to be converted to online video on demand. The concept of video on demand is not to download the program to the hard disk first, then play it, but to stream video, click to watch, and play while transmitting.

Nowadays, the audio and video broadcast on the Internet are: Real Networks' Real Media, Microsoft's Windows Media, and Apple's QuickTime. The video and audio formats defined by them are incompatible with each other, which may cause uncontrollable confusion in the media stream, and MPEG. -4 provides a set of standard tools for Internet video applications that provide consistent video and audio streams. Therefore, the use of MPEG-4 for audio and video playback over the Internet is a safe choice.

(2) Applied to wireless communication

MPEG-4 efficient rate compression, interaction and grading features are especially suitable for multimedia communication on narrowband mobile networks. Future mobile phones will become multimedia mobile receivers, which can not only play mobile TV calls, mobile Internet access, but also mobile multimedia. Broadcast and watch TV.

(3) applied to still image compression

Still images (pictures) are widely used in the Internet, and JPEG technology is often used for image compression on the Internet. The still image (texture) compression in MPEG-4 is based on wavelet transform. Under the same quality condition, the compressed file size is about one tenth of that of JPEG compressed file. Converting JPEG images used on the Internet to MPEG-4 format can greatly increase the speed of pictures in the network.

(4) applied to videophone

Conventional compression coding standards for narrowband video telephony services, such as H261, use intraframe compression, interframe compression, reduced pixels, and frame subtraction to reduce the code rate, but coding efficiency and image quality are unsatisfactory. MPEG-4 compression coding enables the transmission of quality-acceptable audio and video signals at very low bit rates, enabling video telephony services to be implemented over narrowband public telephone networks.

(5) applied to computer graphics, animation and simulation

MPEG-4's special encoding and powerful interactive capabilities enable MPEG-4-based computer graphics and animation to capture material from multimedia databases of various sources and combine the results in real time. Therefore, future computer graphics can be infinitely developed in the desired direction within the scope allowed by the MPEG-4 grammar, producing animations and simulation effects that are unimaginable today.

(6) applied to video games

MPEG-4 can mix and match natural images and sounds with artificially synthesized images and sounds. It has unprecedented flexibility in encoding and can recall materials from multimedia databases of various sources in a timely manner. This will produce a movie-like video game in the future, enabling extremely high degree of freedom of interactive operation.

4.MPEG-7 standard

The MPEG-7 standard, known as the "Multimedia Content Description Interface," provides a standardized description of all types of multimedia information that will be relevant to the content itself, allowing for quick and efficient access to data of interest to the user. It will extend the limited capabilities of existing content-aware solutions, especially as it includes more data types. In other words, MPEG-7 specifies a standard set of descriptors for describing various types of multimedia information. The standard was introduced in October 1998.

The goal of MPEG-7 is to support a variety of audio and visual descriptions, including free text, N-dimensional space-time structures, statistical information, objective attributes, subjective attributes, production attributes, and combined information. For visual information, the description will include colors, visual objects, textures, sketches, shapes, volumes, spatial relationships, motion and deformation.

The goal of MPEG-7 is to provide a way to describe multimedia material based on the level of abstraction of information in order to represent the user's need for information at different levels. Taking visual content as an example, a lower abstraction layer will include a description of shape, size, texture, color, motion (track), and position. The lower abstraction layer for audio includes tones, tones, sonic speeds, sonic changes, and acoustic spatial locations. The top level will give semantic information: such as "This is a scene: a duck is hiding behind the tree and a car is passing behind the scenes." The abstraction layer is related to the way the features are extracted: many low-level features can be extracted in a completely automated way. And high-level features require more people to interact. MPEG-7 also allows for the retrieval of sound data based on visually described queries, and vice versa.

The goal of MPEG-7 is to support the flexibility of data management, the globalization of data resources, and interoperability.

The scope of MPEG-7 standardization includes: a series of descriptors (descriptors are representations of features, a descriptor is the grammar and semantics of defining features); a series of description structures (detailing the structure and semantics between members) ); a language detailing the description structure, a Descriptive Definition Language (DDL); one or more coding description methods.

In our daily lives, the ever-increasing availability of audio and video data requires efficient multimedia systems to access and interact. This type of demand is related to a number of important social and economic issues and is urgently needed in many professional and consumer applications, especially in today's highly developed network, and the ultimate goal of MPEG-7 is to turn online multimedia content into Like the current text content, it is searchable. This allows the public to access a large amount of multimedia content, and the MPEG-7 standard can support a wide range of applications, as follows:

(1) storage and retrieval of audiovisual databases;

(2) Selection of broadcast media (broadcast, television programs);

(3) Personalized news services on the Internet;

(4) Intelligent multimedia and multimedia editing;

(5) Applications in the field of education (such as digital multimedia libraries);

(6) remote shopping;

(7) Social and cultural services (historical museums, art corridors, etc.);

(8) Investigation services (identification, debate, etc. of human characteristics);

(9) Remote sensing;

(10) Surveillance (traffic control, ground transportation, etc.);

(11) Biomedical applications;

(12) Construction, real estate and interior design;

(13) Multimedia directory services (eg, yellow pages, travel information, geographic information systems, etc.);

(14) Home entertainment (personal multimedia collection management system, etc.).

In principle, any type of AV (Audio-Video) material can be retrieved by any type of query material. For example, AV materials can be queried by video, music, language, etc., through search engines to match query data and MPEG-7. Audio and video description. Here are a few examples of queries:

Music: Playing a few notes on the keyboard yields a list of musical compositions that contain (or approximate) the desired tune, or somehow matches the image of the note, for example, from an emotional aspect.

Graphics: You can get a set of images like graphics, logos, ideograms (symbols), etc. by drawing a few lines on the screen.

Movement: For a given set of objects, describing the motions and relationships between the objects, an animated list of the spatio-temporal relationships described is obtained.

Movie shooting script (story description): For a given content, describing the action will get a list of movie shooting scripts (story descriptions) that have similar actions.

Fourth, MPEG-21 standard

The Internet has changed the business model of material commodity exchange, which is "e-commerce." The new market will inevitably bring new problems: how to acquire digital goods such as digital video, audio and composite graphics, how to protect the intellectual property of multimedia content, how to provide transparent media information services for users, how to retrieve content, and how to guarantee services. Quality and so on. In addition, there are many digital media (pictures, music, etc.) that are generated and used by the user. These "content providers" share the same concerns as commercial content providers: content management and relocation, protection of various rights, protection of unauthorized access and modification, protection of trade secrets and personal privacy. Although the infrastructure of transmission and digital media consumption has been established and many factors related to it have been identified, there is no clear relationship description method between these elements and norms. There is an urgent need for a structure or framework to ensure the consumption of digital media. Simplicity, a good deal of the relationship between the elements of "digital consumption." MPEG-21 was proposed in this case.

The purpose of the MPEG-21 standard is to: (1) organically integrate different protocols, standards, technologies, etc.; (2) develop new standards; and (3) integrate these different standards. The MPEG-21 standard is actually the integration of some key technologies. Through this integrated environment, transparent and enhanced management of global digital media resources enables content description, creation, distribution, use, identification, charge management, property rights protection, and user privacy. Protection, terminal and network resource extraction, event reporting and other functions.

Any individual or group that interacts with the MPEG-21 Multimedia Framework standard environment or uses an MPEG-21 digital item entity can be considered a user. From a purely technical point of view, MPEG-21 does not make any difference between "content providers" and "consumers." The MPEG-21 Multimedia Framework standard includes the following user requirements: (1) security of content delivery and value exchange; (2) understanding of digital items; (3) personalization of content; (4) business rules in the value chain; 5) Compatible entity operations; (6) Introduction of other multimedia frameworks; (7) Compatibility and support for standards other than MPEG; (8) Compliance with general rules; (9) MPEG-21 standard functions and communication capabilities of various parts (10) Enhanced use of media data in the value chain; (11) Protection of user privacy; (12) Guarantee of integrity of data items; (13) Tracking of content and transactions; (14) View of business process Providing; (15) provision of general commercial content processing library standards; (16) consideration of independent development of business and technology in long-term investment; (17) protection of user rights, including: service reliability, debt and insurance, loss and damage , payment processing and risk prevention; (18) the establishment and use of new business models.

Five, other compression coding standards

1.Real Video

Real Video is a compression technology developed by Real Networks for multimedia transmission over narrowband (primary Internet).


WMT is a video and audio encoding compression technology developed by Microsoft Corporation for media transmission over the Internet. This technology has been integrated with the WMT server and client architecture as a whole, using some of the principles of the MPEG-4 standard.


QuickTime is a file format and transmission architecture for storing, transmitting and playing multimedia files. The stored and transmitted multimedia is compressed by multiple compression modes, and the transmission is realized by the RTP protocol.

Standardization is the premise of industrialization success. H.261 and H.263 promote the development of video telephony and video conferencing. Early video server products basically adopted the M-JPEG standard to create a video nonlinear editing era. MPEG-1 has successfully promoted the VCD industry in China. The MPEG-2 standard has driven a variety of consumer electronics industries such as DVD and digital TV. Other MPEG standard applications are also being implemented or developed. Real-Networks Real Video, Microsoft Corporation WMT and Apple's QuickTime have driven the development of network streaming media. The video compression codec standard closely follows the pulse of application development, and is synchronized with industry and applications. The future is an information society. The transmission and storage of various multimedia data is a basic problem of information processing. Therefore, it is certain that the video compression coding standard will play an increasingly important role.

Spare Parts For All Battery Equipment

Excavator Battery Spare Parts,Excavator Spare Part Battery Relay,Spare Parts Negative Battery Switch,Spare Parts For All Battery Equipment

zhejiang BSL battery technology service company , https://www.bslbatteryservice.com