Global Standardization Activities
Image and Video Coding Related Standardization Activities of ISO/IEC JTC 1/SC 29
This article reviews recent standardization efforts related to image and video coding of the ISO (International Organization for Standardization)/IEC (International Electrotechnical Commission). We first review the aim of international standardization of media coding. Then we discuss the standardization process and explain some of the signature standards and their widespread ripple effects. Finally, we introduce recent trends and activities toward next-generation coding technologies that will possibly be standardized in the near future.
Keywords: ISO/IEC JTC 1/SC 29, JPEG, MPEG
In the analog age, the medium used for audio signals was cassette tape, and the medium used for still and moving picture signals was silver halide film. Radio and television broadcasts and telephone transmissions were made possible with analog modulation technology. With the emergence of digital media, the fidelity of media signals has advanced significantly, in synchronization with the evolution of media coding (compression) technology.
2. Activities of SC 29
It is essential to conform to internationally coherent specifications and standards when interconnecting, securing, and deploying products and services using images and video. Subcommittee (SC)*1 29 (Coding of audio, picture, multimedia and hypermedia information) is one of the subcommittees of the international standardization body ISO/IEC JTC*2 1, and it has been the hub for standardization activities related to the coding of multimedia information technology . To achieve wide acceptance and use of its standards in the market, SC 29 standardizes not only the coding technologies for audio-visual information but also streaming, searching, multiplexing, storing, and interfacing technologies.
As a result, SC 29 standards are widely utilized in home appliances such as digital cameras, camcorders and hard disk drive (HDD) recorders, audio-visual appliances such as hi-fi stereo systems, video players and personal computers (PCs), and mobile devices such as portable music players, smartphones, and tablets. These standards substantially contribute to today’s visual, audio, multimedia, and information technology products and services. Needless to say, SC 29 plays an essential role in modern life and industry.
3. International and domestic bodies of SC 29
In 1986, ISO/IEC JTC 1/SC 2/Working Group (WG)*3 8 and its only subgroup, JPEG (Joint Photographic Experts Group) , were established. MPEG (Moving Picture Experts Group) , JBIG (Joint Bi-level Image Experts Group), and MHEG (Multimedia and Hypermedia Experts Group), were established in 1987, 1988, and 1990, respectively. In 1991, ISO/IEC JTC 1/SC 29 was established in order to standardize the functionality of end-user multimedia and hypermedia terminals. Dr. Hiroshi Yasuda (with NTT at the time) worked to establish SC 29 and was appointed its first Chair. Since then, a representative and organization from Japan has respectively served as the Chair and Secretariat. In 1998, Dr. Hiroshi Watanabe (with NTT at the time) succeeded as Chair, and Mr. Kohtaro Asai of Mitsubishi Electric took on the position in 2006. The International Secretariat is led by the Information Technology Standards Commission of Japan (ITSCJ), which is part of the Information Processing Society of Japan (IPSJ).
SC 29 holds a plenary meeting once a year. In July 2014, Japan hosted the 27th meeting in Sapporo for the second time, 23 years after the first meeting. At this meeting, the WG 1 Convener changed after 18 years. In June 2015, the 28th meeting was held in Warsaw, Poland, where Mr. Asai asked JTC 1 and the Technical Management Board for an extension of his term of office beyond the nine-year limit. His request was granted, allowing him to continue as Chair of SC 29 for an additional three-year term.
There are two WGs (WG 1 and WG 11) under the auspices of SC 29. WG 1 develops the standards for technologies such as still image coding, image retrieval, and high dynamic range extension. WG 11 develops the standards for technologies such as moving video and audio coding, systems, high efficiency coding, multiplexing, three-dimensional (3D) video coding, and video retrieval. Each WG holds three to four meetings a year in order to facilitate the standardization. Some of the major standards developed by SC 29 are listed in Table 1.
In Japan, a domestic SC 29 committee (SC 29 Japan National Body, chaired by Dr. Seishi Takamura of NTT) is responsible for carrying out SC 29 activities in Japan under the auspices of ITSCJ. It handles communications with the higher body and votes on standardization ballots. It also runs tutorial seminars to introduce its activities to the public. The SC 29 Japan National Body committee comprises five WG subcommittees: WG 1 (still image coding), WG 11/AUDIO (audio-visual coding, audio), WG 11/VIDEO (audio-visual coding, video), WG 11/SYSTEMS (audio-visual coding, systems), and WG 11/SYSTEMS/MPEG-7 SG. They work on gathering/consolidating domestic opinions, and they participate in and support the international WG activities.
4. Standards created by SC 29/WG 1
4.1 JPEG (1992)
JPEG was the first international standard for still image coding. It was quickly disseminated just after its publication. JPEG is still the prevailing standard and is used exclusively in almost all image-related standards such as those concerning smartphones, digital cameras, and web pictures. In 2010, the production of digital cameras culminated in 121.5 million shipments. Every day, at least 1.8 billion JPEG images are shared worldwide via social networking services (SNSs). JPEG has long been in an immovable position as the primary image storage format, which is quite unusual for the continually changing information technology field.
4.2 JBIG (1993)
JBIG is a lossless coding standard for binary (black and white) images. It provides a 20–80% higher compression ratio than the fax image coding method (about 1/20 compression). Its successor, JBIG-2, provides lossless coding and lossy coding and offers two to four times higher compression than JBIG.
4.3 JPEG-LS (1998)
JPEG-LS is a lossless and near-lossless coding standard for continuous-tone images. Although JPEG also has a lossless coding capability, JPEG-LS has a higher compression ratio. Additionally, its computational complexity is not very high, so it is suitable for hardware implementation.
4.4 JPEG 2000 (2000)
JPEG 2000 provides higher image quality than JPEG, particularly at low bit rates. It also provides a wide range of functionalities such as seamless lossy to lossless coding, progressive decoding, ROI (region of interest) coding, error resilience, and moving picture coding. JPEG 2000 is used in numerous fields including digital cinema, digital archiving, driver’s license images, passport images, medical imaging, and satellite imaging.
5. Standards created by SC 29/WG 11
5.1 MPEG-1 (1993)
MPEG-1 aims at a video coding rate of 1.5 Mbit/s to be used in video CD (compact disc), PC, and web applications. MPEG-1 Audio Layer 3, known as MP3, is part of the MPEG-1 standard, which is widely used for audio players and Internet audio streaming.
5.2 MPEG-2 (1995)
MPEG-2 aims at a video coding rate of 4–10 Mbit/s for SDTV (standard definition television) and 15–30 Mbit/s for high definition TV (HDTV). Because MPEG-2 is effective for interlaced video and high-quality video materials (as required by broadcasting studios), it is widely used for storage media such as DVDs (digital versatile discs), terrestrial/satellite digital broadcasting, HDD recorders, and digital editing systems, which disseminate services via digital video transmission.
5.3 MPEG-4 (1998)
MPEG-4 is designed for a range of video coding rates from very low (10 kbit/s) to very high (40 Mbit/s) while providing higher compression than previous standards. Furthermore, it is the first video coding standard that has a VOP (video object plane) coding capability to handle several moving objects as well as the background. It is used in a cellular video phone service in Japan.
5.4 MPEG-4 Advanced Video Coding (AVC) (2003)
MPEG-4 AVC, with the ITU-T (International Telecommunication Union-Telecommunication Standardization Sector) designation of H.265, is designed for a very wide range of video coding rates from 10 kbit/s to 240 Mbit/s while achieving a compression ratio twice that of MPEG-2. It was widely accepted by the consumer market and has been adopted in many audio-visual appliances/services such as Blu-ray Discs, the 1seg mobile terrestrial digital audio/video and data broadcasting service in Japan, and HDTV broadcasting.
5.5 MPEG-H High Efficiency Video Coding (HEVC) (2013)
HEVC was standardized by the Joint Collaborative Team on Video Coding (JCT-VC) of WG 11 and ITU-T SG16, WP3 Q.6 (Working Party 3, Question 6), (also known as VCEG (Video Coding Experts Group)). It is also covered by the ITU-T designation H.265. It covers an extremely wide video coding rate range from 128 kbit/s to 800 Mbit/s. The balance of coding efficiency and computational complexity was carefully examined in the standardization process, and compression performance that was double that of MPEG-4 AVC was finally achieved. The compression performance of ultrahigh-definition TV (UHDTV) (over 4K video) is nearly three times better (64% bit rate reduction) than that of MPEG-4 AVC . Since its publication in 2013, it has been adopted as the core video coding method of many services such as CS (communications satellite) 4K TV test broadcasting (Channel 4K), NTT Plala’s first 4K commercial VOD (video on demand) service in Japan (Hikari TV 4K), and NTT DOCOMO’s Docomo Anime Store. It will be used in the BS (broadcasting satellite) 8K test broadcasting system starting in 2016, and its use is expected to continue spreading and to lead to an expansion of the broadcast, home appliance, and communication market.
5.6 HEVC 2015 Edition
Right after the standardization of HEVC in 2013, its extensions were intensively studied and finalized as RExt (Range Extensions: 4:2:2/4:4:4 chroma format, high bit depth (up to 12 bits/sample)), MV-HEVC (Multi-view HEVC: 3D multi-view video coding), and SHVC (Scalable HEVC). A 3D-related extension was processed by JCT-3V, the Joint Collaborative Team on 3D Video Coding Extension of WG 11 and ITU-T SG16 WP3. These extensions were consolidated into the original HEVC and published as HEVC 2015 Edition with the ITU-T designation of H.265 (V2).
6. External evaluation of SC 29
SC 29 and its contributors have been highly evaluated worldwide because of the prominent industrial success they have achieved, and they have won some very prestigious awards including the First ISO Lawrence D. Eicher Award, the ISO150 Award, Emmy Awards (6 times), the Prime Minister of Japan Award, the Minister of Posts and Telecommunications Award, the Minister of Economy, Trade and Industry Awards (6 times), and the Industrial Science and Technology Policy and Environment Bureau Director-General’s Awards (11 times).
7. Trends of SC 29/WG 1
Some of this group’s working items (JPEG XT and JPSearch) have just been standardized or are about to be standardized. WG 1 is going to start working on the standardization of Advanced Image Coding (AIC), JPEG AR, JPEG Systems, JPEG Privacy, JPEG PLENO, and JPEG XS. These are explained in more detail below.
7.1 JPEG XT
Recent imaging devices can handle higher bit depths than 8 bits/sample. The prevailing JPEG format that is used exclusively as an image coding format is JPEG Baseline, which can only handle up to 8 bits/sample. New coding technology for high dynamic range (HDR) images is expected by allowing backward compatibility with JPEG Baseline. JPEG XT expresses an HDR image with a JPEG Baseline-compatible bitstream that contains residual information. The bitstream can be decoded by a legacy JPEG decoder (with low dynamic range) or by a JPEG XT decoder with high dynamic range. Part 1 of JPEG XT was standardized in 2012; currently, Parts 2 through 9, which support lossless coding and opacity features, are about to be standardized. Japan has made substantial contributions to this standardization effort.
JPSearch is aimed at efficiently retrieving images based on metadata. The data format, query format, application programming interface (API), and content description are provided.
7.3 JPEG XR
JPEG XR (extended range) aims to support high bit depth image coding while offering a lower complexity coding scheme than JPEG 2000. It is based on various other image coding schemes, primarily HD Photo, which was developed by Microsoft Corporation.
AIC is a next-generation image coding standard that achieves higher coding performance than JPEG 2000. Prior to actually providing the coding technology, AIC is developing the image quality evaluation technology.
7.5 JPEG AR
The aim of JPEG AR is to achieve the interoperability of AR (augmented reality)-assisted applications by standardizing associated frameworks and APIs. In October 2012, the Asian Forum on Smart Media and Augmented Reality was established, and its members have been working on standardizing JPEG AR in cooperation with WG 11. The market for AR-related services is expected to grow thanks to the emergence of new devices and applications.
7.6 JPEG Systems
JPEG Systems is a large-scale standard that synthesizes, analyzes, and integrates JPEG-family standards; its standardization was initiated in 2014. While it defines the common file format and code-stream syntax of conventional JPEG, JPEG Systems is also designed to cover future JPEG-family standards.
7.7 JPEG Privacy & Security
To assure privacy and security when sharing photos online through SNSs or (stock) photography databases, JPEG Privacy & Security will provide new functionality for JPEG encoded images such as ensuring privacy, maintaining data integrity, and protecting intellectual rights, while maintaining backward and forward compatibility to existing JPEG. A public workshop will be held in October 2015 during the JPEG meeting in Brussels, Belgium. This new working item was proposed by Japan.
7.8 JPEG PLENO
JPEG PLENO is targeting a standard framework for the representation and exchange of new imaging modalities such as light-field, point-cloud, and holographic imaging. Another target is to define new tools for improved compression while providing advanced functionality support for image manipulation, metadata, image access and interaction, privacy and security, and harmonization with the conventional JPEG format. A workshop was organized during the JPEG meeting in Warsaw in June 2015.
7.9 JPEG XS
Today’s industrial applications often involve transport and storage of uncompressed images and video. This is the case, for instance, in video links, IP (Internet protocol) transport, Ethernet transport, proprietary transports, and memory buffers. In this context, the JPEG XS low-latency lightweight coding system is aimed at increasing the resolution and frame rate while ensuring good visual quality and keeping power use and bandwidth within a reasonable budget. A Call for Proposals is expected to be published at the 71st JPEG meeting in La Jolla, USA, in February 2016.
8. Trends of SC 29/WG 11
Further extensions are underway since the consolidation of HEVC 2015 Edition.
Different from the multiview video coding extension MV-HEVC, 3D-HEVC is aimed at depth-map-based 3D video coding and is being standardized by JCT-3V (JCT on 3D Video Coding Extensions). Coupling this coding technology with depth-image based rendering technology (outside the standardization scope) makes it possible to generate arbitrary viewpoint video in order to achieve a glassless 3D video system.
Conventional video coding standards are aimed at natural, camera-captured video coding. The Call for Proposal for SCC (screen content coding) is aimed at CG (computer graphics), game, and PC screen content coding. It was issued in January 2013, and JCT-VC has been working on its standardization. Its prospective applications include wireless displays, screen sharing, remote education, and electronic books.
The dynamic range of imaging devices and displays has been rapidly advancing. The maximum contrast a human eye can detect in one scene without adaptation is estimated to be 100 thousand to 1, and with unnoticeably slight adaptation, 1 million to 1 . These ratios were taken into account in developing a new coding scheme that allows HDR video with a pixel value of 16 bits/sample or floating point numbers. In February 2015, a Call for Evidence (CfE)*5 was issued, and at the WG 11 meeting in Warsaw in June 2015, the proposed technologies were evaluated in order to accelerate the standardization.
8.4 Future Video Coding
The Future Video Coding project is aimed at next-generation video coding that attains even higher coding performance than HEVC. Just as with HEVC standardization, it is a collaboration between WG 11 and ITU-T SG16 WP3 Q.6. At the WG 11 meeting in Strasbourg, France, in October 2014, the Brainstorming on Future Video Coding was organized, and its targets, applications, and timeline were discussed. A workshop on Future Video Coding is planned for the WG 11 meeting to be held in Geneva, Switzerland, in October 2015. Further improvement of compression is desired not only for UHDTV broadcasting but also for mobile streaming and video downloads. The market for these activities is expected to grow, and thus, these trends bear watching.
Super multiview video coding (FTV: Free-viewpoint Television) goes beyond the current stereo 3D and is now under consideration for standardization. A seminar and demonstrations were held at the WG 11 meeting in Sapporo in July 2014. The main contributors were from Japan and Europe. The CfE soliciting potential technologies that outperform HEVC for this type of video was issued in June 2015. The submitted proposals will be evaluated at the WG 11 meeting in October 2015.
9. Future exploration of SC 29
This article has overviewed the structure and activity of SC 29, which supports today’s widespread multimedia technology, and introduced several image and video coding standards that were developed by SC 29. The standardization of HEVC extensions will converge by the end of 2015, and then many future-oriented exploration activities will gradually shift to the standardization phase, which will trigger another groundswell of activity.
Space limitations did not allow me to mention the deliverables of SC 29 other than image and video coding standards. These deliverables include a hybrid content service that spans broadcasting and telecommunication using MMT (MPEG Media Transport), network bandwidth- or terminal-adaptive video streaming using DASH (Dynamic Adaptive Streaming over HTTP), next-generation 3D audio coding, and a media format for 3D printers, which are becoming increasingly common. Such diverse elements have already been introduced in the market and will surely be standardized in the future. The importance of SC 29 will increase more than ever in synchronization with the evolution of networks, broadcasting, and digital appliances.