|
About libjpeg-turbo Developer Info |
PerformanceThe performance of libjpeg-turbo is monitored using the vgl_5674_0098.ppm: A frame capture from the 3D Studio Max Viewperf test. This frame represents a movie set with significant wireframe and texture content (1240 x 960 pixels.) vgl_6434_0018.ppm: A frame capture from the Pro/ENGINEER Viewperf test. This frame represents an exploded rendering of a race car with smooth shading and lighting (1240 x 960 pixels.) vgl_6548_0026.ppm: A frame capture from the UGS NX Viewperf test. This frame represents a grayscale wireframe rendering of an engine block with partial transparency (1240 x 960 pixels.) nightshot_iso_100.ppm: Photographic content from http://www.imagecompression.info/test_images/ (8-bit version, 3136 x 2352 pixels.) The 3D frame captures were chosen because they represent workloads that are difficult for JPEG to compress. Thus, they have below average compression ratios, but none of them are corner cases. The photograph was chosen because its performance was typical of other images in the test image set. The raw data can be found in the following spreadsheet: libjpegturbovsipp.ods This spreadsheet will be updated as new data is acquired. General Performance NotesFor non-grayscale JPEG compression and decompression, libjpeg-turbo is between 1.8x and 4.5x as fast as libjpeg v6b. For non-grayscale JPEG compression and decompression, libjpeg-turbo 64-bit is between 80% and 118% as fast as TurboJPEG/IPP. libjpeg-turbo 32-bit is between 60% and 93% as fast as TurboJPEG/IPP. libjpeg-turbo's primary weakness relative to TurboJPEG/IPP is 32-bit performance, particularly on Intel processors and even more particularly on legacy Intel processors. This is largely due to the Huffman encoder/decoder running out of registers and having to swap some inner loop variables back and forth from memory. The optimizations performed by the VirtualGL project reduced this effect somewhat, but it could not be eliminated entirely. Another weakness of the libjpeg-turbo codec is subsampling. In general, it takes more of a relative hit from enabling chrominance subsampling than TurboJPEG/IPP does. The other area in which IPP excels is compressing high-frequency content, such as images with sharp lines. Thus, the disparity between TurboJPEG/IPP and libjpeg-turbo is the most pronounced on the 3D Studio Max image and the least pronounced on the photograph. All of these represent areas which could benefit from further optimizations. TurboJPEG/IPP has some weaknesses, however. Perhaps the most notable is that the 64-bit version requires SSE3 code in order to perform optimally, so libjpeg-turbo will have a clear advantage on older Opteron and Athlon64 systems that lack the SSE3 instruction set. Fast UpsamplingOne important note about image quality and performance: The TurboJPEG/OSS wrapper in libjpeg-turbo, which was used for all of the benchmarks on this page, is configured to use settings which duplicate, as closely as possible, the image quality of TurboJPEG/IPP. Thus, the fast integer forward DCT, the slow integer inverse DCT, and slow chrominance upsampling are used in libjpeg. Significant decompression performance (at the expense of increased subsampling artifacts) can be gained on chrominance subsampled images by enabling fast (AKA "merged") chrominance upsampling. This is accomplished by passing a flag of Restart MarkersThe fast Huffman decoder in libjpeg-turbo originally came from Sun Microsystems' mediaLib JPEG Codec. It was open sourced via the VirtualGL project and later adapted for use by libjpeg-turbo. Unfortunately, this fast Huffman decoder does not handle restart markers in a way that makes libjpeg happy, so it is necessary to use the slow Huffman decoder when decompressing a JPEG image that has restart markers. This can cause the 64-bit decompression performance to drop by as much as 15% and the 32-bit decompression performance to drop by as much as 20%. Since many consumer packages, such as PhotoShop, use restart markers when generating JPEG images, this prevents images generated by such packages from achieving full decompression performance with libjpeg-turbo (although they will still decompress much faster than with libjpeg.) This is an area of low-hanging fruit for future research and optimization. |
|
![]() | All content on this web-site is licensed under the Creative Commons Attribution 2.5 License. Any works containing material derived from this web-site must cite The VirtualGL Project as the source of the material and list the current URL for the VirtualGL web-site. |