iSize Claims Massive Performance Savings for its Debut AI Codec

Article Featured Image

London-based startup iSize Technologies is launching an AI-powered encoding platform which it claims boosts bitrate savings by 70% versus virtually any other codec solution. 

The company says it has "extensively tested" its patent-pending SaaS, called BitSave, and further claims it will make encoding up to 500% faster than competing cloud encoding solutions. 

The company claims BitSave is the industry’s first proprietary machine learning (ML) solution for "substantial bitrate or quality gains in video compression particularly for video encoding on 'resource-constrained' devices" (drones, action-cams, smartphones, are mentioned).

It is codec independent so will work with AVC/H.264, HEVC/H.265, and VP9. The currently available version of the platform provides for H.264/AVC encoding, with other encoders coming soon. 

The debut of the product will be at the IBC Show in Amsterdam in next month. 

"Advanced machine learning is now beginning to disrupt content delivery, and we believe we are the right team to allow for such advanced technologies to enter the market in a manner that is backward compatible to existing video encoding and transport standards," says iSize Technologies CEO Sergio Grce. "BitSave resolves high-bitrate quality issues by providing the capacity, performance, and know-how to cost-effectively deliver a high-quality experience for both VOD and live streaming customers."

While it has not been benchmarked against the machine learning capabilities at the core of V-Nova’s Perseus Pro codec (which drives V-Nova’s P.Link Dual UHD/8HD video encoder and decoder), iSize is targeting BitSave at the exact same territory of MPEG-5 Part 2 Low Complexity Enhancement Video Codec (LCEVC). This codec-agnostic enhancement is claimed to improve compression, or video quality at a given bitrate, while reducing processing power consumed and is based on Perseus Pro.

"Future coding standards, like the ongoing VCEG/MPEG JVET standardization to create the next generation codec that will replace HEVC, will undertake a lengthy development process that will typically culminate in 30%-40% bitrate saving for the same visual quality," explains Grce. "However, the expected timeline for delivery of the first working codecs for MPEG’s JVET current standardization is scheduled for after 2023—at the same time, the HEVC standard has still not reached large rollout to date."

On the other hand, current machine learning solutions like Magic Pony (owned by Twitter) and Wave One offer disruptive performance for still-image coding.

"Such solutions face substantial barriers when moved to video due to the unresolved challenge of incorporating temporal prediction and their deployment complexity," says Grce.

That’s where iSize comes in.

Its improvements are achieved by incorporating a proprietary downscaling-upscaling technology as a pre- and post-processing stage of a standard codec pipeline. 

Instead of abandoning the existing codec pipeline (as proposed by autoencoding solutions, such as Magic Pony), iSize’s encoder-side solution downscales the input content with a custom-designed filter. The iSize decoder-side solution upscales the decoded low-resolution video to obtain the final result. 

It says it has measured video quality via industry-standard metrics such as PSNR, SSIM, and VMAF.

"Our IP offers 20% to 40% rate saving or quality improvement (2-4dB of PSNR) over AVC/H.264 and HEVC at marginal complexity increase for the decoder and complexity reduction for the encoder," says Grce. "Unlike other machine learning efforts in this space, our solution is deployable today and can be used on top of any standards-compliant or proprietary video codec architecture with minimal increase in complexity."

In fact, iSize goes further and suggest that operators with existing HEVC encoders will find efficiencies boosted by up to 70 percent using its technology. iSize says it has the data to qualify this.

It further claims to offer "the most competitive and simplest encoding pricing model" in the market. It charges £0.01 + VAT per minute of content encoding (audio included) for all resolutions up to 4K "50% cheaper than existing solutions at enhanced resolutions."

For example, a 10-minute HD video clip encoded at 5000Kbps and 3000Kbps with H.264/AVC costs £0.20 + VAT. 

"We have tuned our encoding such that, when selecting a bitrate value, the provided video quality will correspond to the visual quality of well-known video encoding services at that bitrate," Grce explains. "However, in the vast majority of cases, our actual bitrate will be 20% to 80% lower than that value, thereby offering this saving at no compromise in visual quality—in fact, in many cases our visual quality will be even higher than that of the other services that do not offer such savings across the range of bitrates."

iSize says it is piloting and testing BitSave with companies in VOD, streaming and broadcasting and has partnered with CosmoCDN, a CDN provider currently piloting in several regions of Africa to demonstrate its impact.

In the workflow integration with CosmoCDN, iSize receives the VOD content as well as live feed (utilising SRT) and transcodes it using the iSize engine. The VOD ABR content is sent to the storage Origin for CosmoCDN, while the live feed goes directly to the CDN’s edge servers.

CosmoCDN streams the content to end users, at the same time caching it in the SSD drives. Customers receive an embedded URL from iSize that includes the player and the CosmoCDN streaming URL. This code can be used in a CMS such as WordPress, Drupal, etc.

Additionally, there is a tokenised plugin which generates a token to allow watching the content, so that only authorized users can use the service.

The startup is comprised of a team of five and boasts PhDs in video signal processing, ML, and advanced networking systems.

Grce holds an MSc from London’s Cass Business School and previously worked in investment banking.

Technical director Yiannis Andreopoulos (who is also Professor in Data and Signal Processing Systems at University College London) counts 17 years of experience in video coding with more than 150 research papers, three patent applications, and ten contributions to JPEG/MPEG standards.

Streaming Covers
for qualified subscribers
Subscribe Now Current Issue Past Issues
Related Articles

Introduction: Foundations of AI-Based Codec Development

To promote the understanding of AI codec development, Deep Render's CTO and Co-founder, Arsalan Zafar, has launched an educational resource titled "Foundations of AI-Based Codec Development: An Introductory Course." In a recent interview with Streaming Media Magazine, Zafar provided insights into the course's content, target audience, and expected outcomes.

Companies and Suppliers Mentioned