Buyer's Guide: Windows Production Stations
This article appears in the February/March issue of Streaming Media magazine, the annual Streaming Media Industry Sourcebook. In these Buyer's Guide articles, we don't claim to cover every product or vendor in a particular category, but rather provide our readers with the information they need to make smart purchasing decisions, sometimes using specific vendors or products as exemplars of those features and services.
If you’re an editor or a compressionist, your livelihood depends upon the stability and throughput of your production station. If you plan to buy a new Windows workstation in 2012, here are some thoughts to consider when choosing and configuring your system.
Clearly, you need a 64-bit OS. For most applications, I recommend Windows 7 Professional, which should install in 64-bit mode on computers with a 64-bit CPU. However, several enterprise encoding applications, such as Sorenson Media, Inc.’s Squeeze Server or Telestream, Inc.’s Vantage, require Windows Server 2008. So identify the programs you intend to load on the workstation over the next 12 to 18 months and choose your OS accordingly.
One CPU or Two
The single-CPU versus dual-CPU decision is critical because it impacts cost and performance dramatically. To help analyze this decision in detail, I’ve been fortunate to have two HP workstations in-house: one with a single 2.67 GHz four-core CPU (an HP Z400), the other with two 2.67 GHz four-core CPUs (an HP Z600). Both systems are configured with 24GB of RAM, and both used the same graphics card, NVIDIA Corp.’s Quadro FX 4800.
Using these two systems, I’ve focused on the single-CPU versus dual-CPU buying decision for those editing in Adobe Systems, Inc.’s Creative Suite (CS) 5.5, as well as those encoding with a range of file-based transcoding tools. Here’s what I found.
In terms of editing, single-CPU and dual-CPU performance varies based upon the formats that you edit and whether you typically use a single CPU-intensive application on your computer or whether you multitask. Table 1, from an article I wrote titled “Configuring Windows Workstation for Premiere Pro CS5.5,” summarizes these findings. To explain, the column on the left shows, on a percentage basis, how much longer it would have taken to render my test projects on a single CPU (the Z400), rather than the dual-CPU system (the Z600). Positive numbers mean it would have taken longer on the Z400; the negative numbers means the dual-core system was slower.
I color-coded the results to highlight negative numbers in red, as in a big red flag. For example, when rendering my HDV test projects, the Z600 was actually 2% slower than the Z400. At the other end of the spectrum, if the dual-CPU Z600 would improve encoding time by more than 24%, the figure was highlighted in green, which means go. Granted, 24% is totally arbitrary, but it’s a meaningful number in a production environment.
The second column shows the same analysis with Adobe Encore rendering a Blu-ray project in the background. Beyond the percentage time savings shown in Table 1, the other critical fact is that the dual-CPU Z600 finished the Blu-ray project in 1:42 (hours:minutes) while the single-CPU Z400 took 2:39, almost an hour longer (or 56%). If you frequently multitask and render one project while editing another, this is the column that you should focus on, and the dominance of green means a second CPU could be very worthwhile.
Overall, if you typically don’t multitask, a dual-CPU system doesn’t make sense if you’re working with DV, HDV, AVCHD, or DVCPRO-HD video formats, but it does if you’re working with the other formats. If you frequently multitask with CPU-intensive applications, such as video encoding, DVD authoring, or rendering in Adobe After Effects, a dual-CPU system makes sense irrespective of video format.
Single-CPU vs. Dual-CPU Performance—Video Encoding
To test single-CPU and dual-CPU video-encoding performance, I worked with four programs: Adobe Media Encoder, Harmonic/Rhozet ProMedia Carbon (formerly Carbon Coder), Telestream Episode Engine, and Sorenson Squeeze. I created identical encoding tasks for each program on the two workstations and tested encoding time.
Note that I intentionally used different source files and presets for each encoding tool, so you should not draw any conclusions regarding comparative performance from these tests. It simply takes too long to create a fair, apples-to-apples encoding comparison and that’s not the point of this article. The only thing similar about all encodes is that they all involved the H.264 codec. Table 2 tells the tale.
As you can see, Squeeze, Episode Engine and Rhozet ProMedia Carbon produced the same files in very close to half the time, nearly doubling throughput. In comparison, Adobe Media Encoder saw only a 44% drop in encoding time. Let’s explore why.
Adobe Media Encoder is a serial encoder, encoding one file to completion before starting another. In comparison, the other three tools encode files in parallel with no restriction, which allows them to make more efficient use of the available CPU resources. The obvious lesson is that encoders with unlimited parallel encoding capabilities are the most efficient with the additional CPU cores that a dual-CPU system delivers.
Note, however, that both Sorenson and Telestream sell versions of their products that either can’t encode in parallel (Squeeze 8 Lite and Episode) or can only encode two files simultaneously (Episode Pro). With these tools, it’s likely that the benefits of the second CPU would be much lower.
In addition, I encoded to H.264 format because most H.264 codecs are multithreaded, or able to use multiple cores even when encoding serially, or one file at a time. In comparison, the VP6 codec is relentlessly single-threaded and will use only a single core irrespective of available resources, which can almost completely negate the benefit of additional CPU resources. For example, I reran the Adobe Media Encoder trials using the VP6 codec, and the performance advantage produced by the Z600 dropped from 44% to 7%, with the Z400 producing the files in 15:50 compared to 14:48 for the Z600.
To make a long story short, if you’re working with an encoder with unlimited parallel encoding capabilities, you should be able to efficiently leverage all available cores if you encode in high volume, regardless of the codec. With serial encoders, or with limited parallel encoding capabilities, your results will vary according to the codec. Codecs that are efficiently multithreaded will likely take advantage of the multiple cores, while codecs that are not, won’t.
Sandy Bridge and Quick Sync Video
Sandy Bridge is the CPU architecture that succeeded Intel Corp.’s highly successful Nehalem line of CPUs. One of Sandy Bridge’s most compelling features is Quick Sync Video, or dedicated circuits on the CPU that will accelerate encoding and decoding in programs that support Quick Sync Video. In this regard, Quick Sync Video is a competitor to the CUDA architecture in NVIDIA GPUs, which is discussed later and Advanced Micro Devices, Inc.’s ATI Stream technology, which I won’t discuss because it’s not widely supported in the encoding and production workspace.
Today, the most prominent professional encoding programs that support Quick Sync Video are Microsoft’s Expression Encoder 4 Pro SP2 and Wowza Media Systems’ Transcoder AddOn. MainConcept, a subsidiary of Rovi Corp., has also released a codec that is accelerated by Quick Sync, which will likely appear in multiple licensees in 2012. For this reason, if you’re buying a new workstation and want to access Quick Sync acceleration, make sure that the CPUs are of the Sandy Bridge architecture, which Intel calls the “2nd generation Intel Core Processor family.”
There are multiple families of Sandy Bridge chips, including the Core i3, i5 and i7. The i7 is the workstation class of CPUs; it features hyperthreaded technology, larger L3 cache, and other more technical advantages. If you’re looking for the fastest possible Sandy Bridge chip, buy an i7.
One caveat with Sandy Bridge is the lack of support for ECC (error correction code) memory, which can minimize small bit errors that can cause random system crashes or other problems. Though slightly slower than Sandy Bridge CPUs, the fastest Nehalem-style CPUs can use ECC memory, so they remain favored by some producers.
As you probably know, one of the key benefits of a 64-bit operating system is the ability to use more RAM than the 4GB that 32-bit systems can address. But how much RAM is enough? Let’s explore.
To find the optimal RAM configuration for editing with Adobe CS 5.5, I ran the same multiple-format tests shown in Table 1 on the HP Z400 and Z600 systems in configurations of 24GB, 12GB and 6GB of RAM. You can read the results in another article I wrote, titled “RAM Requirements for Adobe CS5.5.”
Here’s the CliffNotes version of this article. On the dual-CPU Z600, increasing RAM from 6GB to 12GB improved performance in all formats, but the performance boost maxed out at 20% with test projects involving footage from the Red camera. The
next highest number was 9% for DSLR footage. Increasing RAM from 12GB to 24GB delivered a maximum performance boost of 4% and actually increased the rendering time for some formats. At retail, the cost of increasing RAM from 6GB to 12GB is $360, which is worth considering. Paying $1,320 to jump from 6GB to 24GB was clearly not cost justified.
With the single-CPU Z400, the maximum performance boost when jumping from 6GB to 12GB was 3% for DV footage, while jumping from 12GB to 24GB boosted performance a maximum of 4% for footage from Sony’s XDCAM EX. Even upgrading from 6GB to 12GB on this single-CPU system makes little sense.
Note that if you’re working with extremely complex, multiple-layered projects, your RAM needs might be higher, but my test projects included a range of synthetic and real-world projects, some with up to eight layers. In addition, running multiple 64-bit programs simultaneously, such as Adobe After Effects and Adobe Premiere Pro, particularly via Dynamic Link, could also push RAM needs higher. For simple single- or dual-track projects, however, 6GB should suffice for a single-CPU workstation, while 12GB would be a safe starting point for a dual-CPU workstation.
What about RAM requirements for encoding? I tracked RAM usage of the four encoding applications that I discussed earlier, and you can see the results in Table 2. Though both computers had 24GB of RAM, most programs were very efficient in RAM usage, and the same 6GB for a single-CPU system and 12GB for a dual-CPU system seems appropriate.
From a video production perspective, the graphics market seems divided into two camps: NVIDIA and everyone else. More specifically, NVIDIA’s CUDA technology can speed rendering in Adobe Premiere Pro and Adobe Media Encoder, while also accelerating encoding in programs such as Microsoft Expression Encoder and Sorenson Squeeze. However, while rendering with CUDA in Adobe Premiere Pro is all good, the earliest versions of the GPU-accelerated MainConcept codec have exhibited some quality issues. But I get ahead of myself. Let’s take the editing and encoding market one by one.
Briefly, GPU, which stands for graphics processing unit, is the chip that runs the graphics card. Like Quick Sync Video in Sandy Bridge, NVIDIA included the Compute Unified Device Architecture, or CUDA, in some of its faster GPUs. This is a separate programmable architecture that developers can use to perform nongraphic-related functions, such as timeline rendering or encoding video. With CS 5.0, Adobe added CUDA-based GPU acceleration to Premiere Pro to speed rendering for preview and final output; the company enhanced CUDA support in CS 5.5 by adding more GPU-accelerated effects.
If you’re buying a graphics card for CS 5.5, note that there are a limited number of supported cards, as you can see from the Adobe website. However, irrespective of the project type or the number of CPUs in your system, CUDA acceleration is always beneficial, not only in terms of rendering speed but also in rendering quality. And yes, to reach this conclusion, I performed a series of tests on multiple projects using the HP Z400 and Z600 systems that you can read about at Streaming Learning Center.
On the encoding side, however, the results are not so positive. Both Microsoft Expression Encoder and Sorenson Squeeze use the GPU-accelerated MainConcept H.264 codec. While encoding with the GPU is more than 100% faster, the quality is noticeably poorer than software-only encoding in some sequences. It’s often difficult to predict how and when these deficits will occur, which makes it tough to recommend using GPU-accelerated encoding.
Still, MainConcept is aware of the problem and is working to bring parity to its GPU acceleration, and Microsoft has reported that GPU-accelerating encoding quality has improved in its latest version, though I haven’t confirmed this. So while buying NVIDIA may not pay dividends in the short term from a pure encoding perspective, it will almost certainly do so in the long term. So I recommend an NDIVIA card with CUDA acceleration for all of your workstation purchases.
We're growing the Streaming Media brand with Streaming Media Producer and an expanded Buyer's Guide section in this year's Streaming Media Industry Sourcebook.