Short Bytes: NVIDIA GeForce GTX 980 in 1000 Words
To call the launch of NVIDIA's Maxwell GM204 part impressive is something of an understatement. You can read our full coverage of the GTX 980 for the complete story, but here's the short summary. Without the help of a manufacturing process shrink, NVIDIA and AMD are both looking at new ways to improve performance. The Maxwell architecture initially launched earlier this year with GM107 and the GTX 750 Ti and GTX 750, and with it we had our first viable mainstream GPU of the modern era that could deliver playable frame rates at 1080p while using less than 75W of power. The second generation Maxwell ups the ante by essentially tripling the CUDA core count of GM107, all while adding new features and still maintaining the impressive level of efficiency.
It's worth pointing out that "Big Maxwell" (or at least "Bigger Maxwell") is enough of a change that NVIDIA has bumped the model numbers from the GM100 series to GM200 series this round. NVIDIA has also skipped the desktop 800 line completely and is now in the 900 series. Architecturally, however, there's enough change going into GM204 that calling this "Maxwell 2" is certainly warranted.
NVIDIA is touting a 2X performance per Watt increase over GTX 680, and they've delivered exactly that. Through a combination of architectural and design improvements, NVIDIA has moved from 192 CUDA cores per SMX in Kepler to 128 CUDA cores per SMM in Maxwell, and a single SMM is still able to deliver around 90% of the performance of an SMX of equivalent clocks. Put another way, NVIDIA says the new Maxwell 2 architecture is around 40% faster per CUDA core than Kepler. What that means in terms of specifications is that GM204 only needs 2048 CUDA cores to compete with – and generally surpass! – the performance of GK110 with its 2880 CUDA cores, which is used in the GeForce GTX 780 Ti and GTX Titan cards.
In terms of new features, some of the changes with GM204 come on the software/drivers side of things while other features have been implemented in hardware. Starting with the hardware side, GM204 now implements the full set of D3D 11.3/D3D 12 features, where previous designs (Kepler and Maxwell 1) stopped at full Feature Level 11_0 with partial FL 11_1. The new features include Rasterizer Ordered Views, Typed UAV Load, Volume Tiled Resources, and Conservative Rasterization. Along with these, NVIDIA is also adding hardware features to accelerate what they're calling VXGI – Voxel accelerated Global Illumination – a forward-looking technology that brings GPUs one step closer to doing real-time path tracing. (NVIDIA has more details available if you're interested in learning more).
NVIDIA also has a couple new techniques to improve anti-aliasing, Dynamic Super Resolution (DSR) and Multi-Frame Anti-Aliasing (MFAA). DSR essentially renders a game at a higher resolution and then down-sizes the result to your native resolution using a high-quality 13-tap Gaussian filter. It's similar to super sampling, but the great benefit of DSR over SSAA is that the game doesn't have any knowledge of DSR; as long as the game can support higher resolutions, NVIDIA's drivers take care of all of the work behind the scenes. MFAA (please, no jokes about "mofo AA") is supposed to offer essentially the same quality as 4x MSAA with the performance hit of 2x MSAA through a combination of custom filters and looking at previously rendered frames. MFAA can also function with a 4xAA mode to provide an alternative to 8x MSAA.
The above is all well and good, but what really matters at the end of the day is the actual performance that GM204 can offer. We've averaged results from our gaming benchmarks at our 2560×1440 and 1920×1080 settings, as well as our compute benchmarks, with all scores normalized to the GTX 680. Here's how the new GeForce GTX 980 compares with other GPUs. (Note that we've omitted the overclocking results for the GTX 980, as it wasn't tested across all of the games, but on average it's around 18% faster than the stock GTX 980 while consuming around 20% more power.)
Wow. Obviously there's not quite as much to be gained by running such a fast GPU at 1920×1080, but at 2560×1440 we're looking at a GPU that's a healthy 74% faster on average compared to the GTX 680. Perhaps more importantly, the GTX 980 is also on average 8% faster than the GTX 780 Ti and 13.5% faster than AMD's Radeon R9 290X (in Uber mode, as that's what most shipping cards use). Compute performance sees some even larger gains over previous NVIDIA GPUs, with the 980 besting the 680 by 132%; it's also 16% faster than the 780 Ti but "only" 1.5% faster than the 290X – though the 290X still beats the GTX 980 in Sony Vegas Pro 12 and SystemCompute.
If we look at the GTX 780 Ti, on the one hand performance hasn't improved so much that we'd recommend upgrading, though you do get some new features that might prove useful over time. For those that didn't find the price/performance offered by GTX 780 Ti a compelling reason to upgrade, the GTX 980 sweetens the pot by dropping the MSRP down to $549, and what's more it also uses quite a bit less power:
This is what we call the trifecta of graphics hardware: better performance, lower power, and lower prices. When NVIDIA unveiled the GTX 750 Ti back in February, it was ultimately held back by performance while its efficiency was a huge step forward; it seemed almost too much to hope for that sort of product in the high performance GPU arena. NVIDIA doesn't disappoint, however, dropping power consumption by 18% relative to the GTX 780 Ti while improving performance by roughly 10% and dropping the launch price by just over 20%. If you've been waiting for a reason to upgrade, GeForce GTX 980 is about as good as it gets, though the much less expensive GTX 970 might just spoil the party. We'll have a look at the 970 next week.