Desperate Times Call for a Desperate Nintendo
This week saw Nintendo’s long awaited 4.0 Wii U update, which in typical Nintendo fashion set out to implement all changes in the most useless possible fashion. The update did not provide owners with their long wanted [and oft half-heatedly promised] unified eShop accounts, but instead implemented Off-TV play for the Wii U’s Wii-mode. Before one becomes too excited, it is important to understand that Nintendo implemented this in such a way to be effectively useless. There are a significant amount of Wii games that are compatible with the Classic Controller, and thus perfectly suited for playing on the Wii U Gamepad – but sadly the Gamepad can only be used for outputting video, while the buttons cannot be utilised to playing Wii software. This effectively means that Off-TV mode can only be used by propping up the Gamepad in the same room as one’s Wii U set-up, and then waggling the Wiimote in the direction of the TV as one usually would – which begs the question: why not just continue to play Wii games on the TV? At least it is possible to properly view the TV at a distance greater than an arm’s length.
In other Wii U news, there has been an update to last week’s news that Nintendo products have all but vanished from the UK’s biggest supermarket retailer, Tesco. This week Nintendo have revealed a freshly-minted agreement with Tesco, which will see Nintendo purchase floor-space from them in order to continue selling Wii U in the UK. Additionally, Tesco will be sending out a five page flyer to all 300,000 customers who previously bought a Wii through the supermarket chain. This marketing push was sorely needed as, in the words of Nintendo’s UK marketing director, “many people out there that don’t know what this is. There was a big misconception at launch about what Wii U is, and one of the big messages is that this is a new console and a new controller“. That being said, it would [presumably] have been much more effective to send out this marketing material to 3DS owners, since large portions of the Wii’s market no longer exist, and cannot be realistically expected to return to the Nintendo fold.
In one final piece of Wii U news, Donkey Kong Country: Tropical Freeze has been delayed until February of 2014, making Nintendo’s meager holiday line-up that much less impressive. One wishes to be able to feign surprise at this news, yet it is sadly typical of how Nintendo has been caught wrong-footed by the demands of HD development. Right now Nintendo appears to be pinning all their hopes on Super Mario 3D World in order to turn around the fortunes of the Wii U this Christmas.
Even Less GPU Available to Xbone Developers than Previously Thought
For weeks now anecdotal accounts have been surfacing on the internet of developers complaining about how relatively weak the Xbone’s GPU is. This combined with the recent revelation that many Xbone exclusives will not run at native 1080p have painted a troubling picture for Microsoft, but it seems that this performance shortfall may be at least partially self-imposed. This week Microsoft have revealed that a full 10% of the Xbone’s GPU resources are partitioned off for the use of the system’s OS, apps, and Kinect. The PS4 GPU also likely quarantines a portion of its processing power for more mundane tasks, yet this is likely to amount to significantly less than 10% of GPU resources. Microsoft have stated that this allocation of GPU power was deliberately conservative on their part, and have signaled their intention to eventually free up a little more processing power for Xbone developers, yet it is anyone’s guess as to when and by how much the Xbone’s graphics processing capabilities will be bolstered.
On the subject of Xbone titles falling well short of a 1080p resolution, two Xbone designers, Andrew Goosen and Nick Baker, have opened up this week about some of the decisions that have been made in planning Microsoft’s next console. To listen to them a sub-optimal resolution is merely part of the Xbone feature-set:
“We’ve chosen to let title developers make the trade-off of resolution vs. per-pixel quality in whatever way is most appropriate to their game content. A lower resolution generally means that there can be more quality per pixel. With a high quality scaler and anti-aliasing and render resolutions such as 720p or ‘900p’, some games look better with more GPU processing going to each pixel than to the number of pixels; others look better at 1080p with less GPU processing per pixel.”
‘We might have less pixels, but our pixels will be of superior quality!’ – if that is not the mantra of a snake-oil salesman, then one does not know what is. Too bad for Microsoft that the PS4 can handle the equivalent visuals at full resolution. It has been speculated on NeoGAF that one reason for a game’s failure to render at optimal output may simply come down to the fact that the Xbone’s 32MB of ESRAM is not terrible friendly to 1080p visuals, seeing as four 1080p render targets would require 48MB – thus developers must either go with a lower resolution [900p] or use the console’s much slower DDR3 RAM.
Xbone Bandwidth Bottleneck Prevents Xbone from Effectively Utilising additional CUs
Another tidbit of news to come out of this week’s Xbone interviews is the surprising fact that the design team apparently experimented with an Xbone GPU configuration which featured two additional CUs [compute units] before ultimately deciding to upclock the GPU by a modest 6.6%. Every Xbone unit sold will ship with a GPU containing fourteen CUs, yet two of these CUs will be inactive to improve the fabrication yield of the Xbone’s APU. Conventional wisdom would indicate that freeing up two additional CUs should provide an extra seventeen percent of processing power, yet the design team found that increasing the clockspeed actually made for a more appreciable impact, seemingly indicating that bandwidth bottlenecks were preventing the additional CUs from being adequately fed.
“Every one of the Xbox One dev kits actually has 14 CUs on the silicon. Two of those CUs are reserved for redundancy in manufacturing. But we could go and do the experiment – if we were actually at 14 CUs what kind of performance benefit would we get versus 12? And if we raised the GPU clock what sort of performance advantage would we get? And we actually saw on the launch titles – we looked at a lot of titles in a lot of depth – we found that going to 14 CUs wasn’t as effective as the 6.6 per cent clock upgrade that we did. Now everybody knows from the internet that going to 14 CUs should have given us almost 17 per cent more performance but in terms of actual measured games – what actually, ultimately counts – is that it was a better engineering decision to raise the clock. There are various bottlenecks you have in the pipeline that [can] cause you not to get the performance you want [if your design is out of balance].
By fixing the clock, not only do we increase our ALU performance, we also increase our vertex rate, we increase our pixel rate and ironically increase our ESRAM bandwidth. But we also increase the performance in areas surrounding bottlenecks like the drawcalls flowing through the pipeline, the performance of reading GPRs out of the GPR pool, etc. GPUs are giantly complex. There’s gazillions of areas in the pipeline that can be your bottleneck in addition to just ALU and fetch performance”
What Microsoft appear to be trying to implicitly suggest here is that the PS4’s commanding advantage in terms of CUs [+58%], shaders [+50%], texture units [+50%], and ROPS [+100%] will not actually give it much of an advantage against the Xbone’s bandwidth starved and emaciated made-to-cost GPU.
“Just like our friends we’re based on the Sea Islands family. We’ve made quite a number of changes in different parts of the areas. The biggest thing in terms of the number of compute units, that’s been something that’s been very easy to focus on. It’s like, hey, let’s count up the number of CUs, count up the gigaflops and declare the winner based on that. My take on it is that when you buy a graphics card, do you go by the specs or do you actually run some benchmarks?
If you go to VGleaks, they had some internal docs from our competition. Sony was actually agreeing with us. They said that their system was balanced for 14 CUs. They used that term: balance. Balance is so important in terms of your actual efficient design. Their additional four CUs are very beneficial for their additional GPGPU work. We’ve actually taken a very different tack on that. The experiments we did showed that we had headroom on CUs as well. In terms of balance, we did index more in terms of CUs than needed so we have CU overhead. There is room for our titles to grow over time in terms of CU utilisation, but getting back to us versus them, they’re betting that the additional CUs are going to be very beneficial for GPGPU workloads. Whereas we’ve said that we find it very important to have bandwidth for the GPGPU workload and so this is one of the reasons why we’ve made the big bet on very high coherent read bandwidth that we have on our system.”
Amusingly, whether through ignorance or mischief, the Xbone’s engineers have cited a debunked rumour in order to justify their downplaying of the benefits of additional CUs. When it was revealed that four of the PS4’s CUs had been modified in order to boost their performance for GPGPU tasks it had been speculated that these four CUs may have been reserved for compute, yet it was not long before Sony themselves came out and debunked this rumour. Eighteen out of the PS4’s eighteen CUs will be available for graphics processing, and of those eighteen, eighteen of the CUs will likely prove useful, as the console is not starved of bandwidth.