Thank you
.
cautiously proceeding/relying on what we know, not on marketing spin. A very good explanation. And a very intelligent article. thank you.
Thank you
.
That's (including what I cut out) a very reasonable way of looking at it. What we know is they've said it's highly customized and not really a Jaguar. What we don't know is exactly how. Your approach of saying "we don't know specifics, let's ballpark with the old values" is reasonable as long as you specify that, which you've done well. There's a leap between "we don't have specifics, let's estimate with the prior one" and "we don't have specifics, so it *IS* the prior one" that separates your work from some other opinions.
Well you need to, otherwise you cannot make any estimate. But I doubt MS will ever give us the actual changes they made without hiding it in buzzwords. It has been a bad practice int he console industry ever since "Mode 7" and "Blast Processing". But I'll love to learn about the changes, we do know one thing about this Jaguar
, something some people at NeoGaf already attacked as "useless!".
But I find it quite interesting and want to learn more. It's power management
. It might not be "sexy", but it very well is the reason why it can run 200mhz higher than what is considered the maximum for Jaguar CPU's. Finetuning the voltage of every one of them, can only be done if it is cost effective. You cannot spend hours per CPU finetuning the voltage like overclockers do if you have a mass produced product. So what's the catch?
What allows them to do this "The Hovis Method" thing. It seems like a buzzword, but something tells me that the way a Ryzen can lower the voltage and clockspeed of smaller sections of it's IC to save power, is also applied here. Allowing Microsoft to quickly do a test of the CPU coming off the waver to see not only if the CPU is viable to run at 2.3ghz at a decent temperature, but also see where it actually can be clocked lower during low demand scenarios and maybe even clocking some sections a few percentages lower or higher to deliver the specification needed for a Scorpio engine but not waste energy.
They say each CPU will get it's own "power profile", and it hardly suggests this is the case. Now this is not very "sexy" to most, but it is something that has my interest
.
That and consider this: What if they could apply this to the old XBox One S APU as well? A redesign of it's APU to incorporate some of the lessons learned in the design of the Scorpio engine. It would mean getting a more energy efficient APU without having to do a node shrink, meaning an even smaller Xbox One S
.
Dehnus great article, can explain the sudden secret weapon of the pro doing fp16 & fp32 while Scorpio only does fp32. Sort of hot topic on another forum fp16 makes pro 8.4tf.
.
It's not a secret weapon really, FP16 is from a time of fixed pipelines, low memory and having the CPU calculate all of your physics and translations. It is still used in mobile, but considering the screensize, and nature of most mobile games, you wouldn't see the errors it produces. But in comparison to single precision floating points there is a major cut off in precision. So what Sony wants to do, one of their major selling points during the PS4, offloading computation to the GPU? Almost always requires FP32. As the range of a FP16 is very limited and the precision even worse. IT can however be handy for some frame buffer operations, and the reason they added it probably is just that. If you are working on a framebuffer to scale it up via checkerboard, FP16 use will give you performance gains especially if you can do 2 for the price of 1.
But it doesn't magically make the PS4 Pro 8.4TFlop, far from it even. It also never was a secret weapon. I agree with MS her choice in removing it, if it reduces the complexity of the hardware and saves even 1% on die space, it will only serve to make their APU cheaper. Considering the CU count and clockrate which provides real increases across the board that will give them full 4K Xbox One S titles with a little headroom for 4K assets.
So whether they wanted to reduce complexity or opt for efficiency upgrades like the cache latency improvements of the VEGA instead. Both would be preferred over the ability to run two FP16's for one FP32 operation. Keep in mind that one FP16 operation will always take up two registers, so if you have one FP32 and one FP16 operation in queue, you will still need two cycles. Registers don't magically get wider
.