In the first part of this article, we took a gander at fillrate, how it is calculated and why it is important. With the next one in the series, I would like to write a bit about memory. From memory size to bandwidth and latency and compare the bandwidth, latency and memory size of Project Scorpio, comparing it to the original Xbox One and with its main competition from Sony.
This is a part of the specification that is pretty well known for all machines, and there is little to no “secret sauce” to be cleared up. There are some things, like compression and using 16 bit precision rather than 32 bit precision. But all that does is make better use of the available bandwidth and reduces the storage footprint within the system’s memory. There is no speed difference here, and we can almost compare all machines 1 to 1.
So following this introduction, we again will use the familiar table of comparing the known specifications, followed by a clarification of terms. After which we will focus on the two main comparisons: Bandwidth and Memory size.
Like before, we begin with a table listing the specs of each machine with what is known. Little has been announced since then, especially in the area of memory and bandwidth. This makes most of these machines easily comparable:
As shown in the table (Table 1) above, it is very easy to compare the systems, with possible exception of the Original Xbox One and Xbox One S. It’s eSRAM pool makes it more difficult to compare it to the others, but far from impossible. It is also as to why Microsoft’s own offerings were able to reach 1080p/900p even at 60FPS while the third party developers seem to struggle. There is a bit more going on then just the bandwidth and downgrading a game from a PS4 to Xbox One levels.
This has to do with only being able to render part of the framebuffer in the eSRam pool, or using a forward or partly forward render solution. Since most engines used deferred rendering and don’t supply or support the tools Microsoft uses themselves, it is not strange that resolutions are lower. For instance resolutions of effects but also on precision. Simply put an FP16 requires less space than an FP32, but there are trade offs in render quality.
Put simple: Microsoft knows it’s own hardware’s strength and weaknesses and how to deal with them. Where third parties want a quick way to port between systems. This is far from being a “laze developer”, but more about cost effective and the differences often not being very noticeable to your average customer.
But let’s first concentrate on a small clarification of: Bandwidth and latency.
Bandwidth and latency
What is bandwidth:
A common misconception amongst people is that bandwidth equals speed, but bandwidth is better compared to the size of a truck or the amount of lanes on a highway. The more lanes you have the more you can transport, but it isn’t your top speed. Simply put, speed is a measure of distance over time while bandwidth is a measure off amount over time. This we can demonstrate with a simple example using the earlier described lanes:
Let us imagine that we have a 4 lane high way from an ice-cream factory to the frozen storage, in another part of the state. If we also imagine that the both the storage and the factory have 2 entry points for trucks. We also assume that all trucks drive the same speed, and are driven by perfect drivers, or automated systems, that will make sure they always reach their destination at their best possible speed, we also assume that all loading and unloading is done perfectly as “error handling” is a whole different subject deserving it’s own article not related to Xbox One Scorpio.
Now logistics controllers at each of these two points are able to handle two trucks either incoming and or outgoing per hour. This would be called “full duplex”, rather than half duplex where a controller can handle one outgoing or one incoming truck at a time. This means that a half duplex control locks down the road to allow traffic only from a certain direction, either to or from the ice cream factory. Full duplex means the road is not locked down and traffic can pass each other, whether they are incoming or outgoing.
In drawing 1 we can see our 4 lanes and four trucks, going over their hourly business, and each our we can have a maximum of four trucks on the road in either direction. What this means is that at maximum we can fill up 4 trucks with ice-cream or send 4 empty trucks back or any combination of the this within one hour. So our effective bandwidth is our amount of lanes
So our maximum bandwidth per day, provided the trucks run round the clock and we do not have any bottlenecks, is defined by the amount of trucks per hour times the amount of hours per workday, or:
Using an 8 hour workday that would be 2 times 2 times 8 or 32 trucks in either direction per workday.
Simple right? Well that is also how we calculate our memory, network or disk bandwidth. We just have different time-frames and our trucks are bits. But let’s take Project Scorpio’s numbers and plug them in. We know that the Scorpio can do 6800 Memory Transfers per Sec*1, It has 384 lanes ( 384 bit memory bus) on which there are no trucks carrying ice-cream to eager children, but just boring old electric signals :).
But we can plug these values into the same simple formula, we just replaces trucks per hours and rather than workday we want to use the amount of GigaBYTES per second, so we divide it by 8 times 1000:
Now we have this, we can plug in our values to check get our maximum theoretical throughput:
And if we check this value with the value we got from the Digital Foundry presentation it is correct. We can also double check this by calculating the bandwidth of a PS4 Pro, which has a narrower bus of 256 and we’ll get:
So, why is this value so important. Well the more information you wish to send the more trucks you need to load up. The more lanes you have the more trucks you can put out at the same time and thus the more information you can send. Now before I go and compare the offerings made by the two respected companies, I would like to focus a bit on latency.
Latency is the time it takes between action and response, so for instance, if you order a package from a web retailer, the time it takes to deliver this parcel to your doorstep from the moment you pressed the order button is what you’d call latency. As earlier noted in this part of the article, the maximum size of the delivery truck would be your bandwidth ;). Latency is important for many factors, especially in gaming. For instance if you play an online game, a low ping value is very important if you wish to have a great game.
But sadly we have been conditioned for a long time to only look at the bandwidth, many of us go to ISP’s that offer the most bandwidth never to look at the latency values that they offer. It is not only when choosing our network provider that we do this, no, we also do this when comparing specs and many don’t even mind when buying a TV, it is why flat-screen TV’s often have terrible input lag when playing a computer game.
Now latency wise, the Xbox One and Xbox One S had a serious advantage. Not because it used DDR3, as the latency difference between GDDR5 and DDR3 is mostly overcome by the higher clock-speed of GDDR5. No it is that small block of eSRAM, situated on the die of the APU. This means that transfers from the GPU portion of the APU are going to be lightning fast compared to transfers to DDR3 or GDDR5 memory. It is similar to you ordering from Amazon with you living inside the actual building, rather than 4 states away.
Sadly for the Xbox One and One S this small area of ram was mostly used as a frame buffer and couldn’t be used as scratchpad memory for the CPU. Within Microsoft they have been very creative with it, and it is the reason why they are able to get out of the Xbox One what they can, but one cannot expect a multi platform title to take advantage of this. It is does also not surprising that for Project Scorpio Microsoft has chosen to use this die-space for different things, as in the case of GPU’s latency is not as important as bandwidth. Microsoft wisely chooses to forgo the eSRAM block in favour of mapping part of the higher bandwidth and higher clocked GDDR5 that is part of Project Scorpio.
Latency however is important for the CPU, and for that reason I will revisit this and explain why it is sad that the CPU could not make use of this low latency memory.
Now latency and bandwidth go hand in hand, like your ping is important you also need sufficient bandwidth to transfer the information required to play your multiplayer game and have fun. I just felt it needed to explain that bandwidth does not equate speed. And also help myself along for the article that follows this one regarding the CPU of the consoles. However for this article we can just assume that latency is lower than needed for Project Scorpio to run Xbox One S games at a 4K resolution with improved 4K assets.
So now that we shortly explained latency and understand bandwidth let us begin with the comparison! :).
Now that we have a better grasp of the terminology we can spend some time in comparing the different options offered by these companies. As said earlier we can largely ignore the latency issue in the case of the graphical aspect of things, but eSRAM will still be a factor in the comparison regarding bandwidth. Now let’s begin in comparing the various bandwidth numbers of Table 1 and placing them in an easy to read graph for comparison.
To get one point out of the way, I used the calculated number for eSRam bandwidth. This because the technique of writing a bit while you read or reading a bit while you write is not very reliable and is shaky at best and can’t be used in a comparison. Some Microsoft games seem to get a 30% increase at best, and in some tests 70% has been maintained, but that isn’t workable for games. Third parties games don’t even make 30% (1).
Now with that out of the way let’s look at the differences, one thing easily noticed is the massive difference in bandwidth between the earlier offerings of either company and Project Scorpio. Even when comparing it to the PlayStation 4 Pro, it is a 50% increase, which make sense as the memory speed of both consoles is the same, but there are 128 more lanes available, to be exact 50% more lanes ;).
What this means is that Project Scorpio has the possibility to fit in 50% more operations during a time period than a PlayStation 4 pro. But why is this important? Well graphics is nothing more than image processing, of course one also does linear algebra to transform models and animate them, but most of what is actually memory intensive is frame-buffers and textures. Since we wish to go for a 4K resolution with higher texture quality, it also means our frame buffers will get bigger, as will our textures. Our resolution is 4 times as big as as a 1920 by 1080 image after all. So not only do we need the processing speed of our GPU to go up, we also need more bandwidth to keep our GPU from stalling. Or to put in terms of ice-cream: the amount of trucks need to match up with the production of ice-cream. If the production is higher than the sale, eventually you’ll end up not being able to produce more while you wait for trucks.
There is another issue, in where our CPU would not be able to feed our GPU with draw calls. Most of this would be related to a slow CPU that cannot keep it’s GPU fed. This can sometimes be due to bandwidth or latency issues, but it usually isn’t the reason of why a GPU stalls in consoles. It is in PC’s as most of our rigs have CPU’s far outweighing what our GPU’s can handle bandwidth wise, it is why review sites often benchmark at lower resolutions, that way the GPU does not need get stalled by the bandwidth, and you can see just how well a CPU performs by looking at the frames per second.
In this regard Project Scorpio seems to deliver, keep in mind that it’s design goal was to run Xbox One games at a 4K resolution, with 4K assets. And if one compares it to the Xbox One and One S, we can see it trumps it majorly. Keep in mind that the eSRam is a really small pool, and especially in modern games where multiple frames are kept in memory to be later blended together to create a final image. The higher the resolution the more difficult this becomes. And via this we have stumbled upon the reason why Microsoft is able to get 1080P out of most games while third parties do not.
Basically what Microsoft does, besides rendering the UI at a different resolution, is tile based rendering but in a very platform specific manner. To fit a 1080P frame into a limited frame buffer as such, they split up the image into separate tiles. In fact Microsoft even goes as far, that they render some parts of the image with less detail than others, and I do not mean LOD by that. No parts that do not change often or are not that complex get rendered with less render passes or deferred rendering buffers than places in which there are more detail. Rather than go over the entire frame buffer. These techniques help first party and second party titles to obtain the desired resolution and frame-rate, that third party titles often lack. Third parties simply cannot afford to spend that much time on a title, that has to run well on multiple platforms. For them it is easier to turn off a few effects and lower the resolution to deal with the smaller framebuffer and lower bandwidth to DDR3.
This by the way, is not lazy developing. First party titles or exclusives always were able to get more out a system due to locking down the constraints and knowing what one can work with. But imagine Project Scorpio, where there no longer is such a constraint. Although this probably means more titles will scale down and under 900P for the original Xbox One, it will mean that that same title can run in 4k on Project Scorpio.
The reason for this is Microsoft’s advised approach to the Xbox ecosystem. Get your game to run on Project Scorpio, then scale it down to run on the other platforms. Keep in mind that now the 32MB barrier is gone, and you have far more bandwidth to begin with, powered by a GPU that is easily 4 times as capable than the original Xbox One. Add to that hardware advances that are now supported like delta color compression, which in turn will allow for less memory use and fewer bandwidth constraints, and you’d have more than enough band width and space to reach that goal. In fact you have power to spare.
Unlike the current Xbox One and One S, bandwidth will not be an issue for Project Scorpio, anything that runs on 900p or higher on an Xbox One, should be able to run on Project Scorpio in 4K as far as bandwidth is concerned. And at the speeds of which the GDDR5 is running, one doesn’t have to worry too much about latency either. Of course it will be slower than eSRam, but considering that it mainly is used as a framebuffer, it should be little to no issue in Scorpio. Like said before, one cannot use it as a scratchpad.
But what about memory size?
Well this one I merged with this article because it seemed to fit better than on it’s own. On top of that it is a subject which would too short for it’s own article. But let’s start off again with a comparison of the current memory pools of each system, were we subtract the amount of space used by the OS (2).
As one can see in the graph above (illustration 2), the OS takes a huge chunk of the available memory. As there have been rumours that Microsoft has lowered this from 3.5 Gigabyte to 3, and that Sony uses between 3.5 and 3 Gigabyte depending on how much the OS is using, so let us be generous and assume the OS takes 3 GigaBytes leaving the rest for the games to use as they see fit.
We know that Project Scorpio takes an additional 1 Gigabyte for the OS. This however is not a problem, it is 1/3th of the total memory sure, but the older systems use 37.5% of their memory for OS related features, in relation that is still higher (33 vs 37.5%) and it allows for the full 8 gigabytes to be used for games. Why do I say “full 8 gigabytes”, well remember Project Scorpio is not a new generation, it is designed from the ground up to play Xbone One S games at 4K with 4K assets. But let’s place the percentile increases of Scorpio over each machine in a neat table so we can keep track.
In table 2 we took total memory bandwidth increases and listed the memory increase in percentage available to a game. From here we can see that the bandwidth is more than enough for the console to feed that extra memory space with little to no issue. It shows us then why MS comfortably can increase the OS footprint, as for a game with a 60% higher memory load, would still easily fit within the bandwidth constraints of Project Scorpio.
This means that even with the larger frame-buffers involved in 4K resolution there is still plenty of space available for higher quality assets to be used within these games with plenty of bandwidth to facilitate it’s processing.
Conclusion and expectation
We have looked a bit deeper into the memory area of Project Scorpio, and realistically we can again expect 4K resolution to be possible for any title that runs at or above 900p on the current Xbox One S. There is however a catch the other way around, by advising a top down approach in porting to the Xbox ecosystem, Microsoft might enable more 720P titles with far lower assets on the original Xbox One and Xbox One S.
The reason for that is a simple two fold: Time and Money. Microsoft might have the resources on their first party titles to get the most out of the eSRam solution, via techniques they are familiar with by now; this however does not mean third party developers will have the same resources to spend. And with Scorpio offering a large unified memory pool, with the bandwidth to match, it is just easier to lower the size of your frame buffers by resolution alone. It isn’t important if that is your effects or the actual game resolution, one has to make it fit within that small pool of memory.
But this is also the greatest boon of Project Scorpio, one large pool of memory clocked higher to ease any latency issues one might have coming from an eSRam solution, with bandwidth galore. There is no reason not to expect 4K on Project Scorpio as far as the memory is concerned.
And with that I wish to conclude this article, next one will be about the CPU and what we can expect there with what information we know until now. For instance why Jaguar is not as big a problem as we might think it is, especially with a few changes that might not seem all that impressive on it’s own.