Project Scorpio Realistic Expectations Part 1: Fillrate

0

Below is the first part of a realistic, technical, breakdown of the Xbox “Scorpio” along with a comparison of the other technological leading consoles on the market today by forum member Dehnus. He will be in the thread to answer any questions someone may have. Stay tuned for more articles in this series.

Introduction:

With the announcement of specifications for Microsoft’s console refresh, the internet has been abuzz with expectations. Ranging from high to low, and even ridiculously unrealistic till FUD spreading, I felt UnionVGF also needed an article that was more down to earth on the matter.

In the following article I will go over each of the fillrate specifications, compare it to the current offerings of Microsoft and Sony and what is realistic to expect. I will also clarify what information is still missing, or has gone under my radar. This information will be used in future articles regarding this topic to further show the strengths, weaknesses and features of the project. The articles following this one will focus on other announced specifications of Project Scorpio and how they relate to the previous offerings of this generation.

Known specifications

Before I begin to dissect any of the announced specs, I would like to make a table of known specification of both Project Scorpio, Microsoft’s Earlier entries in this generation and the specification of the entries of their main competitor Sony’s PlayStation 4 range:

Console:

Xbox One

Xbox One S

Playstation 4

Playstation 4 Pro

Project Scorpio

Image:

Image

Image

Image

Image

Image

CPU:

Cores:

8

8

8

8

8

Clockspeed:

1750mhz

1750mhz

1600mhz

2100mhz

2300mhz

Architecture:

Jaguar*1

Jaguar*1

Jaguar

Jaguar

Jaguar*1

GPU:

Clockspeed:

853mhz

914mhz

800mhz

911mhz

1172mhz

GCN CUs:

12

12

18

36

40

ROP:

16

16

32

32

32*2

TMU:

48

48

72

144

144*2

Memory

Memory Type:

DDR3/eSRAM

DDR3/eSRAM

GDDR5

GDDR5

GDDR5

Buswidth:

256/1024

256/1024

256

256

384

Clockspeed:

2133/853

2133/914

5500*3

6800*3

6800*3

Bandwith

68GB/s/204 GB/s

68GB/s/219 GB/s

176GB/s

218GB/s

326GB/s

Amount of Memory

8GB*4

8GB*4

8GB*4

8GB*5

12GB*6

Table 1.

*1: For the simplicity of the article, and due to their being many unknowns, we shall assume that these are ordinary Jaguar cores. Until MS goes into detail about the changes made to these CPU’s, we cannot assume otherwise.

*2: ROP and TMU count are rumored, for the sake of the argument we’ll assume that the Scorpio has the same ROP and TMU count as the PS4 Pro

*3: Effective clockrate.

*4: Not the full 8GB are available to games. For each of these consoles we are talking of about a 5-3 split.

*5: The PS4 Pro has an extra GB of DDR3 memory for the OS and swapping. Thus allowing for speed increases to the OS and games (Swapping) but also to minimize the impact the OS has on the main GDDR5 memory pool.

*6: Project Scorpio has an extra 4GB of GDDR5, this memory is reserved for the OS, thus allowing for 8GB to be available for Games.

The above table (Table 1) shows us the known specs with some assumptions on ROP and TMU count. It is the main reason for this series of articles, as well known tech news website, AnandTech, spread some false information on ROP count. ROP count does not increase your bandwidth, it is limited by bandwidth. Thus fillrate is the first topic and article I wish to focus on, to give some more information on what fillrate exactly is, what it does and how it helps Project Scorpio to reach it’s promised 4K resolution.

So without further ado, here is the first article concentrating on fillrate.

Fillrate

As earlier mentioned Anandtech did a hatchet job of a comparison regarding the fillrate of Project Scorpio, this had inspired me to write this article. I would like to get one thing straight: A ROP count that is similar to another system does NOT mean you will only be able to fill the same bandwidth or be limited to the same bandwidth constraint as that other system. In fact Filtrate has little to do with Bandwidth outside of being limited by bandwidth rather than the other way around. Having a higher ROP count does not mean you’ll get more memory bandwidth, in fact having a higher ROP count means you would need more bandwidth to serve these ROPs with sufficient throughput capabilities.

I do call this article a hatchet-job as a tech website the size of Anandtech should know better, than to just compare two numbers without doing a proper comparison. To do this comparison I shall first explain how we obtain fill-rate, how it is limited by bandwidth and what the differences are between ROP and TMU units.

ROP

A ROP, or Raster Operation Pipeline, is the hardware component that processes the processes Pixel and Texel information to process it into a pixel or depth information. This is done via Vector and Matrix operations. Now imagine your screen, it is an amount of pixels high and an amount of pixels wide, this can also be seen as a “raster” of pixels, hence the name for this process. Now imagine that same raster in memory, and you have what you call a “framebuffer”. On this buffer we can do raster operations like anti aliasing, motion blur, chromatic aberration, lighting and shading, combining several framebuffers into the final one to be shown on screen, the list goes on and on.

For this reason you need plenty of fillrate, as each buffer and effect requires operations to be done on these pixels. Now one can cheat by not rendering all effects on the same resolution, and merging them later in final framebuffer to the resolution that you wish, but this will create loss of picture quality. Something we have seen on Xbox One games, the reason for this is not only limited to the Xbox One’s fillrate but also it’s memory architecture. More on this in the second article of the series.

Now as you earlier stated we can see that the PlayStation 4 and PlayStation 4 Pro have 32 Raster Operation Pipelines, this means that per clock-cycle it can process 32 Render Output Operations. The Xbox One and One S has 16 of these units and thus can process 16 operations per cycle. Let us assume that Anandtech was correct (and they very likely are), about the ROP count in Project Scorpio. This would mean that Project Scorpio can process 32 operations per cycle. But as stated earlier, the clock speed of the GPU is a factor in the through put per second also known as pixel fillrate.

So by multiplying the ROP Units by the clockspeed one gets the maximum fillrate of the GPU we get a very simple formula:

Using this formula

for each console’s specification, we can place the results in the following table

Table 2

Table 2

As one can see in Table 2, there is a big difference between the original Xbox One and Project Scorpio, yet it’s difference with the PlayStation 4 Pro, is not as big. This is by design. Scorpio is not a new system, it is a system intended to play your Xbox One games in native HDR10 4k. This means it will have to push more pixels, considerably more, than an Xbox One. Not just on your screen but also in anti aliasing, lighting and other pixel operations. And as stated earlier doing these effects on a lower resolution than the target resolution, or not doing anti aliasing at all, will reduce picture quality. It has however a higher filtrate nonetheless and we can’t just compare the ROP count alone

But since the resolution is four times as high, should the fill rate not be four times as high then as well? To answer this question we can go to the biggest flaw in the design of the Xbox One. It’s memory set up. We’ll go more indepth about this in the second article, but for now let’s just look at the current situation from the perspective of fillrate.

As earlier stated ROPs are not the deciding factor in how much memory bandwidth we have, but they are limited to it. Considering the Xbox One’s memory bandwidth is very constricted, all be it alleviated somewhat by it’s eSRAM solution, it is quite obvious to see that it does constrain the output. It is then also obvious to see why Microsoft wend for a 16 ROP solution and not a 32 one. But let’s see how many pixels we can push at each level of FloatingPoint precision before runing into our memory bandwidth constraints.

We do this by multiplying the size of the type in memory, with the fillrate and then divide that value by 1024 to get the amount of required gigabytes per second if all ROP units are used to maximum with the given type:.

In this formula

S is the size of the type, M is fillrate and MTS is the required bandwidth. We shall use the following type sizes for our comparison:

  • 8 bit integer RGBA (RGBA8), which is 4 bytes per pixel.
  • 16bit floatingpoint RGBA (RGBA16F) which is 8 bytes per pixel
  • 32bit floatingpoint RGBA (RGBA32F) which is 16 bytes per pixel

Now let’s plug this values into the formula:

As can be seen in the previous table (Table 3), each of the consoles is able to fill gigabytes in mere seconds if one would use the maximum available output of the ROP units. This however is a theoretical example as other processes and units need some of that bandwidth as well. We can however see that the number given by Anandtech earlier is false in every sense as not only does the amount of ROP units NOT increase your bandwidth, it was also false in that the even an Xbox One S can fill up that 218GB/s number they stated. A Playstation 4 is able to do almost double that, so the number was just nothing more than the bandwidth of the PS4 Pro used as a stick to gain clicks from a news hungry internet crowd.

Table 3: Maximum Bandwidth Requirements per Console and type of pixel.

Now that we are talking about bandwidth, and why the Xbox One has a problem there, we should look deeper into each possible fillrate of each console and the bandwidth available to each console. To do this I plugged the values calculated into a graph, with the last two bars being the available bandwidth to each console:

Illustration 1: A graph showing each bandwidth requirement if the ROP units were used to maximum with the actual bandwidth shown in green and dark red.

Illustration 1

Please note that start the graph at 50GB/s in this graph (See Illustration 1), close to the smallest value found. This is to show a clearer difference between the actual bandwidth and the amount the APU in each console can fill. It does not mean that the PlayStation 4 has ten times the bandwidth of the Xbox One.

Now that is out of the way, when we look at this graph ( See Illustration 1), we can see that the main memory of the Xbox One is easily filled up even by the lowest precision of pixel information, the eSRAM seems to help punch above it’s weight, but it has other issues, in that you can only reach that bandwidth in theory and actual bandwidth is quite a bit lower (as you don’t always need to do a write when you do a read and vice versa). Many of these issues will go into a bit deeper in the article that concerns itself on memory and bandwidth.

We can also see that both the Playstation 4 Pro, and the original PlayStation 4 seem to be able to fill up it’s bandwidth with 16 bit precision floating points, even going over. While the 16 ROPS of the original Xbox One’s APU seem to only be able to go over the eSRAM bandwidth in the case of 32 bit precision. Mind you this does not mean that the Xbox One APU is better at this task, far from it. But it does explain why Microsoft opted for 16 rather than 32 ROP units, as 32 would have been overkill, and never fully been able to use. It was better to save that silicon and use it for something else, considering the bandwidth restraints.

Now there is one thing standing out in this graph, Project Scorpio. Even on 16 bit precision fully used, it still manages to easily fit it within it’s memory bandwidth space. Something even the PlayStation 4 Pro fails at. This means that the system has quite a bit of headroom to use it’s fillrate for native 4K, and why AnandTech probably was right about the ROP count. Simply put, it makes no sense to add more as it would only be wasted silicon; especially considering the fill rates achieved are plenty for playing 900p and 1080p Xbox One titles on 4K. In fact with the extra headroom in bandwidth one could use that space for some other things.

Which brings us to the following, the TMUs.

TMU:

TMUs or Texture Mapping Units, are the components in the GPU that is responsible for distorting a bitmap image onto a plane. For instance the face of a mountain in a distant has a texture, this texture is a bitmap image in memory that has to be wrapped over the surface that make up this mountain. Doing so the TMU has to rotate, stretch and scale the bitmap accordingly until it fits over the model.

TMU fillrate is important for how hight quality textures you can use and how many. These images are not textures like a the asphalt of a road, it can also be a normal map(1) or a light map(2). These maps help with shadows and reflection in a 3D scene allowing us to blend all of it together into one image with lighting effects. Now the bigger this texture, the more space it takes in memory, but also the more bandwith is required to load it and warp it onto a mesh. But as stated earlier, when you go to 4K, you also have to up the resolution of all the other effects or you’ll lose image quality. This also includes textures and maps.

A mapped bitmap pixel in which the pixels of the texture are mapped over a model or surface is called a texel. The amount in which a GPU, or in the case of these consoles APUs, can map them is expressed in texels per second. A Gigatexel is a thousand MegaTexels, which in turn is a thousand Kilotexels, which in turn is a thousand Texels. Slightly off topic but, thank you Metric System, you’re a great teacher ^^.

Back on topic: We can calculate the amount of texels per second by multiplying the clockspeed of the APUs with the amount of TMUs for each of them. Again we assume that Project Scorpio has the same amount of TMUs as the PlayStation 4 Pro.

Plugging this the values into formula

 we get the following results:

It is clear from the results (Table 4), that Project Scorpio has more than 4 times the amount of texel fillrate of the original Xbox One, it even has almost 4 times as much as the Xbox One S. It is also clear from this result that if Microsoft does have the same TMU count, then Project Scorpio is very well equipped to handle higher resolution textures that are desirable for a 4K resolution. Since the resolution is 4 times as high, they seem to be making that requirement with room to spare.

Table 4

It also confirms the statements made by the people at Turn 10 that it was possible to just load the 4K higher resolution texture assets without any major hit to the performance.

This brings us to the conclusion.

Conclusion, what to expect realistically:

In this first article we focused mostly on the fillrate part of Project Scorpio, and yes in theory it seems to deliver what was promised. It is thus realistic to expect your Xbox One and One S games to run at native 4K with 4K textures assets if one looks to the fillrate aspect of the console. In fact not only does Microsoft seem to deliver the minimum required, they seem to be able have a bit of headroom left.

However it is wise to keep in mind that we are talking about Xbox One and One S games, so the current generation. Scorpio is not a new generation and not a generational leap in. In fact looking at the differences between the PlayStation 4 and the PlayStation 4 Pro and then to the Scorpio, one can clearly see that it is a machine designed to run current Xbox One games in 4K, not to add a lot of bells and whistles. The ROP performance is already a first indicator of this.

This image will only become clearer, pun intended, in the next few articles, starting with the article on Memory.

(1)Normal map is a map of normals. A normal is a perpendicular vector/point on a plane that helps with calculating lighting by telling us more about the direction the plane is facing in. A Normal Map helps an otherwise flat plane, to look as if it has relief or otherwise has depth, by mapping a set of normals over the texture, that indicate different surface directions per area of the texture.

(2)A light map is a texture that has information of how the light is reflected and to what extend.

About author

No comments

This site uses XenWord.