Sunday 26 April 2020

Intel Xe Graphics: Release Date, Specs


Intel Xe Graphics will join the dedicated graphics card fray this year, but can it possibly compete with AMD and Nvidia GPUs?
Last year, Intel Xe Graphics was announced, along with Intel's intention to re-enter the discrete GPU space, the first time we'll have seen a dedicated Intel GPU since the i740 back in 1998. The competition among the best graphics cards is fierce, and Intel's current integrated graphics solutions don't even rank on our GPU hierarchy (they'd be about 1/3 the performance of even a low-end card like an Nvidia GT 1030). Could Intel, purveyor of low performance integrated GPUs—"the most popular GPUs in the world"—possibly hope to compete? Yes, actually, it can.

This year promises a massive shakeup in the PC graphics card market. AMD is working on Big Navi / RDNA 2, Nvidia's RTX 3080 / Ampere GPUs are coming, and along with Intel's Xe Graphics there are rumblings of a fourth player potentially entering the PC GPU space. Huawei is reportedly entering the data center GPU market, so it's not a huge leap to imagine it making consumer models at some point. But for this article, we're focusing on Intel.

Intel Xe Graphics At A Glance:
Specs: Up to 512 EUs / 4096 shader cores
Performance: We're hoping for at least RTX 2080 level
Release Date: Summer 2020 (assuming no Coronavirus delays)
Price: Intel will need to be competitive
Intel's Xe Graphics aspirations hit center stage in 2018, with the hiring of Raja Koduri from AMD, followed by chip architect Jim Keller and graphics marketer Chris Hook, to name just a few. Raja was the driving force behind AMD's Radeon Technologies Group that was created in November 2015, along with the Vega and Navi architectures, and clearly the hope is that he can help lead Intel's GPU division into new frontiers. Not that Intel hasn't tried this before. Besides the i740, Larrabee and the Xeon Phi had similar goals back in 2009, though the GPU aspect never really panned out. So, third time's the charm, right?

Of course, there's a lot more to building a good GPU than just saying you want to make one, and Intel has a lot to prove. Here's everything we know about the upcoming Intel Xe Graphics, including release date, specifications, performance expectations, and pricing. 

Intel's Gen11 Graphics at a high level appears to be quite similar to Xe Graphics.  (Image credit: Intel) Intel Xe Graphics Architecture 
While Intel may be a newcomer to the dedicated graphics card market, it's by no means new to making GPUs. Current Intel Ice Lake CPUs use the Gen11 Graphics architecture, which as the name implies is the 11th generation of Intel GPUs. Incidentally, the first generation of Intel GPUs was in its last discrete graphics card, the i740 (along with Intel's 810/815 chipsets for socket 370 Pentium III and Celeron CPUs, circa 1998-2000). Xe Graphics is round 12 for Intel GPU architectures, in other words, with Gen5 through Gen11 being integrated into Intel CPUs of the past decade. Note that Gen10 Graphics never actually saw the light of day, as it was part of the aborted Cannon Lake CPU line.

While it's common for each generation of GPUs to build on the previous architecture, adding various improvements and enhancements, Intel is reportedly making major changes with Xe Graphics. Some of those changes focus on enabling the expansion of GPU cores, others address the need for dedicated VRAM, and there will also be changes focused on improving per-core performance and IPC.

Recent Intel GPUs have been divided up into a number of 'slices' and 'sub-slices,' with the sub-slices being somewhat analogous to AMD's CUs and Nvidia's SMs. Gen9 Graphics has a sub-slice size of 8 EUs, and each EU has two 128-bit floating point units (FPUs). For FP32 computations, each EU can do up to 8 instructions per clock, and FMA (fused multiply add) instructions count as two FP operations, giving a maximum throughput of 16 FP operations per clock. So: EUs * 8 * 2 * clock speed = GFLOPS. In that sense, an EU counts as eight GPU cores when compared with AMD and Nvidia GPUs, and 8 EUs is equal to an AMD CU or Nvidia SM.

Stepping out one level, the slices in previous Intel graphics have been classified as GT1, GT2, GT3, and GT4 (with Ice Lake / Gen11 adding a GT1.5 option). For Gen9, GT2 models have three sub-slices with eight EUs each, GT1 has two sub-slices with six EUs enabled in each, and GT3 has six sub-slices and eight EUs each. Gen11 changed to each slice having four sub-slices of eight EUs, so Ice Lake GT2 has 64 EUs and 512 GPU cores. For Xe Graphics, Intel will be going for significantly higher EU counts and larger GPU sizes.

Gen11 was a big jump from Gen9, and Xe Graphics could scale to eight or more slices.   (Image credit: Intel) 
Current indications are that the base 'slice' size for Xe Graphics will have up to 64 EUs enabled, with different configurations having different numbers of slices and sub-slices that can be partially disabled as needed. The fundamental building block for Xe Graphics ends up being basically the same as Gen11 Graphics, at least for the first iteration. The big changes will involve adding all the logic for dedicated VRAM, scaling to much higher core counts and multi-chip support, along with any other architectural changes that have yet to be revealed. Xe Graphics will have full DX12 and Vulkan support, but beyond that is unknown.

Intel has talked about three broad classifications of Xe Graphics: Xe LP for low power / low performance devices, Xe HP for high performance solutions, and Xe HPC for data center applications. Xe LP as far as we can tell is mostly for integrated graphics solutions, likely with a single slice—maybe two in a few cases. We know Xe LP is in the upcoming Tiger Lake CPUs, and it was used in the Xe Graphics DG1 developer card. It will be the next iteration of Intel's processor graphics, in other words.

At the other end of the spectrum, there have been images and details regarding Xe HPC and Intel's Exascale ambitions for supercomputers, which as you might imagine means incredibly powerful and expensive chips—we don't anticipate Xe HPC GPUs showing up in consumer cards any time soon. The most interesting chips from our perspective will fall under the Xe HP umbrella, and these should show up in a variety of consumer graphics cards.

One thing that's still unclear is whether the first Xe Graphics solutions will support hardware ray tracing or not. Intel has said it will support ray tracing, but it hasn't specifically stated that it will happen with the initial Xe Graphics architecture. It seems more likely that ray tracing will come in the second generation of Xe Graphics, the 7nm Ponte Vecchio and related chips. Or perhaps ray tracing support will be in a limited subset of the first gen parts—high-end Xe HP or HPC, but not Xe LP, for example. We don't know yet, but it would be quite surprising to have full ray tracing arrive before AMD's ray tracing solution.

These architectural updates are critical, as current Intel GPUs are at best underwhelming when it comes to gaming performance. Take UHD Graphics 630 as an example: 24 EUs (192 cores) at 1.2 GHz in a Core i9-9900K gives a theoretical 460.8 GFLOPS—or 422.4 GFLOPS in the slightly lower clocked (1.1 GHz) Core Core i3-9100. The AMD Ryzen 5 3400G by comparison has 11 CUs, 704 GPU cores, and a 1.4 GHz clock speed, yielding 1971.2 GFLOPS of theoretical performance. It's no surprise that AMD's Vega 11 Graphics are roughly three times faster than Intel's UHD Graphics 630—it could have been more, but both integrated graphics solutions are at least somewhat limited by the system memory bandwidth. 

Intel's Ice Lake processors have a 64 EU GPU that gives clues on how Xe Graphics could scale.   (Image credit: Intel) Intel Xe Graphics Die Shots and Analysis 
Besides mostly undisclosed architectural changes, there are some other interesting tidbits on Xe Graphics that are worth discussing. For example, we can get a pretty good idea of what to expect in regards to size and transistor counts. First, looking at Intel's Ice Lake wafer to see how big the 64 EU GPU is on Intel's 10nm node. Analyzing the die shot, it looks like 64 EUs with Gen11 takes up roughly 40-45mm square of die space. That's actually quite small and it means Intel can scale to much larger GPUs.

Even if we take the high end of that estimate (45mm square), and then assume the Xe Graphics architecture will increase the size by nearly 50%—for all the enhancements and IPC changes it's supposed to bring—we're still only at 65mm square per 64 EU slice. There's a lot of logic related to display outputs, video codecs and more that doesn't need to be duplicated on a larger GPU, but let's aim high.

Doubling that to 130mm square would give Intel a 128 EU chip, 260mm square would be 256 EUs, and 520mm square would yield 512 EUs. And again, actual chip sizes could be quite a bit smaller, as the initial 50% larger estimate is probably excessive. If Intel takes the multi-chiplet approach with consumer cards, it could just use one base chip and then link multiples of that chip together. Alternatively, if Intel goes the custom silicon route, the 128 EU GPU could be about 150mm square, but 256 EUs could fit into about 250mm square, and a large 512 EU chip might only need 450mm square. Such sizes are absolutely within reach for GPUs—we've seen AMD and Nvidia routinely go much larger.

512 EUs in a single chip would mean the equivalent of 4096 GPU cores, which would be pretty impressive. AMD's RX 5700 XT by comparison has 2560 GPU cores, while Nvidia's RTX 2080 Ti sports 4352 GPU cores—not that the AMD, Intel, and Nvidia GPUs are all equivalent, but it's at least a baseline measure of potential performance. Theoretical compute for a 512 EU chip could actually surpass the current kings of the desktop graphics card sector. Does that sound like fantasy land? Check out this Xe Graphics wafer shot Raja Koduri posted on Twitter in February 2020.

We've analyzed that photo, which presumably shows the first generation 10nm+ Xe HPC GPU. Frankly, the die appears to be massive! We've seen other analyses, but our own estimate is that the GPU die on that wafer is approaching maximum reticle size—around 800mm square, give or take. That also coincides with what Intel has publicly stated in regards to its second generation Ponte Vecchio architecture, which will move to a 7nm node.

Ponte Vecchio will include Foveros, Intel's die stacking technology, and Intel mentioned in it's 2019 Investor Meeting that with the current PC-centric approach, product size is "restricted by reticle." In other words, the maximum size of a chip is a hard limit based on the fabrication machinery. This applies to all microprocessors, and the limit is right around 850mm square. Intel's future plans move to a data-centric model that will allow further scaling through die stacking, but that doesn't apply to the 10nm+ Xe HPC GPU.

So, reading between the lines, Xe HPC will use the above GPU die that appears to approach the maximum reticle size. Again, that's not going to be used in a consumer product, but given what we know of Intel's Gen11 Graphics, such a GPU could have 1024 EUs and 8192 GPU core equivalents. Intel also talks about future GPUs moving to "thousands of EUs," meaning multiple Ponte Vecchio chips. Toss in some HMB2e memory, add in INT8 and FP64 support, and data centers should come running.

Now scale that down to more manageable sizes and you get the consumer focused Xe HP. An accidental Intel graphics driver posting from June 2019 gave a clear indication of what to expect. Intel has 128 EU, 256 EU, and 512 EU Xe HP graphics cards in the works, in addition to Xe LP models that most likely will be limited to 64 EUs. That also coincides with statements Intel has made regarding Xe LP scaling from 5W to 20W designs—there's no need for a dedicated graphics card with a 20W TDP GPU. That brings us to the actual Xe Graphics specifications.  

There have been various leaks and rumors about Intel Xe Graphics, each becoming slightly more credible. Intel also demonstrated the Xe Graphics DG1 developer board at CES 2020. While Intel insisted the board was not a final design for consumers, we wouldn't be surprised to see something similar shipping to consumers in the future. However, Xe Graphics DG1 also uses Xe LP silicon, which means that it's a low-power dedicated GPU for test purposes only right now. Intel also revealed that there are three brands of Xe Graphics, scaling from ultra mobile through gaming desktops, then on to workstations and data center applications. Given what we've said above, Intel plans to release a suite of Xe Graphics cards, presumably using Xe HP silicon, and here are the configurations we expect to see:

Based on the chip shots and other information, we expect the Xe HP GPU to be the fundamental building block of the consumer Xe Graphics cards. Intel's EMIB (Embedded Multi-Die Interconnect Bridge) could make an appearance, allowing for multi-chip GPU configurations but without the complexity of AMD CrossFire or Nvidia SLI. It's like AMD's chiplet approach on the Ryzen CPUs, except applied to graphics instead. That's what the above table assumes.

EMIB would effectively allow two or four chips to behave as one, more or less, sharing rendering duties and memory. It's a bit ironic, as Intel initially made fun of AMD's 'gluing' chiplets together with Ryzen, and we can see how that turned out: in looking at AMD vs Intel CPUs, Ryzen has quickly scaled to much higher core counts and performance that Intel currently can't match. But Intel is smart enough to recognize the advantages of such an approach, and applying it to GPUs could make a lot of sense.

Alternatively, EMIB may only be planned for the Xe HPC data center models. Then Intel would take a similar approach to AMD and Nvidia and manufacture multiple GPU variants with specs that should still be close to what we've listed in the above table. The benefit of the EMIB and multi-chip approach is that it would allow Intel to focus on just two main GPUs: Xe HP and Xe HPC (with Xe LP integrated into Tiger Lake and other CPUs).

Considering Intel will have to share 10nm+ manufacturing of Xe Graphics with its CPU families, simplifying on the number of designs would be helpful. There are also questions about Intel's yields and number of defects on 10nm+, as it has yet to release a CPU with more than four CPU cores using its 10nm process. Going with a smaller die and EMIB could dramatically improve yields for Xe HP. That's why our primary guess is that the first generation Xe Graphics will make use of EMIB.

128 EUs per Xe HP GPU would mean the equivalent of 1024 GPU cores, and as noted above the underlying architecture of the cores should be improved in various as-yet-undisclosed ways. Depending on what Intel does, it could end up with GPU cores that are closer to parity with AMD and Nvidia GPU cores—that's sort of the best-case scenario, and what we're hoping happens. There are credible rumors of 2-chip and 4-chip Xe Graphics configurations, which would allow for double and quadruple the theoretical performance of the base Xe HP design.

Adding more GPU cores, slices, EUs or whatever you want to call them will help Intel a lot. 128 EUs / 1024 cores doesn't exactly set our hearts racing, considering Nvidia already offers GPUs with up to 4608 cores (Titan RTX), AMD has offered GPUs with up to 4096 cores (RX Vega 64), and both AMD and Nvidia are likely going to go even higher with the upcoming Big Navi and Ampere architectures. By the end of the year, we could see AMD and Nvidia GPUs with anywhere from 5120 to 8192 GPU cores.

Intel doesn't appear to be shooting quite that high for the consumer space, but we expect to see Xe Graphics models sporting 96 EUs up to 512 EUs, and everything in between. Combined with clock speeds of 1.5-2.0 GHz, which seems reasonable considering previous designs plus the move to 10nm+, and Intel could be pushing 12-16 TFLOPS of computational power on a 512 EU quad-chip configuration. Add in 8GB of GDDR6 memory (or maybe double that to 16GB) and Intel's highest performance Xe Graphics card could be a viable competitor to AMD and Nvidia GPUs. That's the theory at least, though we still don't know if ray tracing support will happen.

Drop down to a smaller GPU or a dual-chiplet configuration and we get more moderate mid-range performance. Half the EUs and GPU cores, half the raw compute, but drop to 6GB VRAM and keep six memory channels—a mid-range tier GPU in 2020 without at least 6GB VRAM simply isn't going to fly. Theoretical performance of 6-8 TFLOPS would put this middle class Xe Graphics solution in the same ballpark as Nvidia's RTX 2060 and AMD's RX 5600 XT, though of course drivers and other factors still need to be tested.

And finally, at the bottom of the heap, we have the budget Xe HP configuration. This would have a single GPU chiplet (or the smallest Xe Graphics dedicated GPU), 4GB of VRAM, and roughly half the performance of the mid-range model. With 128 EUs, that's 1024 cores and a potential 3-4 TFLOPS of compute, depending on clock speeds. There would likely be a higher and lower tier model, one with 96 EUs and no PCIe power connection required, and a second higher performance budget card with 128 EUs and a 6-pin power connector.

It's worth noting that Intel did say at one point during its CES presentation that Xe Graphics would be four times as fast as Gen9 Graphics. The above configurations would certainly hit that mark, and even exceed it. However, we don't know if Intel was saying four times simply from the architecture, i.e. when equalizing clock speeds and EU counts, or four times as fast overall. Xe LP integrated solutions already appear to be targeting the 4x increase relative to an integrated GT2 UHD Graphics 630 configuration. A 128 EU dedicated Xe Graphics card should have no trouble surpassing everything Intel has previously offered.

This concept rendering of Intel's Xe Graphics is probably a reasonable guess at what a larger card could look like.   (Image credit: Intel) Intel Xe Graphics Card Models 
What will Intel call the dedicated Xe Graphics cards? It showed off the DG1 SDV (Discrete Graphics 1 Software Development Vehicle) at CES 2020, and while it repeatedly stated that the design wasn't representative of the final product, it was a nice looking card and doesn't really need any major changes in our view. Of course, it might also be too nice, particularly given the DG1 SDV would be the 'budget' version—there's no PCIe power connector. The metal shroud is certainly an extravagance that isn't needed for a sub-75W card.

Regardless, we anticipate at least four consumer models will be released: budget without a PCIe power connector, probably with 96 EUs and a 50W TDP (give or take). A step up from that will be the full single chip variant with 128 EUs, a 75W TDP and a 6-pin PCIe power connector—just to be safe, but maybe also to enable a bit of overclocking. Then there will be mid-range cards with two chips, assuming Intel takes the EMIB path—one partially enabled (i.e., two 96 EU chiplets) and one fully enabled. Those should have <150W TDP for the former, and perhaps 175-200W TDP for the latter, depending on clock speeds. Finally, the top consumer cards would have four chiplets and up to 512 EUs total (or a single monolithic die if Intel doesn't use EMIB), with higher boost clocks and up to a 300W TDP with dual PCIe 8-pin power connectors. If Intel wants to be aggressive, it will have a step down model that comes with slightly lower clocks and EU counts and a 225-250W TDP.

But the naming of the various models? Intel could play it straight with names like Intel Xe Graphics 96/128/192/256/384/512. That might be too easy. Intel might also go with DG2 branding—DG1 would be for the test platform, but also it would let the original Intel i740 discrete graphics solution keep that title. Or maybe Intel will go with something similar to it's Core branding: Xe9, Xe7, Xe5, and Xe3 families, with varying suffixes based on clock speeds. More likely, there will be a completely new brand reveal in the coming months.

There's also a question about whether Intel will be the sole provider of Xe Graphics cards, or if it will partner with other companies for third-party cards. For the initial launch, or at least until performance is more of a known quantity, we expect Intel will be the only provider of cards. That's basically what AMD and Nvidia do at launch as well with their reference designs, and it helps set the baseline expectations for performance, power and pricing. If Xe Graphics proves to be capable and desirable, third-party boards from the various motherboard and AIB (add-in board) companies could come later.

Intel does have a history of keeping things in-house as much as possible—it makes CPUs and chipsets, SSDs, Xeon Phi cards, NUCs and more. However, graphics cards have a lot of similarities to motherboards, so it's not hard to imagine a future where Intel mostly focuses on providing the GPUs, leaving the graphics card production and assembly to its partners. Well, except for Xe HPC, which will almost certainly be an in-house only product (i.e., like Xeon Phi).

Intel's Tiger Lake processors will feature Xe Graphics and are set to launch later this year.   (Image credit: Intel) Intel Xe Graphics Release Date 
Intel has repeatedly targeted a 2020 release, and all indications are that it's still on track for that. Coronavirus delays might push things back a bit, but given that Intel is primarily responsible for the manufacture of the GPUs, hopefully late summer or early fall 2020 will still happen. We also know that Intel is prepping for the launch of its 10th Gen Core processors, aka Comet Lake, and the LGA1200 socket with Z490 chipset motherboards. We think those will arrive by June and then, maybe a month or two later, Intel will launch Xe Graphics.

Xe Graphics (Xe LP) is also featured in the upcoming Tiger Lake line of CPUs. We expect those to target laptops and mobile devices, just like the current Ice Lake lineup, but Intel has recently revised its plans during its earnings call and Tiger Lake should arrive later this year. That could be good news for anyone concerned with Intel's 10nm+ yields, as it suggests things have improved—though the Tiger Lake CPU is still very small, so perhaps not. 

How Much Will Intel Xe Graphics Cost? 
This is perhaps the most difficult question to answer. Intel traditionally doesn't like dealing with low margin parts. It entered, left, and re-entered the SSD storage market multiple times over the past decade due to profitability concerns. We also know that Intel traditionally wants to sell even its lowest performance Core i3 processors for at least $125, with Core i5 usually being priced closer to $200, Core i7 at $300 and up, and Core i9 at $500 or more. We mention those as a point of reference, and note that building graphics cards inherently means far higher base costs compared to CPUs.

With a CPU all you get is a small package and maybe a cooler. A graphics card needs the GPU, VRAM for the GPU, a PCB to hold the GPU and VRAM and other components, all the video ports, power connectors, and a good cooling solution. That means higher costs and lower margins. However, unlike the CPU realm, Intel is completely unproven in the GPU world. Actually, that's not true: Intel has repeatedly proven over the past decade that it makes inferior GPUs and bundles them into its CPUs.

Put simply, there's no way Intel can charge a price premium with consumer Xe Graphics (data center Xe HPC is a different matter). It needs to clearly beat Nvidia on performance as well as pricing—and matching Nvidia on features would help as well. AMD has been coming in second place for ages, and marketing isn't going to make up for a performance deficit.

Realistically, then, an Intel budget Xe Graphics solution needs to be priced around $125-$150 and be able to clearly match or exceed the performance of the GTX 1650 Super and RX 5500 XT. A $200-$250 model needs to at least match if not beat the GTX 1660 Super and hopefully come close to RX 5600 XT performance, while high-end models priced at $300 and up will need to take on Nvidia's RTX GPUs. Depending on when Xe Graphics arrives, it will also have to contend with AMD's RDNA 2 / Big Navi as well as Nvidia's Ampere / RTX 30-series. We fully expect both of those to deliver better performance than the current RTX 20-series, and Intel will need to keep up.

Intel's steampunk Oblivion concept graphics card, coming in 2035. Or 1865.   (Image credit: Intel) Final Thoughts on Intel Xe Graphics 
The bottom line is that Intel has its work cut out for it. It may be the 800 pound gorilla of the CPU world, but even there Intel has faltered over the past several years. AMD's Ryzen has gained ground, closed the gap, and is now ahead of Intel in most metrics. As the graphics underdog, Intel needs to come out with aggressive performance and pricing, and then iterate and improve at a rapid pace. And please don't talk about how Intel sells more GPUs than AMD and Nvidia. Technically, that's true, but only if you count incredibly slow integrated graphics solutions that are at best sufficient for light gaming.

If Intel quadruples the performance of its current Gen9 Graphics, meaning UHD Graphics 630, that will still fall well short of the GTX 1650 Super and RX 5500 XT. Not only does Intel need to deliver better performance at viable prices, but it needs to prove that it can do more in the way of graphics drivers and regular releases. A 'game ready' Intel driver that basically recommends you set everything to minimum quality and run at 720p, and then hope you can still break 30 fps, is a completely different story from drivers and GPUs where Intel needs to keep up with AMD and Nvidia.

Ideally, competition from Intel should help the graphics industry. A viable third player—maybe even a fourth if Huawei starts doing consumer GPUs—means more choice, and hopefully better prices. But that's all contingent on Intel actually delivering the goods. We'll find out in the coming months if Intel can finally join the dedicated GPU market in a meaningful way, or if it needs to head back to the drawing board yet again. Stay tuned.
Disqus Comments