CPUs


IBM Pairs Xilinx FPGAs to POWER8 to Create an Education Cloud Service

IBM Pairs Xilinx FPGAs to POWER8 to Create an Education Cloud Service

Today IBM has announced “SuperVessel”, an OpenStack based cloud service that enables students and developers to develop applications on a POWER 8 based infrastructure. What makes this cloud service interesting is the announcement that Hemant Dhulla, Vice President of Data Center and Wired Communications for Xilinx made:

Xilinx is delighted to have been chosen as the provider of FPGA accelerators for the IBM SuperVessel cloud. FPGA-based compute acceleration is a critical part of the OpenPOWER Foundation vision to handle demanding workloads in the most cost and power-efficient way. For this reason, a CAPI-enabled Xilinx FPGA is attached to every IBM POWER8 node in the SuperVessel cloud. The research and development being done in the SuperVessel is helping to define the future of heterogeneous computing.”

FPGAs, or field-programmable gate arrays are traditionally used to perform a specific algorithm in hardware. The result is a bulky and expensive chip (produced in low quantities) that runs a certain algorithms at very high speed and low latency. 

Offloading some processing tasks to a specialized chip is certainly nothing new. APUs are CPUs that offload some of their tasks to integrated GPUs. But quite a few parallel algorithms run fast but pretty inefficiently on GPUs. In many cases, an FPGA uses a lot less power. 

Intel has been delivering “customized” Xeons to large customers such as Amazon en Facebook, and has been promising that it will integrate Altera FPGAs inside certain Xeons.  Intel recently bought Altera for $16.7 Billion. 

But IBM seems to have beaten Intel to the FPGA punch with CAPI, the POWER8’s Coherent Accelerator Processor Interface. IBM does not integrate FPGA inside the POWER8 package (yet), but communicates coherently over the PCI express interface. 

The most interesting fact about “Supervessel” that is IBM has managed to make a cloud service that makes ample us of – traditionally expensive – FPGAs, and that the necessary software is in place to make it relatively easy to make use of those FPGAs.  What software did IBM implement to make offload some of the processing work to the Xilinx FPGAs? Unfortunately, so far we only saw the press release and it is very light on technical details. Nevertheless, it is interesting to note that the OpenPOWER Foundation is making a lot of progress in very little time – it was founded only at the end of 2013.

Xeon E3-1200 V4 launch: only with GPU integrated

Xeon E3-1200 V4 launch: only with GPU integrated

Intel’s server CPU portfolio just got more diversified and complex with the launch of the Intel Xeon E3-1200 V4 at Computex 2015.  It is basically the same chip as the Core i7 “Broadwell” desktop that Ian reviewed yesterday: inside we find four Broadwell cores and a Crystal Well-backed Iris Pro GPU, baked with Intel’s state-of-the-art 14 nm process. The Xeon enables ECC RAM support, PCI-passthrough, and VT-D, the former two being features that the desktop chips obviously lack, and VT-D only being present in some desktop chips.

But the current line-up of the Xeon E3-1200 v4 based upon Broadwell is not a simple replacement for the current Xeon E3 1200 v3 “Haswell”, which we tested a few months ago. Traditionally, the Xeon E3 was about either workstations or all kinds of low-end servers. 

It looks like the current Xeon E3-1200 v4 is somewhat a niche product. Besides being a chip for workstations with moderate graphics power, Intel clearly positions the chip as a video transcoding and VDI platform. It looks like – once again – Intel is delivering what AMD promised a long time ago. AMD’s Berlin, a quad steamroller with Radeon GPU was supposed to address this market, but the product did not seem to convince the OEMs.

Intel claims that the 65W TDP E3-1285L v4 was able to decode 14 1080p (at 30 fps) 20Mbps streams, four or 40% more than on the Xeon E3-1286L v3, which could only sustain 10 video streams. Another use are virtual desktops that use PCI device passthrough to give the virtual machine (VM) full access to the GPU. That way of working is very attractive for an IT manager: it enables centralized management of graphical workstation in a secure datacenter.  

But it is should be noted that this kind of virtualization technology comes with drawbacks. First of all, there is only one VM that gets access to the GPU: one VM literally owns the GPU (unlike NVIDIA’s GRID technology). Secondly you add network latency, something that many graphical designers will not like as adds lag compared to the situation where they are working on a workstation with a beefy OpenGL card. 

Below you can find the table of the 5 new SKUs. I added a sixth column with the Xeon-D, so you can easily compare.  

Intel Xeon E3 Broadwell Lineup For
comparison:
  E3-1258L v4 E3-1265L v4 E3-1278L v4 E3-1285 v4 E3-1285L v4 Xeon D-1540
Price $481 $418 $546 $557 $445 $581
Cores 4 4 4 4 4 8
Threads 8 8 8 8 8 16
Base CPU Freq. 1.8 GHz 2.3 GHz 2 GHZ 3.5 GHZ 3.4 GHZ 2 GHz
Turbo CPU Freq. 3.2 GHz 3.3 GHz 3.3 GHz 3.8 GHz 3.8 GHz 2.6 GHz
Graphics P5700
1 GHz
Iris Pro P6300 (GT3e)
1.05 GHz
Iris Pro P6300 (GT3e)
1 GHz
Iris Pro P6300 (GT3e)
1.15 GHz
Iris Pro P6300 (GT3e)
1.15 GHz
none
TDP 47W 35W 47W 95W 65W 45W
DRAM Freq.
(DDR3L)
1600MHz 1866MHz 1600MHz 1866MHz 1866MHz DDR4-2133
L3 Cache 6MB 6MB 6MB 6MB 6MB 12 MB
L4 Cache none 128MB (Crystal Well) 128MB (Crystal Well) 128MB (Crystal Well) 128MB (Crystal Well) none

It is pretty clear that the Xeon-D is a much more attractive server chip for most purposes: twice the amount of cores, twice the amount L3-cache, while remaining inside a 45W TDP power envelop. On top of that, the new Xeon E3 v4 still needs a separate C226 chipset and is limited to 32 GB of RAM. The Xeon-D does not need a separate chipset and supports up to 128 GB of DDR-4. 

In summary, the current Xeon E3-1200 v4 lineup is only interesting if you need a server chip for video transcoding, centralized workstation or a local workstation with relatively modest graphical needs. 

The Atom C2000 and hopefully the X-Gene 2 chips are the SoCs to watch if you want ultra dense and relatively cheap server cpus for basic server processing tasks (static web content, object caching). The Xeon E3-1240Lv3 is probably still the best “single/lowly threaded performance”/watt champion. And the Xeon-D? Well, we will be reviewing that one soon… 

Xeon E3-1200 V4 launch: only with GPU integrated

Xeon E3-1200 V4 launch: only with GPU integrated

Intel’s server CPU portfolio just got more diversified and complex with the launch of the Intel Xeon E3-1200 V4 at Computex 2015.  It is basically the same chip as the Core i7 “Broadwell” desktop that Ian reviewed yesterday: inside we find four Broadwell cores and a Crystal Well-backed Iris Pro GPU, baked with Intel’s state-of-the-art 14 nm process. The Xeon enables ECC RAM support, PCI-passthrough, and VT-D, the former two being features that the desktop chips obviously lack, and VT-D only being present in some desktop chips.

But the current line-up of the Xeon E3-1200 v4 based upon Broadwell is not a simple replacement for the current Xeon E3 1200 v3 “Haswell”, which we tested a few months ago. Traditionally, the Xeon E3 was about either workstations or all kinds of low-end servers. 

It looks like the current Xeon E3-1200 v4 is somewhat a niche product. Besides being a chip for workstations with moderate graphics power, Intel clearly positions the chip as a video transcoding and VDI platform. It looks like – once again – Intel is delivering what AMD promised a long time ago. AMD’s Berlin, a quad steamroller with Radeon GPU was supposed to address this market, but the product did not seem to convince the OEMs.

Intel claims that the 65W TDP E3-1285L v4 was able to decode 14 1080p (at 30 fps) 20Mbps streams, four or 40% more than on the Xeon E3-1286L v3, which could only sustain 10 video streams. Another use are virtual desktops that use PCI device passthrough to give the virtual machine (VM) full access to the GPU. That way of working is very attractive for an IT manager: it enables centralized management of graphical workstation in a secure datacenter.  

But it is should be noted that this kind of virtualization technology comes with drawbacks. First of all, there is only one VM that gets access to the GPU: one VM literally owns the GPU (unlike NVIDIA’s GRID technology). Secondly you add network latency, something that many graphical designers will not like as adds lag compared to the situation where they are working on a workstation with a beefy OpenGL card. 

Below you can find the table of the 5 new SKUs. I added a sixth column with the Xeon-D, so you can easily compare.  

Intel Xeon E3 Broadwell Lineup For
comparison:
  E3-1258L v4 E3-1265L v4 E3-1278L v4 E3-1285 v4 E3-1285L v4 Xeon D-1540
Price $481 $418 $546 $557 $445 $581
Cores 4 4 4 4 4 8
Threads 8 8 8 8 8 16
Base CPU Freq. 1.8 GHz 2.3 GHz 2 GHZ 3.5 GHZ 3.4 GHZ 2 GHz
Turbo CPU Freq. 3.2 GHz 3.3 GHz 3.3 GHz 3.8 GHz 3.8 GHz 2.6 GHz
Graphics P5700
1 GHz
Iris Pro P6300 (GT3e)
1.05 GHz
Iris Pro P6300 (GT3e)
1 GHz
Iris Pro P6300 (GT3e)
1.15 GHz
Iris Pro P6300 (GT3e)
1.15 GHz
none
TDP 47W 35W 47W 95W 65W 45W
DRAM Freq.
(DDR3L)
1600MHz 1866MHz 1600MHz 1866MHz 1866MHz DDR4-2133
L3 Cache 6MB 6MB 6MB 6MB 6MB 12 MB
L4 Cache none 128MB (Crystal Well) 128MB (Crystal Well) 128MB (Crystal Well) 128MB (Crystal Well) none

It is pretty clear that the Xeon-D is a much more attractive server chip for most purposes: twice the amount of cores, twice the amount L3-cache, while remaining inside a 45W TDP power envelop. On top of that, the new Xeon E3 v4 still needs a separate C226 chipset and is limited to 32 GB of RAM. The Xeon-D does not need a separate chipset and supports up to 128 GB of DDR-4. 

In summary, the current Xeon E3-1200 v4 lineup is only interesting if you need a server chip for video transcoding, centralized workstation or a local workstation with relatively modest graphical needs. 

The Atom C2000 and hopefully the X-Gene 2 chips are the SoCs to watch if you want ultra dense and relatively cheap server cpus for basic server processing tasks (static web content, object caching). The Xeon E3-1240Lv3 is probably still the best “single/lowly threaded performance”/watt champion. And the Xeon-D? Well, we will be reviewing that one soon… 

Intel Launches Five 47W Laptop Broadwell SKUs

Intel Launches Five 47W Laptop Broadwell SKUs

As part of Intel’s batch of announcements today, including Broadwell on the desktop and Thunderbolt 3, the 47W laptop/mini-PC processors that were also launched offers an interesting talking point. These are essentially the drop in models for cu…

Intel Launches Five 47W Laptop Broadwell SKUs

Intel Launches Five 47W Laptop Broadwell SKUs

As part of Intel’s batch of announcements today, including Broadwell on the desktop and Thunderbolt 3, the 47W laptop/mini-PC processors that were also launched offers an interesting talking point. These are essentially the drop in models for cu…