CPUs


AMD at ISSCC 2015: Carrizo and Excavator Details

AMD at ISSCC 2015: Carrizo and Excavator Details

AMD is using the International Solid-State Circuits Conference this week to present a paper and announce some interesting developments regarding the next iteration of the Bulldozer architecture, codenamed ‘Excavator’, as well as other details regarding the CPU range that it will be placed in called ‘Carrizo’.

At the tail end of 2014 we reported on Carrizo and AMD’s announcement for its next generation of APUs, and more recently the discussion surrounding Carrizo not coming to desktop. In those announcements AMD revealed that Carrizo will be aimed at the laptop and notebook community first and foremost, a first for the company as previous APU designs have been aimed at both the desktop and mobile markets.

From a hardware standpoint, Carrizo will be combining a number of Excavator modules, AMD’s R-Series GCN GPUs, and the chipset/Fusion Controller Hub into a single package, bringing with it full HSA compatibility, TrueAudio, and ARM Trustzone compatibility. As with Kaveri before it, Carrizo will be built on Global Foundries’ 28nm Super High Performance (28SHP) node, making Carrizo a pure architecture upgrade without any manufacturing changes. Today’s ISSCC paper in turn builds on these revelations, showing some of the data from AMD’s internal silicon testing.

AMD’s presentation confirms that the new Excavator cores are low power optimized rather than desktop optimized. Support for Mantle and DirectX 12 should go without saying, and Dual Graphics support is something AMD has been working on for a number of generations. The next point is interesting from my perspective:

“Single-chip integration of the APU and the Southbridge onto a single die”

In our pre-briefing call, AMD confirmed that the Southbridge/FCH is no longer a separate chip, and is being moved on to the CPU from its previously separate package. In fact not only is the south bridge going to part of the CPU with Carrizo, but it’s being fully integrated into the APU die itself. This is a first for AMD, and even Intel by comparison still uses two separate dies on the same package for their similar Broadwell-Y/U processors. As a result, AMD explained, this advances the Southbridge from the older 65nm/45nm processes to 28nm and 28SHP, reducing power consumption and operating voltage. It also allows the APU to accurately control power gating, further saving power, and reduces the length of HyperTransport interconnects between the APU and the I/O. On the flip side, it does move the soutb bridge’s power consumption onto the APU, as well as the extra transistors it would otherwise occupy. This is explained in detail below.

The key element to Excavator’s design is a reduction in die area. Fundamentally everything is the same in terms of operation compared to Kaveri, but the internal units such as the FP scheduler and cache control have been re-engineered to take up less room on the same 28nm SHP process node. It seems a little odd applying a ‘high-density’ design to a ‘high-performance’ process node, but AMD is stating that part of this has been driven by the GPU team sharing its experiences and knowledge of small, efficient die components with the CPU team, allowing the lessons learned there to benefit AMD’s CPU designs. This is combined with a “GPU-oriented” design stack on the CPU, which AMD is showing provides significant power savings at the same frequency, or higher frequency at the same power.

The high density, power optimized design also plays a role in the GPU segment of Carrizo, offering lower leakage at high voltages as well as allowing a full 8 GCN core design at 20W. This is an improvement from Kaveri, which due to power consumption only allowed a 6 GCN design at the same power without compromising performance.

AMD revealed Voltage Adaptive Operation back with Kaveri, and it makes a reappearance in Carrizo with its next iteration. The principle here is that with a high noise line, the excess voltage will cause power to rise. If the system reduces the frequency of the CPU during high noise/voltage segments – as power is proportional to voltage squared – power consumption will be reduced and then frequency can be restored when noise returns to normal. This happens inside the CPU over nanoseconds, resulting in no serious performance loss but it helps keep the power consumption of the APU down. In the case of Carrizo, AMD is quoting a 10-20% reduction in power consumption versus what a theoretical Carrizo would look like without this technology.

Another new addition to Excavator comes in the form of Adaptive Voltage-Frequency Scaling modules. Carrizo uses 10 in each Excavator ‘core’, and these modules can adjust the frequency and voltage of individual components depending on power requirements, temperature and other external factors in order to improve either performance, power consumption, or efficiency. With this in mind, AMD is claiming a 29% frequency increase at 10W, or if frequency is held constant then there is a 40-50% power decrease at the same 10W. At 20W, as the graph shows, there is almost no difference between the two, indicating that Excavator is truly built for lower TDP devices.

AMD is also presenting news on improvements to their ability to quickly enter and exit sleep states. With Excavator, AMD can now go from a sub-50mW S0i3 state to an active state in under a second. This should allow Carrizo devices to quickly reach and better sustain near-standby power levels, improving idle and low-load power consumption. As shown in the slide, at the S0i3 state only the ACP, PCH, and a small I/O segment are still active, while the rest of the device is completely power gated.

Meanwhile AMD is also once again showing off their technology timeline to illustrate their progress in implementing new technologies over the years. We confirmed that an interesting feature, inter-frame power gating, is active in Carrizo. This in a nutshell allows the GPU to go to a low frequency mode when the frame buffers are full. Though only a few milliseconds of power savings per instance, over time this can add up to larger increases battery life.

Wrapping up the hardware aspects of their ISSCC presentation, AMD is also disclosing the die size and transistor counts for Carrizo. Whereas Kaveri weighed in at 2.3 billion transistors in a 245mm2 die, Carrizo will come in at a much larger 3.1 billion transistors in a 250mm2 die. This a significant increase in transistor density for AMD, with Carrizo packing in 29% more transistors for only a marginal increase in die size. Though AMD is not explaining where all of the transistor increases come from at this time, part of the increase comes from the Southbridge/FCH being moved on-die, which AMD tells us will take up 5.5% of Carrizo’s die. As for the Excavator cores themselves, AMD is starting that they consume 40% less power and take up 23% less die area, thanks to the combination of transistor density improvements, AVFS technology, and bringing the FCH on-die.

Moving on, although AMD’s ISSCC presentation is not going to be diving deep into the Excavator architecture, AMD is claiming that Excavator will also bring with it a 5% IPC boost. We understand that this increase in IPC comes from a doubling of the L1 data cache from 64KB to 128KB, as well as further payoffs from the power improvements. Meanwhile on the fixed-function side of matters, Carrizo will be introducing a full H.265 hardware decoder. This is the first AMD part (CPU or GPU) to offer any kind of hardware support for H.265 decoding, and in the process it will be the first x86 CPU/APU to offer full hardware decode capabilities, as Intel still relies on a hybrid decode approach at this time.

Finally, AMD is also rolling out some new Heterogeneous System Architecture (HSA) functionality as part of Carrizo. HSA is seen as one of the next key factors in personal computing over the next decade. We have seen an almost ubiquitous shift in recent years towards almost every consumer processor having on-die graphics, and the ability to optimize a workload for each part of the system improves the experience. AMD has been riding this wave, announcing Kaveri as ‘HSA Ready’ and now Carrizo as ‘HSA Compliant’, fully adhering to HSA 1.0 specifications. At the moment the biggest benchmark showing off this power is PCMark, something AMD likes to promote.  With regards the difference between HSA Ready and Compliant, I asked AMD what made Kaveri different in that regard. The answer was straightforward enough: Carrizo is able to perform GPU context switching, allowing a GPU state-save and state-restore, something Kaveri is unable to do and offering a solid hint that Carrizo’s GPU is based on AMD’s GCN 1.2 architecture.

Wrapping things up, the combination of a 5% IPC boost and 40% power savings means that AMD has a range of options for Carrizo parts, picking between increased clockspeeds at the same power levels or holding clockspeeds constant for a larger battery life gain. We expect that actual retail parts will be somewhere in the middle, as the graphs in the slides indicate that best efficiency occurs around the 10W scenario.

We did ask about absolute design numbers regarding battery life, processor frequencies and time to market. As expected, AMD is keeping its cards close to its chest, especially in a more academic environment such as ISSCC. At this point in time we were told that Carrizo is expected to come to market within Q2. With Computex taking place towards the end of Q2, this should mean that a number of Carrizo devices will either be on the market or at least on display for us to examine.

 

AMD at ISSCC 2015: Carrizo and Excavator Details

AMD at ISSCC 2015: Carrizo and Excavator Details

AMD is using the International Solid-State Circuits Conference this week to present a paper and announce some interesting developments regarding the next iteration of the Bulldozer architecture, codenamed ‘Excavator’, as well as other details regarding the CPU range that it will be placed in called ‘Carrizo’.

At the tail end of 2014 we reported on Carrizo and AMD’s announcement for its next generation of APUs, and more recently the discussion surrounding Carrizo not coming to desktop. In those announcements AMD revealed that Carrizo will be aimed at the laptop and notebook community first and foremost, a first for the company as previous APU designs have been aimed at both the desktop and mobile markets.

From a hardware standpoint, Carrizo will be combining a number of Excavator modules, AMD’s R-Series GCN GPUs, and the chipset/Fusion Controller Hub into a single package, bringing with it full HSA compatibility, TrueAudio, and ARM Trustzone compatibility. As with Kaveri before it, Carrizo will be built on Global Foundries’ 28nm Super High Performance (28SHP) node, making Carrizo a pure architecture upgrade without any manufacturing changes. Today’s ISSCC paper in turn builds on these revelations, showing some of the data from AMD’s internal silicon testing.

AMD’s presentation confirms that the new Excavator cores are low power optimized rather than desktop optimized. Support for Mantle and DirectX 12 should go without saying, and Dual Graphics support is something AMD has been working on for a number of generations. The next point is interesting from my perspective:

“Single-chip integration of the APU and the Southbridge onto a single die”

In our pre-briefing call, AMD confirmed that the Southbridge/FCH is no longer a separate chip, and is being moved on to the CPU from its previously separate package. In fact not only is the south bridge going to part of the CPU with Carrizo, but it’s being fully integrated into the APU die itself. This is a first for AMD, and even Intel by comparison still uses two separate dies on the same package for their similar Broadwell-Y/U processors. As a result, AMD explained, this advances the Southbridge from the older 65nm/45nm processes to 28nm and 28SHP, reducing power consumption and operating voltage. It also allows the APU to accurately control power gating, further saving power, and reduces the length of HyperTransport interconnects between the APU and the I/O. On the flip side, it does move the soutb bridge’s power consumption onto the APU, as well as the extra transistors it would otherwise occupy. This is explained in detail below.

The key element to Excavator’s design is a reduction in die area. Fundamentally everything is the same in terms of operation compared to Kaveri, but the internal units such as the FP scheduler and cache control have been re-engineered to take up less room on the same 28nm SHP process node. It seems a little odd applying a ‘high-density’ design to a ‘high-performance’ process node, but AMD is stating that part of this has been driven by the GPU team sharing its experiences and knowledge of small, efficient die components with the CPU team, allowing the lessons learned there to benefit AMD’s CPU designs. This is combined with a “GPU-oriented” design stack on the CPU, which AMD is showing provides significant power savings at the same frequency, or higher frequency at the same power.

The high density, power optimized design also plays a role in the GPU segment of Carrizo, offering lower leakage at high voltages as well as allowing a full 8 GCN core design at 20W. This is an improvement from Kaveri, which due to power consumption only allowed a 6 GCN design at the same power without compromising performance.

AMD revealed Voltage Adaptive Operation back with Kaveri, and it makes a reappearance in Carrizo with its next iteration. The principle here is that with a high noise line, the excess voltage will cause power to rise. If the system reduces the frequency of the CPU during high noise/voltage segments – as power is proportional to voltage squared – power consumption will be reduced and then frequency can be restored when noise returns to normal. This happens inside the CPU over nanoseconds, resulting in no serious performance loss but it helps keep the power consumption of the APU down. In the case of Carrizo, AMD is quoting a 10-20% reduction in power consumption versus what a theoretical Carrizo would look like without this technology.

Another new addition to Excavator comes in the form of Adaptive Voltage-Frequency Scaling modules. Carrizo uses 10 in each Excavator ‘core’, and these modules can adjust the frequency and voltage of individual components depending on power requirements, temperature and other external factors in order to improve either performance, power consumption, or efficiency. With this in mind, AMD is claiming a 29% frequency increase at 10W, or if frequency is held constant then there is a 40-50% power decrease at the same 10W. At 20W, as the graph shows, there is almost no difference between the two, indicating that Excavator is truly built for lower TDP devices.

AMD is also presenting news on improvements to their ability to quickly enter and exit sleep states. With Excavator, AMD can now go from a sub-50mW S0i3 state to an active state in under a second. This should allow Carrizo devices to quickly reach and better sustain near-standby power levels, improving idle and low-load power consumption. As shown in the slide, at the S0i3 state only the ACP, PCH, and a small I/O segment are still active, while the rest of the device is completely power gated.

Meanwhile AMD is also once again showing off their technology timeline to illustrate their progress in implementing new technologies over the years. We confirmed that an interesting feature, inter-frame power gating, is active in Carrizo. This in a nutshell allows the GPU to go to a low frequency mode when the frame buffers are full. Though only a few milliseconds of power savings per instance, over time this can add up to larger increases battery life.

Wrapping up the hardware aspects of their ISSCC presentation, AMD is also disclosing the die size and transistor counts for Carrizo. Whereas Kaveri weighed in at 2.3 billion transistors in a 245mm2 die, Carrizo will come in at a much larger 3.1 billion transistors in a 250mm2 die. This a significant increase in transistor density for AMD, with Carrizo packing in 29% more transistors for only a marginal increase in die size. Though AMD is not explaining where all of the transistor increases come from at this time, part of the increase comes from the Southbridge/FCH being moved on-die, which AMD tells us will take up 5.5% of Carrizo’s die. As for the Excavator cores themselves, AMD is starting that they consume 40% less power and take up 23% less die area, thanks to the combination of transistor density improvements, AVFS technology, and bringing the FCH on-die.

Moving on, although AMD’s ISSCC presentation is not going to be diving deep into the Excavator architecture, AMD is claiming that Excavator will also bring with it a 5% IPC boost. We understand that this increase in IPC comes from a doubling of the L1 data cache from 64KB to 128KB, as well as further payoffs from the power improvements. Meanwhile on the fixed-function side of matters, Carrizo will be introducing a full H.265 hardware decoder. This is the first AMD part (CPU or GPU) to offer any kind of hardware support for H.265 decoding, and in the process it will be the first x86 CPU/APU to offer full hardware decode capabilities, as Intel still relies on a hybrid decode approach at this time.

Finally, AMD is also rolling out some new Heterogeneous System Architecture (HSA) functionality as part of Carrizo. HSA is seen as one of the next key factors in personal computing over the next decade. We have seen an almost ubiquitous shift in recent years towards almost every consumer processor having on-die graphics, and the ability to optimize a workload for each part of the system improves the experience. AMD has been riding this wave, announcing Kaveri as ‘HSA Ready’ and now Carrizo as ‘HSA Compliant’, fully adhering to HSA 1.0 specifications. At the moment the biggest benchmark showing off this power is PCMark, something AMD likes to promote.  With regards the difference between HSA Ready and Compliant, I asked AMD what made Kaveri different in that regard. The answer was straightforward enough: Carrizo is able to perform GPU context switching, allowing a GPU state-save and state-restore, something Kaveri is unable to do and offering a solid hint that Carrizo’s GPU is based on AMD’s GCN 1.2 architecture.

Wrapping things up, the combination of a 5% IPC boost and 40% power savings means that AMD has a range of options for Carrizo parts, picking between increased clockspeeds at the same power levels or holding clockspeeds constant for a larger battery life gain. We expect that actual retail parts will be somewhere in the middle, as the graphs in the slides indicate that best efficiency occurs around the 10W scenario.

We did ask about absolute design numbers regarding battery life, processor frequencies and time to market. As expected, AMD is keeping its cards close to its chest, especially in a more academic environment such as ISSCC. At this point in time we were told that Carrizo is expected to come to market within Q2. With Computex taking place towards the end of Q2, this should mean that a number of Carrizo devices will either be on the market or at least on display for us to examine.

 

AMD Reports Q4 FY 2014 And Full Year Results

AMD Reports Q4 FY 2014 And Full Year Results

AMD president and CEO, Dr. Lisa Su, announced the company’s Q4 results, with revenue for the quarter coming in at $1.24 billion, with a gross margin of 29%. Earnings per share based on GAAP results was a loss of $0.47 per share. Compared to Q3, revenue dropped 13%, and year-over-year the drop was 22%. Operating income dropped $393 million from Q3 (623% decrease) and is well down from the Q4 2013 value of $135 million with a posted operating loss this quarter of $330 million. Net income fell from $17 million last quarter and $89 million last year to a $364 million loss, which is a pretty substantial change.

AMD Q4 2014 Financial Results (GAAP)
  Q4’2014 Q3’2014 Q4’2013
Revenue $1.24B $1.43B $1.59B
Operating Income -$330M $63M $135M
Net Income -$364M $17M $89M
Gross Margin 29% 35% 35%
Earnings Per Share -$0.47 $0.02 $0.12

AMD had three large reasons for the loss this quarter which hit their GAAP numbers pretty hard. First, they had yet another write down of their SeaMicro and ATI acquisitions, which they attribute to a decline in their stock prices. This cost them $233 million this quarter. Second, they have had to perform a write down for their second generation APU products, which they have listed higher on their balance sheets than they can sell them for now, however they do expect to sell through their inventory. This contributed to a $58 million non-cash charge. Finally, restructuring charges based on layoffs and the departure of their CEO, as well as real estate restructuring charges cost an additional $71 million. As these are all one time charges, AMD has also released Non-GAAP results which exclude these write downs.

On a Non-GAAP basis, operating income was $36 million, which is down 45% from last quarter’s $66 million value, and down year-over-year from the $91 million operating income from Q4 2013. Net income equates to $2 million, down from $20 million last quarter and $45 million last year, and Non-GAAP earnings per share is $0.00, which can also be spelled as zero, which missed analyst’s expectations of $0.01 per share. The core business is getting to the break-even point, and AMD has said that they have had six consecutive quarters of Non-GAAP profitability, but even that is on a razor’s edge with this quarter’s numbers.

AMD Q4 2014 Financial Results (Non-GAAP)
  Q4’2014 Q3’2014 Q4’2013
Revenue $1.24B $1.43B $1.59B
Operating Income $36M $66M $91M
Net Income $2M $20M $45M
Gross Margin 34% 35% 35%
Earnings Per Share $0.00 $0.03 $0.06

Full numbers for the year had revenue of $5.51 billion, up from $5.30 billion in 2013, and the GAAP operating loss was $155 million for 2014, down from the $103 million operating income for 2013. GAAP net income for 2014 was down as well, to a $403 million loss, exceeding 2013’s loss of $83 million. Non-GAAP values for 2014 were slightly better, with a $235 million operating income and a small profit of $51 million for the fiscal year. Once again, the core business is breaking even, but the heavy write downs are hurting the bottom line.

Looking at individual business lines, the Computing and Graphics segment had a net revenue for Q4 of $662 million, down 15% from Q3 and down 16% from Q4 2013. Lower desktop processor and GPU sales are the blame over last quarter, and desktop processors and chipset sales are called out as the decrease over last year’s numbers. Operating loss for the segment was $56 million, as compared to $17 million in Q3 and $15 million in Q4 2013. Lower channel sales were partially offset by lower operating expenses. Average selling price actually increased both sequentially and year-over-year for processors and chipsets, but GPU selling price decreased year-over-year. This is a soft spot for AMD, and they are diversifying their business outside of the traditional PC space in an attempt to keep one weak line from hurting the company so much. In 2012, about 90% of AMD’s business was based on the traditional PC industry, and by 2014 it was down to 60%, with the other 40% consisting of professional graphics, semi-custom chips, ARM based server, embedded, and ultra low-power clients. By 2015 they are estimating that 50% of their business will be these new markets.

Looking at the Enterprise, Embedded, and Semi-Custom group at AMD, you can see why they are moving that direction. For the full fiscal year, this group contributed $2.374 billion in revenue, and had an operating income of $399 million, both of these are up from 2013 where they managed only $1.577 billion in revenue and $295 million in operating income from this group. Looking at the quarter itself, revenue fell 16% from Q3’s $648 million to $577 million, which AMD attributed to a large run-up of chips for the Xbox One and PlayStation 4 in Q3, as Microsoft and Sony built up inventory for the holiday season. Operating income for Q4 was $109 million, down from $129 million Q4 2013 and up slightly from the $108 million last quarter.

The “All Other” category had no revenue for the quarter, but took a $383 million operating loss, which results in a 2014 operating loss of $478 million for the year. This is the category that is taking the write downs we have already discussed.

Looking ahead to 2015, AMD is listing 2015 as “profitable” at least as far as Non-GAAP figures. Q1 2015 guidance is for a 15% drop in revenue, plus or minus 3%. Gross margin should be up 5% to 34%.

AMD has seen some pretty serious competition in the PC segment, which is still their largest single contributor. Intel has just released their 14 nm parts, with a new CPU architecture due out later this year with Skylake. AMD does have a new APU on the horizon though with Carrizo, and they had working prototypes at CES. While CPU performance will likely not stun anyone due to the new CPU being still based on the Bulldozer architecture, GPU performance should be very competitive. These will be 15 to 35 watt parts, so as far as TDP they will compete against the just launched Broadwell-U. For lower power, AMD will have the Carrizo-L based on Puma+ CPU cores for the 10-25 watt range. All of this will still be on 28 nm though, which puts AMD at a pretty significant disadvantage for efficiency. The increased GPU power may be enough to sway some customers, since many people find they are not CPU bound anyway. Time will tell, and we look forward to seeing the new chips show up so we can test them out.

Source: AMD Investor Relations