Smartphones


Examining Huawei's Benchmark Optimizations in the Ascend P7

Examining Huawei’s Benchmark Optimizations in the Ascend P7

While benchmark optimization has been a hot topic, recently it has faded into the background as the industry adjusted. Previously, we saw changes such as an automatic 10% GPU overclock that was almost never achieved in normal applications, and behavior that would automatically plug in all cores and set the CPU frequency to maximum. Now, most OEMs have either stopped this behavior. Even if an OEM hasn’t stopped such behavior, there are options that make it possible to use the altered CPU/GPU governor in all applications.

Unfortunately, I have to talk about a case where this isn’t true. While I’ve been working on reviewing the Ascend P7 and have found a lot to like, I am sure that the Ascend P7 alters CPU governor behavior in certain benchmarks. For those that are unfamiliar with the Huawei Ascend P7, it’s considered to be Huawei’s flagship smartphone. As Huawei’s flagship, it’s equipped with a Kirin 910T SoC, which has four Cortex A9r4 CPUs running at a maximum of 1.8 GHz, and two gigabytes of RAM. As a flagship smartphone, it also has a five inch display with a 1080p resolution.

To test for differences in governor behavior, we’ll start by looking at how the P7 normally behaves when faced with a benchmark workload. I haven’t seen any differences in GPU behavior as the governor seems to stay clocked at an appropriate level regardless of the benchmark. At any rate, the behavior is noticeably quite reluctant when it comes to reaching 1.8 GHz. For the most part this only happens in short periods, and there is a great deal of variation in clock speeds, with an average of about 1.3 GHz throughout the test.

Here, we can see a significant difference in the CPU frequency curve. There’s far more time spent at 1.8 GHz, and the frequency profile is incredibly tight outside of the beginning and end. The average frequency is around 1.7 GHz, which is significantly higher than what we see in the renamed version of the benchmark.

While this graph is somewhat boring, it’s important as it shows that only three cores are plugged for the full duration of the test. Any noticeable deviation from this pattern would definitely be concerning.

When running the same workload on the Play Store version of GFXBench, we see that four cores are plugged for almost the entirety of the test. While I’m not surprised to see this kind of behavior when combined with altered frequency scaling, it’s a bit disappointing. Strangely, this policy doesn’t seem to be universal either as I haven’t seen evidence of altered behavior in Huawei’s Snapdragon devices. This sort of optimization seems to be exclusive to the HiSilicon devices. Such behavior is visible in 3DMark as well, although it doesn’t seem to happen in Basemark OS II or Basemark X 1.1.

Huawei Ascend P7 Performance
  Play Store Renamed Perf Increase
GFXBench T-Rex 12.3 10.6 +16%
3DMark Ice Storm U/L 7462 5816 +28.3%

While normally such optimizations have a small effect, in the case of the affected benchmarks the difference is noticeable and quite significant. Needless to say, it’s not really acceptable that Huawei is doing this, and I’m disappointed that they have chosen this path.

In response to this issue, Huawei stated the following:

“CPU configuration is adjusted dynamically according to the workload in different scenarios. Benchmark running is a typical scenario which requires heavy workload, therefore main frequency of CPU will rise to its highest level and will remain so for a while. For P7, the highest frequency is 1.8GHz. It seldom requires CPU to work at the highest frequency for long in others scenarios. Even if the highest level appears, it will only last for a very short time (for example 400 ms). Situation is the same for most devices in the market.”

Unfortunately, I’m not sure how this statement explains the situation, as two identical workloads performed differently. While I was hoping to see an end to rather silly games like this, it seems that this path before OEMs stop this kind of behavior will continue on for longer than I first expected. Ultimately, such games don’t affect anyone that actually knows how to benchmark SoCs and evaluate performance, and one only needs to look to the PC industry to see that such efforts will ultimately be discovered and defeated.

 

Examining Huawei's Benchmark Optimizations in the Ascend P7

Examining Huawei’s Benchmark Optimizations in the Ascend P7

While benchmark optimization has been a hot topic, recently it has faded into the background as the industry adjusted. Previously, we saw changes such as an automatic 10% GPU overclock that was almost never achieved in normal applications, and behavior that would automatically plug in all cores and set the CPU frequency to maximum. Now, most OEMs have either stopped this behavior. Even if an OEM hasn’t stopped such behavior, there are options that make it possible to use the altered CPU/GPU governor in all applications.

Unfortunately, I have to talk about a case where this isn’t true. While I’ve been working on reviewing the Ascend P7 and have found a lot to like, I am sure that the Ascend P7 alters CPU governor behavior in certain benchmarks. For those that are unfamiliar with the Huawei Ascend P7, it’s considered to be Huawei’s flagship smartphone. As Huawei’s flagship, it’s equipped with a Kirin 910T SoC, which has four Cortex A9r4 CPUs running at a maximum of 1.8 GHz, and two gigabytes of RAM. As a flagship smartphone, it also has a five inch display with a 1080p resolution.

To test for differences in governor behavior, we’ll start by looking at how the P7 normally behaves when faced with a benchmark workload. I haven’t seen any differences in GPU behavior as the governor seems to stay clocked at an appropriate level regardless of the benchmark. At any rate, the behavior is noticeably quite reluctant when it comes to reaching 1.8 GHz. For the most part this only happens in short periods, and there is a great deal of variation in clock speeds, with an average of about 1.3 GHz throughout the test.

Here, we can see a significant difference in the CPU frequency curve. There’s far more time spent at 1.8 GHz, and the frequency profile is incredibly tight outside of the beginning and end. The average frequency is around 1.7 GHz, which is significantly higher than what we see in the renamed version of the benchmark.

While this graph is somewhat boring, it’s important as it shows that only three cores are plugged for the full duration of the test. Any noticeable deviation from this pattern would definitely be concerning.

When running the same workload on the Play Store version of GFXBench, we see that four cores are plugged for almost the entirety of the test. While I’m not surprised to see this kind of behavior when combined with altered frequency scaling, it’s a bit disappointing. Strangely, this policy doesn’t seem to be universal either as I haven’t seen evidence of altered behavior in Huawei’s Snapdragon devices. This sort of optimization seems to be exclusive to the HiSilicon devices. Such behavior is visible in 3DMark as well, although it doesn’t seem to happen in Basemark OS II or Basemark X 1.1.

Huawei Ascend P7 Performance
  Play Store Renamed Perf Increase
GFXBench T-Rex 12.3 10.6 +16%
3DMark Ice Storm U/L 7462 5816 +28.3%

While normally such optimizations have a small effect, in the case of the affected benchmarks the difference is noticeable and quite significant. Needless to say, it’s not really acceptable that Huawei is doing this, and I’m disappointed that they have chosen this path.

In response to this issue, Huawei stated the following:

“CPU configuration is adjusted dynamically according to the workload in different scenarios. Benchmark running is a typical scenario which requires heavy workload, therefore main frequency of CPU will rise to its highest level and will remain so for a while. For P7, the highest frequency is 1.8GHz. It seldom requires CPU to work at the highest frequency for long in others scenarios. Even if the highest level appears, it will only last for a very short time (for example 400 ms). Situation is the same for most devices in the market.”

Unfortunately, I’m not sure how this statement explains the situation, as two identical workloads performed differently. While I was hoping to see an end to rather silly games like this, it seems that this path before OEMs stop this kind of behavior will continue on for longer than I first expected. Ultimately, such games don’t affect anyone that actually knows how to benchmark SoCs and evaluate performance, and one only needs to look to the PC industry to see that such efforts will ultimately be discovered and defeated.

 

Unity Adds Native x86 Support for Android

Unity Adds Native x86 Support for Android

Intel is facing an uphill battle in the mobile space from a marketshare perspective, but there’s an additional challenge: the bulk of mobile apps are compiled targeting ARM based CPU cores, not x86. With the launch of Medfield on Android, Intel introduced a binary translation software layer to enable running existing ARM based Android apps on x86. Binary translation is a useful fix for enabling compatibility but it does come with a performance and power penalty. Enabling native x86 applications is ultimately the goal here, BT is just used as a transitional tool. 

As far as I can tell, none of the big game engines (Unity, Unreal Engine) were ported to x86 on Android. As a result, any game that leveraged these engines would be ARM code translated to run on x86. This morning Intel and Unity Technologies announced a native x86 version of the Unity game engine for Android. Selected developers have access to the x86 version today, and it’ll be made available to everyone else by the end of the year. There’s no charge for the update. Note that this only applies to the Android Unity port, the engine under Windows and all Windows tools are already obviously compiled for x86.

Intel’s press release mentions support for both Core and Atom families. I clarified with Intel that the Core reference mainly applies to any Core M (Broadwell Y or Skylake Y) Android tablets, and not a push into Core based smartphones. 

Intel is also working on enabling other game engines, but we’ll have to wait to see those announcements. 

Unity Adds Native x86 Support for Android

Unity Adds Native x86 Support for Android

Intel is facing an uphill battle in the mobile space from a marketshare perspective, but there’s an additional challenge: the bulk of mobile apps are compiled targeting ARM based CPU cores, not x86. With the launch of Medfield on Android, Intel introduced a binary translation software layer to enable running existing ARM based Android apps on x86. Binary translation is a useful fix for enabling compatibility but it does come with a performance and power penalty. Enabling native x86 applications is ultimately the goal here, BT is just used as a transitional tool. 

As far as I can tell, none of the big game engines (Unity, Unreal Engine) were ported to x86 on Android. As a result, any game that leveraged these engines would be ARM code translated to run on x86. This morning Intel and Unity Technologies announced a native x86 version of the Unity game engine for Android. Selected developers have access to the x86 version today, and it’ll be made available to everyone else by the end of the year. There’s no charge for the update. Note that this only applies to the Android Unity port, the engine under Windows and all Windows tools are already obviously compiled for x86.

Intel’s press release mentions support for both Core and Atom families. I clarified with Intel that the Core reference mainly applies to any Core M (Broadwell Y or Skylake Y) Android tablets, and not a push into Core based smartphones. 

Intel is also working on enabling other game engines, but we’ll have to wait to see those announcements.