Sunday, February 8, 2015

Android 5.0 and 64-bit processors, or Nexus 9 and another round of … – PCLab.pl

All the indications are that this year’s market of smartphones and tablets will be marked by a 64-bit processors ARMv8. Systems of this type in themselves are not new, since the iPhone 5S has been over a year, and for some time in the sale are androidofony with Snapdragonami and Exynosami new generation, but only recently appeared a big missing piece of the puzzle bit, called Android 5.0, and It is only enable greater use of all in practice. Therefore, when to our editorial finally arrived Nexus 9, which is the first gadget with a 64-bit processor and Android 5.0, I started to combine a pinch him for a few moments.

The desire to take a closer look at the Nexus 9 I’ve had the time of preparation of this text, in which even I hinted that I have plans to check what I can squeeze out of a combination of Android 5.0 and 64-bit ARM processor. Unfortunately, I had time to play less than I’d like, so I did not have to check it all in a satisfactory extent and for me to unravel all the mysteries encountered along the way. After all, with those few hours spent on changing compiler flags, I managed to squeeze a handful of interesting conclusions about the ARMv8, Android 5.0, and by the way I think I learned that Nvidia plans to mount the next Tegrze Corteksy-A57, designed by himself instead of cores Denver ;)

Tegra

The advantage of ARMv8

The benefits of using the full potential of next-generation ARM systems can be divided into two categories. The first one contains all kinds of efficiency gains associated with the transition to 64-bit code processor then uses more registers, some calculations can perform faster and so on. But to everything worked as it should, developers must adapt their software to chips of this type, because without that it will function just as quickly (or slightly faster), such as a 32-bit processor. The second category, the productivity gains resulting from the harness to the work of the new ARMv8 instruction set, significantly improving the efficiency of some operations (sometimes an order of magnitude, as can be seen a little below).

First I wanted to see if he can give the transition to 64-bit and tried to do it to recompile the tools used by me some time to test smartfonowych processors, or x264, 7-Zipa and LAME. This first drew as predictable results, the second (unfortunately) have not yet managed to tame, and the third … I will return to it in a moment.

x264 is very grateful to a piece of code because it has long been optimized everything you can, so I was sure that if it will compile properly under ARMv8, then something interesting happens. And indeed it was. More or less.

 x264_kl

 x264_1p

x264_2p

In the two tests is shown to improve the speed of a few percent, that is, they all went according to plan and I suspect that they are a good representation of the average profit of the transition on ARMv8 and 64 bits in most applications having something to do with the NDK. 3. How do you explain the chart? It turned out that the problem had to be sought in the overheating of the Tegra K1. Theoretically, the processor is clocked at 2.3 GHz, but during the second-pass encoding are maintained at a level of about 1.5 GHz. Interestingly, the time to compress video using a 64-bit version of the tool, after a few minutes cores clocks were about 100-200 MHz lower than when using the 32-bit version. If you think about it a little longer, it even makes sense, because 64-bit applications use most of the CPU, which emits more heat at the same clock speed, so when accompanied by the limiter heat and power consumption must spank harder timing and in this tablet does so drastically that eliminates any potential performance benefit. Oh these modern chips smartfonowo-tools pharmaceutical. How not to love them in the winter? ;)

Much larger increases in performance, however, look for wherever the game includes data encryption, which ARMv8 processors got their own set of instructions designed to improve the efficiency of the process. So for example here:

gb3_3

 androsr

 androsw

androrw

The graphs are at first glance may not have too much in common with the theme of this post, but appearances can be deceptive. Looking at the first one should be aware that among Geekbench 3 partial tests to check the performance of fixed-point processor is investigating several tests the speed of encryption of data (if they can) use new ARMv8 instruction. The result? Nexus 9 in these tests is up to 20 times the partial (yes, it’s not a mistake) faster than devices based on Snapdragonach 80x, which heavily influenced the outcome.

What to do with this topic has test data carrier system? In an article about the Nexus 6 mentioned that he has a pre-integrated data encryption, adversely affecting the speed of reading and writing files, and overall system responsiveness. Nexus 9 also has it, but thanks dobrodziejstwom ARMv8, on the equipment, it does not seem to bother either. So if you care about the security of data, the change to the new processor and Android 5.0 will give you measurable and easily noticeable benefits beyond the extra gigabytes of RAM.



Well, the unfortunate LAME …

As you have noticed, we have been running away from the test thread LAME. Why? Therefore:

lame

This result makes no sense, or I can not find it. LAME library “dokompilowuję” to ffmpeg and it is from this test fires ffmpeg. Compile scripts checked if perhaps not something I wrote somewhere, or not removed, but they differ only processor family to which you want to be optimized code (ARMv7 + NEON vs ARMv8) and the target version of Android (4.4 vs 5.0). Somewhere in these two lines going incomprehensible magic for me because I have not heard about it, to a new set of instructions (or a new version of Android) was something that could have such an impact on the speed of creating MP3 files. If anyone has an explanation for this phenomenon is the desire to listen.



Denver Nvidia ARM-food chain at that

Nvidia already in early 2011 announced that it is working on its own architecture 64-bit ARM instruction set compatible with the ARMv8 (when it was first mentioned codename Denver). My impression is that Nvidia opting for something like this had to be able to stay ahead of the competition in the race for supremacy in the market for 64-bit ARM processors, because Qualcomm seemed to have plans to beat another Kraity the end of the world and two days longer (until the end They would start to melt phone housing) and Cortex-A57 cores were rather treated as a song of the distant future, and something that will go to the first microservers. That is enough to strain, hit the right time and the staff green manufacturer of graphics cards have found out how she feels Gaben after a large sell-off Steam’ie.

“Sorry”, in the meantime, appeared out of nowhere Apple A7 and everything changed: Nvidia has not had a chance to prevail, and the employees of Samsung and Qualcomm have to take to the acute work. It is true that this is just my speculation, but I think that they are not too far from the truth, because creating your own architecture is difficult, costly and risky, so it is done only when there is a plan for the whole process then translate into concrete gains . What am I going? Performance Tegra K1 64 (ie, the one with two cores Denver) in the Nexus 9 and how she looks in comparison to other systems and new ARMv8 ARMv7. And after considering my earlier mention of what happens with this processor clocks in longer load time, it somehow does not look very optimistic.

A handful of comparisons:

lame

 7zip

 x264_kl

x264_1p

x264_2p

 gb3_5

 gb3_6

gb3_4

 gb3_8

 3dmphys_u

LAME single-threaded test shows that Denver has one core performance compared with single core Cortex-A57. Test compression 7-Zip does not scale nearly more than two cores, so it comes out to us that the two cores clocked at Denver 2.3 GHz are a little faster than the dual-core Cortex-A57 clocked at 1.9 GHz. The Tegra K1 64 x264 doing poorly, which is largely the result of increased thermal moderator job, but even without it there would be something great. Time for synthetics. Geekbench? After rejecting strongly favoring the Nexus 9 fixed-point tests, as well as taking the amendment to the fact that the processor has a very good memory controller, which also bandwidth test results averaged inflate the final result, we are left with floating-point performance test and it comes to us, that the prevalence of Denver over Corteksem-A57 is mainly due to a faster clock speed. 3DMark? This benchmark scales with the number of cores almost linear, so doubling th e points scored by Nexus 9 looks good. And then you look at the result of the iPhone 6 Plus, the CPU clock and takes into account the fact that this phablet never slows down your CPU and … Sami himself dokończcie;)

Many of you probably now angry at me, because it “would be enough to make a 4-core Tegra K1, which could compete with the latest 4-core Corteksami, so the performance is not bad, is it?” Not really, because as I said, Tegra K1 already has a problem with keeping the clocks, so the 4-core version probably would be doing with the phone housings the same as the blood from the floor Alien Nostromo. Theorizing? Maybe so, but for me it is very telling that the Tegra X1 will have 4 cores Cortex-A57, and the Tegra K1 64 do not set up the queue. I do not want to say in such a way that the Tegra K1 is weak because it does not lack capacity and as the first architecture nvidia-and it does not look bad, but it seems to me that the engineers of the company failed to achieve the intended effect. The more that this is a very chimeric processor: its performance strongly depends on the situation and run-time code (as explained in great reviews on AnandTechu N exus 9), and in addition he reserves the right additional 128 MB of RAM, which is used as a cache necessary to improve its efficiency.

 nvidia-Tegra-x1-chip

completely separate theme is satisfying performance graphics chip, fully compatible with the usual Open GL-I, and not just with his substitute with a note ES (though Google came out with the Nexus 9 Open GL driver …), but I am here today is not about that;)

A brief summary

At the end of a quick summary and a handful of (pre) applications:

  • The transition to 64-bit code and basic optimization of the ARMv8 can give some-more than ten percent extra performance
  • new instructions related to encryption work and provide tangible benefits to wherever it is needed the security of data
  • core based on the architecture of Denver has a capacity of clock in the clock is comparable to the Cortex-A57, although it seems that this is achieved at the expense of higher energy consumption, chimeryczności and collecting user 128 MB RAM

Thank you for your attention :)

LikeTweet

No comments:

Post a Comment