Skip to main content

Get the Reddit app

Scan this QR code to download the app now
Or check it out in the app stores

A12 Bionic

What Apple didn’t announce about A12
r/apple

An unofficial community about Apple and all of its devices and software.


Members Online
What Apple didn’t announce about A12

What Apple didn’t talk about the A12, is that it’s the first consumer available processor that supports ARMv8.3, which supports a feature called pointer authentication code.

It essentially encrypts and decrypts certain kinds of memory access on the fly, making the phone much more hardened against hacking.

Doing all that, while gaining performance and efficiency, is extremely impressive.





Apple really undersold the A12 CPU. It's almost caught up to desktop chips at this point. Here's a breakdown [OC]:
r/apple

An unofficial community about Apple and all of its devices and software.


Members Online
Apple really undersold the A12 CPU. It's almost caught up to desktop chips at this point. Here's a breakdown [OC]:

This is a long post. The title is basically the Tl;Dr... if you care about the details, read on :)

I was intrigued by the Anantech comparison of the A12 with a Xeon 8176 on Spec2006, so I decided to find more spec benchmarks for other chips and run them.


Comparisons to Xeon 8192, i7 6700k, and AMD EPYC 7601 CPUs.

Notes: All results are Single-Core. If the processor is multithreaded, I tried finding the Multithreaded results. In the case of Big+Little configurations (like the A12) one Big core was used. The 6700k was the fastest Intel desktop chip I could find on the Spec2006 database.

Spec_Int 2006 Example Apple A12[1] Xeon 8176[3] i7 6700k[2] EPYC 7601[3]
Clock speed (Single Core Turbo) 2.5Ghz 3.8Ghz 4.2Ghz 3.2Ghz
Per-core power con. (Watts) 3.64W 5.89W 18.97W 5.62W
Threads (nc,nt) 1c,1t 1c,2t 1c,1t 1c,2t
400.perlbench Spam filter 45.3 50.6 48.4 40.6
401.bzip2 Compression 28.5 31.9 31.4 33.9
403.gcc Compiling 44.6 38.1 44.0 41.6
429.mcf Vehicle scheduling 49.9 50.6 87.1 44.2
445.gobmk Game AI 38.5 50.6 35.9 36.4
456.hmmer Protein seq. analyses 44.0 41.0 108 34.9
458.sjeng Chess 36.6 41 38.9 36
462.libquantum Quantum sim 113 83.2 214 89.2
464.h264ref Video encoding 66.59 66.8 89.2 56.1
471.omnetpp Network sim 35.73 41.1 34.2 26.6
473.astar Pathfinding 27.25 33.8 40.8 29
483.xalancbmk XML processing 57.0 75.3 74.0 37.8

The main takeaway here is that Apple’s A12 is approaching or exceeding the performance of these competing chips in Spec2006, with lower clock speeds and less power consumption. The A12 BIG core running at 2.5GHz beats a Xeon 8176 core running at 3.8GHz, in 9 out of 12 of Spec_Int 2006 tests, often by a large margin (up to 44%). It falls behind in 3 tests, but the deficiency is 2%, 6%, and 12%. It also comes quite close to a desktop 6700k.

No adjustment was made to normalize the results by clock speed. Core-for-Core Apple’s A12 has a a higher IPC and at least 50% better Perf/Watt than competing chips, even with the advantage of SMT on some of these! (Apple doesn’t use SMT in the A-series chips currently).


CPU Width

Monsoon (A11) and Vortex (A12) are extremely wide machines – with 6 integer execution pipelines among which two are complex units, two load units and store units, two branch ports, and three FP/vector pipelines this gives an estimated 13 execution ports, far wider than Arm’s upcoming Cortex A76 and also wider than Samsung’s M3. In fact, assuming we're not looking at an atypical shared port situation, Apple’s microarchitecture seems to far surpass anything else in terms of width, including desktop CPUs.

Anandtech

By comparison, Zen and Coffee Lake have 6-wide decode + 4Int ALU per core. Here are the WikiChip block diagrams: Zen/Zen+ and Coffee Lake Even IBM's Power9 is 6-wide.

Why does this matter?

width in this case refers to Issue Width on the CPU μArch. Or "how many commands can I issue to this CPU per cycle.The wider your issue-width on a CPU, the more you instructions can be issued at once. By stacking these instructions very close to one another, you can achieve multiple instructions per Cycle, resulting in a higher IPC. This has drawbacks -- it requires longer wire length, as the electrons need to travel more to execute all the instructions and because you're doing so many things at once, the design complexity of the CPU increases. You also need to do things like reorder instructions so they'll better fit, and you need larger caches to keep the cores fed. On that note...

Cache sizes (per core) are quite large on the A12

Per core we have:

  • On the A12: Each Big core has 128kB of L1$ and 8MB L2$. each Little core has 32kB of L1$ and 2MB of L2$. There’s also an additional 8 MB of SoC-wide$ (also used for other things)

  • On EPYC 7601: 64kB L1$, 32kB L1D$, 512 KB L2$, 2MB shared L3$ (8 MB per 4-core complex)

  • On Xeon 8176: 32kB L1$, 32kB L1D$, 1MB shared L2$, 1.375MB shared L3$

  • On 6700k: 128kB L1$, 128kB L1D$, 1MB L2$, 2MB shared L3$

What Apple has done is implement a really wide μArch, combined with a metric fuckton of dedicated per-core cache, as well as a decently large 8MB Shared cache. This is likely necessary to keep the 7-wide cores fed.


RISC vs CISC

Tl;Dr: RISC vs CISC is now a moot point. At its core, CISC was all about having the CPU execute commands in as few lines of code as possible (sparing lots of memory/cache). RISC was all about diluting all commands into a series of commands which could each be executed in a single cycle, allowing for better pipelining. The tradeoff was more cache requirements and memory usage (which is why the A12 cache is so big per core), plus very compiler intensive code.

RISC is better for power consumption, but historically CISC was better for performance/$, because memory prices were high and cache sizes were limited (as larger die-area came at a high cost due to low transistor density). This is no longer the case on modern process nodes. In modern computing, both of these ISAs have evolved to the point where they now emulate each other’s features to a degree, in order to mitigate weaknesses each ISA. This IEE paper from 2013 elaborates a bit more.

The main findings from this study are (I have access to the full paper):

  1. Large performance gaps exist across the implementations, although average cycle count gaps are ≤2.5×.

  2. Instruction count and mix are ISA-independent to first order.

  3. Performance differences are generated by ISA-independent microarchitecture differences.

  4. The energy consumption is again ISA-independent.

  5. ISA differences have implementation implications, but modern microarchitecture techniques render them moot; one ISA is not fundamentally more efficient.

  6. ARM and x86 implementations are simply design points optimized for different performance levels.

In general there is no computing advantage that comes from a particular ISA anymore, The advantages come from μArch choices and design optimization choices. Comparing ISA’s directly is okay, as long as your benchmark is good. Spec2006 is far better than geekbench for x-platform comparisons, and Is regularly used for ARM vs x86 server chip comparisons. Admittedly, not all the workloads are as relevant to general computing, but it does give us a good idea of where the A12 lands, compared to desktop CPUs.


Unanswered Questions:

We do not know if Apple will Scale up the A-series chips for laptop or desktop use. For one thing, the question of multicore scaling remains unanswered. Another question is how well the chips will handle a Frequency ramp-up (IPC will scale, of course, but how will power consumption fare?) This also doesn't look at scheduler performance because there's nothing to schedule on a single-thread workload running on 1 core. So Scheduler performance remains largely unknown.

But, based on power envelopes alone, Apple could already make an A12X based 3-core fanless MacBook with 11W power envelope, and throw in 6 little cores for efficiency. The battery life would be amazing. In a few generations, they might be able to do this with a higher end MacBook Pro, throwing 8 (29W) big cores, just based on the current thermals and cooling systems available.

In any case, the A12 has almost caught up to x86 desktop and server CPUs (Keep in mind that Intel’s desktop chips are faster than their laptop counterparts) Given Apple's insane rate of CPU development, and their commitment to being on the latest and best process nodes available, I predict that Apple will pull ahead in the next 2 generations, and in 3 years we could see the first ARM Mac, lining up with the potential release of Marzipan, allowing for iOS-first (and therefore ARM-first) universal apps to be deployed across the ecosystem.


Table Sources:

  1. Anandtech Spec2006 benchmark of the A12

  2. i7 6700k Spec_Int 2006

  3. Xeon 8176 + AMD EPYC 7601 1c2t Spec_Int 2006


Edits:

  • Edit 1: table formatting, grammar.

  • Edit 2: added bold text to "best" in each table.

  • Edit 3: u/andreif from Anandtech replied here suggesting some changes and I will be updating the post in a few hours.


[Ice universe] Geekbench CPU performance comparison: Apple A12, Snapdragon855, Kirin980 A12 is still the most powerful, 855 and 980 are almost the same, they are based on A76+A55, only one is unknown, Exynos M4 based Exynos9820.
r/Android

Android news, reviews, tips, and discussions about rooting, tutorials, and apps. General discussion about devices is welcome. Please direct technical support, upgrade questions, buy/sell, app recommendations, and carrier-related issues to other subreddits.


Members Online
[Ice universe] Geekbench CPU performance comparison: Apple A12, Snapdragon855, Kirin980 A12 is still the most powerful, 855 and 980 are almost the same, they are based on A76+A55, only one is unknown, Exynos M4 based Exynos9820.




Apple A13 benchmarked in Geekbench 5: +18% single-core, +19% multi-core and +41% Metal (vs A12)
r/hardware

/r/hardware is a place for quality computer hardware news, reviews, and intelligent discussion.


Members Online
Apple A13 benchmarked in Geekbench 5: +18% single-core, +19% multi-core and +41% Metal (vs A12)

A new result of the Apple iPhone12,3 (aka 11 Pro Max) just popped up in Geekbench 5.

iPhone XS Max iPhone12,3
Single-Core 1119 1324 +18,3%
Multi-Core 2855 3394 +18,9%
Metal 4666 6557 +40,5%

Of course these are individual runs so there is some statistical noise in the data. Looking at the sub-scores, on the CPU the Image Inpainting and Ray Tracing scores stand out with +30% performance, and on the Metal test, which runs on the GPU, the Stereo Matching result stands out at double the performance.

Looking at the Metal benchmark chart, this places the A13 GPU on par with the Intel Iris Plus Graphics 650/655 in the 13-inch MacBook Pro, and just above the A10X from the 2017 iPad Pros.

Just a single benchmark though, different applications may vary. Can't wait for AnandTech's review!





TSMC's 7nm FinFET process used by Apple's A12 Bionic SoC features 67,4% more transistors per mm2 than 10nm and 211,3% more than 16nm
r/hardware

/r/hardware is a place for quality computer hardware news, reviews, and intelligent discussion.


Members Online
TSMC's 7nm FinFET process used by Apple's A12 Bionic SoC features 67,4% more transistors per mm2 than 10nm and 211,3% more than 16nm

SoC Node Die size MTransistors MT/mm2
A10 16nm FF 125 mm2 3300 26,4
A11 10nm FF 87,6 mm2 4300 49,1
A12 7nm FF 83,3 mm2 6900 82,8

Sources: Tech Insights, Wikipedia

The Apple A12 contains 6,9 billion transistors on a die of 83,3 mm2. The density is 82,8 million transistors per mm2, 67,4% higher than the 49,1 Mtrans/mm2 of the A11. It's 211,3% higher than the 16nm FF process used by the A10, which is only 2 years old.

It will be very interesting what AMD and Nvidia could do given 3x the transistor density of 7nm FF compared to 16nm FF.




Apple's A12 die shot analysed and how a potential Apple A12X SoC could look like.
r/hardware

/r/hardware is a place for quality computer hardware news, reviews, and intelligent discussion.


Members Online
Apple's A12 die shot analysed and how a potential Apple A12X SoC could look like.

Tech Insights released a die shot with floor plan today (mirror). That means it's time for me to estimated how a more powerfull A12X could look like!

First some stats measures from the floor plan. We know that the A12 die size is 9.89mm x 8.42mm = 83.27 mm2 and contains 6.9 billion transistors.

Part (A12) Area MTransistors Percentage
Big CPU (2x Vortex) 3,83 mm2 317 4,6%
Little CPU (4x Tempest 6,07 mm2 503 7,3%
L2 cache (8 MB) 3,34 mm2 277 4,0%
DDR Logic (64-bit) 2,56 mm2 212 3,1%
GPU (4 clusters) 12,90 mm2 1069 15,5%
NPU (8 clusters) 5,53 mm2 458 6,6%
Total of above parts 34,23mm2 2836 41,1%

You can see the area's on this floor plan.

All the above are measured values. If you don't like speculation and calculated guess, don't read further.

The Big CPU cluster seems a little small compared to the Litte CPU cluster, a big core should be at least twice the size of a little core. But let's assume that they are right.

So let's build a A12X. I made the following modifications for my theoretical A12X.

  • Doubled Big CPU cores (2 to 4 Vortex cores)

  • Doubled the L2 cache (8 to 16 MB)

  • Doubled the GPU (4 to 8 clusters)

  • Made the NPU 50% larger (8 to 12 clusters)

  • Doubled the DDR Logic interface (64-bit to 128-bit)

That would give the following values:

Part (A12X) Area MTransistors
Big CPU (4x Vortex) 7,66 mm2 635
Little CPU (4x Tempest 6,07 mm2 503
L2 cache (16 MB) 6,58 mm2 554
DDR Logic (128-bit) 5,12 mm2 424
GPU (8 clusters) 25,80 mm2 2138
NPU (12 clusters) 8,30 mm2 687
Total of above parts 59,63mm2 4941

So far I added about 2,1 billion transistors and about 25,4mm2. We also have to take interconnect into account, even Apple can't magically teleport data to different parts of the chip, so let's add 20% extra for interconnect logic (this is a guess). Also I want a larger video decoder and display pipeline for the high-res 120Hz iPad screens, so let's add 4mm2 for that (plus 20% interconnect).

Now our A12X measures 83,27mm2 + (25,4 mm2 + 4 mm2) * 1,2 = 118,5mm2, containing 9,8 billion transistors. It's 42% bigger than our A12. Sounds about right, and I think we can expect something like this from Apple in the near future.

Edit: As always, AnandTech did a much better job than both Tech Insights and me. Read their analysis here: TechInsights Publishes Apple A12 Die Shot: Our Take





Could Apple use both the A11 Bionic and A12 chips in 2018?
r/apple

An unofficial community about Apple and all of its devices and software.


Members Online
Could Apple use both the A11 Bionic and A12 chips in 2018?

In light of the news today about Apple going with a 7-nm process for the A12 chip, I think Apple could use the processor as a differentiating factor between the new 2018 models.

What's rumored to release:

  • 5.8 in iPhone X successor

  • 6.5 in OLED iPhone

  • 6.1 in LCD iPhone

I would have assumed all 3 new models would come with the new A12 chip, but now that we are hearing it will use a completely new manufacturing process, and be one of the first mass-produced 7nm chips, I could see Apple only using the A12 in the high-end OLED models, thus using the processor to differentiate the models further (in addition to the screen size/type and other features).

So it could be:

iPhone "?" ($$)

  • 5.8" OLED

  • A12

  • New/improved "Face ID 2"

  • New/improved dual camera system

iPhone "?" Plus ($$$)

  • 6.5" OLED

  • A12

  • New/improved "Face ID 2"

  • New/improved dual camera system

iPhone "?" ($)

  • 6.1 LCD

  • A11 Bionic

  • Face ID from iPhone X

  • Dual camera system from iPhone X

I think Apple will reuse a lot of the components from 2017's iPhone X on the 2018 LCD iPhone in order to save costs of developing new systems, allowing it to be a "cheap" plus sized phone will great internals. The A11 Bionic processor will still allow that iPhone to be competitive processor-wise for at least a year. Still have no clue what they will name these things though.

So what do you think?

edit: formatting