Apple M1 (ARM) architecture demos

category: general [glöplog]
Ton, I'm indeed a real person and not caring too much to conceal this identity too much, it seems. Why at the end..

I must have annoyed you very much that you cared to do a research though.

Let me start over as I owe you apologies.

I reread some parts of this thread and last long post from you above. I think I get now where the misunderstanding is.

I was abnormally picky on arguments I admit. In this thread multiple persons claimed MacOs is a closed platform and described it using irrational arguments. My understanding of closed platform is that it must be jail broken to develop and distribute for, freely the way you define, not the platform owner tells you. Aside of annoying behaviour of current MacOs platforms, it's still not exactly a case.

I totally get the point that manually claiming each executable trusted in the system in a place that is disconnected from the flow is annoying as hell. Especially when you discover it for the first time and you have no clue what's happening.

It was similarly annoying to many users who discovered UAC back in the days and Smart Screen in modern Windows, although I believe MS manages the shift of the direction a bit better from the UI perspective.

In commercial deployments, there are tools to help you manage that, even centrally for multiple workstations (aside of the fact that we all should not trust apps from unknown sources).

In the demoscene context, it's just that one more step to do, which albeit being annoying, is not blocking totally to execute non-signed app, right?

Nevertheless I admit, calling you by names was unnecessary. What perhaps we missed is just a tone more of constructive words to understand each other's points better.

If there is a chance to meet at one party in this lifetime, I owe you a drink.

added on the 2021-01-11 09:57:15 by hollowone hollowone
We should start off with
Hello World
added on the 2021-01-15 15:31:36 by DaD1916 DaD1916
@DaD1916: Now we can start sizecoding stuff in AArch64 on an Apple M1 :-) I can also recommend the same authors (Stephen Smith) book called "Programming with 64-Bit ARM Assembly Language" ...after reading it I realise it's quite a big difference to AArch32, at least the instruction length is still fixed at 32 Bit.

And we got plenty 31 registers to play with...one reg called xzr is hardwired to be zero...weird but nice feature...looks like the engineers thought...what shall we do with all those registers...
added on the 2021-01-15 16:55:49 by Kuemmel Kuemmel
Well, a hardwired zero register isn't exactly a rarity... plenty of architectures have that. I'd say it's particularly useful for fixed-width instruction sets where it's cumbersome to encode immediate values.
...I guess it must have something to do either how they wanted to save the space for instruction encoding or something else internally. E.G a simple MUL instruction like MUL x0,x1,x2 is actually encoded as a multiply add with zero => MADD x0,x1,x2,xzr ...I would have thought that would kill performance but may be overall not.

...it's just weird to get used to that you have to turn on an no-alias disassembler option with the gnu assembler to actually see what it made of the instruction you thought you are using...
added on the 2021-01-15 17:15:08 by Kuemmel Kuemmel
Ha.. typing from M1, experience is great. I launched my own old Mac (intel) demo quite easily with rosetta and performance was great. But, indeed, I needed to trick the system a bit to accept untrusted app allowed to be launched. Hint how to do it easily can be found here: https://www.youtube.com/watch?v=NRlnHW-7uek

I'm checking what what other benchmarks I can use other than comparing FPS in shader toy between computers. If you find anything that can be used for the purpose, please share.
added on the 2021-01-15 18:08:03 by hollowone hollowone
All of the CSC exegfx should run natively on M1 (not tested but they should). https://www.pouet.net/groups.php?which=14401

I recommend running a low res version tho (540p maybe), because the m1 has ~1/4 of the flops of the target hardware level.
added on the 2021-01-15 19:08:56 by alia alia
We should start off with
Hello World

afaik the macos/darwin syscall ABI isn't stable or guaranteed to be usable by anyone else than apple or anything (unlike on linux where the syscall ABI is *very* stable), so I'm not sure about how useful that example is

(also re: hardwired zero reg: iirc MIPS and RISC-V have this one too, and maybe SPARC does as well)
added on the 2021-01-16 18:36:17 by porocyon porocyon
anyone got any experience with porting existing simd (intel) code to m1? given you've been working with cpp and intrinsics.
added on the 2021-01-17 03:14:12 by jco jco
lil bit of context: i have to maintain several products. it's already hard whenever apple deprecates any arbitrary stuff (e.g. you wanna use xcode xyz for the new mac os version? buy a new mac!!!1). i have to average out the amount of work getting my shit to run on "the new xcode" or "the new sdk" or "the new macos" on a regular basis. the requirement to "notarize" software comes to mind too (some few days I had to spend with my build scripts). all this means time and money. so, how do you go/feel about this? does apple stay relevant? abandon the platform? i'm also using opengl for fairly simple drawing and shader stuff, simply because it's the lowest common denominator in terms of "put shit on screen with performance" and i only need ogl 2.1 featureset - cross platform convenience is a thing. luckily this still works on macos despite apple announcing to deprecate ogl. the thing is: with the windows side of my development, I'm pretty happy. much more so than with the apple side of things.
added on the 2021-01-17 03:26:34 by jco jco
No specific experience with M1, but you should be able to use the existing NEON intrinsics in clang/etc. just fine. This may become more relevant if you at some point decide to branch out to Android or Windows tablets as well. Who knows where Microsoft is going with ARM in the next few years. So NEON optimizations would be useful for several platforms, but keep in mind that not every optimization should be directly translated from SSE to NEON as-is.
buy a new mac!!!1

how do you go/feel about this? does apple stay relevant? abandon the platform?

Apple steals your application "ideas" and implements in their own software and at some point, you become too small and even paying for "new Mac or licensing Xcode" becomes pointless. This how every corporation does on market.

I'm pretty happy. much more so than with the apple side of things.

Apple create software not to "have fun" or toying with it, they create software to sell, and all this "disgusting promotion" is their way to make money.
Declaration everything as "unique" and "we develop it from scratch for 10 years and no one ever had idea of creating this ever..." ... this is Apple...
added on the 2021-01-17 16:39:03 by Danilw Danilw
Could you clarify, I didn’t quite get what is Apple
added on the 2021-01-17 19:12:57 by farfar farfar
You might find diz useful, even though M1-specific support is not added yet by the committers:

Architecture and compiler agnostic SIMD intrinsics and wrappers

It links to a few interesting Related Projects as well.
added on the 2021-01-17 21:46:43 by Hoild Hoild
Hoild, nice find. I know that JUCE (which I use for cross platform dev) comes with its own SIMD abstractions too. What worries me most is having to port the few third party libs I'm using. There are some hints at the moment that existing x64 code will work via a translation layer, and that it performs surprisingly well. Question is how long this will work ;) I'll know more soon...
added on the 2021-01-18 00:00:15 by jco jco
The translation layer will be there for at least a couple of hours, and from what I’ve seen seems to give ~70% of native performance. The chip is fast to begin with so there’s not an immediate rush to port I guess.

As for mac generally, I can see it getting more relevant now because these new laptops are pretty compelling - you get a machine that’s cool and quiet, very portable, long battery life and good performance... They don’t really have much competition. Sales are up a lot, seems reasonable that mac market share will grow.

But coding for it.. supporting it cross platform seems painful, GL is depreciated (and wasn’t that good before it was depreciated). And the platform is pretty fast moving. Guess it comes down to how much that small market is worth to you.

Doing native dev on the other hand (ie writing specifically for mac) is pretty great. The tools and APIs are mostly excellent. Metal is a great API, and swift is really nice. It’s not without its problems obviously but it is very productive.

I find I spend time building lots of tools, because it’s quick and easily enough that the time pays off fast. Lately I built a GPU synth tool, with live coding, a nice UI to set it up, tabbing for multiple docs, waveform display and such... took <2 days work, now I have a tool that’s super fun to make audio in.
added on the 2021-01-18 12:24:25 by alia alia
The translation layer will be there for at least a couple of hours

Lovely typo and totally in line with how Apple deprecates things :D
ROFL, yep, better get it fixed today 😆 (years, obviously :)
added on the 2021-01-18 12:41:35 by alia alia
General answer to some of the concerns written here recently, which I understand are touching backward compatibility per se.

I find Apple not good at it too. Not worst in the game, but definitively it could learn a lot from Microsoft it they cared. I do believe they don't.

It's occasionally hard to launch app for MacOs recent 10 years ago or so. It's not possible to launch PPC OSX app on Intel OSX these days as Rosetta is long gone. Rosetta2 will be gone shortly as my assumption as well, meaning that Apple will say Intel good bye in probably 2 years or so.

There are some breaking changes that Apple introduces and makes tech obsolete, and when they make tech obsolete, it really means it will disappear from the system.

Windows is superior to it as it still continues to build new tech on top of COM that was introduced in 93, it still includes VB6 runtime today and launching Win9x app is possible on Windows 10 much easier than launching MacOS app developed around 2000...

Unfortunately a platform condition to consider when you make native apps. For that reason commercially when focused on Macs I do only rely on web stack, not native tech. For demo purposes it's all fun to do just differently without any hope that this work can survive in memories longer than a single YT playback.

For similar reason I find Linux actually much worse to support old productions. ELFs for 2.0, 2.2 on current kernel... perhaps I lack knowledge and experience, but how do you launch these without installing really old linux in a VM, is there a way? Off topic to this subject but still I'm quite surprised to say that in the league of backward compatibility Windows wins over MacOS, which wins over Linux if you look at 10-20 years old binaries to launch.
added on the 2021-01-18 13:04:27 by hollowone hollowone
The Linux kernel ABI is stable ("never break userland") but many old binaries will be linked against old libraries that noone uses (or can build) anymore. For example, old demos might use svgalib which was last updated 20 years ago. Obviously that won't run on a modern system.
For similar reason I find Linux actually much worse to support old productions. ELFs for 2.0, 2.2 on current kernel... perhaps I lack knowledge and experience, but how do you launch these without installing really old linux in a VM, is there a way? Off topic to this subject but still I'm quite surprised to say that in the league of backward compatibility Windows wins over MacOS, which wins over Linux if you look at 10-20 years old binaries to launch.

What Saga said (though one could always try to make an svgalib replacement that'd use SDL or so, and ALSA can do OSS emulation), but also:

You basically have to install old versions of their dependencies (eg. they might be looking for libc.so.5, but glibc nowdays provides libc.so.6), but you'll have to compile these with an old compiler, as the calling convention/ABI changed in a binary-backwards-compat-breaking way (call stack had to be aligned to only 4 bytes before, but nowdays it has to be 16 bytes). Which is 'kind of' annoying.

Other than that one could write something that inserts itself between these old binaries and the dynamic linker, and handles the calling convention/ABI translation stuff by itself, so you don't need to install all the new libraries (though you'll still need 32-bit versions of them installed). I don't really have time to write this tool, but if someone would like to have a try at this, I can give you a braindump about it in another Pouët thread or in #lsc on IRCnet or so.


Hm, I heard that sideloading mobile apps on M1 macs is now disabled? Nice of Apple...
added on the 2021-01-18 14:52:42 by porocyon porocyon
Here's a handful of new numbers by neAraz:


tl;dr: x64 CPUs can still easily beat the M1 with sheer core count and if the code is using AVX2, otherwise it's a really impressive little chip :)
added on the 2021-01-18 16:14:43 by kb_ kb_
Overall the 2019 MacBookPro is from “a bit faster” to “about twice as fast” as the M1, when compression is fully multi-threaded.

M1 on the other hand, only has 4-wide SIMD execution (via NEON). If a program can take really good advantage of wider SIMD, then Intel CPU has an advantage there.
added on the 2021-01-18 16:15:10 by Danilw Danilw
Here's a handful of new numbers by neAraz:

lol just saw it on twitter, and you post it in 30 sec before me
added on the 2021-01-18 16:18:19 by Danilw Danilw