Tiny intro wiki for ARM/RISC OS (and may be other platforms)

category: general [glöplog]
Hi tiny intro sizecoders,

by now sizecoding.org was covering only x86. I took some time and created a section on ARM/RISC OS focused on Raspberry Pi's (or other compatible hardware) which I will extend as far as I got time and add more stuff. You can see the link below from the main page or go there directly from here.

Any comments, corrections, additions welcome on that. I wouldn't even know if may be running Linux or whatever you can run on an RPi could be benefitial for sizecoding also.

Hellmood also suggested if anybody is interested in creating information on tiny intros for other platforms it can be also added on sizecoding.org.
added on the 2020-06-18 08:50:38 by Kuemmel Kuemmel
Thanks for the work Kuemmel, already took a look at it yesterday ;-)
added on the 2020-06-18 09:50:12 by superogue superogue
what emulator is good for use?
added on the 2020-06-18 11:03:11 by g0blinish g0blinish
@gOblinish: I never worked with emulators so I'm not sure...may be you can check out the ones listed here
added on the 2020-06-18 11:32:23 by Kuemmel Kuemmel
Nice! But it feels like a teaser.
Most important sizecoding question - how to use THUMB - is not covered yet.

Code example supposed to show possible size advantage of ARM is not suitable, because with lea eax,[ecx+edx*4] x86 would win.
added on the 2020-06-18 11:49:01 by frag frag
Nice summary
@frag: Ok, that's true, then I change the value from *4 to *16, can you do that ;-) ?

Regarding Thumb: What do you mean by not covered ? The sources I show are thumb sources, as shown also in the linked intros. You just use exactly the code from the wiki in the 'main' file for the !GCC and assemble it. Then as described below the start address needs to be set to +1 to trigger thumb mode.

May be I should explain that !GCC / GNU assembler setup more in detail ?
added on the 2020-06-18 13:56:34 by Kuemmel Kuemmel
No, but it's still possible to do it only one byte longer. I mean, if you want to clearly show strong size-reducing properties of ARM ISA (conditional execution and free shifts) then using opcodes other than add and e.g. shift to the right is better and will require much longer x86 equivalent.

I know nothing about ARM and THUMB and after reading this article still have basic questions, as probably other readers too.

What restrictions THUMB has compared to full 4-byte code? How to deal with it?
E.g. is it possible to further reduce the size of this 8-byter using THUMB? (if yes, why it's not done?)
What is the difference between THUMB and THUMB-2?
Is it possible to mix non-THUMB and THUMB code and if yes, how?
added on the 2020-06-18 14:54:43 by frag frag
Ok! Thanks for your response. I'll think of a better or extended example. I should also state that the conditional execution can be used multiple times after a cmp (in normal ARM mode) so you could do a whole sequence without branch of conditioned executions, before you trigger the flags again (adding an 's', like adds r0,r0,r1 triggers the flags or of course another cmp).

Regarding your other questions as far as I can answer it
- Thumb has only a limited instruction set (see the reference card linked in the article), e.g. no conditional execution, basically half the registers. So it might be not as usefull depending on code case.
- As far as I see (I only use Thumb since 6 weeks...normal ARM>25 years, not professional, just hobby) the 8-Byter can't be reduced by Thumb. Thumb-2 provides an If-Then instruction, but I think in total it doesn't lower the byte count even if the cmp in Thumb becomes 2 Bytes.
- Thumb-2 ISA basically is "almost" Thumb + normal ARM instructions except e.g. "normal" conditional execution. See here and also the instruction set sheet in the links.
- Mixing...the cpu can run either in ARM or in THUMB mode. I didn't try by now, but you can call a Thumb routine from ARM mode and go back (Link)

I'll try to cover more of that in the future...it's may be helpfull to check the prods of me and exoticorn (there's a disassembly file provided by the gnu assembler) that will show what instruction became a 2-Byte Thumb and what became a 4-Byte Thumb-2/VFP/NEON. As Thumb-2 includes/supports VFP/NEON. So even if not all the time you get a 2 Byte Thumb instruction from using Thumb-2 mode it reduced my 3dball from 248 to 204 Bytes.
added on the 2020-06-18 15:55:38 by Kuemmel Kuemmel
I can provide info on GBA ARM sizecoding, and linux stuff (but there's this wiki for that already).
added on the 2020-06-18 18:03:51 by porocyon porocyon
I appreciate and welcome every bit of information for all size coding platforms. if wikis exist, please link them! =)
added on the 2020-06-18 18:35:23 by HellMood HellMood
I really enjoyed coding for Gameboy Advance in ARM assembly, that architecure rockz hard (and compared to it x86 just sucks), but unlike developing for GBA I didn't find any cross dev tools for RISC OS .. I prefer coding on linux, are there any assemblers for RISC OS? Coding on RISC OS - I don't like that strange OS usage guidelines ...
added on the 2020-06-19 18:57:58 by Asato Asato
Hi Asato...I never used it myself, but I found that link, may be it fits your purpose ? Cross-compiling software with GCCSDK for Risc OS May be that includes the gnu assembler that we usually use besides the Risc OS Basic Assembler on Risc OS directly.
added on the 2020-06-19 21:05:35 by Kuemmel Kuemmel
> I wouldn't even know if may be running Linux or whatever you can run on an RPi could be benefitial for sizecoding also.

added on the 2020-06-19 21:14:31 by utz utz
re: Linux/RPi ARM: not only what utz said, but also:

* ELF headers are quite large in a sizecoding context, and ARM makes this annoying to work around (compared to x86, where you can do a lot of tricks to shoehorn instructions into the headers). depending on things it might be more beneficial to somehow map some shellcode into memory and jump to it from some scripting language (I tried this with python before, which ended up being slightly larger than ELF headers, but accessing one or two libc/... functions becomes quite a bit easier, but it also makes a lot of things less predictable, which means you might want to use a regular ELF file for more advanced stuff)
* if you want to get an accelerated graphics api, you basically need to do dynamic linking (the reasons are complicated, but the short version is that graphics drivers are userspace) which means you need 10x the header size, plus a lot of setup code (see also the revision 2019 seminar :) )
* fbdev graphics (basically a dumb framebuffer) can be used, but the code then needs to run as root, and you can't be running X11/any kind of graphical environment at the same time. this is kinda annoying. but it's possible.

I have a semifinished ARM version of vondehi lying around, but I haven't finished it because only a small fraction of linux people run ARM (this might change in the future when things like the Pinebook Pro might become more common), and it (unintendedly) also seems to run on Android, so Android malware authors might pay more attention it than the scene. Also it's a bit larger than the x86 version, so, meh.


re: RISC OS assemblers for Linux: I guess GCC/GAS + manually typing out the file header (like this) (or using a linker script) might work?


re: GBA: I could put a braindump on the wiki, but, it doesn't support HTTPS, and I don't really want to create an account w/o an encrypted connection. could this be fixed please?
added on the 2020-06-20 17:20:50 by porocyon porocyon
@porocyon/@asato: Any ARM GCC toolchain will do. For Edgedancer, all I used was as and objcopy (and objdump for checking that I didn't accidentally generated 32bit instructions where I didn't mean to).

RISC OS binaries don't have any header that you'd have to manually tack on, the file is loaded as is into the application memory area and executed there.

Also, it's nice to see that there is some GBA sizecoding going on. I tried to get something going with Raster back in the day, but noone really took me on then. I guess, nowadays you wouldn't count the header in the size limit, which makes some sense.

@g0blinish: rpcemu is free and works great, HOWEVER, it implements the StrongARM instruction set which predates Thumb, NEON and VFP.
So for sizecoding the only option is real hardware. Thankfully, the Raspberry Pi 3 is pretty cheap and 1MB of RAM is more than plenty for RISC OS. (One of the advantages of being an OS from the 1990s. ;) )
added on the 2020-06-20 21:44:42 by exoticorn exoticorn