2024-09-07 00:32:00 crc: GCC is really not happy with my attempts to create a global TOS register 2024-09-07 00:33:24 This is the best I've gotten (from inspection of the generated object) https://termbin.com/eaj6 2024-09-07 00:33:49 If you're willing to profile I'd be interested to see how that method compares 2024-09-07 00:34:29 It's an interesting feature, I will say it seems to be good sometimes but abysmal other times, it's not clear to me what kinds of vars it's happy putting in registers 2024-09-07 00:39:18 And that's me done, learned some interesting stuff about GCC today 2024-09-07 00:46:46 veltas: that's unfortunate! Were you able to allocate m, sp, and rp to global registers? 2024-09-07 00:47:15 I mean I see that you asked it to but I don't know whether it was willing to 2024-09-07 00:49:43 https://termbin.com/rm20 2024-09-07 00:49:51 It's definitely picked up rbx 2024-09-07 00:50:55 And r12 seems to be honoured 2024-09-07 00:51:37 And r15 2024-09-07 00:52:09 you can probably also expropriate rbp, though I don't know if you need to -fomit-frame-pointer 2024-09-07 00:52:55 did you get the C intercalated into the assembly by hand, or is there an objdump option for that? 2024-09-07 00:53:10 There's a few callee-saved regs and the reg seems to matter less than *how* the reg is used 2024-09-07 00:53:20 Pointers seem to work better as global regs, for some reason 2024-09-07 00:54:05 caller-saved regs are allowed too, but they are clobbered by external functions 2024-09-07 00:54:19 yeah, and sometimes GCC calls external functions at unexpected times 2024-09-07 00:54:42 -g to GCC and -S to objdump gives you inline source 2024-09-07 00:54:55 I use -Mintel too because I don't hate myself 2024-09-07 00:54:58 oh, thanks! I had no idea! 2024-09-07 00:55:21 yeah, I usually use -Mintel but sadly ,noprefix seems to have no effect 2024-09-07 00:56:22 like, is mov edx,DWORD PTR [rbx] 2024-09-07 00:56:32 really any clearer than mov edx, [rbx]? 2024-09-07 00:56:34 I don't think so 2024-09-07 00:57:13 I find inline source really helpful with -O3 where the code is just so far removed from the .c 2024-09-07 00:58:19 yeah. given the amount of time I've spent trying to figure out how a long subroutine works with objdump -D, I think I ... really should have read the motherfucking binutils manual well before now 2024-09-07 00:58:35 I mean one I just now compiled and could easily recompile with -g 2024-09-07 00:59:05 I've often instead used gcc -g -Wa,adhlns=motherfucker.lst motherfucker.c -c -o motherfucker.o 2024-09-07 00:59:19 It's not clearer with DWORD PTR but I'm quite used to reading with that, it doesn't bother me 2024-09-07 00:59:20 but that's absurdly verbose and hard to follow 2024-09-07 00:59:30 this would have saved me so much time 2024-09-07 00:59:45 thank you 2024-09-07 00:59:50 No problem lol 2024-09-07 01:04:14 I think I actually *have* read *parts* of this manual 2024-09-07 01:04:27 but I never learned I should use -S 2024-09-07 02:04:16 veltas: updated w/the last vm code you posted: http://forth.works/share/lhOvbevf6E.txt 2024-09-07 02:30:52 do you guys use any kind of naming or stack effect comment conventions to indicate a word may use the "rdrop exit" idiom? 2024-09-07 02:44:42 I don't have anything consistent for this in my systems 2024-09-07 02:44:50 were any of those benchmarks performed with -fno-plt? it would be interesting to see if it'll cut out the PLT+GOT extras and if so, will it make much of an impact on performance if the test programs cause ilo to call libc functions if NOSTDLIBC is undefined 2024-09-07 02:46:49 unjust: no, but I'll run some w/that added 2024-09-07 04:05:31 unjust: http://forth.works/share/NHTZLzIumT.txt adds results with -fno-plt (I also added a redirect of output to /dev/null, removing some terminal related overhead) 2024-09-07 04:10:58 the NOSTDLIBC in this case just avoids using printf() to dump the stack on exit, so not really relevant from this 2024-09-07 04:11:37 (it can be used with a host specific set of syscall wrappers for open/close/read/write/lseek/exit instead of linking with libc, useful for creating small static builds on systems with a bloated libc) 2024-09-07 08:38:16 zelgomer: If the word is the execution-time behaviour of something then I call it a 'raw word' and surround it with parens 2024-09-07 08:38:32 And those words almost always manipulate R@ 2024-09-07 08:44:10 crc: Sounds like the horrid macro inlining is the way to go then 2024-09-07 08:45:05 You can try doing register globals in the inlined one if you want, but my curiosity is satisfied for now 2024-09-07 08:46:25 zelgomer: Also how I might represent that ( R: x1 -- x2 ) 2024-09-07 09:00:24 Yeah the assembly rewrite is easily the best and also the smallest 2024-09-07 13:05:13 crc: thanks for trying that out - interesting that it seems to have performed slightly worse without PLT 2024-09-07 17:04:09 veltas: it occurs to me that ilo's stack items are 32 bits, while you're on amd64, where the "underlying" registers are 64-bit. Could that have been the obstacle to getting GCC to assign T to a register? 2024-09-07 17:04:31 perhaps T was an I (int) rather than a long?