2024-12-17 00:25:25 MrMobius: That's how I plan to first attack my next system. Use gcc to "bootstrap" it. Get a RAM region executable, and then just poke code into it. Eventually jump to it and never return. The main thing I want that for is as a notebook platform for developing the portable Forth parts of my system. Eventually I want it to run on a microcontroller board I've got, which of course will require a 2024-12-17 00:25:28 different native code layer. 2024-12-17 01:22:41 KipIngram: ya that's a good way to do it whether the code ends up in RAM or ROM 2024-12-17 15:53:27 when you write a callback interface in c, how should it look? a) int somefunc(int (*cb)(int x, int y, void *ud), void *userdata); b) int somefunc(int (*cb)(void *ud, int x, int y), void *userdata); or c) int somefunc(int (*cb)(int x, int y, va_list ap), ...); ? 2024-12-17 15:54:47 these are the sort of things that keep me awake at night 2024-12-17 15:59:18 the first one 2024-12-17 15:59:54 the third one is a nightmare through most FFIs 2024-12-17 16:00:42 lol i started converting some of my code to the third one last night just to try it. at least on linux x86_64 it generates absolutely awful asm 2024-12-17 16:01:16 as https://nullprogram.com/blog/2023/12/17/ explains: 2024-12-17 16:01:28 > Note that the context pointer came after the “standard” arguments. All things being equal, “extra” arguments should go after standard ones. But don’t sweat it! In the most common calling conventions this allows stub implementations to be merely an unconditional jump. It’s as though the stubs are a kind of subtype of the original functions. 2024-12-17 16:03:48 that makes sense. my rationale for (b) was that it looks more like you're composing a closure, where you provide a function and its first arg (the closure environment) together. the other one is that i find sometimes the userdata arg is something i want to manipulate, so it ends up being the first arg to functions called from inside the callback. 2024-12-17 16:03:53 also, with the third one, there's no standards-compliant way in C to store the va_list userdata for later use 2024-12-17 16:04:39 and i still think it's stupid that int (*)(void *) isn't compatible with int (*)(specific *) 2024-12-17 16:04:44 (although of course for any particular implementation of C there is a way to do it) 2024-12-17 16:05:22 yep, if the callback is to be called after stack unwind then (c) isn't an option 2024-12-17 16:10:52 it's an option if you're writing assembly 2024-12-17 16:15:30 this is a cool blog 2024-12-17 16:15:57 i actually went down my very own closure tinkering similar to his! 2024-12-17 16:16:29 didn't spend as long as he did fully fleshing it out 2024-12-17 16:17:13 yeah, it's a great blog 2024-12-17 16:19:52 https://uncensored.citadel.org/readfwd?go=Programming?start_reading_at=2099372560#2099372560 <-- there's mine. the stupid bbs destroys all of my leading whitespace though so good luck reading it. 2024-12-17 16:20:50 i even mentioned the TLS method that he described, because i've actually done that before. i need to work with this guy. 2024-12-17 16:22:18 you might also want to know that GCC does this if you write nested functions (a GCC extension to C) 2024-12-17 16:22:55 Chris probably should have mentioned that in his post, but it's possible he doesn't know about it 2024-12-17 16:25:58 I wrote an example of this in http://canonical.org/~kragen/sw/dev3/qsortn.c which uses the nested-function facility to sort a list of words by their Nth letter, where N is a command-line argument 2024-12-17 16:27:58 a friend of mine said it was the most surprising thing he'd ever seen a compiler do 2024-12-17 16:34:10 on amd64 the GCC code that assembles the trampoline function on the stack is not even that long if compiled with optimization 2024-12-17 16:35:05 this is one of the few GCC extensions that Clang doesn't implement 2024-12-17 16:36:14 oooohhhh it's on the stack 2024-12-17 16:36:26 i've been staring at this the whole time trying to figure out what populated r10 2024-12-17 16:36:37 that's cool 2024-12-17 16:37:43 "less /proc/self/smaps" [stack] is not executable though? 2024-12-17 16:39:54 ah, but it is your program :) 2024-12-17 16:40:15 so gcc changes the elf header to make its stack executable. sneaky. 2024-12-17 16:45:37 yes, and GNU ld emits a warning about this 2024-12-17 18:55:05 hi there! in case someone is developing for the Jupiter ACE, I've created a tool that will help 2024-12-17 18:56:52 https://codeberg.org/pgimeno/Mazogs/src/branch/jupiter_ace/JACompiler 2024-12-17 18:58:11 not sure if that should go to #retro instead, I haven't been to that channel 2024-12-17 19:25:07 pgimeno: that's awesome! 2024-12-17 19:29:11 glad you think so! despite having created this tool, I'm a total novice in Forth, so I might be asking some novice questions 2024-12-17 19:30:56 and here comes the first question, is VOCABULARY used often? 2024-12-17 19:30:56 not anymore! 2024-12-17 19:30:56 oh heh 2024-12-17 19:31:08 vocabulary? we don't need no education 2024-12-17 19:31:11 well, would it be used often if you were programming for a Jupiter ACE? 2024-12-17 19:31:19 good question. I've never tried 2024-12-17 19:31:37 F83 is from about that time, for a similarly sized machine 2024-12-17 19:31:49 my understanding is that it implements a sort of tree of namespaces, if I've understood the descriptions right 2024-12-17 19:31:52 it uses about a dozen vocabularies 2024-12-17 19:32:06 not really a tree, just a flat vocabulary namespace 2024-12-17 19:32:35 vocabularies have two big benefits 2024-12-17 19:32:46 well, three 2024-12-17 19:33:00 (and ANS FORTH wordlists can also be used for these) 2024-12-17 19:34:17 the question was aimed at deciding whether it's worth implementing VOCABULARY in the compiler 2024-12-17 19:35:45 1. They keep your namespace small, which helps with finding things and with detecting misspelled words, and they allow you to distinguish the interface of a module from its implementation, because they're in separate vocabularies. 2024-12-17 19:35:58 This is similar to separate compilation in C allowing you to use `static` 2024-12-17 19:36:39 2. They permit you to compile the same word differently in different contexts, which is essential to metacompilation (compiling a Forth system with itself). 2024-12-17 19:37:32 3. They make it practical to use the Forth outer interpreter as a user interface for your program by putting your program's user interface actions in words with names like a s d f l. 2024-12-17 19:40:04 the bit about metacompilation made my head spin :) 2024-12-17 19:40:10 pgimeno: like xentrac mentioned, ans removed vocabulary, but not without replacing it with something different. i would say having some kind of method of grouping words into vocabularies or word lists or whatever you want to call them is idiomatic today. 2024-12-17 19:40:46 zelgomer: thanks, so that's a vote for "yes it's worth it" 2024-12-17 19:40:56 there are a few here (crc being one example) who don't do it and just let everything go into the global namespace, but they have other techniques of managing it 2024-12-17 19:42:07 pgimeno: sorry about that! What I mean is that when you're metacompiling you need to have at least two definitions of, for example, :: the one that you're using to define new subroutines, and the one that you're compiling into the new Forth system being built. 2024-12-17 19:42:24 (that is, the word whose name is pronounced "colon") 2024-12-17 19:42:30 xentrac: ah, understood 2024-12-17 19:43:42 I think I should actually rewrite the compiler and implement it as an interpreter, to make it more useful. I presume that things like [ ... ] LITERAL are quite common. 2024-12-17 19:44:05 I've definitely seen people do that more than once 2024-12-17 19:44:08 so common that i defined ], 2024-12-17 19:44:31 heh 2024-12-17 19:44:32 yeah i was going to say, so common that i've seen others and have started myself using a single word for it 2024-12-17 19:45:31 in my case, i cheated off of muforth and my "]" is like "] literal" by default, and i use "-]" for the old behavior 2024-12-17 19:45:59 I feel like most of the time it's sort of a dubious optimization 2024-12-17 19:46:30 i feel like most of the time most of what we do here is dubious 2024-12-17 19:46:43 like, you could say [ width height * ] literal but maybe it would be better to say width height * constant gridsize earlier and then just say gridsize where you would have said that 2024-12-17 19:47:07 or just say width height * instead 2024-12-17 19:47:50 I'm porting the ZX81 game Mazogs to the Jupiter ACE, and since the game contains a BASIC part, I need to rewrite it in Forth. The tool will suffice for that purpose because the BASIC part is very... well, basic, but from what I'm gathering here it's lacking in power for using it for development 2024-12-17 19:48:13 it's kind of like how about 80% of the use of Lisp macros in Emacs is just as a function-inlining optimization 2024-12-17 19:48:20 one of my biggest challenges with forth is when it forces me to name a bunch of highly local things that i wouldn't normally have to name in another language, so if i can ever minimize that then i'm in favor of it 2024-12-17 19:48:24 you could maybe write a BASIC interpreter in Forth 2024-12-17 19:49:19 well, width height * is shorter than [ width height * ] literal, and equivalent if width and height are really constants (and often a correctness improvement if they aren't) 2024-12-17 19:49:43 xentrac: probably, but it's really not worth it in this case, rewriting it in Forth is much simpler 2024-12-17 19:49:49 plausible! 2024-12-17 19:49:58 there's a small BASIC interpreter included in F83 2024-12-17 19:52:36 zelgomer: I'm trying to think of what you mean. Is it that you feel a pressure to write smaller subroutines in Forth, which necessitates naming them? 2024-12-17 19:54:07 well like your example. in c it wouldn't be unreasonable to write malloc(width * height), and i never had to name that constant. forth encourages much finer factoring, and naming things is already challenging, so now i find i have to name a bunch of small things that would normally have been anonymous expressions. 2024-12-17 19:54:45 if you turn "] literal" into just "]" then i find [ height width * ] malloc analogous to the c example 2024-12-17 19:56:28 how does one make malloc read the literal from the relevant stack in that case? 2024-12-17 19:56:40 i tried playing around with something like that but couldn't make it not crash 2024-12-17 19:58:40 malloc (which is a hypothetical word in this case) would just get the value from the parameter stack like anything else 2024-12-17 19:59:57 GreaseMonkey: to be super clear, what i wrote there won't work in most forths. i'm talking about the muforth version of "]" which not only enters compilation mode, but also compiles a literal 2024-12-17 20:00:04 ah alright 2024-12-17 20:01:15 ok in that case malloc would probably have to be immediate...? 2024-12-17 20:02:01 that's probably not what you want since it wouldn't make much sense to malloc something only once when the word is compiled 2024-12-17 20:25:41 how about just height width * malloc though? 2024-12-17 20:26:55 my cycles!!! 2024-12-17 20:28:16 if it's something like that then sure, it probably doesn't matter. what if it's not * but log10, though? 2024-12-17 20:44:56 well, you could use an optimizing compiler instead of a stack bytecode interpreter 2024-12-17 20:47:14 if it's something that runs once when your program starts on a 3GHz 64-bit processor which can issue and retire 5 microinstructions per cycle on each of its 16 cores, you can probably tolerate a fair bit of inefficiency in things that don't happen often 2024-12-17 20:47:27 like startup and shutdown kinds of stuff 2024-12-17 20:48:08 and if it's something that you're running in a loop over your 32 gibibytes of RAM, you'll probably want to figure out how to vectorize it with AVX or your local equivalent 2024-12-17 20:48:26 there's code in between those two extremes but not nearly as much as there was, say, 30 years ago 2024-12-17 21:05:37 compiler optimization seems like it would be very difficult here. it would have to be able to recognize that log10 is a pure function and given a constant input will produce a constant output 2024-12-17 21:08:12 ah, recognize the semantics of the function in order to optimize them... like gcc and most other compilers do to optimize memcpy etc. 2024-12-17 21:08:33 I'm pretty sure GCC does recognize that about log10 2024-12-17 21:09:10 for constant inputs? I'm not so sure, I would have to check 2024-12-17 21:11:26 even if it does, it would be because memcpy or log10 are part of the stdlib and either a) the compiler recognizes those functions and knows their behavior, or i would suspect b) they're declared with some __attribute__ ((global_const_or_whatever)) hint that enables it 2024-12-17 21:12:03 i guess you could have code log10 ... end-code i-promise-it's-functionally-pure 2024-12-17 21:12:35 :PURE 2024-12-17 21:12:58 put a bit in the word header if it's pure 2024-12-17 21:13:05 right 2024-12-17 21:13:33 yeah, I tried it just now. it compiled `(int)(100000*log10(3.141592653)+0.5)` to `mov $0xc233, %esi` 2024-12-17 21:13:55 so it evaluated the logarithm at compile time 2024-12-17 21:16:12 GCC has function `__attribute__`s called `pure` and `const` which declare that a function has such properties, and of course in C++ you have `constexpr` 2024-12-17 21:17:17 constexpr is kind of the opposite side though, "I want to use it as a compile-time constant and if it's not possible, please err on me" 2024-12-17 21:17:56 but even if you don't use those, it will inline and constant-fold functions. They don't even have to be `static` 2024-12-17 21:18:31 like, I wrote a function `int doit() { return (int)(100000*log10(3.141592653)+0.5); }` and a call to that function from main() still gets replaced with a mov instruction 2024-12-17 21:18:38 well i gave up trying to find it in the headers, but gcc -E reveals extern double log10 (double __x) __attribute__ ((__nothrow__ , __leaf__)); 2024-12-17 21:18:53 interestingly nothing in there to indicate it can be a constant expression 2024-12-17 21:19:20 yeah, I think the floating-point compile-time partial evaluation stuff has special cases for those standard math library functions 2024-12-17 21:20:01 I don't think you need to even have an implementation of log10 in your library for GCC to optimize the above code 2024-12-17 21:20:10 (though you do have to have one in GCC!) 2024-12-17 21:20:16 I know gcc is able to run your code at compile time. I once wrote an MD5 routine and in order to check how well it was optimized, I wrote int main(){puts(md5hex("x");} and checked the assembly. GCC replaced the md5hex call with the constant! 2024-12-17 21:20:24 hahaha 2024-12-17 21:20:34 that's a lot more than I would have expected 2024-12-17 21:20:50 do you know about the three projections of Dr. Futamura? 2024-12-17 21:20:50 not sure if there will be Forth compilers that clever 2024-12-17 21:21:08 xentrac: if you're asking me, no I don't 2024-12-17 21:22:10 (mandatory xkcd reference: https://xkcd.com/221/ ) 2024-12-17 21:22:26 http://blog.sigfpe.com/2009/05/three-projections-of-doctor-futamura.html 2024-12-17 21:23:38 yup 2024-12-17 22:00:24 how hard would it even be to mark functions as pure or impure? when defining a function... i guess one would need to mark it as impure if anything it calls is impure, otherwise it's pure? 2024-12-17 22:01:57 the words @ ! c@ c! , c, would be impure, and various things which stem from it would be 2024-12-17 22:02:27 (incl. allot but probably not here) 2024-12-17 22:02:46 yeah, i think that's true. you would only have to manually mark the code words 2024-12-17 22:03:13 i'm arguing in favour of automatic marking 2024-12-17 22:03:30 i know 2024-12-17 22:04:05 code words, or primitives, would have to be manually marked, and everything else can be derived from those like you described 2024-12-17 22:04:11 and then manual marking would be a choice of one of two things: 1. i need this function to be pure, so if it's impure then abort; and 2. i don't care what you think, just mark the damn thing as pure 2024-12-17 22:05:20 ...oh right also for impure stuff it'd really only be: i want you to treat this as impure, whether it is or not 2024-12-17 22:05:49 but yeah for primitives one would have to think about each word 2024-12-17 22:06:01 ACTION thinks about each word 2024-12-17 22:06:12 at least there's not that many words one needs to think about :^) 2024-12-17 22:07:34 @ c@ could be `pure` but not `const` in GCC's parlance 2024-12-17 22:07:38 the Forth94 spec has a glossary of 7 pages of up to 54 words each and that's all word sets with extensions 2024-12-17 22:08:00 that is, it's safe to not call them if their results will be discarded 2024-12-17 22:08:06 oh right 2024-12-17 22:08:15 (or words containing them) 2024-12-17 22:08:24 there is the thing about an actual pure function, and a function with no side-effects 2024-12-17 22:08:26 as I understand it 2024-12-17 22:08:35 right 2024-12-17 22:08:56 and then there's weaker guarantees like idempotent functions 2024-12-17 22:09:17 so, ! on its own is idempotent, but +! is not 2024-12-17 22:11:03 C has a pretty big advantage here in that functions that store into their local variables can still be `pure` or even `const` 2024-12-17 22:11:58 although I guess you could, as you said, manually mark it with an "i don't care what you think, just mark the damn thing as pure" 2024-12-17 22:25:38 xentrac: the difference is that most forths dont have locals 2024-12-17 22:25:49 everything is either anonymous (ie. on the stack) or global 2024-12-17 22:26:04 it would be non-trivial but not too hard to implement named locals 2024-12-17 22:27:38 hm let's have a go at that actually 2024-12-17 22:30:57 there are a number of Forths that do have named locals 2024-12-17 22:31:06 the difficulty with subroutine-local words in a Forth that doesn't have word nesting is that they make it hard to interactively test the factors of your word 2024-12-17 22:31:20 they tend to favor making your words much larger 2024-12-17 22:33:22 anyway, as interesting as this all is, my response to the original statement, "just use an optimizing compiler instead of [ ] literal", is that i think if i'm choosing forth, i'm choosing it specifically for the simplicity of implementation and i expect for there to be no or very little compiler optimization 2024-12-17 22:34:04 yeah, which is often a reasonable choice 2024-12-17 22:34:19 but it does mean you'll have to drop to assembly more often 2024-12-17 22:35:09 i've gotten to where i don't like what c compilers have become, where we do these ridiculous rain dances and cargo culting to try to get the optimizer to produce what we wanted 2024-12-17 22:35:36 exactly, i think i would actually prefer unsurprising output and drop to assembly when i can't get exactly what it want 2024-12-17 22:51:24 yeah, I think that's reasonable too 2024-12-17 22:51:33 more reasonable now than it was 30 years ago, in fact 2024-12-17 22:51:47 see http://blog.cr.yp.to/20150314-optimizing.html 2024-12-17 22:52:35 oh, uh, that doesn't have the talk slides 2024-12-17 22:53:43 https://cr.yp.to/talks/2015.04.16/slides-djb-20150416-a4.pdf 2024-12-17 22:54:49 > Cheaper computation ⇒ users process more data. 2024-12-17 22:55:08 > Performance issues disappear for most operations. e.g. open file, clean up. 2024-12-17 22:55:17 novice question, why is it that : blah [ 1 ] literal ; works but 1 : blah literal ; doesn't? 2024-12-17 22:55:41 works on my machinbe 2024-12-17 22:56:00 ok, it may be a Jupiter ACE quirk 2024-12-17 22:56:07 probably is 2024-12-17 22:56:22 see if you can find a listing for the forth it runs 2024-12-17 22:56:30 it'll probably help 2024-12-17 22:56:45 works on fforth, complains on pforth that the stack depth changed during compilation 2024-12-17 22:57:29 maybe : pushes something on the stack 2024-12-17 22:57:42 that way ; can tell if you accidentally write : square dup * ; ; 2024-12-17 22:57:56 well, I guess state would tell it anyway? 2024-12-17 22:58:01 try `: BLAH [ .S ] ;` 2024-12-17 22:58:04 if it has .S 2024-12-17 22:58:38 the jupy doesn't 2024-12-17 22:58:47 write .S 2024-12-17 22:58:50 it's really useful 2024-12-17 22:59:41 could be that : unlinks the word from the dictionary and puts it on the stack rather than smudging the name as many other forths do to hide a word during its compilation 2024-12-17 23:02:41 :NONAME pushes an xt in jonesforth 2024-12-17 23:02:57 although jonesforth also uses the hidden flag to hide words during their own compilation 2024-12-17 23:03:24 how does jonestforth's ; know whether the word was a named or a :noname? 2024-12-17 23:03:42 it doesn't need to 2024-12-17 23:03:58 what reveals the word? 2024-12-17 23:04:03 ; 2024-12-17 23:04:15 :NONAME constructs a dictionary header as normal 2024-12-17 23:04:26 it just puts in a zero char long name 2024-12-17 23:04:30 ie. the empty string 2024-12-17 23:04:43 gross 2024-12-17 23:04:45 it works 2024-12-17 23:05:00 always wasting my precious cycles!! 2024-12-17 23:05:07 also i dont think ive ever used :NONAME for anything other than testing 2024-12-17 23:05:29 that's a lie 2024-12-17 23:05:37 my immediate mode IF uses :NONAME internally 2024-12-17 23:07:40 so it seems plausible that the jupy captures the stack pointer and literal checks for stack underflow with respect to the captured pointer, right? 2024-12-17 23:08:24 pgimeno: i don't know, what does "doesn't work" mean exactly? 2024-12-17 23:08:46 throws an error or crashes and burns? 2024-12-17 23:09:15 it gives an error when entering the definition, and the word doesn't get defined 2024-12-17 23:10:53 then there has to be some kind of safe underflow detection occurring there 2024-12-17 23:10:53 that's honestly pretty irritating 2024-12-17 23:12:14 i also use a LOT of [COMPILE] LITERAL 2024-12-17 23:12:21 amby: agree, but it is allowed by ans, and once i accepted it, i found it convenient to push stuff from : and pop from ;. i get around it by using the return stack instead, but i know that doesn't work at the interpreter level in most forths, either. 2024-12-17 23:14:15 e.g., for a while i was playing games like current @ >r vocabulary private also private definitions : public [ r> ] literal current ! ; 2024-12-17 23:17:10 xentrac: i needed to see these slides before now 2024-12-17 23:18:15 zelgomer: my apologies! 2024-12-17 23:18:29 "the compiler needs to be in a dialog with the programmer; it needs to know properties of the data, and whether certain cases can arise, etc. And we couldn't think of a good language in which to have such a dialog." i think this articulates pretty well what's been troubling me 2024-12-17 23:18:49 zelgomer: oh, that does seem like a reasonable guess 2024-12-17 23:20:00 "A word is not properly structured" suggests that : is indeed pushing a thing on the stack for compiler security. and maybe it's not for error checking in ; but rather in things like again and loop 2024-12-17 23:42:46 I've checked and it fails in the same way if I use this: : blah [ 1 ] ; 2024-12-17 23:43:10 so yes, it's definitely checking for stack imbalance or the like 2024-12-17 23:43:21 ; is finding your 1 is not what it expected 2024-12-17 23:43:58 since you don't have .s, how about just : blah [ dup . ] ; 2024-12-17 23:45:56 yeah it prints 10, and I've checked that this works: : blah [ drop 10 ] ; 2024-12-17 23:46:31 interesting 2024-12-17 23:46:44 so it definitely checks if the stack pointer stays the same at the end, and it seems to place a sentinel or the like on top of that 2024-12-17 23:47:39 what if you do this? : blah [ 10 ] ; 2024-12-17 23:48:01 still expect it to fail, but i wonder if it will fail differently 2024-12-17 23:48:28 hah, it doesn't fail 2024-12-17 23:48:38 it just uses a sentinel, it does not check the pointer 2024-12-17 23:48:54 : blah [ 10 ] ; . ( is there still a 10 on the stack here?) 2024-12-17 23:49:12 i mean, i guess we're pretty sure there is, but just for giggles 2024-12-17 23:50:00 no there isn't 2024-12-17 23:50:52 very interesting 2024-12-17 23:50:53 and this works the way I initially expected: 1 : blah [ drop ] literal [ 10 ] ; 2024-12-17 23:51:15 oh wait, I was mistaken, yes there's a 10 there 2024-12-17 23:51:45 ok 2024-12-17 23:54:20 it's a very cramped 8 KB ROM and it's missing many words from F79, so I'd say it uses every trick it can to keep it simple 2024-12-17 23:55:36 and of course this works: 10 : blah literal ; 2024-12-17 23:57:15 lol so as long as you only want your literal to be 10 then you're good to go 2024-12-17 23:57:49 personally i'd be poking around in the rom to remove that 2024-12-17 23:58:00 if you've only got 8k that seems like a waste of space to implement 2024-12-17 23:58:52 what, compiler security? 2024-12-17 23:59:32 compiler security that is actively getting in the way of code someone wants to write 2024-12-17 23:59:47 haha, problem is, if you shift the ROM routines by one byte (e.g. by removing or adding code), all Forth programs you've saved to tape suddenly become invalid 2024-12-17 23:59:58 by even* one byte