2024-09-15 00:00:25 been giving some thought again to an abstract meta compiler. if you were to represent all values as abstract objects, how would , and c, work? they would have to somehow append to a definition, without losing object metadata, and in such a way that "name 5 + c@" is valid and does something sensical. or is there no way to do this and some rules would have to be established for it to work? 2024-09-15 00:10:01 MrMobius: yes, they have excellent rationales for their decisions; they have to do with making simple hardware implementations simpler 2024-09-15 00:13:12 zelgomer: I've been thinking that each cell on a stack or in a created definition can hold either a data pointer, an xt, a return address, or an integer; , or ! would transfer any of these from the operand stack to a cell in a created definition, while @ transfers them from a created definition to the operand stack 2024-09-15 00:14:23 while c, appends an integer cell to the current created definition if necessary and then changes some of the bits of the latest integer cell 2024-09-15 00:15:34 kind of like how Smalltalk works, with dynamic typing instead of no typing 2024-09-15 00:16:43 I was thinking about this for the sake of error reporting, but it's a perspective that might also help with metacompilation 2024-09-15 00:16:48 but I think existing metacompilers have simpler solutions to the problem? 2024-09-15 00:31:17 the reason why i'm going down this road is that i'm not happy with having to specify whether a definition is destined for the target image or for the host compiler. the ultimate dream imo would be that colon definitions, create words, etc. are represented as abstract data at interpretation time, and then you have words for traversing those data and compiling then to a target image 2024-09-15 00:31:32 that makes sense, yeah. it seems like a good idea 2024-09-15 00:32:11 i wanted this years ago but couldn't figure out how to do it. then i decided i was getting ahead of myself and settled for a dumber, single-pass cross compiler. it's ok, i guess, but i feel more comfortable with forth today and i've been thinking of tackling this again 2024-09-15 00:35:19 maybe the way to do it is to just have rules, like, if you append an address which isn't resolvable until it's compiled, then you can only ever access it with @ and an attempt to fetch with c@ just throws an error 2024-09-15 00:39:15 zelgomer: I'm planning on doing something similar. keeping a copy of the definition has a lot of interesting possibilities like editing a word you already entered which is one of my gripes 2024-09-15 00:39:50 you can then tokenize the text input and do optimization on it or just interpret it or compile including to machine code 2024-09-15 00:41:01 the hitch is allocating memory for all of that. you could use a garbage collected linked list but that's not very forth like and consumes more precious ram 2024-09-15 00:41:06 yes, throws an error or just gives you random garbage. "undefined behavior" 2024-09-15 00:41:20 MrMobius: agreed! 2024-09-15 00:42:07 the OpenFirmware approach was to tokenize the input stream 2024-09-15 00:42:32 so instead of "words are separated by spaces" you had "words are 32-bit pointers" (or maybe they used 16-bit, I forget) 2024-09-15 00:50:52 MrMobius: you also get dead code elimination, which means you can include a library and only the parts you use go into the binary. you can also provide high-level implenentations in a common lib, and selectively overlay them with optimized code words from the target-specific lib, and then only the code word version is copied to the image as long as it's the only one referenced 2024-09-15 00:51:13 lots of cool stuff becomes possible, it's just tricky to do 2024-09-15 00:54:02 zelgomer: ya tricky but worth it I think. I get why for example you would leave references to an old word in place when you define a new one with the same name but I think changing the word everywhere opens a lot of interesting stuff too 2024-09-15 00:54:20 which you get by keeping the definition in memory so it can be modified later 2024-09-15 00:54:31 yeah, I think that's what you want normally, and I think that's what ColorForth does 2024-09-15 00:54:42 you don't need to keep the source definition in memory for that, though 2024-09-15 00:55:17 I mean if you keep the source definition in memory the user can edit it without having to retype 2024-09-15 00:55:38 and then from there create a new definition or just have it replace the existing one which I like better 2024-09-15 00:56:04 yes, agreed, and both features (editing without retyping and replacing existing definitions) are desirable for a live, modifiable system 2024-09-15 00:57:44 nowadays computers are sufficiently faster than in 01974 that you don't have to choose between memory fragmentation in your dictionary and responsive systems 2024-09-15 00:58:20 a garbage-collected linked list only consumes more precious RAM if it fragments your heap, I mean 2024-09-15 00:59:38 i was planning to terminate the process after compilation. in that case i can reclaim nodes when it's convenient to do so, but otherwise just let them leak. 2024-09-15 00:59:45 yes and people are hesitant to use garbage collection in embedded but I think you could have a good compromise by only allocating, freeing and collecting while defining words but have everything set while words execute 2024-09-15 00:59:56 yes, exactly 2024-09-15 01:00:37 also typing in a definition to play with it and get it just write is so annoying when it's finally correct then you have to type it back out correctly into your main source file. I want a way to save the current definitions out to a file as necessary 2024-09-15 01:00:45 *just right 2024-09-15 01:00:47 agreed 2024-09-15 01:02:18 a full garbage collection on a 64-kibibyte heap or a 1-mebibyte heap might take too long to do it inside your PID control loop for motor waveform generation, but it's plenty fast for an interactive development environment 2024-09-15 01:03:50 You might need to modify the standard Forth approach to creating data objects, things like create cursor 80 , 24 , 2024-09-15 01:04:57 because you probably want changes to those variables to persist even when you're modifying your program code 2024-09-15 01:06:47 64k is actually a lot 2024-09-15 01:06:58 like when you compare it to the size of modern rams it seems tiny 2024-09-15 01:07:21 but you can fit a lot of stuff in 64k 2024-09-15 01:15:09 now imagine gaving 640k! that's like infinity 2024-09-15 01:21:16 yeah 2024-09-15 01:21:21 nobody's ever gonna need more than that 2024-09-15 01:23:14 oh hey, Factor isn't dead after all: https://re.factorcode.org/2024/09/factor-0-100-now-available.html 2024-09-15 01:54:30 factor's been under active development for a long time, it just has long gaps between releases 2024-09-15 03:16:54 i think i just figured it out. appending to a word with , or c, must have the same semantics as if it were defined with +field and allotted all at once. each , or c, must be accessed as an atomic field. 2024-09-15 03:32:20 i think that's probably enough to do the right thing in nearly every normal use case 2024-09-15 03:45:40 I think the tricky thing is how to handle move, cmove, fill, and cmove> 2024-09-15 03:46:44 I agree that it should have the same semantics as fields defined with +field 2024-09-15 03:48:25 hmm, yeah. this sucks. 2024-09-15 03:48:28 I don't think it's necessarily important that a series of c@ and c! or c, operations should be able to successfully copy a pointer, but move probably needs to be able to copy pointers 2024-09-15 03:49:34 so I think it's reasonable to treat {c@ c! c,} as accessing fields of a cell, which must have an integer value (rather than a code or data pointer value) for that operation to be valid 2024-09-15 03:50:56 whereas {@ ! , move cmove cmove>} need to be able to handle both integer cells and pointer cells 2024-09-15 03:50:56 yeah. i think that can be done. it would know from the metadata what the fields are, it would just have to copy them and the range would have to fall on field bounds. 2024-09-15 03:51:47 yeah, I could be wrong but after investigating this for a few weeks, I don't think you'll find much standard Forth code that will break with that approach, if that's your concern 2024-09-15 03:54:15 nah, not really concerned with adherence to the standard or compatibility with other forths. my concern is just that it behaves intuitively and there are no surprise limitations down the road. 2024-09-15 03:56:09 the integer/pointer thing is the part i've already started playing with. for that, i just have typed values and polymorphic primitives 2024-09-15 04:01:22 xentrac: the problem with using c@/c!/c, to access a cell field is that the target endianness must be known 2024-09-15 04:03:10 though tbh, when i start thinking about stuff like that to make forth portable, i start questioning the validity of what i'm doing 2024-09-15 04:08:28 i just don't want something so abstract it's like factor. i want it to still feel closer to hw than c. 2024-09-15 04:19:47 well, target endianness will affect the results if you read a cell as an integer and write it as bytes, or vice versa 2024-09-15 04:20:10 but that doesn't mean that you can't write portable software that way 2024-09-15 04:21:19 I mean you can say : foo create , , c, c, c, c, ; 1 2 3 4 5 6 foo bar 7 8 9 10 11 12 foo baz and so far everything is fine 2024-09-15 04:22:09 and even if you then say 2 cells 4 + value foosize bar baz foosize move you can copy one to the other in a portable way 2024-09-15 04:24:04 it doesn't matter whether move copying the third cell sees a big-endian value or a little-endian value; later on when you index into baz you'll be indexing by bytes 2024-09-15 04:25:06 as long as it's consistent between your c, and your later indexing 2024-09-15 04:28:35 but what if, from the meta interpreter, i do bar 1+ c@? what if the host differs from the eventual target? 2024-09-15 04:28:59 oh 2024-09-15 04:31:18 yeah it doesn't matter, the integer compiler defined by the back end would know to swap the bytes appropriately 2024-09-15 04:32:15 i mean, the meta part has to return something, but that can be defined 2024-09-15 05:07:17 i think i don't like this. how would it handle just "create foo 200 allot foo foo 80 + !" ? 2024-09-15 05:08:12 it gets into needing array types, which is far enough to make me lose interest 2024-09-15 15:55:36 Wrote some forth, just stack push/pop, http://0x0.st/X3sj.forth 2024-09-15 15:57:43 After some messing around, I realized the 'sequential' version is somewhat complicated in forth -- as in, you need to keep the origin pointer around with DUP and friends, so later in the file there's a more 'discrete' version with explicit variables for each element. after being unsatisfied with that too, i thought binding it all with a word may be a nice solution, and that got me to Perlis' second 2024-09-15 15:57:49 epigram http://cs.yale.edu/homes/perlis-alan/quotes.html 2024-09-15 15:58:00 > 2. Functions delay binding; data structures induce binding. Moral: Structure data late in the programming process. 2024-09-15 18:34:16 well, 80 is a lot less than 200, and it's divisible by all the plausible cell sizes, so I'd think it would work fine: you store a pointer to the beginning of foo in the 11th, 21st, or 41st cell of the foo region 2024-09-15 18:36:25 i woke up this morning and started thinking about a different way of representing metadata than what i was thinking yesterday. may play around with this a little later. 2024-09-15 18:36:48 (depending on whether your cells are 8, 4, or 2 bytes) 2024-09-15 18:42:21 i think my statement about treating ,-built data the same as fields is flawed. what i'm thinking of now is treating all raw memory as contiguous array of bytes, and then have like some kind of structure (maybe binary tree) tracked in an out of band memory region that will try to journal metadata and can be traversed to translate and replay into a target image 2024-09-15 18:43:18 so this approach goes the other way around. ,-made structures aren't treated like fields, instead everything is managed with a common raw bytes array mechanism 2024-09-15 18:45:02 zelgomer: That's about a forth or forth code? 2024-09-15 18:47:52 it's re: some ideas we were discussing yesterday about an interpreter mode where things are collected in abstract data structures that can be evaluated at interpretation time, but then traversed for cross- or opimized compilation 2024-09-15 18:50:59 colon definitions can easily be linked lists of objects, those are the easy case. what makes it hard is forth's model of having direct access to raw memory. 2024-09-15 18:51:37 Hm.. that's probably fun but hardly my area of expertise. But I checked the logs and had the same thoughts about portability. 2024-09-15 18:58:49 which? that it's not worth it? :) 2024-09-15 19:00:30 Indeed, but I mean not worth the effort unless it's absolutely required. 2024-09-15 19:53:31 zelgomer: so basically you're going to put linker relocations in your out-of-band memory? That could work, but you will also need to keep track of the same metadata for items on the operand or return stack at compile time 2024-09-15 19:53:57 the memory model in the Forth Standard doesn't actually require you to have direct access to raw memory 2024-09-15 19:55:45 https://forth-standard.org/standard/usage#section.3.3 describes it 2024-09-15 19:56:14 "Data space is the only logical area of the dictionary for which standard words are provided to allocate and access regions of memory. These regions are: contiguous regions, variables, text-literal regions, input buffers, and other transient regions, each of which is described in the following sections. A program may read from or write into these regions unless otherwise specified." 2024-09-15 19:57:15 it's fairly similar to the ANSI C model of memory, really 2024-09-15 20:46:08 i think what i'm describing is probably linker relocations, but i don't actually know how those work so i can't say for sure 2024-09-15 21:18:41 and yes, still emulating the stacks to track basic types at interpretation time 2024-09-15 23:57:22 yeah