2024-09-15 00:00:25 <zelgomer    > been giving some thought again to an abstract meta compiler. if you were to represent all values as abstract objects, how would , and c, work? they would have to somehow append to a definition, without losing object metadata, and in such a way that "name 5 + c@" is valid and does something sensical. or is there no way to do this and some rules would have to be established for it to work?
2024-09-15 00:10:01 <xentrac     > MrMobius: yes, they have excellent rationales for their decisions; they have to do with making simple hardware implementations simpler
2024-09-15 00:13:12 <xentrac     > zelgomer: I've been thinking that each cell on a stack or in a created definition can hold either a data pointer, an xt, a return address, or an integer; , or ! would transfer any of these from the operand stack to a cell in a created definition, while @ transfers them from a created definition to the operand stack
2024-09-15 00:14:23 <xentrac     > while c, appends an integer cell to the current created definition if necessary and then changes some of the bits of the latest integer cell
2024-09-15 00:15:34 <xentrac     > kind of like how Smalltalk works, with dynamic typing instead of no typing
2024-09-15 00:16:43 <xentrac     > I was thinking about this for the sake of error reporting, but it's a perspective that might also help with metacompilation
2024-09-15 00:16:48 <xentrac     > but I think existing metacompilers have simpler solutions to the problem?
2024-09-15 00:31:17 <zelgomer    > the reason why i'm going down this road is that i'm not happy with having to specify whether a definition is destined for the target image or for the host compiler. the ultimate dream imo would be that colon definitions, create words, etc. are represented as abstract data at interpretation time, and then you have words for traversing those data and compiling then to a target image
2024-09-15 00:31:32 <xentrac     > that makes sense, yeah.  it seems like a good idea
2024-09-15 00:32:11 <zelgomer    > i wanted this years ago but couldn't figure out how to do it. then i decided i was getting ahead of myself and settled for a dumber, single-pass cross compiler. it's ok, i guess, but i feel more comfortable with forth today and i've been thinking of tackling this again
2024-09-15 00:35:19 <zelgomer    > maybe the way to do it is to just have rules, like, if you append an address which isn't resolvable until it's compiled, then you can only ever access it with @ and an attempt to fetch with c@ just throws an error
2024-09-15 00:39:15 <MrMobius    > zelgomer: I'm planning on doing something similar. keeping a copy of the definition has a lot of interesting possibilities like editing a word you already entered which is one of my gripes
2024-09-15 00:39:50 <MrMobius    > you can then tokenize the text input and do optimization on it or just interpret it or compile including to machine code
2024-09-15 00:41:01 <MrMobius    > the hitch is allocating memory for all of that. you could use a garbage collected linked list but that's not very forth like and consumes more precious ram
2024-09-15 00:41:06 <xentrac     > yes, throws an error or just gives you random garbage.  "undefined behavior"
2024-09-15 00:41:20 <xentrac     > MrMobius: agreed!
2024-09-15 00:42:07 <xentrac     > the OpenFirmware approach was to tokenize the input stream
2024-09-15 00:42:32 <xentrac     > so instead of "words are separated by spaces" you had "words are 32-bit pointers" (or maybe they used 16-bit, I forget)
2024-09-15 00:50:52 <zelgomer    > MrMobius: you also get dead code elimination, which means you can include a library and only the parts you use go into the binary. you can also provide high-level implenentations in a common lib, and selectively overlay them with optimized code words from the target-specific lib, and then only the code word version is copied to the image as long as it's the only one referenced
2024-09-15 00:51:13 <zelgomer    > lots of cool stuff becomes possible, it's just tricky to do
2024-09-15 00:54:02 <MrMobius    > zelgomer: ya tricky but worth it I think. I get why for example you would leave references to an old word in place when you define a new one with the same name but I think changing the word everywhere opens a lot of interesting stuff too
2024-09-15 00:54:20 <MrMobius    > which you get by keeping the definition in memory so it can be modified later
2024-09-15 00:54:31 <xentrac     > yeah, I think that's what you want normally, and I think that's what ColorForth does
2024-09-15 00:54:42 <xentrac     > you don't need to keep the source definition in memory for that, though
2024-09-15 00:55:17 <MrMobius    > I mean if you keep the source definition in memory the user can edit it without having to retype
2024-09-15 00:55:38 <MrMobius    > and then from there create a new definition or just have it replace the existing one which I like better
2024-09-15 00:56:04 <xentrac     > yes, agreed, and both features (editing without retyping and replacing existing definitions) are desirable for a live, modifiable system
2024-09-15 00:57:44 <xentrac     > nowadays computers are sufficiently faster than in 01974 that you don't have to choose between memory fragmentation in your dictionary and responsive systems
2024-09-15 00:58:20 <xentrac     > a garbage-collected linked list only consumes more precious RAM if it fragments your heap, I mean
2024-09-15 00:59:38 <zelgomer    > i was planning to terminate the process after compilation. in that case i can reclaim nodes when it's convenient to do so, but otherwise just let them leak.
2024-09-15 00:59:45 <MrMobius    > yes and people are hesitant to use garbage collection in embedded but I think you could have a good compromise by only allocating, freeing and collecting while defining words but have everything set while words execute
2024-09-15 00:59:56 <xentrac     > yes, exactly
2024-09-15 01:00:37 <MrMobius    > also typing in a definition to play with it and get it just write is so annoying when it's finally correct then you have to type it back out correctly into your main source file. I want a way to save the current definitions out to a file as necessary
2024-09-15 01:00:45 <MrMobius    > *just right
2024-09-15 01:00:47 <xentrac     > agreed
2024-09-15 01:02:18 <xentrac     > a full garbage collection on a 64-kibibyte heap or a 1-mebibyte heap might take too long to do it inside your PID control loop for motor waveform generation, but it's plenty fast for an interactive development environment
2024-09-15 01:03:50 <xentrac     > You might need to modify the standard Forth approach to creating data objects, things like create cursor 80 , 24 ,
2024-09-15 01:04:57 <xentrac     > because you probably want changes to those variables to persist even when you're modifying your program code
2024-09-15 01:06:47 <amby        > 64k is actually a lot
2024-09-15 01:06:58 <amby        > like when you compare it to the size of modern rams it seems tiny
2024-09-15 01:07:21 <amby        > but you can fit a lot of stuff in 64k
2024-09-15 01:15:09 <zelgomer    > now imagine gaving 640k! that's like infinity
2024-09-15 01:21:16 <amby        > yeah
2024-09-15 01:21:21 <amby        > nobody's ever gonna need more than that
2024-09-15 01:23:14 <xentrac     > oh hey, Factor isn't dead after all: https://re.factorcode.org/2024/09/factor-0-100-now-available.html
2024-09-15 01:54:30 <crc         > factor's been under active development for a long time, it just has long gaps between releases
2024-09-15 03:16:54 <zelgomer    > i think i just figured it out. appending to a word with , or c, must have the same semantics as if it were defined with +field and allotted all at once. each , or c, must be accessed as an atomic field.
2024-09-15 03:32:20 <zelgomer    > i think that's probably enough to do the right thing in nearly every normal use case
2024-09-15 03:45:40 <xentrac     > I think the tricky thing is how to handle move, cmove, fill, and cmove>
2024-09-15 03:46:44 <xentrac     > I agree that it should have the same semantics as fields defined with +field
2024-09-15 03:48:25 <zelgomer    > hmm, yeah. this sucks.
2024-09-15 03:48:28 <xentrac     > I don't think it's necessarily important that a series of c@ and c! or c, operations should be able to successfully copy a pointer, but move probably needs to be able to copy pointers
2024-09-15 03:49:34 <xentrac     > so I think it's reasonable to treat {c@ c! c,} as accessing fields of a cell, which must have an integer value (rather than a code or data pointer value) for that operation to be valid
2024-09-15 03:50:56 <xentrac     > whereas {@ ! , move cmove cmove>} need to be able to handle both integer cells and pointer cells
2024-09-15 03:50:56 <zelgomer    > yeah. i think that can be done. it would know from the metadata what the fields are, it would just have to copy them and the range would have to fall on field bounds.
2024-09-15 03:51:47 <xentrac     > yeah, I could be wrong but after investigating this for a few weeks, I don't think you'll find much standard Forth code that will break with that approach, if that's your concern
2024-09-15 03:54:15 <zelgomer    > nah, not really concerned with adherence to the standard or compatibility with other forths. my concern is just that it behaves intuitively and there are no surprise limitations down the road.
2024-09-15 03:56:09 <zelgomer    > the integer/pointer thing is the part i've already started playing with. for that, i just have typed values and polymorphic primitives
2024-09-15 04:01:22 <zelgomer    > xentrac: the problem with using c@/c!/c, to access a cell field is that the target endianness must be known
2024-09-15 04:03:10 <zelgomer    > though tbh, when i start thinking about stuff like that to make forth portable, i start questioning the validity of what i'm doing
2024-09-15 04:08:28 <zelgomer    > i just don't want something so abstract it's like factor. i want it to still feel closer to hw than c.
2024-09-15 04:19:47 <xentrac     > well, target endianness will affect the results if you read a cell as an integer and write it as bytes, or vice versa
2024-09-15 04:20:10 <xentrac     > but that doesn't mean that you can't write portable software that way
2024-09-15 04:21:19 <xentrac     > I mean you can say : foo create , , c, c, c, c, ;  1 2 3 4 5 6 foo bar  7 8 9 10 11 12 foo baz and so far everything is fine
2024-09-15 04:22:09 <xentrac     > and even if you then say 2 cells 4 + value foosize  bar baz foosize move you can copy one to the other in a portable way
2024-09-15 04:24:04 <xentrac     > it doesn't matter whether move copying the third cell sees a big-endian value or a little-endian value; later on when you index into baz you'll be indexing by bytes
2024-09-15 04:25:06 <xentrac     > as long as it's consistent between your c, and your later indexing
2024-09-15 04:28:35 <zelgomer    > but what if, from the meta interpreter, i do bar 1+ c@? what if the host differs from the eventual target?
2024-09-15 04:28:59 <zelgomer    > oh
2024-09-15 04:31:18 <zelgomer    > yeah it doesn't matter, the integer compiler defined by the back end would know to swap the bytes appropriately
2024-09-15 04:32:15 <zelgomer    > i mean, the meta part has to return something, but that can be defined
2024-09-15 05:07:17 <zelgomer    > i think i don't like this. how would it handle just "create foo 200 allot  foo foo 80 + !" ?
2024-09-15 05:08:12 <zelgomer    > it gets into needing array types, which is far enough to make me lose interest
2024-09-15 15:55:36 <user51      > Wrote some forth, just stack push/pop, http://0x0.st/X3sj.forth
2024-09-15 15:57:43 <user51      > After some messing around, I realized the 'sequential' version is somewhat complicated in forth -- as in, you need to keep the origin pointer around with DUP and friends, so later in the file there's a more 'discrete' version with explicit variables for each element.  after being unsatisfied with that too, i thought binding it all with a word may be a nice solution, and that got me to Perlis' second
2024-09-15 15:57:49 <user51      > epigram http://cs.yale.edu/homes/perlis-alan/quotes.html
2024-09-15 15:58:00 <user51      > >  2. Functions delay binding; data structures induce binding. Moral: Structure data late in the programming process.
2024-09-15 18:34:16 <xentrac     > well, 80 is a lot less than 200, and it's divisible by all the plausible cell sizes, so I'd think it would work fine: you store a pointer to the beginning of foo in the 11th, 21st, or 41st cell of the foo region
2024-09-15 18:36:25 <zelgomer    > i woke up this morning and started thinking about a different way of representing metadata than what i was thinking yesterday. may play around with this a little later.
2024-09-15 18:36:48 <xentrac     > (depending on whether your cells are 8, 4, or 2 bytes)
2024-09-15 18:42:21 <zelgomer    > i think my statement about treating ,-built data the same as fields is flawed. what i'm thinking of now is treating all raw memory as contiguous array of bytes, and then have like some kind of structure (maybe binary tree) tracked in an out of band memory region that will try to journal metadata and can be traversed to translate and replay into a target image
2024-09-15 18:43:18 <zelgomer    > so this approach goes the other way around. ,-made structures aren't treated like fields, instead everything is managed with a common raw bytes array mechanism
2024-09-15 18:45:02 <user51      > zelgomer: That's about a forth or forth code?
2024-09-15 18:47:52 <zelgomer    > it's re: some ideas we were discussing yesterday about an interpreter mode where things are collected in abstract data structures that can be evaluated at interpretation time, but then traversed for cross- or opimized compilation
2024-09-15 18:50:59 <zelgomer    > colon definitions can easily be linked lists of objects, those are the easy case. what makes it hard is forth's model of having direct access to raw memory.
2024-09-15 18:51:37 <user51      > Hm.. that's probably fun but hardly my area of expertise.  But I checked the logs and had the same thoughts about portability.
2024-09-15 18:58:49 <zelgomer    > which? that it's not worth it? :)
2024-09-15 19:00:30 <user51      > Indeed, but I mean not worth the effort unless it's absolutely required.
2024-09-15 19:53:31 <xentrac     > zelgomer: so basically you're going to put linker relocations in your out-of-band memory?  That could work, but you will also need to keep track of the same metadata for items on the operand or return stack at compile time
2024-09-15 19:53:57 <xentrac     > the memory model in the Forth Standard doesn't actually require you to have direct access to raw memory
2024-09-15 19:55:45 <xentrac     > https://forth-standard.org/standard/usage#section.3.3 describes it
2024-09-15 19:56:14 <xentrac     > "Data space is the only logical area of the dictionary for which standard words are provided to allocate and access regions of memory. These regions are: contiguous regions, variables, text-literal regions, input buffers, and other transient regions, each of which is described in the following sections. A program may read from or write into these regions unless otherwise specified."
2024-09-15 19:57:15 <xentrac     > it's fairly similar to the ANSI C model of memory, really
2024-09-15 20:46:08 <zelgomer    > i think what i'm describing is probably linker relocations, but i don't actually know how those work so i can't say for sure
2024-09-15 21:18:41 <zelgomer    > and yes, still emulating the stacks to track basic types at interpretation time
2024-09-15 23:57:22 <xentrac     > yeah