2023-05-30 06:58:14 So, I want a table of variables right at the start of my image. Those variables need headers, and those headers need to refer to code via the cfa fields. But at the time I declare that table, the "dovar" code doesn't yet exist. 2023-05-30 06:58:46 So I need some kind of "forward reference" mechanism. I'm thinking of a word I might call "forward" or something like that. 2023-05-30 06:59:00 forward dovar 2023-05-30 06:59:18 : var dovar pfa, header ; 2023-05-30 06:59:47 Whenever dovar executes, it will extend a list of locations in the image which eventually need to be backpatched with that reference. 2023-05-30 07:00:39 Then later I'll be able to say "resolve dovar" - at that time all the addresses in that list will get set to the now current offset in the body section, where I will then define the dovar code using assembly. 2023-05-30 07:01:35 Further invocations of dovar will immediately slap that code address into the next cfa item. 2023-05-30 07:23:23 I think I'm going to write this so that it creates the content in blocks "ready to run in place." Then when I've got the whole thing built, I'll write a "squeezer" that trims out the leftover space between sections and compacts it into a final "image." 2023-05-30 07:23:47 That'll do the reverse of this "loading" process I've already designed - change addresses to offsets and so on. 2023-05-30 07:24:58 Wait - that's not exactly what I mean. What I meant to say is that it won't have to be squashed before running that load process on it. I'll be able to load it "spaced out." 2023-05-30 07:25:13 It won't literally run the block contents. 2023-05-30 09:14:50 One thing I'm trying to do here is to run "meta" (host sytem) code to create these parts of the target image, but the key variables involved (pointer to last word, next dictionary byte pointer, etc.) are IN the target image.That's basically what that initial table of variables is. 2023-05-30 09:22:11 Also, there are certain variables that we traditionally expect to be addresses, like DP. I think I'm just going to define those as offsets into some particular region - otherwise I have to "patch them" into addresses at load time. Which isn't "hard" or anything, but there are just a good many of them and defining them as offsets solves the whole problem. The items that will get patched at load time will be 2023-05-30 09:22:12 variables giving the base addresses of the various regions. 2023-05-30 15:07:26 Hi everyone. I'm building a multitasking Forth for an embedded system (no virtual memory) and I'd like to know if anyone has any thoughts or information from other Forth implementations on the system I'm thinking of. 2023-05-30 15:07:26 Let's say I have a task (A) that requires some custom definitions to run. While task (A) is running I start task (B) that also requires its own custom definitions. Then I want to kill task A and free up some memory by forgetting all of the words that pertain only to task A. Giving each task its own dictionary solves this problem as I can deallocate the entire dictionary/user area for task A. (If there were only one dictionary, then FORGETting 2023-05-30 15:07:26 words for task (A) would clobber the words for task (B) because of their definitions in chronological order.) 2023-05-30 15:07:28 But what if I have task (A) and (B) that both share a large chunk of definitions (e.g. bitmap graphics) and task (C) which is completely unrelated. I'd prefer to have tasks (A) and (B) not duplicate the bitmap graphics code in each of their own private dictionaries, but have the equivalent of a "shared library." Then, when tasks (A) and (B) are both killed, I can safely deallocate the bitmap graphics dictionary. This seems doable to me from a 2023-05-30 15:07:33 logical standpoint, but I can't seem to come up with a nice way to handle this from a memory standpoint because creating and destroying dictionaries of arbitrary size will just make a fragmented mess of my RAM. Maybe there's a way to create a dictionary that's made of fixed-sized blocks linked together (pool allocator)? 2023-05-30 15:07:37 Have any Forth systems ever implemented the equivalent of a "shared library"? My understanding of the true utility of wordlists/vocabularies is a little weak (wrong tool for the job?), but as I understand it they are just woven throughout a single dictionary via pointers, so you can't really FORGET a wordlist. 2023-05-30 15:09:14 probably you would do the shared stuff up to some point, then forget back to that point when needed 2023-05-30 15:13:05 thrig: yeah, but if tasks are being created and killed free-form like you use a desktop OS, then chronological forget doesn't work. 2023-05-30 15:14:15 probably a dictionary for an emdedded device wouldn't be designed like X11 on unix 2023-05-30 16:48:27 winduptoy: vocabularies can solve your word list problem, if you provide some kind of heap memory allocation. Anything you intended to be temporary would a) go into its own vocabularly and b) get compiled into a heap block allocated for that purpose. Then you could "forget" everything in that vocabulary, and deallocate the heap block. 2023-05-30 16:48:42 Typical Forths don't have heaps, but nothing really precludes you having one. 2023-05-30 16:49:18 a 1980s microcomputer might have different concerns than a modern "embedded" system 2023-05-30 16:49:19 If you did this sort thing a lot you'd potentially get fragmentation, since you can't move words once they're compiled. 2023-05-30 16:50:49 KipIngram: thank you. yeah, the fragmentation is my fear. has anyone ever implemented a forth dictionary in a pool-based manner (e.g. 1kb blocks)? I can't seem to make it work in my head, because when you're compiling you don't know how long the definition is so you don't know when to allocate a new block until it's too late. 2023-05-30 16:52:46 I have done that with 4kB blocks. 2023-05-30 16:53:02 You have to be able to recognize when a block is almost full. 2023-05-30 16:53:03 winduptoy: I would use garbage collection if I were you as long as you can stand to stall while a task ends 2023-05-30 16:53:25 The way I did it was to require headers fit completely in one block, but definitions could jump from block to block - you just compile a jump at the end. 2023-05-30 16:53:27 assuming "embedded" here means lowest memory resources 2023-05-30 16:53:41 Doing this with small fixed size blocks is feasible, and that solves your fragmentation problem. 2023-05-30 16:54:18 So my headers went to new blocks "in between headers" but definitions could straddle a jump. 2023-05-30 16:54:52 KipIngram: and just hope the jump to the next block isn't inside the hottest part of the loop :P 2023-05-30 16:55:02 You know how big a header will be as soon as you have the name, so that's not a problem. 2023-05-30 16:55:07 And definitions can "step across." 2023-05-30 16:55:18 Yes, that's true. 2023-05-30 16:55:36 I don't know of anything to do about that except just watch out for it. 2023-05-30 16:55:37 KipIngram: OH! That's awesome, thank you! 2023-05-30 16:55:53 You could always force a new block anytime you wanted to, if you have some critical code that needs to not straddle. 2023-05-30 16:55:56 winduptoy: Kip Ingram has a good idea of course. I think they are both viable. Garbage collection would just be storing offsets instead of addresses in the thread and adding those offsets to a base address that changes when you garbage collect and free up memory 2023-05-30 16:56:24 My system works by storing offsets anyway. 2023-05-30 16:56:39 It only adds one instruction to next and one to docol. 2023-05-30 16:57:30 winduptoy: how much ram do you have? 2023-05-30 16:57:41 264 KB 2023-05-30 16:58:23 Ah ok. So no big deal to waste a few kb at the end of each dictionary 2023-05-30 16:58:44 Assuming 4k blocks 2023-05-30 16:58:55 MrMobius: i'm afraid I don't follow your GC description; you're storing offsets to what? 2023-05-30 16:58:56 winduptoy: is this pi pico by chance? 2023-05-30 16:59:02 yeah rp2040 2023-05-30 16:59:25 winduptoy: cool! Check out zeptoforth if you havent 2023-05-30 16:59:51 winduptoy: so the address of the first word would be 0 since it's 0 bytes past the start of the dictionary 2023-05-30 17:00:24 the base address of the dictionary can be anywhere so you can change it when you do garbage collection to close any holes in your memory map 2023-05-30 17:01:08 So 100% of free ram is available all the time 2023-05-30 17:01:23 Not that you have to be too worried with 264k though 2023-05-30 17:01:33 I even tested that system on 256 byte blocks (intended to be "smaller than anything I'd use in practice). 2023-05-30 17:02:15 I had a little M4 Cortex dongle with 256k of RAM, and wanted to try for a muti-session console interface that operated somewhat like screen. 2023-05-30 17:02:23 I never got that done, though. 2023-05-30 17:02:31 MrMobius: so when you close holes, don't you have to re-write the offsets? 2023-05-30 17:03:02 winduptoy: no, you only have to rewrite the base address and copy the existing memory over the hole 2023-05-30 17:03:13 oh copy, gotcha 2023-05-30 17:03:17 everything is still stores the same distance from the base address 2023-05-30 17:03:46 thank you both, this has been very helpful 2023-05-30 17:04:33 MrMobius: i've been studying all the rp2040-compatible forths. does zeptoforth implement one of these methods? I couldn't really figure it out from the docs. 2023-05-30 17:05:16 The creator tabemann is here if you want to ask 2023-05-30 17:05:30 As I understand it, it writes new words to flash 2023-05-30 17:46:03 winduptoy: A simple fixed block size heap is very simple to implement in Forth. It's fairly efficient even written in Forth; much less if you added some primitives to support it. 2023-05-30 17:46:12 Or, "even better," I mean. 2023-05-30 17:48:04 KipIngram: now that you opened my mind to the trick of making the jump to the next allocated block within the definition of a word, that's the way i'm going to go. thanks for taking the time to explain it! 2023-05-30 17:48:39 Oh sure, it was fun making that work. Nice satisfied feeling when it did so the first time. :-) 2023-05-30 18:05:57 So here's a screenshot that shows the variable table layout (center) and startup code (left and right) that is at least close to how I can get a system launched from an image in block space. 2023-05-30 18:06:00 https://imgur.com/a/NgY1otf 2023-05-30 18:06:13 That table is at the beginning of the image in the block. 2023-05-30 18:07:41 It copies it to the system stack (not Forth stack), and then scans through the first few entries, allocating memory buffers, copying data to them, and also patching up all of the necessary addresses, in the data copied and in the stack frame. Then copies the stack frame back into the newly allocated body stuff. Finally initializes the Forth registers (illustrated at the bottom) and starts the vm. 2023-05-30 18:08:00 I forgot to de-allocate the table from the stack, so there's a sub rsp, line missing. 2023-05-30 18:08:24 Or, rather, an add rsp, 2023-05-30 18:10:13 The trick was to think of everything in a typical Forth that needs to be an ADDRESS and get it into that early table of variables. The idea is that nothing else would need to be touched. 2023-05-30 18:10:47 There's a bit of subtle stuff over there in memcpy that handles offset->address on the cfa and pfa tables. 2023-05-30 18:13:36 And I still need to write a little subroutine that does the actual malloc syscall. 2023-05-30 18:15:25 Absolutely likely to have a glitch or two; I slapped it out fairly fast. 2023-05-30 18:20:02 KipIngram: wow, really going above and beyond here. thanks for all the time you've put into this. 2023-05-30 18:27:45 Forth just seems to be a hobby that won't go away. :-) 2023-05-30 20:15:32 forth is quintessential 2023-05-30 20:15:40 timeless, even. 2023-05-30 20:21:58 I agree. It has an "essence" to it that's very beautiful. 2023-05-30 21:54:21 So, is there a good guide online anywhere for how the instruction encoding on x64 actually works? I mean the "core" of it - surely there's some SUBSET of it that has some kind of elegant encoding. 2023-05-30 21:54:29 Before they threw all the weed seeds into it. 2023-05-30 21:55:06 I mean, there are 16 registers. There ought to be four-bit patterns corresponding to each one that show up in the instruction SOMEWHERE. 2023-05-30 21:55:56 the elegant encoding was probably in some design doc that went out the window sometime in the 70s 2023-05-30 22:07:09 Well, this is slightly enlightening: 2023-05-30 22:07:11 https://imgur.com/a/DvWYkR5 2023-05-30 22:08:06 Encoding r8-r15 vs. rax-rdi flips a bit back in the first byte. 2023-05-30 22:09:26 So the low three bits of the source register are in the middle of the last byte; fourth is in the first byte. 2023-05-30 22:09:44 I'd LIKE to that that pattern will show up consistently. 2023-05-30 22:11:56 The low three bits of the last byte appear to encode the destination register. 2023-05-30 22:13:45 For example, I'd like to think that add, sub, xor, and, or for register/register operands would have some fairly "synthesizable" structure. 2023-05-30 22:14:04 : rr-op ...code... ; 2023-05-30 22:14:22 x:01 rr-op rradd 2023-05-30 22:14:57 x:29 rr-op rrsub 2023-05-30 22:14:59 etc. 2023-05-30 22:51:43 I thin it looks like this: 2023-05-30 22:51:45 https://imgur.com/a/LFDSwZR 2023-05-30 22:52:05 That covers add, or, adc, sbb, and, sub, xor, and cmp. 2023-05-30 22:52:20 With the right bits or-ed into it, of course. 2023-05-30 23:01:48 That's enough to do something like I coded ^^ right there. 2023-05-30 23:19:16 I think it's this: 2023-05-30 23:19:18 https://imgur.com/a/UyMHv6S 2023-05-30 23:19:30 That would be 2048 instructions. 2023-05-30 23:21:17 So we could start out, little endian, with 0xC00048. Then or in the register and op bits in the right spots.