2023-08-25 03:38:13 Yeah with a tokenised forth I tend to have CALL8 / CALL16 / CALL32 tokens for different sized immediate relative calls 2023-08-25 04:32:46 I thought about having additional call operatoins that used some fixed number of bits within the cell, immediately following the instruction bits. The idea was to be able to make a short call and then still come back and execute the rest of that cell. 2023-08-25 04:33:22 I decided not to, though, because it would have required me to put not only the return address on the return stack on calls, but also the partially executed instruction word. 2023-08-25 04:33:38 I decided I didn't want that overhead associated with every call I made. 2023-08-25 04:34:06 So I just take "the remainder of the cell" as the relative offset from IP to call to. 2023-08-25 04:34:35 And since I shift the cell contents down as I execute, that "becomes" a 32-bit value in a register by the time I do the call. 2023-08-25 04:35:25 In the interests of supporting the largest possible address space on a desktop system with lots of RAM, I may add a call instruction that uses the entire following 32-bit cell as the call operand. 2023-08-25 04:36:07 Without that, then 26 bits would be my largest call operand, and if that's an unsigned number I'd be able to reach 32 MB in either direction. 2023-08-25 04:36:32 Which of course is plenty for Forth itself, but who knows where everything might be in a large RAM with a memory allocator. 2023-08-25 04:37:47 Since I'm putting each vocabulary in its own RAM region, calls to other vocabularies will have to be across region boundaries. 2023-08-25 04:53:44 That makes sense with your packed format 2023-08-25 04:53:53 I don't really know what the motivation for a packed format is, honestly 2023-08-25 04:54:46 It makes sense to me if you have word addressing I suppose? 2023-08-25 05:00:26 Well, the main benefit I see is just code compactness, and you can get more 6-bit instructions into a cell than you can 8-bit instructions. 2023-08-25 05:00:53 But if you're thinking desktop, yeah - "saving RAM" isn't of particularly huge value. 2023-08-25 05:01:58 It may wind up giving a small speed advantage too, since one fetch gets me a whole cell full of instructions. 2023-08-25 05:02:32 At some point I'll probably write a vm back end for this on my notebook, and then I'll be able to measure and see. 2023-08-25 05:02:48 I mean a real one, as opposed to this Python thing I'm working on now. 2023-08-25 05:07:52 Saving RAM is a good motivation for me 2023-08-25 05:08:24 I'm not entirely convinced you save space that way, but I'm not entirely unconvinced either 2023-08-25 05:08:32 I'd have to play with it a bit or do some tests 2023-08-25 05:11:40 Yes, it remains to be seen how it all actually falls out - there are things that can force me to a new cell before I've used very many slots in the previous cell (any call, any literal, etc.) So all of the *potentially* available space won't get used in the end. 2023-08-25 05:13:08 But I'm pretty sure it's going to turn out to be a "small" system, for some definition of small. The switch back to traditional headers, and the code-threaded aspects of the structure, will mean that words with 1-char names will only need 4 bytes for their header, and almost all of my remaining words will only need 8. 2023-08-25 05:19:34 The issue of tracking how much of the packed instruction is 'done' and where to put immediates is just a bit of a headache for me 2023-08-25 05:19:56 I definitely prefer a 'dumb' token system 2023-08-25 05:19:59 I really like how this shift down methodology just removes that problem. 2023-08-25 05:20:16 Ah but now you've got a register dedicated to holding the opcode 2023-08-25 05:20:18 It does require a smarter compiler, though. 2023-08-25 05:20:30 And yeah compiler's a bit more complicated 2023-08-25 05:20:31 Yes, there is an instruction word register. 2023-08-25 05:20:44 I think it may be considerably more complicated, but we'll see. 2023-08-25 05:21:18 I assume the only 'optimisation' will be that you actually attempt to pack the cells 2023-08-25 05:21:27 But not reorder or anything 2023-08-25 05:21:58 Yes. Try to do what needs to be done with the bits that are lef t - if it's not possible, just zero the rest of that cell and move to the next one. 2023-08-25 05:22:07 Yup 2023-08-25 05:22:08 It may not be that bad. I've basically written it in Python now. 2023-08-25 05:22:35 It uses python's ** (power) operator, which isn't exactly machine code friendly. 2023-08-25 05:22:45 But it's certainly doable. 2023-08-25 05:23:38 Regarding sign-extension, maybe just use unsigned immediates but have an implicit negate for calls, jumps, and loops; and no negate for branches? 2023-08-25 05:24:02 For compiling operands or opcodes I'll have some number of remaining bits in the cell. Say N. The signed operand range that can hold is -(2**(N-1)) to 2**(N-1)-1. 2023-08-25 05:24:04 Maybe just maybe support a negated literal operand so -1 et al are shorter 2023-08-25 05:24:43 I could. I'm going to try signed operands first. The only cost of doing that is that I require two "next cell" codes instead of one. 2023-08-25 05:25:17 I'm suggesting because it seemed like that was an issue, you only get two actual ops possible for the last opcode position 2023-08-25 05:25:37 Just an idea, anyway 2023-08-25 05:25:54 Jumps will pretty much always be backward, but calls won't necessarily be, depending on how vocabularies wind up in RAM, and tail optimization turns calls into jumps sometimes. 2023-08-25 05:26:35 Remember I don't have IF (which does a forward jump). I don't thinK I ever have any forward jumps. And in the system I'm running now I have no forward calls. 2023-08-25 05:26:55 But I think I will, potentially, in this new system with RAM segregated vocabularies. 2023-08-25 05:28:23 Also, these will be cell counts, not byte counts, since all jump and call targets will be cell aligned. 2023-08-25 05:29:22 So, say I use four of the five cells for instructions, and the last one is a call. I have left the fifth 6-bit cell and the 2 residual bits, so eight bits. 256 cells -> 1024 bytes. So that would go 512 bytes either way. 2023-08-25 05:29:43 That will probably cover an awful lot of my "local helper word" situations. 2023-08-25 05:30:11 And if slot 3 is the last instruction we've up to 8kB either way. 2023-08-25 05:30:22 I'm sorry - 32kB. 2023-08-25 05:31:19 And since the code is compact, that's a lot of code "in range." 2023-08-25 05:32:19 In fact, the aspect that's going to be a little sad is that if I have a string of calls I don't get to pack them. Each call takes a whole cell. 2023-08-25 05:32:35 But that's no worse than what they'd take in a standard 32-bit Forth. 2023-08-25 05:33:14 That was why I considered the possibility of fixed width call operands, with continued execution of the cell after. 2023-08-25 05:35:20 KipIngram: I'm not sure if the last thing I posted is faster or slower but it has exactly one jump and no branches which was the idea 2023-08-25 05:35:30 I think I'll include an "inline" bit in the header that tells the compiler whether t call the word or copy its code inline. 2023-08-25 05:36:26 Ok. I don't think I saw it with complete clarity, but I think I at least have a sense of what you're going for. You'd avoid having to shift the instruction word, right? As you executed through it, you alter NEXT so that it grabbled the next slot? 2023-08-25 05:36:45 not really 2023-08-25 05:36:54 Ok. Sorry. :-( 2023-08-25 05:37:35 rather than having a 6th instruction dispatch to load the next word, you could have a counter to know what word you're on 2023-08-25 05:37:47 Ah - I see. 2023-08-25 05:38:13 Yet another register, though 2023-08-25 05:38:16 but I stead of using a branch to check the counter, you incorporate the counter in the address calculation for the jump 2023-08-25 05:38:28 Ok, sure - that would be pretty easy to do. It would add one instruction per NEXT - so 5 instructions. 2023-08-25 05:38:34 Per cell, I mean. 2023-08-25 05:38:57 so 5 dispatches instead of 6. not sure of the extra calculation balances it out though 2023-08-25 05:39:19 I think they're about the same, then, because that sixth implicit instruction for me would be (x64) lodsd ; . 2023-08-25 05:39:23 Interesting idea 2023-08-25 05:39:31 So my bet is they'd turn out quite similar on timing. 2023-08-25 05:39:36 I think you'd have to profile it, I've got no clue 2023-08-25 05:39:51 As in "probably hard to measure the difference. 2023-08-25 05:39:57 Right. 2023-08-25 05:40:05 ya very hard to go with your gut on x86 for this kind of stuff 2023-08-25 05:40:12 Losing the extra register (?) is a design cost that will impact performance somewhere els 2023-08-25 05:40:14 me either, and it might also be one of those situations where just counting instructions fools you. 2023-08-25 05:40:15 e 2023-08-25 05:40:29 Yeah, could be. 2023-08-25 05:41:20 I'd always just presumed thath "token threading" == "slow," at least to some extent. But I'm questioning that now. 2023-08-25 05:41:38 maybe. plenty of forths leave several registers unused. depends on how often you actually use them 2023-08-25 05:41:51 Just the way this feels to me, my sense is that if I lose any performance at all it won't be much. 2023-08-25 05:42:26 I've tended toward using almost all the registers, and that actually pleases me to some extent. I feel like I'm not leaving resources on the table. 2023-08-25 05:42:43 Slows down thread switches, though. 2023-08-25 05:42:48 It's stated but I don't know how tested that direct threading is as fast as or faster than subroutine threading on some x86 machines 2023-08-25 05:42:53 If it's a dumb unoptimised STC 2023-08-25 05:43:01 I absolutely believe that. 2023-08-25 05:43:07 Not always, but sometimes. 2023-08-25 05:43:14 And if that's so I wouldn't be surprised if direct token threading is about as fast 2023-08-25 05:44:07 Anyway, it suddenly makes this very attractive. Let's say it's at least similar in performance, and at least similar in compactness (I expect it to be a considerable compactness improvement). 2023-08-25 05:44:18 (direct token threading === the layout of dictionary header is as-if you have direct threading, but colon defs call token threading) 2023-08-25 05:44:23 So that means there are no serious costs, and the portability benefit is huge. 2023-08-25 05:44:55 Most of your system can be written in binary portable form (barring endianess / cell size changes). 2023-08-25 05:45:40 Yeah, I took your meaning there. 2023-08-25 05:46:00 I find this one I'm talking about interesting in that it seems like a bit of a mix of code threading with direct threading. 2023-08-25 05:46:01 I personally think token threading is "the way to go" for 64-bit x86 2023-08-25 05:46:19 Because 64-bit rules out direct threading, it's just too fat 2023-08-25 05:46:28 Variables and constants, for example, will look like direct threaded words - just using the vm instruction set instead of the native instruction set. 2023-08-25 05:46:54 And one cell there at the CFA is enough to implement variable, constant, and vocbulary, and, I think, does> words. 2023-08-25 05:46:58 I don't care what anyone says about how we have infinite RAM, we don't have infinite cache, 64-bit direct threading just sounds piss poor 2023-08-25 05:47:09 I aree. 2023-08-25 05:47:25 And I agree as well. 2023-08-25 05:47:50 Especially on microcontrollers, the cache might only be a few kB. 2023-08-25 05:47:54 Saving memory is worthy when the actual business end of the CPU acts like a super microcontroller which has to send off in the mail for RAM that didn't fit in cache 2023-08-25 05:48:01 Compactness will pay big dividends there. 2023-08-25 05:48:49 Microcontrollers I don't really care because the RAM is only a little slower than the CPU 2023-08-25 05:49:13 I shouldn't be predicting the size of this thing yet, because I don't know how much space will get "wasted" by back to back calls, operands not fitting in the remaining bits, and so on, but I'm not going to be shocked if my "complete basic system" comes in around 5kB-6kB. 2023-08-25 05:49:19 But your desktop CPU has a Commodore 64 inside that runs at lightspeed, that's where you should do all of your work 2023-08-25 05:51:11 People usually guess around 8kB for a "complete" 16-bit system. Well, six-bit slots is a LOT smaller than 16-bit cells, so I think it has a shot at being smaller than that. 2023-08-25 05:52:01 Say the mix of 4 byte and 8 byte headers is such that we average 6 bytes. 250 words - that's just 1500 bytes for all of the headers. 2023-08-25 05:52:27 250 is just a guess of course. 2023-08-25 05:54:52 I'd probably guess 7 or 8 bytes for average 2023-08-25 05:54:55 To be safe 2023-08-25 05:55:09 Although I'm making a lot of assumptions about your naming 2023-08-25 05:55:45 I've studied my naming - my average name length is five bytes. In my current system. 2023-08-25 05:55:59 I spent a little while one afternoon digging into that. I tend toward terse names. 2023-08-25 05:56:30 But, you may be right estimating 7. 2023-08-25 05:57:11 Anyway, it beats the pants off of my current headers, which have the same link and name information and then adds 4 bytes of CFA and 4 bytes of PFA. 2023-08-25 05:57:59 And a couple of months ago I was giving serious consideration to a scheme that used 8 bytes each for CFA and PFA. In a 64-bit system, obviously. 2023-08-25 05:58:39 And yes, that thinking made me unhappy as soon as I started thinking about any platform other than my notebook. 2023-08-25 05:58:53 To optimise cache usage most of the 'header' should be in a totally different page anyway 2023-08-25 05:59:13 The non-executing parts of the header should be somewhere else ... if you really want to optimise cache 2023-08-25 05:59:22 Yeah, I know, but this time I'm not going to do that. Doing that requires extra pointers. 2023-08-25 05:59:32 It's up to you 2023-08-25 05:59:45 But, my vm machine code will be in an isolated part of RAM, separate from where the headers are. 2023-08-25 05:59:55 I totally understand, optimising everything isn't fun. Because you'd probably just have to stop writing Forth 2023-08-25 06:00:02 The "code" that's mingled with the headers will be vm instructions, so "data" as far as the hardware is concerned. 2023-08-25 06:00:52 I don't know yet how big the vm emulator will be. Maybe 1kB-2kB? 2023-08-25 06:01:11 I'm including all the code that the jump table points into there. 2023-08-25 06:01:38 Well that goes in a different cache anyway 2023-08-25 06:02:28 Depends pretty heavily on how big my table entries are. 2023-08-25 06:02:55 Right - my icache stuff will be off in its own place. 2023-08-25 06:09:58 I've been thinking I'd put NEXT inline at the end of each instruction implementation, but that's another thing where jumping to one NEXT instead might not be as obviously slower as one might first think. 2023-08-25 06:10:10 Cache effects might act there too. 2023-08-25 06:11:30 though if an instructions "business code" is in cache, then an inline NEXT after it likely is as well. 2023-08-25 06:59:24 hi there 2023-08-25 06:59:31 anyone knows about inet? 2023-08-25 06:59:46 https://en.wikipedia.org/wiki/Interaction_nets 2023-08-25 07:01:41 I made a inet implemention, that uses forth-like syntax to build graphs :) 2023-08-25 07:01:59 swap and rot and so on 2023-08-25 07:02:02 https://github.com/cicada-lang/inet 2023-08-25 08:43:28 When I open the playground in chrome on Windows I just get scrollbars appearing and disappearing rapidly on the right pane 2023-08-25 08:43:43 Might be reproducible on a smaller window 2023-08-25 08:57:25 Well that's just great.... 2023-08-25 08:57:48 Trying to do block work and I realise now that 0.7.3 gforth's block implementation seems to be broken 2023-08-25 08:58:02 I suppose I need to write a block library first! 2023-08-25 09:00:35 Oh no it's on my end! I'll shut up... 2023-08-25 09:11:43 I like how EMPTY-BUFFERS is basically undo 2023-08-25 09:21:54 veltas: I meet similar issue during testing but forget how to reproduce it. 2023-08-25 09:21:59 I will do more tests. 2023-08-25 09:45:31 As I said might try a small window 2023-08-25 09:45:42 Or different window sizes until it happens 2023-08-25 09:45:58 It is handy the way Forth postpones disk updates. 2023-08-25 09:46:00 I don't know anything about inet although I know there's a few people in here who probbaly do 2023-08-25 09:46:11 Maybe siraben (who's absent) 2023-08-25 09:46:36 KipIngram: Well literally everything does, but it's handy how Forth is very explicit with it ... and a footgun occasionally 2023-08-25 09:46:47 But it works enough of the time to be a nice thoughtful feature 2023-08-25 09:47:10 My issue was the oldest in the book, forgetting to use UPDATE 2023-08-25 09:47:24 Heh. Done that. 2023-08-25 09:47:57 It's hard to know what the right place to put UPDATE is in a stack of words manipulating blocks 2023-08-25 09:48:01 Such as the editor 2023-08-25 09:48:04 It's a bit of an odd feature for Forth. 2023-08-25 09:48:11 I need to get on with serialisation so I can share what I'm working on..... 2023-08-25 09:48:31 I've got one direction, just need the other 2023-08-25 09:48:36 It's not "basic/primitive." It has a fair bit of fancy functionality associated with it. 2023-08-25 09:48:49 In terms of managing disk buffers. 2023-08-25 09:48:53 Well... it's blocks we're talking about 2023-08-25 09:48:57 It's still quite primitive 2023-08-25 09:49:11 It's quite simple to implement verses a filesystem + file I/O driver 2023-08-25 09:49:36 I'm happy with the level of abstraction, it's appropriate. It's meant to be more 'convenient' and 'interactive' and high-levek 2023-08-25 09:49:52 Forth can do high and low level, and the block words believe it or not are more high-levle 2023-08-25 09:49:59 level* I can't spell that word today 2023-08-25 09:50:23 Oh, sure. Very simple compared to a file system. 2023-08-25 09:50:40 High level yet small and manageable, easy enough for anyone to understand 2023-08-25 09:50:50 You couldn't say the same about a filesystem 2023-08-25 09:50:57 But, Forth could have just given us block-read and block-write, both with ( addr blknum --) and left it at that. 2023-08-25 09:51:15 Yes 2023-08-25 09:51:22 And left all the memory management to us. 2023-08-25 09:51:35 Which is why I believe it's "high level" and I think it's appropriate because it's meant for interactive / userfriendly use 2023-08-25 09:51:50 I'm not criticizing it - I like what they've done. 2023-08-25 09:51:53 Yeah 2023-08-25 09:52:07 Forth is the whole stack, this is further up the stack 2023-08-25 09:52:40 A bloody text editor is high up the stack, you don't need it. You can just use BLOCK/UPDATE/FLUSH/EMPTY-BUFFERS to edit blocks 2023-08-25 09:53:11 And I did use those to edit the blocks before my text editor was actually building properly 2023-08-25 09:53:32 Wasn't even that bad 2023-08-25 09:53:49 Yes, definitely. 2023-08-25 09:54:53 You can just EXPECT text into block buffers. 2023-08-25 09:54:57 crc's editor is quite minimal too and I imagine quite functional 2023-08-25 09:55:12 Mine's very minimal as well. 2023-08-25 09:55:39 It basically does EXPECT into RAM, with some sugar wrapped around it for convenience. 2023-08-25 09:56:01 Except I have an entry into the machinery EXPECT uses that lets me edit a range of RAM with already existing content in it. 2023-08-25 09:56:10 I like my editor; been using it (and variants of it) for about 20 years. 2023-08-25 09:56:30 I arranged it like that pecisely for supporting editor functionality. 2023-08-25 09:56:36 precisely 2023-08-25 09:56:37 I was doing e.g. CHAR ^ PARSE Here is some new text^ 400 BLOCK UPDATE 3 64 * + SWAP MOVE 2023-08-25 09:56:45 And then I added a couple of words that made that a bit more automated 2023-08-25 09:57:00 I think that's close to some of crc's editor functionality 2023-08-25 09:57:56 Like I know you've got 0-15 to overwrite a specific line of the block 2023-08-25 09:57:57 That bit of machinery basically lets me edit everything from a specified starting point in a block to the end of the block, but with visibility into and cursor motion within only the first line of that range. 2023-08-25 09:58:13 So if I add a character, all the text in the block all the way to the end moves up a byte. 2023-08-25 09:58:26 Yeah existing content with EXPECT is something I will add at some point 2023-08-25 09:58:38 It just drags all the invisible stuff beyond the line I'm editing along for the ride. 2023-08-25 09:58:40 Like the EDIT command in the ZX Spectrum's BASIC editor 2023-08-25 09:59:07 Then if you've got a good EXPECT, you've got good line editing 2023-08-25 09:59:55 It would have been a terrible way of doing it a few decades ago - it definitely exploits the fact that systems are fast enough that moving several kB of text around like that is painless and snappy. 2023-08-25 10:00:41 And that was a conscious thought I had to - "I can do it this way because the computer is really fast." 2023-08-25 10:00:46 too 2023-08-25 10:01:36 I'm still a little bugged by how voluminous all the code for EXPECT is, though. Can't beat the feeling that I've overcomplicated it somehow. 2023-08-25 10:03:11 Well I'm not thinking of doing a whole block, just a line 2023-08-25 10:03:12 part of that is understandable, though - it get complicated by the non "1 char = 1 byte" nature of utf8. 2023-08-25 10:04:12 My plan is to get most of the readline shortcuts working for EXPECT and abuse that as much as possible 2023-08-25 10:04:18 And do tab-completion like gforth 2023-08-25 10:04:31 Oh, tab completion - nice. 2023-08-25 10:04:38 (readline shortcuts are the emacs-style shortcuts in bash etc) 2023-08-25 10:05:23 You mean like moving the cursor by words and so on? 2023-08-25 10:05:29 i.e. C-u to delete a line, C-a for home, C-b for left, C-y for 'yank' (paste) etc 2023-08-25 10:05:42 Yeah, I have several of those. 2023-08-25 10:05:48 C-r for 'reverse incremental search' (i.e. search history) 2023-08-25 10:06:06 And in my last system (not this one, at least not yet) I had a command history system tied in at the QUERY level. 2023-08-25 10:06:10 M-y to cycle through previous pastes after pasting ... that's a powerful one 2023-08-25 10:06:20 That is nice. 2023-08-25 10:06:35 It's definitely a place worth injecting convenience. 2023-08-25 10:06:41 Yeah it's weird that bash is more featureful sometimes than vim... that is one thing vim doesn't make nice 2023-08-25 10:06:57 Pretty much the only emacs feature I miss in vi or vim 2023-08-25 10:08:04 I don't know how much of that I'm actually willing to implement in forth though 2023-08-25 10:23:38 That all feels to me like one of those problems where getting the approach right is critical to having it feel "painless." I've run into that in several little areas while using Forth. 2023-08-25 10:23:48 One example of it is managing a doubly linked list. 2023-08-25 10:24:06 Most ways you approach it get all klunky feeling, but there is sa path in there that just feels really clean. 2023-08-25 10:24:13 I usually forget it and have to re-discover it. 2023-08-25 12:00:30 So, I'm dropping Chuck's !p (store using instruction pointer without post-increment) from my instruction set. I don't see how I would ever use it in a software system. Chuck uses it as a way to write to ports he's executing from. Makes sense in hardware. 2023-08-25 12:00:48 @p with post increment is useful, though - it loads a 32-bit literal. 2023-08-25 19:13:32 I'm pretty close to being able to run a snip of Python that will create words in the dictionary for all of the vm instructions. 2023-08-25 19:14:20 Those will all be definitions that just run the instruction followed immediately by a return. Executable by the interpreter. 2023-08-25 19:14:34 No dictionary search involved in that part of things. 2023-08-25 19:18:52 Then I'll start at the other end, with the interpreter loop, calling stubs, and flesh out the stubs downward until I reach the instruction level everywhere. 2023-08-25 19:22:28 And that will search the dictionary