2023-05-21 08:25:59 Well, I've sketched out 38 operations that I think are close to a "portable instruction set." I haven't exercised it on real examples yet, but I think I'm in the ballpark with it. 2023-05-21 08:27:20 I think if I write code generator words for each of those for a platform, then the rest can be portable. 2023-05-21 08:28:23 They fall into six categories: single register, reg/reg, reg/mem, mem/reg, "indexed" and jumps/calls. 2023-05-21 08:29:23 They take various parameters - register callouts, numeric values, etc. 2023-05-21 08:30:33 The words that generate "families" of primitives, like conditional control words, will just use these to do the code generation. 2023-05-21 08:54:06 Hmmm. Arm v7 has a number of issues that look like they'll affect me. 2023-05-21 08:54:47 Particularly in regard to registers. Looks to me like there just aren't enough registers there to do things the way I currently do them. It nominally has 16 registers, but they're managed in somewhat funky ways. 2023-05-21 08:55:16 And the program counter is included in that list, whereas it's not included in the 16 regs x86_64 has. 2023-05-21 08:56:10 Looks to me like if I want to port my architecture there some things will have to be demoted to memory, and that will certainly affect performance. 2023-05-21 08:56:29 I'll just have to decide "where to put the pain." 2023-05-21 08:58:09 First candidate is obviously the disk buffer address, which I wasn't even sure I was going to register on x86. It definitely won't be on arm. 2023-05-21 08:58:31 And the "exception pointer" would be the next one. I rarely use that feature at all. 2023-05-21 08:58:54 None of the other possibilities are fun. 2023-05-21 09:42:00 I'm bothered by alignment issues re: literals in this new design. 2023-05-21 09:42:31 I'm using 16-bit xt's, so that means that unless I take pains to align literals they may not be 64, or even 32, bit aligned. 2023-05-21 09:43:31 I guess I could just lump it and accept the alignment penalty. In the current system xt's are 32 bits, and when I embed a string into the code stream I re-32bit-align after it. 2023-05-21 09:44:27 And I actually implement numeric literals as 32 bits. Not really floatng point friendly, but I'm not doing any floating point, and I've yet to actually NEED a real 64-bit literal. 2023-05-21 09:45:20 I could implement a literal table, but then that's memory to allocate and the question of how to keep up with where it is arises. 2023-05-21 09:45:43 Anyone seen any particularly clever methods around this? 2023-05-21 10:04:35 I guess I could just have the lit runtime align BEFORE picking up the literal instead of after. Then that would work even on systems that didn't natively handle misaligned addressing. 2023-05-21 10:04:53 Not for strings of course - for numeric lits. 2023-05-21 15:00:52 You know, I'd forgotten how far along I got the BLOCK stuff when I worked on it a year or more ago. 2023-05-21 15:01:45 It does everything, just write. I haven't written any words to set dirty bits or flush yet, but if I load block 1, make a small change to it, set it's dirty bit, and then load block 256 (which uses the same buffer), it does indeed write the change out to disk. 2023-05-21 15:01:48 Nice. 2023-05-21 15:06:11 certainly a better feeling than "how did this code ever work??" on revisit 2023-05-21 15:06:26 Yeah, I still see pretty clearly what I was up to. 2023-05-21 15:07:00 block itself is a primitive, and if the indicated blk# is already resident that's it - the primitive just confirms that and returns as fast as possible. 2023-05-21 15:07:15 If it's not resident it transfers control to some other words I wrote in Forth. 2023-05-21 15:07:39 One entry point if the needed buffer is dirty; another if it's clean. 2023-05-21 15:26:12 Even with it being a primitive it's not the fastest thing in the world - there's just a certain amount of work that has to be done there. 2023-05-21 15:26:58 I wanted the disk buffers to fit in a 1MB RAM block. I have 255 4kB buffers, and the last 4kB range in the 1MB has 255 16-byte "descriptors." 2023-05-21 15:27:33 The last 16 bytes is just unused. 2023-05-21 15:27:49 Anyway, the buffer is chosen via (blk#-1) mod 255. 2023-05-21 15:28:15 I used a Hacker's Delight algorithm for mod 255 rather than the division instruction. 2023-05-21 15:28:33 Uses multiply and some diddling and it's faster than doing it with /. 2023-05-21 15:41:21 So that fast path winds up being about 28 instructions. The mul is the only one of those that would take more than a minimum length of time. Guess I could time it. 2023-05-21 15:45:39 Ok, it runs that fast path a million times in about 10.2 msec. 2023-05-21 15:46:23 So, about 10 ns for a resident block? That seems fairly nice. 2023-05-21 15:47:46 Actually it's more like 6 ns on repeated executions. Just happened to catch something busy going on that first time. 2023-05-21 15:48:19 Anyway, I wanted applications that work with disk resident data to be as fast as I could make them. 2023-05-21 20:19:25 Well, I did some reading today on arm architecture, and I'm a little disillusioned about this portability hope I had. I mean, I can make it work - no problem. But the way I was planning to do it I don't really think would put me close to a truly optimum arm implementation. Arm just has some fairly significant additional features, such as conditional execution of practically every instruction. 2023-05-21 20:19:56 So an imp that mirrored the conditional structure of the x86_64 imp probably wouldn't be as good as one that fully exploited what the arm has to offer. 2023-05-21 20:21:14 I was picturing a situation where the two programs had basically the same structure, just with the x86 version taking advantage of the ability to access RAM and "operate" at the same time, while the arm used the necessary load/store operations as separate instructions. 2023-05-21 20:22:06 It did, though, say that Thumb instruction set didn't necessarily have all those features, so maybe if I used that it might make more sense. 2023-05-21 20:22:18 And that would be more compact that using the full 32-bit instructions. 2023-05-21 20:22:57 Oh, I got the additional block words written.