2022-12-30 00:27:22 https://cohost.org/offset---cyan/post/728975-the-zen-of-forth 2022-12-30 00:39:34 So, decided today that there will be two "do nothing" six bit instruction codes. 2022-12-30 00:39:41 "nop" and "null" 2022-12-30 00:40:00 nop will still take a cycle to execute, and execution of the bundle it's in will continue past it. 2022-12-30 00:40:26 null on the other hand, upon becoming the "opcode to execute next time" will triger a return to the instruction cell fetch state. 2022-12-30 00:40:53 So null will be used to pad out bundles that I can't' use all the slots off. 2022-12-30 00:40:54 of 2022-12-30 00:41:28 And also when I shift an instruction word down as I march through the slots, the nul code will shift into the high bits. 2022-12-30 00:42:19 So I won't need to keep up with how many slots I've executed in a word. 2022-12-30 00:42:44 I'll just keep shifting and executing until a null shows up in the "upcoming instruction" field. 2022-12-30 01:52:17 I think I may star the Verilog for this by writing code that just makes all the block RAM into a bit 32-bit random access memory. 2022-12-30 01:52:22 Via external pins. 2022-12-30 01:52:30 Then work inward from there. 2022-12-30 01:55:50 I wonder if Verilog actually simulates RAM content when you run a model... 2022-12-30 03:20:21 KipIngram: Make 00 and FF throw an exception 2022-12-30 03:21:03 I get annoyed when an ISA allows all 00's and all FF's as an instruction 2022-12-30 10:15:20 Funny enough those were very two I was thinking of. 2022-12-30 10:15:32 Except it would be 00 and 3F. 2022-12-30 10:57:45 Yeah make those cause an exception/crash. So if you ever 'jump' to data accidentally you've got a good chance of bailing early 2022-12-30 11:01:09 I used 0 in my token forth for 6502 as the instruction to start a word 2022-12-30 11:01:25 which is neat because 0 is the 6502 instruction for software interrupt 2022-12-30 11:02:03 so you can jump to the same piece of code from assembly or forth and it will either start the forth engine or push the IP if already in forth mode 2022-12-30 11:02:21 and begin executing in either case 2022-12-30 11:35:17 Ah, I see what you mean - you're saying don't "use them" at all. 2022-12-30 11:35:30 And that's not a bad idea, so long as I have the opcodes to spare. 2022-12-30 11:35:52 These two functionalities I described I do intend to use, though. 2022-12-30 11:36:13 But I don't have to have 0 or 63 be either of them, necessarily. 2022-12-30 11:37:10 MrMobius: that's pretty clever. 2022-12-30 11:37:47 There's one place in my x86 system where I want to start a Forth word from machine code, and I had to jump a small hoop to do it. 2022-12-30 11:38:15 That's in BLOCK. BLOCK itself is a primitive, and if the requested block # is already in a buffer, the machine code handles the total job. 2022-12-30 11:38:23 I wanted that to be as fast as possible. 2022-12-30 11:38:46 But if it's not resident and I'm going to have to mess with the disk, then speed isn't as paramount, so the lower stuff I wrote in Forth. 2022-12-30 11:41:43 It goes from there either to (blkr) or (blkw), depending on the situation. 2022-12-30 11:43:10 I have 255 4kB buffers in a 1M region; the leftover 4k at the top is a table used for storing the information about what's in what block, whether a block is dirty or not, and so on. 2022-12-30 11:43:33 I wanted the buffers to be 4k aligned, so I keep that meta-info separate. 2022-12-30 11:44:06 There's a 16-byte record for each buffer up there, so that means there's 16 bytes at the end that are unused. 2022-12-30 11:45:01 256 buffers would have been nicer; I have to do a mod 255 in block (I use a trick from Hacker's Delight). But I wanted the whole thing to fit in a power of two large block (1 MB in this case). 2022-12-30 11:52:34 When I first started planning the op_control section, I envisioned having explicit state bits that told me where I was in an instruction word. First opcode slot, second, etc. Even when I was planning to shift the word down op by op, I still had drawn that six long (fetch plus five ops) state diagram, and would have needed at least three state bits to idenfity where I was in it. 2022-12-30 11:52:44 Though I was kind of thinking of making it one hot. 2022-12-30 11:53:30 But anyway, I realized last night I don't need that information at all. If I shift this "null" code in from the top as I shift the ops down, then that CONTENT can tell me when I've exhausted the cell. 2022-12-30 11:53:35 Cleaner. 2022-12-30 11:54:11 It's kind of like having an "end of opcodes" instruction, except I don't have to manually put it in the code - it just appears naturally where it's needed. 2022-12-30 11:54:57 So the machine itself now has only two states; we're either fetching a cell or processing some opcode in a cell. 2022-12-30 11:55:25 And since the block ram business forces me to look ahead one opcode, I can respond to that null code without it actually taking a cycle. 2022-12-30 12:12:28 Best I can tell I can support a 32MB total address space while using only two levels of LUTs to do all the address decode. 2022-12-30 12:12:53 Even if I've overlooked something it won't take more than three - if it winds up being three then I could probably expand that quite a bit. 2022-12-30 14:17:28 Hmmm. Here's something interesting. In general RAM I'm using the A block RAM port for reading, and the B port for writing. That lets me keep the B register applied to the B port all the time, so it's "always ready to write." 2022-12-30 14:17:45 It means the A write port and the B read port are unused. 2022-12-30 14:18:06 I could do something interesting with those. I could use them as 32 "general input pins" and 32 "general output pins." 2022-12-30 14:18:46 then I could generate signal waveforms by marching the b register along, or capture signal waveforms by marching reg a along (with some special circuitry to tell it when to write from A). 2022-12-30 14:18:52 That's kind of nifty. 2022-12-30 14:19:41 feed those B read outputs into a DAC and all of a sudden I've got an arbitrary signal generator. 2022-12-30 14:22:02 A *fast* one. 2022-12-30 14:22:27 Set the A write inputs up right and I've got a logic analyzer. 2022-12-30 17:54:24 Ugh, got to stay up late tomorrow :( 2022-12-30 17:55:44 "Who are all these people at this party? I bet none of them have even read Starting Forth. My feet hurt." 2022-12-30 17:56:06 Hope you enjoy your New Years parties tomorrow ;) 2022-12-30 18:04:18 my what now 2022-12-30 18:05:24 we ain't having one. 2022-12-30 18:09:42 I envy you 2022-12-30 18:11:03 crc: I just found out that whether division is floored or symmetric is implementation-defined in C89 2022-12-30 18:11:25 It's symmetric in C99+ 2022-12-30 18:11:28 don't trust division. 2022-12-30 18:17:18 veltas: I'll look into working around this to ensure consistency 2022-12-30 18:18:30 yeah, it's better to work on things together 2022-12-30 18:20:03 I thought I'd tell you because I'm assuming you only want symmetric and you're supporting C89 2022-12-30 18:22:29 Yes 2022-12-30 21:06:27 I've added some logic to try to correct for this; also fixed the python code which was doing floored division 2022-12-30 22:10:00 Wrote a first Verilog file this evening. 2022-12-30 22:10:35 It's just the bit that shuffles the data in lines to the RAM around, based on address 1:0 and a "size" indication that will be taken from the instructions. 2022-12-30 22:11:03 Lights up the data lines to the ram with the right information and produces the four write enable signals for the bytes. 2022-12-30 22:11:33 Took it all the way through the Xilinx software and looked at some of the reports. 2022-12-30 22:11:43 It was amusing how little of the device it consumed. 2022-12-30 22:12:26 The software detects free inputs and outputs and routes them to pins for you. Of course in the end this part will be buried inside. 2022-12-30 22:12:47 But I suppose in theory I could program it onto the board and then test it live if I wanted to. 2022-12-30 23:03:42 Oh, btw - I'd originally planned to use all the block rams as 32-bit wide units. With that aspect ratio you get 1k cells per block RAM. 2022-12-30 23:04:23 But it's also possible to do it other ways, and I realized that doing it that way would make me have to send all the output data from all the RAM units through a big selector tree. 2022-12-30 23:04:49 Instead for general RAM I'm going to use them in 32k 1-bit wide mode, and just gang 32 of them across to get my 32-bit cells. 2022-12-30 23:05:04 Then just a single gang will give me 128kB of RAM, which is what I'll start with. 2022-12-30 23:05:19 So there will just be one set of 32 output lines from it. 2022-12-30 23:05:27 It'll do all that work for me inside the block RAM. 2022-12-30 23:05:52 For the stacks, though, I'll use by-32, since each stack will just need one block RAM anyway.