2023-08-24 00:08:40 xelxebar: Oh, it doesn't actually burn the flash until you're done debugging your program. 2023-08-24 00:08:58 Prior to that, it actually runs your program on the host, an duses jtag to control the processor pins. 2023-08-24 00:09:16 So you have an interactive session even on a little micro that can't actually host one - the tethered host runs it. 2023-08-24 00:09:37 And basically just uses the target as an "extension cord" sort of. 2023-08-24 00:10:11 Yes - what you said. the interpreter doesn't actually run on the target chip. 2023-08-24 00:11:52 It doesn't have enough resources to run one. 2023-08-24 00:12:10 That's what makes mecrisp across so neat - it gives you that interactive experience in spite of the resource limitations. 2023-08-24 00:12:29 And my understanding is that when it does create and burn code, it's extremely well optimized. 2023-08-24 00:13:19 I think of the wear reduction on the flash as a secondary benefit - the primary benefit is the interactivity. 2023-08-24 00:14:39 xelxebar: JTAG is a hardware protocol supported by many chips - it lets you communicate with the part over a serial connection, and exploits internal wiring in the device that lets you program the internal flash, test various things, and control all of the pins. So you can drive the chip through an emulation of it dealing with its own external peripherals and so on. 2023-08-24 00:54:39 Oh cool. So you're executing on your dev laptop and the JTAG bridges let's you mostly pretend you're running on the chip. 2023-08-24 00:55:56 Then when you're ready, mecrisp will compile the forth down to a more optimized blob and burn that. 2023-08-24 00:56:08 Am I grokking the basic idea? 2023-08-24 01:33:43 Not really on your laptop - you buy not only the target micro but also a "host board" which has a more capable microprocessor on it. The system runs on that, and you communicate with it via a USB port. 2023-08-24 01:34:05 I can't remember the specific part number for the host board, but I've got one in the garage. 2023-08-24 01:35:35 So basically notebook <--> host_board <--> target_board. 2023-08-24 01:35:58 host_board runs an interactive forth that can use jtag to control target_board, or burn it's flash with a finished program. 2023-08-24 01:43:14 Ah, okay. So the host board is an intermediary with the extra hardware needed to interface with the actual hardware. 2023-08-24 01:52:51 Yes. 2023-08-24 03:14:40 That actually makes hardware devving sound fun. I've always found the dev cycle to be demotivating: write program -> burn flash -> run -> miserable failure -> tweak program -> repeat 2023-08-24 03:15:03 This is why mecrisp is so popular 2023-08-24 03:15:11 A lot of people got into Forth because of mecrisp actually 2023-08-24 03:15:21 But being able to just have the hardware plugged into a friggin' repl. Hell yeah. 2023-08-24 03:15:31 Interesting. 2023-08-24 03:15:56 Is that why Linux/FreeBSD ports exist? 2023-08-24 03:15:57 There's a #mecrisp channel ... somewhere, the main one's on hackint I think 2023-08-24 03:16:06 Which is quite active 2023-08-24 03:16:18 I suppose so? 2023-08-24 08:12:13 Hardware design is fun. I particularly enjoyed digital circuit design back in the day when we did it by wiring fixed function discrete parts together (gate level design). There was just something about that process that resonated for me. 2023-08-24 08:12:38 Now we've got Verilog, though, and big FPGAs etc. It's a different world. 2023-08-24 08:12:53 That kind of hardware design has almost become software design. 2023-08-24 08:15:07 I've always been slightly suspicious of Verilog / VHDL / etc. Feels to me like it lets you bring forth "hardware" without necessarily really understanding what you've done. 2023-08-24 08:15:33 Because one way to use Verilog is just to describe what you want the hardware to do - the synthesizer will come up with the details. 2023-08-24 08:15:42 You *can* get yourself into trouble that way. 2023-08-24 08:16:17 Once in a while a bit of behavioral Verilog that looks totally innocent will go like use half of your FPGA or something. 2023-08-24 08:16:41 I think it's important to understand what your circuit's actually going to LOOK LIKE. 2023-08-24 08:17:08 You can use Verilog that way too - you can just describe your circuit more literally. "Structural" Verilog vs. "behavioral." 2023-08-24 08:17:41 Structural is pretty much the same as drawing a schematic insofar as what you're actually doing goes. 2023-08-24 08:18:30 But I've always thought that an actual schematic can deliver information in a "visual" way that Verilog can't achieve. It's like the general "shape" of the circuit on the page is an information channel. 2023-08-24 08:18:37 General information, but still. 2023-08-24 08:19:03 There's no denying, though, that Verilog lets you design much larger, more complex things. 2023-08-24 08:20:07 And the world's never going to go back to schematics, so I just have to get over it. :-) 2023-08-24 11:53:13 Well, I went through and assigned instructions to 6-bit opcodes. I tried to be as patterned about it as possible - doesn't really matter for a software implementation, but getting that kind of thing organized right can make a real difference if one ever ties to do an FPGA implementation. I'm pretty sure it won't be "final," but at the moment I've used 60 of the 64 instructions. I was pretty liberal with 2023-08-24 11:53:15 things, though, so I'm sure I could contract aspects of it if I needed to. 2023-08-24 12:00:44 I'm making some use of "prefixes." For things you do often, it's worth splurging on a "family" of instructions, but for rare things you'd rather not. E.g., a Forth system needs all of sp@ sp1 rp@ rp!. But you rarely actually use those. So I'm going to have a prefix that uses the following field to choose a specific operation. That way I only need one instruction to get at a whole family of operations. 2023-08-24 12:04:35 I guess I could push this "all the way". I've got 64 codes. Two of them (all 0's and all 1's) get used up by the next cell function. I could take one of the remaining 62 for a "rare" prefix. Then I can have 61 frequently used instructions and as many as 64 less frequently used ones. 2023-08-24 12:10:45 I'm also going somewhat "HP calculator" with it. I'm going to think of the top four stack items as x, y, z, t. So instead of DUP I will have X. Instead of OVER I will have Y. And I'll have Z and T as well. Instead OF SWAP I'll have <>Y. And I'll have <>Z and <>T as well. 2023-08-24 12:20:11 hmm... this Commander X16 seems to going to become a thing ( https://www.commanderx16.com ) 2023-08-24 12:36:34 ACTION works on https://gist.github.com/zarutian/e13c9fc3f1239a1e716fd8248db26baf 2023-08-24 13:11:32 Holy cow. $500? 2023-08-24 13:11:49 I'd expect something like that to be more like $100-$150. 2023-08-24 13:11:55 Since it's essentially a toy. 2023-08-24 13:26:09 KipIngram: I expect as much for the first batches. 2023-08-24 13:27:00 8bit guy has not gone into manifacturing before so it is expected 2023-08-24 13:28:49 That's fair. I guess we can look at $400 of it as the price of being an "early adopter." 2023-08-24 13:29:25 I just want buy the thing as a kit 2023-08-24 13:37:13 but the fun thing is that there is an emulator in webpage already 2023-08-24 14:02:56 Yes; I didn't really study the site but I was thinking it would be easy to play with without any hardware. 2023-08-24 18:05:47 KipIngram: just curious, why is the all 1s instruction special? 2023-08-24 22:23:18 MrMobius: some instructions need literal data that isn't necessarily 32 bits (or rather, doesn't REQUIRE 32 bits). Say to specify literal 12. You can use "the rest of the cell" for that. 2023-08-24 22:23:26 And you might want to specify a negative literal. 2023-08-24 22:23:44 Therefore, the right shift instructions used to move each 6-bit instruction down toward the low position must be arithmetic. 2023-08-24 22:23:50 That's part 1 of the answer. 2023-08-24 22:24:00 Part two is that you always have five 6-bit fields in a 32-bit cell. 2023-08-24 22:24:05 But there are two bits left over. 2023-08-24 22:24:35 If you happen to need an instruction next which only has the low two bits nonzero, you can slip a sixth instruction into the cell. 2023-08-24 22:24:57 If those two bits of that sixth instruction have 1 as the lsb, then that arithmetic right shift will shift in 1's from the top. 2023-08-24 22:25:08 So instead of eventually getting all 0's, you'll eventually get all 1's. 2023-08-24 22:25:18 So either one needs to "next cell" the thing. 2023-08-24 22:25:22 Does that make it clear? 2023-08-24 22:26:26 I guess the short answer is "because for various reasons Ineed to right shift with a sign-bit preservin shift operation. 2023-08-24 22:26:28 " 2023-08-24 22:27:07 And I still need to guarantee that I will execute a next cell process at the necessary time. 2023-08-24 22:28:44 Why not do a logical shift then so you get 5 zeroes shifted in regardless of what the MSB is? 2023-08-24 22:29:19 I see what you mean about literals though you may be able to sign extend with one instruction 2023-08-24 22:29:42 Because then I couldn't use instruction 0b000010 in the sixth slot. 2023-08-24 22:30:17 I'm sorry - wrong answer. 2023-08-24 22:30:25 Or, I answered the wrong question. 2023-08-24 22:30:38 What if I want to have a lit instrucion in the third cell for -12? 2023-08-24 22:30:54 I need to set the top 12 bits to -12. 2023-08-24 22:31:02 And that will shift right as I execute. 2023-08-24 22:31:16 I need it to be a 32-bit -12 by the time it shifts all the way down. 2023-08-24 22:31:42 If I do a logical shift, 0's will shift in and it will look like a positive number. 2023-08-24 22:32:21 Same for jumps forward or backward. Or calls. I'm just going to add what's left of the cell to IP. 2023-08-24 22:32:31 Well, I'm going to add 4x what's left. 2023-08-24 22:32:41 I need to be able to specify offsets in either direction. 2023-08-24 22:33:19 When an instruction that uses "the rest of the cell" as a literal value executes, that literal value will have shifted down to the bottom of the cell. 2023-08-24 22:33:25 I want it to sign extend properly. 2023-08-24 22:33:32 Which requires arithmetic shift. 2023-08-24 22:34:30 I might not have enough room left to call out the necessary value. 2023-08-24 22:34:41 If that's the case then I abandon the rest of that cell and go to the next one. 2023-08-24 22:35:06 That fresh cell will have one six bit instruction and 26 bits available for operand. 2023-08-24 22:37:41 so 26 bits is the only immediate size? 2023-08-24 22:37:47 before sign extending 2023-08-24 22:37:52 No; it's "whatever is left." 2023-08-24 22:38:15 If the instruction is in the lowest order cell, there are 26 bits. If it's in the next cell up there are 20. Then 14, etc. 2023-08-24 22:38:28 It will always have shifted down into a 32-bit register layout by the time I use it. 2023-08-24 22:38:44 And I want it to be able to capture positive or negative values, so that forces arithmetic right shift. 2023-08-24 22:39:19 I don't have to know what slot my instruction was in inititally - at execution time I'll just have a register with 32 bits in it that contains the operand. 2023-08-24 22:39:28 that's one way to do it 2023-08-24 22:39:49 The "force a new cell" thing only happens if there isn't enough operand space left in the prior cell for the value at hand. 2023-08-24 22:40:07 So "short' jumps can be in more possible slots - "long" jumps have to be in a lower slot. 2023-08-24 22:41:00 and you can only encode a 1 or 2 for the final 2 bit instruction? 2023-08-24 22:41:29 where 0 and 3 are nops 2023-08-24 22:42:06 Right. As far as using the sixth slot for an INSTRUCTION goes, really only 0b000001 and 0b000010 are beneficial values. 2023-08-24 22:42:14 I'm going to assign those to return and unext. 2023-08-24 22:42:47 Making one of the unext lets me have five-instruction-long microloops instead of just four instructions. 2023-08-24 22:43:06 And return... well, Forth does a lot of returning. 2023-08-24 22:43:28 It's a guess to some extent. Chuck included both of those in his equivalent "short cell" instructions. 2023-08-24 22:43:37 He had eight, though, because he had three extra bits. 2023-08-24 22:44:36 And he didn't need to use any of his for "next celling." His hardware kept up with that. 2023-08-24 22:45:14 I could emulate that hardware in software, of course. I could keep a count of how many cells I've executed, so I knew when to reload. 2023-08-24 22:45:24 But I'd be running that test every single instruction, and it would be inefficient. 2023-08-24 22:46:03 It was when I realized I could do it this other way, and only run next cell logic when I needed to, precisely, with no overhead the rest of the time, that this whole idea started to look attractive to me. 2023-08-24 22:46:40 I think reloading your jump pointer every time would let you keep track and might be pretty efficient 2023-08-24 22:46:52 jump pointer for next I mean 2023-08-24 22:47:19 Not sure I know what you mean by that. 2023-08-24 22:48:47 I mentioned it the other day. basically 4 copies of NEXT with labels NEXT1, NEXT2, NEXT3, NEXT4 2023-08-24 22:49:18 load NEXT1 into a pointer register. the end of each primitive jumps to the pointer register 2023-08-24 22:50:01 NEXT1 loads NEXT2 into the register then executes your NEXT_MACRO 2023-08-24 22:50:30 so that the next primitive jumps to NEXT2 when it's done 2023-08-24 22:50:51 All four are identical except NEXT4 loads a new word 2023-08-24 22:50:53 Oh. I miseed it the other day - that's new to me. 2023-08-24 22:51:16 or in this case NEXT5 since 5 instructions 2023-08-24 22:51:32 I'd have to think about it a little, but it doesn't immediately see simple. But yes, if there's a register pointing to NEXT then I guess there are a lot of games you could play. 2023-08-24 22:52:24 NEXT1: MOV Rx, NEXT2 2023-08-24 22:52:34 I'm picturing each "primitinve" (aka vm instruction code) ending with a phrase like so (x64 style): mov regA, regB ; sar regA, 6 ; and regB, 0x3F ; jmp [regC+4*regB] 2023-08-24 22:52:34 NEXT_MACRO 2023-08-24 22:52:53 basically that 5 times with extra code on the 5ty one 2023-08-24 22:53:38 And that compares VERY favorably with the amount of code in my current NEXT. 2023-08-24 22:54:12 Then every few instructions I'll need a more typical looking operation, lodsd, to reload the instruction word using IP and bump ip forward. 2023-08-24 22:54:26 That's what the 0b000000 and 0b111111 "instructions" will do. 2023-08-24 22:54:30 hmm 2023-08-24 22:54:44 I think on an x64 it might JUST be lodsd. 2023-08-24 22:54:51 ya one jump at the end of each primitive is pretty clean 2023-08-24 22:54:57 Which would then be followed by the first thing I laid out. 2023-08-24 22:55:26 you could also mask or multiply or conditional move if you're avoiding branches 2023-08-24 22:55:37 I'm not going to be shocked if I discover this is actually faster than my current system, AND more compact by a large margin and more portable. 2023-08-24 22:55:53 I'm excited about it. 2023-08-24 22:56:31 I'm not quite prepared to say I'm sure it will be faster, though. 2023-08-24 22:59:02 how about using something like setz then multiplying? 2023-08-24 23:00:13 so you get a new value to jump to once your count of instructions hits 0 but otherwise keep executing 2023-08-24 23:02:54 if (count) x=1; else x=0; 2023-08-24 23:03:17 y=1-x; 2023-08-24 23:04:05 goto next_address*x+reload_address*y; 2023-08-24 23:04:40 ugly in C but potentially not many instructions in asm 2023-08-24 23:11:46 I don't see how that is better. 2023-08-24 23:12:13 I suppose we could measure them, but it feels like it would be slower. 2023-08-24 23:12:50 You have to have the table jump somewhere, and the scheme I'm looking at now has no other jumps in it. Just three reg/reg instructions. 2023-08-24 23:13:38 Well, one reg/reg and two reg/imm. 2023-08-24 23:14:23 I'll admit, though, that I haven't systematically generated a whole flock of possibilities and carefully analyzed them. 2023-08-24 23:14:33 I just worked this one method out, and it "seems very fast." 2023-08-24 23:20:25 The shift process eliminates the need for having to differentiate among the slots. The slot to be executed is always the least signficant six bits. 2023-08-24 23:20:54 If there's an operand, it's always a full 32-bit value just sitting in a register by the time it gets used. So many things just "fall into place" in an almost perfect way. 2023-08-24 23:21:19 You even automatically get an all 0's or all 1's instruction when it's time to bump to the next cell. 2023-08-24 23:22:32 I guess to be honest that's not "perfect." It's an overhead operation. "Perfect" would be for you to just somehow know to do the cell bump when you finished the last REAL instruction in the cell - you wouldn't have to process and execute a "phantom extra instruction" to get that to happen. 2023-08-24 23:23:09 But to do that I'd have to keep up with slots and so on; my guess is that that would wind up imposing more overhead, just spread out instead of coming in one spot. 2023-08-24 23:24:01 Forth encourages us to factor/modularize, but one consequence of that is that a lot of calls are local - small distance calls. 2023-08-24 23:24:20 Those are more likely to need a smaller operand and so have a better chance of fittingin the remainder of your current cell. 2023-08-24 23:25:49 The most expensive operations will be calling native Forth words that need to be called instead of being in-lined. Those are way down in the startup dictionary, likely far away, and will be more likely to force us to a new cell so that we have room for a large operand. 2023-08-24 23:26:46 One posible way of dealing with that would be to have a second call instruction that took the operand as a offset from the base of the system instead of an offset from IP. 2023-08-24 23:27:14 All the start-up words will be down there near that point. But that wouldn't be a relocatable image then, unless I have a register pointing to my system base and offset from that. 2023-08-24 23:43:25 Actually the all 0's / all 1's instruction will have to cache a copy of the instruction word in another register. Because unext involves jumpin back to the beinning of the cell, so I have to have a copy of it.