2023-05-09 06:01:45 KipIngram: I think that's a terminal option, whether \n generates the corresponding carriage return 2023-05-09 06:03:20 And this is how C on UNIX systems implements new lines in the terminal, because in the C standard \n must generate a new line (not just a line feed in terminal) 2023-05-09 06:03:31 so they just leave that option on by default 2023-05-09 06:04:02 In Windows I would imagine that instead the putc() et al functions check for \n and generate \r\n when in text mode 2023-05-09 06:04:11 And in binary mode you need to use \r\n 2023-05-09 06:41:13 Yeah, probably so. 2023-05-09 06:41:28 (terminal option). 2023-05-09 07:26:05 I started to pull together a little set of block editing words last night that looked fairly promising. Basically a line editor - word for displaying the block, words to move the target line around in the block, insert a new line somewhere, delete a line, etc. 2023-05-09 07:26:27 Won't be as smooth as my previous editor, but so far it feels like it would get the job done. 2023-05-09 07:27:04 Uses newline chars and not 64-char lines. 2023-05-09 07:27:48 And it capitalizes on the ability a component of my EXPECT has to carry the whole remainder of the block along for the ride while I'm editing a line. 2023-05-09 07:29:07 I've started to fake compile by default, althought it's not good yet 2023-05-09 07:29:34 but everything becomes a function, a colon word becomes a function that executes an array of functions xD 2023-05-09 07:30:03 it's not good cause I'm not caching compiling, so everything is compiled every time 2023-05-09 07:30:17 I have to fix that, but somehow I see it a bit more similar than forth xD 2023-05-09 07:30:46 instead of a pointer to assembly instructions, it's a function 2023-05-09 07:30:59 There's a Forth out there I've seen that does interpretation that way - it compiles the whole line and then executes it. Just doesn't save it unless you're making a definition. 2023-05-09 07:31:09 also realized forth wants to be written in assembly, but my lang wants to be written in a high level lang 2023-05-09 07:31:18 The nice thing about it is that you can use any words in "interpret" mode, since it's really compile mode under the hood. 2023-05-09 07:31:23 So you can have loops and so on. 2023-05-09 07:31:40 because it assumes a call stack, environments, hashes, lists and a gc exist 2023-05-09 07:32:08 Yeah, you are using some of the abstract things your underlying language offers. 2023-05-09 07:32:12 KipIngram: yeah, I had it but as optional way, in order to provide a bit of efficiency 2023-05-09 07:32:34 but this time I'm not saving words 2023-05-09 07:32:53 because I need to change the functions for objects 2023-05-09 07:33:13 or whatever it is that gives me metadata, for example to see if it's an immediate function 2023-05-09 07:33:31 so yeah, they need a header 2023-05-09 07:33:32 xD 2023-05-09 07:34:18 the cool thing is if I save the compiled version of a word, and this word is tail recursive, node can give it tco 2023-05-09 07:35:06 when compiling the definition of a word, if I find the name of the word I'm compiling, I return a function that will execute the function of the future cached version 2023-05-09 07:35:21 as in a recursive definition the word isn't yet compiled 2023-05-09 07:36:35 I'm really planning to start a new system, but I'm givin serious consideration to trying to use the existing system to develop it. Just get this metacompile thing tackled. 2023-05-09 07:36:36 I was amazed that node was able to provide tco with anon subs that come from a "hash" 2023-05-09 07:36:51 KipIngram: oh, forth metacompilers interest me 2023-05-09 07:37:04 It's mostly just a bookkeeping problem. 2023-05-09 07:37:21 I want to learn about them sooner or later 2023-05-09 07:37:26 You need two sets of various pointers, and you have to know which set to use at various times. 2023-05-09 07:38:38 KipIngram: how do you turn a recursive word into a loop even if it's not tail recursive? 2023-05-09 07:39:04 oh you don't need that, you have a call stack 2023-05-09 07:39:33 I tried to do it, once the recursion ended it was missing to execute the remainder of the word 2023-05-09 07:39:52 I assume this is why tail call is usually optimized, they convert it in a loop 2023-05-09 07:40:13 but since it's a stack based language, I think I can take the remainder of the recursion before starting the loop 2023-05-09 07:40:23 once the loop ends, execute the remainder 2023-05-09 07:40:42 Um, there is a way to do that, but it's been a long time since I've looked into it. 2023-05-09 07:40:52 There's an "algorithm" for the conversion, sort of. 2023-05-09 07:41:09 Basically you have to provide the equivalent of the standard stack that the recursion is taking advantage of. 2023-05-09 07:41:10 I'm not sure if "saving" the remainder of the word before starting recursion, and execute it later will be enough 2023-05-09 07:41:16 it looks like it would work for me 2023-05-09 07:41:37 it would be cool as then I have recursion without blowing the stack 2023-05-09 07:41:44 You can't dodge having to save the information - you just do it explicitly instead of having it happen transparently using the standard stack. 2023-05-09 07:42:23 Sometimes your standard stack has limited capacity - if you allocate the necessary memory yourself, you avoid running into that limit. 2023-05-09 07:42:41 my stack has the same limit as the memory available 2023-05-09 07:42:46 it's perl again 2023-05-09 07:42:47 xD 2023-05-09 07:42:49 It's just particularly easy in the tail recursion case, because you don't have to save any information in that case. 2023-05-09 07:43:17 Yeah, if your system stack is unlimited, then you can go either way - recursive or iterative. 2023-05-09 07:43:30 I thought perl was a playground and I had to make a C version once I had an idea of what I'm doing 2023-05-09 07:43:55 but there's no reason to use C, unless I need performance 2023-05-09 07:44:11 and I never wrote a program that needs performance 2023-05-09 07:44:23 Or if you just want to learn to solve some of the things Perl gives you for free. 2023-05-09 07:44:33 I tried, cause for example if I compare the perl version with the js one I cry 2023-05-09 07:44:48 KipIngram: that's the best benefit I have 2023-05-09 07:44:49 I've written some things worth crying over over the years. 2023-05-09 07:45:04 I can focus on whatever I want and don't have to be distracted implementing stuff I already have 2023-05-09 07:45:14 this is also the reason it differs so much from forth 2023-05-09 07:45:31 Sure. 2023-05-09 07:45:43 It's not a Forth - it's a stack-oriented "something else." 2023-05-09 07:45:46 yeah xD 2023-05-09 07:46:11 But you've had fun, and I can tell you've learned some things. 2023-05-09 07:46:38 and it never stops changing 2023-05-09 07:46:51 I don't know how many times I've rewritten 2023-05-09 07:46:58 even when I'm not using rust 2023-05-09 07:47:15 Re: metacompiling, there are really two issues. When you do that, you have two dictionaries - the one you're running from and the one you're creating. 2023-05-09 07:47:31 but it will be cross compiling? 2023-05-09 07:47:34 So you need two pointers to the next free byte in the dictionary. 2023-05-09 07:47:37 so it will be on other machine? 2023-05-09 07:47:56 When you compile a meta word, you want to compile it into the new dictionary, and advance that dictonary's pointer. 2023-05-09 07:48:11 And you also want to SEARCH that dictionary for the words in the definition you're building. 2023-05-09 07:48:18 That's all - you just have to keep those things straight. 2023-05-09 07:48:43 You don't want the target dictionary pointing at anything in hte host dictionary. 2023-05-09 07:49:07 Meanwhile, though, your host system has to continue to run properly 2023-05-09 07:49:42 And if you've written the host system without having that in mind, it's probably not going to be awfully easy. 2023-05-09 07:50:04 so you just decide whether to add stuff on the host or target dictionary? 2023-05-09 07:50:17 I'm not expecting it to be particularly easy i nmy case here - I will probably have to make some adjustments to the host system I've got in order to make it w ork. 2023-05-09 07:50:40 that's why you wanted to make a new system 2023-05-09 07:50:57 Yes, correct. The "easiest" way would be to use different words to compile to the different dictionary, but you do want the target result to have the right names for its words and so on. 2023-05-09 07:51:05 Like, if you see : foo ... ; 2023-05-09 07:51:13 are you adding a word to the host, or to the target? 2023-05-09 07:51:29 I expected to have different syntax to decide that 2023-05-09 07:51:30 You have to have a way to tell the host which thing to do. 2023-05-09 07:51:39 well a flaag would also do 2023-05-09 07:51:40 Then your new system has a different syntax. 2023-05-09 07:51:48 Yes - you need a flag-based way. 2023-05-09 07:52:13 So that the two systems can share various bits of naming/syntax. 2023-05-09 07:52:28 hmm 2023-05-09 07:52:32 could just be as "simple" has having words HOST and TARGET. 2023-05-09 07:52:43 TARGET : foo ... ; 2023-05-09 07:52:49 in my case metacompiling on the same system does not make much sense 2023-05-09 07:52:56 but "metatranspiling" does 2023-05-09 07:53:03 also transpiling 2023-05-09 07:53:15 haha I like the term 2023-05-09 07:53:19 metatranspiling 2023-05-09 07:53:29 Yes - I'm interested in building a new system that runs on the same machine, but once you've got this all set up and working that's not a "requirement" - you certainly could cross-compile. 2023-05-09 07:53:39 in your case it does make sense 2023-05-09 07:53:45 in mine I just add overhead 2023-05-09 07:54:39 KipIngram: how would you cross compile? 2023-05-09 07:55:53 Well, in addition to keeping track of your two dictionaries, you also have the ability to write primitives for the other architecture. 2023-05-09 07:56:26 And you might have a different cell size, or endianess, or whatever on the new system - you just have to "handle all that." 2023-05-09 07:56:53 Like, when you compile a primitive, you use the right instruction set. 2023-05-09 07:57:56 So, 1) which dictionary do you search, 2) which dictionary do you add new stuff to, and 3) what processor / system platform architecture do you use. 2023-05-09 07:58:30 One host/target flag can tell you what you're doing - you just have to wire that flag to all the right bits of functionality. 2023-05-09 07:59:52 Also, you're probably saving this target image in disk buffers, and later you're going to want to load it to some arbitrary place in the target's RAM and run it. That's probably going to require some adjustment to the actual image to account for where you put it in RAM when you want to run it. 2023-05-09 08:00:05 So somehow your image has to tell you how to do all that. 2023-05-09 08:00:35 That's the same as on Linux, though - the loadable image format has records in it that tell Linux how to properly load it into RAM. 2023-05-09 08:00:35 hmm 2023-05-09 08:00:44 The ELF executable spec. 2023-05-09 08:01:18 so I just need a way to "insert" primitives 2023-05-09 08:01:22 In my case, for example, those tables that hold CFAs and PFAs will probably have "offsets" in the image, but when I load it I'll need to modify all those entries to make them addresses. 2023-05-09 08:01:35 Which will just mean adding the base address I load at to every entry. 2023-05-09 08:01:37 well that's mainly the whole thing about transpiling 2023-05-09 08:02:13 what you explained kind of made a bulb on my head almost be on 2023-05-09 08:02:19 :D 2023-05-09 08:02:31 If you knew exactly where you were going to load it ahead of time, you could just output the addresses right into the image. But unless you're making something for some embedded device where you control the whole address space, you probably don't know that. 2023-05-09 08:02:38 In Linux there's no teling where it will get put. 2023-05-09 08:04:38 This is part of why you rarely see Forth allow you to "save compiled programs." It dodges that problem altogether - you just compile your source every time, and since you're compiling it directly to where it will run, you can just set those addresses as you go. 2023-05-09 08:04:56 On modern arches you want to put code separate to data, because most modern machines have separate instruction and data L1 caches 2023-05-09 08:05:09 Yes. 2023-05-09 08:05:30 Which raises the interesting question of "what is a colon definition"? 2023-05-09 08:05:42 To the processor it looks like data, at least on a threaded design. 2023-05-09 08:05:47 direct or indirect, that is. 2023-05-09 08:06:03 Yeah it's data unless you're STC 2023-05-09 08:06:13 Or return-threaded as well 2023-05-09 08:06:27 Well actually no sorry 2023-05-09 08:06:30 Just STC 2023-05-09 08:06:53 I generally put my primitives and my definitions in the same zone, but in a "separated" way. Right where they meet it might overlap, but "most" of the code is separated from "most" of the definition data. 2023-05-09 08:06:54 Return threaded code the colon def is just a list of code addresses, so it's like direct threading 2023-05-09 08:07:07 Yes. 2023-05-09 08:07:29 I could always stick an align at the boundary, I guess - align to a fresh cache line. 2023-05-09 08:07:40 But ideally they'd be in separate regions. 2023-05-09 08:07:56 Because if I added new primitives to my system the way it is now, then they'd start to get mixed up. 2023-05-09 08:08:03 Unnecessary alignment sounds bad 2023-05-09 08:08:16 Yeah, I try not to do too much of it. 2023-05-09 08:08:54 But one between "all the code" and "all the defs" wouldn't be too bad. 2023-05-09 08:08:55 You probably want code unaligned and contiguous, only align really hot bits of code to the size of the instruction (i.e. loop restart point) 2023-05-09 08:09:06 Yes, I don't align within code. 2023-05-09 08:09:31 I do align between headers, so odd length names won't throw out the alignment of stuff coming after. 2023-05-09 08:09:38 Sections tend to have a reasonably large alignment 2023-05-09 08:10:09 I think often they start at 4KB alignment? 2023-05-09 08:10:12 I think so. Maybe 4k? I think the hardware memory permission stuff works on 4k page boundaries. 2023-05-09 08:10:15 Or 1KB at least 2023-05-09 08:10:51 Maybe as I do this new system I'll fully separate machine code from definitions. 2023-05-09 08:11:18 One of the advantages of developing it in Forth instead of nasm is that I can do anything of that sort I want, without having to figure out how to get nasm to cooperate. 2023-05-09 08:11:30 It's an optimisation that's not too hard, so probably worth doing 2023-05-09 08:15:40 I have actually converted ilo-amd64 to gas format because I got fed up with nasm 2023-05-09 08:16:04 And converting I found a bug where nasm had silently 'fixed' an invalid effective address 2023-05-09 08:16:24 Ooops. 2023-05-09 08:16:34 I haven't run into any terrible problems with nasm thus far. 2023-05-09 08:16:39 So I will probably avoid nasm from now on 2023-05-09 08:16:48 The thing is that GNU as is just far more battle-tested than nasm 2023-05-09 08:17:32 It's got a very important job so it absolutely 'works' and supports things I think nasm is lacking like anonymous labels 2023-05-09 08:17:46 vms14: Another possible way to handle the meta compiling would be to just write a completely new target compiler, instead of trying to trick the system compiler into doing it for you. Something like this: 2023-05-09 08:17:52 TARGET{ 2023-05-09 08:17:59 : foo ... ; 2023-05-09 08:18:02 : bar ... ; 2023-05-09 08:18:05 ... 2023-05-09 08:18:09 } 2023-05-09 08:18:11 doesn't nasm represent local labels with a leading .? 2023-05-09 08:18:27 I'm not sure - I don't actually use them. 2023-05-09 08:18:30 unjust: Yeah and it generates symbols for them and there's no way to turn that off, which is annoying 2023-05-09 08:18:33 Sounds reasonable, though. 2023-05-09 08:18:36 not quite the 1b / 1f that gas offers though 2023-05-09 08:18:40 Yeah 2023-05-09 08:19:00 1b / 1f don't generate extra symbols 2023-05-09 08:19:14 vms14: So TARGET{ would just take control and wouldn't give it up until it saw that closing }. 2023-05-09 08:19:21 Better debugging output, less noise in the ELF 2023-05-09 08:19:37 And if you want to use } for something else, it could be TARGET} at the tail end. 2023-05-09 08:19:55 You can even use intel syntax in gas now 2023-05-09 08:20:11 Although I've not done that because 'when in Rome' 2023-05-09 08:20:14 That's a fair bit of code to write, though - TARGET{ would have to have a FIND and so on. 2023-05-09 08:20:22 KipIngram: but what target would really do 2023-05-09 08:20:25 But it would just completely solve the juggling issue. 2023-05-09 08:20:34 everything I assume 2023-05-09 08:20:35 KipIngram: I think it's necessary to allow handling literals properly 2023-05-09 08:20:54 It would call WORD to parse words out of the source, search for them in the target dictionary, act on them to compile the results to the target dictionary, using the target architecture, etc. 2023-05-09 08:21:07 It would do pretty much what the normal system compiler does. 2023-05-09 08:21:08 I mean, not only put those definitions on the target, but has to care about everything 2023-05-09 08:21:29 Yeah, it has to do a fair bit. It's just a way to solve the bookkeeping problem cleanly. 2023-05-09 08:21:34 It's not that big though 2023-05-09 08:21:40 at least that way you can use your host normally 2023-05-09 08:21:47 and have this almost as an extension 2023-05-09 08:21:54 You've already had most of the work done for you, you just need to rewrite the high level compiler-interpretre 2023-05-09 08:21:55 No, it wouldn't be "awful," and if you've got the source for your host system you could capitalize on it quite a lot. 2023-05-09 08:22:20 I tend to add words like that in order to not mess with the interpreter code 2023-05-09 08:22:37 It's probably what I will do, given that I didn't really plan thoroughly for metacompile while writing my current system. 2023-05-09 08:22:45 I like the interpreter to be as dumb as possible 2023-05-09 08:22:51 It would be helpful to vectorise COMPILE, and LITERAL for this 2023-05-09 08:22:54 read a word, recognize it 2023-05-09 08:23:17 And you could have a word TARGET-LOAD such that you didn't have to actually have TARGET{ and TARGET} in your source blocks themselves. 2023-05-09 08:23:21 And then add new definitions for some words like IF/THEN 2023-05-09 08:23:48 Right - it would need to handle those properly as well. But it's a "clean problem" now, instead of a noodly mess. 2023-05-09 08:23:58 Hmmmm 2023-05-09 08:24:03 nasm - noodle assembler 2023-05-09 08:24:11 Heh. 2023-05-09 08:24:52 It feels more tractable to me than fighting with my existing code trying to fix all the things I left out while writing it. 2023-05-09 08:26:39 until nasm allows you to prevent local symbol export, you could remove the locals with objcopy if they bothered you enough 2023-05-09 08:27:26 unjust: That's just one reason, I think gas is a better choice for many reasons 2023-05-09 08:27:48 I'm hoping I won't be going back to nasm anymore. 2023-05-09 08:30:00 vms14: Another reason I like this TARGET{ idea is that I am planning a pretty different dictionary layout, and the existing FIND for example assumes a particular layout that the new system won't have. 2023-05-09 08:30:19 It would get to be a real mess trying to make the existing FIND conditionally support either/or. 2023-05-09 08:30:51 I'm even planning a different xt size. 2023-05-09 08:31:28 I will try to write the new system with native metacompile support in mind, though. 2023-05-09 08:33:43 it could be better than just a flag 2023-05-09 08:33:54 but somehow equivalent 2023-05-09 08:34:18 anyways the more you put on words and less on the "core" the better 2023-05-09 08:34:22 Well, it eliminates needing a flag, since you're no longer coaxing your system into double duty. 2023-05-09 08:34:51 every word is a world 2023-05-09 08:35:00 Then if the new system will run on your host system, you could have TARGET-RUN, which would load the image from disk buffer into a run buffer and launch it. 2023-05-09 08:35:27 And then you'd be running the new system, and BYE would drop you back to the host system. 2023-05-09 08:35:47 Once you get it that far along. 2023-05-09 08:36:01 A common step for me in early development is just getting the system to run and exit without throwing an error. 2023-05-09 08:36:03 KipIngram: why did you want metacompiling? 2023-05-09 08:36:09 just cause it was interesting? 2023-05-09 08:36:12 That would work here too. 2023-05-09 08:36:25 Well, yes, but mainly so that I no longer have to use external development tools. 2023-05-09 08:36:31 I'd like to "cut that tether." 2023-05-09 08:37:21 So I can see in the early phase of this trying to get to where I can TARGET-RUN and it just loads the image, runs it, and pops right back out. 2023-05-09 08:37:29 Just a "no error pass" through the new stuff. 2023-05-09 08:38:06 Just make sure you have offsite backups of your binaries because you can't build from source anymore 2023-05-09 08:38:44 For sure. 2023-05-09 08:39:37 I'd keep the last nasm source and the blocks.dat file that it can use to build the new system of course. 2023-05-09 08:39:51 But you're right - keeping several different options around all that would be good. 2023-05-09 08:40:54 And I'd want a "golden" image somewhere in blocks.dat too, because I'm sure I will break my dev image many times along the way. 2023-05-09 08:41:48 I imagine having a little Linux executable that lets me say loader " 2023-05-09 08:42:14 It would allocate a bunch of RAM, load the right part of blocks.dat into it, and jump to it. 2023-05-09 08:49:35 veltas: Here's question re: separating code and data. 2023-05-09 08:50:01 In this new plan of mine, I will have registers pointing at each of the tables (CFA and PFA pointer tables). 2023-05-09 08:50:35 In theory I wouldn't need a register pointing at the code / definition zone(s) - the CFA/PFA pointers themselves will point into those zones. 2023-05-09 08:51:24 In my existing system, though, I have a register pointing to that area, and I let it do double duty by putting NEXT first so that that register points to it too, and a jmp instruction serves as my next in primitives. 2023-05-09 08:51:48 To avoid allocating a third register, I'd contemplated putting next at the base of one of the tables. 2023-05-09 08:51:55 How big an issue do you see that as? 2023-05-09 08:52:03 Probably the most frequently executed code in the system. 2023-05-09 08:52:18 We'd only be talking about one cache line. 2023-05-09 08:52:38 And none of the "data" in the table that was shared in that cache line will be written to, ever. 2023-05-09 08:53:04 It doesn't feel to me like it would be problematic, but I think you're better tuned into this aspect of things than I am. 2023-05-09 08:53:23 Wouldn't I just wind up with copies of that line in both caches? 2023-05-09 08:54:15 And there I would put an align after the next code, just so all the table entries were aligned. 2023-05-09 08:54:22 Probably only a 16-bit align, though. 2023-05-09 08:54:35 No wait - sorry. A 64-bit align. 2023-05-09 08:54:42 Table entries are 64 bits. 2023-05-09 08:54:48 xt's in definitions will be 16 bits. 2023-05-09 08:55:15 So that would look like 2023-05-09 08:55:28 <64 bit align>

... 2023-05-09 08:55:56 Then all the rest of the machine code would go off over in the machine code section. 2023-05-09 08:59:18 Not 100% sure what you mean by putting NEXT at bottom of table 2023-05-09 09:01:34 I'll have a region for a table of CFA addresses. A register will point to it. I'm just considering putting that one bit of code - the code for NEXT - right at the start of that region, and then what followed would be table entries. 2023-05-09 09:01:48 This is just to have an already-allocated register pointing to the next code. 2023-05-09 09:02:01 Instead of allocating a third register, above and beyond the two for the tables. 2023-05-09 09:02:25 In a perfect world I'd just have a third register point to the machine code region and put next there. 2023-05-09 09:02:34 Just trying to save a register. 2023-05-09 09:02:45 I use practically all of them for various things. :-) 2023-05-09 09:03:22 Which both pleases me (I'm taking advantage of the resources) and irritates me (more to save on thread switches). 2023-05-09 09:04:28 Oh, by the way, speaking of thread switches... most of the stuff I've dredged up so far on memory allocation doesn't account for multi-threading. Probably pretty important to get that right so that I avoid a bunch of locking and contention around memory resources. 2023-05-09 09:06:17 Anyway, if I put that bit of code at the base of the table, then I have one place where code is right by data. My guess is that that line would just wind up in the instruction cache and the data cache, but it seems to me that ought to be ok since that data will really never get written and thus invalidated. 2023-05-09 09:06:36 Kicking next out of the cache would be the worst thing I could do. 2023-05-09 09:07:24 I see that that would be wasting most of one line of instruction cache, but it would be only one. 2023-05-09 09:07:27 How long is your next code anyway? 2023-05-09 09:07:38 Only 10-11 bytes. 2023-05-09 09:07:46 I'd just inline it 2023-05-09 09:07:52 Well, maybe a bit longer than that, but it's pretty short. 2023-05-09 09:07:52 That's not long 2023-05-09 09:08:03 Can you paste the code somewhere and link it? 2023-05-09 09:08:09 I think it's nine instructions. 2023-05-09 09:08:16 Yeah, one second. 2023-05-09 09:10:19 https://pastebin.com/rVWYz5gM 2023-05-09 09:10:35 That does next, docol, and the tick counting for thread switching. 2023-05-09 09:10:58 TICXT is the xt of the code that actually does the thread switching, that gets called now and again. 2023-05-09 09:11:37 So next itself is only two instructions. 2023-05-09 09:11:44 But all of that is 9. 2023-05-09 09:12:43 For this to work, the high bits of rax must be kept zero. 2023-05-09 09:12:56 Because lodsw doesn't clear them when it picks up a 16-bit quantity. 2023-05-09 09:13:05 lodsd, 32 bits, does. 2023-05-09 09:13:21 So any primitive that uses rax is responsible for clearing it before next-ing. 2023-05-09 09:13:51 This actually doesn't decrement the tick counter every next - it decrements it every docol. 2023-05-09 09:14:08 But the code was just particularly clean this way; decided I liked it. 2023-05-09 09:15:12 What are the register assignments? 2023-05-09 09:15:36 I'm not a huge fan of renaming registers in assembly :( 2023-05-09 09:17:46 Oh, sorry. Um, rrIP is rsi, so that lodsw will work. rrW is rax. 2023-05-09 09:17:52 The others don't really matter. 2023-05-09 09:18:08 It's just that lodsw mechanism that drives those to "musts." 2023-05-09 09:20:35 UGH. 2023-05-09 09:20:41 Those *two* musts. :-( 2023-05-09 09:21:16 I particularly liked the way it worked out for handling the tick call by just making an extra pass through docol. 2023-05-09 09:21:30 It just "slips in" a call to TICXT. 2023-05-09 09:22:10 So I can write that all in Forth, and just vector it in. 2023-05-09 09:22:42 And if I wanted an uninterrupted period I could just revector it to a noop. 2023-05-09 09:23:24 Well, I guess it would need to at least re-load the counter. 2023-05-09 09:36:41 But anyway, the real evil of mixing code and data would be writing to the data and evicting the code from the cache, wouldn't it? And also the fact that the data would take up some space in an instruction cache line? 2023-05-09 09:38:22 Yeah it's bad for performance 2023-05-09 09:38:48 But if I can guarantee those data items will never be written to, I'd avoid that eviction, right? 2023-05-09 09:39:06 I'm willing to carry a part of one cache line not used for code. 2023-05-09 09:41:48 I'd expect the NEXT code to frequently be cached, it containing mostly data would probably be bad 2023-05-09 09:42:05 Probably hard to notice without profiling though 2023-05-09 09:42:56 Well, I always could align the table data up to the next cache line, I guess. Then that part of the next cache line would just be wasted, but it's already wasted the other way too, at least insofar as the instruction cache goes. 2023-05-09 09:45:45 0 1 1000000 range [ + ] do.list . => 1.47 real 2023-05-09 09:46:00 it improved a bit, it was 3 seconds last time 2023-05-09 09:46:17 Hey, that's more than "a bit." 2023-05-09 09:46:24 Cutting in half is good. 2023-05-09 09:46:32 it's because of the fake compile 2023-05-09 09:46:43 it mainly avoids the interpreter overhead 2023-05-09 09:46:57 the interpreter overhead is at compile time 2023-05-09 09:47:12 the recognition of every word, etc 2023-05-09 09:47:25 Sure. 2023-05-09 09:47:38 but still it's wrong 2023-05-09 09:47:57 and defeats the purpose of making the whole fake compiling 2023-05-09 09:48:06 I need to cache the compiled versions 2023-05-09 09:48:36 this will remove a flaw it has, which is that dynamic behavior of words, using always the last definition 2023-05-09 09:48:53 Caching is a funny word to use for it - to me that's just compiling to the dictionary. 2023-05-09 09:48:58 yeah xD 2023-05-09 09:49:08 caching for me is because they end in a hash table 2023-05-09 09:49:11 isn't the word for that memo'isation? 2023-05-09 09:50:06 Yeah, that's used to. I guess we could think of a Forth definition as memoizing. 2023-05-09 09:50:28 too 2023-05-09 09:50:35 vms14: congrats on 2x speed up 2023-05-09 09:50:39 Having problems with "to's" this morning. :-| 2023-05-09 09:51:04 unjust: the js version stills being much faster even without that fake compile 2023-05-09 09:51:05 xD 2023-05-09 09:51:12 js takes 0.5 seconds for that 2023-05-09 09:51:29 but I wasn't aiming at performance 2023-05-09 09:51:41 I need to get rid of a flaw, and compiling is one of the solutions 2023-05-09 09:51:46 "compiling" 2023-05-09 09:54:00 Yeah I agree with KipIngram, sounds like you're just reinventing compiling and dictionaries 2023-05-09 09:54:41 but it's not efficient anyways 2023-05-09 09:54:46 0 1 1000000 range [ 1 2 3 drop drop drop + ] do.list . 2023-05-09 09:55:09 this takes 3.57 real 2023-05-09 09:55:48 running on what though? 2023-05-09 09:56:00 unjust: what? 2023-05-09 09:56:08 ah, on a gpd micropc 2023-05-09 09:56:15 want to benchmark? 2023-05-09 09:56:43 https://termbin.com/t7hj 2023-05-09 09:56:52 time perl this.file 2023-05-09 09:56:58 veltas: Regarding register aliasing, I totally hear what you're saying. My reasoning, though, was that I was aiming at code that would eventually evolve into something portable. What I didn't show you when I posted that was another column of "portable code" over on the right, that I had toyed with. 2023-05-09 09:57:06 it contains that [ 1 2 3 drop drop drop + ] code 2023-05-09 09:57:17 And I certainly found it easier while writing primitives to not have to remember what register I'd picked for what purpose. 2023-05-09 09:57:42 The aliases I gave them were really easy for me to keep in mind. 2023-05-09 09:59:15 unjust: the system is netbsd, I don't know how much would differ if it was linux 2023-05-09 09:59:38 and perl is v5.36.0 2023-05-09 10:01:26 I expect most of the other computers have a better result 2023-05-09 10:01:41 this is a computer that fits on my pocket :D 2023-05-09 10:02:39 KipIngram: Yeah I just find it easier in my experience to bite the bullet and memorise the allocation 2023-05-09 10:02:41 real 0m53.781s 2023-05-09 10:02:51 Linux 4.9.226-perf on armv7l 2023-05-09 10:02:53 :-) 2023-05-09 10:03:05 Which was a bit annoying in ilo-amd64 because I changed my allocation half way through :) 2023-05-09 10:03:20 perl v5.36.0 2023-05-09 10:03:30 When I realised it was probably worth using rbx for the data stack register 2023-05-09 10:03:40 And r15 for memory pointer 2023-05-09 10:04:50 real 0m3.576s (on x86-64) 2023-05-09 10:08:07 xd 2023-05-09 10:08:25 most likely that build of perl isn't as optimised as well as it could be for armv7l, but not sure it would get near 4s even if it were 2023-05-09 10:09:08 it's a fake compilation, the best way to get performance on a perl implementation would be to write perl code 2023-05-09 10:09:16 I'm also interested on that 2023-05-09 10:09:24 What sort of armv7l machine is it? 2023-05-09 10:09:36 I tried to do it and it kind of worked xD 2023-05-09 10:12:56 veltas: a phone with an armv7l targetted distribution of android, but running on qualcomm snapdragon 215 (armv8 native) 2023-05-09 10:14:20 vms14: Is your abomination code online somewhere? 2023-05-09 10:14:41 it's in the last link he posted 2023-05-09 10:14:43 veltas: only in form of a paste 2023-05-09 10:15:06 https://termbin.com/6c07 now it's almost the same 2023-05-09 10:15:20 but it's "caching" the compiled version of a word 2023-05-09 10:15:32 still it's the same performance for that code 2023-05-09 10:16:06 Have you rewritten it in C? 2023-05-09 10:16:09 no 2023-05-09 10:16:17 I was going to, but nope 2023-05-09 10:16:40 I realized I have no reason to use any other language 2023-05-09 10:16:52 except maybe js for getting it into the browser 2023-05-09 10:17:07 I don't want performance, so why I would want C 2023-05-09 10:17:14 I only want Xlib from C 2023-05-09 10:17:23 I can get it anyways 2023-05-09 10:18:18 Other advantages of C or asm would be to reduce dependencies, produce smaller overall stack 2023-05-09 10:18:32 Makes it easier to fix things and maintain it properly 2023-05-09 10:18:37 the only dependency is perl 2023-05-09 10:18:43 and it makes it quite portable 2023-05-09 10:18:57 even if I want to use external libraries 2023-05-09 10:19:29 It's a toy language after all :D 2023-05-09 10:19:32 Yeah 2023-05-09 10:19:48 it has environments also, which mean closures and alike 2023-05-09 10:19:48 There's nothing wrong with that, it's your own recreational thing 2023-05-09 10:20:11 you can put words on the environment 2023-05-09 10:20:19 and every word has its own environment 2023-05-09 10:20:35 they inherit from other words 2023-05-09 10:20:57 C only gets on the way to implement those features 2023-05-09 10:21:11 and gives me performance in return, which I don't want 2023-05-09 10:21:12 :D 2023-05-09 10:22:07 I wonder how well it runs on my laptop 2023-05-09 10:22:24 It's an Intel Core 2 Duo 2023-05-09 10:22:28 So you're basically treating each word like its own process? 2023-05-09 10:22:46