2023-06-14 01:20:21 there is a link between concatenative programming and currying 2023-06-14 01:22:09 in both systems you pass an environment from left to right transforming it as you go 2023-06-14 01:23:11 with currying you build structure by consing tuples of anonymous type 2023-06-14 01:23:48 with forth state is ! and @ from computer memory 2023-06-14 01:39:16 I think this is related to making a purely consistent purely reversible forth 2023-06-14 02:15:32 what do you mean by the last bit: purely consistent purely reversible? 2023-06-14 02:32:58 mathemaitcal consistency like lean or like coq 2023-06-14 02:33:16 reversibility for a quantum computer 2023-06-14 02:33:33 maybe also pure immutability 2023-06-14 02:46:28 not familiar with those languages or quantum computing, but i'm interested to see what an implementation that meets the criteria would be like 2023-06-14 02:50:15 would all data consumed and produced be restricted to the data/parameter stack only? no variable/dictionary/heap access allowed? would you be able to compile words at all? 2023-06-14 07:15:33 My take on that is that the entire computer state is regarded as the thing evolving, not just the stack. 2023-06-14 07:16:12 unjust: In quantum theory all processes of change have a mathematical trait called unitarity. A key feature of a unitary transformation is that it is reversible - no information is ever destroyed. 2023-06-14 07:17:31 So all of the "gates" in a quantum computer are fully reversible operations. It turns out that in computing theory it's actually the erasure of information that causes you to necessarily consume energy. In theory reversible processes can be reduced to zero energy consumption (though you do them very very slowly to get there). 2023-06-14 07:18:04 This really has nothing to do with the way our current computers consume energy - we're nowhere even close to that "theoretical energy consumption rate." 2023-06-14 07:19:15 It's ironic that quantum processes never lose information, and yet when we make a measurement using "basic quantum mechanics" we throw away the prior quantum state and replace it with a new one that's related to the measurement device. 2023-06-14 07:19:23 That doesn't seem like "not losing information." 2023-06-14 07:20:26 But it turns out that's just an approximation - in the real world that "thrown away" information in fact gets smeared out in the state of the measuring device, you, and the rest of the world. It's still there - it's just lost in the environment in a way that we can't get it back "practically speaking." 2023-06-14 07:20:48 What we're really doing there is treating our little system as quantum but the measurement device as classical. 2023-06-14 07:21:10 So tossing aside that information is a "cheat" but one that's very very close to what actually happens. 2023-06-14 07:21:56 It's similar to saying that all of the air molecules in the room *do* have specific positions and velocities, but you certainly don't and can't know what they all are. 2023-06-14 07:24:00 Basic quantum mechanics really isn't a precise model of "the world" at all - you're deliberately treating most of the world as classical, while paying "quantum attention" to just the one little bit that you're focused on. 2023-06-14 07:25:41 People get all mystified about "quantum collapse," as though it's some mystical process. But think about it - you have this tiny little quantum thing, like an electron or something, and you suddenly interact it with a ginormous measuring device that's big engouh for you to read something off of with your eyes. Which thing do you THINK is going to win that contest? It's no surprise at all that the 2023-06-14 07:25:43 instrument dominates the next state. 2023-06-14 07:26:10 "Collapse" is the process of forcing the little quantum system into compatibility with the big measuring device. 2023-06-14 07:28:33 They've held whole conferences on "Quantum theory without observers," but they've never really come up with anything terribly profound. Basic quantum mechanics pretty much requires an "observer," and you treat the observer as a classical entity. 2023-06-14 07:29:31 Sometimes folks talk about "the wave function of the universe," but that's not a very helpful idea because the basic math process of the theory requires that classical / quantum divide somewhere in the picture. 2023-06-14 07:37:12 Anyway, if you think about it any Forth word at all could be thought of in the form 2023-06-14 07:37:27 system(t+1) = f(system(t)) 2023-06-14 07:37:48 where system(t) is the whole computer state. 2023-06-14 07:38:26 For a lot of words you could restrict that to the stack, but not for all of them. 2023-06-14 07:39:12 If you focus only on the stack, then DROP loses information, but if you focus on the whole computer it doesn't - it's just a pointer increment/decrement. 2023-06-14 07:39:48 ! loses information regardless though, since it overwrites something. 2023-06-14 07:42:41 When a quantum computer runs, you wind up with "garbage" which is the stuff that's no longer of primary interest but holds the information you'd need ot reverse the operations. Take an AND gate for example - the quantum equivalent has two bits of output - the result of the AND and some "leftovers." With the result and the leftovers you could "un-AND" and get back your original inputs. 2023-06-14 07:43:19 You have to deal with that garbage properly to get the results you want from the computer. 2023-06-14 07:45:33 The usual way of running a quantum computer is to put the inputs into a "superposition" of all possible states. And input possibilities that lead to the same output you'd like to "interfere" with each other, but they won't if they make different garbage - only when you've dealt with that garbage properly will you get the output with the interference you want. 2023-06-14 07:46:09 If that sounds fuzzy it's my fault - I don't REALLY know this stuff well enough to actually work with it. But those ideas are "in there" somewhere. 2023-06-14 07:47:57 If you let that garbage slip away then you can't do the right stuff to it to get the answer you want. And that's why interaction with the environment is so bad in a quantum computer - you lose the ability to work further with the info that slips away. 2023-06-14 08:13:39 thanks for the explanation 2023-06-14 08:50:47 Well, wish I could make it a little better. It's "vaguely on track." 2023-06-14 08:52:34 It's always just amused me a little that so much "popular media" stuff on this general domain tries to draw these big sweeping philosophical conclusions about the whole world from this stuff, when actually it's a very "rough" model to start with. Just not made to talk about "the whole world." 2023-06-14 08:55:07 i feel like i need an interactive assembler right now 2023-06-14 08:55:26 fasmg is nice and all that but sometimes i feel like i need some direct feedback 2023-06-14 08:56:34 asmrepl only does x86-64 so it doesnt really help me out with what i'm currently doing 2023-06-14 08:57:40 i'm too early in my assembly learning to have an assembler powered by forth 2023-06-14 08:58:44 The main puzzle on a Forth-based assembler is just seeing some kind of pattern in the encoding of the instructions. How easy that would be would depend a lot on the particular instruction set. 2023-06-14 09:00:04 It would be neat if there was a site like this one: 2023-06-14 09:00:06 https://defuse.ca/online-x86-assembler.htm#disassembly 2023-06-14 09:00:19 compiler explorer, probably? 2023-06-14 09:00:19 that had some kind of a program accessible API. 2023-06-14 09:00:36 actually, there's an implementation of godbolt on emacs 2023-06-14 09:00:46 You could potentially wire such a thing into your Forth, so that it was doing the encoding for you. 2023-06-14 09:00:48 https://gitlab.com/jgkamat/rmsbolt 2023-06-14 09:00:56 Oh, interesting. 2023-06-14 09:01:31 can disassemble assembly 2023-06-14 09:01:41 to assembly, rather 2023-06-14 09:02:30 objdump seems to be the thing 2023-06-14 09:04:26 "Allows targeting custom compilers - which means disassembly for niche assembly targets, specific commits of compilers, and patched libraries or compilers is possible." 2023-06-14 09:04:28 I managed to get a fair subset of x64 encoding sussed out just by staring at it, but it was kind of tedious. 2023-06-14 09:04:35 aight i'm going to hack this one up to do assembly later today 2023-06-14 09:06:10 so, godbolt does exactly the same as the site you linked 2023-06-14 09:06:21 it gives you the encodings 2023-06-14 09:06:54 I'll have to take a look. I'd really prefer to have a self-contained setup, but it's a nice ace to have in a hip pocket. 2023-06-14 09:07:17 We're fortunate that we don't really need the WHOLE x64 instruction set. 2023-06-14 09:07:53 thankfully, but if you need the extensions, they're there. 2023-06-14 09:08:58 godbolt can be self-hosted, mind you 2023-06-14 09:09:14 but its a bit heavy 2023-06-14 09:11:03 https://dogbolt.org/ 2023-06-14 09:11:05 neat 2023-06-14 09:11:11 there's godbolt but for decompilation 2023-06-14 09:14:49 Yeah, if you were trying to do something really fancy you might need the more esoteric stuff. 2023-06-14 09:15:05 I usually am thinking primarily about just building a basic Forth system. 2023-06-14 09:15:07 hmm, one of the godbolt developers expressed some minor interest in supporting forths 2023-06-14 09:15:15 :-) 2023-06-14 09:16:47 i found out when i joined the discord server for it 2023-06-14 09:20:50 Give godbolt's extensive capabilities, it's hard to see how it would be anything but heavy. 2023-06-14 09:24:17 to be fair, its the deployment requirements 2023-06-14 09:25:15 you need nodejs and typescript and then you have to download the dependencies for each language you want to use 2023-06-14 09:25:25 then it runs these on containers i believe? 2023-06-14 09:25:40 hmm, actually, it requires them to be installed locally 2023-06-14 09:26:05 quite well built however. 2023-06-14 09:26:08 Hmmm. 2023-06-14 09:26:10 https://www.youtube.com/watch?v=GyV_UG60dD4 2023-06-14 09:26:41 https://github.com/compiler-explorer/compiler-explorer 2023-06-14 09:30:51 https://godbolt.org/noscript there is a javascript free version of the website 2023-06-14 09:31:11 which is nice i suppose, but doesnt change that it is heavy on dependencies 2023-06-14 09:32:49 are you just trying to get something that will spit out machine code for each instruction in assembly that you feed it? 2023-06-14 09:37:28 this requires just objdump, a shell, and some utilities you probably already have: https://termbin.com/ylir 2023-06-14 09:37:33 yes. 2023-06-14 09:37:41 that's what kip's been doing with the website he linked earlier 2023-06-14 09:37:59 the website even quotes the tooling required 2023-06-14 09:38:29 real cool stuff though 2023-06-14 09:39:27 in terms of those online services, found dogbolt recently - pretty useful if you want to throw a binary at a few decompilers and not have to host them yourself 2023-06-14 09:39:34 yeh, dogbolt. 2023-06-14 09:40:04 i didnt know about some of these decompilers 2023-06-14 09:41:00 https://github.com/decompiler-explorer/decompiler-explorer 2023-06-14 09:41:01 neato. 2023-06-14 09:41:05 binary ninja seems to give the most sane output for linux x86 elf binaries of that log 2023-06-14 09:41:14 lot* 2023-06-14 09:42:07 this makes me want a whole assembler toolchain in a single binary with minimal dependencies 2023-06-14 09:42:25 a hackable toolchain, even. 2023-06-14 09:43:08 i also hate the bloat that comes with most diagnostic tools, or tools in general nowadays 2023-06-14 09:43:47 (someone was complaining that a cli tool would download chromium, spin it up, send a request to it, get a file out, then turn off chromium) 2023-06-14 09:44:20 thrig: what tool was that? 2023-06-14 09:45:46 https://thrig.me/tmp/a-rant.txt 2023-06-14 09:46:04 fasmg represents a sort of ideal state where everything is macros, and when macros are too slow, it compiles macros to speed it up 2023-06-14 09:46:42 so you can do preprocessor-like things with it 2023-06-14 09:47:13 unjust: Yeah, the modern philosophy is a "everything including the kitchen sink" approach. Just throw it in - it might come in handy. 2023-06-14 09:47:14 which allows for writing tooling with only a single binary 2023-06-14 09:47:46 i forgot the name of the tool that uses chromium for javascript test harnesses 2023-06-14 09:48:29 obsidian is basically a giant pile of javascript and html on top of a browser engine 2023-06-14 09:49:27 selenium is what i'm thinking of 2023-06-14 09:49:49 browser automation software 2023-06-14 09:50:24 selenium is probably the only tool i can think of where spawning a browser from a script for the purpose of retrieving remote data makes sense 2023-06-14 09:50:35 sure 2023-06-14 09:50:45 that one i wont complain about 2023-06-14 09:52:00 but having said that, you can definitely scrape tons of stuff with just wget/fetch with sed and/or awk 2023-06-14 09:53:33 no need to parse html+javascript and arbitrary json objects, when there are tokens (arbitrary strings) always present in some fairly fixed relation to the info that you want 2023-06-14 10:18:29 proof of concept: https://raw.githubusercontent.com/jhswartz/search/master/search-youtube 2023-06-14 10:19:08 yt-dlp breaks as they fiddle with the site, though 2023-06-14 10:20:23 this just uses wget + sed + awk, it's been ok for a few years using the same scheme 2023-06-14 10:21:59 example output @ https://raw.githubusercontent.com/jhswartz/search/master/README 2023-06-14 12:52:16 hmm, setting up compiler explorer was surprisingly easy 2023-06-14 12:54:30 its pretty much just nodejs plus the compilers you want to use 2023-06-14 12:55:50 painless to selfhost 2023-06-14 12:57:16 (I recently tossed the last node dependency I had, which also got rid of openssl) 2023-06-14 13:03:50 unjust: objdump is very useful. 2023-06-14 13:04:02 tyvm 2023-06-14 13:05:20 the snipppet you linked is missing just the flag for controlling the syntax output 2023-06-14 13:05:54 it otherwise defaults to at&t syntax 2023-06-14 13:07:15 -M intel should do the trick 2023-06-14 13:07:33 there's also intel-mnemonic 2023-06-14 13:07:57 doesnt seem to look any different though 2023-06-14 13:40:29 drakonis: ah, i didn't include that because i prefer at&t over intel syntax. glad to hear it worked out though. 2023-06-14 13:40:41 fair. 2023-06-14 13:41:09 emacs vs vi, at&t vs intel, gnome vs kde, ... 2023-06-14 13:41:43 i dont exactly have a world of choices with my current assembler with regards to syntax 2023-06-14 13:42:20 not really an issue if you use that with an objdump built for a cross-compilation toolchain though, then you're lumped with whatever the GNU assembler thinks is appropriate for your target arch 2023-06-14 13:46:33 also, how do you begin writing an assembler anyways? 2023-06-14 13:46:52 how does one write an assembler in the first place 2023-06-14 13:47:03 ed(1) 2023-06-14 13:47:20 or there's probably compiler (dragon) books about this 2023-06-14 13:47:44 i'm curious now because fasm's author doesnt exactly take patches unless its for minor changes and anything else discussed in the related forums 2023-06-14 13:50:02 not feeling sufficiently defiant to do that yet 2023-06-14 13:55:34 maybe rolling out one with smithforth might be a fun time 2023-06-14 14:02:27 1. determine the instruction encodings used for the desired target cpu, 2. select or devise an assembly language syntax that appeals to you, 3. determine how instructions formatted according to (2) need to be transformed into the format dictated by (1), 4. write a parser that implements (3) 2023-06-14 14:07:40 then decide if you need features that make the programmer's life more comfortable, like allowing the assembler to keep track of origin and location, or if you want to handle labels and/or generate symbols. maybe you want to implictly calculate offsets for things like branches and references based on labels or absolute numbers when underlying instructions require relative offsets 2023-06-14 14:08:38 maybe you want macros, or maybe you feel the need to implement features that a linker might normally handle 2023-06-14 14:08:40 It's pretty easy to calculate branch offsets. One way or another you save an address on the stack, and then finish up at the other end. 2023-06-14 14:09:07 You're either setting an offset for a forward jump or just marking the location for a later back jump to hit. 2023-06-14 14:09:15 Forth and it's stack is just lovely for that. 2023-06-14 14:12:48 yep. 2023-06-14 14:13:02 so it might be a lot easier if i just do it in a very small forth 2023-06-14 14:13:25 i want to have macros and linker features 2023-06-14 14:13:32 begone linker dependencies 2023-06-14 14:17:18 it seems well documented 2023-06-14 14:18:05 you'll also need some reference material about the binary format you want to output, and some diagnostics to make it easy to verify the output is sane, if you aren't content with just producing raw machine code 2023-06-14 14:18:06 Yes, that's exactly what I'm going for. I don't want any other tool dependencies - I want to be able to operate entirely within my own system. 2023-06-14 14:18:12 https://www.youtube.com/playlist?list=PLZCIHSjpQ12woLj0sjsnqDH8yVuXwTy3p 2023-06-14 14:18:31 https://www.youtube.com/playlist?list=PLZCIHSjpQ12wX5m6q4dQNQcmmjq9oF3or 2023-06-14 14:18:41 "Linker" stuff - for me that has mostly to do with migrating items between "offset form" and "address form." 2023-06-14 14:19:25 Some things need to be addresses, but you don't know 'em until you know where you're putting it in your address space. So in the disk resident format they get offsets instead, and then get changed to addresses at load time. 2023-06-14 14:20:18 You have to invent a way to know where those things are - in my current plan it's "the tables." Every cell of either table is such a field, and only 2-3 other things need to become addresses - I will handle those with case-specific code. 2023-06-14 14:43:24 no time like the present. 2023-06-14 14:48:20 I think that I want to assemble into blocks in a way that is immediately executable, and then have a word to "squash" that into a final disk resident image. That means I'd assemble addresses, squash to offsets, and load offsets->addresses. 2023-06-14 15:04:10 And, well, of course I want to be able to assemble into my running dictionary too - that last was talking about meta- stuff. 2023-06-14 15:04:55 But assembling into the running dictionary and assembling into a target dictionary will be almost the same thing - just different places to deposit the stuff. 2023-06-14 15:21:06 I want to be able to assemble a metacompile target into blocks, and then "run it," as its own stand alone system - then saying bye should send me back to the system I launched it from. 2023-06-14 15:21:26 I've got the basic mechanics of that already working. 2023-06-14 15:21:51 I've made headers and I've assembled code into a block and run it - just have't put the two together yet into a "Forth." 2023-06-14 15:22:29 That was a linked list set of headers, though, and now I'm leaning toward this hash-based setup, so some of that will be to re-do. 2023-06-14 17:07:28 i'm going through this and wow this is a lot of forth words in assembly https://github.com/albertvanderhorst/yourforth/blob/master/yourforth.fas 2023-06-14 17:08:27 jonesforth is much better at being educational 2023-06-14 17:08:50 also doesnt have 210 words in assembly 2023-06-14 17:09:07 making an assembler is fun. the tough part is recognizing addressing modes and evaluating expressions if it supports algebraic expressions 2023-06-14 17:10:12 Forth style assemblers sometimes encode the addressing mode in the instruction name which is a bit cumbersome but works 2023-06-14 17:10:39 its not too different from at&t syntax, yeah? 2023-06-14 17:13:45 You can do something like "mov al, bl" in Intel which is "movb %bl, %al" or something like that in at&t. With forth it might need to be "bl al mov_r8r8" 2023-06-14 17:14:12 So extra annoyance for the user to indicate the form of mov in the instruction name 2023-06-14 17:15:56 Assuming al and bl push an ID number onto the stack that mov_r8r8 uses to figure out what registers. You could have them push a type ID onto a second stack but then you'd need an extra marker for immediates 2023-06-14 17:20:00 For example if al pushes a 5 on the stack and bl pushes a 6 then you don't know how to tell the difference between "mov al, 6", "mov 5, bl" and "mov 5, 6" which is why you need the _r8r8 suffix. You could do something like "5 IMMED al mov" for "mov al, 5" if IMMED pushes a type ID. Either way you're annotating 2023-06-14 17:58:41 hmm, how would the assembler experience be improved? 2023-06-14 17:59:12 for a specific case? or for assembly in general? 2023-06-14 17:59:26 in general 2023-06-14 17:59:41 modern assembly is more written for compilers than the meat popsicles 2023-06-14 18:00:35 like, as previously mentioned, its annoying to append modifiers to word names 2023-06-14 18:01:12 meat popsicles, heh 2023-06-14 18:01:58 i was thinking about having a word that does addressing mode calculations 2023-06-14 18:02:17 for an architecture like x86, those modifiers/annotations exist because they are, individually, closer representations of the actual instructions encoded than the mnemonics you see in the popular syntaces 2023-06-14 18:02:36 i see 2023-06-14 18:03:44 you can abstract that away by implementing a bunch of IF/ELSE or CASE statements that will determine which encoding (so, addressing mode, operand order and operand size) you probably meant 2023-06-14 18:05:43 it does seem like a serviceable solution 2023-06-14 18:06:20 so that logic would be in whatever you have to parse a MOV instruction ... which might determine whether you actually need MOVB, MOVW (which could be abstracting away MOVW-RR MOVW-RS MOVW-SR MOVW-RM MOVW-MR MOVW-IR MOVW-IM etc.), MOVL and MOVQ 2023-06-14 18:08:39 but as there're groups of instructions that will have more or less the same operand and addressing mode patterns, maybe you only define each case once and use the CREATE / DOES> pattern to generate the actual words for each case. each would get the appropriate opcode value thrown in at creation time. 2023-06-14 18:12:28 in the end, if you're accustomed to using those size and operand annotations included in the mnemonic - you have to ask yourself if it's worth the effort implementing all of the baggage to obtain smarter mnemonics 2023-06-14 18:12:58 hmm, it is a quite good question 2023-06-14 18:13:29 perhaps as a toy 2023-06-14 18:16:49 but it is a bit soon to be doing that 2023-06-14 18:19:18 it could also be useful as more than a toy. if you wanted a word/macro that generated syscalls for linux/x86-86 for example, because you need to pass in the syscall function arguments into a bunch of registers - a lot of that register population is likely to be performed via MOVQ instructions most of the time, and they're always going to have a register destination. although the source operand type can vary, so if you're able to parameterize the sou 2023-06-14 18:21:31 something like ... 47 :I RDI :R MOVQ ... SOME-STRING :M R10 :R MOVQ 2023-06-14 18:22:05 where :M indicates memory/address/pointer (displacement optional?) 2023-06-14 18:25:21 drakonis: Someone was performance-inclined. ;-) 2023-06-14 18:25:39 ha, me? 2023-06-14 18:25:51 i want to experiment 2023-06-14 18:26:02 MrMobius: I added "mode words." 2023-06-14 18:26:16 rax rbx d[] mov 2023-06-14 18:26:39 That's "destination indirect," so it's mov [rax], rbx 2023-06-14 18:27:13 I have d[] s[] and also d[i] s[i]. 2023-06-14 18:27:43 mov rax, [rbx+0x08] -> rax rbx 8 s[i] mov 2023-06-14 18:37:43 I actually didn't have to use a bunch of conditionals for my assebly subset. 2023-06-14 18:37:54 It's more or less just "calculated." 2023-06-14 18:38:26 I mean, there are some decisions in the logic, but not anything terribly "overt." 2023-06-14 18:39:05 Foremost is the exceptional nature of rsp and r12, and particularly rbp and r13. 2023-06-14 18:39:45 Those I just had to "check for" explicitly, because they deviate from the pattern that otherwise holds.