2023-08-08 10:01:15 Any recommendations for a simple file system? something a step up from blocks. 2023-08-08 10:11:45 exFAT? 2023-08-08 10:16:20 https://dl.forth.com:8443/jfar/vol2/no3/article4.pdf 2023-08-08 10:29:35 joe9: The simplest thing to do is to form linked lists with blocks. Just use the last cell of each to point to the next. When a block is deallocated, put it on a free list, and check that list first, before advancing your frontier pointer, when you need to allocate a new block. 2023-08-08 10:30:14 Going up from there, I've felt that a b*tree structure is a good way to organize a file system, and that gives you the advantage of not using any space in your data blocks for file system info - it's all contained within the b*tree instead. 2023-08-08 10:31:12 That first thing I mentioned is just a simple "fixed size" allocator. And it is worth noting that you only use part of a data block for file system info (those links) when it's on the free list. When it's in use you can use the whole block for whatever purpose. 2023-08-08 10:31:45 A big question you need to answer is whether or not you're going to support the idea of partially used blocks (blocks that contain less than a full block's worth of user data). 2023-08-08 10:32:16 Doing that complicates things quite a bit, but it does make it possible to edit the middle of big files without having to do a lot of shuffling around. 2023-08-08 10:34:33 thanks, I have mucked around with storing direct, indirect blocks into a dentry. But, it seems simpler to just allocate a contiguous space per file/directory. 2023-08-08 10:34:40 reallocate as necessary. 2023-08-08 10:34:48 If you run out of space in a block to add data, you split it into two and insert the new block in the middle of the file using the b*tree. 2023-08-08 10:34:56 the addition of a write cache complicates stuff. 2023-08-08 10:35:21 My feeling is that few of us will ever write any applications that really demand a truly best-in-class file system. 2023-08-08 10:35:33 the simple approaches will likely cover all that most of us write. 2023-08-08 10:35:48 But it's one of those things that it's hard to resist wanting to "optimize." 2023-08-08 10:36:09 I am surprised that there are no forth fs's other than blocks. 2023-08-08 10:36:24 there must be something around that is lost to time. 2023-08-08 10:36:35 A very simple first step is just to do contiguous groups of blocks and arrange for LOAD to just pass to the next block when it reaches the end of one. 2023-08-08 10:36:49 So one load would load N blocks, if that's what the blocks told it to do. 2023-08-08 10:37:00 Rather, unless the blocks told it to stop. 2023-08-08 10:38:19 Oh, I'm not surprised. It's one of those problems people love to attack, similar to local variables. 2023-08-08 10:38:33 There's always someone who wants to stuff "modern method X" into Forth. 2023-08-08 10:38:46 File systems, locals, object orientation, etc. 2023-08-08 10:46:42 There's like a whole literature of all this out there, if you look through those old Forth Dimensions archives. 2023-08-08 10:47:46 For my desktop work I'll probably try to do the b*tree file system, because I get as tempted to show off as anyone else. On this little board I'm working on, though, I'll likely do something much simpler. Or just stick with numbered blocks. 2023-08-08 10:48:22 If you think about it, numbered blocks is kind of a "human managed file system." :-) 2023-08-08 10:49:07 There's a feature I want that b*trees will also work well for - I want to be able to store comments 'remote' from source; a b*tree could hold those links. 2023-08-08 10:49:19 does anyone have this paper? A Tokenized Rule Processing System 2023-08-08 10:49:19 Steven M. Lewis and Mike Fisher 2023-08-08 10:49:20 USC Department of Bioengineering and Danforth Corporation 2023-08-08 10:49:40 https://dl.forth.com:8443/jfar/vol3/no2/article34.pdf is not the full pdf 2023-08-08 10:49:45 Whereever I am in my source code, I can search that b*tree (not the same b*tree that runs the file system proper) to see if any links originate from that spot. If any do, follow them and open the commentary in a separate window. 2023-08-08 10:50:09 I figure once I have code to implement b*trees, I may as well do a full file system too. 2023-08-08 10:50:31 Oh, neat - my wife has a former colleague Steven Lewis. 2023-08-08 10:50:39 Though he may spell both names differently. 2023-08-08 11:06:23 I don't see hide nor hair of that paper, joe9. 2023-08-08 11:08:30 Here is a similar paper by Lewis alone, though: 2023-08-08 11:08:33 https://vfxforth.com/flag/jfar/vol4/no1/article2.pdf 2023-08-08 11:08:43 Probably covers similar material. 2023-08-08 11:08:56 good one. Thanks. 2023-08-08 11:09:02 Sure thing. 2023-08-08 11:16:49 That looks like a pretty interesting paper. I like how it stresses the "natural language" aspects of the application. Forth is a good choice for that. 2023-08-08 11:17:00 yes, I agree. 2023-08-08 11:21:02 joe9: Pretty sure there's a file system described in some Forth books 2023-08-08 11:21:10 Maybe polyforth manual or swiftforth manual has something 2023-08-08 11:21:15 Or thinking forth? 2023-08-08 11:21:20 Can't remember, at work 2023-08-08 11:23:03 starting forth has blocks. 2023-08-08 11:25:04 Yes the filesystem I'm thinking of was implemented on top of blocks 2023-08-08 11:25:20 It's probably singly linked lists of blocks 2023-08-08 11:26:00 Because that has a O(1) allocation scheme with no fragmentation, but linear access time for each file instead of random access 2023-08-08 11:26:17 Which is the kind of tradeoff you'd want for a general filesystem in forth 2023-08-08 11:26:34 The block database is much better though 2023-08-08 11:27:10 not in thinking forth from what I can see. 2023-08-08 11:27:54 < veltas> The block database is much better though -- Why do you say this? it is a pita to remember all the block numbers though. 2023-08-08 11:32:15 The block database in the polyforth manual 2023-08-08 11:32:20 veltas: is this what you are thinking of? https://dl.forth.com:8443/jfar/vol2/no3/article4.pdf 2023-08-08 11:32:25 veltas thanks so much. 2023-08-08 11:33:02 That's not what I'm thinking of but it sounds similar 2023-08-08 11:33:53 All file systems are ultimately implemented on top of blocks. That's just how we talk to disk drives. 2023-08-08 11:34:08 That's why we call them "block" devices. 2023-08-08 11:34:10 Except old disk blocks are usually 512 bytes 2023-08-08 11:34:20 And new dick blocks are virtualised to 512 bytes 2023-08-08 11:34:22 disk* 2023-08-08 11:34:30 Bit of a Freudian slip there lol 2023-08-08 11:34:45 Sure - block sizes vary. But actually the native page size of modern drives is larger. If the OS still offers 512-byte "sectors" is does them via RMW. 2023-08-08 11:34:58 joe9: I just use blocks, I have directory blocks I use to organise my blocks 2023-08-08 11:35:04 A few years ago it was 4k; ours use a 16kB logical page size. 2023-08-08 11:35:24 joe9: There's also the INDEX word if you've heard of it, that just prints the first line of each block with its block number 2023-08-08 11:35:29 That reduces the amount of RAM you need on board for logical to physical mappeing. 2023-08-08 11:35:59 Yeah, you just use line 1 as a "title line" for each block, and that plays well with INDEX. 2023-08-08 11:36:39 Often "line 1" is just the first 64 characters, though. 2023-08-08 11:37:01 A normal block should have its title, author, and date. It's Forth's "dir" equivalent 2023-08-08 11:37:12 That 64-char line structure is often built into Forth's fairly deeply. 2023-08-08 11:37:14 on line 1 I mean 2023-08-08 11:37:21 And yeah line 1 is the first 64 chars 2023-08-08 11:37:25 s/Forth's/Forths/ 2023-08-08 11:39:37 And then there's shadow blocks where you can put documentation or other related info 2023-08-08 11:39:57 Forth is scratch upon scratch upon scratch 2023-08-08 11:44:52 joe9: "Shadow blocks" is basically just the idea of using odd numbered blocks for code, data, etc. and the neighboring even-numbered block for commentary/documentation. 2023-08-08 11:45:04 As a convention. And I may have it backwards. 2023-08-08 11:45:12 I read about that in the Starting Forth book. 2023-08-08 11:45:26 I have files that could be upto a GiB binary data. 2023-08-08 11:45:34 It's just typical of Forth to adopt the "simplest, most brain-dead way" of doing things. 2023-08-08 11:46:08 Wow - that's a lot. If you need to edit the middle parts of those files then you probably do need a capable file system. 2023-08-08 11:46:43 In that b*tree idea I mentioned, the b*tree holds the sequence of blocks associated with each file and also what portion of each block has live data. 2023-08-08 11:47:08 The general idea is to keep the blocks somewhere around half to three quarters full - that leaves you room to add stuff, delete stuff, etc. 2023-08-08 11:55:44 And then if you add the idea that directories are just files that are used by the file system itself, that gets you the rest of the way there. 2023-08-08 11:56:20 I decided that using something like the Linux inode method made sense - it opens the door to keeping up with larger amounts of file meta-data. 2023-08-08 11:59:01 One facet I had an internal debate about was whether or not to keep the data blocks "pure data" or whether to include some redundant "recovery info" in them. If you go pure data, then if you lose a b*tree block you can lose your data - but it doesn't take a particularly large amount of redundant data in the blocks to give you the possibility of recovering files. Something like a linked list of the block in 2023-08-08 11:59:02 any given file. 2023-08-08 12:00:12 And that kind of opens the way to an evolution path - maybe you start with a file system based only on block-to-block links, and then add the b*tree later as a "performance layer." 2023-08-08 12:01:43 You might use the last cell of each block to hold a "next block pointer" and a "bytes used" value. Then you could find offset N in any file by starting at block 1 for that file and advancing along that list until you found the one that held byte N. 2023-08-08 12:01:48 Slow, but it would work. 2023-08-08 12:02:07 Then later the b*tree would give you a much faster way of finding those interior points. 2023-08-08 12:02:50 If you kept the last cell stuff you'd have to do a bit of extra work maintaining it as well as the b*tree, but it would be fantastic to have in a recovery situation. 2023-08-08 12:02:57 I have written fs's using the direct, indirect, double indirect blocks and so on. Later, in an attempt to simplify, I just used contiguous blocks 2023-08-08 12:03:10 now, i want to simplify further. 2023-08-08 12:03:36 Is there really anything in between "pure blocks" and "contiguous blocks"? 2023-08-08 12:04:55 I had a bigger write cache to work around the slow disk access times. 2023-08-08 12:05:19 Now, I think that I should get rid of that. 2023-08-08 12:05:20 Oh, ok. That's fair. I was just thinking about the "organization part." 2023-08-08 12:05:27 But yeah, caching is kind of a separate thing. 2023-08-08 12:06:01 I saw a video on a file system that's been proven faster than btree systems. It involved a sort of "hierarchical caching" approach. Very fast writes. 2023-08-08 12:06:06 It adds a lot more complexity and creates one of those bigger problems that Chuck Moore talks about. 2023-08-08 12:06:09 Because writes basically always got cached in RAM. 2023-08-08 12:06:24 Then over time that cached data would get percolated out into the disk blocks themselves. 2023-08-08 12:06:35 yes, exactly. that is what I have now. 2023-08-08 12:06:45 And yes, it was quite complex. I can't remember the name. 2023-08-08 12:06:51 I mentioned it here - maybe I could find it in my logs. 2023-08-08 12:06:56 I am finding that it is a pita to keep the data in sync between the cache and the disk. 2023-08-08 12:07:06 too many locks and syncing. 2023-08-08 12:07:06 Yeah. 2023-08-08 12:07:26 But, computer science guys will kind of bend over backward in the quest for O(1). 2023-08-08 12:07:50 Generally you need a really, REALLY big application before it starts to pay off. 2023-08-08 12:08:04 With that, I have a RAM data structure with the disk contents and the cache contents and a disk data structure. 2023-08-08 12:08:14 For example, there is a way to multiply matrices faster than the standard O(N^2) time. 2023-08-08 12:08:23 But unless your matrices are huge it winds up taking longer. 2023-08-08 12:08:59 Same with multiplying numbers - we all learn the standard grade school method, but you can do it faster. 2023-08-08 12:09:11 And if you have 10,000 digit numbers or something, it makes sense to do that. 2023-08-08 12:09:47 Some Russian dude discovered it when he was just a kid in the last few decades. 2023-08-08 12:09:58 "kid" == very young adult. 2023-08-08 12:10:46 or goat 2023-08-08 12:11:09 :-) Sometimes - that's true. 2023-08-08 12:11:28 some in that area have become cockroaches, so who knows how this all works 2023-08-08 12:11:50 I saw a Magnum PI episode over the weekend where Magnum was hiding a goat on the estate and trying to keep Higgins from knowing about it. 2023-08-08 12:11:54 He failed, of course. 2023-08-08 12:12:42 I had forgotten how silly that show was at times. 2023-08-08 12:12:55 I think TV in general leaned more that way back in those days. 2023-08-08 12:13:28 More comedic elements, even in serious dramas. 2023-08-08 12:13:58 batman was pretty campy 2023-08-08 12:14:07 Compare the 1990's miniseries of The Stand with the recent remake. 2023-08-08 12:14:12 The first one was very campy. 2023-08-08 12:14:19 And yes - Batman was SERIOUSLY campy. 2023-08-08 12:14:34 And Batman looked soft and pudgy. 2023-08-08 12:15:04 Those celebrity cameos they would do while Batman and Robin were rope walking up the side of a building were hilarious, though. 2023-08-08 12:15:11 You never knew who would pop up. 2023-08-08 12:15:44 It never had anything to do with the plot - it was pure comedy insert. 2023-08-08 12:16:37 I still prefer the 1990 version of The Stand, though. In spite of the campy it told the story really well. 2023-08-08 12:18:17 I have a particular opinion about The Stand. You had this whole slew of characters, but I think the one that really mattered was Larry Underwood. 2023-08-08 12:18:46 It was a sort of "one man's redemption saves the world" story. 2023-08-08 12:31:44 joe9: https://forth.works/temp/fs.txt is a little file system I implemented with my son last year. I have also implemented a Unix v6 style file system, but don't have the code readily available at present. 2023-08-08 12:32:41 KipIngram, joe9: Shadow blocks actually tend to be blocks in a totally different place, like +800 or something. And you switch between e.g. block 12 and block 812 to get its documentation 2023-08-08 12:33:34 There's TRIAD etc in polyforth that will print blocks with their documentation nicely to paper 2023-08-08 12:37:01 KipIngram: TV and films today seem to lack any sense of humour 2023-08-08 12:37:36 Although I think a lot of stuff from the 80's was quite morbid, I'm too young for that! 2023-08-08 12:39:57 I've not implemented shadow blocks in my editor; they'd be easy to add though. 2023-08-08 12:47:12 crc: I'm envious of you getting to engage in that sort of thing with your kid. I'd love that, but none of mine are into this sort of stuff. Really nice for you. 2023-08-08 12:48:25 veltas: I guess there are all kinds of ways to implement shadow blocks, and if you're doing some kind of simple contiguous range file system you'd really need something other than even/odd, I guess. And yes, modern television / film is typically fairly dark. Unless it's actually a comedy overtly. 2023-08-08 12:49:10 I enjoy some aspects of the "more mature" approach to stories, but on the other hand a little humor here and there isn't necessarily a bad thing. If it's not overdone. 2023-08-08 12:50:23 I guess the phrase "comic relief" arose for a reason. 2023-08-08 12:51:28 The movie I most remember as being totally dark with no respite is Ransom, with Mel Gibson. I swear when it was finally over I felt just wrung out. 2023-08-08 13:43:55 joe9: I'm really liking thath paper. Thanks for aiming us that way. 2023-08-08 13:45:10 which one? the streaming or the tokenized paper? 2023-08-08 13:53:05 I'm still reading the tokenized paper. 2023-08-08 13:53:19 I just like how direct and straightforwardly it's written (so far). 2023-08-08 13:55:09 Hmmm. I'd expected OR to have a meaning along with AND, but it doesn't look like it does. Maybe there's some reason that's not required. 2023-08-08 13:55:42 I've seen a couple of Prolog implementations in Forth over the years. That gets into similar territory. 2023-08-08 13:58:13 This is already making me want to consider the possibility of having entire phrases (like THE ANIMAL IS A BIRD) in the dictionary as distinct items. 2023-08-08 14:16:08 Ah, he cites LISP-like functions. I've often thought that it might make sense to have a LISP engine alongside a Forth system, with some mechanism for the Forth to pump string structures into and out of the Linux. 2023-08-08 14:16:27 It would just be "handy" for those situations that are particularly amenable to a LISP approach. 2023-08-08 14:17:10 And I think a "basic" LISP system could be implemented without bloating the system. I regard Forth and *basic* LISP as having similar implementation complexity. 2023-08-08 14:17:41 I did get the impression, though, that state of the art performance for LISP bumps the complexity up quite a bit. 2023-08-08 14:39:02 If I were looking to implement something like this I'd at least try to use the dictionary itself for my linked list needs. That's why the notion of being able to compile "phrases" came to mind. 2023-08-08 14:39:49 Oh, he mentions possible extension to "levels of certainty" - it does seem like probabilities could be integrated into this in a useful way. Kind like a "fuzzy logic" version. 2023-08-08 14:40:44 Seems like good prospects for using builds> does> actions in compiling these rule sets. 2023-08-08 14:50:30 I'm also thinking that my "exception return" operation would come in handy here. This is somewhat like what FIND does. It loops over a hierarchical data structure, with the fine-grain match testing occuring down in the leaves of that process. As long as you have no match, you just continue iterating over the hierarchy. But if you do find a match, you're done, and you want out as quickly and cleanly as 2023-08-08 14:50:32 possible. My {| ... |} ; structure facilitates that. 2023-08-08 14:51:15 Really made the FIND code clean - it was about seven short definitions to do a full search of an arbitrary vocabulary sequence. 2023-08-08 15:13:57 Ew... don't like the Forth coding style. It's that "C looking Forth." 2023-08-08 17:09:30 joe9: Don't know if you're read the sample rule sin that tokenied paper, on page 17 of the pdf. I don't understand where the consequence "the kidney is not clogged" comes from. Up above, they produce consequence "the kidney is clogged," so it's obvious what they intend, but I don't see where the actually recognition of those as "complements" comes from. I didn't see that any ability to recognize such 2023-08-08 17:09:32 things was "built in." 2023-08-08 17:10:04 Maybe I just overlooked it - there are several such things in there, so it must be able to figure it out from the presence of hte word "not." 2023-08-08 17:42:11 I don't think they discuss implementation of NOT, but it seems pretty clear that it would be easy to handle, just with a flat in the clauses. 2023-08-08 17:42:23 Flip it if NOT appears anywhere in the string. 2023-08-08 17:42:49 Maybe they figured it was too obvious to discuss. 2023-08-08 17:44:14 Oh, I see it on page 7. NOT reverse the flag returned by a clause. 2023-08-08 17:44:38 It's a bit in a flags byte. 2023-08-08 18:08:45 I think that for purposes of string matching they ignore the NOT - "the string to match" is the string with the not excised from it. 2023-08-08 23:10:18 <_404man > hi