2023-08-31 11:54:57 should i intern strings representing words? 2023-08-31 11:55:05 e.g. "dup" 2023-08-31 14:38:19 "Intern?" 2023-08-31 14:38:47 probably a lisp thing 2023-08-31 14:38:51 All of those strings will be in your dictionary - that's how Forth recognizes them. 2023-08-31 14:39:08 It parses what you type out of the input stream, into space-delimited words, and then searches the dictionary for a match. 2023-08-31 14:39:28 whereby "foo" or |foo| or whatever become the symbol FOO in the current package. not sure what that would mean for a forth, though 2023-08-31 14:39:30 so one way or another the names have to be around in string form, in some suitable data structure. 2023-08-31 15:44:02 i understand "string interning" to mean that it tracks a cache of strings so that there are never duplicates. i.e., two strings with the same contents are guaranteed to also have the same address 2023-08-31 17:05:45 rendar: Forths don't usually intern strings, in fact I'd guess about 99% don't 2023-08-31 17:07:01 veltas, i see, but could it help? 2023-08-31 17:07:34 Not much, especially as you're unlikely to repeat most names 2023-08-31 17:08:28 And you're just adding overhead to it, Forth systems usually inline the name to the definition record in a very simplistic way 2023-08-31 17:09:00 Quite often it's a byte containing length of name (with high bit indicating whether it's IMMEDIATE) followed by the name 2023-08-31 17:09:06 Or something similarly simple 2023-08-31 17:49:24 veltas, i see, thanks 2023-08-31 20:37:48 rendar: in my library I have an s:unique word for this (pass in a string, if it's known, it returns the address of the existing copy; otherwise it saves it, returning a pointer to the newly saved one). 2023-08-31 20:38:29 I mostly use it in my larger system, where it serves to save memory for source file names for words. 2023-08-31 22:36:58 Chuck did something along those lines at one point, when he was trying to minimie his source size. There was a symbol table of some kind, and then source references to the words would be indices into that table instead. 2023-08-31 22:41:56 DNS, too 2023-08-31 22:57:40 He rigged this up so that it boosted compilation speed too - for example, all of the "built in" words had their CFAs already plugged into that symbol table. So, the source pointed directly into the table, so that when rendering the source on screen it could go get the word text there, but it also had the CFA information which could then be copied directly into the dictionary, no lookup required. 2023-08-31 22:57:55 He claimed to be able to compile 100 MB/s of source, and this was back around 2000 or so. 2023-08-31 23:00:06 DNS is slightly different in that questions or answers will repeat the 'example' of 'example.com' a lot, so you have an address for 'example' instead of repeating it in the packet 2023-08-31 23:00:46 Sounds a little like dedupe. 2023-08-31 23:02:08 it saves space, which can be tight in 1980s UDP 2023-08-31 23:09:58 I've read some about those limitations - never had direct exposure to networking back in those days. 2023-08-31 23:53:26 this is an excellent series of videos: 2023-08-31 23:53:28 https://www.youtube.com/watch?v=PbITFIGLciI&list=PLUl4u3cNGP63bAfjGas3TuA4ZCPUtN6Xf 2023-08-31 23:53:51 Basically a "history of physics." Goes into some technical detail, but it's mostly the historical and social aspects.