2024-06-25 00:06:34 That's just a lot more down the "formal CS" road than Forth goes. There's almost a fanaticism about simplicity in the community. 2024-06-25 09:21:27 A lot of people confuse reentrant with thread-safe 2024-06-25 09:21:45 If you're worried about thread-safety just bung it in USER vars 2024-06-25 09:22:07 If you're worried about reentrancy then save/restore to return stack as zelgomer said 2024-06-25 09:26:51 You shouldn't worry about either until you really need to 2024-06-25 09:32:12 lf94: : PARSE| ( parse-xt1 parse-xt2 -- str n ) >IN @ 2>R CATCH ?DUP 0= IF 2R> >IN ! EXECUTE ELSE 2R> 2DROP THEN ; 2024-06-25 09:32:17 Something like that? 2024-06-25 09:33:02 That's totally wrong but I can't be bothered to think it 100% through 2024-06-25 09:33:14 Using exceptions is the right way to go for backtracking parsers though 2024-06-25 09:33:24 I've done that before and it works well 2024-06-25 09:35:14 : PARSE| ( parse-xt1 parse-xt2 -- str n ) >IN @ 2>R CATCH IF 2R@ >IN ! EXECUTE THEN 2R@ 2DROP ; 2024-06-25 09:35:19 That's closer to the mark 2024-06-25 09:38:31 : PARSE* ( parse-xt -- str_1 n_1 .. str_i n_i i ) >R 0 BEGIN R@ >IN @ >R CATCH 0= WHILE R> DROP ROT 1+ REPEAT R> >IN ! R> DROP ; 2024-06-25 09:38:36 Something like that too is possible 2024-06-25 09:38:55 I can't remember if >IN is restored by exceptions, it restore the input spec but I don't know if that includes the input pointer, weirdly 2024-06-25 09:39:05 I'm not sure it's even defined in the standard 2024-06-25 09:40:53 But as for why it's not a thing ... it absolutely is possible, but it's not in the spirit of Forth. There's much easier ways of achieving the same thing most of the time. 2024-06-25 09:47:49 Okay apparently it restores input source spec 2024-06-25 09:47:57 And >IN is included 2024-06-25 09:48:36 So we can have : PARSE| ( parse-xt1 parse-xt2 -- str n ) >R CATCH IF R@ EXECUTE THEN R@ DROP ; 2024-06-25 09:49:18 and : PARSE* ( parse-xt -- str_1 n_1 .. str_i n_i i ) >R 0 BEGIN R@ CATCH 0= WHILE ROT 1+ REPEAT R> DROP ; 2024-06-25 09:50:51 It doesn't really define this properly anywhere but it's clear from the examples in appendix that's what's intended 2024-06-25 17:32:30 I thought most systems just shoved >IN back to zero when recovering from a troublesome situation. 2024-06-25 17:32:41 And nothing in the TIB. 2024-06-25 17:33:12 I.e., "restart the interpreter." 2024-06-25 19:10:02 KipIngram: THROW is only 'trouble' if it's not caught 2024-06-25 20:15:26 True, but if it is caught we'd never come back from our code prematurely, would we? 2024-06-25 20:15:48 I wouldn't expect error recovery to be involved in that case. 2024-06-25 21:13:19 And yet it is in the standard 2024-06-25 21:13:29 Which makes exceptions quite convenient for parsing 2024-06-25 21:13:42 And less convenient for a lot of typical applications 2024-06-25 21:14:59 Well basically it means we have no longjmp in standard, apart from R> DROP 2024-06-25 21:36:09 OT: does anybody here happen to be highly knowledgeable in how search engines work? had a technical question or two regarding indexing and page rank (not for SEO purposes) 2024-06-25 21:37:11 The only thing I know about that is that Google's original innovation created a big matrix of the linkages it found online and then did something with the eigenvalues and eigenvectors of that matrix. 2024-06-25 21:37:23 It was very clever an "sciencey." 2024-06-25 21:37:29 I only have a vague idea based on https://www.tbray.org/ongoing/When/200x/2003/07/30/OnSearchTOC 2024-06-25 21:39:10 mainly I was wondering something about PageRank: does google (and others) have to go through their entire index to calculate the panks ranks, like once a day or something? 2024-06-25 21:39:33 I'd expect not 2024-06-25 21:39:59 Most pages don't change regularly anyway 2024-06-25 21:40:51 Is PageRank even still a thing? 2024-06-25 21:41:25 https://www.tbray.org/ongoing/When/200x/2003/11/13/ResultRanking 2024-06-25 21:43:44 so, I think I get the basic idea of pagerank: you find a link to something in a page, you allocate so page rank to that other page, adjusted depending on the rank of the current page and maybe some other quality factors... 2024-06-25 21:45:07 but how would you know whether or not you had already made such an allocation based on last time your crawler/indexer dealt with that page 2024-06-25 21:46:19 unless you made records of all those allocations tied to the target's page rank record which I would think would be cost-prophibitive 2024-06-25 21:46:51 It may be that the value is only an estimate 2024-06-25 21:47:10 so, it would seem like you would have to forget about the old values and rebuild the whole thing from your entire collection 2024-06-25 21:47:49 There's a detailed looking description https://en.wikipedia.org/wiki/PageRank#Algorithm 2024-06-25 21:48:00 "The PageRank algorithm outputs a probability distribution used to represent the likelihood that a person randomly clicking on links will arrive at any particular page." 2024-06-25 21:50:38 randonmly clicking on links. sounds like a fun game. 2024-06-25 21:51:31 a few sentences later: "The PageRank computations require several passes, called "iterations", through the collection to adjust approximate PageRank values to more closely reflect the theoretical true value." 2024-06-25 21:52:38 and then it says that pagerank values are prob. between 0 and 1 2024-06-25 21:53:24 I suppose the spider could update when it encounters a link to a page along with the number of preceding pages which didn't link to it 2024-06-25 21:54:35 I suppose, trying to answer my question, you could compare the current version you had crawled with an earlier cached version 2024-06-25 21:55:13 if the link was found also in the earlier cached version, you would not have to adjust the page rank of the target 2024-06-25 21:55:43 unless maybe the crawled page's own rank had changed