2023-07-26 08:00:24 ACTION is looking into utf8 handling 2023-07-26 08:01:23 I thought the Unicode consortium had defined utf8 to use up to and including 7 bytes 2023-07-26 08:01:35 per codepoint that is 2023-07-26 08:01:57 Dang, drag and drop is extremely user friendly. Means it has an entire USB stack running from the get go? 2023-07-26 08:02:48 xelxebar: or just the parts needed for usb mass storage device writes 2023-07-26 08:19:56 Converting UTF-8 to codepoints is easy, it's all the other unicode shite that's nasty 2023-07-26 08:20:17 Like how characters are different widths, a single glyph can be many codepoints strung together, etc 2023-07-26 08:21:24 Also you have to know *when* to interpret different things 2023-07-26 08:22:20 Often you don't want to interpret at all, which is the mistake lots of newer languages make that try and greedily decode UTF-8 data to store in strings 2023-07-26 08:22:32 Python is the worst for this 2023-07-26 08:23:19 It's pretty rare that "unicode codepoints" is what your business logic will care about 2023-07-26 08:29:12 Pretty sure utf8 isn't more than four bytes. 2023-07-26 08:29:29 I think 'valid' UTF-8 can't be more than four 2023-07-26 08:30:04 But I think the actual encoding scheme might support more bytes in case they decide to make Unicode larger, can't remember 2023-07-26 08:30:08 I think full unicode can be more. 2023-07-26 08:30:24 But the encoding diagram for utf8 on Wikipedia only shows up to four bytes. 2023-07-26 08:30:42 the FSS version has more 2023-07-26 08:31:21 Looking at the encoding scheme, there's no limit to how long it could be 2023-07-26 08:31:33 If you wanted 1000-bit encoding then you can do it, just keep adding bytes 2023-07-26 08:32:11 https://en.wikipedia.org/wiki/UTF-8#Encoding 2023-07-26 08:32:15 Oh wait no I'm dead wrong... but there is room for more 2023-07-26 08:32:31 I think 7 is the limit 2023-07-26 08:32:40 The first byte determines the length 2023-07-26 08:32:47 Sorry spinning too many plates 2023-07-26 08:32:57 I know the feeling. 2023-07-26 08:33:01 I'm going to go back to my Forth testing framework :) 2023-07-26 08:35:04 I've got my system set up so that KEY and EMIT can be utf8. Nice thing is that that can still work on a 32-bit system. 2023-07-26 08:36:13 What I really like about utf8, compared to ansi control codes for terminal stuff, is that you can recognize a utf8 char traversing either direction. 2023-07-26 08:36:48 So moving a cursor point back and forth in a string can be done incrementally. 2023-07-26 08:37:17 Not that it did me any good in the end, because I also support the ansi control strings. 2023-07-26 08:37:18 KipIngram: codepoint != character 2023-07-26 08:37:44 Because of them I wound up havin to start from the beginning and count over every motion. 2023-07-26 08:38:07 Yes, I know. I was sloppy with wording there. 2023-07-26 08:38:19 Well that means you will accept some characters and break on others 2023-07-26 08:38:32 I don't think I really understand the full nuances of unicode, but utf8 is manageable enough. 2023-07-26 08:38:48 I don't think I really understand why you'd need KEY to 'support' UTF-8 2023-07-26 08:39:15 Well, it just depends on where you want to modify the system for UTF8. 2023-07-26 08:39:21 You can put the support in various places. 2023-07-26 08:39:36 Putting it in early as I said earlier I consider to be a mistake 2023-07-26 08:39:37 It does take more than just code in KEY and EMIT, though. 2023-07-26 08:39:45 Unicode has 2^21 integers reserved, which happens to be exactly the limit for 4-byte utf-8. 2023-07-26 08:39:46 I have changes in WORD and EXPECT too. 2023-07-26 08:40:06 xelxebar: It's not a coincidence I don't think 2023-07-26 08:40:30 I just found it "convenient" to modify KEY and EMIT in that way. 2023-07-26 08:40:33 KipIngram: What's the advantage? 2023-07-26 08:40:48 A totally dumb unmodified WORD can handle UTF-8, sort of the point of UTF-8 2023-07-26 08:40:54 Well, I can run KEY and enter a utf8 entity. 2023-07-26 08:40:57 It "just works." 2023-07-26 08:40:59 So can I 2023-07-26 08:41:06 I just need multiple KEY's 2023-07-26 08:41:11 I don't. 2023-07-26 08:41:18 What's the advantage of that? 2023-07-26 08:41:59 I found it preferable. I mean, I think the advantage is that I can have KEY and EMIT work in the way you expect them to, for any displayable glyph. 2023-07-26 08:42:11 If you don't think that's advantageous, then I don't really have more argument. 2023-07-26 08:42:18 It feels like an advantage to me. 2023-07-26 08:42:21 Didn't we say earlier that it doesn't manage that though? 2023-07-26 08:42:31 Would that work for i.e. a national flag glyph? 2023-07-26 08:42:39 It works for UTF8. 2023-07-26 08:42:57 If that KEY is just consuming code points, emoji will almost certainly be half-consumed. 2023-07-26 08:42:57 Doesn't work for full unicode, but I don't care about that. 2023-07-26 08:43:02 It's not pedantry to bring up codepoints vs characters here because that is rather the issue at hand 2023-07-26 08:43:20 My goal just wasn't to support full unicode. 2023-07-26 08:43:33 Well unfortunately people made the same decisions as you in things like Python and now we all suffer for it 2023-07-26 08:43:38 For the longest time I was totally content with having just ASCII support, and didn't care at all about utf8. 2023-07-26 08:43:40 Truly in your system at least it won't hurt anyone else 2023-07-26 08:43:50 But then we started talking about APL, and I started wanting *those* characters. 2023-07-26 08:44:07 I may have overlooked something, but as far as I could tell they were contained in utf8. 2023-07-26 08:44:31 I support UTF-8 because I don't complain about codes after 127, that's the 'fix' for UTF-8 2023-07-26 08:44:32 Yeah, I think it's a decently okay design decision to ignore extended grapheme clusters, which are a huge pain to deal with. 2023-07-26 08:44:46 And I support APL characters as names because likewise I don't complain about codes after 127 2023-07-26 08:45:11 My 'word' delimiter word is something like 32 < 2023-07-26 08:45:54 I think my decision about it is subject to "political correcntess" criticism, because I just don't care about internationalization. 2023-07-26 08:46:44 I decided I wanted APL character set, and this was enough to get that. 2023-07-26 08:46:44 Come one. Don't you want to join the foray into handling mixed LTR and LTR scripts?! /s 2023-07-26 08:46:54 :-) Not so much. 2023-07-26 08:49:28 Did you mess with word delimiters for >127 points? 2023-07-26 08:50:03 Let me turn the question around. What's *disadvantageous* about having KEY and EMIT handle UTF8 "one shot"? 2023-07-26 08:51:01 Performance, perhaps, but I don't loop on EMIT in my TYPE. 2023-07-26 08:51:09 TYPE outputs the whole string with one syscall. 2023-07-26 08:52:29 Honestly, I think if I'd had this approach in mind from the beginning I wouldn't have needed to modify WORD and EXPECT. But I hacked it in after the fact, and that was the path of least resistance. 2023-07-26 08:52:41 Next time I'll try to confine the changes entirely to KEY and EMIT. 2023-07-26 08:53:05 Well, some in EXPECT may be unavoidable. 2023-07-26 08:53:19 But I shouldn't really need any in WORD. Maybe. We'll see. 2023-07-26 08:54:07 I'm not sure - but it just felt to me like the pertient code could have been "better confined" that I wound up having it be. 2023-07-26 08:54:23 What's disadvantageous? It increases complexity up-front with no benefit 2023-07-26 08:54:48 I don't see a benefit, other than your taste, and I see a lot of issues 2023-07-26 08:55:01 Like what if you end up reading a binary file that doesn't encode valid UTF-8? 2023-07-26 08:55:27 Why should I care if it's not valid UTF-8? Not all input is UTF-8. Likewise not all terminals are UTF-8 2023-07-26 08:55:34 It's not a contrived problem, this is an issue in python 2023-07-26 08:55:35 I wouldn't expect a binary file to be processable with code designed for displayable text. 2023-07-26 08:55:50 And then also why can't I store KEY as bytes into a byte array? That would be logical 2023-07-26 08:55:54 But with your thing I can't do that 2023-07-26 08:55:57 And I can go on 2023-07-26 08:56:34 Well, I think we're just operating with differt "desire sets." Sounds like an argument neither of us can "win." 2023-07-26 08:57:43 Well all I'm saying is, you have something that is more complicated and will break on expected things, and I've got something that always works and solves even your problems, apart from your contrived technical goal 2023-07-26 08:58:04 So I will have to agree to disagree 2023-07-26 08:58:15 I think we're "expecting" different things. 2023-07-26 08:58:32 I'm not trying to write a system that works for everyone in the world. 2023-07-26 08:58:37 I mean UTF-8 in the first place was designed to allow byte-based stream programs to "just work" 2023-07-26 08:58:45 Or, rather, as everyone in the world "might expect." 2023-07-26 08:59:43 IMO less code for same effect is better. The only difference is your KEY will occasionally return a big non-byte number when mine would return a byte and then another byte 2023-07-26 09:00:17 And I say occasionally because you don't support all such characters anyway, only some 2023-07-26 09:00:46 Walking into an environment this defies my basic expectations and would confuse me. But not just me, most people. So yes I lean on convention a bit 2023-07-26 09:00:58 I will say unconventional choices demand explanation and documentation 2023-07-26 09:01:09 All hypothetical because your work isn't public 2023-07-26 09:01:13 So doesn't need documentation 2023-07-26 09:06:36 I think the lazy codex of "codepoints are characters" is fine to support typical boring exotic characters and ignore emoji etc 2023-07-26 09:06:54 But I think that kind of codex needs to be explicit for the programmer 2023-07-26 09:07:23 And it's important enough to moan about this choice in public whenever people declare it 2023-07-26 09:07:35 Because I want to avoid people doing this in things I'll use 2023-07-26 10:27:04 Well, I'm not going to argue about it any more. I supported the characters I care about - there was no attempt to support unicode in full. Like I noted, the step away from ASCII at all was purely because of my interest in APL. Having a different set of design goals from you just isn't a good thing to argue about - we're each writing our code for ourselves. In my eyes, "one keystroke, one call to KEY" was 2023-07-26 10:27:06 a reasonable decision - I guess you just have a different opinion. That's fine. 2023-07-26 10:27:24 Also, one call to EMIT, one character on the screen - same idea. 2023-07-26 10:28:15 God, I'm dialed into a big shot "all hands" meeting. These things are really just so lame. 2023-07-26 10:28:35 Reminds me of high school pep ralleys, just without the cute cheerleaders. 2023-07-26 10:31:11 I mean, I'm sitting here thinking about this discussion, and to ME your reasoning just seems entirely wrong. But I guess mine seems entirely wrong to you. So we're just sitting at such different places in our points of view that there's really no point haggling. 2023-07-26 10:40:15 At least one aspect of your reasoning here would also fit another example - let's say I popped in and announced that I'd written an English/French translation program. It feels like you'd be telling me it was misguided because it didn't translate every language on the planet. That's EXACTLY how I feel about utf8 vs. non-utf8 unicode. 2023-07-26 10:40:41 I wasn't really bothered by the "it doesn't translate everything" aspect, don't get hung up on it 2023-07-26 10:40:58 You know what really bugs me is at some point Windows changed the way stacking order works, so now if I minimise a program it's 'next' in the alt-tab order 2023-07-26 10:41:05 The bane of heavy keyboard control users 2023-07-26 10:49:24 Really weird is that if you disable desktop window manager and/or kill explorer (can't remember which or if both are needed) it will go back to old behavior 2023-07-26 10:49:47 But lots of programs refuse to run without explorer running 2023-07-26 10:57:44 I hate how little consideration is given for "keyboarding" these days. 2023-07-26 10:57:56 It's like it's an afterthought. 2023-07-26 11:20:34 my xterm keyboard fine 2023-07-26 11:45:10 I'm not talking about just having a keyboard operate. I'm talking about integrating keyboard operation into an app's navigation. 2023-07-26 11:46:12 If I were writing an app (with limited exceptions - there ARE s ome apps that are just inherently graphical) I'd write it first as a console application and then add the GUI. 2023-07-26 11:46:44 What I really think is the best approach is to first implement an API that lets you "remote operate" the application, then lay the GUI and the TUI on top of that. 2023-07-26 11:47:08 Among other things that immediately lets the app and the interface run on separate systems, if that's useful. 2023-07-26 11:47:49 And I'm talking here about an app running under an OS like Linux - not about something implemented on bare metal. Though even on an embedded system I'd try to "approach" that sort of structure. 2023-07-26 11:48:52 If all user operations are actually going through an API, then you know from the jump that you can automated test the app, since the automation would be able to do anything the user could do. 2023-07-26 12:01:46 Well, I drug out my old ItsyBitsy M4 and updated its bootloader and CircuitPython items. 2023-07-26 12:02:28 It also uses that drag and drop method - you just boot it either to "bootloader mode" or "standard operation mode" and drop a file formatted as a UF2 binary on the drive. 2023-07-26 12:02:41 The UF2 format seems well documented. 2023-07-26 12:02:53 All I'm missing for that one is internal architectural details of the board. 2023-07-26 12:03:03 "All" - that's actually a pretty big blank. 2023-07-26 12:22:41 UF2 format is here: https://github.com/microsoft/uf2 2023-07-26 12:37:33 Oh... That's cute. So REFILL just updates #IN and TIB to point to the next line. 2023-07-26 12:38:35 Still hand disassembling SmithForth, here. Starting to get into meatier definitions it seems. 2023-07-26 12:46:33 KipIngram: You definitely like the layered architecture don't you? 2023-07-26 12:50:56 When I write GUI programs I tend to just keep everything simple, slurp everything into a big array or whatever 2023-07-26 12:51:15 I don't worry about separation or anything, no special API for accessing the 'business logic' 2023-07-26 12:51:21 The GUI comes first for me I suppose 2023-07-26 12:51:57 But generally the logic that could be separated ends up separated anyway with a modicum of factoring and tidiness 2023-07-26 12:57:28 KipIngram: I think an ELF would suffice 2023-07-26 12:58:00 Whenever I see a 'format' for 'loading binaries' I always roll my eyes watching people convert an ELF into this other format, when they could have just left it as an ELF 2023-07-26 12:58:20 I had a rant about this exact situation at work yesterday 2023-07-26 12:58:39 (regarding UF2) 2023-07-26 13:21:45 Well, if the layers are already there and forced on me, I think there's a way they should be used. That's why I caveated embedded applications. But let me say it this way - if there's going to be a GUI, I think it should be on top of SOMETHING. That might be "on top of a TUI," but it might be better in some cases to put both on top of an API. 2023-07-26 13:21:59 I don't think that a GUI should be the "first and foremost" interface. 2023-07-26 13:22:10 But that's the way most software is approached these days. 2023-07-26 13:23:03 Do you think there should be a 'model'? 2023-07-26 13:23:33 Not sure what you mean. Example? 2023-07-26 13:23:51 Well in the sense of e.g. 'model view controller' 2023-07-26 13:24:02 But without having 'views' or 'controllers' etc 2023-07-26 13:24:03 I guess I do think one should think in terms of manipulating a state. 2023-07-26 13:24:14 And that would lead one to consider various "basic operations." 2023-07-26 13:24:25 The an interface would offer a pleasing way to perform those operations. 2023-07-26 13:24:31 I think having a 'model' is quite sane, and 'views' as well 2023-07-26 13:24:45 I'm not up on the terminology, but it sounds reasonable. 2023-07-26 13:24:54 I don't really know what a 'controller' is other than the app itself? I don't know the terms myself either really 2023-07-26 13:24:57 I thinkt here should be a "rigorous view" of what one is doing. 2023-07-26 13:25:08 Shouldn't go just tossing things together. 2023-07-26 13:25:09 When I did MFC I felt it made a reasonable amount of sense 2023-07-26 13:25:46 I prefer the tossed together GUI approach, for anything less than a really complicated GUI 2023-07-26 13:25:59 If you're not writing the next Excel 2023-07-26 13:26:15 Well, I still think there a "basic way to think about" most applications. 2023-07-26 13:26:27 Like at BP, we made equipment for programming semiconductor devices. 2023-07-26 13:27:01 MVC does actually make a lot of sense, for most basic GUI programs, I'll give it that 2023-07-26 13:27:03 So, you need to choose what device you're programming, specify your programming pattern, and in some cases specify how the parts are going to be delivered and how you want them dispensed with after they're programmed, etc. 2023-07-26 13:27:21 All of those questions would get answered somewhere in the user interface. 2023-07-26 13:27:38 Similar reasoning on the seismic equipment I dealt with later. 2023-07-26 13:27:54 There's a way seismic techs "think about" getting a job done. 2023-07-26 13:28:27 That wasn't really my part of the puzzle - there was another guy (the one I wound up butting heads with) who was a lot more knowledgable than I was about what made a good customer experience. 2023-07-26 13:28:40 But I could make the electronics do whatever we decided it needed to do. 2023-07-26 13:29:05 Could have been a good partnership if we'd been able to work well together. 2023-07-26 13:29:22 But from my perspective he wanted to run everything, and he likely had some similar perspective about me. 2023-07-26 13:29:51 I can't imagine how I might have made him feel like I wanted to run that customer experience part, though, because I don't recall having any interest in dictating that. 2023-07-26 13:31:31 Heh. 2023-07-26 13:31:33 https://www.nature.com/articles/d41586-023-02361-7 2023-07-26 13:31:41 "ChatGPT broke the Turing test." 2023-07-26 13:32:04 Forever the Turing test has been the benchmark for AI. But apparently ChatGPT can pass it, and yet it's still obviously not "intelligent." 2023-07-26 13:32:06 Ooops. 2023-07-26 13:32:09 Need a new test. 2023-07-26 13:32:30 I predict the same thing will happen all over - sooner or later "AI" will pass the new test too, but will STILL be "missing something." 2023-07-26 13:32:44 I don't think it will ever not be "missing something." 2023-07-26 13:32:55 No matter how well it impersonates human behavior. 2023-07-26 13:33:21 The problem is that we don't even have a good scientific way of defining that "missing aspect." 2023-07-26 13:34:00 AI - the lights are on, but no one is home. 2023-07-26 14:11:07 ChatGPT doesn't pass the turing test, does it? The point is you choose which of two candidates are real, and which is AI. I feel like right now most people would 'win' the imitation game 2023-07-26 14:12:44 Apparently, if you read that article, it does pass many tests that would have been thought of as "Turing tests" in the past. The article says that it is capable of fooling people into thinking they're conversing with another human, at least for short periods of time, and not including experts who know a lot about Large Language Models and can exploit known weaknesses. 2023-07-26 14:13:08 So people are looking to invent new classes of tests. 2023-07-26 14:14:09 But the article said LLMs work by simply making statistical inferences using an enormous body of textual information - they just calculate "most likely next words" using probability theory. 2023-07-26 14:14:19 There's clearly no "understanding of the content" going on. 2023-07-26 14:14:23 eliza also fooled some people 2023-07-26 14:14:25 The Turing Test or 'imitation game' is more about philosophy than being an actual test. It's only ever been considered a 'proper test' by pop science people 2023-07-26 14:14:40 Yeah, but evidently not enough for them to find it significant. 2023-07-26 14:14:45 The new stuff is much better. 2023-07-26 14:14:50 It's a fantastic thought experiment from Turing, endlessly abused by pop science 2023-07-26 14:15:05 Yes. 2023-07-26 14:15:13 I don't disagree with that at all. 2023-07-26 14:15:21 And that is why I didn't read the article 2023-07-26 14:15:26 Another realm of science that gets mangled by the popular press. 2023-07-26 14:15:37 Because its title indicated it's pop science and I don't have time for that 2023-07-26 14:15:43 Also it's nature.com 2023-07-26 14:15:44 I'm so tired of "electrons in two places at the same time" articles. 2023-07-26 14:16:33 I get some pretty interesting stuff from them. My main complaint about them is that they're CLEARLY politically skewed and don't hesitate to use their platform for promoting their point of view. 2023-07-26 14:17:00 They're not totally overt about it it, but if you read between the lines it's very clear. 2023-07-26 14:17:29 They have a thing called "Nature Briefs" that I get in my inbox. 2023-07-26 14:17:38 Sounds like a low SNR 2023-07-26 14:17:43 I thought about subscribing to the publication once, but they want $200 a year and that's just not happening. 2023-07-26 14:18:12 Yes, but it only takes me a minute to glance at the headlines in the briefs and it's only now and then I click in, when something seems sufficiently interesting. 2023-07-26 14:18:25 Most of them are just not topics I'm interested in, regardless of how good or not they are. 2023-07-26 14:18:59 These publications are too niche and expensive 2023-07-26 14:19:04 Yes. 2023-07-26 14:19:08 Definitely too expensive. 2023-07-26 14:19:10 Well actually they're not expensive, we're all just poor 2023-07-26 14:19:16 Our money doesn't buy as much as it used to 2023-07-26 14:19:23 :-) 2023-07-26 14:19:59 When I was a child I could afford to buy the paper most days, and I did 2023-07-26 14:20:08 I can't justify that anymore 2023-07-26 14:23:45 We could see the doctor if we needed to back then 2023-07-26 14:54:08 Yes. 2023-07-26 14:56:14 Actually I think all of that "blank" information I mentioned earlier is just part of the ATSAMD51 documentation. I was thinking that the peripheral information would be related to the board level design, but it looks like all of that stuff is on chip. 2023-07-26 14:56:26 So I think this is likely all I need: 2023-07-26 14:56:28 https://ww1.microchip.com/downloads/en/DeviceDoc/SAM_D5x_E5x_Family_Data_Sheet_DS60001507G.pdf 2023-07-26 20:55:53 ACTION muses on storing toki pona text as 14 bits of a 16 bit cell using https://github.com/abalidoth/sitelen_palisa 2023-07-26 22:41:28 Well, I think my favorite sitelen_palis symbol will just have to be "kipisi" 2023-07-26 22:41:43 s/palis/palisa/ 2023-07-26 23:10:54 way to cut to the chase