00:00:00 --- log: started forth/05.08.26 01:06:09 --- quit: crc (Read error: 110 (Connection timed out)) 02:30:46 --- quit: YoyoFreeBSD_ (Read error: 110 (Connection timed out)) 03:04:31 --- join: qFox (n=C00K13S@92pc222.sshunet.nl) joined #forth 03:33:15 --- join: amca (n=plump@as-bri-3-171.ozonline.com.au) joined #forth 03:35:28 Gday 03:45:47 mornin :) 03:46:27 Evening here :) 03:48:10 Is there a difference between "variable foo 10 , 20 ," and "create foo 10 , 20 ,"? 03:49:03 yep 03:49:11 What is it? 03:49:20 "variable foo" is equivilent to "create foo 0 ," 03:49:45 Ah 03:50:17 So variable initialises the first array position? 03:50:44 it's usually used when you just want one 03:51:16 Or do you mean "variable foo 10 , 20 ," would allocate space for 3 numbers whereas "create foo 10 , 20 ," allocates space for only 2? 03:51:36 right 03:51:48 I'm not entirely sure the first would work in all forths 03:52:07 It does in gforth 03:52:14 it's possible the space for one number allocated by "variable" would be in a different place in some forths maybe 03:52:18 maybe 03:53:04 well, gotta run. ttyl 03:53:21 http://rafb.net/paste/results/8wyHhr52.html 03:53:25 ok bye 04:56:29 --- quit: amca ("d34d") 05:31:39 --- log: started forth/05.08.26 05:31:39 --- join: clog (i=nef@bespin.org) joined #forth 05:31:39 --- topic: 'Forth: One language, many dialects. #forth - general forth discussion. #c4th - ColorForth. #retro - RetroForth. #c4th-ot - social channel. #1xforth - a secret channel for 1xforthers. #concatenative - the category of language that forth belongs to (sorta).' 05:31:39 --- topic: set by crc on [Sat Jul 23 13:29:38 2005] 05:31:39 --- names: list (clog skylan qFox ianp @JasonWoof warpzero ccfg OrngeTide saon docl Raystm2_ Snoopy42 virsys slava saon|fbsd Jim7J1AJH I440r Quartus onetom) 05:34:09 --- join: cmeme (n=cmeme@boa.b9.com) joined #forth 05:34:54 --- nick: Raystm2_ -> nanstm 05:35:21 --- quit: cmeme (Client Quit) 05:36:04 --- join: cmeme (n=cmeme@boa.b9.com) joined #forth 05:37:18 --- quit: cmeme (Client Quit) 05:38:01 --- join: cmeme (n=cmeme@boa.b9.com) joined #forth 05:39:14 --- quit: cmeme (Client Quit) 05:39:56 --- join: cmeme (n=cmeme@boa.b9.com) joined #forth 05:52:59 --- join: YoyoFreeBSD_ (n=yoyofree@222.90.3.96) joined #forth 05:57:22 --- join: snowrichard (n=richard@adsl-69-155-177-155.dsl.lgvwtx.swbell.net) joined #forth 06:06:22 --- join: madwork (n=madgarde@derby.metrics.com) joined #forth 06:25:34 --- quit: I440r ("Leaving") 06:48:08 --- join: PoppaVic (n=pete@0-1pool46-103.nas30.chicago4.il.us.da.qwest.net) joined #forth 06:48:29 G'day 06:49:05 hi 06:54:45 --- quit: snowrichard ("Leaving") 07:03:54 --- join: snowrichard (n=chatzill@adsl-69-155-177-155.dsl.lgvwtx.swbell.net) joined #forth 07:05:17 --- quit: snowrichard (Client Quit) 07:08:20 --- join: snoopy_16 (i=snoopy_1@dsl-084-058-160-031.arcor-ip.net) joined #forth 07:09:50 --- join: sproingie (i=foobar@64-121-15-14.c3-0.sfrn-ubr8.sfrn.ca.cable.rcn.com) joined #forth 07:09:59 Mornin' 07:10:40 morn 07:11:31 --- join: tathi (n=josh@pdpc/supporter/bronze/tathi) joined #forth 07:14:49 --- quit: Snoopy42 (Read error: 60 (Operation timed out)) 07:15:02 --- nick: snoopy_16 -> Snoopy42 07:18:37 --- join: derv0 (n=derv0@proxy1.nscl.msu.edu) joined #forth 07:23:20 --- join: virl (n=hmpf@chello062178085149.1.12.vie.surfer.at) joined #forth 07:29:35 --- quit: qFox ("this quit is sponsored by somebody!") 07:42:26 --- quit: PoppaVic ("Pulls the pin...") 08:07:11 --- join: bigPoppi (n=derv0@proxy1.nscl.msu.edu) joined #forth 08:09:18 --- part: bigPoppi left #forth 08:10:42 --- join: bigPoppi (n=derv0@proxy1.nscl.msu.edu) joined #forth 08:16:18 --- quit: derv0 (Read error: 110 (Connection timed out)) 08:46:06 --- join: PoppaVic (n=pete@0-2pool198-51.nas30.chicago4.il.us.da.qwest.net) joined #forth 08:49:17 I miss anything invigorating? 08:52:06 nope, dead calm all morning 08:52:15 dang, that's sorta' depressing 08:52:33 except amca was asking if there's a diff between create and variable 08:52:50 hehe - I'll bet that was confuzzling ;-) 08:53:04 actually in retroforth, create just makes a stub whereas variable actually allots space 08:53:24 yeah, it varies all over 08:53:32 I think variable is create allot or some such 08:53:44 it certainly can be 08:54:32 I've been looking over my stuff (again), and thinking of tathi & sproingie... Also double-thinking the database-idea 08:54:46 in #perl they were talking about bear-baiting 08:54:52 .sigh 08:55:00 perl... What a waste 08:55:05 hehe 08:55:20 folks always forget the purpose of the stupid language. 08:55:25 pragmatic and easy to learn, what's not to like? 08:55:40 :P 08:55:45 not to like... Um, I avoid it like plague 08:56:07 mostly because it almost makes C look attractive 08:56:40 nah, all that meaningless structuring and casting you have to do in c is automated in perl 08:56:53 it isn't meaningless ;-) 08:57:08 and, yes - perl will holdyer hand nicely 08:57:51 of course python does about as well if you're not into messy coding 08:58:01 I believe about 70-89% of the complaints about C come from kids that started out in perl, python, php or some sorta' doze-script 08:58:10 yup probably 08:58:40 as I did, since I had no easy alternatives 08:58:42 "why do I have to {allocate, free}?" "WHY can't I return an array?" 08:59:42 yeah that's probably why I have a mental block against c to this day. I'm too used to hand-holding 09:00:33 but I'm starting to get the picture on why things go the way they do in c and other medium-level languages 09:00:56 yah 09:01:08 I will guarentee forth-guts help that along 09:01:59 I was discussing with madwork on #retro about the idea of making executable delimiters, so not everything needs a space 09:02:09 heh 09:02:13 Good luck there 09:02:14 would be necessary if I want to emulate the feel of other languages 09:02:24 Yes, I've got it on my agenda 09:02:51 it necessitates a better parser-systemology 09:02:54 we might need to switch to a hashed dictionary for speed. I'm reading the wiki entry on hashes now 09:03:10 Need to lookup every character as you parse it, essentially. :) 09:03:20 (depending on state) 09:03:25 I don't really understand hashing in my guts, I always think of balanced-trees 09:03:37 I understand hashing. If you get stuck let me know. 09:03:48 madwork: right, and I sorta' plan to make it possible to switch testers/parsers in and out 09:03:53 --- quit: bigPoppi ("Client exiting") 09:04:05 we're thinking newbies might be more comfortable with other syntaxes than : ; like maybe c{ }c and the like 09:04:12 sure 09:04:28 Tis what I started to do for my minish sub-project 09:05:00 docl: it also - done properly - can clean up semantics and syntax. 09:05:30 thing is there are good and bad hash algorithms, and you have to use the right ones to get any kind of efficiency 09:05:49 yeah, that's the primary reason I just use avl-trees 09:06:31 Go for *very* simple. You're effectively hashing English text, when hashing a Forth dictionary. Multiply each character by a small prime and add; you'll find you get very good distributiojn. 09:06:34 -j 09:06:52 yeah, I've tried both. 09:07:07 hash functions for single-character delimiters should be simple. 09:07:20 I've tried that sorta' hash and then LL, and it works. 09:07:22 Er, sorry. Painkillers. Add each character and multiply the sum by a small prime. 09:07:46 ..but a tree is just as easy to my mind - of course, I *LIKE* reusable code 09:07:52 If you use a heavy hash algorithm, you'll a) not get any better distribution and b) waste time, when the whole idea is to speed things up. 09:08:00 yep 09:08:24 I also recommend hashtable sizes that are a power-of-two. This is contrary to the texts, but it works fine. The texts are wrong. 09:08:30 to be honest, a mmap'd binary datafile is prolly even more economical 09:08:47 Quartus, I came up with a hash function that works well on power-of-two sized tables a little while ago, plus it's fast. 09:09:04 I'd like to see it. 09:09:30 heh - I use 2^ often, but... I found that using some prime value in there does balance the array better 09:10:09 Yes, as long as you're not using a prime-sized table and doing a modulus on the hash value. :-& 09:10:19 the kicker, to me, is to blackbox whatever you can 09:10:36 Quartus, let me track down my function... 09:10:38 aye, as long as it's good 09:10:49 Division is slow. 09:10:58 madwork: nah, I can't recall the funcright now, but I believe I hashed char-groups into a prime-sized array of tree-roots 09:11:21 seems like duplicating the symantics properly, you could import code from any language 09:11:29 yep 09:11:39 or a damn good subset 09:11:47 Quartus, look at hash_rand8(), hash_rand9(), and hash_rand0() on this page: http://www.freelunchdesign.com/cgi-bin/codwiki.pl?PaulPridham/CodePaste 09:12:02 then we'd almost instantly have the whole GNU+ collection of open source stuff for retro 09:12:19 or quartus, factor, etc. 09:12:38 sure - did that suprise you? 09:13:09 kinda. it hasn't been done has it? 09:13:21 docl: be advised, this is what doze does for all their stupidscript-engines. 09:13:24 madwork, that's the idea. Even smaller multipliers work with Forth. 09:13:38 you use a basic VN and all other shit must go thru the vm-universe 09:14:02 VN/VM 09:14:23 ok, cool. so that's kinda what cygwin does? 09:14:24 docl: it's what I'd like to get done for *nix, too 09:14:49 particularly if we can expect/offer translation INTO a proper, lower language 09:15:33 so, sure - emulate/interpret/forth "compile"... and also offer an exporting-capability, like gcc tries with asm 09:16:19 When testing hashfunctions I compared against a hashfunction built with the MersenneTwister; can't do any better than that for distribution. 09:17:04 Quartus: I'd still say it should be blackboxed 09:17:18 Something's gotta be inside the box, whatever colour it is. :) 09:17:35 like with libdb, it should be damn-near invisible 09:18:24 Quartus: hell, sure... basically it's all key/value add/remove/find/create/delete 09:18:44 Quartus, I was thinking of trying the MersenneTwister for hashing too. But the rand calculation I do is simpler, and it's only a one-time hit. 09:19:02 madwork: all the work up front? 09:19:02 hash_rand8() turns out to be a little bit faster than superfasthash. 09:19:28 PoppaVic, at the end, but yea. 09:19:53 yah. so, "get a name", smurf!, lookup. 09:20:26 hash_rand0() would be the best one to adapt to a single-character delimiter lookup. 09:21:06 I'd be suprised if you even let them know it's 1 char. There is no need-to-know for blackboxing 09:21:08 I'll have to test it though, to see how distribution is over the range of 8-bit numbers. 09:21:10 different hash algorithms go better in different situations 09:21:18 docl, yep. 09:21:28 docl: sure, they all say that ;-) 09:21:34 so label the blackboxes with stats or something 09:21:41 hmm? 09:21:55 docl: all you dois specify an input and an output 09:21:59 PoppaVic, it only needs to be one char for simplicity. if the delimiter wants to belonger than 1 char, it can handle that case when it executes. 09:22:13 you need to make sure they can pick the right one for the job, right? 09:22:26 madwork: absolutely - what I mean is the bb can use 1 - or N - the user has noNeed To Know 09:22:43 Sure. 09:24:00 so, "+" and "+!" may be in the same hash, or the same tree, or different hashes/trees - all we care is "feed it" +"expect" = "returning" 09:25:07 docl: this is also why Bog created vectored-functions ;-) 09:25:53 Bog? 09:26:13 I think vocabularies are the best way to handle different syntax. 09:26:24 well, God, gog/magog are busy 09:26:28 hehe 09:26:50 madwork: yes, I'm considering a scanner/parser func-ptr for them, too 09:27:47 ..add in the search-order stack and shit, and it seems likely to me we can get along nicely w/o lex andgoddamnedyacc 09:28:07 Damn skippy. 09:28:28 Or "pee" and "poo" as I like to call them. 09:28:28 well, I LIKE delegation and structures 09:28:57 folks just never seem to grok "context structures" 09:29:20 ..close as they come is goddamned C++ and "Classes" 09:29:44 hmm context structures. nice idea 09:30:33 docl: very, very old 09:30:45 What do you mean by "context structures?" 09:30:48 predates C++ by a long margin 09:31:15 madwork: a structure passed to a collection of funcs so they are all working together. 09:31:29 or a *, actually - same-same 09:31:53 The only drawback is it immediately costs an arg in C 09:32:18 in a VM, well - perhaps it is a member of the "global" struct 09:32:43 Ahh. OK, I do this all the time. 09:32:48 sure 09:32:58 anyone that likes reusable code learns it asap 09:32:59 It's the "this" in C++ 09:33:04 righto 09:34:08 Context structure with a pointer to a vtable, I think we were talking about this a while back, PoppaVic. ;) 09:34:51 yes, add a vtable* and you now have a compact, method-func system embedded as well - well beyond C and forth deferred 09:35:39 I have been thinking about these things for years and years... I LIKE forth, and I LIKE C - but neither ever seems to learn from the other. 09:37:02 I'd even be happy to have a decent forthish shell/interp and live there while being able to say "there! now generate my C source!" 09:37:07 hashing? numbers as word names are faster ;-) 09:37:27 numbers as delimiters, ick. 09:37:49 no, he means like (int)"expect" 09:38:11 that really does no good at interpret-time. 09:38:29 You still have to look up "expect". 09:38:32 it's another of those "hashes are tossups" things 09:38:59 no, no - "expect" is a string: treat N characters as components of an int at that place 09:39:08 Ahhh... but, if you don't mind having a 256-entry table, then delimiters can just index directly into it by ASCII value. 09:39:15 yup 09:39:17 wouldn't an escape character (like \) make delimiters possible without checking every single character against an entire hash? 09:39:30 How? 09:39:31 foo: 09:39:35 folks are abusing the term "delimiter" 09:39:38 foo\: 09:39:48 docl: why bother? 09:39:50 So, you want to define new words this way? 09:40:04 I meant ': 123 ;'(but in bytecode) 09:40:25 docl: I already told you - we'd certainly want a better, distributed scanner/parser 09:41:06 ..and, certainly we could care less if 'compiling'/'interpreting' wastes a few millisecs 09:41:06 yeah 09:41:26 once we have a word built, THEN we want the sob to run like a bat outta' hell 09:41:34 which syntax do you want` 09:41:42 `=? 09:42:29 docl: one of the ways it seems quite possible is to segregate "operators" into a voc 09:42:52 and why should be numbers as word names be tossups? they would really speed up lookup 09:44:11 harder to remember 09:44:18 docl: recall that almost all forths use a (can't recall the func) ' ' word-parse sorta' thing.. the topmost int on stack is delimiter. it's a really limited system 09:44:33 wsparse ? 09:44:38 virl, if you want to use numbers as word names, compile them literally. This can already be achieved. 09:44:42 not on my machine, but sure 09:45:14 32 parse foo 09:45:21 I hate it when people don't understand me... I said that all the time 09:45:32 docl: now, if the entry was a ptr to a string where ANY is viable as a delimiter, and you can also check a VOC for a complete word... Yer stylin' 09:45:46 sorry virl 09:45:51 madwork: thanks, been awhile ;-) 09:46:11 virl, but direct compilation isn't really making use of language extension capabilities... it doesn't improve things for the average user. 09:46:27 I use it in my bytecode vm to define new words, and a vm which interprets bytecode doesn't need that big uplooking mechanism. 09:46:40 Certainly not! 09:46:53 But, I think we're discussing a higher-level of user interface mechanism here. 09:47:03 Not the Forth writer, but Forth user. 09:47:24 union { char buf[MAXWORD]; int hashish; } 09:47:47 I don't know that, I overread that sry. 09:48:16 but, the guts are guts - blackboxing is ALWAYS sensible 09:50:04 One of the issues with forth is allowing full low<->high in the same voc - it becomes hard to mentally track value from noise. 09:50:38 hmm yeah blackboxing in forth isn't always done enough 09:50:48 well, it's the same in C 09:51:03 virl, the main concept that I brought uplast night in #retro is that if you have executable delimiters, then you can do stuff like this: 09:51:04 foobar: 123 . ; 09:51:04 The : would be an executable delimiter, and parse backwards in the input stream to read the name of the new word definition. 09:51:27 ...I already did my screaming about functions doing loops in loops to loop and reloop - mention some should be funcs and they smirk about "Mr. Pascal" 09:51:53 madwork: HORRIBLE sematic, anywhere 09:52:13 PoppaVic, eh? 09:52:28 going forward to go back and then forward - nasty 09:52:50 : foo 123 . ; istrivial - he needs to consider some better examples 09:52:55 This is how it must be done with a one-pass compiler. 09:53:09 yes, and god save us from 1-pass 09:53:13 foo( int a, char b ) 09:53:28 There you go. 09:53:50 And I don't fear controlling the input stream, I think it's quite a powerful ability. 09:54:02 how about this: "uname -a 2>/dev/null" 09:54:12 BAS{ print "Hello" }BAS 09:54:33 Yes, " is another case. 09:54:45 above offered several issues, yes - " is one of them ;-) 09:54:52 " delimiter parses ahead until it sees another " No trailing space required. 09:54:57 C{ void main(void) { printf "Hello\n"; } }C 09:55:23 docl: this is not a bother to me, I can envision it. 09:55:30 system("uname -a 2>/dev/null") 09:55:47 ahh 0 NOW yer hurting - switch to popen ;-) 09:55:53 heh 09:56:09 this is where I'd like to thrash the vm/forthish guts 09:56:21 This stuff simply enables easy infix/prefix notation. 09:56:27 sure enough 09:59:32 madwork: .... " INFIX: ..... ; ;INFIX ..... ; : CINFIX .... ;" ;-) 10:00:49 although, I suspect "INFIX:" should just run to EOL 10:01:22 somehow it looks horrible 10:01:29 Heh, yea sure... but then you need to write an actual infix parser in there, which is more work than using smart delimiters. 10:01:44 the good old C looks better 10:01:59 madwork: you misunderstand... it'd trigger those vtable-settings we'd spoken of ;-) 10:02:37 which would mean the delimiters could be used in multiple-contexts ;-) 10:02:47 INFX{ ... }INFX 10:03:32 docl: yet, "}INFIX" is a substring, not a char-trigger (see where we are going?) - It might not work as wanted 10:04:08 I do believe, however, that yer attempt sorta' points out the crux of the issue 10:04:33 we need to damn-nearly get a lex-like system in place. 10:04:38 Well... with the basic, simple smart delimiter functionality, anything could be built upon it. 10:05:01 So, lex-like whatever can be an additional lib, or not. 10:05:06 yes, madwork - using a delimiter-voc and sub-parsers? 10:05:24 The delimiters themselves are the sub-parsers, yes. 10:05:50 yer assuming - what - for a "delimiter", assuming we are gathering a string into a buffer? 10:06:02 Yes, the input stream. 10:06:03 forward parsing would be better than the backward thing. 10:06:15 And the backward parsing isn't really backwards at all... 10:06:18 it's the current token. 10:06:23 right 10:06:24 We're still matching it! 10:06:37 yep. Just trying to envision the flow 10:06:37 When we reach the delimiter, we just handle the token as it is, however we see fit. 10:06:45 And the delimiter can call EVALUATE or whatever they need to in order to get the job done. 10:06:54 right.... I think 10:06:55 It's extremely simple. 10:07:27 well, writing for lex IS simple. Doing anything from there can suck 10:07:28 Delimiters are fully qualified words in their own right. So they have all the power of Grayskull during parsing. 10:07:52 Yea, but that's not our problem. ;) It's the problem of the language writer. 10:08:02 Delimiters gets them there much more easily. 10:08:07 ahhhh hehe - shovel it off, eh? ;-) 10:08:15 YES, dammit! ;) 10:08:18 I can't follow your thoughts, please describe it clearly so such a natively german talking person with not perfect english abilities can understand you ;-) 10:08:35 The lang itself should offer the proper demonstrative examples, mind you ;-) 10:08:36 virl, I have that problem with PoppaVic as well. ;) 10:08:49 PoppaVic, sure. 10:08:52 * PoppaVic spanks mad ;-) 10:09:05 It could come with an infix math vocabulary, or a subset of BASIC. 10:09:19 BASIC would be great, actually, to welcome the newbies to the system. 10:09:38 And imagine a language-extensible BASIC! 10:09:45 I honestly think that the parsing/scanning and vocs/words/immediate needs a rebuild.. It's a pardigm-shift, and it deserves a lot of cogitation. 10:09:54 Yep. 10:10:29 madwork, please not another Java/Python/PHP/etc. please 10:10:56 it seems to me that we start out with a forthish parsing, to get the basics cheap. 10:11:14 ..and the complexity needs to evolve/build 10:11:41 a forthish parsing? you mean wordword... 10:12:00 virl, well my point is not to design a new language. 10:12:11 My point is enable Forth to easily allow this flexibility of extension. 10:12:19 Trying to envision forth/vocs behaving like mini-lex .l files can be a bit confuzzling 10:12:42 The vocs would simply be the language context. 10:12:58 madwork, which extension? another syntax? or to change forths syntax into anything elses syntax? 10:12:59 oh, well shit.. I have no prob thinking "new language", but mostly it's building in elements and tools that can BE used 10:13:14 You could, of course, use the facility to create a lex/yacc for creating new language vocs. :P But that would be typical overdesign. 10:13:27 madwork: remember, I mentioned vocs might have their own vectored-funcs ;-) 10:13:43 virl, yes to both questions. 10:14:08 PoppaVic, why would it matter what was in the voc? It's flexible, put what you want in there. 10:14:36 no, I mean an element OF the voc - that means switching to it from the CLI or the search-order ;-) 10:14:37 virl, but mostly to allow someone to create a new language syntax on top of the Forth. 10:15:02 madwork, that means seperating syntax from functionality. 10:15:15 How so? 10:16:09 One of the issues Ican easily envision is.. it means that you'd need a special colon-word sorta' thing for any 'word' having a delimiter-embedded. 10:16:32 Only if the vocabulary that the delimiter exists in is in the search order. 10:17:06 Really people, this is the simplest idea. These delimiters don't even have to be used. The most basic situation is that of a regular Forth syntax, and the default delimiters used are whitespace. 10:17:20 how about any word between { } or words including them, gets parsed more closely. stuff not in curlies stays as regular forth with just wsparse 10:17:20 because you need basics from where you could build a new syntax, like that the output operator must have a basic form. it's like mathematic which their fundamentals which would crash mathematic when they would be wrong. 10:17:23 yeppers, but I was considering "Let us consider "[](){};;\"\'" 10:17:28 Now if someone wants to extend the language and not be tied to purely-whitespace delimiting, they can. 10:18:09 madwork: I cannot remotely denigrate whitespace-as-easy, it just screws up advanced infix 10:18:25 That's why you need smart delimiters. ;P 10:18:43 yes, and a smart-scanner 10:18:49 and a smart ass! 10:19:10 Well the smart-scanner looks up the delimiters as it looks up tokens. 10:19:17 I am sorta' "seeing" a delimiter-array and a delimiter subvoc 10:19:28 yeah 10:19:43 but it should be optional 10:19:44 If the only delimiter was whitespace, then the whitespace delimiter, when executing, would look up the rest of the characters in the current token and execute/compile. 10:19:50 you'd check charset-delim, then delimiter-voc, then voc-order 10:20:22 FOO 10:20:32 heh 10:20:33 executes, FINDs FOO in teh dict, compiles it. 10:20:42 This is essentially what already happens. 10:20:59 Except that it's usually a special-case handling of whitespace, specifically for dictionary lookup. 10:21:04 yep, because was either in the delim-charset, or a delimiter-voc word 10:21:08 Yep. 10:21:08 hmmm 10:21:20 ok, then we need a GOOD parser for : 10:21:32 So, you put in , alias and etc. to , and yo uhave normal Forth operation. 10:21:38 basically, we need an escapish-system for 'words' 10:21:46 a { word would just vector the wsparse to look in it's preferred delimiter vocabulary 10:22:09 PoopaVic, example? 10:22:15 then } would devector it 10:23:20 the new vector func in retroforth probably makes all this stuff easy 10:23:49 ok, wait a sec - I've a brainfart working 10:23:58 basically any word cen be pointed to a custom value, then reset to the default when it's done 10:24:32 vectors? why the hell vectors? 10:24:35 1) we are coming quite close to the goddamned issues with urls and idiot "folders" with the comline - where a space can ruin yer day 10:25:24 2) we are starting to talk about a different sort of lookup, too: sorting longer-words foremost (like lex) 10:25:58 hmm 10:26:33 madwork: oh, I was thinking of the sorta' '%20' workaround of urls or string-quiting for shells 10:26:48 quoting, two 10:27:08 It's a very weird situation 10:27:26 Not sure why that would be an issue. 10:27:30 you have to tokenize things somehow 10:27:39 In fact it's not at all, because you have control of delimiters. 10:27:40 spaces are easy on the eyes 10:27:47 The current forthish mess is pretty well THE most flexible, if uncomfortable 10:27:49 Also, you can parse-ahead as in normal Forth parsing. 10:28:05 The infix-mess almost makes lexx glow 10:28:36 madwork: well, recall: there are several TYPES of "delimiter" - they can be chars or substrings 10:28:59 Well, the true delimiter is just a char. 10:29:05 Then that delimiter parses its own substring. 10:29:10 or, unifying it: a delimiter is a string of 1...N chars 10:29:14 So if it wants to skip spaces, it's well and dandy. 10:29:36 yeah, thisis what I am cogitating 10:29:44 A delimiter should always be 1 char. 10:30:03 And since it's a fully qualified word, it can do anything from the point it executes. 10:30:11 however, getting a colon-def for a control-char is amusing, I'm trying to think how that'd best work 10:30:13 It could interpret the rest of the Forth program, for all it matters. 10:30:31 Putting the name before the colon instead of after. 10:30:51 in fact, getting a ^c embedded IN a word is also legal (and obtuse), then 10:30:55 Yep. 10:31:00 Could also go Unicode. :) 10:31:06 Woohooo, Wingdings! 10:31:08 not today, please 10:31:12 ;) 10:31:41 : Ö¯Ñ ... ; 10:31:49 I'm trying to envision the current forth declarators, and twist it... 10:32:07 It almost suggests a cpp sorta' layer 10:32:09 This would be a nice word def syntax: 10:32:16 foo{ 123 . cr } 10:32:28 I could live with it, yes 10:32:41 But, it makes me think, also... 10:32:46 foo { 123 . cr } 10:32:51 "undefined" at '{' - soo? 10:32:52 This would be an error. 10:32:59 undefined at 'foo' 10:32:59 hmm,cppdoes that now 10:33:11 right 10:33:21 But you simply need to be able to handle this in the whitespace delimiter! 10:33:32 yes, I see the pt 10:33:52 So it's quite powerful in its simplicity. 10:34:01 it's interesting, yes 10:34:12 it frees up multiple chars themselves, too 10:34:15 * madwork has dibs on the patent. ;) 10:34:56 oh, hell - I looked at it all years ago -let's not get greedy ;-> I'd rather pervert the next few generations! ;->> 10:35:07 why not code { to look backwards for foo? or foo to look forwards for {? 10:35:24 docl, that's what I was saying for { 10:35:28 Same as foo: 10:35:45 ok 10:35:47 and could take "foo" and look forwards for a delimiter before giving up an error. 10:35:54 (if you wanted to go that far) 10:35:58 docl: I'm beginning to think we need to, essentially take control of word-use-of []{}() sorta-things, and consider nesting 10:36:10 This would allow both foo{ 123 . cr } and foo { 123 . cr } 10:36:15 yeppers 10:36:47 Could also do interesting things with . 10:36:49 foo.bar 10:36:53 123.456 10:36:54 madwork: I also have no issue with acting like cpp with foo[{(] 10:37:04 damn right 10:37:15 not to mention foo:bar, foo::bar 10:37:18 PoppaVic, me either... but that's really the decision of the language writer. 10:37:21 PoppaVic, yep. 10:37:41 well, hmm 10:38:34 ie. I think you're all pondering over the high-level implications, which are essentially infinite and thus irrelevant to the basic feature of smart delimiters. Sky's the limit. 10:38:44 hmm. space doesn't currently actually execute. I'm thinking more like foo or { just ignores whitespace 10:38:46 so, 1) we need the length-first dict sort; 2) we need to breakout the word/parse shit and offer some vectoring 10:38:52 interesting, but I see at the horizont that xml people who claim that this idea comes from xml... 10:39:07 docl: trust me, it can be done 10:39:07 docl, space doesn't execute, no, but it could execute using this facility. 10:39:23 I know but when would it ever need to? 10:39:28 and that all languages which are based on that are xml languages, that somehow reminds me on lisp. 10:39:39 docl, you might want to write a language with no whitespace allowed. 10:39:42 there is absolutely no reason to suffer the embedded-and-forever forthish " space parse ...." shit 10:39:43 in the python vm 10:39:44 1,2,3,a,b,c 10:40:08 Or, have specific meanings for tabs and whitespace, yes, as in Python. 10:40:13 yes 10:40:16 So whitespace needs tobe definable as smart delimiters as well. 10:40:18 or newline 10:40:22 No reason to restrict this! 10:40:26 right 10:40:28 PoppaVic, yep. 10:40:44 Just trying to envisionwhere it gets smacked into-place 10:40:58 And the simplest default is to handle whitespace in the Forthy interpretive fashion... ie, the default Forth dictionary. 10:41:27 absolutely, just like you presume unassigned vectables and deferred point at a failure func 10:41:41 PoppaVic, well, probably somewhere in the interpreter around "dup + c@ whitespace?" ;) 10:42:13 Er... "dup 1+ c@ whitespace?" 10:42:24 madwork: recall, I live in C and want the engine/vm in C - so, I'm thinking objects/modules/levelsm etc 10:42:38 Yea, I'm a C boy too. 10:43:23 So the parser scans a character, looks it up in the current voc search order, and executes it. 10:43:24 who isn't a C boy? 10:43:52 I gave up on c before getting good with it. like a lot of langs I've tried 10:44:06 docl: that affects everything 10:44:12 PoppaVic is more of a "C boyeeeee, aww yea biatch," IMO. ;) 10:44:24 madwork: or, yeah - adds it to the current buffer 10:44:38 PoppaVic yea, if it's not found! Good point. 10:44:55 you almost lookup, add; goto lookup 10:45:03 Yep. 10:45:12 madwork? a macho C boy or a gay C boy? :-P 10:45:27 virl, macho, heh. 10:45:32 so, a null-string is prolly still useful (nil string) 10:45:48 Or gangsta! PoppaVic MC Gangsta C, G! 10:46:07 PoppaVic, unless you have a delimiter of '\0' :) 10:46:18 So, might want to use counted strings. 10:46:18 I hate that crap. I was rockin' and rollin' back when bbs were "the net" 10:46:24 Hehe. 10:46:35 Hey, I was a BBSer too, back on my 300baud C64 modem. 10:46:41 madwork: no prob: I actually think a \0 op is not a bad idea 10:46:46 yep 10:46:47 I had to generate my own dial tones through the SID chip. 10:46:56 300 on a kaypro 4/64 ;-) 10:47:30 PoppaVic, yea, \0 would definitely be a useful function. 10:47:39 end of text, end of buffer, etc. 10:47:58 So, ideally there are no special cases at ALL when parsing input. 10:48:02 madwork, that must be a funny a thing 10:48:09 right 10:48:27 hmmmm 10:48:45 I think we are talking about a consistent vm+lexer+vocs 10:48:46 virl, it was awesome. I wrote a program that dialed out a number, and then dropped into the SAM speech synth so that I could prank-call people with a robot voice. :) 10:49:15 I just use to daemon-dial my buddies machine when he was out of town ;-) 10:49:35 great way to ruin an answering machine 10:50:04 madwork, before I can do that I need to finish to write my OS ;-) 10:51:39 which is a forth system which uses colors, yeah it will be comparable to colorforth. 10:51:47 hmm, I think we have a couple "Clews". 10:52:07 fuck colorforth 10:52:31 color islike "trigraphs" - insane 10:53:04 I said it is compareable that doesn't mean that it is a colorforth 10:53:32 madwork: how likely is it we can eventually get an elf/mach or C or true asm-generator? 10:53:40 it is only a forth which can paints on screen 10:53:50 or, link to shared-object/libs? 10:56:06 madwork: my "biggy" is, source is wonderful - but a tool/app often benefits from a lib. AND, it enables cross-language use or "fast load and use". 10:56:37 I just hate the fact that a lot of forth looks like goddamned autoshit. 10:57:31 I can see source-distro, and even "header-like", but once installed local, there is no sensible reason to keep parsing and scanning and looking-up 10:57:36 I don't know any forth which looks like autoconf# 10:58:24 so, what forths look like autoconf? 11:00:40 madwork: actually, hmm.. The best of all worlds would be to be able to run interpreter/RT and generate minimalized turnkey C ;-) 11:01:52 PoppaVic, heh. 11:02:13 Well I don't see why this couldn't be turnkeyed like normal native Forth systems. 11:02:15 madwork: well... C or realasm 11:02:34 The parsing is mostly about compile/interpret-time anyway. 11:02:41 I know, I have been ewatching forth for 20 years and getting more and more depressed 11:03:30 Just having a few clues how gcc does it makes me more and more adamant that forth-distros are NOT LISTENING (sorry, I got emotional) 11:05:04 minimalized turnkey C, eh? 11:05:23 Eh, just turnkey the Forth. 11:05:29 I think our "roadblock" is that forthers either think asm or forth to exclusion to the universe 11:05:47 madwork: usually, the overhead is a waste of space 11:06:05 Overhead? What overhead? 11:06:33 ideally, you'd run thru/down the fcode and extract the vm, asm-calls and then the forth colon-words ;-) 11:06:56 overhead being "extraneous data" 11:07:38 Just the code, no word headers. 11:07:51 right - and only the relative code/calls 11:08:02 Yes. 11:08:14 That's pretty typical of a turnkeyed Forth program. 11:08:23 madwork: which makes me expect they COULD generate source for... what? C? asm? 11:08:27 Headers all go in one part of memory, so it's simple to strip out. 11:08:52 madwork: ahh, yer assuming an x86 ;-) 11:08:58 Sure, why not? I guess they just figure... "why bother?" 11:09:20 would you need to save the forth headers to generate good c headers? 11:09:22 I'm assuming flat memory. 11:09:45 right, because they don't see forthish as a unique entity andconversion as useful 11:09:47 docl, yes, unless you don't care what the C function names are. 11:10:24 right, "am I saving a special interp" versus "do this and ONLY this, w/o lookup" 11:10:35 The easiest way to convert to C would be generating Forth VM. 11:10:47 PoppaVic, for turnkeying, I'm assuming B. 11:10:48 yeah, that's what I am thinking 11:11:16 madwork: sure, but gforth and others assume a full-up system, or NO lookup at all 11:12:23 it's almost funny, because we certainly CAN 'see'/decompose a word - so calling lower and lower to ascertain mechanical-linkages and then storing pcode should never ever be an issue 11:12:47 yep. 11:13:25 so, yeah - I think we write the vm and shit in C, use a lib, and we need a "regenerator" 11:14:05 certainly I'd never expect regenerated code to work on linux as on a doze or bsd or solaris box 11:14:29 BUT, sure - if regenerating wrote C or asm, it might 11:15:03 It should! 11:15:55 Generating sub-threaded would be the quickest, I assume. You'd still use a software data-stack. 11:16:00 yeah, so - I am not off my rocker? 11:16:03 So, not really a true VM. 11:16:28 yes, converting to a "real system" could easily be considered 11:16:36 Just 11:16:36 void foo(){ bar(); baz(); swap(); dup(); } 11:17:05 but, even if we generated a copy of the VM, in C, and used data-ptrs to lists, it'd STILL be sensible. 11:17:23 You'd also have to disallow >R R> etc., and instead have a separate stack for locals or temp storage. 11:17:27 a bit slower than asm, but compact as hell 11:17:31 Yep. 11:17:33 yeppes 11:17:44 Sounds fun. :) 11:17:52 Heck, I might have to try this at some point with Forthy. 11:18:06 yeah, and I am watching tathi, sproingie and their vm-ideas as well. 11:18:19 madwork: I wish we could, really-really. 11:18:35 PoppaVic, you sound like a forlorn lover. ;P 11:18:38 I have this godawful feeling we need a fist of asm, though 11:19:12 ..I tend to see too much crap w/o black-boxing and even less ABI docs 11:21:01 docs shmocs! ;) 11:21:27 no, ABI 11:21:48 I can decipher some source and most headers - docs are beyond most folks 11:23:01 madwork: Keep me in mind, and Ishall continue my Mission From God. ;-) I think we are all missing a bet 11:25:47 madwork: just for fun - I'll throw this at you, as #asm sat there confused.... Could we write a universal-asm forth and generate LOCAL asm source as required? (this ain't far from forth as-is) 11:28:08 * JasonWoof pops in 11:28:17 what do you mean "universal-asm"? 11:28:19 hi JasonWoof ;-) 11:28:32 like a virtual-machine? 11:28:33 jas... oh, yer late and misseda lot. 11:28:37 or some intermediate language? 11:29:02 yes, looking over multiple CPU and going for GCD and then gloabals/locals/funcs 11:29:39 have you looked into the intermediate language that gcc uses? 11:29:59 I believe the absolute easiest place to ease in a forth-likeVM would be as a universal assembler 11:30:20 I tried to capture some files, yeah - on the powerbook it was pretty useless. 11:31:55 man, my fingertips are sore 11:32:06 typing? sewing? 11:32:31 helped unload six hundred something bales of straw and stack them 11:32:44 stiching a rifle-scabbard was painful 11:33:08 crap, I never try to help with bailing anymore - it's waaay too painful in several ways 11:33:43 * madwork is too dumb to answer PoppaVic's question. 11:33:57 local, meaning platform-dependant 11:33:58 haven't had any severe pain involved, but it's always grueling 11:34:07 madwork: well, no prob - despair-not.. I think we sorta' agree a lot of places 11:34:28 universal meaning platform-non-dependant 11:34:34 JasonWoof: hayfever, plus: I've suffered bailers that like to pump out heavy-fuckers 11:34:41 PoppaVic, "generating local asm" from "universal asm" is a compiler 11:35:00 docl: yes 11:35:00 docl: the actual core-code would emulate what it couldn't support 11:35:04 mmm... yeah, nothing like heavy bales 11:35:28 we tried not to make ours tooo yeavy 11:35:43 JasonWoof: Igot past that, into corncribs, and finally blew the fuck up and never helped Dad again *sigh* 11:35:44 sounds like a compiler to me. but it could be styled much like a macro assembler, if there's any advantage to that 11:35:58 I do remember some bales that were delivered when I was younger that I couldn't lift at all 11:36:12 JasonWoof: well, when you expect <=50# and get a 70+, you sorta' get tired fast 11:36:23 docl: indeed 11:36:29 easy porting from regular asm, e.g. 11:36:35 docl: think also "translator" 11:36:43 more like "to asm" 11:36:53 Could define the Forth ASM wordset to be universal, and then load an platform-specific mapping underneath it. 11:36:57 just have the vm do just in time compilation 11:37:03 BUT, recall we still want ourt lovely interactive-engine 11:37:11 madwork: hell YES 11:37:20 ..tis why I've been peevedfor years 11:37:22 --- quit: virl (Remote closed the connection) 11:38:08 the prob appears to be asm-heads expect "purity", and compiler-heads do NOT expect that "purity" 11:38:49 fukm. ;) 11:38:58 I'd rather see a generic engine-based lang, and have folks link in outside objects 11:39:16 if you want to write programs that can generate code on teh fly, then I'd reccomend a VM 11:39:32 JasonWoof: also compect WHEN compiled 11:39:38 "compact: 11:39:59 compact what? 11:40:02 the idea is to aid porting, not skip the porting step 11:40:26 --- part: slava left #forth 11:40:33 if they want to take 100% control, let them write pure "asm-for...", otherwise I think we need an engine. 11:41:25 docl: the idea is to keep our interp/compile/interactive; add exporting to and understand the world runs on at LEAST 'C' 11:41:26 --- join: snowrichard (n=richard@adsl-69-155-177-155.dsl.lgvwtx.swbell.net) joined #forth 11:41:29 --- join: virl (n=hmpf@chello062178085149.1.12.vie.surfer.at) joined #forth 11:41:54 folks that want to export source know why and it's source 11:42:08 folks that want to share forthish source can 11:42:49 ..we obviate sharing cpu-specific asm at this level, but their DATAFILES for translating - should be shareable. 11:43:39 C almost plays well, but between the commies and the lack of that really "intermediate format", we are all well and truly screwed 11:44:14 OK, I just had a stupid idea about creating Forth-like C programs using macros for creating definitions and sharing the data stack. 11:44:16 hmm. as in, C isn't truly portable because asm isn't? 11:44:22 docl: I firmly believe a forthish "engine"/vm isthe answer 11:44:46 madwork: now, consider how C and forth treat "macros" ;-) 11:45:00 Well, I'm thinking of C's #defines here. 11:45:01 docl: oh, far worse 11:45:11 madwork: that'd be "cpp" - not C 11:45:21 No... C. 11:45:29 I don't use C++ if I can help it. 11:45:34 trust me, I've looked this shit over for years and DELVED for weeks 11:45:49 madwork: "cpp" := "C preprocessor" 11:45:54 Oh. heh. ;) 11:46:20 oh, that cpp :) 11:46:24 unlike forth, cpp is not an extensible-interp 11:46:24 Anyway, include a "forth.h" and away you go. 11:46:42 unlike forth, C is not an extensible compiler 11:46:48 Of course not. 11:46:59 unlike forth, asm is specific and non-portable 11:47:00 But, this would go more along the lines of your forth->C generator. 11:47:14 so, we have at least 3 levels of semi-interaction 11:47:23 madwork: yes, assuredly 11:47:57 in fact, even 'sh' is only partially portable 11:48:14 def foo 11:48:14 bar baz swap drop 11:48:14 end 11:48:24 yah ;-) 11:48:34 def(foo) 11:48:34 bar baz swap drop 11:48:34 end 11:48:36 (rather) 11:48:49 #alias def : 11:49:01 #alias enddef ; 11:49:22 #define def(n) void n(void){ 11:49:39 you basically need to step back, and cross-review whomdoes what to who and where 11:49:41 #define end } 11:49:47 yeppers 11:50:38 1) We can't extend the cpp; 2) we can't extend C; 3) we can neither extend the assembler or ID it; 4) WTF isthe ABI? 11:50:52 shells are not forth 11:51:05 we need a linkable ENGINE 11:51:12 cpp should be replaced with Forth. it would be the ultimate precompiler. 11:51:22 mad - see above, but I agree 11:51:36 PoppaVic, and I'm talking about source generation here, not linking. 11:52:23 problem is, right - exactly: we want to compile-in-place to "extend" and be able to write shit that CAN be compiled to a lib for linkage AND extension. 11:52:47 madwork: damn, I am soooooo happy to talk with you ;-) 11:53:04 hmm. a forth parser could be written that interprets c code, or any other language for that matter. 11:53:05 Heh. :P 11:54:58 docl, yep. 11:55:09 using smart delimiters. ;) 11:55:13 docl: now.. take another step: are we METACOMPILING? 11:55:14 Whoa... deja vu... 11:55:21 cool 11:55:47 any time you use macros that's kinda metacompiling, right? 11:56:04 or, take a twist on metacompiling... Consider the defunct libffi or ffcall 11:56:18 docl: macros are text-replacement 11:56:43 yeah. or code execution if you want them to. 11:56:57 Forth Dimensions had an ooold article on honest "macros", but most forths lie anymore 11:57:09 see http://forthworks.com/blog/?p=154 11:57:17 no, macros are replacement 11:57:27 hmm 11:57:37 basically, you change input-streams or force-feed them 11:57:47 This allows for assorted voodoo 11:58:14 [compile] is just not quite the same 11:59:18 docl: remember - we are also talking about perverting, mangling and subverting the parse/word/lookup 11:59:24 well, macros are primarily text replacement. but as implemented in retroforth you can execute code at compile-time 11:59:41 hmm yeah 12:00:06 docl: indeed, and most forths handle embeddment immediates - we are changing the paradigm, remember ;-) 12:00:27 oh 12:00:34 how/why? 12:00:47 for starters, our parsing suggests at least one voc of "immediates" 12:00:54 aka "operators" 12:01:13 yeah 12:01:32 so, are immediate-markers even needed? You see why this shit has driven me for decades? 12:01:34 how does this make it necessary not to execute macro code? 12:02:22 docl: a macro is "doo foo bar snafu" - in the stream. NOT "[compile] doo [compile] foo ..." 12:02:54 I'm pretty sure we don't even have the names/words for what we are suggesting 12:03:02 you mean we need to call it something else? 12:03:18 docl: think about the difference 12:03:36 say I switch vocs first - or parsers - or both 12:04:14 "234" is a string, not a number - something GENERATES a number 12:04:47 "1.234" has the same issue, PLUS "is this a double? a float"? 12:05:35 so it is compiled later by the compiler 12:05:58 maybe - or it may execute - or be ignored - not our prob 12:06:23 hmm. so we think of it as text, not being compiled atm 12:06:30 right 12:06:36 while it's text we insert stuff here and there 12:06:40 a "complete something" 12:06:45 then, later we compile it 12:06:45 ..which we then process 12:06:59 after we are done treating it as text 12:07:00 compile, run, spew ;-) Yes 12:07:55 for lack of a better term, lets call this mess "sixth" 12:08:07 we may save pre-processed source or post-processed. the former would probably be more readable 12:08:41 it's like .c/.h/.o and the internmediate files 12:09:23 what we're thinking of is, what we can do while it's still text 12:09:33 I'd be quite happy with a .h (deferred) .c mentality versus sh/immediate 12:09:34 which is a lot 12:09:42 hell yes 12:10:16 that's what my perl mentality tells me :) 12:10:23 * PoppaVic lost a brainfart to append in here and hates that 12:10:33 forget perl 12:11:09 perl has almost nothing in common with forth or C - and best you can say is they have a centralized distro-pt and a way to interface 12:11:48 HOW and WHY are more interesting, and even php and java can make suggestions 12:12:30 perl is about processing text, that's all I meant. php likewise. even java, yeah 12:13:28 think regexp. you can replace anything with anything before evaluating it 12:13:40 well, the moment you can imagine an INTERACTIVE cpp, and an INTERACTIVE C, and a exportive engine, your world roates crazily 12:14:08 sub forth for C if it helps 12:14:35 hmm 12:14:38 yes 12:14:56 forth is interactive but not so exportive 12:15:04 oh, and accept that some folks will write .asm - but that code is NOT PORTABLE 12:15:18 right, think both ends to the middle 12:16:36 Remember too: not one forth I've ever seen can do what goddamned shitwipe sh can. Not neatly. 12:17:23 we need to change the top to change the middle to generate the bottom 12:18:05 it's a major paradigm-shift (and I hatte sound-bites) 12:18:59 OK, time to stagger off and prepare dinner..... Youse guys know my email - back tomorrow 12:19:03 --- quit: PoppaVic ("Pulls the pin...") 13:55:30 madwork, agreed about MersenneTwister -- I used it only as a baseline for ideal distribution, because it's a very good PRNG. It's far to heavy to be useful as a Forth hashfunction. 13:57:51 Quartus, did you look at hash_rand8()? 13:58:45 I did. 14:03:00 The Quartus Forth hash is approximately: hash=hash*17+tolower(nextchar) & (stacksize-1); where stacksize is a power of two 14:03:29 Presently the hashtable size is 128 chains. 14:03:50 er, read 'hashtablesize' for 'stacksize' above. I have to stop taking Tylenol 3. 14:03:53 :) 14:04:30 I initialize 'hash' with the length of the string. 14:04:45 Ahh. 14:04:50 Let me do this over. 14:05:10 hash=length; 14:05:26 for each char: hash=hash*17+tolower(char); 14:05:33 hash = hash & (hashtablesize-1) 14:05:35 There. 14:05:40 Right. 14:05:48 It works very well. 14:06:05 How's distribution? 14:06:09 The 'tolower' is actually a table-lookup that is both case- and accent-insensitive. 14:06:13 It's very good. 14:07:04 It looks to me that you'll lose bit information with that hash function, as all the bits are flowing left. 14:07:12 Sure you do. Turns out not to matter. 14:07:42 You have to lose hash information regardless of the function; you're only using a few bits of it. 14:07:46 Well, you'll get longer chains, more collisions. 14:08:13 You don't. It's a very effective hash. Competes with any I tried, only it's faster. 14:08:35 neat. 14:08:53 OK, I'll test it out tonight. 14:09:04 Try MD5 as a baseline hash. 14:09:33 It's slow as molassass on a February morning, but the distribution can't be beat. 14:10:40 My tests aren't too comprehensive anyway. ;) 14:10:50 But I can at least test speed and distribution. 14:11:15 I suspect that hash_rand8() will be faster, only because I mix 4 bytes at once. 14:11:33 Maybe. Some of that depends on whether you want case-insensitivity. I do. 14:12:38 madwork: I don't understand what you mean by "all the bits are flowing left" 14:13:39 since 17 is prime, multiplying by 17 any number of times should keep changing the low bits, shouldn't it? 14:13:46 Well, the hash continuously adds and multiplies, so the bit information from previous hashings continues to shift leftwards. After so many iterations, information from earlier bytes gets lost and becomes inconsequential. this limits the size of the string. 14:14:19 But as I say, you have to truncate any hash value to fit your hashtable. 14:14:33 The hash value, yes, not the hash key. 14:14:37 but doesn't it just act like a finite ring? 14:14:42 I don't store the key. 14:14:56 since 17 is relatively prime to 2^n, it doesn't actually get *shifted* left 14:15:25 Try hashing these two keys: 14:15:25 "000000000000000000000000000000000000000000000000000a" 14:15:25 "000000000000000000000000000000000000000000000000000000000a" 14:15:35 hang about, I will 14:18:09 k. I don't think I got exactly the same number of 0s. :) 14:18:46 Well, I'm just guessing anyway. ;) 14:18:57 I'll test it out when I get home though. I gotta leave in a few minutes. 14:19:15 yah, my abstract algebra is admittedly a little rusty. 14:19:37 but...I think you would lose some bits off the top when you add, but multiplying by 17 doesn't. 14:19:39 Ok. I've hashed them both. 16088 and 11720 respectively. 14:19:49 so you should be able to hash pretty long strings. 14:20:01 And we're dealing with Forth. So 31 characters is the max. 14:20:02 and...I think you want to test with the 'a' at the beginning, not the end. 14:20:06 right 14:20:29 anyway, masked to fit a 128-chain hashtable, those wind up in different buckets. 14:20:36 yup. 14:20:45 88 and 72, respectively. 14:20:51 OK, try this: 14:20:52 Try hashing these two keys: 14:20:52 "ABCD000000000000000000000000000000000000000000" 14:20:52 "0000000000000000000000000000000000000000000000" 14:21:31 OK, gotta run. bbiab. 14:22:12 46148 and 26302, respectively. 14:22:28 Buckets 68 and 62. 14:22:45 That's pretty cool. 14:22:56 last time I tried playing with hashing, I was messing with xor/shift hashes 14:23:09 couldn't figure out how to do a good fast one 14:23:24 Well, multiplying by 17 is a shift and an add. 14:23:34 And the add can be faked with an xor. 14:26:26 hmm, sort of 14:26:46 For this kind of work it's close-enough, if you want to avoid the add. 14:27:38 ok 14:27:46 But don't take my word for it. :) 14:27:59 * tathi keeps wishing he had more number theory and such 14:28:51 I'm an empiricist. I started from theory and found, as the quote says, that while in theory there should be no difference between theory and practice, in practice there is. 14:29:40 :) 14:30:32 So set up a test harness and run the chi-squared test against all the Forth word-names you can find. 14:30:51 right. 14:31:02 Varying for table-size, hash function, and case sensitivity. (Though I think case-sensitive Forths are a curse.) 14:31:07 hmm, I'm coming up with different hashes for those strings. 14:31:30 tathi, that's because I use a table-lookup to eliminate case and accents. 14:32:14 ah well, the exact hash values aren't important, I guess. 14:32:19 No. 14:33:55 As I mention probably every time we talk about hashing, you get wordlists for free with a hashed dictionary. 14:34:43 right 14:35:01 Which always pleases me. It's a kind of magic. :) 14:37:18 Well, with one caveat: you get as many wordlists as you have buckets. 14:39:07 but it's no big deal to make a bigger hash table if you need more. 14:39:13 True. 14:39:24 actually, I think factor even resizes its hash tables dynamically 14:40:08 I haven't seen factor's innards. Most hashtable implementations I've seen, though, have prime-sized hashtables and ungodly complex and slow hashfunctions. 14:40:24 yeah 14:42:31 And case-sensitive, too. 14:43:08 Which would be fine if I stored word-names in all lower or upper-case, but I don't. I keep them as entered, so the hashfunction has to be insensitive. 14:43:22 I rather like case sensitivity myself, but to each his own... 14:43:57 Think about working in Forth on the Palm -- case-sensitivity would be a nuisance, not a feature. 14:44:44 why? (I've never used a Palm much) 14:45:19 Pen-based input is sufficiently cumbersome already without having to worry about case. 14:46:46 I also think that OVER DUP IF looks like shouting. Years of USENET I guess. :) 14:47:27 Oh. Yeah, I mostly stick to lowercase. 14:47:53 Oh, right, the standard says if it's case sensitive, all standard words must be uppercase, doesn't it? 14:48:33 Well, not exactly, but close enough. 14:49:01 Yeah, says that all the Standard words must be findable in upper-case. They could also be findable in lower-case, I suppose. 14:49:19 That'd be a schizophrenic sort of system, case-wise. 14:49:25 :) 14:49:35 --- join: madgarden (n=madgarde@Kitchener-HSE-ppp3577151.sympatico.ca) joined #forth 14:50:02 Well, in that context, it definitely makes sense to be case insensitive. 14:50:32 Not quite following. 14:50:45 Back. 14:50:50 Oh, sorry. 14:51:16 mad, those last two you gave hash out to different buckets also. 14:51:31 If you're going to try and be standards compliant, and on the Palm where dealing with case in the input is an issue, then it makes sense. 14:51:40 Ah. Ok. :) 14:52:56 Plus there's a built-in dictionary of 6200+ functions and constants and structures, and I'd hate to have to get Palm's specific subCapitalizationMethod right for each one; they weren't always the same! 14:53:14 ouch 14:54:36 camelCapsKindaSucks 14:55:38 madgarden: Looks like if you pass increasing length strings ("0" "00" "000" "0000", etc.) to that hash function, you get a maximal length sequence, FWIW. 14:57:22 I just hashed that sequence up to "00000", each has a different bucket. 14:57:56 Four more. Still different. 14:58:37 Again four more. Still different. I'll hit a collision at some point (obviously), but it's holding up. 14:59:32 Well, I just ran it through my test. 14:59:34 and that then your hash value is dependent on a prefix (or postfix), if any. 14:59:44 anyway, just thought that was kind of interesting. 14:59:49 Doing tolower() in a loop is bad for speed. 14:59:59 Yeah. I do a table-lookup. 15:00:29 Distribution is not as good as rand8. 15:00:33 Let me adjust the function. 15:02:00 What's the hashtable size? 15:02:39 And adjust rand8 to be case-insensitive, or remove the tolower. 15:03:19 Ok, I just removed the tolower altogether. The speed is improved, it's about 0.78s vs rand8's 0.47. 15:03:47 rand0 is about 1.0s for this test 15:03:51 Rand8 is, as you say, doing four bytes at a time, and is case-sensitive. What if you make it case-insensitive? 15:03:57 So it's faster at byte-by-byte hashing. 15:04:26 Quartus, it makes no difference to distribution. 15:04:44 Sure it does. Strings that only differ by case will collide. 15:05:03 Well, yes of course. 15:05:20 And that's what you want for your Forth dict lookup, but that's not really a test of the hash function itself. 15:05:35 rand8 collisions: 262593 15:05:35 rand8 unused: 449 15:05:35 rand8 longest chain: 16 15:05:35 rand8 variance: 4.974396 15:05:35 rand8 327680: 0.470000s 15:05:36 quartus collisions: 265097 15:05:38 quartus unused: 2953 15:05:40 quartus longest chain: 20 15:05:42 quartus variance: 10.150116 15:05:44 quartus 327680: 0.781000s 15:05:58 * Table size: 65536, keys: 327680 15:06:01 Well, it is part of the test, in fact, if it's in the hashfunction that you're doing the insensitivity. 15:06:20 What are you feeding these two functions? 15:06:47 for(i=0; i { 15:06:48 sprintf(test+i*KEY_LEN, "%05x", j); 15:06:48 strncpy((test+i*KEY_LEN)+5, "super really really really really really REALLY long test string is super super super super SUPER DUPER super SUPER DUPER super SUPER DUPER great (yes it really is!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!)", 15:06:48 KEY_LEN); 15:06:49 j++; 15:06:51 } 15:07:15 Ah. Try feeding it Forth-like word names. 1-31 chars, printable. 15:08:36 During my tests I ran all the strings with a length<32 from the 'dict' file into the function as a Forth-like simulation. 15:09:04 In fact I doubt many Forth word-names get up to 31 chars in length. 15:10:18 Yep, randomized lowercase improves your distribution. 15:10:34 quartus collisions: 262623 15:10:34 quartus unused: 479 15:10:34 quartus longest chain: 17 15:10:34 quartus variance: 5.016663 15:10:34 quartus 327680: 0.781000s 15:10:51 rand8 collisions: 262567 15:10:52 rand8 unused: 423 15:10:52 rand8 longest chain: 18 15:10:52 rand8 variance: 4.972748 15:10:52 rand8 327680: 0.481000s 15:11:19 Try it with a much smaller hashtable. 15:12:59 quartus collisions: 32821 15:13:00 quartus unused: 53 15:13:00 quartus longest chain: 15 15:13:00 quartus variance: 5.062744 15:13:00 quartus 40960: 0.020000s 15:13:00 rand8 collisions: 32818 15:13:02 rand8 unused: 50 15:13:04 rand8 longest chain: 15 15:13:06 rand8 variance: 4.920654 15:13:08 rand8 40960: 0.010000s 15:13:56 I'm certainly not claiming it's the World's Best General-Purpose Hash Function -- but it works very well in practice, and considering that the case-insensitivity happens inside the hash-function, I'd guess that's where the speeds would level out. 15:14:07 --- join: crc (i=crc@pool-151-197-17-102.phil.east.verizon.net) joined #forth 15:14:23 rand8 would have trouble doing 4-at-a-time if it had to lower-case each one. 15:14:40 I'd lowercase the word before searching for it, though. 15:14:50 Let me amend that. It works very well for Forth, in practice. 15:15:04 Yep. 15:15:16 To lowercase words as I search I'd have to copy them to another buffer first. 15:18:14 Yea, looks like a good hash for Forth words. Could likely get a little speed increase by lowercasing and hashing native ints at a time, but probably not worth the effort. 15:19:39 How bit was the final hashtable? 15:19:44 er, big? 15:19:47 (tylenol 3 again) 15:25:43 I've seen hashfunctions that don't directly use the string length, too. Never understood why. 15:25:56 8192 15:26:24 --- mode: ChanServ set +o crc 15:26:58 Hey, this is pretty cool. We got an Atari CD in a box of Lucky Charms, and it has 80 Atari games emulated on it. :D My son is lovin' it, heh. 15:27:09 :) 15:28:34 The hashtable is 128 buckets in the current version of Quartus Forth. 15:29:20 How many words in the dictionary? 15:29:25 (generally) 15:29:58 At startup -- 329, I believe. 15:30:09 Can you resize the table? 15:30:20 Have done; the gains beyond 127 are minimal. 15:30:22 er, 128. 15:30:52 There's another 6200+ names findable, but they are stored outside of the dictionary. 15:33:18 --- join: qFox (n=C00K13S@92pc222.sshunet.nl) joined #forth 15:37:00 The number of string comparisons drops closer and closer to 1 with each doubling of the table-size, but empirically the Quartus Forth test benchmarks don't show significant improvement with a larger hashtable. 15:38:14 Number of string comparisons for a successful match, that is. 15:40:21 --- nick: nanstm -> Raystm2 15:42:43 hello all :) 15:42:52 Hi Raystm2. 15:48:32 What's up? 16:07:57 --- quit: snowrichard ("Leaving") 16:33:20 Hi Quartus. Just talking to crc about stuff. 16:33:58 --- quit: qFox ("this quit is sponsored by somebody!") 16:34:23 Re-writeing the draft for ChuckBot the Cursor. 16:40:20 ACK 16:41:11 hi JasonWoof 16:41:35 hi hi 16:41:36 JasonWoof do you have a mic and a head set? 16:41:42 yes 16:41:56 but unfortunately, can't seem to use the speakers and mic at the same time under linux 16:42:23 I've been wanting to try VoIP for a long time 16:42:34 ohhhh :) Life is good 16:42:47 Raystm2 and I have been chatting using Google Talk ;) 16:42:58 is life good? 16:43:22 much as I don't really like grueling hot phisical labor... there is no better way to fully appreciate the joys of a good shower 16:43:35 life is definitely good 16:43:37 at least! 16:44:02 this is true. Or a Beer( substitute favorite drink-- mine is root beer) 16:44:23 mmmm... 16:44:33 nah, I'll take my shower =) 16:44:52 drinks are never as refreshing as I hope they will be 16:44:58 a root beer will survive a shower. 16:45:04 :) 16:45:19 that's very true, I was thinking the same all; this week. 16:45:30 I"m being accosted, call someone 16:46:12 sorry crc, I know nobody whats to hear that stuff :) 16:46:33 :) 16:47:11 Raystm2: your cat getting out of hand again? 16:47:20 yes. 16:47:59 she's been woken up before the need to get up for work, and the kids are gone. 16:48:16 I could be in trouble. 17:05:28 ok, now what do I do? 17:05:41 about? 17:05:56 dunno 17:06:14 hehe you mean the chat? 17:06:31 no, I'm just thinking out loud 17:06:35 ah 17:07:01 maybe I'll go invest in some eggs 17:07:15 nothing wrong with that, so long as it's _Your_ voice that answers you back in your head. 17:07:27 yes get eggs :) 17:07:32 hmm? 17:07:38 as apposed to whos? 17:07:54 oh I don't know.... maybe......SATAN ! 17:08:02 buh buh buh buummmmmm 17:08:38 oh, you mean that voice that says "lets go bang rocks together! with people's heads in between!" 17:08:44 thats a V in morse code. 17:09:05 You know that voice? 17:24:38 mmmm.... potatoes and stake 17:25:47 steak 17:26:55 --- quit: tathi ("leaving") 17:29:24 --- quit: saon (Read error: 110 (Connection timed out)) 17:29:51 --- quit: saon|fbsd (Read error: 110 (Connection timed out)) 19:57:50 --- join: snowrichard (n=richard@adsl-69-155-177-155.dsl.lgvwtx.swbell.net) joined #forth 19:58:30 hello. 20:05:23 I've got source code for a VAX figforth that I can run under VMS. But it is basically 16bit cell size, which seems a little unusual running on a 32 bit machine. 20:13:20 yeah 20:13:30 I like being able to address more than 64KB 20:27:34 I noticed it because the next macro used a movzw to read the xt into a register. 20:28:18 that zero-extends a 16 bit to 32 if I can remember correctly 20:29:37 --- quit: madgarden (Read error: 104 (Connection reset by peer)) 20:32:17 you wouldn't be able to use the full 4GB space but you could use 2GB. Process virtual addresses can be up to 7fffffff depending on how the page tables are set up. 20:33:09 though that end is stack space, not regular data. 20:43:47 --- join: madgarden (n=madgarde@Kitchener-HSE-ppp3577151.sympatico.ca) joined #forth 20:46:37 hi again madgarden. 20:47:15 Hi there. Windows crashed. 20:47:20 NO! 20:47:29 tell me it ain't so :) 20:47:45 BSOD :) 20:47:51 YES. :-/ 20:47:53 Jeez/ 20:49:26 I've been reading some propaganda about Open VMS since I've install the hobby license for it. They joke about uptimes in years. 20:54:52 sure it's a joke? 20:55:18 no I phrased that badly It probably can stay up for years if power stays 20:55:35 not unheard of 20:55:54 I'm not even sure it's terribly uncommon 20:56:16 I don't leave mine up that long because its running on a simulator on my linux box 20:56:36 heh 20:57:56 it has a TCP/IP stack and FTP and TELNET so if I assigned it one of my static IP's I could run stuff on it from somewhere else. :) 21:00:35 I did run across a website offering remote access to their VAX boxes. 21:25:35 "The future is much like the present, only longer" -- Don Quisenberry 23:59:59 --- log: ended forth/05.08.26