00:00:00 --- log: started retro/10.02.04 00:16:11 --- join: virl (~virl__@chello062178085149.1.12.vie.surfer.at) joined #retro 01:33:41 --- quit: sixforty (Quit: Leaving.) 03:09:19 docl: because I only use 'find' in conjunction with 'if' 03:10:02 docl: it avoids extra dup's and drop's in this case 03:12:33 docl: though I could likely refactor some stuff... 05:00:19 --- join: mat2 (~4d177c2b@gateway/web/freenode/x-vrpbrcdaxjdhrqze) joined #retro 05:00:24 moin 05:11:34 hi 05:13:38 hi crc 05:14:10 I wan't to take a look at the docs but have problems to access the wiki 05:14:42 what problems? 05:14:54 ok, the page's now loaded 05:15:23 * crcz stil needs to add caching to the wiki backend 05:19:46 hmm the compiled code follows directly the dictionary header, right ? 05:25:59 ok, so code isn't seperated form the dictionary 05:26:03 from 05:28:48 right 05:29:20 that's not required though 05:31:39 not yet, but for handling possible parallel processing in the future 05:35:46 (a folk then can share the idctionary) 05:35:50 dictionary 05:36:30 -> less memory overhead 07:24:18 less overhead is good 07:24:29 * crcz is going to start playing with avm tonight 08:22:21 ah fine (please compile for 32 bit if you haven't an x86-64 capable cpu) 08:23:14 or try 16 bit word size 08:41:08 I can do 64-bit (core 2 duo, os x) 08:43:18 then please run the two threading benchmarks in the tests directory 08:43:59 I need results from other cpu's than Athlon64 and Celeron 08:50:27 http://gist.github.com/294854 08:58:23 ups, I have forgotten updating the repro 09:00:02 I use test-64-asm only for debugging the inline assembler 09:01:39 ok 09:03:31 wait a moment I will update the repro to version 0.1c 09:07:09 the timings for the test-threading-64 are typical for the branch share-bug "feature" of gcc, compile with -O3 to get usable results 09:11:01 ok, you find now a new file in the tests directory 09:11:31 threading-BrImm.c 09:12:25 test-threading-64.c is a little benchmark for testing raw branch performance 09:13:40 threading-BrImm.c tests the threading performance (~500000000 subroutine branches are executed) 09:14:24 and test-64-asm is only used for finding bugs 09:15:44 (on last bug remains in this version, conditional executed instruction doesn't update the IP pointer. I'm working at current on this) 09:16:44 COMPILE gives some hints for compilation 09:19:43 *Mat2 should corrent some gramatical errors in COMPILE* 09:19:48 correct 09:19:54 grammatical 09:25:08 http://gist.github.com/294888 09:25:32 --- quit: virl (Remote host closed the connection) 09:27:20 thanks, fine but where are the timings ? 09:27:59 sorry, one second... 09:28:33 my actual timings: 09:29:02 test-threading-64: 09:29:03 real 0m2.790s 09:29:03 user 0m2.731s 09:29:03 sys 0m0.014s 09:29:10 test-BrImm: 09:29:17 real 0m19.025s 09:29:17 user 0m18.821s 09:29:17 sys 0m0.058s 09:29:32 real 0m15.412s user 0m14.719s sys 0m0.026s 09:29:42 for test-BrImm 09:29:58 have you compiled with -O3 flags ? 09:30:24 -O2 09:30:29 I'll redo with -O3 09:31:55 the same procedure for retro (50790400 calls) 09:31:59 real 0m28.375s user 0m27.889s sys 0m0.046s 09:32:07 a2 -O3: 09:32:08 real 0m2.767s 09:32:08 user 0m2.731s 09:32:08 sys 0m0.013s 09:32:54 a3 -O3: 09:32:55 real 0m19.631s 09:32:55 user 0m19.395s 09:32:55 sys 0m0.055s 09:33:53 hmm ok 09:35:43 I wonder why my older cpu is a bit faster on the benchmarks (I bet it it's because of some overhead of the Darwin kernel ..) 09:35:58 likely true 09:37:01 what's your clock speed? 09:37:08 still better than the threded console verson of Ngaro though 09:37:16 threaded 09:37:23 1,8 GHz 09:37:43 ok, so not faster than mine 09:39:09 so I think we can postulate an performance advantage of something between 50-90 % for subroutine threding (28.375 versus 14.6 s) 09:39:14 threading 09:40:15 bizarre is that the best timings gives the Celeron cpu inside my eeePC 70 09:40:16 1 09:44:55 I have added a test file for retro (the subroutine threading benchmark), can you run it on time to see the timings ? 09:45:53 it's in the tests directory: test-threading1.4th 09:46:03 sure, just a moment :) 09:51:49 real 0m14.854s 09:51:50 user 0m14.646s 09:51:50 sys 0m0.062s 09:51:55 with your vm 09:52:20 real 0m36.384s 09:52:20 user 0m35.871s 09:52:20 sys 0m0.096s 09:52:24 with mine 09:52:43 *shock* 09:53:21 the only reason for this can be the memory managment in OS X 09:54:17 0m35.871s, are these the timings for the threaded console vm or the switch based one ? 09:54:30 switch-based 10:00:52 after a bit googling I found this: 10:01:30 the large amount of time spent in 'mach_msg_trap' indicate that XNU kernel is spending a lot of time message passing 10:02:27 Shark reports that with the default OS X malloc, about 12% of the runtime is spent in 'mach_msg_trap 10:05:09 look at the sys timings, seems page handling is a problem for OS X 10:07:40 thangs for testing, I will lokk at an alternative for malloc (possible BSD's valloc) 10:08:26 thanks 10:14:15 np 10:15:11 fast-console give me: 10:15:22 real 0m28.447s 10:15:32 user 0m27.916s 10:15:41 sys 0m0.046s 10:17:17 console: 10:17:54 real 0m43.082s 10:18:05 user 0m41.565s 10:18:14 sys 0m0.056s 10:18:20 AVM: 10:19:00 real 0m14.528s 10:19:12 user 0m14.345s 10:19:23 sys 0m0.017s 10:20:49 similar ratios for the Celeron cpu 10:21:37 what GCC version do you use ? 10:24:47 i686-apple-darwin9-gcc-4.0.1 10:26:54 ok, that explains a lot. GCC < 4.1 was the reasion i switched to replicating-switch threading 10:27:55 I have a 4.2 install as well 10:28:05 I'll try that tonight 10:29:05 thanks, otherwise you find some compiler flags in COMPILE for handling some special "optimations" which results in impressive bad code otherwise 10:30:15 I will update the repro this night to the next version. It can be that, the replicating-switch AVM interpreter gives you better results 10:30:59 but beware: Only compile with -O 10:31:51 ok now coding, ciao 10:31:58 --- quit: mat2 (Quit: Page closed) 13:09:41 --- quit: crcz (Quit: Leaving) 13:32:55 --- join: crcx (~crcx@bespin.org) joined #retro 13:33:30 trying an even simpler irc client 13:35:52 test 13:36:53 ok, it seems to work ok 13:37:08 and doesn't have a curses interface, so scales down better 13:53:43  14:01:26 --- join: Mat2 (~4d177c2b@gateway/web/freenode/x-scvmpshqtaqlgwap) joined #retro 14:01:35 hello everyone 14:01:51 @crc: do you readin ? 14:04:20 I have just uploaded a new version of AVM 14:05:28 sttill at work 14:05:37 home in about 30 min 14:07:11 ok, sorry 14:19:20 will be back later 14:19:23 --- part: Mat2 left #retro 14:25:20 home now 14:42:47 --- join: Mat2 (~4d177c2b@gateway/web/freenode/x-pkqwiexvoypxaqkf) joined #retro 14:43:06 hi mat 14:43:12 will have results for you in a minute 14:43:22 hi, 14:43:28 thanks .) 14:43:31 :) 14:50:00 http://gist.github.com/295216 14:52:34 hmm, ok that helps 14:52:38 thanks 14:53:00 you're welcome 14:54:26 can you compile the tests with the replicating-switch interpreter ? 14:55:12 but beware, compilation would need a lot of memory 14:55:16 --- join: erider (~chatzilla@pool-173-69-160-231.bltmmd.fios.verizon.net) joined #retro 14:55:22 hi erider 14:55:22 --- quit: erider (Changing host) 14:55:23 --- join: erider (~chatzilla@unaffiliated/erider) joined #retro 14:56:15 Mat2: doing so now 14:57:34 why does it use so much memory? 14:59:07 hi 14:59:18 hi erider 14:59:39 --- quit: Mat2 (Ping timeout: 248 seconds) 15:00:15 hey crc what you guys up to today 15:01:52 helping mat2 work on his vm a bit 15:02:01 reading yiyus's message to the mailing list 15:04:53 --- join: Mat2 (~4d177c2b@gateway/web/freenode/x-bsglxouafrgmqaia) joined #retro 15:05:01 server reset ? 15:05:56 * crc is still recovering from compiling both files at the same time (a definite mistake) 15:06:12 http://gist.github.com/295249 15:07:53 oh i'm sorry, the rst sources are very memory intensive 15:08:06 that's ok 15:08:16 I'll compile them separately next time :) 15:08:36 importent is compiling them with -O 15:08:43 important 15:08:55 I used -O3 15:09:03 should I redo with -O ? 15:09:25 please 15:09:37 ok 15:10:13 but be beware of the memory usage !!! 15:15:25 http://gist.github.com/295254 15:19:38 just as I guess: The timings aren't such bad as I feard because the sb benchmark for AVM iterates 2.096.001 more subroutine calls as the others and the timings for ngaro: fast-console are near 15:20:21 by the TTC interpreter of AVM 15:20:28 so this is a cache issue 15:21:12 -> The memory managment of Mac OS X seems to generate a lot more page faults than linux 15:22:51 as the ngaro vm generates smaller executables this has the effect of slightly better results 15:23:43 thanks 15:23:48 no problem 15:24:25 what I don't understand is why gforth performs so bad on my system 15:24:31 btw, if I disable the nop's, the code gets to run much faster: 15:24:34 real 0m9.302s 15:24:35 user 0m9.072s 15:24:35 sys 0m0.045s 15:24:39 with fast-console 15:24:51 after loading: 15:24:52 : .fast compiler @ if 1+ 1+ compile else execute then ; 15:24:52 ' .fast is .word 15:25:05 to have the compiler skip over the nop's 15:25:21 jes but the benchmark code for AVM is far from optimisated 15:25:30 ok 15:26:47 AVM can skip every second subroutine call completely though conditional execution 15:27:35 and because the interpreter can execute two branches in every dispatch 15:28:11 but that wouldn't be fair against the ngaro vm 15:38:32 crc what is 1+ adding to 15:38:44 : .fast ( a- ) 15:38:51 classes get passed the address of a word 15:39:07 the : compiler lays down two nop instructions at the start of a word 15:39:20 1+ 1+ skips to the first actual instruction 15:40:16 (doing this loses the flexibility of vectored execution, but improves performance.) 15:44:41 crc how doe sit skip 15:45:16 compiler lays down an address the you are incrementing 15:45:58 .fast replaces the default compiler for normal words 15:46:38 that* you 15:52:49 erider: the compoilation is done by .fast 15:52:57 so it can increment the pointer before compiling it 15:53:29 thanks so it inc the pointer 15:53:35 yes 16:02:39 crc you guys working on the vm 16:02:59 Mat2 is working on a new vm that will eventually have a port of retro 16:08:00 that's not ngaro 16:08:13 nope 16:08:46 @crc: I have dropped the times under 8.6 16:10:10 but as written, the code uses some capabilitys of AVM not avariable on ngaro so this wouldn't be a fair comparision 16:11:42 hmm, capabilities would be right I think 16:12:03 time for sleep, ciao crc, erider 16:12:13 goodnight Mat 16:12:34 --- quit: Mat2 (Quit: Page closed) 16:54:29 --- quit: SimonRC (Ping timeout: 246 seconds) 16:58:50 crc hey do you think that queues are easy to work with than arrays 17:00:00 arrays are simpler, implementation-wise 17:02:01 but what is easier uses 17:02:23 I think that would vary with the task at hand 17:02:47 hmm 17:03:19 its an array but you have access to both sides 17:03:20 queues are basically a FIFO buffer, not designed for random access 17:03:45 in an array, you can access any element, not just from the ends 17:04:22 --- join: SimonRC (~sc@fof.durge.org) joined #retro 17:04:55 erider: in a proper queue, you can push to the end of the queue and pop from the front, so not full access to both ends 17:05:58 I want to make some that you can pop from both sides 17:07:08 you want a doubly-linked list I think 17:07:44 I want something to play with in forth 17:10:25 http://code.google.com/p/ffl/source/browse/trunk/ffl/dcl.fs 17:10:47 GPL, so don't ask me to read it in any detail 17:45:05 crc do doesn't put anything on the params stack right? 17:45:22 no 17:45:43 it pushes loop indexes to a control stack or to the return stack 17:45:48 but not the data stack 17:46:08 shit 17:46:16 why? 17:46:37 it is hard to write other peoples forth code 17:47:26 people define a word that is going to get its input from someone else I they don't document that 17:47:43 which word 17:50:30 no worries just getting a little frustrated 17:51:17 most forth users tend to develop their own libraries and tools, gathered from earlier projects and their usage of forth 17:52:22 readonly 17:52:54 no, just poorly documented 17:53:29 * crc finds most code from others to be hard to follow, no matter what language 17:53:50 commenting, structure, naming, etc all vary greatly between individual programmers 17:54:02 so true 17:54:34 yeah 17:54:41 agreed 17:55:55 forth just happens (like lisp and smalltalk) to be quite mallable; so it's harder to follow 17:55:58 especially at first 18:11:46 I get read lisp 18:12:01 crc hey what is BL 18:12:54 bl? 18:13:04 in forth 18:13:27 constant for blank space, in ANS 18:13:34 32 CONSTANT BL 18:13:44 ah ok 18:13:45 generally 18:14:07 docl: I'll be exposing 'space' as a word; it's reused enough 18:15:50 scratch that, I already did :) 18:18:10 cool :) 18:18:30 I must be getting forgetful, it's in 10.4 already 18:23:37 docl: what are you working on 18:28:52 * docl has been altering the theme on my forum 18:28:57 forthcommunity.com 18:29:01 it's looking nice 18:29:56 --- join: sixforty (~sixforty@pdpc/supporter/active/sixforty) joined #retro 18:31:03 thanks crc :) 19:13:30 --- quit: erider (Ping timeout: 248 seconds) 20:07:29 http://www.forthcommunity.com/forum/viewtopic.php?f=9&t=19 20:07:35 Double linked lists 20:07:57 very nice 20:08:05 :) 20:09:10 I left spaces in the 1 + part for stylistic reasons. It may be more portable that way too. 20:10:02 that just involves executing one extra instruction, I'd not worry about it unless the code gets called a lot 21:04:13 --- quit: sixforty (Quit: Leaving.) 22:03:29 --- quit: probonono (Remote host closed the connection) 23:59:59 --- log: ended retro/10.02.04