00:00:00 --- log: started forth/18.06.09 00:13:14 --- quit: nighty-- (Quit: Disappears in a puff of smoke) 00:55:48 --- join: ncv (~neceve@unaffiliated/neceve) joined #forth 01:15:10 --- quit: karswell (Read error: Connection reset by peer) 01:16:03 --- join: karswell (~user@cust125-dsl91-135-5.idnet.net) joined #forth 01:46:45 --- join: nighty-- (~nighty@s229123.ppp.asahi-net.or.jp) joined #forth 01:52:13 --- quit: nighty-- (Max SendQ exceeded) 01:55:16 --- join: dys (~dys@tmo-109-2.customers.d1-online.com) joined #forth 02:08:57 --- join: nighty-- (~nighty@s229123.ppp.asahi-net.or.jp) joined #forth 02:12:35 --- quit: mnemnion (Remote host closed the connection) 02:13:12 --- join: mnemnion (~mnemnion@2601:643:8102:7c95:1d0b:effe:e72e:7681) joined #forth 02:23:03 --- quit: mnemnion (Ping timeout: 260 seconds) 03:25:45 --- quit: pierpal (Quit: Poof) 03:26:01 --- join: pierpal (~pierpal@host23-9-dynamic.16-87-r.retail.telecomitalia.it) joined #forth 04:27:15 --- quit: pierpal (Quit: Poof) 04:27:37 --- join: pierpal (~pierpal@host23-9-dynamic.16-87-r.retail.telecomitalia.it) joined #forth 04:57:58 --- join: dddddd (~dddddd@unaffiliated/dddddd) joined #forth 05:30:16 --- quit: pierpal (Quit: Poof) 05:30:35 --- join: pierpal (~pierpal@host23-9-dynamic.16-87-r.retail.telecomitalia.it) joined #forth 06:07:55 --- quit: dave9 (Quit: dave's not here) 06:32:57 --- nick: Zarutian_2 -> Zarutian 07:24:55 --- quit: WilhelmVonWeiner (Quit: leaving) 07:25:03 --- join: WilhelmVonWeiner (dch@ny1.hashbang.sh) joined #forth 07:33:51 --- quit: ncv (Remote host closed the connection) 07:35:34 --- join: ncv (~neceve@2a02:c7d:c5c9:a900:c792:a3e8:397d:b37) joined #forth 07:35:34 --- quit: ncv (Changing host) 07:35:34 --- join: ncv (~neceve@unaffiliated/neceve) joined #forth 07:41:42 --- quit: WilhelmVonWeiner (Quit: leaving) 07:41:53 --- join: WilhelmVonWeiner (dch@ny1.hashbang.sh) joined #forth 07:59:15 So, I can't remember if it was Zarutian or zy]x[yz, but one of you guys told me you'd found that mov eax [rsi]; add rsi, 4 was faster than lodsd. 07:59:39 lodsd is one byte and the mov, add is seven. 08:00:00 So lodsd sure wins on code size. But you're saying that the other one is still faster? 08:00:13 KipIngram: not me, I detest x86 as it is rather crappily designed ISA. 08:00:29 That's counter-intuitive to me, given that there's so much more code that has to be fetched. 08:00:36 But doesn't mean it's not so. 08:01:09 Most x86 chips nowdays fetch cache-lines of worth from main memory into instruction L2 cache 08:02:18 KipIngram: you also have to take in account decoding delay and execution. 08:02:34 Right, but somewhere in there it still has to pick through those bytes. I just think lodsd *ought* to be faster, in a careful design, but again, doesn't mean it is. 08:03:11 KipIngram: rule of thumb is that the more complex an instruction is the longer it takes to execute 08:03:14 That's true - the mov, add might be "more directly" decodable. 08:03:39 Well, what I really like about the mov, add is that it doesn't confine me to rsi. 08:04:22 I'd actually like to have rdi and rsi free to be used for A and B address regs, so I can apply rep in MOVE and CMOVE without having to move a bunch of stuff around. 08:05:03 KipIngram: also take into account that if you could take a histogram of instructions executed from programs from various sources then for CISC archs such as x86 you would see quite a hilly graph 08:05:03 I'm nudging this thing around this morning so that all of the "system-used" registers are in r8-r15, leaving all of the "standard" registers completely free. 08:05:58 Then I can just forget about those extra registers, knowing that they're doing a job for me, and regard the standard regs as things not touched by the system at all. 08:06:02 KipIngram: and such a graph would inform the ISA designers to spend more silicon estate to speed up the most used instructions 08:06:13 Yep. Makes total sense. 08:06:44 yeah if you want to fall into local maxima trap 08:07:14 Well, ultimately I'd need to profile *my* code - what the world does most commonly isn't necessarily what a Forth would do most commonly. 08:07:47 As soon as I have enough in here to make a loop I can do some timing of my own. 08:08:12 sure. I read an old textfile from way back when someone tried their Forth on an pentium and it ran slower than on his 386 08:09:40 x86 chips nowdays dont like much branch heavy code. But such code takes up less space in caches and runs "hotter" (common routines get frequently exercised) 08:18:51 msg dave0 Ok, modified my code so that I'm using only registers in r8-r15 for "special purposes," and allocated one to allow next to be jmp . 08:18:57 Ooops. 08:19:01 Sorry guys. 08:19:14 Not enough coffee this morning. :-( 09:10:02 KipIngram> So, I can't remember if it was Zarutian or zy]x[yz, but one of you guys told me you'd found that mov eax [rsi]; add rsi, 4 was faster than lodsd. 09:10:08 that was me 09:10:44 Ok. I felt pretty sure it was one of you, and Z denied culpability earlier. :-) 09:10:51 I switched over to that this morning. 09:11:00 is it faster for you? 09:11:04 Swelled NEXT up from 11 bytes to 17. 09:11:12 I haven't gotten quite far enough to time it yet, but I will. 09:11:24 I didn't read the rest of the backlog because I'm lazy 09:11:26 cool 09:11:26 Need some sort of loop running, so I can write a huge look and put a stopwatch on it. 09:11:45 Well, probably not a literal stopwatch. 09:11:50 But something that I can wrap timing around. 09:11:59 yeah, assuming your findings match mine, you'll have to choose between optimizing for space or optimizing for time 09:15:03 Well, I'm jumping to my next, so I only pay those extra bytes in one spot. 09:15:17 So I don't REALLY care how big NEXT is. 09:15:33 It just surprised me a little (but not a lot) that something that much bigger could be faster. 09:15:51 But Z pointed out the idea that the bigger one could be more readily decoded. 09:16:11 Once it's in cache so you're not paying the fetch overhead every time. 09:16:41 modern-day x86 is so complicated I gave up trying to understand why things are 09:17:40 I think I've seen it suggested somewhere that you should treat it as a risc isa these days 09:18:06 any multi-op instructions are probably implemented in microcode and incur some overhead 09:18:33 Yes, makes sense. 09:18:53 Well, I like leaving rsi (and rdi) for A/B address registers. 09:19:00 Since they support instructions you can apply REP to. 09:19:40 And that probably implies a slight advantage in choosing rcx as TOS. 09:20:11 I think you'll get some space back if you change next not to use rsi, too 09:20:18 And I also like it that this lets me put the virtual machine business all over in r8-r15, where it doesn't affect the "traditional" regs. 09:20:31 Yes, I've done that - I switched to r14. 09:20:40 isn't the ModR/M encoding longer for rsi 09:20:42 oh ok 09:20:54 It is for one of them; I thought it was rbp. 09:21:03 I was immersed in that stuff a few months ago, but I've forgotten now. 09:21:12 well, using r14 you suffer having to use a prefix 09:21:21 oh you're right it is rbp 09:21:32 Hmmm. I didn't notice that in the code listing. I'll check again. 09:21:49 But that seems like it would have to be, based on what I remember. 09:22:26 this is truly premature-micro-optimization at this point 09:22:42 is that instruction four bytes or five? 09:22:49 Which one? 09:23:03 I should have put that in quotes 09:23:10 Oh - I see what you're saying. 09:23:13 I meant it hypothetically 09:23:31 Well, given that I'm jumping to NEXT it really is time that matters. Space is pretty irrelevant. 09:23:45 Especially if it doesn't CORRELATE to time. 09:23:46 right 09:24:30 And my philosophy has always been that if I have a little bit of code that just really, really, really must be as fast as possible, I'll write a primitive to help it along. 09:24:43 Or auto-unwind a sequence of primitives into a bigger one, with just one NEXT. 09:24:55 zy]x[yz: modern-day x86 is so complicated because it is a bodge upon a kludge no one remembers why was put in 09:25:04 That's why I decided I was ok with jmp next instead of inline next. 09:25:19 Zarutian: That's exactly what I figured. 09:25:28 I imagine it's a weird culture in that team these days. 09:25:43 The design has become like a Grecian oracle or something. :-) 09:25:54 "Find a way to squeeze this in..." 09:26:09 I sometimes think that's the case with the flash controller in our storage product. 09:31:55 what I have learned from various sources regarding ISAs is that register based systems are ackward to design and use. 09:32:58 Zarutian> zy]x[yz: modern-day x86 is so complicated because it is a bodge upon a kludge no one 09:33:00 take for one instance the J1 fpga dual stack core. Because operands are always at known places you can get away with using ?muxed-output? ALU design 09:33:01 remembers why was put in 09:33:10 sounds a lot like where I work 09:33:28 (I don't work at intel) 09:34:23 zy]x[yz: such is usually result of overpromising sales drones making up features the product does not have to make a sale and more importantly for them, commission. 09:35:35 as a guy who used to be my manager years ago said last year, "don't forget our motto: 'overpromise, underdeliver'" 09:38:20 This muxed-output ALU design allows for faster execution rate as there is no need to wait for operand selection like in register based systems. 09:40:27 I think "overpromise/underdeliver" is the whole world's business model, which is part of why I can't be a businessman. 09:41:20 It's been a long time since I've designed for a living. I absolutely loved it, but these days I do performance testing. 09:41:40 KipIngram: I think it is from that loans for boondogles and 'leveraging' are too cheap and the attitude of rapid growth. 09:41:42 I get my "creativity fixes" from designing and building my own tools, but nothing that goes out the door. 09:42:13 And also creativity fixes from doing stuff like this on the hobby side. 09:42:34 When it's a hobby I can mull over any part of it I want to for as long as I want to, without having to justify it to anywone. 09:42:56 I can at least *try* to be an artist instead of a hack. 09:43:37 KipIngram: also deadlines that have basis in reality for products that need a lot of research and developement 09:44:10 Yeah, pretty much everything encourages "get it done fast" instead of "get it done superbly." 09:44:46 --- quit: pierpal (Read error: Connection reset by peer) 09:45:56 KipIngram: yeah while the managers joggles the elbows of engineers, designers and technicians. 10:16:43 --- quit: dys (Ping timeout: 256 seconds) 10:53:45 Hmmmm. 10:54:16 I have this where it will run a test definition, which is just to print "Hello, world" five times, and then exit, with the BYE primitive. 10:54:22 That works fine. 10:54:44 Then I added a .bss section to the source, and reserved some bytes there. 10:54:50 This is a step toward a "process block." 10:55:29 Just as a test, I had the startup code copy the test definition into that area, and pointed the IP at that instead of the hand-written definition of the test word. 10:55:37 That prints hello, world five times - and then segfaults. 10:55:54 So it's *working* - it's stepping through the definition, and reaching the message print word all five times. 10:56:12 So I feel like it must be REACHING the BYE word. 10:56:16 But it's unhappy about something. 10:57:27 This was a two-nested deep test; I had this: 10:57:35 : test messages bye ; 10:57:46 : messages msg msg msg msg msg ; 10:58:15 Only messages is copied to the bss region - test is still in the text region. 10:58:27 I can add a sixth msg in test, after messages, and it prints. 10:58:51 So control is coming back from the .bss section definition - it seems that something has happened that makes the actual code of bye unhappy. 11:00:48 --- join: ncv_ (~neceve@2a02:c7d:c5c9:a900:c792:a3e8:397d:b37) joined #forth 11:00:48 --- quit: ncv_ (Changing host) 11:00:48 --- join: ncv_ (~neceve@unaffiliated/neceve) joined #forth 11:01:07 Wait - I take that back. It does NOT come back from the copied definition. 11:01:16 That seems less mysterious, somehow. 11:01:53 OH. 11:01:56 I'm testing it wrong. 11:04:27 --- quit: ncv (Ping timeout: 245 seconds) 11:04:28 --- quit: bb010g (Ping timeout: 245 seconds) 11:06:38 --- quit: tadni (Ping timeout: 240 seconds) 11:07:13 --- quit: pointfree1 (Ping timeout: 240 seconds) 11:07:35 --- quit: M-jimt (Ping timeout: 260 seconds) 11:08:19 --- quit: amuck (Ping timeout: 240 seconds) 11:09:35 --- join: amuck (~amuck@152.243.185.35.bc.googleusercontent.com) joined #forth 11:43:55 Ok. It all seems to work. 12:08:48 --- join: pierpal (~pierpal@host23-9-dynamic.16-87-r.retail.telecomitalia.it) joined #forth 12:26:56 --- quit: ncv_ (Ping timeout: 256 seconds) 13:36:34 --- join: bb010g (bb010gmatr@gateway/shell/matrix.org/x-aeadktpvhwzafwcq) joined #forth 13:43:01 --- join: Keshl_ (~Purple@24.115.185.149.res-cmts.gld.ptd.net) joined #forth 13:43:19 --- quit: Keshl_ (Client Quit) 13:43:31 --- quit: Keshl (Read error: Connection reset by peer) 13:52:42 --- join: Keshl (~Purple@24.115.185.149.res-cmts.gld.ptd.net) joined #forth 14:14:50 --- quit: pierpal (Quit: Poof) 14:15:07 --- join: pierpal (~pierpal@host23-9-dynamic.16-87-r.retail.telecomitalia.it) joined #forth 14:28:24 --- join: tadni (tadnimatri@gateway/shell/matrix.org/x-cgajpdlvzijqhppc) joined #forth 14:28:25 --- join: pointfree1 (pointfreem@gateway/shell/matrix.org/x-kmwvilegjfutrdcr) joined #forth 14:28:25 --- join: M-jimt (jimtmatrix@gateway/shell/matrix.org/x-xapgvevkfmetcoxn) joined #forth 14:32:44 --- join: pierpa (57100917@gateway/web/freenode/ip.87.16.9.23) joined #forth 15:04:19 --- quit: Zarutian (Ping timeout: 264 seconds) 15:04:51 --- join: Zarutian (~zarutian@173-133-17-89.fiber.hringdu.is) joined #forth 16:33:36 --- quit: dddddd (Remote host closed the connection) 17:16:20 --- join: karswell_ (~user@185.161.200.10) joined #forth 17:17:36 --- quit: karswell (Ping timeout: 255 seconds) 17:23:44 --- join: TCZ (~Johnny@ip-91.189.219.200.skyware.pl) joined #forth 17:24:03 --- nick: karswell_ -> karswell 17:42:20 --- join: karswell_ (~user@185.161.200.10) joined #forth 17:44:18 --- quit: karswell (Ping timeout: 245 seconds) 17:46:44 --- nick: karswell_ -> karswell 18:23:29 --- quit: TCZ (Quit: Leaving) 19:29:16 --- quit: pierpa (Quit: Page closed) 20:10:38 --- quit: pierpal (Quit: Poof) 20:10:54 --- join: pierpal (~pierpal@host23-9-dynamic.16-87-r.retail.telecomitalia.it) joined #forth 20:16:21 --- quit: pierpal (Remote host closed the connection) 21:08:48 --- quit: Zarutian (Ping timeout: 240 seconds) 22:11:32 --- quit: karswell (Ping timeout: 256 seconds) 23:46:19 --- quit: epony (Quit: QUIT) 23:59:59 --- log: ended forth/18.06.09