00:00:00 --- log: started forth/18.01.07
00:34:04 --- join: ncv_ (~neceve@2a02:c7d:c5c9:a900:c792:a3e8:397d:b37) joined #forth
00:34:04 --- quit: ncv_ (Changing host)
00:34:04 --- join: ncv_ (~neceve@unaffiliated/neceve) joined #forth
00:51:45 --- quit: jedb (Ping timeout: 252 seconds)
01:04:59 --- join: jedb (~jedb@71.19.249.82) joined #forth
02:09:51 --- join: mykespb (~myke@213.141.133.133) joined #forth
02:32:37 --- join: alexshendi (~yaaic@2a02:8070:218b:bd00:9987:6d45:45f2:26fd) joined #forth
02:42:41 --- quit: mykespb (Quit: Leaving)
03:04:02 --- join: dddddd (~dddddd@unaffiliated/dddddd) joined #forth
04:05:55 --- join: gravicappa (~gravicapp@h62-133-162-121.dyn.bashtel.ru) joined #forth
04:14:54 --- quit: alexshendi (Ping timeout: 240 seconds)
04:38:57 --- join: alexshendi (~alexshend@HSI-KBW-095-208-250-068.hsi5.kabel-badenwuerttemberg.de) joined #forth
05:31:07 --- quit: alexshendi (Ping timeout: 265 seconds)
08:47:33 --- quit: ThirtyOne32nds (Ping timeout: 260 seconds)
09:14:17 --- join: dys (~dys@tmo-100-224.customers.d1-online.com) joined #forth
09:20:14 --- join: Gromboli (~Gromboli@static-72-88-80-103.bflony.fios.verizon.net) joined #forth
10:14:31 --- quit: gravicappa (Ping timeout: 264 seconds)
10:25:22 --- join: proteusguy (~proteus-g@bba575250.alshamil.net.ae) joined #forth
10:25:22 --- mode: ChanServ set +v proteusguy
10:56:03 --- join: ThirtyOne32nds (~rtmanpage@180.sub-174-204-5.myvzw.com) joined #forth
11:18:58 --- join: groovy2shoes (~groovy2sh@unaffiliated/groovebot) joined #forth
11:54:57 --- join: jcob (~user@152.7.255.232) joined #forth
12:15:28 --- mode: crc set +b *!~tkeeq@165.16.66.58
12:15:33 --- mode: crc set +b *!~pekpburk@187.16.217.86
12:18:32 --- mode: crc set +o koisoke
12:27:39 --- join: diginet2_ (~diginet@107.170.146.29) joined #forth
12:35:28 --- quit: diginet2 (*.net *.split)
12:35:29 --- nick: diginet2_ -> diginet2
12:51:58 --- quit: ncv_ (Ping timeout: 255 seconds)
13:21:29 <zy]x[yz> fffffffffff I think I've made a huge mistake
13:22:01 <zy]x[yz> should stop trying to be clever and just compile 32-bit words already
13:30:53 <Zarutian_PI> zy]x[yz: hmm?
13:36:39 <zy]x[yz> the forth I'm building runs in 64-bit mode but compiles 16-bit addresses
13:37:40 <zy]x[yz> so the way I do this is that I ensure that the code is mapped to a region that's 64K-aligned, and then I preload the high-bits of the IP and NEXT just loads the lower 16-bits
13:38:26 <zy]x[yz> this is fine for now, but eventually I'd like to support shared objects, which would require position-independent code
13:38:46 <zy]x[yz> there was even a guy in here not too long ago who was doing something similar and I acted like I knew what I was doing
13:38:55 <zy]x[yz> but it just occurred to me that my plan has a fatal flaw
13:40:20 <zy]x[yz> namely, the code-field is still 64-bit.  so I'll have to either fix-up all of those when a module is mapped, or make them relative somehow (which might be nice because I could probably also make them 16-bit in that case, but it would be a super slow NEXT)
13:40:52 <zy]x[yz> alternatively I could switch to DTC but when I experimented with that, ITC was like 50% faster on x86
13:50:15 <zy]x[yz> I guess compiling 32-bit addresses doesn't really solve my PIC problem
13:50:26 <zy]x[yz> I need to think about this more...
13:58:04 --- join: fiddlerwoaroof_ (~fiddlerwo@unaffiliated/fiddlerwoaroof) joined #forth
13:59:20 --- quit: fiddlerwoaroof (Ping timeout: 248 seconds)
15:02:15 --- quit: dddddd (Remote host closed the connection)
15:18:57 --- quit: dys (Ping timeout: 268 seconds)
16:06:20 --- quit: pointfree1 (Read error: Connection reset by peer)
16:06:21 --- quit: M-jimt (Read error: Connection reset by peer)
16:16:15 --- quit: Riviera- (Ping timeout: 276 seconds)
16:17:44 --- join: pointfree1 (pointfreem@gateway/shell/matrix.org/x-hiqnecfarlenflkq) joined #forth
16:19:32 --- join: lijero (~lijero@unaffiliated/lijero) joined #forth
16:28:17 --- join: Riviera- (Riviera@2a03:b0c0:1:d0::10:b001) joined #forth
16:35:09 <reepca> zy]x[yz: out of curiosity, what does your NEXT look like now? 
16:37:19 <zy]x[yz> lodsw  jmp *(%rax)
16:39:07 <Zarutian_PI> not quite familiar with x86_64 architecture but isnt there an instruction to load only into the lower (or upper half) of an 64 bit register?
16:40:08 <zy]x[yz> lodsw loads the bottom two bytes
16:40:51 <zy]x[yz> I mean, in general you can mostly address the lower half of all registers
16:41:12 <zy]x[yz> like eax is the low half of rax, ax is the low half of eax, etc
16:41:49 <zy]x[yz> but there's one quirk that all loads into e*x also clears the top 32 bits of r*x
16:42:10 <Zarutian_PI> oh, yeah, x86 is just nasty in that regard.
16:43:39 <zy]x[yz> tbh that's not my problem though.  my problem is that I need that jmp *(%rax) to be a jump relative to rax
16:45:25 <Zarutian_PI> yeah, you are bumping against x86 pecularities and it non-uniformity of address modes
16:45:46 <zy]x[yz> I'd have to do like...  movzwq (%rax), %rcx;  jmp *(%rax, %rcx)
16:46:15 <Zarutian_PI> but that is like two memory fetches, man!
16:46:20 <zy]x[yz> assuming I can do that.  I'm not even sure if indirect jmp takes a modr/m byte or is it a hard-coded thing
16:46:34 <zy]x[yz> if it's the latter case then I also have to throw an lea in there
16:46:51 <Zarutian_PI> though the first causes the second to be fast as the value will be in cache the second time around.
16:47:01 <reepca> jmp *(%rax) here meaning jump to the contents of the address pointed to by rax? My assembler's a little rusty
16:47:09 <zy]x[yz> yeah
16:47:36 <zy]x[yz> att x86 is also just weird assembly
16:48:19 <reepca> but I thought lodsw cleared the high bits of %rax?
16:48:27 <zy]x[yz> oops, movswq earlier... would want that sign bit
16:48:28 --- quit: jcob (Remote host closed the connection)
16:48:42 <zy]x[yz> no, only the 32-bit instructions do that
16:48:45 --- join: jcob (~user@152.7.255.232) joined #forth
16:49:11 <reepca> Huh. Weird.
16:49:33 <Zarutian_PI> this just occured to me: what exactly is the difference between an super-scalar RISC that is running something akin to CISC emulator switch-loop and modern x86?
16:50:14 <Zarutian_PI> (with the RISC we can presume that the emulator switch-loop is in instruction cache)
16:52:03 --- join: M-jimt (jimtmatrix@gateway/shell/matrix.org/x-ravjtnkolbxxwnyx) joined #forth
16:53:09 <zy]x[yz> I don't know?  I don't actually know how this stuff works underneath, I just work here
16:56:11 <Zarutian_PI> you get to see the RISC code is what I gather.
16:57:28 <Zarutian_PI> I was watching an 34c4 talk on how certain hackers/university-researchers found out how to get at the micro coding of now old AMD processors.
17:09:37 <zy]x[yz> heh cool
17:10:01 <zy]x[yz> implement forth in the microcode
17:55:39 <reepca> hmm, my internet searching skills have failed me... what's the meaning of jmp *(%rax, %rcx)?
17:58:05 <Zarutian_PI> I presume that rax and rcx are added together before using that value as an address whose contents becomes the new instruction pointer.
17:58:58 <zy]x[yz> yeah. like I said, might not be valid for a jmp
17:59:40 <zy]x[yz> but you can do like mov (%rsp, %rax, 2), %rbx 
18:00:14 <zy]x[yz> that's rbx <= *(rsp + rax * 2)
18:01:24 <zy]x[yz> there's also an lea instruction which just calculates the address and stores it to the destination without actually fetching from memory, so you can use it for single-instruction math sometimes
18:01:45 <zy]x[yz> and also you can add an immediate offset
18:02:07 <zy]x[yz> (to all of them, not just lea)
18:03:18 <zy]x[yz> lea 12(%rdx, %rcx, 8), %rax  // rax <= rdx + rcx * 8 + 12
18:05:13 <yunfan> Zarutian_PI: i guess the only difference is the lower speed :D
18:06:41 <zy]x[yz> it's encoded by a ModR/M byte and a SIB byte which follow a lot of the primary opcodes to specify operands.  but some opcodes don't take those bytes and just encode the operand into the bits of the instruction.  without looking it up I'd guess the indirect jmp is one where the operand is encoded in bits in the opcode
18:06:49 <Zarutian_PI> yunfan: re that RISC running CISC emulation versus x86? Not much or any speed reduction I guess.
18:17:42 <reepca> I guess ultimately all we can do is just try the slightly-longer NEXT and measure the performance
18:18:00 <yunfan> Zarutian_PI: then it must eats much more energy while implement that emulation
18:19:05 <Zarutian_PI> yunfan: have you looked at how much power an x86 chip draws versus something like POWER 7?
18:19:12 <yunfan> does these modern cpu has some in-cpu addressable register files?
18:19:30 <yunfan> Zarutian_PI: nope, its all just my guessing
18:20:17 <Zarutian_PI> yunfan: from that 34c3 talk at least some AMDs have microcode registers, a lot more than x86 provides.
18:20:45 <yunfan> Zarutian_PI: but not addressable?
18:21:02 <yunfan> i mean you could address then from a segment of memory address
18:21:12 <yunfan> like from 0xf0 - 0xff 
18:21:18 <Zarutian_PI> not that I know of.
18:21:26 <yunfan> so people could use such things as a super speed stack
18:28:07 <Zarutian_PI> anyone here know or are an VLSI chip designer or engineer? I am wondering how many small cores say of J1 or RTX2010 with associated sram memories you could fit on a die nowdays
18:41:58 --- quit: proteusguy (Ping timeout: 248 seconds)
20:07:47 --- quit: Gromboli (Quit: Leaving)
21:44:31 --- quit: lijero (Quit: Leaving)
21:59:14 <reepca> a 16-bit indirect-threaded forth seems it'd be really compact. I have to wonder how small of executables you could make if you make it possible to strip out unused code and headers. But the issue of "given this entry point to this program, determine exactly what parts of its address space can and can't be referenced" seems impossible in the general case.
22:06:18 <yunfan> you only need 3 ins :D
22:06:36 <yunfan> i think that's a holy grail of backdoor
23:59:59 --- log: ended forth/18.01.07