00:00:00 --- log: started forth/18.01.07 00:34:04 --- join: ncv_ (~neceve@2a02:c7d:c5c9:a900:c792:a3e8:397d:b37) joined #forth 00:34:04 --- quit: ncv_ (Changing host) 00:34:04 --- join: ncv_ (~neceve@unaffiliated/neceve) joined #forth 00:51:45 --- quit: jedb (Ping timeout: 252 seconds) 01:04:59 --- join: jedb (~jedb@71.19.249.82) joined #forth 02:09:51 --- join: mykespb (~myke@213.141.133.133) joined #forth 02:32:37 --- join: alexshendi (~yaaic@2a02:8070:218b:bd00:9987:6d45:45f2:26fd) joined #forth 02:42:41 --- quit: mykespb (Quit: Leaving) 03:04:02 --- join: dddddd (~dddddd@unaffiliated/dddddd) joined #forth 04:05:55 --- join: gravicappa (~gravicapp@h62-133-162-121.dyn.bashtel.ru) joined #forth 04:14:54 --- quit: alexshendi (Ping timeout: 240 seconds) 04:38:57 --- join: alexshendi (~alexshend@HSI-KBW-095-208-250-068.hsi5.kabel-badenwuerttemberg.de) joined #forth 05:31:07 --- quit: alexshendi (Ping timeout: 265 seconds) 08:47:33 --- quit: ThirtyOne32nds (Ping timeout: 260 seconds) 09:14:17 --- join: dys (~dys@tmo-100-224.customers.d1-online.com) joined #forth 09:20:14 --- join: Gromboli (~Gromboli@static-72-88-80-103.bflony.fios.verizon.net) joined #forth 10:14:31 --- quit: gravicappa (Ping timeout: 264 seconds) 10:25:22 --- join: proteusguy (~proteus-g@bba575250.alshamil.net.ae) joined #forth 10:25:22 --- mode: ChanServ set +v proteusguy 10:56:03 --- join: ThirtyOne32nds (~rtmanpage@180.sub-174-204-5.myvzw.com) joined #forth 11:18:58 --- join: groovy2shoes (~groovy2sh@unaffiliated/groovebot) joined #forth 11:54:57 --- join: jcob (~user@152.7.255.232) joined #forth 12:15:28 --- mode: crc set +b *!~tkeeq@165.16.66.58 12:15:33 --- mode: crc set +b *!~pekpburk@187.16.217.86 12:18:32 --- mode: crc set +o koisoke 12:27:39 --- join: diginet2_ (~diginet@107.170.146.29) joined #forth 12:35:28 --- quit: diginet2 (*.net *.split) 12:35:29 --- nick: diginet2_ -> diginet2 12:51:58 --- quit: ncv_ (Ping timeout: 255 seconds) 13:21:29 fffffffffff I think I've made a huge mistake 13:22:01 should stop trying to be clever and just compile 32-bit words already 13:30:53 zy]x[yz: hmm? 13:36:39 the forth I'm building runs in 64-bit mode but compiles 16-bit addresses 13:37:40 so the way I do this is that I ensure that the code is mapped to a region that's 64K-aligned, and then I preload the high-bits of the IP and NEXT just loads the lower 16-bits 13:38:26 this is fine for now, but eventually I'd like to support shared objects, which would require position-independent code 13:38:46 there was even a guy in here not too long ago who was doing something similar and I acted like I knew what I was doing 13:38:55 but it just occurred to me that my plan has a fatal flaw 13:40:20 namely, the code-field is still 64-bit. so I'll have to either fix-up all of those when a module is mapped, or make them relative somehow (which might be nice because I could probably also make them 16-bit in that case, but it would be a super slow NEXT) 13:40:52 alternatively I could switch to DTC but when I experimented with that, ITC was like 50% faster on x86 13:50:15 I guess compiling 32-bit addresses doesn't really solve my PIC problem 13:50:26 I need to think about this more... 13:58:04 --- join: fiddlerwoaroof_ (~fiddlerwo@unaffiliated/fiddlerwoaroof) joined #forth 13:59:20 --- quit: fiddlerwoaroof (Ping timeout: 248 seconds) 15:02:15 --- quit: dddddd (Remote host closed the connection) 15:18:57 --- quit: dys (Ping timeout: 268 seconds) 16:06:20 --- quit: pointfree1 (Read error: Connection reset by peer) 16:06:21 --- quit: M-jimt (Read error: Connection reset by peer) 16:16:15 --- quit: Riviera- (Ping timeout: 276 seconds) 16:17:44 --- join: pointfree1 (pointfreem@gateway/shell/matrix.org/x-hiqnecfarlenflkq) joined #forth 16:19:32 --- join: lijero (~lijero@unaffiliated/lijero) joined #forth 16:28:17 --- join: Riviera- (Riviera@2a03:b0c0:1:d0::10:b001) joined #forth 16:35:09 zy]x[yz: out of curiosity, what does your NEXT look like now? 16:37:19 lodsw jmp *(%rax) 16:39:07 not quite familiar with x86_64 architecture but isnt there an instruction to load only into the lower (or upper half) of an 64 bit register? 16:40:08 lodsw loads the bottom two bytes 16:40:51 I mean, in general you can mostly address the lower half of all registers 16:41:12 like eax is the low half of rax, ax is the low half of eax, etc 16:41:49 but there's one quirk that all loads into e*x also clears the top 32 bits of r*x 16:42:10 oh, yeah, x86 is just nasty in that regard. 16:43:39 tbh that's not my problem though. my problem is that I need that jmp *(%rax) to be a jump relative to rax 16:45:25 yeah, you are bumping against x86 pecularities and it non-uniformity of address modes 16:45:46 I'd have to do like... movzwq (%rax), %rcx; jmp *(%rax, %rcx) 16:46:15 but that is like two memory fetches, man! 16:46:20 assuming I can do that. I'm not even sure if indirect jmp takes a modr/m byte or is it a hard-coded thing 16:46:34 if it's the latter case then I also have to throw an lea in there 16:46:51 though the first causes the second to be fast as the value will be in cache the second time around. 16:47:01 jmp *(%rax) here meaning jump to the contents of the address pointed to by rax? My assembler's a little rusty 16:47:09 yeah 16:47:36 att x86 is also just weird assembly 16:48:19 but I thought lodsw cleared the high bits of %rax? 16:48:27 oops, movswq earlier... would want that sign bit 16:48:28 --- quit: jcob (Remote host closed the connection) 16:48:42 no, only the 32-bit instructions do that 16:48:45 --- join: jcob (~user@152.7.255.232) joined #forth 16:49:11 Huh. Weird. 16:49:33 this just occured to me: what exactly is the difference between an super-scalar RISC that is running something akin to CISC emulator switch-loop and modern x86? 16:50:14 (with the RISC we can presume that the emulator switch-loop is in instruction cache) 16:52:03 --- join: M-jimt (jimtmatrix@gateway/shell/matrix.org/x-ravjtnkolbxxwnyx) joined #forth 16:53:09 I don't know? I don't actually know how this stuff works underneath, I just work here 16:56:11 you get to see the RISC code is what I gather. 16:57:28 I was watching an 34c4 talk on how certain hackers/university-researchers found out how to get at the micro coding of now old AMD processors. 17:09:37 heh cool 17:10:01 implement forth in the microcode 17:55:39 hmm, my internet searching skills have failed me... what's the meaning of jmp *(%rax, %rcx)? 17:58:05 I presume that rax and rcx are added together before using that value as an address whose contents becomes the new instruction pointer. 17:58:58 yeah. like I said, might not be valid for a jmp 17:59:40 but you can do like mov (%rsp, %rax, 2), %rbx 18:00:14 that's rbx <= *(rsp + rax * 2) 18:01:24 there's also an lea instruction which just calculates the address and stores it to the destination without actually fetching from memory, so you can use it for single-instruction math sometimes 18:01:45 and also you can add an immediate offset 18:02:07 (to all of them, not just lea) 18:03:18 lea 12(%rdx, %rcx, 8), %rax // rax <= rdx + rcx * 8 + 12 18:05:13 Zarutian_PI: i guess the only difference is the lower speed :D 18:06:41 it's encoded by a ModR/M byte and a SIB byte which follow a lot of the primary opcodes to specify operands. but some opcodes don't take those bytes and just encode the operand into the bits of the instruction. without looking it up I'd guess the indirect jmp is one where the operand is encoded in bits in the opcode 18:06:49 yunfan: re that RISC running CISC emulation versus x86? Not much or any speed reduction I guess. 18:17:42 I guess ultimately all we can do is just try the slightly-longer NEXT and measure the performance 18:18:00 Zarutian_PI: then it must eats much more energy while implement that emulation 18:19:05 yunfan: have you looked at how much power an x86 chip draws versus something like POWER 7? 18:19:12 does these modern cpu has some in-cpu addressable register files? 18:19:30 Zarutian_PI: nope, its all just my guessing 18:20:17 yunfan: from that 34c3 talk at least some AMDs have microcode registers, a lot more than x86 provides. 18:20:45 Zarutian_PI: but not addressable? 18:21:02 i mean you could address then from a segment of memory address 18:21:12 like from 0xf0 - 0xff 18:21:18 not that I know of. 18:21:26 so people could use such things as a super speed stack 18:28:07 anyone here know or are an VLSI chip designer or engineer? I am wondering how many small cores say of J1 or RTX2010 with associated sram memories you could fit on a die nowdays 18:41:58 --- quit: proteusguy (Ping timeout: 248 seconds) 20:07:47 --- quit: Gromboli (Quit: Leaving) 21:44:31 --- quit: lijero (Quit: Leaving) 21:59:14 a 16-bit indirect-threaded forth seems it'd be really compact. I have to wonder how small of executables you could make if you make it possible to strip out unused code and headers. But the issue of "given this entry point to this program, determine exactly what parts of its address space can and can't be referenced" seems impossible in the general case. 22:06:18 you only need 3 ins :D 22:06:36 i think that's a holy grail of backdoor 23:59:59 --- log: ended forth/18.01.07