00:00:00 --- log: started forth/10.04.24 00:01:46 --- join: ygrek (debian-tor@gateway/tor-sasl/ygrek) joined #forth 00:36:44 --- quit: kar8nga (Remote host closed the connection) 01:45:36 --- join: qFox (~C00K13S@5356B263.cable.casema.nl) joined #forth 02:25:27 --- quit: qFox (Read error: Connection reset by peer) 03:15:42 --- join: kar8nga (~kar8nga@jol13-1-82-66-176-74.fbx.proxad.net) joined #forth 03:28:20 --- join: qFox (~C00K13S@5356B263.cable.casema.nl) joined #forth 04:16:44 --- quit: ygrek (Ping timeout: 245 seconds) 04:54:08 --- quit: tathi (Quit: leaving) 05:01:08 --- quit: ASau (Quit: reboot) 05:12:00 --- join: ASau (~user@83.69.227.32) joined #forth 06:09:06 --- quit: mathrick (Remote host closed the connection) 06:10:46 --- join: mathrick (~mathrick@users177.kollegienet.dk) joined #forth 06:36:01 --- quit: qFox (Read error: Connection reset by peer) 06:55:39 --- join: ygrek (debian-tor@gateway/tor-sasl/ygrek) joined #forth 07:04:51 --- join: qFox (~C00K13S@5356B263.cable.casema.nl) joined #forth 07:30:45 --- quit: gogonkt (Ping timeout: 264 seconds) 07:32:22 --- join: gogonkt (~info@113.69.46.126) joined #forth 09:07:14 --- join: beretta (~beretta@c-24-3-44-28.hsd1.oh.comcast.net) joined #forth 09:09:52 --- quit: beretta (Client Quit) 09:12:06 --- join: beretta (~beretta@c-24-3-44-28.hsd1.oh.comcast.net) joined #forth 09:58:49 --- join: GoNoGo (~GoNoGo@2a01:e35:2ec5:dd70:646c:9f17:87b2:b018) joined #forth 10:48:22 --- quit: crc (Ping timeout: 258 seconds) 10:50:02 --- join: crc (~charlesch@184.77.185.20) joined #forth 10:53:41 --- join: alex4nder (~alexander@173-147-207-68.pools.spcsdns.net) joined #forth 10:54:01 hey 11:02:11 ping! 11:03:26 ping 11:17:24 pong 11:17:26 beretta: hey 11:18:24 ahoy! 11:18:37 sorry I'm getting carried away... 11:19:29 anyone been hacking on anything from greenarrays recently? 11:22:52 they look interesting 11:24:39 They have "product" now? 11:26:00 aren't they just producing something like the intellasys things? 11:27:00 oh man. their website is messed up. 11:27:11 I like how they put a ; and have it go back in history. 11:27:41 heh 11:28:32 Chuck moore's companies never have product. 11:28:43 They just put specs up on the website, but never ship anything. 11:28:56 Pusdesris: I have an older S24 dev kit. 11:29:04 From intellasys.. not a chuck moore company ;) 11:29:07 but nothing from Greenarrays, officially. 11:29:35 Even intellasys. 11:29:44 I tried getting myself a forth drive, never happened though. 11:29:55 I got a intellasys dev kit. 11:30:06 yup. I have a Forth drive as well. 11:30:24 the whole 40C18 eval kit or what it is called. 11:30:36 yah 11:30:54 I ordered it not directly from intellasys but from somewhere else.. they had a link I think. 11:31:32 I was trying to build a product based on the S24, instead of using an FPGA.. but of course, that wasn't going to happen. 11:31:45 Why didn't it happen? 11:32:54 because they couldn't even offer a low-production volume of S24s, in any reliable way. 11:33:23 Hoh.. I think I read something about that.. They were wanting to ship just single eval kits or huge amounts. 11:33:53 How did you get the forthdrive? 11:33:53 yup 11:33:58 I asked. 11:34:03 and they sent me one. 11:34:13 They used to have a link to a site that sold their products.. I can-t find it now. 11:34:27 I asked too. 11:34:29 They sent me nothing. 11:34:40 Did they reply? 11:34:46 Yes. 11:34:54 Just said something like, "We ran out." 11:34:55 what did you say you were going to use it for? 11:35:16 Academic purposes. 11:35:31 Or I may have said hobby. 11:35:33 I don't remember 11:35:44 Running out of this forthdrive seems.. I dunno. I can see how they were not making any money. But how to make money if they don't ship free products to have people play with and get hooked? 11:36:14 I ordered one to write a high-speed sensor decoder 11:36:22 i.e. decoding the crank position of an engine 11:36:25 schme: startup capital 11:36:42 Didn't they have like some competition and just the neat ones got the drive? 11:36:54 beretta: Just saying.. theey should be handing those things out like candy :) 11:37:21 Agreed. 11:37:25 haha...mmmm forth candy mmmmm... 11:37:50 once you pop, you can't stop. 11:38:44 unfortunately greenarrays isn't going to go anywhere 11:39:15 cmoore should just release all of his implementations to the public, and earn off of his patents. 11:42:34 "We recommend that you keep cats and small children away from your keyboard when colorForth is running." -- Chuck 11:48:01 alex4nder: He could then make a living as a webdesigner. 11:48:19 schme: hahaha 12:02:45 --- nick: Pusdesris -> Deformative 12:11:56 --- quit: Snoopy_1611 (Ping timeout: 276 seconds) 12:52:17 --- quit: GoNoGo (Quit: ChatZilla 0.9.86 [Firefox 3.6.3/20100401080539]) 13:09:39 --- quit: ygrek (Ping timeout: 245 seconds) 13:30:29 --- quit: alex4nder (Ping timeout: 245 seconds) 14:25:36 --- quit: kar8nga (Remote host closed the connection) 15:33:28 --- quit: qFox (Quit: Time for cookies!) 15:38:56 --- quit: ASau (Ping timeout: 258 seconds) 15:40:56 --- join: ASau (~user@83.69.227.32) joined #forth 18:07:48 alex4nder: That just doesn't seem to be the mindset of the elite Forth guys, though - they are "close hold" in the extreme. 18:08:56 Man, it's nuts here tonight. My 4th grade daughter is having a birthday slumber party. The place is covered up in kids. 18:18:17 --- quit: crc (Quit: http://retroforth.org) 18:33:38 --- join: crc (~charlesch@184.77.185.20) joined #forth 19:19:28 Hey kip. 19:32:28 Hey. 19:32:30 How goes. 19:36:27 Spent the day relaxing. 19:36:32 Waiting for more exams. 19:36:39 To screw up on. 19:38:31 Been trying to get this research thing all worked out. 19:38:41 Trying to figure out how to get academic credits and such. 19:38:46 How abotu yourself? 19:44:14 Noisy house. My daughter's having a sleepover. Driving me crazy. 19:44:24 I just retreated to the bedroom. 19:44:31 Hopefully they will go upstairs soon. 19:48:22 Heh. 19:53:27 I changed !b+) to !a+). The a and b registers load as a chain, first a and then b. It made sense to put the one with the most features first in line. 19:53:40 Shortens the instruction sequence required to set up for that feature. 19:54:20 I'm also thinking of making the thing generally see nop opcodes coming and skip over them in time. So they would take up space (which is what they are for - code alignment), but they would never take up time. 19:54:23 What do you think? 19:55:42 It's over my head. I think/ 19:56:46 Well, I guess what I'm asking is what do you think of giving up nop instructions as a way of "killing time" on the processor? 19:58:16 Seems valid. The only usefulness I can think of for killing time is something like "pausing" or whatever, but it can probably be implemented better with an onboard clock and loop. 20:00:20 You are not using nop for synchronization or anything are you? 20:00:47 Or with a sequences like "dup drop" or "dup swap drop" (that one would take an even cell). So I don't really have to have nop for that. 20:00:52 No. 20:00:55 Just to align code. 20:01:20 Like if a ret falls on the first slot of a cell I have to put two nop codes so that I can start the next routine on an addressable cell boundary. 20:02:38 Can those opcodes ever be reached? 20:03:08 I mean, is it manditory that they really have no operation? 20:03:16 You could always fill with arbitrary bits. 20:05:13 I think I am going to build the vga controller and instructions right into my cpu. 20:05:25 I will have extra opcodes to do whatever I want with, and well, why not. 20:05:33 That's a damn good point. 20:05:42 Maybe I don't even need that opcode at all. 20:06:30 :) 20:06:32 Ok, I'm going to watch some TV with my wife. I'll have to ponder on whether I really *need* a nop. 20:07:28 Let's talk some more soon about a VGA controller - I've had some thoughts (very preliminary ones) about that. I think a chat would be fun. 20:07:35 Maybe later tonight or tomorrow. 20:09:48 Well. 20:10:04 THe problem with not having nop is that nop is really useful for ILP. 20:10:22 But the fact that you are using a stack machine, really makes ILP minimally useful. 20:10:34 Ok, talk to you later. 20:10:36 o/ 20:19:09 --- quit: beretta (Quit: Poof!) 20:31:04 ...Wait. 20:31:10 Is nop useful for ILP? 20:31:15 * Deformative ponders. 20:34:40 Forget I said that. 20:34:48 I don't think it is useful for that now that I think of it. 20:54:48 --- quit: gnomon (Ping timeout: 258 seconds) 20:55:08 --- join: gnomon (~gnomon@CPE0022158a8221-CM000f9f776f96.cpe.net.cable.rogers.com) joined #forth 21:06:14 I don't know what ILP stands for. 21:06:26 Instruction level parallelism. 21:06:52 Ok. 21:07:11 My processor does a few things via look-ahead, but makes no general attempt at that. 21:07:18 I have no idea why I thought nop was useful for that. 21:07:21 * Deformative shrugs. 21:07:54 If you had a multi-processor system and were trying to line up (in time) streams of code on different proccessors it might be useful. 21:08:15 I was thinking, and it wouldn't be terribly difficult to implement huffman encoding right into the instruction set. 21:09:04 One would just need to know the frequency which each instruction is commonly used. 21:09:31 In forth, CALL is probably the most common, so that could be reduced to a single bit. 21:09:41 * Deformative shrugs. 21:11:56 Another idea I had. 21:12:28 What do you think of seprate pipelines for instructions and data on a forth chip? 21:12:38 Actually, nevermind. 21:12:59 It probably wouldn't be useful, since there is so much brancing and forth can easily prefetched. 21:15:38 Well, there is potential benefit I suppose in store and fetch. 21:15:42 Not sure though. 21:18:01 It would be more difficult to write an assembler/compiler for. 21:18:23 If you wanted to put literals and stuff on their own pipeline. Lol. 21:18:34 Ok, I am just getting stupid now. 21:18:36 I should stop. 21:43:05 My CALL is a single bit, in fact. But CM had that idea years ago, so it's nothing new. 21:44:14 That's the whole MSB thing - if the MSB of a cell is zero then the other 15 bits are three five bit opcodes. If it's 1 then the other 15 bits are used to compute an effective address and conditional info for a possible control transfer. 21:44:30 I don't know if CM had any conditional stuff in his. 21:44:46 Or the ability to do jump as well as call. 21:45:41 In my processor if bit 15 is 1 then bit 14 tells you whether to push the return address (call) or not (jump), and bits 12 and 13 specify one of four conditional possibilities. 21:47:47 Finally, if bit 11 is 0 then the 11 bit effective address (which can specify 2k cells) lies in the address range 0 - 2k. But if bit 11 is 1 then the upper five bits of the effective address is specified by a "bank register". 21:48:07 That way I can have up to 64k cells of code, in 2k cell banks. 21:48:54 The effective address is fully resolved to a 16-bit cell address at call time, and the return address that's pushed is always a full 16-bit address. So returns always work properly and go back to the proper place. 21:50:57 I see. 21:51:42 So you cannot put calls within your instruction cells. 21:51:48 I suppose that is a good thing. 21:51:48 Right now I hand code everything to specific addresses, in machine code. I find that getting the addresses specified right is quite difficult. 21:52:00 Once I write a compiler, though, it will do all that for me. 21:52:20 I do plan to add a call opcode. 21:52:42 It will pull a literal from the following cell (a 16-bit one) and will transfer control to that address, with no mapping. 21:52:59 Right now I push the literal to the return stack and then do a ret. 21:53:22 But that takes several opcodes and several cycles; I think consolidating it all to a special purpose opcodee is worth it. 21:54:28 I agree. 21:54:33 You have plenty to spare. 21:55:31 I was planning on call, branch, and 0branch. 21:55:45 I will remove branch from the list if I need an extra opcode. 22:02:28 Once I added the ability to either jump or call with a effective address cell, and once I added conditions, I needed none of those opcodes. I can call or jump, unconditionally or per three conditions, with a single 16-bit cell and no extra opcodes. 22:02:48 The only reason I'm adding a special opcode is to allow a "far call" that will go to a new mapped memory page. 22:03:14 I imagine that's what I'll replace the nop opcode with. 22:04:48 Here's my reasoning. Let's say you have call, branch, and 0branch opcodes. 22:05:03 And 16-bit address cells. 22:05:18 That means that a call takes 16+(opcode size) bits. 22:05:22 As do the other two. 22:05:52 A conditional *call* takes a lot of bits - the 16+(opcode size) for the call as well as the opcode for the conditional jump, the bits for the target address, etc. 22:06:00 I can put all of these things into a 16-bit cell. 22:06:18 The cost is that I require the mapping process to get full addressing range. 22:06:50 But if far calls and jumps are rare compared to short ones then this will pay off in code space in in the time required to fetch that code. 22:07:30 --- quit: segher (Quit: This computer has gone to sleep) 22:08:10 Hmm. 22:08:23 Is there any reason to pack instructions into cells other than for unext? 22:08:51 Code space. 22:08:58 I do not understand. 22:09:24 Plus memory access time - if you can fetch three opcodes in one read then you can run instructions at three time the memory access rate. 22:09:30 I'm not doing that yet, but I intend to. 22:09:34 Future version. 22:10:03 So, you can still fetch 3 at a time... Why do you need to be restricted to these cells? 22:10:28 You can left shift by opcode size every instruction. 22:10:55 I don't understand - "restricted to these cells"? 22:11:22 You mean why can't I do a wider fetch? 22:11:33 Ok, say you fetch 16 bits at a time. 22:11:39 That's right. 22:11:43 And you have 32 opcodes. 22:11:45 I could fetch more if I wanted to. 22:11:46 So each take 4 bits. 22:11:53 Each takes five bits. 22:12:03 Oh, right. 22:12:11 Ok. 22:12:11 The one left over is the one that specifies whether it's three opcodes or an effective address / conditional. 22:12:35 Let's just call them opcode cells and EA cells from now on. 22:12:58 Hmm. 22:13:28 Ok, hypathetically, let's say that there were only 16 instrutions. 22:13:53 You fetch 4 at a time. 22:14:13 But you always have 4 extra prefetched. 22:15:05 So, let's just store them in a 32 bit register. 22:15:21 Until you branch before you need them, and then you haven't prefetched the ones you need. 22:15:36 I've been all down this road. 22:15:53 Makes sense. 22:15:54 Ok. 22:16:15 And then if you don't have that one leftover bit for the EA specification you have to have a call opcode. 22:16:26 So a series of colan definitions becomes this: 22:16:29 nop nop nop call 22:16:30 EA 22:16:33 nop nop nop call 22:16:35 EA 22:16:41 So you waste a bunch of slots on nops. 22:17:08 Essentially each colan definition becomes a 32-bit thing. 22:17:12 Makes sense. 22:17:41 Would it be possible to make it so you can fetch 32 bits at a time? 22:18:38 I suppose that is a lot of extra fetching if you branch quickly. 22:19:38 In an fpga yes, almost anything is possible. Put your memory banks in parallel. But yes, you are seeing the problems. 22:20:29 For tightly wound code with lots of calls and branches 32-bit fetches are wasteful. 22:20:36 And besides, code space counts in an fpga. 22:20:55 Your internal memory is *much* faster than external memory, and smaller fpgas have only modest memory. You don't want to waste it. 22:21:18 * Deformative ponders. 22:27:22 How do your calls work again? 22:27:28 With your cell packing method? 22:29:24 If the MSB of a cell is 1, then the lower 15 bits specify an EA and conditions. 22:29:39 Bit 15 specifies call vs. jump (do I push the return address). 22:29:47 I'm sorry - bit 14. 22:30:03 Ok, you have "+ USER_DEFINED_WORD" 22:30:05 Bits 12 and 13 specify unconditional or one of three conditions. 22:30:07 What does that compile to? 22:30:29 Bit 11 specifies hard reference to lowest order block vs. block specified by map register. 22:30:49 I don't know what you mean by that. 22:31:46 If someone were to type : foo * - ; : bar + foo ; what is generated? 22:32:56 --- join: alex4nder (~alexander@dsl093-145-168.sba1.dsl.speakeasy.net) joined #forth 22:33:19 In the bar definition, what does + foo look like? 22:38:21 KipIngram: ? 22:44:50 Sorry; distracted for a minute. 22:44:58 No worries. 22:45:20 At the target address for foo you get this: 22:45:30 * - ret 22:45:34 So that's just one cell. 22:45:54 At the target address for bar you get this: 22:45:59 Ok, lets represent cells as [] So [*, -, ret, x] 22:46:01 + nop nop 22:46:03 Where x is don't care. 22:46:15 EA cell with call to foo 22:46:18 Oh yeah, 3 words per cell. 22:46:20 ret nop nop 22:46:20 I forgot. 22:46:23 --- join: forther (~62d2faca@gateway/web/freenode/x-ftebgrwsdbhxfxdi) joined #forth 22:46:26 No, wait. 22:46:27 Better. 22:46:40 at the address for bar you get this: 22:46:54 nop nop + 22:46:55 hi 22:47:00 ea for jump to foo 22:47:03 and that's it. 22:47:20 Since you jump to foo the ret takes you back to bar's caller. 22:47:55 Hi forther. 22:49:19 Ok, I see. 22:50:13 It seems like most of the time if you use an opcall somewhere in a defintion, you will most of the cell. 22:50:47 If you need just one opcall then it's inefficient. 22:51:01 This is the biggest reason I pushed so hard to get calls, jumps, conditionals, etc. into the EA cells. 22:51:02 I think that is most of the time. 22:51:15 So that I didn't need isolated jmp, jz, call, etc. opcodes. 22:52:11 I don't know - I think it would vary. 22:52:34 I imagine that it would be close to 1/3 of the time for each case (one slot used, two slots used, three slots used). 22:52:43 --- part: a3i left #forth 22:52:44 Or maybe 1/2, 1/4, 1/4. 22:52:54 But not 0.9, 0.05, 0.05 22:53:17 I won't know for sure until I can write and profile a fairly substantial body of code. 22:53:52 I am wondering if it is possible to put jump opcodes in cells as well, then use the same sort of style you used for literals. 22:54:02 It just makes return extremely difficult. 22:54:14 Actually! Not too difficult! 22:54:56 Sure; jump opcodes are easy to implement. 22:55:06 I did it that way in my early drafts. 22:55:13 No, not the way I am thinking. 22:55:29 You need to reload the offset when returning. 22:55:50 Oh, you mean return to a slot within a cell instead of to the cell itself? 22:55:54 But yu can store that in the return stack, or you could do some analysis of the cell when you return. 22:56:11 I haven't even looked into that. 22:56:35 Yes, you could store two extra bits to specify the state (which specifies the slot). 22:57:04 But are you going to refetch the cell, and then go to the right slot (which means going to state 0 and then to state )? 22:57:24 Or are you going to push the cell contents onto the return stack as well (now you have a 34 bit wide return stack). 22:57:29 Issues... 22:58:26 Hmm. 22:58:39 Pushing cell contents would be fascinating. 22:58:44 You could exploit that hack like crazy. 22:58:49 I decided not to. 22:58:58 But yes, I did find it an interesting idea. 22:59:35 But like I said, it would make return very complicated. 22:59:46 No, that's what would make returning easy. 23:00:07 I think it is much more difficult than your current plan. 23:00:12 The address lets you restore IP, the contents let you restore IW, and the two bit state let you go to the right slot in IW. 23:00:14 You're back. 23:00:34 Yes, it is more difficult and more than doubles the resources required for the return stack. 23:01:08 That was the real kicker for me - I couldn't bear the resource cost. 23:01:16 re.. 23:01:27 * Deformative ponders. 23:01:40 Deformative and I are discussing my FPGA processor architecture, which runs Forth native. 23:01:55 He's going to be doing one, and we're using aspects of mine as talking points. 23:02:13 Actually we've touched on a number of others as well, like some of Seaforth's stuff and so on. 23:02:51 Right now we're dealing with issues related to opcode packing, how to encode calls, jumps, etc. 23:06:38 It leads to problems with prefetching. 23:07:24 Yeah, lots of things can cause prefeching problems. 23:07:44 Prefetching can speed you up so much, but then as soon as a control flow issue makes it "wrong" you have to deal with that. 23:08:14 I was thinking about having two pipelines. 23:08:27 One for literals and one for instructions. 23:08:48 You mentioned that earlier. I'm not sure what it buys you. 23:08:50 They would each need their own counters though. 23:09:25 Yeah, you're right, it buys nothing. 23:09:29 Nevermin.d 23:10:05 I was thinking you might be able to hold the same prefetch, but you would need to branch in both. 23:10:11 So it doesn't help. 23:10:31 These thoughts are like drugs. 23:10:56 I spent months trying to find the golden path through it all, and finally gave up. 23:11:13 Of course, these are standard problems that face all processor designers. 23:13:04 You know, I'll tell you where you need a nop. 23:13:27 When you finish a code stream that runs up to a jump target and you're not on a cell boundary. 23:13:34 You need a way to pad. 23:13:39 Yeah. 23:13:55 And besides, if nop could be done away with Chuck Moore would have figured that out. 23:14:09 I think that's a blind alley. 23:14:26 If there is a way to get call working in a cell as well, then you don't need it. 23:14:43 Erm, I mean packed into a cell. 23:14:51 Branch is easy. 23:14:55 Once again you're talking about transferring control into the middle of a cell? 23:15:14 Yeah... 23:15:18 Remember you have to specify those bits in your target address. 23:15:28 Makes your addresses wider. 23:15:47 No, target can always branch to front of a cell. 23:15:52 Oh, you would still need nops. 23:15:54 Hmm. 23:16:49 So you would need to both branch from and to middle of cell. 23:16:56 I was thinking it was only from. 23:18:08 Branching from the middle of a cell takes an opcode. In my processor if you've used slot one for something and then need to branch, you nop slots two and three and put the jump as a cell after that. 23:18:29 The alternative is to have the same use for slot one, put a jump instruction in slot two, and then the target goes in the following cell. 23:18:33 You didn't really gain anything. 23:18:57 The target address goes in the same place either way, and you have one nop whereas I have two. 23:20:55 I just feel like the advantage of packing is minimal, it is only useful for auto-implementing extremely trivial examples of duffs device and for very simple words which use a lot of machine calls. 23:22:23 In a speed-optimized processor, which mine isn't right now, the execution speed of most opcodes (the ones that don't access memory) is much faster than a memory access. 23:22:42 So getting several opcodes per memory access makes sense. Otherwise you will be limited by your memory access speed. 23:22:47 I don't want to implement branch predictors. 23:22:49 Your hardware will be waiting for data. 23:22:56 But I don't see another way to get prefetching working well. 23:22:57 I don't either. 23:23:18 I don't really mean prefetching future opcodes. 23:23:23 My processor does this: 23:23:26 S0: Fetch a cell 23:23:31 S1: execute opcode 1 23:23:36 S2 execute opcode 2 23:23:42 S3 execute opcode 3 23:24:03 When I optimize it S1, S2, and S3 can be much shorter than S0. 23:24:05 Don't you fetch one cell in the future? 23:24:24 For things like literal? 23:24:39 The literal cell isn't fetched until the literal opcode is executed. 23:24:55 --- join: ygrek (debian-tor@gateway/tor-sasl/ygrek) joined #forth 23:25:08 So for those opcodes the execution state couldn't be shorter. 23:25:14 Oh. 23:25:16 That's why this optimization will be less than trivial. 23:25:18 I thought you had tht fetched. 23:25:28 Nope. 23:25:39 IP points to it, so it's easy to do when I need it. 23:26:02 Ok, branching to the middle of a cell isn't bad at all then. 23:26:06 And I *could* fetch it and store it somewhere if I wanted to, which might speed up certain things, like literal loads in slot 3. 23:26:34 You still have to store in the target address the spot in the cell you want to branch to. 23:26:47 This doesn't all fit in a 16-bit architecture anymore. 23:26:49 Return stack cells are 20 bits the way I am currently thinking. 23:26:57 Though, Iw oudl like a way to prefetch a little extra. 23:27:05 You will have to have wider addresses in your code, wider return stack, etc. etc. 23:27:09 Oh!! 23:27:10 Idea. 23:27:24 You can always keep the top of the stack prefetched at all times. 23:27:32 That is trivial! 23:27:41 My stack is a hardware stack - it's not in memory. 23:27:48 Mine too. 23:27:54 Every stack element (data and return) are registers. 23:27:58 I am saying prefetch the address at that location. 23:28:00 Then what do you mean by prefetching TOS? 23:28:13 Oh. 23:28:29 Keep that instruction cell prefetched. 23:28:43 You mean the return stack? 23:29:13 Ok. return stack holds addresses of instruction cells. 23:29:22 Keep the tos instruction cell prefetched. 23:29:40 Yes, I can see some merit in that. 23:29:43 Interesting idea. 23:29:43 Overwrite when you push to return stack, and ret can re-fetch it. 23:29:56 Only takes 18 extra bits to do. 23:30:27 Maybe 20. 23:30:29 Most importantly, this shaves a cycle from return operations. 23:30:34 Yes. 23:30:43 You don't need S0 for the fetch immediately following return. 23:31:26 You're saying 16 bits for the cell content and then 2 more for the slot to return to? 23:31:44 You'd need those two bits for every element of the return stack. 23:31:45 Yes, and potentially 2 more for the PC offset. 23:31:52 Indeed. 23:31:53 Not just once. 23:31:58 I understand. 23:32:12 I was planning on having 20 bit return stack elements anyway. 23:32:28 And this is only useful if you can call from the middle of a cell, which requires opcodes and a slot. 23:32:55 --- quit: forther (Ping timeout: 252 seconds) 23:32:59 It is still useful if you return to the beginning of a cell. 23:33:01 I'm not seeing a code space win here. 23:33:14 Yes, I do like the idea of prefetching the target content. 23:33:22 Just not the extra bits for storing the return slots. 23:33:38 Well, it is still applicable for your model. 23:34:01 I am currently leaning toward return slots for mine. 23:34:03 It might not always buy me anything. If ret is in slot 1, then I see it when I'm in S0. 23:34:25 In S0 I'm busy fetching the cell that contains the ret. 23:34:42 And on the next cycle I'm "back" and fetching from the return address. 23:35:00 So only if ret is in slot 2 or 3 does it help me. But that's probably 2/3 of the time. 23:35:10 OH yeah, I keep forgetting that the next cell isn't prefetched. 23:35:18 So that saves an average of 2/3 of a cycle per return operation. 23:36:08 I like the in-cell model because it is easily scalable to larger word sizes. 23:36:34 Still benifits if you scale to 32 and 64 bit words. 23:36:52 I need to go to bed - I want to either run or cycle tomorrow morning. Let's pick this up tomorrow. 23:37:13 Ok, maybe we will talk about the vga controller then as well. 23:37:15 Goodnight. 23:37:21 Yes - that too. 23:37:22 Night. 23:46:01 --- quit: alex4nder (Ping timeout: 240 seconds) 23:59:59 --- log: ended forth/10.04.24