#title: Thoughtful Programming and Forth #author: Jeff Fox / www.UltraTechnology.com #published: 1999-12 ________________________________ Thoughtful Programming and Forth ________ Preface Computers are amazing machines. They have an incredible potential. At their best they provide an extension to our minds and our traditional media assisting us in storing and retrieving data, making calculations and decisions, visualizing information, and communicating information between humans. At their worse they introduce nightmarish complexity and problems into our lives. They account for an entire industry that is vast and pervasive and which works in cooperation with strong media and socio-economic forces to sell and promote computer use in our culture. The technological fads and media hyped product feeding frenzies that we know of as the modern computer industry also have a dark side. The phenomenon known as the digital divide is the way that technology is creating a strong social economic division in our culture that could influence generations. Those with access to modern computers will have access to nearly unlimited information and a world of training, experience and opportunity that the have-nots will never know. The strong and disturbing evidence is that home computers, SAT test practice programs, and access to the internet have become prerequisites to enrollment in a good college and getting a good job in the future. Those without a way to get a foot up into the system will be forever kept out. One aspect of the digital divide is that computers themselves must be made to appear inconceivably complex and incomphrehensible to the uninitiated. The reality of the world we live in is that if 100 people represented the population of the world two of them would own personal computers. Owning a personal computer is much like owning an automobile and gives you bragging rights about the model and style that represents you. To the vast majority of the people in the world just owning one puts you in an elite group of rich and affluent people whether it is a clunker or the top of the line luxury model. Marketing of computer hardware and software is pervasive throughout the culture. Everyone would like the most beautiful, expensive, fastest and highest quality model in an ideal world. Part of modern culture seems to be that people like to pretend, perhaps even to themselves, that they are so rich and important that money is of no object to them. If they can say that quality for value is not important to them because they only want the top quality most expensive option they get higher social status. Many people are therefore very arrogant about how wasteful they are with their computer. They are very proud of it and will tell you how they got the latest upgrades that they really didn't need but since it is all really cheap these days anyway etc. They will say that they tried the latest most wasteful new software and had to go out and buy a faster computer but didn't care because they are cheap. 98% of the people in the world think computers are too expensive to buy and most of the people who buy them don't really believe they are so cheap that no one cares about that. If you are talking about most of the limited resources in our culture it is not fashionable to brag about conspicuous consu mption, but computers seem to thought of as an unlimited resource because of marketing. I was first exposed to the difference between programming and computer marketing almost thirty five years ago. If you have been a programmer perhaps you have been there too. Your manager tells you, "This program you have written is too efficient. You don't understand the big picture. The client is spending $100,000 a year now to do this manually. Based on the runtime of the program in this form this program would accumulate only $5,000 in fees for an entire year. The client is now spending $100,000 and will be happy to have it done for $50,000. Go back and rewrite this program so that it uses ten times as much computer time so that we can charge this client ten times as much. This is a business and the bottom line is making money not writing efficient programs." The nature of business management in the US is such that managers work their way up through the corporate structure by showing that they can manage larger and larger budgets. If you show an aspiring manager a way to reduce their budget they will know that they will be expected to live with that reduced budget again next year by a maybe not so understanding level of management above them. They also know that the one of their peers with the largest budget will most likely be the one promoted up to the next level where the budgets to be managed are even bigger. These pressures lead middle managers to pad their budgets and this is one of the driving factors in the computer industry. IBM exploited this vacuum for years with managers being promoted for pouring money into mainframe accounts with constant upgrades of machines and operating systems to address a constant list of bugs. When I worked for Bank of America in San Francisco one of my managers maintained more mainframe accounts for employees who had quit than he had for employees who were still working for him. This allowed him to inflate his budget by about 20 phantom employees times $1000 a month to IBM for their mainframe computer accounts. I had three accounts and they were all still active three years after I had left the bank. My manager had given IBM $108,000 for my computer use after I was no longer an employee of the bank. Multiply that times twenty employees each for four thousand managers and you get the picture at the bank at the time. As time moved on it became easier to get promotions in large companies for wasting money on Personal Computers. When I was a consultant to Pacific Bell I saw countless examples of managers spending hundreds of thousands of customers dollars on inflated budgets for computing systems to work their way up the corporate budget ladder. Managers were looking for packages that came in the largest boxes, with the most diskettes and with the largest pricetags then they would buy hundreds or thousands of copies that were not needed for any conceivable reason. One example were the 3270 terminals. They had many employees who used IBM 3270 terminals to talk to their mainframes. They replaced them with PC running 3270 emulation programs. This allowed them to continue to spend thousands of dollars per machine every year for needless hardware and software upgrades even though all of these users only ever ran one piece of software, 3270 emulation. Whether you are talking about corporate America being marketed products that are intentionally puffed up for marketing purposes or individuals being marketed new computer hardware and software products based on style, status, and hype it is hard to deny that what sells are big boxes and big programs with lists of features that far exceed anything related to productive work. There is considerable concern both in the industry and by consumers about the diminishing returns for our continued investment as users in this kind of software. The marketers tell us that if cars were like computers the cars we would be buying today would be absurdly cheap and absurdly fast, but it just isn't so. I got my first personal computer for about $1000 in 1975. That is still what they cost. The graphics are better. The computer is bigger and faster, and doing more complex things, but that is what you would expect after 25 years of progress. My first machine ran at about one hundred thousand instructions per second and my current machine runs at one hundred million, 1000x faster. My first machine was so slow that it would take several seconds to boot up and would sometimes just go away while executing system code on badly written programs and I could not type for up to twenty seconds. My current PC is 1000 times faster, but the programs seem to be 1000 times bigger and slower because it now takes about a minute to boot up and it still goes away for about twenty seconds sometimes while the OS and GUI do who knows what sometimes and appears dead and will not a ccept a keystroke for that period of time. In this world of quickly expanding computer hardware and quickly expand computer software there seem to be very few people concerned with making computers efficient or getting the most value out of the computers we have, or the most productivity out of the programmers. It is more fashionable to claim that everyone (who is important) is rich and doesn't care about things like efficiency. If a program is inefficient they can always just go out and buy a more expensive computer to make up for any amount of waste. In this world there are few people working on making computers simple to understand, simple to build, and simple to program. There are few people making programs that are easy to understand, easy to maintain, efficient and beautiful. One of those people is Charles Moore the inventor of the computer language Forth. Chuck Moore describes himself as a professional who gets personal satisfaction out of seeing a job done well. He enjoys designing computers and writing very efficient software. He has been working for nearly thirty years to come up with better software and nearly twenty years to come up with better computer hardware. His latest work involves unusually small computers both in hardware and software. His ideas are very synergistic as both his hardware and software are as much as 1000 times smaller than conventional hardware and software designs. However many of his ideas about software design and programming style are not limited to his tiny machines. While it is difficult to map bloated software techniques to tiny machines it is easy to map his tight tiny software techniques to huge machines. There will always be problems bigger than our machines and there will always be people who want to get the most out of their hardware, their software, and their own productivity. Chuck's approach to VLSI CAD is a good example of the application of his style of programming to a conventional computer. The approach and the design of the code used the tiny approach to get the most productivity from the programmer and the highest performance from the software on a conventional Intel based PC. Instead of purchasing packages of tens of megabytes of other people's code for hundreds of thousands of dollars Chuck wrote his own code in a matter of months to make it faster, more powerful and more bug free. He does the job more quickly with thousands of times less code. The size and performance of the program are quite remarkable and the methodology behind its design and construction involve more than a specification of the features of his language. It involves understanding how that language was intended by used by its inventor. Chuck has moved most of this Forth language into the hardware on his computers leaving so little for his software to do that it is very difficult for people to see how his software could possibly be so small. He has refined his approach to his language until is is difficult for people who have been extending it for twenty years to see all he has done with so little code. There are aspects of his early experiments with CAD that have led to great confusion about his software style. It has focused many people's attention on the number of keys on his keyboard or the size or number of characters in his fonts or the hue of the colors that he selects in CAD, the names he used for opcode and a long list of other distractions. _____________ Introduction Having spent the last ten years working with {Chuck Moore|http://www.ultratechnology.com/people.htm#CM} on his custom VLSI Forth chip development I have greatly changed my ideas about Forth. I have moved on from the concepts that I first learned about Forth twenty some years ago and studied what Chuck has done with Forth in the last fifteen years. I looked over his shoulder a lot and asked him a lot of questions. When the Forth community first began work on the ANS Forth standard the effort involved defining a Forth specification that provided common ground for different Forth users. The ANS Forth standard as I think Chuck would say was designed to cover almost all the variations on what Chuck had invented that everyone else was doing twenty years ago. There was never anything like it before, a sort of meta-Forth definition. But Chuck said he worried that this formalizing of a definition of Forth would result in a sort of crystallization of Forth. My concern was a different consequence of ANS which was that a new style of Forth programming seems to have evolved. Traditional Forth was on a real machine where there was a hierarchy from primitive hardware through abstracted code. There was always a sense of what were the simple fast primitive words were the best to get the most efficient code where that was needed. In ANS there is no such sense that the 10,000th word in the system is necessary any more high level or com plex than the first since that is implementation dependent. Even though such a hierarchy of complexity will normally exist in Forth common practice in ANS Forth is to ignore this reality. Chuck's advice regarding programming is often highly contextual. He will say people should not use most standard OS services rather you should write the code yourself. He says this because if you build your code on inefficient code you will have an efficient application and you will have to do more work to get it to work. At the same time the primitive words in Forth are also a set of standard services. On a real system you know the real tradeoffs regarding each of these services and can make informed decisions regarding which words to use. On an abstracted model of Forth (ANS) you cannot make these kinds of informed decisions. As a result ANS Forth programmers do with Forth what Chuck would advise them to do with OS services, they try to rewrite them themselves. Instead of using perfectly beautiful Forth words the way Chuck had intended them to be used 30 some years ago they rewrite their own version. In this case Chuck would not advise them to rewrite it themselves. I would often ask ANS programmers, "Why did you rewrite this word with all these pages of high level code when almost exactly the same thing is available in highly optimized CODE and is 1000x faster?" "Because that is the definition in the library in the system I normally use." was the answer. Chuck and I were both convinced that this sort of abstracted approach to Forth might result in a new style of using Forth that would in turn lead to the ultimate demise of the language. We focused our efforts on building the fastest, simplest and cheapest hardware and fastest, simplest and cleanest software as an alternate future for Forth. Chuck's ideas about Forth have evolved through {four stages|http://www.ultratechnology.com/prog.htm#stages} in this time and I have generally been a stage behind. After Chuck left Forth Inc. and began working on Forth in silicon he had the chance to start his approach to Forth again with a clean slate. He was happy with many improvements but did not stop experimenting after he did his {cmForth|http://www.ultratechnology.com/meta.html#cmForth}. He moved on through the OK phase, the Machine Forth phase, to his current {Color Forth|http://www.ultratechnology.com/color4th.html} experiment. This document is not intended to be a programming tutorial. It is not going to present a step by step explanation of how one programs in the style that Chuck Moore is using but will present an overview of what he is doing and why. There is an older document describing {Forth and UltraTechnology|http://www.ultratechnology.com/prog.htm}. _________ Chapter 1 Most of the Forth community have had little exposure to the evolution of Chuck's Forth for the last fifteen years and have now become deeply entrenched in their habits from twenty years ago. Chuck has lamented that no-one has published a book teaching people how to do Forth well. Chuck has seen how other people use Forth and is generally not impressed. On this page I will discuss aspects of the Forth language as I currently see them and lightly cover the subject of good Forth programming. __________________________________________________________ A Definition for Good in the Context of Forth Programming What is good Forth? What makes one Forth program better than another? Well of course it depends on context. The first thing in that context to me is the computer. Real programs run on real computers. By that I mean real programs are implementations not specifications. You can specify the design of a program in a more or less portable form or you can specify the details of an actual implementation of that program more explicitly. In either case I am talking about two aspects of the program, the source and the object. I will discuss what I mean by good source and good object code. Good object code is pretty straightforward. It is efficient in terms of system resources, it does not consume resources excessively. The particular resources for a given system and a given program will constitute a different balance of things like memory use, speed (time use), register use, cache use, I/O device use etc. On many architectures there is the tradeoff between code size and speed. Up to the point that cache overflows longer sequences of unfactored instructions will execute faster so many compilers perform inlining of instructions. At the point that cache overflows things can slow down by an order of magnitude and if the program expands to virtual memory paging from disk things will slow down by orders of magnitude. A little smaller, a little bigger, no big deal. A little faster, a little slower, no big deal. But when the ratios become quite large you really need to pay attention to the use of resources. Since there are so many layers that all multiply by one another in terms of efficiency if a system has ten layers that each introduce a little more fat the final code may see a small fraction of the total CPU power available. Programmers need to remember on most modern machines the CPU is much faster than Cache memory, cache memory is much faster than onpage DRAM access and offpage DRAM access is much slower than onpage. Regardless of other factors the way the program organizes data in memory and how it is accessed can easily effect program speed by more than an order of magnitude. What is marketed as a 100Mhz PC can easily be slowed to 10Mhz by slow memory access depending on the program. It can be effectively reduced to almost nothing when the software goes away for 20 seconds at a time unpredictable to do some system garbage collection or something. From the user's point of view for those 20 seconds the machine has 0 user mips. Programs slow significantly when the program or dataset is so large and access to it is so random that the worst case memory time happens a lot. This and much worse is what happens as programs grow and spill out of cache and out of available memory. To avoid this keep things small. In some cases, such as scripting languages, fat is not an issue in terms of code efficiency. It remains an issue in programmer efficiency however if that fat is a source of bugs just like lean code only moreso. Excessively fat programs can easily be excessively buggy and unstable because the bugs will be hard to find in all that fat. Also if a program is grossly inefficient at runtime it may not be as important as the time spent writing it. There are many one-of type of applications where big and slow is not an issue such as trivial scripts that only run once in a while. But for system software it is very important that object code not be too inefficient because other things are built on top of it. Of course some people would say, who cares, just buy a more expensive and faster computer to make up the difference. Sometimes that makes sense. But for those who have been in those BIOSes and system software and seen how bad it can get it seems like a shame to see people being forced to waste 90% of their investment in hardware or software because it means someone gets to charge more money. In this sense the inefficiency fuels the planned obsolescence and forces people down the expensive upgrade path. It's good for you if you own Intel or Microsoft but otherwise it is a concern that has spawned the growth of PD software like Linux. Good source code is a bit more difficult to define. It should be clear, easy to read and write, easy to debug. Again a little smaller, a little bigger no big deal. But computer languages are more different than one another than human languages. When people see a language that is considerably more brief or verbose than the computer language that they are used to their immediate reaction is usually I can't read that, it's too little or it's too much. To compound this variation in point of view the visual layout of the source is a big issue. The attention of reader is directed by code layout and this is also a big factor on how readable the code will be. If the comments are in a language that you don't read they don't help. If they are in a font that is too small to see they don't help. If they are printed in a color that you can't see they don't help. Fortunately some vision problems are correctable but these are issues. For some people the code layout must be pretty. This may be more imprint to some people than code contents. I can't relate to that myself. To me the layout is simply there to direct the attention of the reader. You are not trying to give them an esthetically pleasing experience so that they sigh when they look at the page and don't bother to read the contents. If you follow code layout rules they are there just to make the code clearer. Chuck has switched to color in his latest Forth as a replacement for some of syntax and words that he had not already eliminated. *: ; [ ] LITERAL DECIMAL HEX \ ( )* are some of the words that Chuck has replaced with color change tokens. What I find most interesting about this is that when reading the code a different part of your brain is engaged in seeing the organization of the code into words and what the compiler and interpreter are going to do with the code than the the part of your brain that decodes the meaning of the words. It seems to free the part of the brain reading words to focus on the words more clearly because there are less distractions. Mostly Chuck has replaced some layout information and some Forth words with Color. Besides making the Forth, small and fast, as Chuck puts it, it also makes it colorful. My own experience with his Color Forth is that the result is easier to read code than conventional Forth. But until I have tried using it myself I am not ready to make a final judgment abou t that. As I have said, prettiness is more important to some people and beauty is in the eye of the beholder. Some peel think a system described on a couple of pages clearly is beautiful in itself just as a concise equation in Physics. To another a listing that looks like a telephone directory is beautiful. People will never agree about what looks best. Chuck has limited detail resolution in his vision and complains that he can't see small fonts on the screen. He uses large fonts so he can see them and as a consequence he only has short lines and small definitions. Other people have screens with 256 characters on a line and some very long Forth definitions. Chuck complains that he can't see those small characters and that the code should be factored into smaller pieces. (when the code is printed in a larger font Chuck has also complained that he still couldn't read it because often it would begin with lots of words that had been loaded from a user's libraries that are essential for the author to write anything but w hich can only be described as extensions to Forth. If you know all of these persons extensions you might be able to read the code.) This same author complains that he is color blind so Color Forth doesn't work for him, even if he were not color blind the lack of layout and spelling rules would make it unreadable to him. Of course color has been substituted for layout and some words in Color Forth. Chuck feels color is good substitute for layout and some words, other people don't or haven't tried it. As I say the layout issue is very personal, one person may have a couple of rules for layout and someone else may have about as many rules for spelling and code layout as another person needs to define the Forth system. My stance is that this is a matter of taste and I have my personal style and I can read either extreme of code. The code with pages of layout and spelling rules looks nice and if you cross reference all the words that came from the user's private libraries the meaning is clear. I find Chuck's Color Forth very easy to read too. I think it is easier for me to read but part of that is the same reason that a 25K source is easier to read than a 25M source. Size becomes a significant factor when it comes to being clear, easy to read, easy to write and maintain etc. when the numbers ratios become quite large. Very small programs can be read quickly but may include subtleties that elude easy perception on the surface. They may need to be read more than once, or they may require more documentation than the code itself to be clear. If code is too dense it will appear as nothing except meaningless cryptic symbols unless it is studied in great detail. If code is too verbose it may appear as perfectly clear line by line but impossible to view because of size. Yes, I can read source code, but no I can't read 25 megabytes of source and keep a picture of it all clearly in my mind. So my the definition I am use here for good source is something that conveys meaning to the programmer effectively. I would say text, but it could include graphics in visual programming or Color, Font styles etc. Just call it source to distinguish it from a formerly sourceless programming environment like OK. _______________________ The First 10x in Forth Forth had a surge of popularity in the seventies when FIG was distributing source and alternatives were limited. Many users who discovered Forth at that time reported elation at the increase in their productivity. They wrote programs faster, they debugged them faster, they maintained the more easily. They reported that they could write much smaller and much faster programs that could do much more than the ones they could write before. But when they reported that they had seen a 10x improvement after switching from ... they were often dismissed by mainstream programmers as kooks because that just seemed too good to be true to many people. Those who were there know that the 10x is not all that remarkable and is really due a bunch of numbers that when all multiplied together equals 10. No single thing gave these programmers a way to be ten times more productive, instead it is all the factors multiply by one another. The reasons have to do with the design of Forth. Stacks, words, blocks. Having the data stack for data is simple and beautiful way to handle and pass data within a program. It introduced less bugs than environments where programmers were working with lots of named variables or where they had to juggle register use by hand in assembler. The separation of data and return stacks made factoring more attractive. If you don't have to construct stack frames and move data in and out of function call's local variable data space before you call something you have less overhead in calling a function and can factor the code more extensively. {!"Factor, factor, factor. Factor definitions until most definitions are one or two lines."} is Chuck's advice. Factoring was a key to debugging and maintaining the code. Well factored code is easy to debug and maintain so the programmer is more effective with their time. Factoring also helps achieve the desired balance between memory use and speed for a given machine since memory and processing power are always finite. Programs were often performance limited by their interaction with mass storage as they are today. Forth provided the BLOCK mechanism as a very simple form of virtual memory and close to the metal mass storage access. By using BLOCKS where they could for data instead of more complex file access programmers reported speeding up parts of their programs by 100x as well as making them smaller and simpler. Programmers often also reported that with Forth they could change anything. They didn't spend large amounts of time caught in bugs they discovered in someone else's compiler, or in elaborate work around schemes fighting with their software as they had before. If they wanted to change something they just did and moved on to other productive work. They didn't get stuck like they often had before. So armed with software that was smaller and simpler and easier than what they had before and with an interactive modular develop and debug methodology that was more effective in the integrated development environment they were happy. They were delighted to have seen improvements in their productivity as programmers, their understanding and their freedom to do what they wanted to do. They also turned off a lot of people who didn't want to believe that these people could actually have all this stuff and made comments about how this was too good to be true so Forth must just be a religion or cult or something. So far everyone has said, yes, yes, we all know this ancient history of Forth. So far everyone has been with me and mostly agreeing. So let's get to the more controversial stuff. I begin with this history because it is my opinion that this is as far as most people got before they headed back toward more conventional programming practices for various reasons. Little by little as people added their favorite features from their favorite languages to their Forth and fought to standardize the practice Forth became bigger and more complex. Bigger computers and growing libraries of code made it possible to easily compile bigger and bigger Forths. When Forths were small and simple they didn't take much source code. There weren't too many words. Even on the slow systems of the old days a simple linked list of the name dictionary was sufficient to search things quickly. As systems became larger and larger dictionaries became more complex with wordlist trees and more complex methods of searching the more complex dictionaries were introduced which introduced more complexity. In an environment of spiraling complexity in the popular operating systems and GUI Forths expanded their interface to keep up. Now the glue between Forth and the user interface could be hundreds of times bigger and more complex than a complete Forth system in the old days. Some of these Forth systems are advertised as having the best compilers that produce the fastest code. What they don't tell you is that it may be true given that you are ready to accept a 90% or 99% slowdown to move into that environment in the first place. If you choose to mount your Forth under a GUI that takes 90% of the CPU power and leaves 10% for your Forth you may need that optimizing compiler even on a fast computer. We have ported the chip simulators to various compilers. We moved from a 16 bit DOS environment to a 32 bit Windows environment to get higher performance. When we hit the OS wall we still wanted more speedup so we ported back to the 16 bit DOS environment where we could get out from under the API load. We were able to speed up the program 1000x times by switching to a Forth that wasn't crippled by its Windows interface. What is interesting is that the program runs 1000x faster in a Windows environment by ditching the Windows Forth. We have a strong incentive to replace the many megabytes of OS code with a couple of K of reasonable code to get the same functionality. We prefer programs that are 1000x smaller and 1000x faster and easier to write and maintain etc. If you are stuck in an excessively complex environment using Forth gets you out from under some of the complexity facing other people, but only a tiny bit of it. Complexity demands more complexity. When the source code gets really big and complex it begins to demand things like version control utilities. Now between the huge files and time spent on version control jobs become too big for one person so we split them up and assign a team. Now we need a team of four. Now we need a more complex version control system with multiple user access. Now programmers are spending more time with the complexities of version control and other people's bugs that four isn't enough so we expand the team. Diminishing returns is the obvious result. Many commercial and PD ANS Forth implementations Forth have become as complex as 'C', or extensions to 'C'. The ANS standard went beyond the Forth core into extension libraries and it was common practice to start with everything from the last twenty years. We had Forths that were hundreds (or thousands) of times bigger and more complex than early Forths. They still supported the factoring, and the interactive nature of Forth development so they still had some the factors that made up that old 10x that we loved in the old days. But often now they were carrying megabytes of baggage and users were dealing with programs as large and complex as many other languages. Forth had changed so much that many systems require a full page or more of code to build a hello-world program thanks to the use of dreadful APIs. There were traditional Forth programmers and new Forth programmers who could use these environments in a similar way to what they had once done but the common practice was to introduce coding style, layout, and libraries plucked right out of other languages. Common practice became very unForthlike and in particular beginners were often exposed to such examples in places like c.l.f. between the debates about who could write the wierdest code to break the outer fringes of the ANS Forth standard. Chuck's view of programming, as I understand his description of it, is that there is a problem, a programmer and his abstraction and the computer. Forth was there to let the picture be as simple as possible and let the programmer map the solution to the problem to the computer. {!Problem Programmer with abstraction of problem Computer} and this leading to a solution that looks like this. {!User Programmer's simple implementation by abstraction of the problem to the computer Computer} {!This was Chuck's original idea of Forth even though in the old days the normal picture was not as complex and layered as it has become today. There were only a few layers between the programmer and the computer in those days but that was the problem that Forth was suppose to avoid. As the layers have become more numerous and deeper it has become even more important to let Forth avoid that problem.} {!As each of the layers of abstraction were added to the model and common practice over the years we were told that each would result in smaller, simpler programs because they would not need their own copy of things we standardized on. Programmers were suppose to become more productive and systems were suppose to become easier to understand and software would be easier to write, there would be more code reuse etc. Like cooking a frog in a pot the water the pot got hotter and hotter without people noticing who was coming to dinner until most of the problems most people face were introduced this way. People complain now that they spend more time looking for some code to reuse this way than they used to spend writing code when they used to do that. Chuck on the other hand has learned how to be more productive and write better code faster.} {!Chuck wants there to be nothing in his way. Chuck wants to make the computer simple and easily comprehended so that no extra layers of abstraction are needed to get at it or the problem. Chuck wants to make the solution simple so that it easy to write and efficient and has no extra layers of unneeded fat. Chuck seeks a simple efficient abstraction of the actual problem to the actual computer.} {!Chuck does not like the idea of a generalized OS providing a thick layer of abstraction that can introduce unneeded code, unneeded complexity, and inefficiency. He will support the idea of the abstraction of an OS but not one for everything. He and I would agree that in many environments there are layer upon layer of abstraction, that introduce complexity.} {!The people coming into computing in these times are being taught that the picture below is reality of a computer. They face enormous problems as a result. Almost no one gets to deal with the simple reality of the problem or the computer but must deal with the complexity of a thousand other people's abstractions at all times.} {!Problem Programmers's abstractions of problem(s) Programmers's abstractions in software (example: OO w/ late binding) Programmers's abstractions of software reuse (general source libraries) Programmers's abstractions of optimizing compilers knowing more than they Programmers's abstractions of the computer GUI API Programmers's abstractions of the computer OS Services Programmers's abstractions of the computer BIOS Programmers's abstractions of the computer architecture ('C') Computer (too complex for all but a few humans to grasp)} These are two very different points of view. Chuck has said that he would like to {Dispel the User Illusion|http://www.ultratechnology.com/cm52299.htm}. He means that the user has the illusion that all these layers of abstraction {!ARE} the computer. If they could see beyond the illusion to see only the simple problem and were only faced with mapping it to a simple computer things stay simple and simple methods work. The majority of problems are avoided this way. Those who have been working on making Forth more mainstream, extending it, and merging it with 'C' libraries and popular APIs have applied Forth in a very different way than Chuck. What made Forth popular twenty years ago was that Forth provided a simpler model and made programmers more productive because they weren't trapped behind so many barriers introduced by other environments. They could do things the way that made the most sense not the way they had to be done. Chuck originally created Forth to avoid problems introduced by unneeded abstractions. There was the abstraction of a Forth virtual machine and the expression of a solution in terms of that abstraction. Chuck has spent years simplifying and improving the virtual machine and has moved that abstraction into hardware to simplify both hardware and software design. He has a simpler virtual machine model, implemented in hardware on his machines, and a simple Forth environment implemented on top of it. In many discussions that I read in c.l.f someone will ask how other people would accomplish such and such. My first thought is usually something about how it couldn't be much simpler than what we do. {!Acolor fontcolor !} to change the color of output in a numeric picture. What does it take? A store to memory, a few nanoseconds is the answer when you keep things clean. Other people will post tens of pages of detailed code that they need because of the bizarre behavior of particular layers of their layer upon layer of abstraction introduced problem laden environments. People have said that without all this abstraction the general purpose OS could not run on innumerable combinations of cobbled together systems made of boards and cards and drivers from a thousand different vendors. This may be true although only in isolated cases will it sort of run anyway. It is also true that computers don't have to be built that way. They can be built with logical, simple, inexpensive but high performance designs. The problem is that there is a cost to carrying around drivers for thousands of computers when in reality you are always only using one set. Neither hardware nor software have to be built that way. The problem is that the number of bugs and related problems or the amount of waste of resources can cripple the computer and/or the programmer. With so many people using Forth today as a sort of scripting environment on top of every generalized service and abstraction as everyone else the common practice in Forth was no longer 10x compared to other ways of solving those same problems. Meanwhile I have been watching Chuck very closely. He seemed to still have a 10x up each sleeve that I saw very few other people using. He had a very different style of using Forth by continuing to explore in the direction he had been headed with Forth originally while most of the Forth community was going in the opposite direction. What are these other 10x factors? _________________________ Chapter 2 The Second 10x The first Forth in hardware was the Novix. Chuck wrote {cmForth|http://www.ultratechnology.com/meta.html#cmForth} for the first Forth machine. cmForth was only a few K of object code and about 30k of source. It included a metacompiler to compile itself. It included an optimizing native code compiler for the Novix that was very small and very powerful. It was so fast that when running off a floppy disk it could boot and metacompile itself off the floppy disk at boot before the disk could get up to full working speed. This phase has been called hardware Forth since the Forth that Chuck wrote was for the Novix which was a Forth in hardware chip. Other people looked at all the innovations that went into cmForth and wrote systems to experiment with some of the things that Chuck tried in that implementation. Pygmy Forth is an example of a system on the PC that was modeled after cmForth. Chuck was not happy with cmForth however because it was too much a hardware Forth. The Novix had some powerful features but writing a compiler to take advantages of the specific features of this quirky hardware was too complicated and too hardware specific for this chip only. As I say some of the innovations, some of the simplifications in Forth that were implemented in cmForth were ported to Pygmy on the PC and there are groups of programmers who used cmForth and loved it and there are still people using Pygmy Forth on the PC. Forth had been a very portable language because it was based on implementing the Forth virtual machine on some hardware. Once you did that you were dealing with more or less the same virtual machine on any Forth. Chuck looked at the concept and decided to design an improved Forth virtual machine and design both his new chips and his new Forth for these chips for this virtual machine. He wanted to get away from a hardware specific approach to Forth and go back to using a portable approach to his new Forth. The result was the virtual machine model that became the hardware implementation on his MISC machines. At first he referred to the native code for his chips as assembler like everyone else but after he wrote a tiny portable native code compiler as the assembler for this machine he decided it should be called Machine Forth. This new Machine Forth was not only smaller than cmForth but the approach was portable again because the simple optimizing compiler could be implemented on any machine by simply implementing the virtual machine just like in the old days. This was the technique Chuck used on his remarkable OKAD VLSI CAD program. While much of the Forth community was beginning to embrace optimizing native code Forth compilers in their own environments these Forths were very different. Other peoples optimizing compilers could be very complex. I implemented one that had many rules for inlining code and well over one hundred words that were inlined. This is very different than the optimizing compiler that Chuck designed which just inlines a dozen or so Forth primitives and compiles everything else as calls or jumps. It was the smallest and simplest Forth Chuck had built yet. It was not just a very effective programming tool for him but has been used by dozens of other people over the years many of whom just loved it and reported amazingly good results. There were also a lot of people who put their hands over their eyes when Chuck showed people what he was doing. All of my work for years has been either in ANS Forth or in Machine Forth. There are a number of other people who did a lot of work in Machine Forth and there were different styles employed by different people. One of my jobs at iTV was training new people and exposing them to our chips, our tools, our Forth and our methods for using it. I was pleased to report taking users from no prior knowledge of Forth, our chips or our tools to being productive programmers after a two hour tutorial and having them showing the first of their application code for their project the next day. I like to ask people how long it takes to train someone to be an assembly programmer on the Pentium or even a 'C' programmer just to see if anyone will ever say two hours from scratch. Using smaller, faster, easier to understand tools and techniques in your Forth can get you some of those factors that can get you that second 10x. Machine Forth source and the technique it uses for compiling is easy to read and understand. Chuck has discussed his reasons for introducing the innovations that he added to Machine Forth and why they produce more efficient code than the original Forth virtual machine model. Like the first 10x there is not one factor that is 10x. 10x is the result of combining a number of smaller factors. Chuck has explained the first set of these things many times. Mostly they are unpopular because they do not map well to what you have to do once you have gone way done the complexity path. Once you have introduced a large overhead of fat to deal with the lowfat approach doesn't apply well. Forth is stacks, words, and blocks; start there. Stay on that path, keep things simple, as simple as possible to still meet the other goals mentioned above. If you start to head off that path you start to introduce complexity that isn't needed and can eventually become the biggest part of the problem. Keep the problem small. Focus on solving the problem. Stacks are great (for Forth) so use them. Chuck has said "Don't use locals." You have a stack for data, this is Forth, locals are for a whole different picture. Words are great (for Forth) so use them. You have this wonderful mechanism for encapsulating code and data, for factoring and data hiding. If you want this stuff it has been there for twenty years, you don't need to extend Forth to look like some other OO language. We already have words. Most other languages implement OO by using late binding while Forth prefers early binding. Why do things at runtime that you can do at compile time? But most OO Forth implementations use late binding and push calculations from compile time into runtime and thus produce overhead. Blocks are great (for Forth) so use them. I don't say never use files. That would be ridiculous. But in most cases Forth can get a big advantage in small clear source and simple fast object code _where it counts_ by using Blocks where appropriate. Twenty years ago I learned that by substituing BLOCKS for files I could get parts of my applications to go 100 times faster and be easier to write and maintain. I also learned that by using a ramdisk or file caching in memory I could get a 100 times speedup. I experimented with combining the two and using ramBLOCKs on my expaned memory devices. I could get the both the BLOCK and ram speedups and tailor it to the performance balance that I wanted. Chuck has done something similar. Chuck uses BLOCKS in memory as a form of memory managment. Chuck's BLOCK references compile into a memory reference. He has a couple of hundred bytes of code to transfer blocks to or from disk to memory once in a while. The approach of using only the layers of abstraction that you need can include files but it may be better to have your own file system that meets the needs of an application. I know from experience that a file system can be very small and matched tightly to your hardware and provide things like file caching about as efficiently as it can be implmented. Chuck would say that if you are going to use files that you should do it where appropriate in the most appropriate way. He would say that many people no longer consider BLOCKS. Common practice has been to abandon BLOCKs altogether. Keep as much of that original 10x as you can. Don't start by giving up the advantages that you had to begin with. Don't use files for everything! Don't use files inappropriately. Forth provides unique access to a combination of interpretation, compilation and execution that just isn't there in other languages. It is part of the power of the language and it goes well beyond interactive development. It comes from a very very important notion in Forth, don't do what you can do at compile time at runtime and don't do what you can at design time at compile time. Chuck will start with a program that other people will code as A B C D E F G and rethink to find a way to do the equivalent as A B E F G by figuring out how NOT to compile part of the program and still make it work. Then he will figure out how to make B and F happen at compile time by switching between interpret and compile mode while compiling. Chuck's compiled code will be A E G. with C D eliminated (done at design time) and B F done at compile time. Chuck has said that one of the problems he sees with many mega-Forth systems is that they don't switch between compile and interpret modes often enough and compile too much. On the Forth systems I have been working on for several years I can do a Forth {!DUP} in one of three ways. I can write {!DUP} in Machine Forth and compile a 2ns 5 bit opcode. I can switch to ANS Forth and write {!DUP} in a definition. It will compile the ANS word {!DUP}. In most of the ANS Forths this will be about 50 times slower than the Machine Forth but at least it still compiling at compile time and executing at runtime. I can also do it with late binding as is shown in many examples in c.l.f {!S" DUP" EVALUATE} This involves interpreting a string, searching the dictionary, eventually executing the DUP with late binding all at runtime. This is a powerful technique that can certainly make Forth look like a completely different language. It is also a good way to slow down the program by a factor of about 600,000 times. I know, who cares? Just buy a new computer that is 600,000 times faster to make up the difference. Is this what we want to show beginners in c.l.f to expose them to good ANS Forth practice s? Give them a page of rules for spelling, a page of rules for laying out the code and examples of how to slow things down by 100,000x. This is pushing things from design time to compile time and from compile time to runtime in a big way. The idea in Forth to do just the opposite. Chuck likes to say that executing compiled code requires that the code be read, compiled, and executed. You have to read it to compile it and then execute the compiled code anyway so consider reading it when you want to do it by using interpretation. If you are dealing with small simple source this seems to make sense. I doubt if it seems to make sense if you have introduced so much complexity that you have megabytes of code. In Chuck's software with a 1K Forth and typically 1K applications interpreting source is fast and involves searching short dictionaries. Interpretation becomes less attractive when it involves searching wordlists of tens of thousands of words. Chuck refers to his concept as {!ICE}, Interpret, Compile, Execute. As a metric I did some analysis of code examples Chuck has provided. The numbers I find most interesting is that {!the length of the average colon definition is 44 characters. The length of the longest colon definition was 70 characters.} This is a sign that he has factored, factored, factored. How big are the average definitions in your code? Smaller definitions are easier to code, easier to test, etc. {!The ratio of number of interpreted to compiled mode words in his source is 1.65/1. That is he as 1.65 times a much being interpreted while compiling as compiled while compiling.} Chuck says that without this use of {!ICE} programs can easily become too large. Finally although Chuck includes almost no documentation in the source code itself it was accompanied by lots of external documentation. {!The ratio of documentation text to source code text was 339/1.} Chuck feels a description of the code and explanations can be linked to the code but should not clutter it. He feels that documentation deserves a place of its own and has created various web pages so that the documentation can be accessible to other people. He says he can take a small Forth and interpret an application from source faster than a megaForth can execute compiled code. He is not just talking about a small difference either. The megaForth may take quite a long time loading into memory, the compiled code may also have to come in as a DLL. I doubt if you can load and execute a few megabytes of compiled code this way as quickly or easily as simply interpreting a a small source file in a system that can load into memory, and compile or interpret applications before your finger can come up off a key. Chuck would say if you know how to write good code you won't be so reliant on source code libraries. He would say, "{!Don't use libraries inappropriately.}" Don't compile everything including the kitchen sink just because it is there. If you don't need it leave it out. Leave it in the library. I have often seen people take a big computer and start coding with a few library includes to get started and then run out of resources before they can get beyond the sign-on message, megabyte hello-world programs. {!Use efficient control flow structures.} Don't over use them. Saying {!X X X X X} is cleaner, simpler, faster and clearer than {!5 0 DO X LOOP}. We have so many choices in ANS Forth that we can make it look like anything, but don't. Keep it simple. There are words like {!CASE} that Chuck refers to as abominations, keep it simple. Speaking of abominations, don't {!PICK} at your code. Don't {!PICK}, don't {!ROLL}. Don't delay doing things and never {!POSTPONE}. (Compiler writers excepted.) Stay away from the {!DEPTH}. Programs that have lost track of what is on the stack are already in big trouble. Chuck has made himself very clear about this and explained his reasoning. {!There is a long list of words in the ANS Forth standard that Chuck would say you should avoid if not simply remove from your system.} The problem of course is that if you have ANS Forth libraries they too are full of them. But this just means that you have the opportunity to do it better this time. {!One of Chuck's principles is that what you take away is more important than what you add. You have to add whatever you need. The problem is that you end up adding a lot of stuff that you didn't intend to add originally but the complexity just demanded it. The more you add the more the complexity demands that you add more and it can easily get out of control. You have to put in what you need but unless you make an effort to take out what you don't fat accumulates.} So a review of what I would call Chuck's second 10x are first stay with stacks, words, blocks as a start and don't be in a hurry to abandon them for things from other languages like locals, objects and files. Think through the problem, spend your time in the design phase figuring out how to balance the program between interpretation and compiled code execution. Don't just jump in writing code and compiling everything. Use the ICE concept. Don't abuse libraries. Compile what you need not every library and extension. Use simple control flow and layout of the code to enhance the factoring factoring factoring. Use {!BEGIN}s rather than {!+LOOP} etc. Don't use the long list of unfortunate words that have crept into and been quasi legitimized by ANS. Do all this and don't give up the 10x that you stared with. Keep it simple. I had several years to observe a team of programmers some of whom were following these guidelines and some of whom were following common ANS practices. Chuck also observed these same practices and solidified his opinion of ANS Forth use. My position as programming manager was somewhat different than his but we tended to agree on what we should and could do. Chuck has said this stuff many times over the years. If you introduced the word {!FOO} into the ANS standard you don't want to hear Chuck telling people not to use it. If you charge people for fixing their buggy programs you may not want them able to write solid efficient programs on their own. If you are promoting your megaForth system you may not like Chuck advocating the use of small and sharp Forths. If your thing is locals or {!OOF} or Forth in 'C' or Java then you may not like Chuck's opinions on these subjects. With so many people pre-prejudiced against Chuck's ideas on these subjects there are a lot of people who will trivialize and distort Chuck's position on these subjects out of self-defense for their own position. At the same time some of them have said that they will refuse to even look at Chuck's examples or proofs, they have closed their eyes and minds on these subjects. ___________________ Chapter 3 Third 10x Now that brings us to the third 10x that Chuck has up his sleeve. This is the one that I found most elusive. This is the stuff that he has not explicitly and repeatedly presented to the Forth community. It is the stuff that is not obvious to everyone and which you have to dig in to find. I can only assume that given the reaction that Chuck has received for the advice he has given on good programming style that there is no point in his trying to go on to the advanced stuff. I know I have felt very much that way over the years. When I have given presentations I always want to get into the interesting stuff like how to implement a GUI in a couple of K or how to compile English language rules into optimally efficient executable code structures or how to parallelize Forth programs easily but we rarely get beyond reviewing the most basic and simple facts to get started. People seem to begin by trying to map the ideas about the chip and Forth to other languages and environments and architectures. As soon as we try to get going people say, "that's not possible" because they are thinking 'C' or thinking RISC. We spend most of our time simply explaining that the starting point and the Forth perspective and that our programs and our hardware match very well because the hardware is 1000x smaller and che aper and lower power consumption and we only need about 1/1000 the code so it fits. It is simple and easy and productivity goes way up when you are not dealing with the 99.9% fat and all the unneeded complexity. The explanations often go like this: "I don't think it will work, you can't handle the ... problem." "We don't encounter the ... problem in this approach we avoid it." "But you can't handle the ... problem." "We don't encounter the ... problem when going in this direction." "How can you handle the ... problem?" "That is what we are optimized and designed from the ground up to do." They have a very hard time seeing that we face a different set of problems by simply avoiding so many unsolvable problems. If I post a fact in c.l.f such as the time for a specific processor to run a specific program so that people can compare it to other processors it gets lost in the noise of all the people who post what they think are corrected estimates based on their perspective. I have asked people why they would say in c.l.f that I was not being honest in simply reporting results and ask them where they get the information that they published. "Well it was just my estimate of how well F21 would run SpecFP in Unix." If one has read what Chuck and I have said over the years they would know that either the person doesn't have a clue or is trying to be an intentional deceptive as possible. As I say the last 10x is the most elusive and the stuff that I don't think Chuck has explained repeatedly although it has been put in front of us. Of course those who deny that it is there are never going to find it even when it is pointed out to them. I imagine that when Forth Inc. trains people that they provide them with some of this stuff. I have seen articles in FD about some of the techniques but it tends to get lost in all the stuff about making Forth look more like other programming languages. People try to distort and trivialize Chuck's by saying that he has removed everything that is needed from Forth and they don't understand why. But Chuck didn't just remove things to remove things. He removed things when he found a simpler way to solve the same problem that something was there solve. Often to solve a problem you have to add x-y-z. But then x-y-z introduces some more complexity and as a result you have to then also add a-b-c and d-e-f. Now everyone gets used to x-y-z and never considers any other way to solve the problem that x-y-z has solved for everyone for so many years. They also never consider that a-b-c and d-e-f may also be viewed as problems that are begging to be fixed by being removed. Otherwise as time goes by they become bigger and bigger problems and as code is developed and added to the system they will lead to more complexity. So Chuck will rethink the problem. He will find a different way to do x-y-z, or better yet to avoid the problem that x-y-z solved. Now the x-y-z solution that everyone has been using for some problem is no longer needed because something better is now available. It gets done because now he can also remove a-b-c and d-e-f since they were just needed to support x-y-z. Things become simpler, clearer, easier to use, easier to maintain, because unneeded fat has been removed and there are fewer complications. But people who have been doing x-y-z for twenty years are horrified at the prospect of not using it. They cannot imagine Forth without it. None of their code would run without it. They would have to rewrite code to not use x-y-z and writing code is hard for them. They also cannot imagine living without a-b-c and d-e-f because they once again their old code is full of that and they never considered doing it any other way. Chuck doesn't just throw stuff out because he is compelled to minimize. He is compelled to experiment and improve his code. He is not against throwing something out if he finds a way to be more productive without it. He throws stuff because he can more productive by doing things in a better way, cleaner, clearer, simpler. He does not think that he came up with the perfect language 30 years ago any more than he feels he has the perfect language now. He will always want to change and improve what he is doing and he will always be looking for new ideas to make it better. He has been on a clear path to make his language, Forth, smaller, simpler, easier, more powerful, faster, less buggy, more maneuverable, etc. But each time he finds a way to improve something, through extensive experimentation and pragmatic analysis many people have knee jerk reaction and say, "He took out what? I can't live without that! What is he thinking?" But rather than ask what he is thinking and ask him to explain why it is better to do Forth without x-y-z many people just say, "x-y-z is standard practice by the average programmer and is part of the ANS standard. I don't want to even consider the idea of removing x-y-z and I have interest in listening to the arguments. I will just post stuff for other people about how I know that Chuck knows that x-y-z is really good and that he is just trying to mislead people. Look at Chuck's {cmForth|http://www.ultratechnology.com/meta.html#cmForth} from a decade ago if you haven't. Why was SMUDGING removed? That's non-standard! Why were immediate words removed? The answer is pretty obvious, he found a better way to solve the same problems those things were there to solve, they were no longer needed and so were just in the way. He as explained the reasoning but only a few of dozen people have followed his logic in the design of cmForth. Fewer followed his transition to Machine Forth, it sort of slipped through the cracks. Very few had any interest in OK, it wasn't Forth. OK as originally called 3/4 (three Forth) because Chuck considered it subset of Forth at first but later decided that it wasn't Forth. By the time he had moved to his {Color Forth|http://www.ultratechnology.com/color4th.html} Most Forth enthusiasts had lost any interest in what Chuck was doing. He wasn't doing ANS Forth like everyone else and some of the ANS Forth people even felt that Chuck should not even use t he term Forth any more because they were now the official owners of the term and Chuck wasn't doing what they said was Forth. The consequence of removing SMUDGING was that compiling was simpler and recursion became automatic. As a result if Chuck wanted to redefine a word he would have to do something like ( ' MYWORD : MYWORD ( redef) ... COMPILE, ... } because {!MYWORD} would be recursive call to the new {!MYWORD} without SMUDGING of the name until the definition is finished. But Chuck said he would just redefine it with a new name anyway because that is simpler. Chuck also added tail recursion a decade ago to his Forth. This only gave him a small speedup on many words by converting the last call into a jump rather than compiling an {!EXIT} in the {!;} A little speedup here and a little speedup there and if you keep at it you see overall speedup. In the same way a little fat and sloppiness here and a little fat there and soon you see a lot of fat accumulating. By combining this tail recursion with the auto-recursion provided by removing {!SMUDGE} from the system this provided a looping construct. With this new simpler control flow mechanism Chuck didn't need all those others. He had already tried replacing {!DO} with {!FOR} a decade ago and now able to remove other looping constructions now that he had a simpler more efficient mechanism available. When Chuck removes stuff from his Forth that other people use regularly in their Forths they wonder how he can live without ... or ... and they don't seem realize that he didn't just take stuff out randomly or for no reason. He took stuff out after many years of extensive experimentation and consideration because either it was replaced by something that he considered better or he felt that it was just getting in the way and no longer needed at all. People assume that since Chuck has refined his Forth down to about a 1K object that this means he has just stripped his Forth down to a 1K kernel that will boot like in the old days and that he is going to compile a complete Forth system on top of the 1K before he starts an application. This is wrong. The complete Forth system is 1K, and the reason for that is maximize Chuck's productivity. What stops people from doing what they need to do to solve a problem is all the time spend solving all the related sub-problems that pop up as a result of complex interconnections between components. To maximize his productivity Chuck minimizes the number of these side problems that pop up. Keep it simple, and don't get to where you are spending 90% or 99% or you time dealing with related sub-problems. Avoid unsolvable problems, don't waste your time trying to solve them. The approach that Chuck has taken is to focus on the task of solving a problem at hand. This means facing the situation of a real program. Unless you are writing a book about programming theory if you are dealing with a program it is the implementation of a program on a real machine. There are many real issues that come up when one faces the situation at hand. You have a problem, and are facing the implementation of a program in the real world. The approach Chuck takes is to think through the problem well before you design the app. Part of that design involves issues around the real platform on which the code will run. Face the problem, think about it until you can picture the solution as about 1K of code or less. Until then you don't really understand it. If you assume that the program is going to be megabytes of code it is very intimidating. The first thing people look for is ways to crank out code and do as little coding as possible. They can't possibly write a megabyte of real code themselves so they look for code in libraries. They paste in the code, and as much as possible and then go from there. They see no alternative. There is just too much code to deal with to study it in detail. Sometimes the prospect that programmers are so unproductive and programs are so big will make development costs look more attractive if spread accross multiple platforms. The idea is good and sometimes it is better to only spend the minimum on coding and live with inefficient code that is mostly portable to multiple platforms. That is the case when the programmers are crippled. This is not the mindset that Chuck would advocate. Don't base all the plans on the restrictions imposed by crippled programmers. Once you have made it through the second 10x you are no longer facing the unpleasant prospect of only being able to cobble together systems out of large collections of code written by lots of different people. You have the option of doing the same job with a small amount of well thought out code. The code is small so it easy to focus on making it efficient. The code is small so it is easy to focus on making it match the real details of the platform on which it runs. Writing megabytes of code requires that you paste in a lot of stuff without trying to understand too much detail, not very Forth-like. Writing kilo sized applications exposes all levels of what is going on in the code and provides the anything can be changed ability that accounted for the first 10x. The applications can be easily ported and easily maintained and easily improved. If an application deserves to be written it deserves to be written well. It deserves to be well thought out and well implemented and perhaps even rewriten and reimplemented. Why? Because you will learn! You will learn how to make it better and better. If you paste stuff in in an effort to avoid thinking about the details do you really think your programming skills will improve? Do you really think you reached perfection on the last try or that you cannot learn by rethinking the problem? {!One of the factors for that third 10x is thoughtful or mindful programming. Pay attention to what you are doing. Don't try to avoid thinking. Programming with thought really is more fun and more rewarding than programming without thought.} Now I realize that this bucks the trend in modern software. Popular software keeps trying to dumb languages down further and further so that programmers will be a commodity resource. Any programmer given the same libraries and same tools will cobble together more or less the same mess. It is a good way to turn the programmers into replaceable parts. But who wants to be a replaceable part? So you really think through the problem. You study examples in libraries. You try different experiments and models and compare features before you jump in and design code. Where do you jump in? I think you think through the problem from top to bottom and from bottom to top a couple of times. Then you look at the most important part the bottom. Most of the time in the profile will be spent at the bottom. The code at the top may execute so infrequently that it doesn't matter whether it is system style code or script style code. But the lowest level code, system level if Forth is the OS, or OS and interface code otherwise spreads its fat over everything else. If anything deserves attention this is where you start. What routines execute most often? Where will the focus on efficiency be placed? This is the very bottom of the design. At this stage I advise programmers to study the data structures that are used by the program very carefully. There are some very important issues to the code design, data structures, ordering, and scaling. How do you design the data structures? You examine what they contain, how they will accessed, and how they will be manipulated. It is not that much different than planning the data flow on the stack. If it is well planned out things are often there whenever they are needed because of the planning at design time. When they are not in the right order you need stack or data structure gymnastics to manipulate the data. It is very important that the fat be removed at this level. The order in which data elements are accessed in the program should be designed so that as often as possible in the most frequent routines when a data element is accessed the pointer is left pointing at the next element with no overhead. No manipulation of the data pointers is needed when this is applied in as many steps as possible in the problem. Access to the data on which other routines depend can be sped up several times with something as simple as that. It is no different than carefully planning the stack use so that no stack gymnastics are needed. I can think of an example where the common practice was applied by an ANS Forth programmer who chose to ignore all of the above advice and generate more portable code. Everything else was built on top of the data structures. The program was a translation of a program from a library from 'C'. The first step was to paste in 'C' like data structures. The second step was to make a copy of the data structures in the 'C' program and access them in exactly the same order as in the 'C' program. Worst of all the Forth implementation of 'C' structures used late binding so instead of pushing stuff from runtime to compiletime and from compile time to design time this was pushing stuff from compile time to runtime. The pasted code had not been examined closely. Even a novice Forth programmer would have cleaned it up and sped it up a couple of times if thought had been applied. So it gave up about 2x by not removing lots of visible fat, 2x for using the ordering in the 'C' program without question, 50x for using late bind ing and 50x for using ANS Forth rather than Machine Forth. This was far more than 10x when you combine the factors, more like 10,000x. I patched the code, ran comparative benchmarks and discussed the issue again in staff meeting. We had been assured that any code pasted in from FSL would get cleaned up appropriately for an embedded target. {!Translating programs from one language to another can be done mechanically or by studying the implementation in one language and improving on it in the next. Don't translate when you have the opportunity to rewrite and improve. You can spend your time carefully designing code or trying to debug and improve thoughtless code.} Translating without thinking is not thoughtful programming. Don't be afraid to think about the problem and what you are doing. {!Another factor for getting that last 10x is scaling.} Most Forth programmers don't take advantage of the tricks of scaled arithmetic. Their programs deal only with full integers or floating point. Scaling is important because it may allow you to take advantage of my favorite mathematical operation, cancellation. Scaling is something that takes planning, not unlike the ordering and arrangement of important data structures. Precision and accumulated error are some of the considerations. {!The real value is that when using scaled math you may have the opportunity to carefully scale the factors so that the most common operations are simpler mathematically or logically than they would be without careful scaling.} In other languages you may write an infix algebreic equation and let the compiler sort out the operations to do it. You can add a routine to do that for you in Forth also. Normally we put in the equation as a comment and sort out the sequence of operations to implement it outselves. In doing this we are used to refactoring an equation to simplify its calculation. In the same way that expressions can be simplified to remove terms some equations can be juggled to use operations that are simpler to implement than others. Shifts and adds can replace multplies and divides or more complex operations when the values are scaled to suit key constants and key calculations, or terms can be combined or made to drop out altogether. The program will execute this equation N millions of times, by scaling it this way these multiples become {!1*}, this divide becomes a {!2/}, and these terms drop out. Of course we need to convert back to another range of numbers when we are done with all the calculations in the main loop so you rescale again when you are done. If you don't think this way you can't take advantage of this type of thoughtful programming technique. Chuck showed me the equations he was using for transistor models in OKAD and compared them to the SPICE equations that required solving serveral differential equations. He also showed how he scaled the values to simplify the calculation. It is pretty obvious that he has sped up the inner loop a hundred times by simplifying the calculation. He adds that his calculation is not only faster but more accurate than the standard SPICE equation. In the article about OKAD from {More on Forth Engines Volume 16|http://www.ultratechnology.com/mofe16.htm} Chuck mentioned scaling of units for his transitor model. He said, {!"I originally chose mV for internal units. But using 6400 mV = 4096 units replaces a divide with a shift and requires only 2 multiplies per transistor. This pragmatic model closely fits the measured IV curves. A (spreadsheet style) display exists for manually fitting parameters."} Even the multiplies are optimized to only step through as many bits of precision as needed. The third 10x comes from putting particular attention at design time to these issues of how to move things from runtime to compiletime and from compiletime to designtime. {!The third 10x is to some extent a recursive application of part of the second 10x.} Just because you have a solution does not mean that you should not consider trying a better solution. This is most important in low level code. Wordlists are a feature of Forth that links multiple lists of words to be searched in the name dictionary. One list of words for Forth, another for the assembler, another for the editor or at least that is the way it worked in the old days. ANS Forth provides a mechanism for constructing and managing wordlist trees in the name dictionary. I have noticed that this has often become a very abused feature of Forth. The multiple ordering of the wordlists to do different things leads to the problem of the same names doing different things in different wordlists and confusing the programmers as well as other problems. I have seen this carried to such an extreme that the programmer has constructed more wordlists than Chuck would write words! The worst case I have seen is the use of wordlist to implement classes in a object oriented Forth. As I say before the words in Forth are a primitive form of object if they don't do it for you you can use the {!CREATE DOES>} construct for more object functionality. When systems overload the concept of OO classes onto a wordlists I think it qualifies a wordlist abuse. It introduces so much complexity. Chuck has recently said that he has removed wordlists from Color Forth. With a 1K sized Forth that can compile applications in a click and forget them in a click he isn't dealing with code that requires being spread out over a wordlist name tree. Removing so much complexity from other places made Chuck feel that wordlists were just not needed to deal with the complexity that he used to deal with. Removing wordlists is also one of the techniques that greatly simplified other words in the system and allowed Chuck to build a 1K sized Forth that can compile applications in a click. There are several improvements that Chuck has added to his newer Forth virtual machine model. One of them is the address register and the other is the circular stacks. Chuck has explained that hardware considerations aside the idea of the address register was that the Forth words {!@} and {!!} ( fetch and store) were clumsy at the top of the stack and were based on smaller atomic operations that the programmer could take advantage of. {!@} was broken into two operations{!A!} and {!@A}. Likewise {!!} was broken into {!A!} and {!!A}. One advantabe of this approach is that the contents of the addressing pointer can be preserved accross words independently of the data stack. Another advantage is the use of auto-increment addressing memory access opcodes. The use of them permits auto-incrementing items within loops of a program using less code and less time. There are some other consequences as he has said as to what kind of code falls out when you write for this sort of virtual machine. All the people that I wor ked with reported enjoying using the style and felt empowered with these techniques to improve their Forth code. I advised the programmers who were using ANS Forth only to experiment a little with addressing this way in their high level code and gave them a simple set of definitions with a variable named A used for addressing. Code written in that style and with {!BEGIN}s instead of {!LOOP}s and other more high overhead words that do the same thing would result in a simpler and clearer style and make it trivial to port their ANS Forth code to Machine Forth on the project to get a large automatic speedup on this system Another feature of Chuck's new Forth virtual machine model is his circular stacks. They greatly simplify the construction of his hardware and make the stacks in his architecture faster than general purpose registers in other architectures. They also greatly simply Forth. Chuck has said that stack overflow and underflow errors have always been a problem not just for Forth but for everyone. The problem is that they are destructive. Unanticipated errors happen but then the return stack gets corrupted errors can compound and systems can crash. Overflowing stacks can corrupt code and other data structures causing systems to crash. Hardware and software designers have put a lot of attention into managing these errors with elaborate hardware and software. That elaborate hardware and software adds complexity in other places like compilers and applications and that leads to more bugs. Chuck wanted to find a simple solution that wouldn't introduce complexity. Having the bottom N elements of the stack as a circular dat a structure meant that there was no arbitrary starting point. When you were done it was empty. You never had to empty it out before using it again, it was always at the start if you wanted it to be. If you program had bugs the worst thing that could happen on the stack is that stack data would be corrupted. That kind if error is a lot easier to deal with than corrupted code or memory structures. It also means that a Forth system does not have to deal with complex hardware mechanisms or complex software mechanisms the problem is either avoided or minimized as well as any approach can make it almost for free. Now Chuck says to factor, factor, factor. But in fact he also sometimes does just the opposite, he inlines his code! But not all the time, only were unrolled inner loops will improve the performance of a critical routine. It is a common practice in other languages to allow the compiler to do this for you. Chuck however simply specifies it explicitly when it is better than factored code for the problem at hand. He doesn't do it with complete applications, only a few routines. These inlined code unrolled inner loops also give Chuck access to the use of computed entry into a code table. If you only transfer a maximum of 800 pixels in sequence when copying a video line you can inline a sequence to move up to that maximum and eliminate all loop overhead in that sequence of code. You can them compute the address in the code table into which to jump or call to transfer less than 800 pixels. Chuck has abandoned the traditional {!DO LOOP} constructs in Forth and replaces them with simple {!BEGIN} constructs or computed jumps into unrolled inner loop code tables. Another factor at work here is the amount of code inside of the loop. The shorter the inner loop the higher the percentage of overhead for looping and the more it begs to be unrolled. One of the most powerful techniques I have seen is what Dr. Philip Koopman documents in {STACK COMPUTERS the new wave/ section 7.2.3.|http://www.cs.cmu.edu/~koopman/stack_computers/sec7_2.html#723} executable data structures. Code operates on data but they are not mutually exclusive. Code can have data embedded in it and data can be executable. This is our perspective in Forth. Some other languages and some hardware systems have a very different perspective. If you can take advantage of it it is a very powerful technique and can generate the tightest code possible on a large class of problems, problems that use decision trees. ___________ Conclusion Novice programmers, or commodity component programmers in some language are given a problem and start right in writing (or pasting) code. More experience programmers know that time invested in understanding the problem and designing the solution is more important than jumping into coding. The master programmer knows that this stage may deserve being done more than once or recursively. If you cobble together megabytes of code and then start testing it you would never consider starting again. You already had to produce 1000x more code than you needed to, would you want to do it again? If you carefully construct some well thought out code to match the problem at hand more thinking or a little more recoding may be more fun and more productive than the first try. So if you're new to Forth try it out. Perhaps you can find some way to get that exhilarating 10x improvement in your programming. If you have been doing that for years remember how much fun it was when you made that improvement and consider that it could be fun again. Try for that second 10x with an open mind. OK, so you have been doing all this for years and your programs are about as small and simple and fast and clean as you think is possible. Great, but don't stop there. Unless you think you really have reached the end-all of computing be willing to try other things, experiment, look for stuff you hadn't noticed before. If on the other hand you don't care about these issues and are delightfully happy to do your sentence as a replaceable commodity programmer and you find learning painful then none of this is applicable to you. It is only for people who want smaller, faster, cleaner, clearer and better programs and more personal satisfaction. As Chuck has said one of the rewards of thoughtful programming is the satisfaction of a job well done. If you enjoy programming you probably enjoy doing it well and you may not enjoy being told to crank out low quality code because it seems like a good idea to some bean-counting manager. It makes you wonder how many times programmers expressed concern and were forced to put Y2K bugs into programs against their will by short sighted managment looking at the quarterly budget. One thing I am sure many people would think is that the techniques that Chuck uses are examples of individual genius and individual effort that they are not going to be applicable to typical problems in computing. I know that Chuck would not agree with that nor do I. I have seen too many other people do it, and on teams. I worked with Chuck through my own company UltraTechnology and I worked as director of programming at the iTV Corporation with Chuck. There I was able to examine the relationship between team effort and programming environment and style. I was somewhat suprised to find that a team of Machine Forth programmers using Chuck's style of problem solving and coding combine with a management effort to guide the team worked very well. It really didn't appear that the language or style of coding had any relationship with team coordination efforts other than the relative number of bugs that different environments introduced. The Machine Forth programmers had no problem sharing problems, sharing code, a nd having their contributions to the effort work in harmony with the other programmers. It confirmed my experience that it is not the language but the way the team interacts that is the biggest factor. However if the coding style introduces bugs they are harder to find in a multiprogrammer project. The Machine Forth programmers consistently delivered very stable high performance bug free code on schedule. It was also my job to train these programmers on Machine Forth and I pleased to see that the techniques were easy to learn and easy to apply. Another thing I am sure many people are going to think is that the techniques that Chuck uses is useful for dealing with small problems but not large one. However the largest and most complex, not to mention most expensive, software that I have encountered were VLSI CAD programs. I am sure people will equate the incredibly small size of OKAD and its incredibly fast code with it just being a trivial problem. Chuck's extreme optimization of the software only shows that his approach allowed him to write a suite of application programs that perform the same functions that he wanted to use in those huge expensive VLSI CAD programs. His ability to shrink a number of megabyte sized application programs into his set of a number of kilobyte sized is not an isolated case. We had a set of programmers who consistently wrote bug free 1K sized applications in Machine Forth. In one case we had one ANS Forth program that was 100K rewritten to 1K in Machine Forth and made 1000x faster by someone applying these programming te chniques. So why are Chuck's ideas about computing so unpopular? Well people tend to equate them with twenty five year old Forths rather than seeing that he has significantly changed his approach to Forth several times over the years. What is funny is that they will say they have no interest in what Chuck is doing because he is still doing what he was doing twenty years ago. Chuck is doing Forth without 'C' libraries linked in and with BLOCKS and with tiny non-ANS style Forth. Chuck would say that they are the ones who are twenty years behind and are still doing what he was doing twenty years ago, they just created a poor committee standardized version of where he was twenty years ago. So it is almost funny that both views see the other as twenty years behind. People get very offended when Chuck just plainly says that their code is 10x bigger than required, and those are the Forth experts! The people doing 'C' or Forth in 'C' look at what he said and see those 100x numbers. They feel very insulted and sometimes claim that Chuck called them an idiot just because he reported the result of a benchmark or the time he spend coding something. They would rather refuse to believe the numbers or even refuse to look at them because they might shake up their ideas about computing. If you are selling megaForths then Chuck's ideas that Forth only needs 1K and more than that is most likely unneeded fat is not very attractive. If you are getting paid to teach people how to write ANS Forth programs you might not like Chuck advising people to try something else. If you never saw the original 10x and don't think that other people really say it you are not likely to consider other hidden potential 10x factors. Chuck's ideas are really in the face of many of the things that are being taught in computer science. They are in the face of all the folks who's computer software has been expanding about as fast as their computer hardware for years or even sometimes falling behind. They are not applicable once you find yourself buried in self-imposed complexity. They won't help you if mix and match them too much with the conventional ideas. While most of the Forth community has been working very hard to make Forth more like other languages to get it to fit into the niche the world has for it Chuck has been trying to go in an opposite direction. Rather than water down or dumb it down to look everything else Chuck has continued to make it smaller, simpler, faster and more productive in iteration after iteration. Just as Forth was in the face of conventional programming Chuck has chosen to not do what he thinks most everyone else is doing, making Forth look and work more like other languages. He feels that are plenty of people doing that. He wants to try to make Forth smaller, simpler, faster and more Forth-like again and again. That is in direct opposition to most of the Forth community who want to agree to set in stone the way things were done twenty years ago and then extend Forth further and further. Neither effort seems to have done much for Forth. In this period the size of the Forth Interest Group dropped steadily, leveled for a bit, then fell off again. Today I think c.l.f is a larger Forth community than FIG. It is a shame too in the sense that FIG asks the experts to give presentations and then ask them questions. Usenet offers equal time to newbies, green belts, black belts and masters and the experts are more focused on exposing the fringes of Forth to people by endlessly debating ridiculous ways to break the standard with some of the wierdest code any of us have ever seen. I really wonder how anyone would get started with Forth today and how much of that original 10x, will be available to them. _______________ Final Thoughts Chuck prefers the use of seven keys, cursor and function keys rather than a mouse for precision of control. He uses a full keyboard for writing Color Forth scripts with function keys assigned as color change tokens. His GUI in OK uses the full graphics screen in a number of different modes of operation rater than having the look of popular GUI with resizable, movable, layered windows on a simulated desktop. I prefer the mouse driven interface to the seven function keys but the interface is still left, right, up, down, and a couple of buttons. I have said for years that something that looks to the user like a popular GUI with resizable, movable, layered display windows and a simulated desktop only takes a couple of K of code. Chuck has demonstrated that a windows accelerator in hardware only requires a few transistors. These things can be simple, fast, and mostly painless. The application of Chuck's methods fits well when creating a small simple GUI or OS with efficient implementations of the abstractions tha t the application demands. Chuck's idea of the future is more custom silicon and machines that are efficient at what they do. All machines will not need Windows (tm) or Unix or Forth for that matter. Chuck likes the abstraction of source code being interpreted in embedded applications. He has even joked that perhaps after Y2K it will be mandated. With a tiny 1K Forth or Forth in hardware machines would embody underlying abstractions used by programmers. Chuck's methods apply well to a team of programmers setting the abstractions that they need. Chuck has said that he would enjoy seeing a small project be funded to have a small team of a few people knock off something resembling the modern GUI desktop and set of applications in Forth on a Forth machine that does everything useful but that is a thousand times smaller and simpler than the alternatives people have now. A tiny piece of silicon can contain hardware that performs a particular task with incredible efficiency and with a tiny general purpose central processor. One advantage of these machines will be that software will support only the devices found on chip. You can have all the special cards, video cards, analog cards, network cards, lcd support, gigabit fiber links, popular I/O interfaces etc. and for a given device there only needs to be one set of drivers to support the hardware. There is no need to support many extra layers of abstraction or inefficiency between what needs to be done, what the hardware can do, and what the software does. After of number of years of working in these environments I must admit that I feel that I have become very spoiled. I am used to having fun solving problems quickly and feeling very productive. I have a strong sense of satisfaction with the problems that I have throught through well. I enjoy showing other people how easy it is to do things this way. I recall how when I worked as a consultant to big companies on all kinds of computer with all kinds of software that I enjoyed solving the most complex and involved problems and I think how different that world was than one where the problems are small ones and I can get so much more done. I can also understand how most people look at Chuck's methods on the surface and think that it just doesn't apply to what they have to do. Chuck's methods wouldn't solve all of the problems that they have. Their problems are often related to different methods and Chuck strategy is to avoid most of the problems other people must face to be more effective at solving the problems he wants to solve. ___________________ Related references [01] Jeff Fox Thoughtful Programming and Forth {Chapter 1|http://www.ultratechnology.com/forth.htm} {Chapter 2|http://www.ultratechnology.com/forth2.htm} {Chapter 3|http://www.ultratechnology.com/forth3.htm} [02] Charles H.Moore, Geoffrey C. Leach {FORTH - A Language for Interactive Computing|gopher://gopher.anahuac.de/it/programming/forth/4th_1970.pdf} 1970