X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: f996b,4f3c981b56494d92 X-Google-Attributes: gidf996b,public From: Neil Franklin Subject: Re: Standard codes (out of topic !?) Date: 2000/02/12 Message-ID: <6uwvoa85y6.fsf@chonsp.franklin.ch>#1/1 X-Deja-AN: 585097042 References: <883uu3$1he2$1@f1node01.rhrz.uni-bonn.de> X-Complaints-To: news@chonsp.franklin.ch X-Trace: 12 Feb 2000 22:31:54 +0100, dua131221.dialup800-stat.ethz.ch Organization: My own Private Self NNTP-Posting-Date: 12 Feb 2000 21:30:09 GMT Newsgroups: alt.ascii-art "Meph" writes: > > 1963 ASCII (American Standard Code for Information Interchange) > - specified as 7-bit code for telecommunication and data exchange > - 100 of 128 positions are used (only upper case) Prepend this with: U.S./AT&T Teletype code. Existant at least in 1953 (I have used a device built then). Had 64 printable characters (codes 32-95) identical with todays ASCII with a few substitution (no. 95 (_) was then a leftwards facing arrow, no. 94 (^) was an actual upwards facing arrow, possibly others I do not know of). > 1965 ECMA-6 (European National Standards Institute, Genf) European Computer Manufacturers Association, actually. ENSI does not exist, there exists an ETSI European Telecommunications Standards Institute you may be confusing this with. > 1981 IBM Codepage 437 > - IBM uses the 8 bit and extended the code to 256 positions 8 bit, but neither a predecessor to ISO-8859 (that is why DOS texts make such a mess on Usenet), nor the first 8bit code. The honour of both of those goes AFAIK to an DEC character set from the mid/late 1970s (VT52 or VT100 terminal character set). > 1986 ISO 8859-1, called Latin-1 > - it becomes international standard > - identically with IBM Codepage 819 and 850 More likely those IBM Codepages are the names of their ISO implementations. > - used in Windows with some additional characters Particulartly 128-159, which are empty in ISO, but also quite a few other random changes that foul up 8bit exchanges (that is why Windows texts make such a mess on Usenet). > (it exists no code called ANSI !) The name ANSI properly refers to a set of terminal control codes (function key reporting, cursor positioning, font switsching etc) derived from the DEC VT100 terminal code. What is today often misscalled ANSI is the PC/MS-DOS character set, which is actually pure IBM Codepage 437. > 1987 Unicode (UCS) > - developed by Apple and Xerox Actually derived from an Xerox character set, with additions by Apple and others. > - at the moment only used by Windows NT and Java And Linux (as alternative to ISO8859-1, the default). > 1992 ISO/IEC 10646-1 > - it becomes international standard Actually ISO10646 is a 32 bit code with Unicode as its codes 0-65535. Also ISO10646 defines UTF-8, a coding that uses single bytes. 1 byte for chars 0-127 (= ASCII), 2 bytes for 128-2047, 3 bytes for 2048-65535 ... until max 6 bytes. -- Neil Franklin, neil@franklin.ch.remove http://neil.franklin.ch/ Nerd, Geek, Hacker, Unix Wizzard, Sysadmin, Roleplayer, Mystic Computer: a toy, speeds work so that you have more time to play