Annex a 8-Bit Character Sets



Download 191.76 Kb.
Page1/8
Date conversion23.02.2017
Size191.76 Kb.
  1   2   3   4   5   6   7   8
Annex A


8-Bit Character Sets

This Annex to the Guide to the Use of Character Sets in Europe provides more detailed information about 8-bit character set standards than is found in the main body of the Guide. Annex B deals in more detail with the Universal Multi-octet Coded Character Set (UCS) specified in ISO/IEC 10646-1.

The need to represent characters by bit combinations (binary numbers) is central to the storage and processing of data by computer systems and the interchange of data between such systems. This annex gives guidance on the many standards and other specifications which have been developed to address the issues that arise from this need up and until the advent of the multi-octet code structure embodied in ISO/IEC 10646-1:1993.

Table of Contents


1 More about this annex 4

2 Limitations of this annex 4

3 User Requirements 4

3.1 Language Support 4

3.2 Page and Display Formats 5

3.3 European Requirements 6


4 Introduction to character sets 6

4.1 Historical background 6

4.1.1 The first binary codes 6

4.1.1.1 The legacy of Baudot 6

4.1.1.2 Locking shifts 6

4.1.1.3 National variants 7

4.1.2 ASCII 7

4.1.2.1 A 7-bit code 7

4.1.2.2 The legacy of paper tape 7

4.1.2.3 94 characters 7

4.1.2.4 Built-in extendability 7

4.1.2.5 International adoption 8

4.1.3 The world after ASCII 8

4.1.3.1 7-bit codes in an 8-bit world 8

4.1.3.2 8-bit codes 8

4.1.3.3 Locking shifts again 8

4.1.3.4 The International Register 9

4.1.3.5 Limits on expansion 9

4.1.4 The future is 16-bit 9

4.2 Concepts and terminology 10

4.2.1 Basic principles of ISO/IEC 2022 10

4.2.2 Code tables 11

4.2.2.1 Layout and notation 11

4.2.2.2 Structure 11

4.2.2.3 Escape sequences 12

4.2.3 Code elements 12

4.2.3.1 Code elements G0, G1, G2 and G3 of graphic characters 12

4.2.3.2 Code elements C0 and C1 of control characters 14

4.2.3.3 Other control functions 15

4.2.4 Repertoire of a code 15

4.2.5 Formal definitions 15

5 Technical Guidance 16

5.1 Application Environments 17

5.1.1 Features of sequential access 17

5.1.2 Features of random access 17

5.1.3 Use of code extension techniques 17

5.1.4 Restriction to subrepertoires 18

5.2 Graphic Characters 18

5.2.1 94 and 96 position character sets 18

5.2.2 Single-byte and multiple-byte character sets 19

5.2.2.1 Nesting of character sets 19

5.2.2.2 Coding of nested sets 19

5.2.2.3 Chinese, Japanese and Korean national standards 20

5.2.2.4 Variable-length coding 21

5.2.3 Combining characters 21

5.3 Control Functions 22

5.3.1 Primary sets of control functions 22

5.3.2 Supplementary sets of control functions 22

5.3.3 Escape sequences 23

5.3.3.1 General construction 23

5.3.3.2 Two-byte escape sequences 23

5.3.3.3 Escape sequences with Intermediate Bytes 24

5.3.4 Code extension 24

5.3.4.1 Locking shifts 24

5.3.4.2 Single shifts 25

5.3.4.3 Designation of sets of control functions 25

5.3.4.4 Designation of sets of graphic characters 25

5.3.4.5 Announcement functions 26

5.3.5 Control sequences 26

5.3.6 Control strings 27

5.3.7 Control functions for text communication 27



6 Guides to standards 28

6.1 International Standards 28

6.1.1 ISO/IEC 646 28

6.1.1.1 Current edition 28

6.1.1.2 Description 28

6.1.1.3 Tutorial guidance 28

6.1.2 ISO/IEC 2022 29

6.1.2.1 Current edition 29

6.1.2.2 Description 29

6.1.2.3 Tutorial guidance 30

6.1.3 ISO 2375 30

6.1.3.1 Current edition 30

6.1.3.2 Description 30

6.1.3.3 Tutorial guidance 30

6.1.4 ISO/IEC 4873 30

6.1.4.1 Current edition 30

6.1.4.2 Description 30

6.1.4.3 Tutorial guidance 31

6.1.5 ISO/IEC 6429 31

6.1.5.1 Current edition 31

6.1.5.2 Description 31

6.1.5.3 Tutorial guidance 31

6.1.6 ISO/IEC 6937 32

6.1.6.1 Current edition 32

6.1.6.2 Description 32

6.1.6.3 Tutorial guidance 32

6.1.7 ISO/IEC 7350 33

6.1.7.1 Current edition 33

6.1.7.2 Description 33

6.1.7.3 Tutorial guidance 33

6.1.8 ISO/IEC 8859 34

6.1.8.1 Current edition 34

6.1.8.2 Description 34

6.1.8.3 Tutorial guidance 34

6.1.9 ISO/IEC 10367 36

6.1.9.1 Current edition 36

6.1.9.2 Description 36

6.1.9.3 Tutorial guidance 36

6.1.10 ISO/IEC 10538 38

6.1.10.1 Current edition 38

6.1.10.2 Description 38

6.1.10.3 Tutorial guidance 38

6.1.11 ISO/IEC 10646 38

6.1.11.1 Current edition 38

6.1.11.2 Description 38

6.1.11.3 Tutorial guidance 38

6.1.12 ISO/IEC ISP 12070 39

6.1.12.1 Current edition 39

6.1.12.2 Description 39

6.1.12.3 Tutorial guidance 39

6.2 International Registers 40

6.2.1 ISO 2375 Register (International Register of Coded Character Sets to be used with Escape Sequences) 40

6.2.1.1 Current edition 40

6.2.1.2 Description 40

6.2.1.3 Tutorial guidance 40

6.3 European Standards 42

6.3.1 EN 1922 42

6.3.1.1 Current edition 42

6.3.1.2 Description 42

6.3.1.3 Tutorial guidance 42

6.3.2 EN 1923 42

6.3.2.1 Current edition 42

6.3.2.2 Description 42

6.3.2.3 Tutorial guidance 42



1More about this annex


The requirement for compatibility between newer and older equipment has led to the standards of the present day containing legacies from decisions taken many years ago. The reasons behind those decisions are often no longer relevant and their present day legacies may appear merely as unnecessary oddities and complexities. This annex includes some historical background which however is not necessary for an understanding of the remainder of the text.

As work on character sets has developed, there has been a gradual refinement of the concepts involved. This has led to character set standards and other literature making use of technical terms that can be a barrier to the reader. It may be helpful to read section 1.2, Concepts and terminology, before exploring the remaining sections in detail.

This gradual evolution of character set standards has led to technical innovations designed to increase the capabilites of coded character sets while remaining backwardly compatible with what has gone before. Within this evolved framework it is now possible to support a wide range of languages. The wider the range that it is required to support simultaneously, however, the more complex is the technical innovation required. For further information see section 3.1, Language support.

Not all the technical innovations are compatible with all the ways that character data may be used by applications. Section 5.1, Application Environments, provides guidance on these limitations.

Other sections provide greater detail on particular issues.




  1   2   3   4   5   6   7   8


The database is protected by copyright ©hestories.info 2017
send message

    Main page