-
Notifications
You must be signed in to change notification settings - Fork 3
/
iso8879.html
8920 lines (8182 loc) · 291 KB
/
iso8879.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!--
HP1 is EM
HP2 is STRONG
-->
<!ENTITY tab "	">
<!ENTITY vbar SDATA "font=symbol charset=fontspecific code=189">
<!ENTITY tilde CDATA "~">
<!ENTITY asterisk CDATA "*">
<!ENTITY mdash SDATA "font=symbol charset=fontspecific code=190">
<!ENTITY copyr "©">
<!ENTITY lbrk CDATA "[">
<!ENTITY rbrk CDATA "]">
<!ENTITY odq "“">
<!ENTITY eodq "”">
<!ENTITY cdq "”">
<!ENTITY amp CDATA "&">
<!ENTITY gt CDATA ">">
<!ENTITY lt CDATA "<">
<!ENTITY xclm CDATA "!">
<!ENTITY caret "¬">
<!ELEMENT break - O EMPTY>
<!ELEMENT nline - O EMPTY>
<USERDOC>
<PROLOG>
<TITLE>
<TOPIC>Information Processing -- Text and Office Systems --
Standard Generalized Markup Language (SGML)</TOPIC></TITLE>
<DOCNUM>ISO 8879:1986/A1:1988(E)</DOCNUM></PROLOG>
<FRONTM>
<TIPAGE>
<NOTICES>
<P>
<STRONG>Foreword</STRONG></P>
<P>ISO (the International Organization for Standardization) is a
worldwide federation of national standards bodies (ISO member bodies).
The work of preparing International Standards is normally carried out
through ISO technical committees.
Each member body interested in a subject for which a technical committee
has been established has the right to be represented on that committee.
International organizations, governmental and non-governmental, in
liaison with ISO, also take part in the work.</P>
<P>Draft International Standards adopted by the technical committees are
circulated to the member bodies for approval before their acceptance as
International Standards by the ISO Council.
They are approved in accordance with ISO procedures requiring at least
75 % approval by the member bodies voting.</P>
<P>International Standard ISO 8879 was prepared by Technical Committee
ISO/TC 97,
<EM>Information processing systems</EM>.</P>
<P>Users should note that all International Standards undergo revision
from time to time and that any reference made herein to any other
International Standard implies its latest edition, unless otherwise
stated.</P></NOTICES>
<VNOTICE>
<LOSTDATA></LOSTDATA>
<COPRNOTE>© International Organization for Standardization, 1986</COPRNOTE></VNOTICE>
<PREFACE TOPICID="INTRO">
<SPECHD>Introduction</SPECHD>
<LOSTDATA></LOSTDATA>
<P>This International Standard specifies a language for document representation
referred to as the
<Q>Standard Generalized Markup Language</Q> (SGML).
SGML can be used for publishing in its broadest definition,
ranging from single medium conventional publishing
to multi-media data base publishing.
SGML can also be used in office document processing
when the benefits of human readability
and interchange with publishing systems
are required.</P>
<H2>Background</H2>
<LOSTDATA></LOSTDATA>
<P>A document can be viewed in the abstract
as a structure of various types of element.
An author organizes a book into chapters
that contain paragraphs, for example,
and figures that contain figure captions.
An editor organizes a magazine into articles
that contain paragraphs that contain words, and so on.</P>
<P>Processors treat these elements in different ways.
A formatting program might print headings in prominent type face,
leave space between paragraphs,
and otherwise visually convey the structure and other attributes to the reader.
An information retrieval system would perhaps assign extra significance
to words in a heading when creating a dictionary.</P>
<P>Although this connection between a document's attributes
and its processing now seems obvious,
it tended to be obscured by early text processing methods.
In the days before automated typesetting,
an editor would
<Q>mark up</Q> a manuscript
with the specific processing instructions
that would create the desired format when executed by a compositor.
Any connection between the instructions and the document's structure
was purely in the editor's head.</P>
<P>Early computerized systems continued this approach
by adding the process-specific
<Q>markup</Q>
to the machine readable document file.
The markup still consisted of specific processing instructions,
but now they were in the language of a formatting program,
rather than a human compositor.
The file could not easily be used for a different purpose,
or on a different computer system,
without changing all the markup.</P>
<P>As users became more sophisticated,
and as text processors became more powerful,
approaches were developped that alleviated this problem.
<Q>Macro calls</Q> (or
<Q>format calls</Q>) were used
to identify points in the document where processing was to occur.
The actual processing instructions were kept outside of the document,
in
<Q>procedures</Q>
(or
<Q>macro definitions</Q> or
<Q>stored formats</Q>),
where they could more easily be changed.</P>
<P>While the macro calls could be placed anywhere in a document,
users began to recognize
that most were placed at the start or end of document elements.
It was natural, therefore, to choose names for such macros
that were
<Q>generic identifiers</Q> of the element types,
rather than names that suggested particular processing
(for example,
<Q>heading</Q> rather than
<Q>format-17</Q>),
and so the practice of
<Q>generic coding</Q>
(or
<Q>generalized tagging</Q>) began.</P>
<P>Generic coding was a major step towards
making automated text processing systems
reflect the natural relationship between document attributes and processing.
The advent of
<Q>generalized markup languages</Q> in the early 1970's
carried this trend further
by providing a formal language base for generic coding.
A generalized markup language observes two main principles:
<OL>
<LI>Descriptive markup predominates
and is distinguished from processing instructions.
<P>Descriptive markup includes both generic identifiers and other attributes
of document elements that motivate processing instructions.
The processing instructions, which can be in any language,
are normally collected outside of the document in procedures.</P>
<P>As the source file is scanned for markup
and the various elements are recognized,
the processing system executes the procedures associated
with each element and attribute for that process.
For other processes, different procedures can be associated
with the same elements and attributes
without changing the document markup.</P>
<P>When a processing instruction must be entered directly in a document,
it is delimited differently from descriptive markup
so that it can easily be located and changed for different processes.</P></LI>
<LI>Markup is formally defined for each type of document.
<P>A generalized markup language formalizes document markup
by incorporating
<Q>document type definitions</Q>.
Type definitions include a specification (like a formal grammar)
of which elements and attributes can occur in a document
and in what order.
With this information, it is possible to determine
whether the markup for an individual document is correct
(that is, complies with the type definition)
and also to supply markup that is missing,
because it can be inferred unambiguously from other markup that is present.</P></LI></OL></P>
<NT>A more detailled introduction to the concepts of generic coding
and the Standard Generalized Markup Language can be found in annex A.</NT>
<H2>Objectives</H2>
<LOSTDATA></LOSTDATA>
<P>The Standard Generalized Markup Language standardizes the application
of the generic coding and generalized markup concepts.
It provides a coherent and unambiguous syntax
for describing whatever a user chooses to identify within a document.
The language includes:
<UL>
<LI>An
<Q>abstract syntax</Q> for descriptive markup of document elements.</LI>
<LI>A
<Q>reference concrete syntax</Q> that binds the abstract syntax
to particular delimiter characters and quantities.
Users can define alternative concrete syntaxes to meet their requirements.</LI>
<LI>Markup declarations that allow the user to define a specific vocabulary
of generic identifiers and attributes for different document types.</LI>
<LI>Provisions for arbitrary data content.
In generalized markup,
<TOTO>Testing again</TOTO>
<Q>data</Q> is anything that is not defined by the markup language.
This can include specialized
<Q>data content notations</Q>
that require interpretation different from general text:
formulas, images, non-Latin alphabets,
previously formatted text, or graphics.</LI>
<LI>Entity references:
a non-system-specific technique for referring to content
located outside the mainstream of the document,
such as separetely-written chapters, pi characters, photographs, etc.</LI>
<LI>Special delimiters for processing instructions
to distinguish them from descriptive markup.
Processing instructions can be entered when needed
for situations that cannot be handled by the procedures,
but they can easily be found and modified later
when a document is sent to a different processing system.</LI></UL></P>
<P>For a generalized markup language to be an acceptable standard, however,
requires more than just providing the required functional capabilities.
The language must have metalinguistic properties,
in order to satisfy the constraints imposed
by the need to use it in a multiplicity of environments.
The major constraints,
and the means by which the Standard Generalized Markup Language
addresses them,
can be summarized as follows:
<OL>
<LI>Documents
<Q>marked up</Q> with the language must be processable
by a wide range of text processing and word processing systems.
<P>The full form of the language, with all optional features, offers
generality and flexibility that can be exploited by sophisticated systems;
less powerful systems need not support the features.
To facilitate interchange between dissimilar systems,
an
<Q>SGML declaration</Q> describes any markup features
or concrete syntax variations used in a document.</P></LI>
<LI>The millions of existing text entry devices must be supported.
<P>SGML documents, with the reference concrete syntax,
can easily be keyboarded and understood by humans,
without machine assistance.
As a result:
<UL SPREAD="COMPACT">
<LI>Use of SGML need not await the development and acceptance
of a new generation of hardware
-- just software to process the documents on existing machines.</LI>
<LI>Migration to such a new generation (when it comes) will be easier,
as users will already be familiar with SGML.</LI></UL></P></LI>
<LI>There must be no character set dependency,
as documents may be keyed on a variety of devices.
<P>The language has no dependency on a particular character set.
Any character set that has bit combinations for
letters, numerals, space, and delimiters is acceptable.</P></LI>
<LI>There must be no processing, system, or device dependencies.
<P>Generalized markup is predominantly descriptive
and therefore inherently free of such dependencies.
The occasional processing instruction is specially delimited
so it can be found and converted for interchange,
or when a different process renders the instruction irrelevant.</P>
<P>References to external parts of a document are indirect.
The mapping to real system storage
are made in
<Q>external entity declarations</Q>
that occur at the start of the document,
where they can easily be modified for interchange.</P>
<P>The concrete syntax can be changed with the SGML declaration
to accommodate any reserved system characters.</P></LI>
<LI>There must be no national language bias.
<P>The characters used for names can be augmented
by any special national characters.
Generic identifiers, attribute names,
and other names used in descriptive markup
are defined by the user in element and entity declarations.</P>
<P>The declaration names and keywords used in markup declarations
can also be changed.</P>
<P>Multiple character repertoires, as used in multi-lingual documents,
are supported.</P></LI>
<LI>The language must accommodate familiar
typewriter and word processor conventions.
<P>The
<Q>short reference</Q> and
<Q>data tag</Q> capabilities
support typewriter text entry conventions.
Normal text containing paragraphs and quotations is interpretable as SGML
although it is keyable with no visible markup.</P></LI>
<LI>The language must not depend on a particular data stream
or physical file organization.
<P>The markup language has a virtual storage model
in which documents consist of one or more storage entities,
each of which is a sequence of characters.
All real file access is handled by the processing system,
which can decide
whether the character sequence should be viewed as continuous,
or whether it should reflect physical record boundaries.</P></LI>
<LI>
<Q>Marked up</Q> text must coexist with other data.
<P>A processing system can allow text that conforms to this International Standard
to occur in a data stream with other material,
as long as the system can locate the start and end of the conforming text.</P>
<P>Similarly, a system can allow data content not defined by SGML
to occur logically within a conforming document.
The occurrence of such data is indicated by markup declarations
to facilitate interchange.</P></LI>
<LI>The markup must be usable by both humans and programs.
<P>The Standard Generalized Markup Language is intended as a suitable interface
for keyboarding and interchange without preprocessors.
It allows
extensive tailoring to accommodate user preferences in text entry conventions
and the requirements of a variety of keyboards and displays.</P>
<P>However, it is recognized that many implementers will take advantage
of the language's information capture capabilities
to provide intelligent editing or to create SGML documents
from a word-processing front-end environment.
SGML accommodates such uses by providing the following capabilities:
<UL SPREAD="COMPACT">
<LI>Element content can be stored separately from the markup.</LI>
<LI>Control characters can be used as delimiters.</LI>
<LI>Mixed mode of data representation are permitted in a document.</LI>
<LI>Multiple concurrent logical and layout structures are supported.</LI></UL></P></LI></OL></P>
<H2>Organization</H2>
<LOSTDATA></LOSTDATA>
<P>The organization of this International Standard is as follows:
<OL>
<LI>The physical organization of an SGML document as an entity structure
is specified in clause 6.</LI>
<LI>The logical organization of an SGML document as an element structure,
and its representation with descriptive markup,
is specified in clause 7.</LI>
<LI>Processing instructions are discussed in clause 8.</LI>
<LI>Common markup constructs, such as characters, entity references,
and processing instructions, are covered in clause 9.</LI>
<LI>Markup declarations with general applicability
(comment, entity, and marked section)
are specified in clause 10.</LI>
<LI>Markup declarations that are used primarily to specify
document type definitions
(document type, element, notation,
short reference mapping, and short reference use)
are defined in clause 11.</LI>
<LI>Markup declarations that are used primarily to specify
link process definitions
(link type, link attribute, link set, and link set use)
are defined in clause 12.</LI>
<LI>The SGML declaration, which specifies the document character set,
capacity set, concrete syntax, and features,
is defined in clause 13.</LI>
<LI>The reference concrete syntax
is defined in clause 14.</LI>
<LI>Conformance of documents, applications, and systems
is defined in clause 15.</LI></OL></P>
<P>There are also a number of annexes containing additional information:
they are not integral part of the body of this International Standard.</P>
<NT>This International Standard is a formal specification of a computer language,
which may prove difficult reading
for those whose expertise is in the production of documents,
rather than compilers.
Annexes A, B, and C discuss the main concepts in an informal tutorial style
that should be more accessible to most readers.
However, the reader should be aware that
those annexes do not cover all SGML constructs,
nor all details of those covered,
and subtle distinctions are frequently ignored
in the interest of presenting a clear overview.</NT></PREFACE></FRONTM>
<BODY>
<H1>Scope</H1>
<LOSTDATA></LOSTDATA>
<P>This International Standard:
<OL>
<LI>Specifies an abstract syntax known as
the Standard Generalized Markup Language (SGML).
The language expresses the description of a document's structure
and other attributes,
as well as other information that makes the markup interpretable.</LI>
<LI>Specifies a reference concrete syntax
that binds the abstract syntax to specific characters and numeric values,
and criteria for defining variant concrete syntaxes.</LI>
<LI>Defines conforming documents in terms of
their use of components of the language.</LI>
<LI>Defines conforming systems in terms of
their ability to process conforming documents
and to recognize markup errors in them.</LI>
<LI>Specifies how data not defined by this International Standard
(such as images, graphics, or formatted text)
can be included in a conforming document.</LI></OL></P>
<NT>This International Language does not:
<OL>
<LI>Identify or specify
<Q>standard</Q> document types,
document architectures, or text structures.</LI>
<LI>Specify the implementation, architecture, or markup error handling
of conforming systems.</LI>
<LI>Specify how conforming documents are to be created.</LI>
<LI>Specify the data stream, message handling system, file structure,
or other physical representation in which
conforming documents are stored or interchanged,
or any character set or coding scheme into or from which
conforming documents might be translated for such purposes.</LI>
<LI>Specify the data content representation or
notation for images, graphics, formatted text, etc.,
that are included in a conforming document.</LI></OL></NT>
<H1>Field of Application</H1>
<LOSTDATA></LOSTDATA>
<P>The Standard Generalized Markup Language can be used for documents
that are processed by any text processing or word processing system.
It is particulary applicable to:
<OL>
<LI>Documents that are interchanged among systems
with differing text processing languages.</LI>
<LI>Documents that are processed in more than one way,
even when the procedures use the same text processing language.</LI></OL></P>
<P>Documents that exist solely in final
<REV REFID="AMEND1">imaged</REV>
form
are not within the field of application of this International Standard.</P>
<H1>References</H1>
<LOSTDATA>
<FN ID="STGDF">At present at the stage of draft.</FN></LOSTDATA>
<P>ISO 639,
<CIT>Codes for the representation of names of languages.</CIT>
<FNREF REFID="STGDF"></P>
<P>ISO 646,
<CIT>Information processing --
<REV REFID="AMEND1">ISO</REV>
7-bit coded character set for information exchange.</CIT></P>
<P>ISO 9069,
<CIT>Information processing
-- SGML support facilities
-- SGML Document Interchange Format (SDIF).</CIT>
<FNREF REFID="STGDF"></P>
<P>ISO 9070,
<CIT>Information processing
-- SGML support facilities
-- Registration procedures for public text.</CIT>
<FNREF REFID="STGDF"></P>
<P>The following references are used in conjunction with illustrative materials:</P>
<P>ISO 2022,
<CIT>Information processing
-- ISO 7-bit and 8-bit coded character sets
-- Code extension techniques.</CIT></P>
<P>ISO 3166,
<CIT>Codes for the representation of names and countries.</CIT></P>
<P>ISO 4873,
<CIT>Information processing
-- ISO 8-bit code for information interchange
-- Structure and rules for implementation.</CIT></P>
<P>ISO 6937,
<CIT>Information processing
-- Coded character sets for text identification.</CIT></P>
<P>ISO 8632/2,
<CIT>Information processing systems
-- Computer graphics
-- Metafile for the storage and transfer
of picture description information
-- Part 2: Character encoding.</CIT>
<FNREF REFID="STGDF"></P>
<P>ISO 8632/4,
<CIT>Information processing systems
-- Computer graphics
-- Metafile for the storage and transfer
of picture description information
-- Part 4: Clear text encoding.</CIT>
<FNREF REFID="STGDF"></P>
<H1>Definitions</H1>
<LOSTDATA></LOSTDATA>
<NT>The typographic conventions described in 5.1 are employed in this clause.</NT>
<DL TSIZE="0">
<DT>4.1 abstract syntax (of SGML):</DT>
<DD>Rules that define how markup is added to the data of a document,
without regard to the specific characters used to represent the markup.</DD>
<DT>4.2 active document type (declaration):</DT>
<DD>
<REV REFID="AMEND1">A document type that the system has identified to be active.
<NT>An SGML entity is parsed with respect to its active document type, if any,
or if not, with respect to its base document type and any active link types.</NT></REV></DD>
<DT>4.3 active link type (declaration):</DT>
<DD>
<REV REFID="AMEND1">A link process that the system has identified as being active.</REV></DD>
<DT>4.4 ambiguous content model:</DT>
<DD>A
<HPV>content model</HPV> for which an element or character string
occurring in the document instance can satisfy more than
one
<HPV>primitive content token</HPV> without look-ahead.
<NT>Ambiguous content model are prohibited in SGML.</NT></DD>
<DT>4.5 application:</DT>
<DD>Text processing application.</DD>
<DT>4.6 application convention:</DT>
<DD>Application-specific rule governing the text in document areas
that SGML leaves to user choice.
<NT>There are two kinds: content conventions and markup conventions.</NT></DD>
<DT>4.7 application-specific information:</DT>
<DD>A parameter of the
<HPV>SGML declaration</HPV> that specifies information
required by an application and/or its architecture.
<NT>For example, the information could identify
an architecture and/or an application,
or otherwhise enable a system to determine whether it can process the document.</NT></DD>
<DT>4.8 associated element type:</DT>
<DD>An element type associated with the subject of a markup declaration
by its
<HPV>associated element type</HPV> parameter.
<REV REFID="AMEND1">
<DT>4.8.1 associated notation (name):</DT>
<DD>A notation name associated with the subject of a markup declaration
by its
<HPV>associated notation name</HPV> parameter.</DD></REV></DD>
<DT>4.9 attribute (of an element):</DT>
<DD>A characteristic quality, other than type or content.</DD>
<DT>4.10 attribute definition:</DT>
<DD>A member of an attribute definition list;
it defines an attribute name, allowed values, and default value.</DD>
<DT>4.11 attribute definition list:</DT>
<DD>A set of one or more attribute definitions defined by the
<HPV>attribute definition list</HPV> parameter
of an attribute definition list declaration.</DD>
<DT>4.12 attribute definition list (declaration):</DT>
<DD>A markup declaration that associates an attribute definition list
with one or more element types.</DD>
<DT>4.13 attribute list:</DT>
<DD>Attribute specification list.</DD>
<DT>4.14 attribute list declaration:</DT>
<DD>Attribute definition list declaration.</DD>
<DT>4.15 attribute specification:</DT>
<DD>A member of an attribute specification list;
it specifies the value of a single attribute.</DD>
<DT>4.16 attribute (specification) list:</DT>
<DD>Markup that is a set of one or more attribute specifications.
<NT>Attribute specification lists occur in start-tag and link sets.</NT></DD>
<DT>4.17 attribute value literal:</DT>
<DD>A delimited character string that is interpreted as an
<HPV>attribute value</HPV> by replacing references
and ignoring or translating function characters.</DD>
<DT>4.18 available public text:</DT>
<DD>Public text that is available to the general public,
though its owner may require
payment of a fee or performance of other conditions.</DD>
<DT>4.19 B sequence:</DT>
<DD>An uninterrupted sequence of upper-case letter
<Q>B</Q> characters;
in a string assigned as a short reference, it denotes a blank sequence
whose minimum length is the length of the B sequence.</DD>
<DT>4.20 base document element:</DT>
<DD>A document element whose document type is the base document type.</DD>
<DT>4.21 base document type:</DT>
<DD>The document type specified by the first document type declaration in a prolog.</DD>
<DT>4.22 basic SGML document:</DT>
<DD>A conforming SGML document
that uses the reference concrete syntax and capacity set
and the SHORTTAG and OMITTAG markup minimization features.
<NT>It also uses the SHORTREF feature
by virtue of using the reference concrete syntax.</NT></DD>
<DT>4.23 bit:</DT>
<DD>Binary digit; that is, either zero or one.</DD>
<DT>4.24 bit combination:</DT>
<DD>An ordered collection of bits, interpretable as a binary number.</DD>
<DT>4.25 blank sequence:</DT>
<DD>An uninterrupted sequence of
<HPT>SPACE</HPT> and/or
<HPT>SEPCHAR</HPT>
characters.</DD>
<DT>4.26 capacity:</DT>
<DD>A named limit on some aspect of the size and complexity of a document,
expressed as a number of points that can be accumulated
for a kind of object or for all objects.
<NT>The set of capacities is defined by the abstract syntax,
but values are assigned to them by individual documents and SGML systems.</NT></DD>
<DT>4.27 capacity set:</DT>
<DD>A set of assignments of numeric values to capacity names.
<NT>In an SGML declaration,
the capacity set identifies the maximum capacity requirements of the document
(its actual requirements may be lower).
A capacity set can also be defined by an application,
to limit the capacity requirements of documents
that implementations of the application must process,
or by a system, to specify the capacity requirements
that it is capable of meeting.</NT></DD>
<DT>4.28 CDATA:</DT>
<DD>Character data.</DD>
<DT>4.29 CDATA entity:</DT>
<DD>Character data entity.</DD>
<DT>4.30 chain of (link) processes:</DT>
<DD>Processes, performed sequentially, that form a chain in which
the source of the first process is an instance of the base document type,
and the result of each process but the last is the source for the next.
Any portion of the chain can be iterated.
<NT>For example, a complex page makeup application could include
three document types
-- logical, galley, and page --
and two link processes
-- justification and castoff.
The justification process would create an instance of a galley
from an instance of a logical document,
and the castoff process would in turn create pages from the galleys.
The two processes could be iterated,
as decisions made during castoff could require
re-justification of the galleys at different sizes.</NT></DD>
<DT>4.31 character:</DT>
<DD>An atom of information with an individual meaning,
defined by a character repertoire.
<NOTEL>
<LI>There are two kinds: graphic character and control character.</LI>
<LI>A character can occur in a context in which it has a meaning,
defined by markup or a data content notation,
that supercedes or supplements its meaning in the character repertoire.</LI></NOTEL></DD>
<DT>4.32 (character) class:</DT>
<DD>A set of characters that have a common purpose in the abstract syntax,
such as non-SGML characters or separator characters.
<NT>Specific characters are assigned to character classes in four different ways:
<OL SPREAD="COMPACT">
<LI>explicitly, by the abstract syntax (
<HPC>Special</HPC>,
<HPC>Digit</HPC>,
<HPC>LC Letter</HPC>, and
<HPC>UC Letter</HPC>);</LI>
<LI>explicitly, by the concrete syntax (
<HPT>LCNMSTRT</HPT>,
<HPT>FUNCHAR</HPT>,
<HPT>SEPCHAR</HPT>, etc.);</LI>
<LI>implicitly, as the result of explicit assignments made to
delimiter roles or other character classes
(
<HPT>DELMCHAR</HPT>, and
<HPT>DATACHAR</HPT>); or</LI>
<LI>explicitly, by the document character set (
<HPT>NONSGML</HPT>).</LI></OL></NT></DD>
<DT>4.33 character data:</DT>
<DD>Zero or more characters that occur in context in which no markup is recognized,
other than the delimiters that end the
<HPV>character data</HPV>.
Such characters are classified as data characters
because they were declared to be so.</DD>
<DT>4.34 character data entity:</DT>
<DD>An entity whose text is treated as
<HPV>character data</HPV> when referenced
and is not dependent on a specific system, device, or application process.</DD>
<DT>4.35 character entity set:</DT>
<DD>A public entity set consisting of general entities that are graphic characters.
<NOTEL>
<LI>Character entities are used for characters
that have no coded representation in the document character set,
or that cannot be keyboarded conveniently,
or to achieve device independence for characters
whose bit combinations do not cause proper display on all output devices.</LI>
<LI>There are two kinds of character entity sets: definitional and display.</LI></NOTEL></DD>
<DT>4.36 character number:</DT>
<DD>A
<HPV>number</HPV> that represents the base-10 integer
equivalent of the coded representation of a character,
obtained by treating the sequence of bit combinations
as a single base-2 integer.</DD>
<DT>4.37 character reference:</DT>
<DD>A reference that is replaced by a single character.
<NT>There are two kinds: named character reference and numeric character reference.</NT></DD>
<DT>4.38 character repertoire:</DT>
<DD>A set of characters that are used together.
Meanings are defined for each character,
and can also be defined for control sequences of multiple characters.
<NT>When a character occurs in a control sequence,
the meaning of the sequence
supercedes the meanings of the individual characters.</NT></DD>
<DT>4.39 character set:</DT>
<DD>A mapping of a character repertoire onto a code set
such that each character is associated with its coded representation.</DD>
<DT>4.40 (character) string:</DT>
<DD>A sequence of characters.</DD>
<DT>4.41 class:</DT>
<DD>Character class.</DD>
<DT>4.42 code extension:</DT>
<DD>The use of a single coded representation for more than one character,
without changing the document character set.
<NT>When multiple national languages occur in a document,
graphic repertoire code extension may be useful.</NT></DD>
<DT>4.43 code set:</DT>
<DD>A set of bit combinations of equal size,
ordered by their numeric values, which must be consecutive.
<NT>For example, a code set whose bit combinations have 8 bits
(an
<Q>8-bit code</Q>) could consist as many as 256 bit combinations,
ranging in value from 00000000 through 11111111
(0 through 255 in the decimal number base),
or it could consist of any contiguous subset of those bit combinations.</NT></DD>
<DT>4.44 code set position:</DT>
<DD>The numeric value of a bit combination in a code set.</DD>
<DT>4.45 coded representation:</DT>
<DD>The representation of a character
as a sequence of one or more bit combinations of equal size.</DD>
<DT>4.46 comment:</DT>
<DD>A portion of a markup declaration that contains explanations or remarks
intended to aid persons working with the document.</DD>
<DT>4.47 comment declaration:</DT>
<DD>A markup declaration that contains only comments.</DD>
<DT>4.48 concrete syntax (of SGML):</DT>
<DD>A binding of the abstract syntax to particular delimiter characters,
quantities, markup declaration names, etc.</DD>
<DT>4.49 concrete syntax parameter:</DT>
<DD>A parameter of the SGML declaration that identifies the concrete syntax
used in document elements and (usually) prologs.
<NT>The parameter consists of parameters that identify
the syntax-reference character set, function characters,
shunned characters, naming rules, delimiter use, reserved name use,
and quantitative characteristics.</NT></DD>
<DT>4.50 conforming SGML application:</DT>
<DD>An SGML application that requires documents to be conforming SGML documents,
and whose documentation meets the requirements of this International Standard.</DD>
<DT>4.51 conforming SGML document:</DT>
<DD>An SGML document that complies
with all provisions of this International Standard.</DD>
<DT>4.52 containing element:</DT>
<DD>An element within which a subelement occurs.</DD>
<DT>4.53 content:</DT>
<DD>Characters that occur between start-tag and end-tag of an element
in a document instance.
They can be interpreted as data, proper subelements, included subelements,
other markup, or a mixture of them.
<NT>If an element has an explicit content reference,
or its declared content is "EMPTY",
the content is empty.
In such case, the application itself may generate data
and process it as though it were content data.</NT></DD>
<DT>4.54 content convention:</DT>
<DD>An application convention governing data content,
such as restriction on length, allowable characters,
or use of upper-case and lower-case letters.
<NT>A content convention is essentially an informal data content notation,
usually restricted to a single element type.</NT></DD>
<DT>4.55 (content) model:</DT>
<DD>Parameter of an element declaration that specifies
the
<HPV>model group</HPV> and
<HPV>exceptions</HPV>
that define the allowed
<HPV>content</HPV> of the element.</DD>
<DT>4.56 content model nesting level:</DT>
<DD>The largest number of successive
<HPD>grpo</HPD> or
<HPD>dtgo</HPD>
delimiters that occur in a
<HPV>content model</HPV> without a corresponding
<HPD>grpc</HPD> or
<HPD>dtgc</HPD> delimiter.</DD>
<DT>4.57 content reference (attribute):</DT>
<DD>An impliable attribute whose value is referenced by the application
to generate content data.
<NT>When an element has an explicit content reference,
the element's
<HPV>content</HPV> in the document instance is empty.</NT></DD>
<DT>4.58 contextual sequence:</DT>
<DD>A sequence of one or more markup characters
that must follow a delimiter string within the same entity
in order for the string to be recognized as a delimiter.</DD>
<DT>4.59 contextually optional element:</DT>
<DD>An element
<OL SPREAD="COMPACT">
<LI>that can occur only because it is an inclusion;
or</LI>
<LI>whose
<HPV>content token</HPV> in the currently applicable model group
is a contextually optional token.</LI></OL></DD>
<DT>4.60 contextually optional token:</DT>
<DD>A
<HPV>content token</HPV> that
<OL SPREAD="COMPACT">
<LI>is an inherently optional token;
or</LI>
<LI>has a
<HPD>plus</HPD> occurrence indicator and has been satisfied;
or</LI>
<LI>is in a model group that is itself a contextually optional token,
no tokens of which have been satisfied.</LI></OL></DD>
<DT>4.61 contextually required element:</DT>
<DD>An element that is not a contextually optional element and
<OL SPREAD="COMPACT">
<LI>whose
<HPV>generic identifier</HPV> is the
<HPV>document type name</HPV>;
or</LI>
<LI>whose currently applicable model token is a contextually required token.</LI></OL>
<NT>An element could be neither contextually required nor contextually optional;
for example, an element whose currently applicable model token
is in an
<HPD>or</HPD> group that has no inherently optional tokens.</NT></DD>
<DT>4.62 contextually required token:</DT>
<DD>A
<HPV>content token</HPV> that
<OL SPREAD="COMPACT">
<LI>is the only one in its model group;
or</LI>
<LI>is in a
<HPD>seq</HPD> group
<OL SPREAD="COMPACT">
<LI>that
<UL SPREAD="COMPACT">
<LI>is itself a contextually required token; or</LI>
<LI>contains a token which has been satisfied;</LI></UL></LI>
<LI>all preceding tokens of which
<UL SPREAD="COMPACT">
<LI>have been satisfied; or</LI>
<LI>are contextually optional.</LI></UL></LI></OL></LI></OL></DD>
<DT>4.63 control character:</DT>
<DD>A character
that controls the interpretation, presentation, or other processing
of the characters that follow it;
for example, a tab character.</DD>
<DT>4.64 control sequence:</DT>
<DD>A sequence of characters, beginning with a control character,
that controls the interpretation, presentation, or other processing
of the characters that follow it;
for example, an escape sequence.</DD>
<DT>4.65 core concrete syntax:</DT>
<DD>A variant of the reference concrete syntax
that has no short reference delimiters.</DD>
<DT>4.66 corresponding content (of a content token):</DT>
<DD>The element(s) and/or data in a document instance
that correspond to a
<HPV>content token</HPV>.</DD>
<DT>4.67 current attribute:</DT>
<DD>An attribute whose current (that is, most recently specified) value
<REV REFID="AMEND1">becomes its default value.</REV>
<NT>The start-tag cannot be omitted for the first occurrence of an element
with a current attribute.</NT></DD>
<DT>4.68 current element:</DT>
<DD>The open element whose
<HPV>start-tag</HPV> most recently occurred
(or was omitted through markup minimization)</DD>
<DT>4.69 current link set:</DT>
<DD>The link set associated with the current element
by a
<HPV>link set use declaration</HPV> in the element content
<REV REFID="AMEND1">or by a link process definition.</REV>
If the current element has no associated link set,
the previous current link set continues to be the current link set.</DD>
<DT>4.70 current map:</DT>
<DD>The short reference map associated with the current element
by a
<HPV>short reference use</HPV> declaration
in the element content or document type definition.
If the current element has no associated map,
the previous current map continues to be the current map.</DD>
<DT>4.71 current rank:</DT>
<DD>A number that is appended to a rank stem in a tag
to derive the generic identifier.
For a
<HPV>start-tag</HPV> it is the
<HPV>rank suffix</HPV>
of the most recent element with the identical
<HPV>rank stem</HPV>,
or a
<HPV>rank stem</HPV> in the same
<HPV>ranked group</HPV>.
For an
<HPV>end-tag</HPV> it is the
<HPV>rank suffix</HPV>
of the most recent open element with the identical
<HPV>rank stem</HPV>.</DD>
<DT>4.72 data:</DT>
<DD>The characters of a document that represent the inherent information content;
characters that are not recognized as markup.
<REV REFID="AMEND1">
<DT>4.72.1 data attribute:</DT>
<DD>An attribute of the data conforming to a particular data content notation.
<NT>In most cases, the value of the data attributes must be known
before the data can be interpreted in accordance with the notation.</NT></DD></REV></DD>
<DT>4.73 data character:</DT>
<DD>An
<HPV>SGML character</HPV>
that is interpreted as data in the context in which it occurs,
either because it was declared to be data,
or because it was not recognizable as markup.</DD>
<DT>4.74 data content:</DT>
<DD>The portion of an element's
<HPV>content</HPV>
that is data rather than markup or a subelement.</DD>
<DT>4.75 data content notation:</DT>
<DD>An application-specific interpretation of an element's data content,
<REV REFID="AMEND1">or of a data entity,</REV>
that usually extends or differs from the normal meaning of the
document character set.
<NT>It is specified
for an element's
<HPV>content</HPV> by a notation attribute,
<REV REFID="AMEND1">and for a data entity by the
<HPV>notation name</HPV> parameter</REV>
of the entity declaration.</NT>
<REV REFID="AMEND1">
<DT>4.75.1 data entity:</DT>
<DD>An entity that was declared to be data
and therefore is not parsed when referenced.
<NT>There are three kinds:
character data entity, specific character data entity and non-SGML data entity.</NT></DD></REV></DD>
<DT>4.76 data tag:</DT>
<DD>A string that conforms to the data tag pattern of an open element.
It serves both as the
<HPV>end-tag</HPV> of an open element
and as
<HPV>character data</HPV> in the element that contains it.</DD>
<DT>4.77 data tag group:</DT>
<DD>A model group token that associates a data tag pattern
with a target element type.
<NT>Within an instance of a target element,
the data content and that of any subelements
is scanned for a string that conforms to the pattern (a
<Q>data tag</Q>).</NT></DD>
<DT>4.78 data tag pattern:</DT>
<DD>A data tag group token that defines the strings that,
if they occurred in the proper context,
would constitute a data tag.</DD>
<DT>4.79 declaration:</DT>
<DD>Markup declaration.</DD>
<DT>4.80 declaration subset:</DT>
<DD>A delimited portion of a markup declaration
in which other declarations can occur.
<NT>Declaration subset occur only in
document type, link type, and marked section declarations.</NT></DD>
<DT>4.81 declared concrete syntax:</DT>
<DD>The concrete syntax described by the
<HPV>concrete syntax</HPV> parameter
of the
<HPV>SGML declaration</HPV>.</DD>
<DT>4.82 dedicated data characters:</DT>
<DD>Character class consisting of each
<HPV>SGML character</HPV>
that has no possible meaning as markup;
a member is never treated as anything but a
<HPV>data character</HPV>.</DD>
<DT>4.83 default entity:</DT>
<DD>The entity that is referenced by a general entity reference
with an undeclared name.</DD>
<DT>4.84 default value:</DT>
<DD>A portion of an attribute definition that specifies the attribute value
to be used if there is no
<HPV>attribute specification</HPV> for it.</DD>
<DT>4.85 definitional (character) entity set:</DT>
<DD>A character entity set whose purpose is
to define entity names for graphic characters,