[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
eltype
has information from `ELEMENT' and `ATTLIST'
declarations. It can also store data for the application.
The element types are symbols in a special oblist. The oblist is the table of element types. The symbols name is the GI, its value is used to store three flags and the function definition holds the content model. Other information about the element type is stored on the property list.
This function can be used as a place in setf
, push
and
other functions from the CL library.
empty
(if declared with a `<!USEMAP
gi #EMPTY>' or nil
(if no associated map).
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The DTD data type is realised as a lisp vector using defstruct
.
There are two additional fields for internal use: dependencies and merged.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The tree nodes are represented as lisp vectors, using defstruct
to define basic operations.
The Element data type is a view of the tree built by the parser.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
PSGML uses finite state machines and a stack to parse SGML. Every element type has an associated DFA (deterministic finite automaton). This DFA is constructed from the content model.
SGML restricts the allowed content models in such a way that it is easy to directly construct a DFA.
To be able to determine when a start-tag can be omitted the DFA need to contain some more information than the traditional DFA. In PSGML a DFA has a set of states and two sets of edges. The edges are associated with tokens (corresponding to SGML's primitive content tokens). I call these moves. One set of moves, the optional moves, represents optional tokens. I call the other set required moves. The correspondence to SGML definitions are: if there is precisely one required move from one state, then the associated token is required. A state is final if there is not required move from that state.
The SGML construct `(...&...&...)' (AND-group) is another problem. There is a simple translation to sequence- and or-connectors. For example `(a & b & c)' is can be translated to:
((a, ((c, b) | (b, c))) | (b, ((a, c) | (c, a))) | (c, ((a, b) | (b, a))) ) |
But this grows too fast to be of direct practical use. PSGML represents an AND-group with one DFA for every (SGML) token in the group. During parsing of an AND-group there is a pointer to a state in one of the group's DFAs, and a list of the DFAs for the tokens not yet satisfied. Most of this is hidden by the primitives for the state type. The parser only sees states in a DFA and moves.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
sgml-push-to-entity
(or similar).
Restore the current buffer to the buffer that was current when the push
to this buffer was made.
Use sgml-pop-entity
to exit from this buffer.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
sgml-dtd-info
which contain the DTD (type dtd).
Parse until (at least) goal, a buffer position. Optional argument
extra-cond should be a function. This function is called in the
parser loop, and the loop is exited if the function returns t. If third
argument quit is non-nil
, no "`Parsing...'" message
will be displayed.
sgml-markup-start
pointing to start of short
reference and point pointing to the end.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
File = Comment, File version, S-expression --dependencies--, Parameter entites, Document type name, Elements, General entities, S-expression --shortref maps--, S-expression --notations-- Elements = Counted Sequence of S-expression --element type name--, Counted Sequence of Element type description File version = "(sgml-saved-dtd-version 5) " Comment = (";", (CASE OF [0-9] OF [11-255])*, [10] --end of line marker--)* Element type description = S-expression --Misc info--, CASE OF [0-7] --Flags 1:stag-opt, 2:etag-opt, 4:mixed--, Content specification, Token list --includes--, Token list --excludes-- OF [128] --Flag undefined element-- Content specification = CASE OF [0] --cdata-- OF [1] --rcdata-- OF [2] --empty-- OF [3] --any-- OF [4] --undefined-- OF [128] --model follows--, Model --nodes in the finite state automaton-- Model = Counted Sequence of Node Node = CASE OF Normal State OF And Node Normal State = Moves --moves for optional tokens--, Moves --moves for required tokens-- Moves = Counted Sequence of (Token, OCTET --state #--) And Node = [255] --signals an AND node--, Number --next state (node number)--, Counted Sequence of Model --set of models-- Token = Number --index in list of elements-- Number = CASE OF [0-250] --Small number 0--250-- OF [251-255] --Big number, first octet--, OCTET --Big number, second octet-- Token list = Counted Sequence of Token Parameter entites = S-expression --internal representation of parameter entities-- General entities = S-expression --internal representation of general entities-- Document type name = S-expression --name of document type as a string-- S-expression = OTHER Counted Sequence = Number_a --length of sequence--, (ARG_1)^a |
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |