Class Prawn::Format::Lexer
In: lib/prawn/format/lexer.rb
lib/prawn/format/lexer.rb
Parent: Object

The Lexer class is used by the formatting subsystem to scan a string and extract tokens from it. The tokens it looks for are either text, XML entities, or XML tags.

Note that the lexer only scans for a subset of XML—it is not a true XML scanner, and understands just enough to provide a basic markup language for use in formatting documents.

The subset includes only XML entities and tags—instructions, comments, and the like are not supported.

Methods

each   each   new   new   next   next  

Classes and Modules

Class Prawn::Format::Lexer::InvalidFormat

Constants

ENTITY_MAP = { "lt" => "<", "gt" => ">", "amp" => "&", "mdash" => "\xE2\x80\x94", "ndash" => "\xE2\x80\x93", "nbsp" => "\xC2\xA0", "bull" => "\342\200\242", "quot" => '"', }
ENTITY_MAP = { "lt" => "<", "gt" => ">", "amp" => "&", "mdash" => "\xE2\x80\x94", "ndash" => "\xE2\x80\x93", "nbsp" => "\xC2\xA0", "bull" => "\342\200\242", "quot" => '"', }

Attributes

verbatim  [RW]  Controls whether whitespace is lexed verbatim or not. If not, adjacent whitespace is compressed into a single space character (this includes newlines).
verbatim  [RW]  Controls whether whitespace is lexed verbatim or not. If not, adjacent whitespace is compressed into a single space character (this includes newlines).

Public Class methods

Create a new lexer that will scan the given text. The text must be UTF-8 encoded, and must consist of well-formed XML in the subset understand by the lexer.

[Source]

    # File lib/prawn/format/lexer.rb, line 31
31:       def initialize(text)
32:         @scanner = StringScanner.new(text)
33:         @state = :start
34:         @verbatim = false
35:       end

Create a new lexer that will scan the given text. The text must be UTF-8 encoded, and must consist of well-formed XML in the subset understand by the lexer.

[Source]

    # File lib/prawn/format/lexer.rb, line 31
31:       def initialize(text)
32:         @scanner = StringScanner.new(text)
33:         @state = :start
34:         @verbatim = false
35:       end

Public Instance methods

Iterates over each token in the string, until the end of the string is reached. Each token is yielded. See next for a discussion of the available token types.

[Source]

    # File lib/prawn/format/lexer.rb, line 65
65:       def each
66:         while (token = next_token)
67:           yield token
68:         end
69:       end

Iterates over each token in the string, until the end of the string is reached. Each token is yielded. See next for a discussion of the available token types.

[Source]

    # File lib/prawn/format/lexer.rb, line 65
65:       def each
66:         while (token = next_token)
67:           yield token
68:         end
69:       end

Returns the next token from the scanner. If the end of the string has been reached, this will return nil. Otherwise, the token itself is returned as a hash. The hash will always include a :type key, identifying the type of the token. It will be one of :text, :open, or :close.

For :text tokens, the hash will also contain a :text key, which will point to an array of strings. Each element of the array contains either word, whitespace, or some other character at which the line may be broken.

For :open tokens, the hash will contain a :tag key which identifies the name of the tag (as a symbol), and an :options key, which is another hash that contains the options that were given with the tag.

For :close tokens, the hash will contain only a :tag key.

[Source]

    # File lib/prawn/format/lexer.rb, line 54
54:       def next
55:         if @state == :start && @scanner.eos?
56:           return nil
57:         else
58:           scan_next_token
59:         end
60:       end

Returns the next token from the scanner. If the end of the string has been reached, this will return nil. Otherwise, the token itself is returned as a hash. The hash will always include a :type key, identifying the type of the token. It will be one of :text, :open, or :close.

For :text tokens, the hash will also contain a :text key, which will point to an array of strings. Each element of the array contains either word, whitespace, or some other character at which the line may be broken.

For :open tokens, the hash will contain a :tag key which identifies the name of the tag (as a symbol), and an :options key, which is another hash that contains the options that were given with the tag.

For :close tokens, the hash will contain only a :tag key.

[Source]

    # File lib/prawn/format/lexer.rb, line 54
54:       def next
55:         if @state == :start && @scanner.eos?
56:           return nil
57:         else
58:           scan_next_token
59:         end
60:       end

[Validate]