rexical-1.0.8/0000755000004100000410000000000014632135064013203 5ustar www-datawww-datarexical-1.0.8/Manifest.txt0000644000004100000410000000124614632135064015515 0ustar www-datawww-dataCHANGELOG.rdoc COPYING DOCUMENTATION.en.rdoc DOCUMENTATION.ja.rdoc Manifest.txt README.ja README.rdoc Rakefile bin/rex lib/rexical.rb lib/rexical/generator.rb lib/rexical/rexcmd.rb lib/rexical/version.rb sample/a.cmd sample/b.cmd sample/c.cmd sample/calc3.racc sample/calc3.rex sample/calc3.rex.rb sample/calc3.tab.rb sample/error1.rex sample/error2.rex sample/sample.html sample/sample.rex sample/sample.rex.rb sample/sample.xhtml sample/sample1.c sample/sample1.rex sample/sample2.bas sample/sample2.rex sample/simple.html sample/simple.xhtml sample/xhtmlparser.racc sample/xhtmlparser.rex test/assets/test.rex test/rex-20060125.rb test/rex-20060511.rb test/test_generator.rb rexical-1.0.8/bin/0000755000004100000410000000000014632135064013753 5ustar www-datawww-datarexical-1.0.8/bin/rex0000755000004100000410000000067514632135064014507 0ustar www-datawww-data#!/usr/bin/env ruby # # rex # # Copyright (c) 2005-2006 ARIMA Yasuhiro # # This program is free software. # You can distribute/modify this program under the terms of # the GNU LGPL, Lesser General Public License version 2.1. # For details of LGPL, see the file "COPYING". # ## --------------------------------------------------------------------- require 'rubygems' require 'rexical' Rexical::Cmd.new.run rexical-1.0.8/README.rdoc0000644000004100000410000000237614632135064015021 0ustar www-datawww-data= Rexical home :: http://github.com/sparklemotion/rexical/tree/master == DESCRIPTION Rexical is a lexical scanner generator that is used with Racc to generate Ruby programs. Rexical is written in Ruby. == SYNOPSIS Several examples of Rexical grammar files are provided in the sample directory. Here is an example of a lexical definition: class Sample macro BLANK [\ \t]+ rule {BLANK} # no action \d+ { [:digit, text.to_i] } \w+ { [:word, text] } \n . { [text, text] } end Here are examples of the command line usage: $ rex sample1.rex --stub $ ruby sample1.rex.rb sample1.c $ rex sample2.rex --stub $ ruby sample2.rex.rb sample2.bas $ racc calc3.racc $ rex calc3.rex $ ruby calc3.tab.rb The description files for lexical analysis in the sample directory are the files ending with the .rex extension. == REQUIREMENTS * ruby version 1.8.x or later. == INSTALL * sudo gem install rexical == LICENSE Rexical is distributed under the terms of the GNU Lesser General Public License version 2. Note that you do NOT need to follow LGPL for your own parser (Rexical outputs). You can provide those files under any licenses you want. See COPYING for more details. rexical-1.0.8/lib/0000755000004100000410000000000014632135064013751 5ustar www-datawww-datarexical-1.0.8/lib/rexical.rb0000644000004100000410000000015214632135064015723 0ustar www-datawww-datarequire_relative "rexical/generator" require_relative "rexical/rexcmd" require_relative "rexical/version" rexical-1.0.8/lib/rexical/0000755000004100000410000000000014632135064015400 5ustar www-datawww-datarexical-1.0.8/lib/rexical/generator.rb0000644000004100000410000002647214632135064017726 0ustar www-datawww-data# # generator.rb # # Copyright (c) 2005-2006 ARIMA Yasuhiro # # This program is free software. # You can distribute/modify this program under the terms of # the GNU Lesser General Public License version 2 or later. # require 'strscan' module Rexical class ParseError < StandardError ; end class Generator attr_accessor :grammar_file attr_accessor :grammar_lines attr_accessor :scanner_file attr_accessor :module_name attr_accessor :class_name attr_accessor :lineno attr_accessor :rules attr_accessor :exclusive_states attr_accessor :ignorecase attr_accessor :independent attr_accessor :debug def initialize(opts) @lineno = 0 @macro = {} @rules = [] @exclusive_states = [nil] @grammar_lines = nil @scanner_header = "" @scanner_footer = "" @scanner_inner = "" @opt = opts end def add_header( st ) @scanner_header += "#{st}\n" end def add_footer( st ) @scanner_footer += "#{st}\n" end def add_inner( st ) @scanner_inner += "#{st}\n" end def add_option( st ) opts = st.split opts.each do |opt| case opt when /ignorecase/i @opt['--ignorecase'] = true when /stub/i @opt['--stub'] = true when /independent/i @opt['--independent'] = true when /matcheos/i @opt['--matcheos'] = true end end end def add_macro( st ) ss = StringScanner.new(st) ss.scan(/\s+/) key = ss.scan(/\S+/) ss.scan(/\s+/) st = ss.post_match len = st.size ndx = 0 while ndx <= len c = st[ndx,1] ndx += 1 case c when '\\' ndx += 1 next when '#', ' ' ndx -= 1 break end end expr = st[0,ndx] expr.gsub!('\ ', ' ') key = '{' + key + '}' @macro.each_pair do |k, e| expr.gsub!(k) { |m| e } end @macro[key] = expr rescue raise ParseError, "parse error in add_macro:'#{st}'" end def add_rule( rule_state, rule_expr, rule_action=nil ) st = rule_expr.dup @macro.each_pair do |k, e| rule_expr.gsub!(k) { |m| e } end if rule_state.to_s[1,1] =~ /[A-Z]/ @exclusive_states << rule_state unless @exclusive_states.include?(rule_state) exclusive_state = rule_state start_state = nil else exclusive_state = nil start_state = rule_state end rule = [exclusive_state, start_state, rule_expr, rule_action] @rules << rule rescue raise ParseError, "parse error in add_rule:'#{st}'" end def read_grammar @grammar_lines = StringScanner.new File.read(grammar_file) end def next_line @lineno += 1 @grammar_lines.scan_until(/\n/).chomp rescue nil end def parse state1 = :HEAD state2 = nil state3 = nil lastmodes = [] while st = next_line case state1 when :FOOT add_footer st when :HEAD ss = StringScanner.new(st) if ss.scan(/class/) state1 = :CLASS st = ss.post_match.strip @class_name = st else add_header st end when :CLASS s = st.strip next if s.size == 0 or s[0,1] == '#' ss = StringScanner.new(st) if ss.scan(/option.*$/) state2 = :OPTION next end if ss.scan(/inner.*$/) state2 = :INNER next end if ss.scan(/macro.*$/) state2 = :MACRO next end if ss.scan(/rule.*$/) state2 = :RULE next end if ss.scan(/end.*$/) state1 = :FOOT next end case state2 when :OPTION add_option st when :INNER add_inner st when :MACRO add_macro st when :RULE case state3 when nil rule_state, rule_expr, rule_action = parse_rule(st) if rule_action =~ /\s*\{/ lastmodes = parse_action(rule_action, lastmodes) if lastmodes.empty? add_rule rule_state, rule_expr, rule_action else state3 = :CONT rule_action += "\n" end else add_rule rule_state, rule_expr end when :CONT rule_action += "#{st}\n" lastmodes = parse_action(st, lastmodes) if lastmodes.empty? state3 = nil add_rule rule_state, rule_expr, rule_action else end end # case state3 end # case state2 end # case state1 end # while end def parse_rule(st) st.strip! return if st.size == 0 or st[0,1] == '#' ss = StringScanner.new(st) ss.scan(/\s+/) rule_state = ss.scan(/\:\S+/) ss.scan(/\s+/) rule_expr = ss.scan(/\S+/) ss.scan(/\s+/) [rule_state, rule_expr, ss.post_match] end def parse_action(st, lastmodes=[]) modes = lastmodes mode = lastmodes[-1] ss = StringScanner.new(st) until ss.eos? c = ss.scan(/./) case c when '#' if (mode == :brace) or (mode == nil) #p [c, mode, modes] return modes end when '{' if (mode == :brace) or (mode == nil) mode = :brace modes.push mode end when '}' if (mode == :brace) modes.pop mode = modes[0] end when "'" if (mode == :brace) mode = :quote modes.push mode elsif (mode == :quote) modes.pop mode = modes[0] end when '"' if (mode == :brace) mode = :doublequote modes.push mode elsif (mode == :doublequote) modes.pop mode = modes[0] end when '`' if (mode == :brace) mode = :backquote modes.push mode elsif (mode == :backquote) modes.pop mode = modes[0] end end end #p [c, mode, modes] return modes end REX_HEADER = <<-REX_EOT.gsub(/^ {6}/, '') #-- # DO NOT MODIFY!!!! # This file is automatically generated by rex %s # from lexical definition file "%s". #++ REX_EOT REX_UTIL = <<-REX_EOT require 'strscan' class ScanError < StandardError ; end attr_reader :lineno attr_reader :filename attr_accessor :state def scan_setup(str) @ss = StringScanner.new(str) @lineno = 1 @state = nil end def action yield end def scan_str(str) scan_setup(str) do_parse end alias :scan :scan_str def load_file( filename ) @filename = filename File.open(filename, "r") do |f| scan_setup(f.read) end end def scan_file( filename ) load_file(filename) do_parse end REX_EOT REX_STUB = <<-REX_EOT if __FILE__ == $0 exit if ARGV.size != 1 filename = ARGV.shift rex = %s.new begin rex.load_file filename while token = rex.next_token p token end rescue $stderr.printf %s, rex.filename, rex.lineno, $!.message end end REX_EOT def scanner_io unless scanner_file = @opt['--output-file'] scanner_file = grammar_file + ".rb" end File.open(scanner_file, 'wb') end private :scanner_io def write_scanner f = scanner_io flag = "" flag += "i" if @opt['--ignorecase'] f.printf REX_HEADER, Rexical::VERSION, grammar_file unless @opt['--independent'] f.printf "require 'racc/parser'\n" end @scanner_header.each_line do |s| f.print s end if @opt['--independent'] f.puts "class #{@class_name}" else f.puts "class #{@class_name} < Racc::Parser" end f.print REX_UTIL eos_check = @opt["--matcheos"] ? "" : "return if @ss.eos?" ## scanner method f.print <<-REX_EOT def next_token #{eos_check} # skips empty actions until token = _next_token or @ss.eos?; end token end def _next_token text = @ss.peek(1) @lineno += 1 if text == "\\n" token = case @state REX_EOT exclusive_states.each do |es| if es.nil? f.printf <<-REX_EOT when #{(["nil"] + rules.collect{ |rule| rule[1].nil? ? "nil" : rule[1] }).uniq.join(', ')} REX_EOT else f.printf <<-REX_EOT when #{es} REX_EOT end f.printf <<-REX_EOT case REX_EOT rules.each do |rule| exclusive_state, start_state, rule_expr, rule_action = *rule if es == exclusive_state if rule_action if start_state f.print <<-REX_EOT when((state == #{start_state}) and (text = @ss.scan(/#{rule_expr}/#{flag}))) action #{rule_action} REX_EOT else f.print <<-REX_EOT when (text = @ss.scan(/#{rule_expr}/#{flag})) action #{rule_action} REX_EOT end else if start_state f.print <<-REX_EOT when (@state == #{start_state}) && (text = @ss.scan(/#{rule_expr}/#{flag})) ; REX_EOT else f.print <<-REX_EOT when (text = @ss.scan(/#{rule_expr}/#{flag})) ; REX_EOT end end end end # rules.each if @opt["--matcheos"] eos_check = <<-REX_EOT when @@ss.scan(/$/) ; REX_EOT else eos_check = "" end f.print <<-REX_EOT #{eos_check} else text = @ss.string[@ss.pos .. -1] raise ScanError, "can not match: '" + text + "'" end # if REX_EOT end # exclusive_states.each f.print <<-REX_EOT else raise ScanError, "undefined state: '" + state.to_s + "'" end # case state REX_EOT if @opt['--debug'] f.print <<-REX_EOT p token REX_EOT end f.print <<-REX_EOT token end # def _next_token REX_EOT @scanner_inner.each_line do |s| f.print s end f.puts "end # class" @scanner_footer.each_line do |s| f.print s end f.printf REX_STUB, @class_name, '"%s:%d:%s\n"' if @opt['--stub'] f.close end ## def write_scanner end ## class Generator end ## module Rexical rexical-1.0.8/lib/rexical/rexcmd.rb0000644000004100000410000000672114632135064017215 0ustar www-datawww-data# # rexcmd.rb # # Copyright (c) 2005-2006 ARIMA Yasuhiro # # This program is free software. # You can distribute/modify this program under the terms of # the GNU LGPL, Lesser General Public License version 2.1. # For details of LGPL, see the file "COPYING". # ## --------------------------------------------------------------------- require 'getoptlong' module Rexical class Cmd OPTIONS = <<-EOT o -o --output-file file name of output [.rb] o -s --stub - append stub code for debug o -i --ignorecase - ignore char case o -C --check-only - syntax check only o - --independent - independent mode o - --matcheos - allow match against end of string o -d --debug - print debug information o -h --help - print this message and quit o - --version - print version and quit o - --copyright - print copyright and quit EOT def run @status = 1 usage 'no grammar file given' if ARGV.empty? usage 'too many grammar files given' if ARGV.size > 1 filename = ARGV[0] rex = Rexical::Generator.new(@opt) begin rex.grammar_file = filename rex.read_grammar rex.parse if @opt['--check-only'] $stderr.puts "syntax ok" return 0 end rex.write_scanner @status = 0 rescue Rexical::ParseError, Errno::ENOENT msg = $!.to_s unless /\A\d/ === msg msg[0,0] = ' ' end $stderr.puts "#{@cmd}:#{rex.grammar_file}:#{rex.lineno}:#{msg}" ensure exit @status end end def initialize @status = 2 @cmd = File.basename($0, ".rb") tmp = OPTIONS.lines.collect do |line| next if /\A\s*\z/ === line # disp, sopt, lopt, takearg, doc _, sopt, lopt, takearg, _ = line.strip.split(/\s+/, 5) a = [] a.push lopt unless lopt == '-' a.push sopt unless sopt == '-' a.push takearg == '-' ? GetoptLong::NO_ARGUMENT : GetoptLong::REQUIRED_ARGUMENT a end getopt = GetoptLong.new(*tmp.compact) getopt.quiet = true @opt = {} begin getopt.each do |name, arg| raise GetoptLong::InvalidOption, "#{@cmd}: #{name} given twice" if @opt.key? name @opt[name] = arg.empty? ? true : arg end rescue GetoptLong::AmbiguousOption, GetoptLong::InvalidOption, GetoptLong::MissingArgument, GetoptLong::NeedlessArgument usage $!.message end usage if @opt['--help'] if @opt['--version'] puts "#{@cmd} version #{Rexical::VERSION}" exit 0 end if @opt['--copyright'] puts "#{@cmd} version #{Rexical::VERSION}" puts "#{Rexical::Copyright} <#{Rexical::Mailto}>" exit 0 end end def usage( msg=nil ) f = $stderr f.puts "#{@cmd}: #{msg}" if msg f.print <<-EOT Usage: #{@cmd} [options] Options: EOT OPTIONS.each_line do |line| next if line.strip.empty? if /\A\s*\z/ === line f.puts next end disp, sopt, lopt, takearg, doc = line.strip.split(/\s+/, 5) if disp == 'o' sopt = nil if sopt == '-' lopt = nil if lopt == '-' opt = [sopt, lopt].compact.join(',') takearg = nil if takearg == '-' opt = [opt, takearg].compact.join(' ') f.printf "%-27s %s\n", opt, doc end end exit @status end end end rexical-1.0.8/lib/rexical/version.rb0000644000004100000410000000020414632135064017406 0ustar www-datawww-datamodule Rexical VERSION = "1.0.8" Copyright = "Copyright (c) 2005-2006 ARIMA Yasuhiro" Mailto = "arima.yasuhiro@nifty.com" end rexical-1.0.8/sample/0000755000004100000410000000000014632135064014464 5ustar www-datawww-datarexical-1.0.8/sample/sample.rex.rb0000644000004100000410000000366614632135064017102 0ustar www-datawww-data#-- # DO NOT MODIFY!!!! # This file is automatically generated by rex 1.0.5 # from lexical definition file "sample/sample.rex". #++ require 'racc/parser' # # sample.rex # lexical definition sample for rex # class Sample < Racc::Parser require 'strscan' class ScanError < StandardError ; end attr_reader :lineno attr_reader :filename attr_accessor :state def scan_setup(str) @ss = StringScanner.new(str) @lineno = 1 @state = nil end def action yield end def scan_str(str) scan_setup(str) do_parse end alias :scan :scan_str def load_file( filename ) @filename = filename open(filename, "r") do |f| scan_setup(f.read) end end def scan_file( filename ) load_file(filename) do_parse end def next_token return if @ss.eos? # skips empty actions until token = _next_token or @ss.eos?; end token end def _next_token text = @ss.peek(1) @lineno += 1 if text == "\n" token = case @state when nil case when (text = @ss.scan(/[ \t]+/)) ; when (text = @ss.scan(/\d+/)) action { [:digit, text.to_i] } when (text = @ss.scan(/\w+/)) action { [:word, text] } when (text = @ss.scan(/\n/)) ; when (text = @ss.scan(/./)) action { [text, text] } else text = @ss.string[@ss.pos .. -1] raise ScanError, "can not match: '" + text + "'" end # if else raise ScanError, "undefined state: '" + state.to_s + "'" end # case state token end # def _next_token end # class rexical-1.0.8/sample/b.cmd0000644000004100000410000000004214632135064015366 0ustar www-datawww-datacall racc xhtmlparser.racc -v %* rexical-1.0.8/sample/sample1.rex0000644000004100000410000000205114632135064016544 0ustar www-datawww-data# # sample1.rex # lexical definition sample for rex # # usage # rex sample1.rex --stub # ruby sample1.rex.rb sample1.c # class Sample1 macro BLANK \s+ REM_IN \/\* REM_OUT \*\/ REM \/\/ rule # [:state] pattern [actions] # remark {REM_IN} { self.state = :REMS; [:rem_in, text] } :REMS {REM_OUT} { self.state = nil; [:rem_out, text] } :REMS .*(?={REM_OUT}) { [:remark, text] } {REM} { self.state = :REM; [:rem_in, text] } :REM \n { self.state = nil; [:rem_out, text] } :REM .*(?=$) { [:remark, text] } # literal \"[^"]*\" { [:string, text] } # " \'[^']\' { [:character, text] } # ' # skip {BLANK} # no action # numeric \d+ { [:digit, text.to_i] } # identifier \w+ { [:word, text] } . { [text, text] } end rexical-1.0.8/sample/simple.xhtml0000644000004100000410000000037514632135064017040 0ustar www-datawww-data

XHTML 1.1

rexical-1.0.8/sample/calc3.rex.rb0000644000004100000410000000351214632135064016574 0ustar www-datawww-data# # DO NOT MODIFY!!!! # This file is automatically generated by rex 1.0.0 # from lexical definition file "calc3.rex". # require 'racc/parser' # # calc3.rex # lexical scanner definition for rex # class Calculator3 < Racc::Parser require 'strscan' class ScanError < StandardError ; end attr_reader :lineno attr_reader :filename def scan_setup ; end def action &block yield end def scan_str( str ) scan_evaluate str do_parse end def load_file( filename ) @filename = filename open(filename, "r") do |f| scan_evaluate f.read end end def scan_file( filename ) load_file filename do_parse end def next_token @rex_tokens.shift end def scan_evaluate( str ) scan_setup @rex_tokens = [] @lineno = 1 ss = StringScanner.new(str) state = nil until ss.eos? text = ss.peek(1) @lineno += 1 if text == "\n" case state when nil case when (text = ss.scan(/\s+/)) ; when (text = ss.scan(/\d+/)) @rex_tokens.push action { [:NUMBER, text.to_i] } when (text = ss.scan(/.|\n/)) @rex_tokens.push action { [text, text] } else text = ss.string[ss.pos .. -1] raise ScanError, "can not match: '" + text + "'" end # if else raise ScanError, "undefined state: '" + state.to_s + "'" end # case state end # until ss end # def scan_evaluate end # class if __FILE__ == $0 exit if ARGV.size != 1 filename = ARGV.shift rex = Calculator3.new begin rex.load_file filename while token = rex.next_token p token end rescue $stderr.printf "%s:%d:%s\n", rex.filename, rex.lineno, $!.message end end rexical-1.0.8/sample/xhtmlparser.rex0000644000004100000410000000532514632135064017562 0ustar www-datawww-data# # xhtmlparser.rex # lexical scanner definition for rex # # usage # rex xhtmlparser.rex --stub # ruby xhtmlparser.rex.rb sample.xhtml # class XHTMLParser option ignorecase macro BLANK \s+ TAG_IN \< TAG_OUT \> ETAG_IN \<\/ ETAG_OUT \/\> XTAG_IN \<\? XTAG_OUT \?\> EXT \! REM \-\- EQUAL \= Q1 \' Q2 \" rule # [:state] pattern [actions] {XTAG_IN} { self.state = :TAG; [:xtag_in, text] } {ETAG_IN} { self.state = :TAG; [:etag_in, text] } {TAG_IN} { self.state = :TAG; [:tag_in, text] } :TAG {EXT} { self.state = :EXT; [:ext, text] } :EXT {REM} { self.state = :REM; [:rem_in, text] } :EXT {XTAG_OUT} { self.state = nil; [:xtag_out, text] } :EXT {TAG_OUT} { self.state = nil; [:tag_out, text] } :EXT .+(?={REM}) { [:exttext, text] } :EXT .+(?={TAG_OUT}) { [:exttext, text] } :EXT .+(?=$) { [:exttext, text] } :EXT \n :REM {REM} { self.state = :EXT; [:rem_out, text] } :REM .+(?={REM}) { [:remtext, text] } :REM .+(?=$) { [:remtext, text] } :REM \n :TAG {BLANK} :TAG {XTAG_OUT} { self.state = nil; [:xtag_out, text] } :TAG {ETAG_OUT} { self.state = nil; [:etag_out, text] } :TAG {TAG_OUT} { self.state = nil; [:tag_out, text] } :TAG {EQUAL} { [:equal, text] } :TAG {Q1} { self.state = :Q1; [:quote1, text] } # ' :Q1 {Q1} { self.state = :TAG; [:quote1, text] } # ' :Q1 [^{Q1}]+(?={Q1}) { [:value, text] } # ' :TAG {Q2} { self.state = :Q2; [:quote2, text] } # " :Q2 {Q2} { self.state = :TAG; [:quote2, text] } # " :Q2 [^{Q2}]+(?={Q2}) { [:value, text] } # " :TAG [\w\-]+(?={EQUAL}) { [:attr, text] } :TAG [\w\-]+ { [:element, text] } \s+(?=\S) .*\S(?=\s*{ETAG_IN}) { [:text, text] } .*\S(?=\s*{TAG_IN}) { [:text, text] } .*\S(?=\s*$) { [:text, text] } \s+(?=$) inner end rexical-1.0.8/sample/xhtmlparser.racc0000644000004100000410000000243314632135064017671 0ustar www-datawww-data# # xml parser # class XHTMLParser rule target : /* none */ | xml_doc xml_doc : xml_header extra xml_body | xml_header xml_body | xml_body xml_header : xtag_in element attributes xtag_out xml_body : tag_from contents tag_to tag_from : tag_in element attributes tag_out tag_empty : tag_in element attributes etag_out tag_to : etag_in element tag_out attributes : /* none */ | attributes attribute attribute : attr equal quoted quoted : quote1 value quote1 | quote2 value quote2 contents : /* none */ | contents content content : text | extra | tag_from contents tag_to | tag_empty extra : tag_in ext extra_texts tag_out extra_texts : /* none */ | extra_texts rem_in remtexts rem_out | extra_texts exttext remtexts : remtext | remtexts remtext end ---- header ---- # # generated by racc # require 'xhtmlparser.rex' ---- inner ---- ---- footer ---- exit if ARGV.size == 0 filename = ARGV.shift htmlparser = XHTMLParser.new htmlparser.scan_file filename rexical-1.0.8/sample/sample2.rex0000644000004100000410000000145514632135064016554 0ustar www-datawww-data# # sample2.rex # lexical definition sample for rex # # usage # rex sample2.rex --stub # ruby sample2.rex.rb sample2.bas # class Sample2 option ignorecase macro BLANK \s+ REMARK \' # ' rule {REMARK} { self.state = :REM; [:rem_in, text] } # ' :REM \n { self.state = nil; [:rem_out, text] } :REM .*(?=$) { [:remark, text] } \"[^"]*\" { [:string, text] } # " {BLANK} # no action INPUT { [:input, text] } PRINT { [:print, text] } \d+ { [:digit, text.to_i] } \w+ { [:word, text] } . { [text, text] } end rexical-1.0.8/sample/calc3.rex0000644000004100000410000000033114632135064016166 0ustar www-datawww-data# # calc3.rex # lexical scanner definition for rex # class Calculator3 macro BLANK \s+ DIGIT \d+ rule {BLANK} {DIGIT} { [:NUMBER, text.to_i] } .|\n { [text, text] } inner end rexical-1.0.8/sample/calc3.tab.rb0000644000004100000410000000706314632135064016551 0ustar www-datawww-data# # DO NOT MODIFY!!!! # This file is automatically generated by racc 1.4.4 # from racc grammar file "calc3.racc". # require 'racc/parser' # # generated by racc # require 'calc3.rex' class Calculator3 < Racc::Parser ##### racc 1.4.4 generates ### racc_reduce_table = [ 0, 0, :racc_error, 1, 11, :_reduce_none, 0, 11, :_reduce_2, 3, 12, :_reduce_3, 3, 12, :_reduce_4, 3, 12, :_reduce_5, 3, 12, :_reduce_6, 3, 12, :_reduce_7, 2, 12, :_reduce_8, 1, 12, :_reduce_none ] racc_reduce_n = 10 racc_shift_n = 19 racc_action_table = [ 7, 8, 9, 10, 6, 18, 3, 4, 11, 5, 7, 8, 9, 10, 3, 4, 13, 5, 3, 4, nil, 5, 3, 4, nil, 5, 3, 4, nil, 5, 3, 4, nil, 5, 7, 8, 7, 8 ] racc_action_check = [ 12, 12, 12, 12, 1, 12, 10, 10, 3, 10, 2, 2, 2, 2, 0, 0, 6, 0, 4, 4, nil, 4, 9, 9, nil, 9, 8, 8, nil, 8, 7, 7, nil, 7, 17, 17, 16, 16 ] racc_action_pointer = [ 8, 4, 7, -1, 12, nil, 16, 24, 20, 16, 0, nil, -3, nil, nil, nil, 33, 31, nil ] racc_action_default = [ -2, -10, -1, -10, -10, -9, -10, -10, -10, -10, -10, -8, -10, 19, -5, -6, -3, -4, -7 ] racc_goto_table = [ 2, 1, nil, nil, 12, nil, nil, 14, 15, 16, 17 ] racc_goto_check = [ 2, 1, nil, nil, 2, nil, nil, 2, 2, 2, 2 ] racc_goto_pointer = [ nil, 1, 0 ] racc_goto_default = [ nil, nil, nil ] racc_token_table = { false => 0, Object.new => 1, :UMINUS => 2, "*" => 3, "/" => 4, "+" => 5, "-" => 6, "(" => 7, ")" => 8, :NUMBER => 9 } racc_use_result_var = false racc_nt_base = 10 Racc_arg = [ racc_action_table, racc_action_check, racc_action_default, racc_action_pointer, racc_goto_table, racc_goto_check, racc_goto_default, racc_goto_pointer, racc_nt_base, racc_reduce_table, racc_token_table, racc_shift_n, racc_reduce_n, racc_use_result_var ] Racc_token_to_s_table = [ '$end', 'error', 'UMINUS', '"*"', '"/"', '"+"', '"-"', '"("', '")"', 'NUMBER', '$start', 'target', 'exp'] Racc_debug_parser = false ##### racc system variables end ##### # reduce 0 omitted # reduce 1 omitted module_eval <<'.,.,', 'calc3.racc', 13 def _reduce_2( val, _values) 0 end .,., module_eval <<'.,.,', 'calc3.racc', 15 def _reduce_3( val, _values) val[0] + val[2] end .,., module_eval <<'.,.,', 'calc3.racc', 16 def _reduce_4( val, _values) val[0] - val[2] end .,., module_eval <<'.,.,', 'calc3.racc', 17 def _reduce_5( val, _values) val[0] * val[2] end .,., module_eval <<'.,.,', 'calc3.racc', 18 def _reduce_6( val, _values) val[0] / val[2] end .,., module_eval <<'.,.,', 'calc3.racc', 19 def _reduce_7( val, _values) val[1] end .,., module_eval <<'.,.,', 'calc3.racc', 20 def _reduce_8( val, _values) -(val[1]) end .,., # reduce 9 omitted def _reduce_none( val, _values) val[0] end end # class Calculator3 puts 'sample calc' puts '"q" to quit.' calc = Calculator3.new while true print '>>> '; $stdout.flush str = $stdin.gets.strip break if /q/i === str begin p calc.scan_str(str) rescue ParseError puts 'parse error' end end rexical-1.0.8/sample/sample1.c0000644000004100000410000000017214632135064016172 0ustar www-datawww-data int main(int argc, char **argv) { /* block remark */ int i = 100; // inline remark printf("hello, world\n"); } rexical-1.0.8/sample/calc3.racc0000644000004100000410000000147014632135064016305 0ustar www-datawww-data# # A simple calculator, version 3. # class Calculator3 prechigh nonassoc UMINUS left '*' '/' left '+' '-' preclow options no_result_var rule target : exp | /* none */ { 0 } exp : exp '+' exp { val[0] + val[2] } | exp '-' exp { val[0] - val[2] } | exp '*' exp { val[0] * val[2] } | exp '/' exp { val[0] / val[2] } | '(' exp ')' { val[1] } | '-' NUMBER =UMINUS { -(val[1]) } | NUMBER end ---- header ---- # # generated by racc # require 'calc3.rex' ---- inner ---- ---- footer ---- puts 'sample calc' puts '"q" to quit.' calc = Calculator3.new while true print '>>> '; $stdout.flush str = $stdin.gets.strip break if /q/i === str begin p calc.scan_str(str) rescue ParseError puts 'parse error' end end rexical-1.0.8/sample/sample2.bas0000644000004100000410000000006714632135064016521 0ustar www-datawww-data' inline remark i = 100 input st print "hello, world" rexical-1.0.8/sample/sample.html0000644000004100000410000000146714632135064016643 0ustar www-datawww-data Title

HTML 4.01

rexical-1.0.8/sample/sample.rex0000644000004100000410000000036614632135064016472 0ustar www-datawww-data# # sample.rex # lexical definition sample for rex # class Sample macro BLANK [\ \t]+ rule {BLANK} # no action \d+ { [:digit, text.to_i] } \w+ { [:word, text] } \n . { [text, text] } end rexical-1.0.8/sample/a.cmd0000644000004100000410000000004014632135064015363 0ustar www-datawww-datacall rex xhtmlparser.rex -s %* rexical-1.0.8/sample/c.cmd0000644000004100000410000000025114632135064015371 0ustar www-datawww-data:ruby xhtmlparser.tab.rb simple.html %* :ruby xhtmlparser.tab.rb simple.xhtml %* :ruby xhtmlparser.tab.rb sample.html %* ruby xhtmlparser.tab.rb sample.xhtml %* rexical-1.0.8/sample/sample.xhtml0000644000004100000410000000160014632135064017020 0ustar www-datawww-data Title

XHTML 1.1

rexical-1.0.8/sample/simple.html0000644000004100000410000000010714632135064016641 0ustar www-datawww-data

Hello World.

rexical-1.0.8/sample/error2.rex0000644000004100000410000000041514632135064016417 0ustar www-datawww-data# # error2.rex # lexical definition sample for rex # class Error2 macro BLANK [\ \t]+ rule {BLANK} # no action \d+ { [:digit, text.to_i] } \w+ { [:word, text] } \n . { self.state = :NONDEF ; [text, text] } end rexical-1.0.8/sample/error1.rex0000644000004100000410000000036714632135064016424 0ustar www-datawww-data# # eooro1.rex # lexical definition sample for rex # class Error1 macro BLANK [\ \t]+ rule {BLANK} # no action \d+ { [:digit, text.to_i] } \w+ { [:word, text] } \n # . { [text, text] } end rexical-1.0.8/test/0000755000004100000410000000000014632135064014162 5ustar www-datawww-datarexical-1.0.8/test/test_generator.rb0000644000004100000410000001424014632135064017535 0ustar www-datawww-datagem "minitest" require 'minitest/autorun' require 'tempfile' require 'rexical' require 'stringio' require 'open3' class TestGenerator < Minitest::Test def test_header_is_written_after_module rex = Rexical::Generator.new( "--independent" => true ) rex.grammar_file = File.join File.dirname(__FILE__), 'assets', 'test.rex' rex.read_grammar rex.parse output = StringIO.new rex.write_scanner output comments = [] output.string.split(/[\n]/).each do |line| comments << line.chomp if line =~ /^#/ end assert_match 'DO NOT MODIFY', comments.join assert_equal '#--', comments.first assert_equal '#++', comments.last end def test_rubocop_security rex = Rexical::Generator.new( "--independent" => true ) rex.grammar_file = File.join File.dirname(__FILE__), 'assets', 'test.rex' rex.read_grammar rex.parse output = Tempfile.new(["rex_output", ".rb"]) begin rex.write_scanner output output.close stdin, stdoe, wait_thr = Open3.popen2e "rubocop --only Security #{output.path}" if ! wait_thr.value.success? fail stdoe.read end ensure output.close output.unlink end end def test_read_non_existent_file rex = Rexical::Generator.new(nil) rex.grammar_file = 'non_existent_file' assert_raises Errno::ENOENT do rex.read_grammar end end def test_scanner_nests_classes source = parse_lexer %q{ module Foo class Baz::Calculator < Bar rule \d+ { [:NUMBER, text.to_i] } \s+ { [:S, text] } end end } assert_match 'Baz::Calculator < Bar', source end def test_scanner_inherits source = parse_lexer %q{ class Calculator < Bar rule \d+ { [:NUMBER, text.to_i] } \s+ { [:S, text] } end } assert_match 'Calculator < Bar', source end def test_scanner_inherits_many_levels source = parse_lexer %q{ class Calculator < Foo::Bar rule \d+ { [:NUMBER, text.to_i] } \s+ { [:S, text] } end } assert_match 'Calculator < Foo::Bar', source end def test_stateful_lexer m = build_lexer %q{ class Foo rule \d { @state = :digit; [:foo, text] } :digit \w { @state = nil; [:w, text] } end } scanner = m::Foo.new scanner.scan_setup('1w1') assert_tokens [ [:foo, '1'], [:w, 'w'], [:foo, '1']], scanner end def test_simple_scanner m = build_lexer %q{ class Calculator rule \d+ { [:NUMBER, text.to_i] } \s+ { [:S, text] } end } calc = m::Calculator.new calc.scan_setup('1 2 10') assert_tokens [[:NUMBER, 1], [:S, ' '], [:NUMBER, 2], [:S, ' '], [:NUMBER, 10]], calc end def test_simple_scanner_with_empty_action m = build_lexer %q{ class Calculator rule \d+ { [:NUMBER, text.to_i] } \s+ # skips whitespaces end } calc = m::Calculator.new calc.scan_setup('1 2 10') assert_tokens [[:NUMBER, 1], [:NUMBER, 2], [:NUMBER, 10]], calc end def test_parses_macros_with_escapes source = parse_lexer %q{ class Foo macro w [\ \t]+ rule {w} { [:SPACE, text] } end } assert source.index('@ss.scan(/[ \t]+/))') end def test_simple_scanner_with_macros m = build_lexer %q{ class Calculator macro digit \d+ rule {digit} { [:NUMBER, text.to_i] } \s+ { [:S, text] } end } calc = m::Calculator.new calc.scan_setup('1 2 10') assert_tokens [[:NUMBER, 1], [:S, ' '], [:NUMBER, 2], [:S, ' '], [:NUMBER, 10]], calc end def test_nested_macros source = parse_lexer %q{ class Calculator macro nonascii [^\0-\177] string "{nonascii}*" rule {string} { [:STRING, text] } end } assert_match '"[^\0-\177]*"', source end def test_more_nested_macros source = parse_lexer %q{ class Calculator macro nonascii [^\0-\177] sing {nonascii}* string "{sing}" rule {string} { [:STRING, text] } end } assert_match '"[^\0-\177]*"', source end def test_changing_state_during_lexing lexer = build_lexer %q{ class Calculator rule a { self.state = :B ; [:A, text] } :B b { self.state = nil ; [:B, text] } end } calc1 = lexer::Calculator.new calc2 = lexer::Calculator.new calc1.scan_setup('aaaaa') calc2.scan_setup('ababa') # Doesn't lex all 'a's assert_raises(lexer::Calculator::ScanError) { tokens(calc1) } # Does lex alternating 'a's and 'b's calc2.scan_setup('ababa') assert_tokens [[:A, 'a'], [:B, 'b'], [:A, 'a'], [:B, 'b'], [:A, 'a']], calc2 end def test_changing_state_is_possible_between_next_token_calls lexer = build_lexer %q{ class Calculator rule a { [:A, text] } :B b { [:B, text] } end } calc = lexer::Calculator.new calc.scan_setup('ababa') assert_equal [:A, 'a'], calc.next_token calc.state = :B assert_equal [:B, 'b'], calc.next_token calc.state = nil assert_equal [:A, 'a'], calc.next_token calc.state = :B assert_equal [:B, 'b'], calc.next_token calc.state = nil assert_equal [:A, 'a'], calc.next_token end def test_match_eos lexer = build_lexer %q{ class Calculator option matcheos rule a { [:A, text] } $ { [:EOF, ""] } :B b { [:B, text] } } calc = lexer::Calculator.new calc.scan_setup("a") assert_equal [:A, 'a'], calc.next_token assert_equal [:EOF, ""], calc.next_token end def parse_lexer(str) rex = Rexical::Generator.new("--independent" => true) out = StringIO.new rex.grammar_lines = StringScanner.new(str) rex.parse rex.write_scanner(out) out.string end def build_lexer(str) mod = Module.new mod.module_eval(parse_lexer(str)) mod end def tokens(scanner) tokens = [] while token = scanner.next_token tokens << token end tokens end def assert_tokens(expected, scanner) assert_equal expected, tokens(scanner) end end rexical-1.0.8/test/assets/0000755000004100000410000000000014632135064015464 5ustar www-datawww-datarexical-1.0.8/test/assets/test.rex0000644000004100000410000000026014632135064017161 0ustar www-datawww-datamodule A module B class C < SomethingElse macro w [\s\r\n\f]* rule # [:state] pattern [actions] {w}~={w} { [:INCLUDES, text] } end end end rexical-1.0.8/test/rex-20060511.rb0000644000004100000410000000727514632135064016214 0ustar www-datawww-data#!/usr/local/bin/ruby # # rex # # Copyright (c) 2005-2006 ARIMA Yasuhiro # # This program is free software. # You can distribute/modify this program under the terms of # the GNU LGPL, Lesser General Public License version 2.1. # For details of LGPL, see the file "COPYING". # ## --------------------------------------------------------------------- REX_OPTIONS = <<-EOT o -o --output-file file name of output [.rb] o -s --stub - append stub code for debug o -i --ignorecase - ignore char case o -C --check-only - syntax check only o - --independent - independent mode o -d --debug - print debug information o -h --help - print this message and quit o - --version - print version and quit o - --copyright - print copyright and quit EOT ## --------------------------------------------------------------------- require 'getoptlong' require 'rex/generator' require 'rex/info' ## --------------------------------------------------------------------- class RexRunner def run @status = 1 usage 'no grammar file given' if ARGV.empty? usage 'too many grammar files given' if ARGV.size > 1 filename = ARGV[0] rex = Rex::Generator.new(@opt) begin rex.grammar_file = filename rex.read_grammar rex.parse if @opt['--check-only'] $stderr.puts "syntax ok" return 0 end rex.write_scanner @status = 0 rescue Rex::ParseError, Errno::ENOENT msg = $!.to_s unless /\A\d/ === msg msg[0,0] = ' ' end $stderr.puts "#{@cmd}:#{rex.grammar_file}:#{rex.lineno}:#{msg}" ensure exit @status end end def initialize @status = 2 @cmd = File.basename($0, ".rb") tmp = REX_OPTIONS.collect do |line| next if /\A\s*\z/ === line disp, sopt, lopt, takearg, doc = line.strip.split(/\s+/, 5) a = [] a.push lopt unless lopt == '-' a.push sopt unless sopt == '-' a.push takearg == '-' ? GetoptLong::NO_ARGUMENT : GetoptLong::REQUIRED_ARGUMENT a end getopt = GetoptLong.new(*tmp.compact) getopt.quiet = true @opt = {} begin getopt.each do |name, arg| raise GetoptLong::InvalidOption, "#{@cmd}: #{name} given twice" if @opt.key? name @opt[name] = arg.empty? ? true : arg end rescue GetoptLong::AmbiguousOption, GetoptLong::InvalidOption, GetoptLong::MissingArgument, GetoptLong::NeedlessArgument usage $!.message end usage if @opt['--help'] if @opt['--version'] puts "#{@cmd} version #{Rex::VERSION}" exit 0 end if @opt['--copyright'] puts "#{@cmd} version #{Rex::VERSION}" puts "#{Rex::Copyright} <#{Rex::Mailto}>" exit 0 end end def usage( msg=nil ) f = $stderr f.puts "#{@cmd}: #{msg}" if msg f.print <<-EOT Usage: #{@cmd} [options] Options: EOT REX_OPTIONS.each do |line| next if line.strip.empty? if /\A\s*\z/ === line f.puts next end disp, sopt, lopt, takearg, doc = line.strip.split(/\s+/, 5) if disp == 'o' sopt = nil if sopt == '-' lopt = nil if lopt == '-' opt = [sopt, lopt].compact.join(',') takearg = nil if takearg == '-' opt = [opt, takearg].compact.join(' ') f.printf "%-27s %s\n", opt, doc end end exit @status end end RexRunner.new.run rexical-1.0.8/test/rex-20060125.rb0000644000004100000410000000744414632135064016213 0ustar www-datawww-data#!C:/Program Files/ruby-1.8/bin/ruby # # rex # # Copyright (c) 2005 ARIMA Yasuhiro # # This program is free software. # You can distribute/modify this program under the terms of # the GNU LGPL, Lesser General Public License version 2.1. # For details of LGPL, see the file "COPYING". # ## --------------------------------------------------------------------- REX_OPTIONS = <<-EOT o -o --output-file file name of output [.rb] o -s --stub - append stub code for debug o -i --ignorecase - ignore char case o -C --check-only - syntax check only o - --independent - independent mode o -d --debug - print debug information o -h --help - print this message and quit o - --version - print version and quit o - --copyright - print copyright and quit EOT ## --------------------------------------------------------------------- require 'getoptlong' require 'rex/generator' require 'rex/info' ## --------------------------------------------------------------------- =begin class Rex def initialize end end =end def main $cmd = File.basename($0, ".rb") opt = get_options filename = ARGV[0] rex = Rex::Generator.new(opt) begin rex.grammar_file = filename rex.read_grammar rex.parse if opt['--check-only'] $stderr.puts "syntax ok" return 0 end rex.write_scanner rescue Rex::ParseError, Errno::ENOENT msg = $!.to_s unless /\A\d/ === msg msg[0,0] = ' ' end $stderr.puts "#{$cmd}:#{rex.grammar_file}:#{rex.lineno}:#{msg}" return 1 end return 0 end ## --------------------------------------------------------------------- def get_options tmp = REX_OPTIONS.collect do |line| next if /\A\s*\z/ === line disp, sopt, lopt, takearg, doc = line.strip.split(/\s+/, 5) a = [] a.push lopt unless lopt == '-' a.push sopt unless sopt == '-' a.push takearg == '-' ? GetoptLong::NO_ARGUMENT : GetoptLong::REQUIRED_ARGUMENT a end getopt = GetoptLong.new(*tmp.compact) getopt.quiet = true opt = {} begin getopt.each do |name, arg| raise GetoptLong::InvalidOption, "#{$cmd}: #{name} given twice" if opt.key? name opt[name] = arg.empty? ? true : arg end rescue GetoptLong::AmbiguousOption, GetoptLong::InvalidOption, GetoptLong::MissingArgument, GetoptLong::NeedlessArgument usage 1, $!.message end usage if opt['--help'] if opt['--version'] puts "#{$cmd} version #{Rex::VERSION}" exit 0 end if opt['--copyright'] puts "#{$cmd} version #{Rex::VERSION}" puts "#{Rex::Copyright} <#{Rex::Mailto}>" exit 0 end usage(1, 'no grammar file given') if ARGV.empty? usage(1, 'too many grammar files given') if ARGV.size > 1 opt end ## --------------------------------------------------------------------- def usage(status=0, msg=nil ) f = (status == 0 ? $stdout : $stderr) f.puts "#{$cmd}: #{msg}" if msg f.print <<-EOT Usage: #{$cmd} [options] Options: EOT REX_OPTIONS.each do |line| next if line.strip.empty? if /\A\s*\z/ === line f.puts next end disp, sopt, lopt, takearg, doc = line.strip.split(/\s+/, 5) if disp == 'o' sopt = nil if sopt == '-' lopt = nil if lopt == '-' opt = [sopt, lopt].compact.join(',') takearg = nil if takearg == '-' opt = [opt, takearg].compact.join(' ') f.printf "%-27s %s\n", opt, doc end end exit status end ## --------------------------------------------------------------------- main rexical-1.0.8/rexical.gemspec0000644000004100000410000000467214632135064016210 0ustar www-datawww-data######################################################### # This file has been automatically generated by gem2tgz # ######################################################### # -*- encoding: utf-8 -*- # stub: rexical 1.0.8 ruby lib Gem::Specification.new do |s| s.name = "rexical".freeze s.version = "1.0.8" s.required_rubygems_version = Gem::Requirement.new(">= 0".freeze) if s.respond_to? :required_rubygems_version= s.require_paths = ["lib".freeze] s.authors = ["Aaron Patterson".freeze] s.date = "2024-05-23" s.description = "Rexical is a lexical scanner generator that is used with Racc to generate Ruby programs. Rexical is written in Ruby.".freeze s.executables = ["rex".freeze] s.extra_rdoc_files = ["CHANGELOG.rdoc".freeze, "DOCUMENTATION.en.rdoc".freeze, "DOCUMENTATION.ja.rdoc".freeze, "README.rdoc".freeze] s.files = ["CHANGELOG.rdoc".freeze, "COPYING".freeze, "DOCUMENTATION.en.rdoc".freeze, "DOCUMENTATION.ja.rdoc".freeze, "Manifest.txt".freeze, "README.ja".freeze, "README.rdoc".freeze, "Rakefile".freeze, "bin/rex".freeze, "lib/rexical.rb".freeze, "lib/rexical/generator.rb".freeze, "lib/rexical/rexcmd.rb".freeze, "lib/rexical/version.rb".freeze, "sample/a.cmd".freeze, "sample/b.cmd".freeze, "sample/c.cmd".freeze, "sample/calc3.racc".freeze, "sample/calc3.rex".freeze, "sample/calc3.rex.rb".freeze, "sample/calc3.tab.rb".freeze, "sample/error1.rex".freeze, "sample/error2.rex".freeze, "sample/sample.html".freeze, "sample/sample.rex".freeze, "sample/sample.rex.rb".freeze, "sample/sample.xhtml".freeze, "sample/sample1.c".freeze, "sample/sample1.rex".freeze, "sample/sample2.bas".freeze, "sample/sample2.rex".freeze, "sample/simple.html".freeze, "sample/simple.xhtml".freeze, "sample/xhtmlparser.racc".freeze, "sample/xhtmlparser.rex".freeze, "test/assets/test.rex".freeze, "test/rex-20060125.rb".freeze, "test/rex-20060511.rb".freeze, "test/test_generator.rb".freeze] s.homepage = "http://github.com/sparklemotion/rexical".freeze s.licenses = ["LGPL-2.1-only".freeze] s.rdoc_options = ["--main".freeze, "README.rdoc".freeze] s.rubygems_version = "3.3.15".freeze s.summary = "Rexical is a lexical scanner generator that is used with Racc to generate Ruby programs".freeze if s.respond_to? :specification_version then s.specification_version = 4 end if s.respond_to? :add_runtime_dependency then s.add_runtime_dependency(%q.freeze, [">= 0"]) else s.add_dependency(%q.freeze, [">= 0"]) end end rexical-1.0.8/Rakefile0000644000004100000410000000007014632135064014645 0ustar www-datawww-datarequire 'minitest/test_task' Minitest::TestTask.create rexical-1.0.8/DOCUMENTATION.ja.rdoc0000644000004100000410000001073614632135064016465 0ustar www-datawww-data = REX: Ruby Lex for Racc == 概要 Racc と併用する Ruby 用の字句スキャナ生成ツール。 == 使い方 rex [options] grammarfile -o --output-file filename 出力ファイル名指定 -s --stub デバッグ用の主処理を付加 -i --ignorecase 大文字小文字を区別しない -C --check-only 文法検査のみ --independent 非依存モード -d --debug デバッグ情報表示 -h --help 使い方の説明 --version バージョン表明 --copyright 著作権情報表示 == デフォルトの出力ファイル名 foo.rex について foo.rex.rb を出力する。 以下のように利用されることを想定している。 require 'foo.rex' == 入力ファイル構造 頭部、規則部、脚部の順に定義する。 規則部には、複数の節が含まれる。 各節は、行頭がキーワードで始まる。 概要: [頭部] "class" Foo ["option" [オプション] ] ["inner" [メソッド定義] ] ["macro" [マクロ名 正規表現] ] "rule" [スタート状態] パターン [アクション] "end" [脚部] === 入力ファイル記述例 class Foo macro BLANK \s+ DIGIT \d+ rule {BLANK} {DIGIT} { [:NUMBER, text.to_i] } . { [text, text] } end == 頭部(省略可能) 規則部の定義以前に記述された内容は、すべて出力ファイル冒頭に転記される。 == 脚部(省略可能) 規則部の定義以降に記述された内容は、すべて出力ファイル末尾に転記される。 == 規則部 規則部は "class" キーワードから始まる行から "end" キーワードから始まる 行までである。 "class" キーワードに続けて出力するクラス名を指定する。 モジュール名で修飾すると、モジュール内クラスとなる。 Racc::Parser を継承したクラスを生成する。 === 規則部定義例 class Foo class Bar::Foo == オプション(省略可能) この節は "option" キーワードで始まる。 "ignorecase" 大文字小文字を区別しない。 "stub" デバッグ用の主処理を付加 "independent" 非依存モード。Racc を継承しない。 == 内部ユーザコード(省略可能) この節は "inner" キーワードで始まる。 ここで定義した内容は、生成したスキャナのクラスの内部で定義される。 == マクロ定義(省略可能) この節は "macro" キーワードで始まる。 一綴りの正規表現に名前をつける。 \ でエスケープすることで空白を含めることができる。 === マクロ定義例 DIGIT \d+ IDENT [a-zA-Z_][a-zA-Z0-9_]* BLANK [\ \t]+ REMIN \/\* REMOUT \*\/ == 走査規則 この節は "rule" キーワードで始まる。 [state] pattern [actions] === state: スタート状態(省略可能) スタート状態は ":" を前置する識別子で表される。 続く英字が大文字のとき、排他的スタート状態となる。 小文字のとき、包含的スタート状態となる。 スタート状態の初期値および省略時値は nil である。 === pattern: 文字列パターン 文字列を特定するための正規表現。 正規表現の記述には、括弧で括ったマクロ定義を用いることができる。 空白を含む正規表現を用いるには、マクロを使用する。 === actions: アクション(省略可能) パターンに適合するときアクションは実行される。 適切なトークンを作成する処理を定義する。 トークンは、種別と値の二項を持つ配列、または nil である。 トークンを作成するために以下の要素を利用できる。 lineno 入力行番号 ( Read Only ) text 検出した文字列 ( Read Only ) state スタート状態 ( Read/Write ) アクションは { } で括った Ruby のブロックである。 ブロックを越えて制御の流れを変える機能を使用してはいけない。 ( return, exit, next, break, ... ) アクションが省略されると、適合した文字列は破棄されて次の走査に進む。 === 走査規則定義例 {REMIN} { self.state = :REM ; [:REM_IN, text] } :REM {REMOUT} { self.state = nil ; [:REM_OUT, text] } :REM (.+)(?={REMOUT}) { [:COMMENT, text] } {BLANK} -?{DIGIT} { [:NUMBER, text.to_i] } {WORD} { [:word, text] } . { [text, text] } == コメント(省略可能) 各行において "#" から 行末までがコメントになる。 == 生成したクラスの使い方 === scan_setup() スキャナの実行開始時に初期化するためのイベント。 再定義して使用する。 === scan_str( str ) 定義された文法によって記述された文字列を解釈する。 token を内部に保持する。 === scan_file( filename ) 定義された文法によって記述されたファイルを読み込む。 token を内部に保持する。 === next_token 内部に保持する token をひとつずつ取り出す。 最後は nil を返す。 == 注意 本仕様は暫定的であり、予告なく変更される場合がある。 rexical-1.0.8/CHANGELOG.rdoc0000644000004100000410000000112114632135064015336 0ustar www-datawww-data=== 1.0.8 / 2024-05-23 * Dependencies * Added `getoptlong` as an explicit dependency since Ruby 3.4 removes it from the standard library. === 1.0.7 / 2019-08-06 * Security * prefer File.open to Kernel.open === 1.0.6 * Bug fixes * scanner states work better. Thanks Mat. === 1.0.5 * Bug fixes * Scanners with nested classes work better === 1.0.4 * Bug fixes * Generated tokenizer only tokenizes on pulls === 1.0.3 * Bug fixes * renamed to "Rexical" because someone already has "rex". === 1.0.2 * Bug fixes * Fixed nested macros so that backslashes will work rexical-1.0.8/COPYING0000644000004100000410000005573714632135064014257 0ustar www-datawww-dataGNU LESSER GENERAL PUBLIC LICENSE Version 2.1, February 1999 Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public Licenses are intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This license, the Lesser General Public License, applies to some specially designated software packages--typically libraries--of the Free Software Foundation and other authors who decide to use it. You can use it too, but we suggest you first think carefully about whether this license or the ordinary General Public License is the better strategy to use in any particular case, based on the explanations below. When we speak of free software, we are referring to freedom of use, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish); that you receive source code or can get it if you want it; that you can change the software and use pieces of it in new free programs; and that you are informed that you can do these things. To protect your rights, we need to make restrictions that forbid distributors to deny you these rights or to ask you to surrender these rights. These restrictions translate to certain responsibilities for you if you distribute copies of the library or if you modify it. For example, if you distribute copies of the library, whether gratis or for a fee, you must give the recipients all the rights that we gave you. You must make sure that they, too, receive or can get the source code. If you link other code with the library, you must provide complete object files to the recipients, so that they can relink them with the library after making changes to the library and recompiling it. And you must show them these terms so they know their rights. We protect your rights with a two-step method: (1) we copyright the library, and (2) we offer you this license, which gives you legal permission to copy, distribute and/or modify the library. To protect each distributor, we want to make it very clear that there is no warranty for the free library. Also, if the library is modified by someone else and passed on, the recipients should know that what they have is not the original version, so that the original author's reputation will not be affected by problems that might be introduced by others. Finally, software patents pose a constant threat to the existence of any free program. We wish to make sure that a company cannot effectively restrict the users of a free program by obtaining a restrictive license from a patent holder. Therefore, we insist that any patent license obtained for a version of the library must be consistent with the full freedom of use specified in this license. Most GNU software, including some libraries, is covered by the ordinary GNU General Public License. This license, the GNU Lesser General Public License, applies to certain designated libraries, and is quite different from the ordinary General Public License. We use this license for certain libraries in order to permit linking those libraries into non-free programs. When a program is linked with a library, whether statically or using a shared library, the combination of the two is legally speaking a combined work, a derivative of the original library. The ordinary General Public License therefore permits such linking only if the entire combination fits its criteria of freedom. The Lesser General Public License permits more lax criteria for linking other code with the library. We call this license the "Lesser" General Public License because it does Less to protect the user's freedom than the ordinary General Public License. It also provides other free software developers Less of an advantage over competing non-free programs. These disadvantages are the reason we use the ordinary General Public License for many libraries. However, the Lesser license provides advantages in certain special circumstances. For example, on rare occasions, there may be a special need to encourage the widest possible use of a certain library, so that it becomes a de-facto standard. To achieve this, non-free programs must be allowed to use the library. A more frequent case is that a free library does the same job as widely used non-free libraries. In this case, there is little to gain by limiting the free library to free software only, so we use the Lesser General Public License. In other cases, permission to use a particular library in non-free programs enables a greater number of people to use a large body of free software. For example, permission to use the GNU C Library in non-free programs enables many more people to use the whole GNU operating system, as well as its variant, the GNU/Linux operating system. Although the Lesser General Public License is Less protective of the users' freedom, it does ensure that the user of a program that is linked with the Library has the freedom and the wherewithal to run that program using a modified version of the Library. The precise terms and conditions for copying, distribution and modification follow. Pay close attention to the difference between a "work based on the library" and a "work that uses the library". The former contains code derived from the library, whereas the latter must be combined with the library in order to run. TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License Agreement applies to any software library or other program which contains a notice placed by the copyright holder or other authorized party saying it may be distributed under the terms of this Lesser General Public License (also called "this License"). Each licensee is addressed as "you". A "library" means a collection of software functions and/or data prepared so as to be conveniently linked with application programs (which use some of those functions and data) to form executables. The "Library", below, refers to any such software library or work which has been distributed under these terms. A "work based on the Library" means either the Library or any derivative work under copyright law: that is to say, a work containing the Library or a portion of it, either verbatim or with modifications and/or translated straightforwardly into another language. (Hereinafter, translation is included without limitation in the term "modification".) "Source code" for a work means the preferred form of the work for making modifications to it. For a library, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the library. Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running a program using the Library is not restricted, and output from such a program is covered only if its contents constitute a work based on the Library (independent of the use of the Library in a tool for writing it). Whether that is true depends on what the Library does and what the program that uses the Library does. 1. You may copy and distribute verbatim copies of the Library's complete source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and distribute a copy of this License along with the Library. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Library or any portion of it, thus forming a work based on the Library, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) The modified work must itself be a software library. b) You must cause the files modified to carry prominent notices stating that you changed the files and the date of any change. c) You must cause the whole of the work to be licensed at no charge to all third parties under the terms of this License. d) If a facility in the modified Library refers to a function or a table of data to be supplied by an application program that uses the facility, other than as an argument passed when the facility is invoked, then you must make a good faith effort to ensure that, in the event an application does not supply such function or table, the facility still operates, and performs whatever part of its purpose remains meaningful. (For example, a function in a library to compute square roots has a purpose that is entirely well-defined independent of the application. Therefore, Subsection 2d requires that any application-supplied function or table used by this function must be optional: if the application does not supply it, the square root function must still compute square roots.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Library, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Library, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Library. In addition, mere aggregation of another work not based on the Library with the Library (or with a work based on the Library) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may opt to apply the terms of the ordinary GNU General Public License instead of this License to a given copy of the Library. To do this, you must alter all the notices that refer to this License, so that they refer to the ordinary GNU General Public License, version 2, instead of to this License. (If a newer version than version 2 of the ordinary GNU General Public License has appeared, then you can specify that version instead if you wish.) Do not make any other change in these notices. Once this change is made in a given copy, it is irreversible for that copy, so the ordinary GNU General Public License applies to all subsequent copies and derivative works made from that copy. This option is useful when you wish to copy part of the code of the Library into a program that is not a library. 4. You may copy and distribute the Library (or a portion or derivative of it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange. If distribution of object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place satisfies the requirement to distribute the source code, even though third parties are not compelled to copy the source along with the object code. 5. A program that contains no derivative of any portion of the Library, but is designed to work with the Library by being compiled or linked with it, is called a "work that uses the Library". Such a work, in isolation, is not a derivative work of the Library, and therefore falls outside the scope of this License. However, linking a "work that uses the Library" with the Library creates an executable that is a derivative of the Library (because it contains portions of the Library), rather than a "work that uses the library". The executable is therefore covered by this License. Section 6 states terms for distribution of such executables. When a "work that uses the Library" uses material from a header file that is part of the Library, the object code for the work may be a derivative work of the Library even though the source code is not. Whether this is true is especially significant if the work can be linked without the Library, or if the work is itself a library. The threshold for this to be true is not precisely defined by law. If such an object file uses only numerical parameters, data structure layouts and accessors, and small macros and small inline functions (ten lines or less in length), then the use of the object file is unrestricted, regardless of whether it is legally a derivative work. (Executables containing this object code plus portions of the Library will still fall under Section 6.) Otherwise, if the work is a derivative of the Library, you may distribute the object code for the work under the terms of Section 6. Any executables containing that work also fall under Section 6, whether or not they are linked directly with the Library itself. 6. As an exception to the Sections above, you may also combine or link a "work that uses the Library" with the Library to produce a work containing portions of the Library, and distribute that work under terms of your choice, provided that the terms permit modification of the work for the customer's own use and reverse engineering for debugging such modifications. You must give prominent notice with each copy of the work that the Library is used in it and that the Library and its use are covered by this License. You must supply a copy of this License. If the work during execution displays copyright notices, you must include the copyright notice for the Library among them, as well as a reference directing the user to the copy of this License. Also, you must do one of these things: a) Accompany the work with the complete corresponding machine-readable source code for the Library including whatever changes were used in the work (which must be distributed under Sections 1 and 2 above); and, if the work is an executable linked with the Library, with the complete machine-readable "work that uses the Library", as object code and/or source code, so that the user can modify the Library and then relink to produce a modified executable containing the modified Library. (It is understood that the user who changes the contents of definitions files in the Library will not necessarily be able to recompile the application to use the modified definitions.) b) Use a suitable shared library mechanism for linking with the Library. A suitable mechanism is one that (1) uses at run time a copy of the library already present on the user's computer system, rather than copying library functions into the executable, and (2) will operate properly with a modified version of the library, if the user installs one, as long as the modified version is interface-compatible with the version that the work was made with. c) Accompany the work with a written offer, valid for at least three years, to give the same user the materials specified in Subsection 6a, above, for a charge no more than the cost of performing this distribution. d) If distribution of the work is made by offering access to copy from a designated place, offer equivalent access to copy the above specified materials from the same place. e) Verify that the user has already received a copy of these materials or that you have already sent this user a copy. For an executable, the required form of the "work that uses the Library" must include any data and utility programs needed for reproducing the executable from it. However, as a special exception, the materials to be distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. It may happen that this requirement contradicts the license restrictions of other proprietary libraries that do not normally accompany the operating system. Such a contradiction means you cannot use both them and the Library together in an executable that you distribute. 7. You may place library facilities that are a work based on the Library side-by-side in a single library together with other library facilities not covered by this License, and distribute such a combined library, provided that the separate distribution of the work based on the Library and of the other library facilities is otherwise permitted, and provided that you do these two things: a) Accompany the combined library with a copy of the same work based on the Library, uncombined with any other library facilities. This must be distributed under the terms of the Sections above. b) Give prominent notice with the combined library of the fact that part of it is a work based on the Library, and explaining where to find the accompanying uncombined form of the same work. 8. You may not copy, modify, sublicense, link with, or distribute the Library except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense, link with, or distribute the Library is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 9. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Library or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Library (or any work based on the Library), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Library or works based on it. 10. Each time you redistribute the Library (or any work based on the Library), the recipient automatically receives a license from the original licensor to copy, distribute, link with or modify the Library subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties with this License. 11. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Library at all. For example, if a patent license would not permit royalty-free redistribution of the Library by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Library. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply, and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 12. If the distribution and/or use of the Library is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Library under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 13. The Free Software Foundation may publish revised and/or new versions of the Lesser General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Library specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Library does not specify a license version number, you may choose any version ever published by the Free Software Foundation. 14. If you wish to incorporate parts of the Library into other free programs whose distribution conditions are incompatible with these, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS rexical-1.0.8/README.ja0000644000004100000410000000332214632135064014454 0ustar www-datawww-dataRexical README =========== Rexical は Ruby のためのスキャナジェネレータです。 lex の Ruby 版に相当します。 Racc とともに使うように設計されています。 必要環境 -------- * ruby 1.8 以降 インストール ------------ パッケージのトップディレクトリで次のように入力してください。 ($ は通常ユーザ、# はルートのプロンプトです) $ ruby setup.rb config $ ruby setup.rb setup ($ su) # ruby setup.rb install これで通常のパスに Racc がインストールされます。自分の好き なディレクトリにインストールしたいときは、setup.rb config に 各種オプションをつけて実行してください。オプションのリストは $ ruby setup.rb --help で見られます。 テスト ------ sample/ 以下にいくつか Rexical の文法ファイルのサンプルが用意 してあります。以下を実行してください。 $ rex sample1.rex --stub $ ruby sample1.rex.rb sample1.c $ rex sample2.rex --stub $ ruby sample2.rex.rb sample2.bas $ racc calc3.racc $ rex calc3.rex $ ruby calc3.tab.rb Rexical の詳しい文法は doc/ ディレクトリ以下を見てください。 また記述例は sample/ ディレクトリ以下を見てください。 ライセンス ---------- ライセンスは GNU Lesser General Public License (LGPL) version 2 です。ただしユーザが書いた規則ファイルや、Racc がそこから生成した Ruby スクリプトはその対象外です。好きなライセンスで配布してください。 バグなど -------- Rexical を使っていてバグらしき現象に遭遇したら、下記のアドレスまで メールをください。 そのときはできるだけバグを再現できる文法ファイルを付けてください。 ARIMA Yasuhiro arima.yasuhiro@nifty.com http://raa.ruby-lang.org/project/rex/ rexical-1.0.8/DOCUMENTATION.en.rdoc0000644000004100000410000001320614632135064016470 0ustar www-datawww-data= REX: Ruby Lex for Racc == About Lexical Scanner Generator used with Racc for Ruby == Usage rex [options] grammarfile -o --output-file filename designate output filename -s --stub append stub main for debugging -i --ignorecase ignore character case -C --check-only only check syntax --independent independent mode -d --debug print debug information -h --help print usage --version print version --copyright print copyright == Default Output Filename The destination file for foo.rex is foo.rex.rb. To use, include the following in the Ruby source code file. require 'foo.rex' == Grammar File Format A definition consists of a header section, a rule section, and a footer section. The rule section includes one or more clauses. Each clause starts with a keyword. Summary: [Header section] "class" Foo ["option" [options] ] ["inner" [methods] ] ["macro" [macro-name regular-expression] ] "rule" [start-state] pattern [actions] "end" [Footer section] === Grammar File Description Example class Foo macro BLANK \s+ DIGIT \d+ rule {BLANK} {DIGIT} { [:NUMBER, text.to_i] } . { [text, text] } end == Header Section ( Optional ) All of the contents described before the definitions in the rule section are copied to the beginning of the output file. == Footer Section ( Optional ) All the contents described after the definitions in the rule section are copied to the end of the output file. == Rule Section The rule section starts at the line beginning with the "class" keyword and ends at the line beginning with the "end" keyword. The class name is specified after the "class" keyword. If a module name is specified, the class will be included in the module. A class that inherits Racc::Parser is generated. === Example of Rule Section Definition class Foo class Bar::Foo == Option Section ( Optional ) This section begins with the "option" keyword. "ignorecase" ignore the character case when pattern matching "stub" append stub main for debugging "independent" independent mode, do not inherit Racc. == Inner Section for User Code ( Optional ) This section begins with the "inner" keyword. The contents defined here are defined by the contents of the class of the generated scanner. == Macro Section ( Optional ) This section begins with the "macro" keyword. A name is assigned to one regular expression. A space character (0x20) can be included by using a backslash \ to escape. === Example of Macro Definition DIGIT \d+ IDENT [a-zA-Z_][a-zA-Z0-9_]* BLANK [\ \t]+ REMIN \/\* REMOUT \*\/ == Rule Section This section begins with the "rule" keyword. [state] pattern [actions] === state: Start State ( Optional ) A start state is indicated by an identifier beginning with ":", a Ruby symbol. If uppercase letters follow the ":", the state becomes an exclusive start state. If lowercase letters follow the ":", the state becomes an inclusive start state. The initial value and the default value of a start state are nil. === pattern: String Pattern A regular expression specifies a character string. A regular expression description may include a macro definition enclosed by curly braces { }. A macro definition is used when the regular expression includes whitespace. === actions: Processing Actions ( Optional ) An action is executed when the pattern is matched. The action defines the process for creating the appropriate token. A token is a two-element array containing a type and a value, or is nil. The following elements can be used to create a token. lineno Line number ( Read Only ) text Matched string ( Read Only ) state Start state ( Read/Write ) The action is a block of Ruby code enclosed by { }. Do not use functions that exit the block and change the control flow. ( return, exit, next, break, ... ) If the action is omitted, the matched character string is discarded, and the process advances to the next scan. === Example of Rule Section Definition {REMIN} { self.state = :REM ; [:REM_IN, text] } :REM {REMOUT} { self.state = nil ; [:REM_OUT, text] } :REM (.+)(?={REMOUT}) { [:COMMENT, text] } {BLANK} -?{DIGIT} { [:NUMBER, text.to_i] } {WORD} { [:word, text] } . { [text, text] } == Comments ( Optional ) Any text following a "#" to the end of the line becomes a comment. == Using the Generated Class === scan_setup( str ) Initializes the scanner with the str string argument. This is redefined and used. === scan_str( str ) Parses the string described in the defined grammar. The tokens are stored internally. === scan_file( filename ) Reads in a file described in the defined grammar. The tokens are stored internally. === next_token The tokens stored internally are extracted one by one. When there are no more tokens, nil is returned. == Notice This specification is provisional and may be changed without prior notice.