Change Log for the C# Version of Coco/R

  • New release: Dec, 22 2014
  • New names for set constants in Parser.
  • New release: Apr, 19 2011
  • Minor changes, cleanup.
  • Support for #line pragmas in the generated Parser.cs.
  • New release: Nov, 16 2010
  • Frame files provided as command line argument take now precedence over frame files in the source directory of the attributed grammar.
  • The namespace can be set as a directive in the attributed grammar:
    If the namespace is set in the attributed grammar and on the command line, the command line argument takes precedence.
  • New option checkEOF: With the option checkEOF the user can specify whether the generated parser should check if the entire input has been consumed after parsing, i.e., if the token after the start symbol of the grammar is an end-of-file token. The user can enable or disable this check by the following directive in the attributed grammar:
    $checkEOF=true // enable the end of file check (default)
    $checkEOF=false // disable the end of file check
  • Support for UTF-8 input: The token stores the character position in Token.charPos.
  • Support for copyright sections in the generated files. If a file named Copyright.frame is provided, it will be included at the top of the generated scanner and parser.
  • Cleanup, removed the marker $$$ from the end of the frame files.
  • Minor change: Code cleanup.
  • New release: Apr, 23 2010
  • Minor Change: Unreachable nonterminals trigger warnings (have been errors), as it is in the Java version.
  • New release: Jan, 11 2010
  • More robust scanner generation.
  • New release: Jun, 22 2009
  • More robust UTF-8 handling in ParserGen.CopySourcePart and Scanner.GetString.
  • Simplified Coco.atg (using statements handled by ANY).
  • New release: Mar, 27 2009
  • Support for pragmas which are part of terminal classes (thanks to Serge Voloshenyuk)
  • Minor change: Code cleanup.
  • New release: Nov, 8 2008
  • Bugfix in DFA.NumberNodes.
  • New release: Nov, 6 2008
  • Minor change: Code cleanup.
  • New release: Nov, 1 2008
  • Minor change: More robust Scanner, never assign Buffer.EOF to a char (which results in an overflow, and crashes in a checked environment).
  • New release: Oct, 1 2008
  • Bugfix: bug in the construction of the scanner automaton fixed.
  • Minor change: More robust Peek method is Scanner.
  • Minor change: Allow underscores (_) in identifiers.
  • Minor change: The grammar dependent fields Scanner.start and Parser.set are now static. This speeds up the case where many instances of the compiler are created. Especially when the grammar is big, but the input sentences are short.
  • Bugfix in DFA.cs, possible crash in generated Scanners with IGNORECASE feature.
  • Enhanced support for input streams: Previously we did support seek-able streams with fixed size (e.g. files), but not non seek-able streams (e.g. network). Now we support both stream types. Please note since our memory buffer keeps the entire history of a non seek-able stream, the maximum supported stream size is limited by the available memory and the runtime environment.
  • Bugfixes in the CSharp 2 Grammar:
    • Report an error on incomplete pre-processor directives
    • Allow unbound-type-names, see ECMA-334: 14.5.11 and 25.5. We allow them everywhere not only in typeof statements.
  • The possibility to set the output path with command line option "-o".
  • The main method returns 1 if the grammar contained an error.
  • The declaration of standard whitespaces (namely space) is again done in the file Scanner.frame.
  • Misplaced resolvers cause warnings instead of erros now.
  • The scanners generated by Coco/R can now also process Unicode characters in UTF-8 format. This implies that Coco/R itself supports UTF-8 now.
  • Attributes may now also contain the characters '<' and '>' (e.g. for operators or generic types). Such attributes must be enclosed in <. and .> brackets.
  • Error messages are written to an error stream instead of to the console. The error stream can be changed by the user.
  • The scanner now also recognizes the Unicode byte order mark for UTF-8.
  • The if else if cascade of an alternative does not get optimized to a switch statement anymore if the alternative contains a LL(1) warning, thus coco generates at least compileable code in such a situation.
  • Constant declarations are generated for pragma names in the parser now (in case you want to access those names in semantic actions).
  • Bug fixed in Tab.cs. Coco reported a misplaced resolver if 2 alternatives at the end of a production were deletable and a resolver was placed in front of the first one.
  • Small bug in DFA fixed (EOF was not recognized correctly if ANY was used)
  • Coco/R as well as the generated compilers are reentrant now. That means that all fields and methods are non-static. Please look at the user manual to see how to create and initialize a scanner and a parser object in your compiler.
  • In addition to bracket comments ATG files can also contain end of line comments now (// ... cr lf)
  • Scanners can read arbitrariliy large files now (needed for parsing log files with several hundred megabytes)
  • Generated scanners are substantially faster than before (about 30%)
  • Lexical structures like '(' {char} ')' resulted in an endless loop in the scanner if char was defined as ANY - ')' and if the terminating ')' was missing in the input stream of the generated compiler.
  • If an expression in curly braces or square brackets is deletable (as in [[x]]) a new LL(1) warning is printed:
    contents of [...] or {...} must not be deletable.
  • Blanks are specified as white space in the scanner frame now, so one can delete this if one doesn't want to ignore blanks.
    Caution: Use the latest Coco.exe only with the latest Scanner.frame.
  • When appending a file name to the frame directory path Path.DirectorySeparatorChar is used instead of '\\'.
  • Bug fix in Coco.ATG: invalid TokenFactors and Terms caused Coco to crash.
  • Generation of case-insensitive compilers changed
    - keyword IGNORECASE instead of IGNORE CASE.
    - case is also ignored in tokens and character sets now.
    - User manual changed.
  • The scanner uses \u0100 instead of \0 as an eof character now. This allows \0 to be used in tokens (useful for parsing binary files).
  • Bug fix in the detection of tokens that cannot be distinguished.
  • Various cleanups.
  • Method Tab.IgnoreCase added
  • The frames directory is not specified by the environment variable CRFRAMES any more but can be specified with the command line option -frames
  • Bug fix in Coco.cs (incorrect handling of command line arguments)
  • Fatal errors abort with System.Environment.Exit(1) instead of System.Environment.Exit(0) now
  • Resolvers have sometimes to be ignored in the computation of symbol sets and sometimes not. The way how it was done so far had subtle errors. Changes in Tab.cs:
    - field ignoreRslvs removed
    - method Expected0 added
    - method Expected0 called twice in CheckAlts and once in CheckRes
    - CheckRes: computation of all start symbols of an alternative chain modified
  • Environment variable changed from crframes to CRFRAMES. Which makes a difference in Unix like environments.
  • Characters in the range 128..255 are handled correctly now. So far they were translated to '?'.