Change Log for the Java Version of Coco/R

  • New release: Dec, 22 2014
  • New names for set constants in Parser.
  • New release: Apr, 15 2013
  • Minor changes, more robust attribute handling.
  • New release: Apr, 19 2011
  • Minor changes, code cleanup.
  • New release: Nov, 16 2010
  • Frame files provided as command line argument take now precedence over frame files in the source directory of the attributed grammar.
  • The package can be set as a directive in the attributed grammar:
    If the package is set in the attributed grammar and on the command line, the command line argument takes precedence.
  • New option checkEOF: With the option checkEOF the user can specify whether the generated parser should check if the entire input has been consumed after parsing, i.e., if the token after the start symbol of the grammar is an end-of-file token. The user can enable or disable this check by the following directive in the attributed grammar:
    $checkEOF=true // enable the end of file check (default)
    $checkEOF=false // disable the end of file check
  • Support for UTF-8 input: The token stores the character position in Token.charPos.
  • Support for copyright sections in the generated files. If a file named Copyright.frame is provided, it will be included at the top of the generated scanner and parser.
  • Cleanup, removed the marker $$$ from the end of the frame files.
  • Minor change: Code cleanup.
  • New release: Jan, 11 2010
  • More robust scanner generation.
  • New release: Jun, 22 2009
  • More robust UTF-8 handling in ParserGen.CopySourcePart and Scanner.GetString.
  • Simplified Coco.atg (import statements handled by ANY).
  • New release: Mar, 27 2009
  • Support for pragmas which are part of terminal classes (thanks to Serge Voloshenyuk)
  • Support for the escape sequences vertical tab (\v) and audible bell (\a)
  • Minor change: Code cleanup
  • New release: Nov, 6 2008
  • Minor change: Code cleanup.
  • New release: Nov, 1 2008
  • Minor change: More robust Scanner, never assign Buffer.EOF to a char (which results in an overflow, should do no harm).
  • New release: Oct, 1 2008
  • Bugfix: bug in the construction of the scanner automaton fixed.
  • Minor change: More robust Peek method is Scanner.
  • Minor change: Literal check is now handled by a hash-table-lookup instead of an if-else-if cascade.
  • Minor change: Allow underscores (_) in identifiers.
  • Minor change: The grammar dependent fields Scanner.start and Parser.set are now static. This speeds up the case where many instances of the compiler are created. Especially when the grammar is big, but the input sentences are short.
  • Bugfix in, possible crash in generated Scanners with IGNORECASE feature.
  • Enhanced support for input streams: Previously we did support files via file names and file streams via input streams, but not non seek-able streams (e.g. network). Now we support both stream types. Please note since our memory buffer keeps the entire history of a stream, the maximum supported stream size is limited by the available memory and the runtime environment.
  • The possibility to set the output path with command line option "-o".
  • The main method returns 1 if the grammar contained an error.
  • The declaration of standard whitespaces (namely space) is again done in the file Scanner.frame.
  • Misplaced resolvers cause warnings instead of erros now.
  • Bug fix in handling of generic types in attributes.
  • The scanners generated by Coco/R can now also process Unicode characters in UTF-8 format. This implies that Coco/R itself supports UTF-8 now.
  • Attributes may now also contain the characters '<' and '>' (e.g. for operators or generic types). Such attributes must be enclosed in <. and .> brackets.
  • Error messages are written to an error stream instead of to the console. The error stream can be changed by the user.
  • The scanner now also recognizes the Unicode byte order mark for UTF-8.
  • The if else if cascade of an alternative does not get optimized to a switch statement anymore if the alternative contains a LL(1) warning, thus coco generates at least compileable code in such a situation.
  • Constant declarations are generated for pragma names in the parser now (in case you want to access those names in semantic actions).
  • Bug fixed in Tab.cs. Coco reported a misplaced resolver if 2 alternatives at the end of a production were deletable and a resolver was placed in front of the first one.
  • Small bug in DFA fixed (EOF was not recognized correctly if ANY was used)
  • Coco/R as well as the generated compilers are reentrant now. That means that all fields and methods are non-static. Please look at the user manual to see how to create and initialize a scanner and a parser object in your compiler.
  • In addition to bracket comments ATG files can also contain end of line comments now (// ... cr lf)
  • Scanners can read arbitrariliy large files now (needed for parsing log files with several hundred megabytes)
  • Generated scanners are substantially faster than before (about 30%)
  • Lexical structures like '(' {char} ')' resulted in an endless loop in the scanner if char was defined as ANY - ')' and if the terminating ')' was missing in the input stream of the generated compiler.
  • If an expression in curly braces or square brackets is deletable (as in [[x]]) a new LL(1) warning is printed:
    contents of [...] or {...} must not be deletable.
  • Blanks are specified as white space in the scanner frame now, so one can delete this if one doesn't want to ignore blanks.
    Caution: Use the latest Coco.jar only with the latest Scanner.frame.
  • When appending a file name to the frame directory path System.getProperty("file.separator") is used instead of '\\'.
  • Bug fix in buggy code was generated for CONTEXT phrases.
  • Bug fix in Coco.ATG: invalid TokenFactors and Terms caused Coco to crash.
  • Generation of case-insensitive compilers changed
    - keyword IGNORECASE instead of IGNORE CASE.
    - case is also ignored in tokens and character sets now.
    - User manual changed.
  • The scanner uses \u0100 instead of \0 as an eof character now. This allows \0 to be used in tokens (useful for parsing binary files).
  • Bug fix in the detection of tokens that cannot be distinguished.
  • IO routines changed from Java 1.0 to Java 1.1.
  • Various cleanups.
  • bug fix: Scanner backup file not generated correctly
  • bug fix: Parser backup file not generated correctly
  • Method Tab.IgnoreCase added
  • The frames directory is not specified by the environment variable CRFRAMES any more but can be specified with the command line option -frames
  • Bug fix in (incorrect handling of command line arguments)
  • Errors.errMsgFormat handled as in the C# version now (in
  • Bug fix in Sets.PrintSet (in
  • Fatal errors abort with System.exit(1) instead of System.exit(0) now
  • Resolvers have sometimes to be ignored in the computation of symbol sets and sometimes not. The way how it was done so far had subtle errors. Changes in
    - field ignoreRslvs removed
    - method Expected0 added
    - method Expected0 called twice in CheckAlts and once in CheckRes
    - CheckRes: computation of all start symbols of an alternative chain modified
  • Some code was commented out in ParserGen.GenCond because it generated unnecessary checks in the parser.
  • Bug fix in DFA.MatchLiteral.
  • Environment variable changed from crframes to CRFRAMES. Which makes a difference in Unix like environments.
  • The "environment variable" CRFRAMES was a system property, now we changed it to an environment variable.
    Annotation: You may get a deprecation warning because System.getenv(String) is deprecated since JDK 1.1, but won't be in SDK 1.5.
  • Characters in the range 128..255 are handled correctly now. So far they were translated to '?'.
  • Bug fix: A syntactically wrong statement was generated if no IGNORE option was specified.