Changelog for version 0.0.3-alpha.1 (05/25/2014)

  • Updated example walkthrough and spouse_example
  • Added python utility ddlib for text manipulation (need exporting PYTHONPATH, see its pydoc for usage)
  • Added utility script util/extractor_input_writer.py to sample extractor inputs
  • Updated nlp_extractor format (use sentence_offset, textual sentence_id)
  • Cleaned up unused datastore code
  • Update templates
  • Bug fixes

Changelog for version 0.0.3-alpha (05/07/2014)

  • Non-backward-compatible syntax change: Developers must include id column with type bigint in any table containing variables, but they MUST NOT use this column anywhere. This column is reserved for learning and inference, and all values will be erased and reassigned during the grounding phase.

  • Updated dependency requirement: requires JDK 7 or higher.

  • Supported four new types of extractors. See documentation for details:

    • tsv_extractor
    • plpy_extractor
    • sql_extractor
    • cmd_extractor
  • Even faster factor graph grounding and serialization using better optimized SQL.

  • The previous default Java sampler is no longer supported. The C++ sampler is now the default sampler.

  • New configuration tunable supported: pipeline.relearn_from to skip extraction and grounding and only perform learning and inference with a previous version of the grounded graph. Useful for tuning sampler arguments.

  • Supported custom holdout by a holdout query.

  • Updated spouse_example with implementations of different styles of extractors.

  • The nlp_extractor example has different table requirements and usage. See here: NLP extractor.

  • In the db.default configuration, users should define dbname, host, port and user. If not defined, by default system will use the environmental variables DBNAME,PGHOST, PGPORT and PGUSER accordingly.

  • Fixed all examples.

  • Updated documentation.

  • Print SQL query execution plans for extractor inputs.

  • Skip grounding, learning and inference if no factors are active.

  • Greenplum users should add DISTRIBUTED BY clause in all CREATE TABLE commands. Do not use variable id as distribution key. Do not use distribution key that is not initially assigned.