Factor graph grounding output schema reference

Grounding is the process of building the factor graph and dumping it to files that the sampler can take as input. DeepDive uses a custom binary format to encode the factor graph. It generates four files, one each for weights, variables, factors and metadata. All files resides in the out/ directory of the latest run. The format of these files is as follows, where numbers are in bytes and network byte order is used.

Weights

weightId        long    8
isFixed         bool    1
initialValue    double  8

Variables

variableId      long    8
isEvidence      bool    1
initialValue    double  8
dataType        short   2
edgeCount*      long    8
cardinality     long    8

Factors

weightId        long    8
factorFunction  short   2
equalPredicate  long    8
edgeCount       long    8
variableId1     long    8
isPositive1     bool    1
variableId2     long    8
isPositive2     bool    1
...

Note: from version 0.03, the edgeCount field in Variables is always -1.

The systems also generates a metadata file which contains a single line in comma-separated-values format with the following fields:

Number of weights
Number of variables
Number of factors
Number of edges
Path to weights file
Path to variables file
Path to factors file
Path to edges file

Incremental grounding schema reference

For incremental grouding, the weights, variables, metadata files have the same format as above. There are two more files, factors and edges files, with the following format

Factors

factorId        long    8
weightId        long    8
factorFunction  short   2
edgeCount       long    8

Edges

variableId      long    8
factorId        long    8
position        long    8
isPositive      bool    1
equalPredicate  long    8