Encoders

Recsplain comes with a variety of encoders.

Note

The code for the encoders is in encoders.py

Here is the list.

NumericEncoder

Use for numeric data. example:

{
  "field": "fat_precentage",
  "values": np.linspace(0, 100, num=101),
  "type": "numeric"",
  "weight": 1
}

OneHotEncoder

Use for categorical data. First category is saved for “unknown” entries. example:

{
  "field": "category",
  "values": ["dairy", "pasrty", "meat"],
  "type": "onehot",
  "weight": 1
}

StrictOneHotEncoder

Use for categorical data. No “unknown” category. example:

{
  "field": "category",
  "values": ["dairy", "pasrty", "meat"],
  "type": "strictonehot",
  "weight": 1
}

OrdinalEncoder

Use for ordinal data. window is the allowed similarity leakage between closed values. example:

{
  "field": "price",
  "values": ["low", "mid", "high"],
  "type": "ordinal",
  "weight": 1,
  "window": [0.1,1,0.1]

}

BinEncoder

Use for binning data. values is the boundaries of the bins. example:

{
  "field": "product_color",
  "values": ['blue', 'red', 'green'],
  "type": "bin",
  "weight": 1,
}

BinOrdinalEncoder

Use for binning ordinal data.

values is the boundaries of the bins.

window is the allowed similarity leakage between closed values.

example:

{
  "field": "price",
  "values": [10, 50, 100, 500, 1000],
  "type": "binordinal",
  "weight": 1,
  "window": [0.2,1,0.1]
}

HierarchyEncoder

Use for hierarchical data. example:

{
  "field": "sub_category",
  "values": {"meat":["chicken","beef"],"dairy": ['milk','yogurt'],"pastry":['bread','baguette']},
  "type": "hierarchy",
  "weight": 1,
}

NumpyEncoder

User defined encoder as numpy array.

JSONEncoder

User defined encoder as json.

QwakEncoder

Use with qwak data format.