Skip to content

Examples

Andrea Gazzarini edited this page Jan 11, 2018 · 17 revisions

If you have rush, here's a concrete usage scenarios list where you can use this plugin.

The most trivial thing you can do with this plugin

Each dimension can be associated with a different boost.

Each unit can be expressed in different forms (e.g. mt,meter,m).

Don't boost only the detected quantity but also items that fall within a range (for that quantity)

A dimension field is expressed using meters (in index), but the user queried for 120 cm

Two or more fields are using the same unit (e.g. width, height, depth): how to deal with this?

An amount is detected, without any unit (e.g. fridge 1287). Do we have to consider this a quantity?


Simplest quantity detection

In this example we have just one unit with one allowed form.

    {
      "capacity": {
        "unit": "cl"
      }
    }
input query q bq bf
beer 25cl beer capacity:25 N.A.
beer 25 cl beer capacity:25 N.A.
beer 25 Cl beer capacity:25 N.A.
25 Cl *:* capacity:25 N.A.

Note that

  • the quantity detection is case-insensitive and it doesn't take in account whitespaces between amounts and units
  • after isolating quantities, if the resulting q is empty then is will replaced with a MatchAllDocsQuery (*:*)

Boost

In this example we have two units configured. Both of them declare one single allowed form / variant, but the first provides an additional boost factor.
Please note the different ways to declare a boost. Although the second and the third one could sound verbose at this moment, it is useful for the multi-dimensions scenario described below.

    {
      "capacity": {
        "unit": "lt",
        "boost": 1.2
      },
      "voltage": {
        "unit": "v"
      }
    }
    {
      "capacity": {
        "unit": "lt",
        "boost": {
          "lt": {
             "value": 1.2
          }
        }
      },
      "voltage": {
        "unit": "v"
      }
    }
    {
      "capacity": {
        "unit": "lt",
        "boost": {
          "value": 1.2
        }
      },
      "voltage": {
        "unit": "v"
      }
    }
input query q bq bf
fridge 125lt 120v fridge capacity:125^1.2 voltage:120 N.A.
125lt 120v *:* capacity:125^1.2 voltage:120 N.A.

Variants

In this example there's just one unit configured, but we defined several matching variants.

    {
      "voltage": {
        "unit": "volts",
        "variants": {
          "volts": ["v","volt"]
        }
      }
    }
input query q bq bf
car battery 120v car battery voltage:120 N.A.
car battery 120V car battery voltage:120 N.A.
car battery 120 volts car battery voltage:120 N.A.
car battery 120volts car battery voltage:120 N.A.

Variants and gap

This is enriching the configuration of the previous example, by adding a gap, therefore enabling range querying.
The gap attribute has two attributes; since the generated queries depend on those attributes, the tables below lists the available combination with the corresponding result (in the configuration below the value and mode attributes have just a placeholder).

    {
      "voltage": {
        "unit": "volts",
        "variants": {
          "volts": ["v","volt"]
        },
        "gap": {
          "value": $value,
          "mode": $mode
        }
      }
    }

Values of "input query" and "q" are omitted because are always the same (see the previous example):

value mode bq bf
10 PIVOT voltage:120
voltage:[110 TO 130]
recip(abs(sub(voltage,120)),1,1000,1000)
10 MIN voltage:120
voltage:[120 TO 130]
recip(abs(sub(voltage,120)),1,1000,1000)
0 MIN voltage:120
voltage:[120 TO *]
recip(abs(sub(voltage,120)),1,1000,1000)
10 MAX voltage:120
voltage:[110 TO 120]
recip(abs(sub(voltage,120)),1,1000,1000)
0 MAX voltage:120
voltage:[0 TO 120]
recip(abs(sub(voltage,120)),1,1000,1000)

Note that the PIVOT mode (which is the default) requires a value, differently from MIN and MAX, which assumes 0 if the value is missing.

Equivalences

In this example, quantities in the query string use a different unit and therefore need to be converted using an equivalence table.

{
  "units" : {
      "capacity": {
        "unit": "lt",
        "variants": {
          "lt": ["l","liters","liter"],
          "cl": ["centiliters"],
          "ml": ["milliliters"]
        }
      },
      "height": {
        "unit": "cm",
        "variants": {
          "mm": [
            "millimeters"
          ],
          "cm": [
            "centimeters"
          ],
          "m": [
            "mt",
            "meters"
          ]
        },
        "gap": {
          "value": 10,
          "mode": "PIVOT"
        }
      }
  },
  "equivalence.table": {
      "cm" : {
        "mm" : 10,
        "m" : 0.01
      },
      "lt" : {
        "cl": 100,
        "ml": 1000
      }
    }
}
input query q bq bf
a bottle of 1.20lt door capacity:1.20 recip(abs(sub(capacity,120)),1,1000,1000)
a bottle of 1200 cl door capacity:1.20 recip(abs(sub(capacity,120)),1,1000,1000)
car 2.10mt car height:210 height:[200 TO 220] recip(abs(sub(height,210)),1,1000,1000)

Multifields

The scenario is the following: you have three attributes in your schema, which are using (i.e. are expressed) in the same unit. The configuration allows you to define this context, with different weights associated to each field.
Note that you can also use one of the features described above (e.g. equivalences, boosts, gaps).
In this example, we have two fields using the same unit (cm):

  • height: for height we want a range query with gap of 1.5 in PIVOT mode, and a boost of 1.1
  • width: for width we want a range query with gap of 1.1 in MIN mode, and a boost of 0.2
{
  "units" : {
      "height,width": {
        "unit": "cm",
        "variants": {
          "mm": [
            "millimeters"
          ],
          "cm": [
            "centimeters"
          ],
          "m": [
            "mt",
            "meters"
          ]
        },
        "gap": {
          "value": 1.5,
          "mode": "PIVOT",
          "width": {
            "value": 1.1,
            "mode": "MIN"
          }
        },
        "boost": {
          "value": 1.4,
          "height": {
            "value": 1.1
          },
          "width": {
            "value": 0.2
          }
        }
      }
    },
    "equivalence.table": {
      "cm" : {
        "mm" : 10,
        "m" : 0.01
      },
      "lt" : {
        "cl": 100,
        "ml": 1000
      }
    }
} 

A query like fridge 1260 cm will be translated in:

  • q=fridge
  • bq=height:1260^1.1 height:[1258.5 TO 1261.5] width:1260^0.2 width:[1258.9 TO 1261.1]
  • bf=recip(abs(sub(height,120)),1,1000,1000) recip(abs(sub(width,1260)),1,1000,1000)

Assumptions

The user entered a query which contains amount (or amounts) without any unit at all. If we want to consider this amount a quantity, the following configuration allows to do that.
This section is absolutely optional: if it is not declared, there won't be any assumption at all, and "orphan" amounts (i.e. amounts without units) will be ignored.

As you can see, we can configure a global rule (the "default" unit), and an additional map where each unit is can be associated with one or more ranges.
The global rule is useful in those scenarios where you have just one dimension attribute, but if the schema contains several fields (e.g. capacity, voltage, height), then ranges are a precious hint for "guessing" the target unit.

{
  "units" : {
    "height": {
      "unit": "cm",
      "variants": {
        "mm": [
          "millimeters"
        ],
        "cm": [
          "centimeters"
        ],
        "m": [
          "mt",
          "meters"
        ]
      }
    },
    "capacity": {
      "unit": "lt",
      "variants": {
        "cl": [
          "centiliters"
        ]
      }
    },
    "voltage": {
      "unit": "volt",
      "variants": {
        "volt": [
          "v",
          "volts"
        ]
      }
    }
  },
  "equivalence.table": {
      "cm" : {
        "mm" : 10,
        "m" : 0.01
      },
      "lt" : {
        "cl": 100,
        "ml": 1000
      }
  },
  "assumption.table": {
    "lt": [[0.25,0.50],[0.60,0.75]],
    "cl": [[250,1000]],
    "m": [[1.5,3]],
    "cm": [[1500,3000]],
    "default" : "volt"
  }
}    

in the example above, the assumption table has been enabled:

  • an amount detected in the user query, falling between 0.25-0.50 or 0.60-0.75 is associated to the "lt" unit (and to the capacity field)
  • an amount detected in the query, falling between 250 and 1000 is associated to the "cl" unit
  • ...and so on
  • if the amount doesn't fall within the declared ranges, then a "volt" unit is used

_IMPORTANT: the ranges must be disjoint _

Clone this wiki locally