Custom entities

To create entities you need to define them in data.json file in entities section. Don't forget to connect dataset file to the application.

Section entities is a dictionary, where the key is an entity name and value object describes the entity.


To create entity you need to define:

  • values: list of possible values and synonyms
  • open_set (optional, default: true): to restrict values not from the list
  • includes (optional): add phrase examples with entities (another entities can also be in the phrase), which have no intents
  • excludes (optional): values ​​from which entity should not recognize

Entity markup format

To make entities recognize from phrases you need to markup them. It is very important to markup all of them. If you don’t then the model will train incorrectly and won’t recognize entities well enough.

There are two equivalent options to markup entities in phrases:

  • Value-Entity brackets: (entity value)[entity_name:tag]
  • Square brackets: [entity_name:tag]

Entity values and synonyms

For each entity, you need to define values to be extracted. Each value contains a reference value and may contain a list of synonyms. If any value from synonyms is extracted, it will associate with referencing value.

Includes values

There are cases when you need to recognize entities from a phrase that doesn't relate to any intent. Includes section is for such phrases.

Excludes values

You can specify values, which should not be recognized as this entity.

Open set option

Sometimes you may have a very specific list of named entities where you know all possible values (e.g. pizza names or drinks on a restaurant menu) or you can’t list all possible values (e.g. addresses, times, names, numbers, etc.) in the world. You can work with two of these different cases with the Open Set Option.

  • Open Set: false

    When you know all your values. List all values, that you know. In this case, model will recognize only entities in the list.

  • Open Set: true

    When you can’t know all your values. Provide minimum 5-10 values. People may say some entity value that is not in your list. Then, the model will extract entities from the context of the sentence.

Value augmentations

You don't need to worry about all values in examples. Augmentation algorithm will place all values of the entity in phrases automatically.

Thats enough to provide phrase and markup entities. Don't create duplicated phrases.

Entity tags

Sometimes it's needed to specify additional meaning of the the extracted entity. For example, source and target account; departure and arrival city, etc.
In this case, you need to specify entity tag in your markup. Tag is a special word, that clarifies the meaning of an entity. Use : after entity name to define a tag ([entity_name:tag]). The tag can be any word you want. You don't need to define it somewhere else. You can specify an entity with or without a tag.

Then, in DashaScript you can filter entities by this tag and get the one with a particular meaning.