Custom entities
To create entities you need to define them in data.json
file in entities
section. Don't forget to connect dataset file to the application.
Section entities
is a dictionary, where the key is an entity name and value object describes the entity.
Entities
To create entity you need to define:
values
: list of possible values and synonymsopen_set
(optional, default: true): to restrict values not from the listincludes
(optional): add phrase examples with entities (another entities can also be in the phrase), which have no intentsexcludes
(optional): values from which entity should not recognize
Entity markup format
To make entities recognize from phrases you need to markup them. It is very important to markup all of them. If you don’t then the model will train incorrectly and won’t recognize entities well enough.
There are two equivalent options to markup entities in phrases:
- Value-Entity brackets:
(entity value)[entity_name:tag]
- Square brackets:
[entity_name:tag]
Entity values and synonyms
For each entity, you need to define values to be extracted. Each value contains a reference value
and may contain a list of synonyms
. If any value from synonyms is extracted, it will associate with referencing value.
Includes values
There are cases when you need to recognize entities from a phrase that doesn't relate to any intent. Includes section is for such phrases.
Open set option
Sometimes you may have a very specific list of named entities where you know all possible values (e.g. pizza names or drinks on a restaurant menu) or you can’t list all possible values (e.g. addresses, times, names, numbers, etc.) in the world. You can work with two of these different cases with the Open Set Option.
Open Set: false
When you know all your values. List all values, that you know. In this case, model will recognize only entities in the list.
Open Set: true
When you can’t know all your values. Provide minimum 5-10 values. People may say some entity value that is not in your list. Then, the model will extract entities from the context of the sentence.
Value augmentations
You don't need to worry about all values in examples. Augmentation algorithm will place all values of the entity in phrases automatically.
Thats enough to provide phrase and markup entities. Don't create duplicated phrases.
Entity tags
Sometimes it's needed to specify additional meaning of the the extracted entity. For example, source and target account; departure and arrival city, etc.
In this case, you need to specify entity tag in your markup. Tag is a special word, that clarifies the meaning of an entity.
Use :
after entity name to define a tag ([entity_name:tag]
). The tag can be any word you want. You don't need to define it somewhere else.
You can specify an entity with or without a tag.
Then, in DashaScript you can filter entities by this tag and get the one with a particular meaning.