To create entities you need to define them in
data.json file in
entities section. Don't forget to connect dataset file to the application.
entities is a dictionary, where the key is an entity name and value object describes the entity.
To create entity you need to define:
values: list of possible values and synonyms
open_set(optional, default: true): to restrict values not from the list
includes(optional): add phrase examples with entities (another entities can also be in the phrase), which have no intents
excludes(optional): values from which entity should not recognize
To make entities recognize from phrases you need to markup them. It is very important to markup all of them. If you don’t then the model will train incorrectly and won’t recognize entities well enough.
There are two equivalent options to markup entities in phrases:
- Value-Entity brackets:
- Square brackets:
For each entity, you need to define values to be extracted. Each value contains a reference
value and may contain a list of
synonyms. If any value from synonyms is extracted, it will associate with referencing value.
There are cases when you need to recognize entities from a phrase that doesn't relate to any intent. Includes section is for such phrases.
Sometimes you may have a very specific list of named entities where you know all possible values (e.g. pizza names or drinks on a restaurant menu) or you can’t list all possible values (e.g. addresses, times, names, numbers, etc.) in the world. You can work with two of these different cases with the Open Set Option.
Open Set: false
When you know all your values. List all values, that you know. In this case, model will recognize only entities in the list.
Open Set: true
When you can’t know all your values. Provide minimum 5-10 values. People may say some entity value that is not in your list. Then, the model will extract entities from the context of the sentence.
You don't need to worry about all values in examples. Augmentation algorithm will place all values of the entity in phrases automatically.
Thats enough to provide phrase and markup entities. Don't create duplicated phrases.
Sometimes it's needed to specify additional meaning of the the extracted entity. For example, source and target account; departure and arrival city, etc.
In this case, you need to specify entity tag in your markup. Tag is a special word, that clarifies the meaning of an entity. Use
: after entity name to define a tag (
[entity_name:tag]). The tag can be any word you want. You don't need to define it somewhere else.
You can specify an entity with or without a tag.
Then, in DashaScript you can filter entities by this tag and get the one with a particular meaning.