Entity configuration

Semantria uses a machine learning model to extract entities from the text. These are returned as entities of type "named." This base entity extraction model cannot be tuned by the user, but you can add new entities you define, which are returned as type "user."

When telling the system what to look for, you can use either exact phrases (Microsoft) or the Boolean query syntax (Microsoft OR MSFT). Custom entities are often used for things like product names or store locations or are used to normalize variations in names to one, such as normalizing MSFT and Microsoft to the same name.

Adding an entity consists of configuring:

Field NamePurposeExample
EntityThe text to look for. Special characters and multiple word phrases should be in quotes. To use query syntax in this definition you must precede the definition with a '+.'+"Dec 25th" OR "December 25" OR "12/25" OR "25.12" OR "Christmas" OR "X-mas"
LabelA label field returned in the API output to give more information about the entity, such as a link to a Wikipedia pageWinter Holiday
TypeUsually used to differentiate types of entities from each other, such as Company, Beverage, or Competitor.Holiday
NormalizedYou can use this to normalize different forms of the entity to the same value. For instance, one entity might be Coke, and another Coca-Cola. If you enter the same normalized value for each entity, they will appear in the output as the same thing.Christmas

If you use a query in the Entity field, you must also enter a value for the Normalized field.

Even though you can use Boolean syntax to define entities, the extraction differs from a query in that you can define a type and a label field in the output.