teacher and students

Text Analytics

Lexalytics was the first to market with a commercial sentiment analysis engine way back in 2004: in the decade since, we’ve built on our on-premise Salience Engine and in-the-cloud Semantria API/Excel plug-in offerings to provide the most powerful, most flexible text analytics in the industry. Below you will find a very general overview of the features of Lexalytics’ industry-leading text mining tools. For more information, check out our technical pages on each feature or read through their associated whitepapers. Check out our web demo to see Lexalytics in action, or get in touch to schedule a live demo with our team of data ninjas.

Text mining is the process of determining and collecting high-quality information from unstructured text. The very first step of text mining is information retrieval—that is, building a database to analyze. This database can be virtually any type of text, from a mass of Twitter posts to a collection of scientific papers, depending on the focus of the organization conducting the analysis.

Once a database of text has been established, Lexalytics engages a number of sophisticated text analytics systems that aim to answer three broad questions:

  • Who is talking and who/what are they discussing?
  • What are they saying?
  • How do they feel?

These three categories are roughly definable by our core features.

  • Named entity extraction
  • Themes
  • Categories
  • Intentions
  • Sentiment analysis
  • Summarization

Named Entity Extraction

Recognizing named entities means identifying named text figures: most often this means people, places, organizations, products, and brands, but Named Entity Extraction can be configured to whatever your organization requires. Names of trading stocks, specific abbreviations, even specific strains of a disease can be identified and tagged as an entity. In addition to specific named entities, Lexalytics identifies pattern-based entities such as street addresses, phone numbers and email addresses.

named entities features
Now that you’ve prepared the text, you can do things like extract the entities, and get the associated sentiment, themes, and summary (for that entity).


Contextual clues can be vital when dealing with words that have multiple meanings: the word crane, for instance, could refer to a machine used to lift heavy objects, a type of bird, or even a movement of someone’s neck. Lexalytics determines the context of entities through themes and facets, identifying the topics of discussion. Our context determination involves highly complex text mining techniques that will show you what consumers are saying and why they feel the way they do.

entities example
Themes are lexically important noun phrases. Think of them as the “buzz” from the document. They work really well when rolled up across many documents – so you can get a feel for what, exactly, are people saying. They are completely automatic. We can also tell you the themes that are lexically associated with an Entity, and not just the themes that are important inside a document.


Categories are the other side of the "determining context" coin from Themes. Themes are extracted completely automatically, where categories need to be configured ahead of time. This is useful for sorting content into buckets that are useful and relevant to a business. For example, with a retail establishment, they might be interested in categories like "staff, location, parking, stock availability, lighting, pricing, etc." We do have automatic categories, and these are very high level buckets for you to use to get a preliminary view of the content. With several different ways of categorizing content, from search queries to machine learning classifiers to Wikipedia-based categories, we provide the tools necessary to segment content exactly the way that is most relevant to any business.

entities example
Categories are pre-configured classification buckets that allow you to define “what is this content about” or “what concepts does this content mention”?


Intentions are "predictions of future behavior." A very simple example is "Hey, I dropped my camera, guess I need to buy a new one." That's a buy intent. We have four intent types out of the box: Buy, Sell, Recommend, and Quit. Using intentions will let you find new customers as well as prevent customer churn. Unlike any other text analytic system that provides intention extraction, we don't just tell you that there is an intention, we tell you who is the "intender," what is the object of their intention, and what is the intention itself. This lets customers act immediately on the information to jump on any opportunities to build business, as well as respond to problems without delay. In Salience Server, you can create your own intention types as well - say you want to configure something for a "desire" or a "vote" intention - you have complete control over the intentions. Intentions are an important part of our Industry packs, as the language for an intention varies widely from industry to industry. The word "return" is a "Buy" or "Recommend" intention in the hospitality space, but is a "Quit" intention in the consumer packaged goods space.

Sentiment Analysis

Speaking of feeling, our Sentiment Analysis feature will show you exactly how consumers feel about their subject of discussion. Our sentiment analysis is the most powerful, accurate, and reliable in the business: beyond telling you whether a given document of text is positive, negative, or neutral, we assign a specific score to show just how strong that sentiment is. What’s more, we attach sentiment scores to entities, themes, facets, in addition to showing a general document sentiment score. This multi-level analysis can be configured and optimized to match your individual needs.


Summarization is meant for humans to get a quick grasp on a long document. “Long” could be a 200 page analyst report you’re reading on your laptop, or a missive from your boss that you’re trying to scan along with another 20 emails on your phone. Lexalytics has highly tunable summarization technology to give exactly the right results for your application. One of the most interesting features is the ability to give Entity Summaries – very useful if you’re trying to crank through a few hundred large research reports trying to understand just what they’re saying about the one company (of dozens) about which you need to learn.

entities example
The summaries we provide are based on the words actually in the document. We give you the most important sentences. We can also give you the summaries that are relevant to an entity – great for dealing with 200 page analyst reports.
man with telescope

The Upshot

Lexalytics has spent over a decade refining our systems so that you, the busy professional, can sit back and let our products save you time, money, and headache. Our on-premise Salience Engine and in-the-cloud Semantria API and Semantria Excel plug-in process billions of documents every day in a wide range of industries, from Hospitality to Financial Services to Customer Experience Management and beyond. Software companies around the world find our text analytics solutions easy to integrate into their own offerings, and never fail to report happy end-users. Directly or indirectly, Lexalytics provides the text mining tools that businesses need to unlock the insights hidden in unstructured text.

Get a FREE trial Schedule a demo

Or call us at 1-800-377-8036