So it seems like that Text Analytics as a technology space is in the process of coming of age, with Sue Feldman at IDC
dubbing companies like ours as Eureka 2.0 The sudden expansion of social networking will lead to a tsunami of unstructured data. This will lead to the emergence of “Eureka 2.0? software that combines text analytics, sentiment extraction, and related technologies to distill the “wisdom of crowds.
Whilst of course this is great to hear, after all more understanding of Text Analytics means more sales, it puts more onus on us to be at the cutting edge of technology. As with Full Text Search, technologies such as Entity Extraction and Document Level sentiment are becoming commodities, with customers expecting them to be present in any checklist, which means that to maintain our technological edge there is a lot of pressure on the Lexalytics Engineering team to come up with the next big thing. To accomplish this, what we have been doing over the past few months is create a grammatical engine to compliment our existing speech tagging technologies and over the next few months we will be introducing the benefits of that into the core Salience Engine. Some of the things that it will let us do are:
- Improved Entity Sentiment
- Automatic Brand Detection
- More complex theme extraction
The information that we derive from the grammatical parses of a document will also be used by other technologies like Emotive State Detection and Message Detection to offer more ways to slice and dice your unstructured information. Over on the technical blog we are planning a series of articles about how the grammatical engine has been implemented and you can get a look at a very early prototype at http://dev.lexalytics.com/lab/parserdemo.php where you can learn all about Posterior sentence probabilities. Roll on Eureka 3.0