Semantic technology differs from most computing as it learns on the job. This can provide great benefits but it can also be time consuming. I recently talked with Jeff Catlin, CEO of Lexalytics, about what they’re doing with their recent upgrade of their product, Salience 5.0. They came up with a clever idea to reduce the learning curve. They had their semantic engine digest Wikipedia to gain an understanding of human thought and build their Concept Matrices™. This allows it to do things that most computer technology would struggle with such as understanding that pizza is a food even though the word food was never associated with pizza in the text it was looking at.
While the Wikipedia content is freely reusable under the Creative Commons Attribution-ShareAlike 3.0 Unported License, Jeff said that they made sure the Wikipedia people were comfortable with what they are doing without asking for a formal endorsement.
These associations are established through concept matrixes. Salience 5.0 will also knows that cat is more associated with lion than with hippo. Most semantic engines required extensive training to be able to reach these conclusions. Devouring Wikipedia gives Salience 5.0 an “out-of-the box” categorization system. You can then add your own categories to extend the system and Salience will know where to put stuff it encounters. Wikipedia was a good choice since it is the work of many people and is constantly refined so there is a “wisdom of crowds” component to its content.
