What is a matrix, and what is it used for? This short article will attempt to de-mystify this fundamental concept of mathematics. Despite the complex subject matter, I'll try to keep this light and approachable.
The matrix is a fundamental object in many different fields of analysis, and natural language processing is no exception. Whether you want to study the flow of air around a speeding race car or study the interaction of terms in a document, the matrix can represent the state, the changes, and the connections in a complicated problem.
What is a Matrix?
But what exactly is a matrix? A matrix is really just a grid of objects, usually numbers or equations.
You can identify any single cell in the matrix by counting down the side to find its row, and then across the top to find its column.
There are many ways of building and representing matrices. In Excel, we might write "C12"; in pure mathematics, that might be "M3,12,". Either way, we're giving the coordinates to a particular piece of data in our matrix.
You can also think of a matrix as a tool for storing information in an organized way. Given a spreadsheet (a common way to build matrices), each row and column represents a different element in a network. In that network, the cells represent connections.
What are Matrices Used For?
Technically you can use matrices to do math by hand. But this is very tedious. Imagine lots of multiplication and addition, repeated hundreds of thousands of times or more. You can do it, but you probably don't want to.
Fortunately, this sort of repetitive math is where computers excel (no pun intended). Researchers use computerized matrices to do all sorts of amazing calculations. From modeling electrons, to displaying 3d shapes in video games, and beyond.
In fact, you encounter matrices every day. Just think: How does Google figure out what to show you in your search results?
That's right: They use a huge web of matrices to analyze millions of webpages and the connections between them. Then, they combine that data with a bunch of other analytics to make their best guess at what you're most likely to click on.
If you want to learn even more about matrices, give it a search in your search engine of a choice, and a matrix will point the way for you.
How are Matrices Used in Natural Language Processing?
Many of the matricization techniques developed for one field have a way of spreading to others. Research in natural language processing, for example, has started borrowing some of the same techniques used by physicists to describe the laws of reality.
Many powerful machine learning algorithms can be run on matrices. Matrix Factorization takes a complicated mess of data and finds simpler, more compact explanations. And Eigenvector Analysis can tell you what the most important items in a matrix are.
Generally speaking, in natural language processing, we use matrix-based algorithms to understand the construction (syntax) of a sentence.
At Lexalytics, for example, we used a complex network of matrices to build our Concept Matrix. This tool, representing the connections between words inside the top 600,000 Wikipedia™ articles, plays a large role in our natural language processing features, such as Categorization.