Well that may be overstating things a bit, but now that I’ve got your attention take a look at the Calais Viewer. Paste some text into the entry field and submit it, and it parses it for the specific meaning using natural language processing.
My friend Eric Hoffer was a guest speaker at the Web 2.0 Day I co-led at SHU, and he turned us on to this. It’s not perfect, but by and large it does an amazing job of picking out persons vs. political persons, places, cities, countries, industry terms and educational references. It correctly identified “Rolling Stones” as an “organization,” not more-than-one moving rocks.
I took a few paragraphs from a New York Times business section story on Citigroup:
Citigroup said Friday morning that it lost $2.5 billion, or 54 cents a share, in the second quarter.
The loss was largely caused by $7.2 billion of write-downs of Citigroup’s investments in mortgages and other loans and by a weakness in the consumer market, which cost Citigroup $4.4 billion in credit losses and $2.5 billion to increase reserves. Analysts had expected a loss of 66 cents a share.
But the chief executive, Vikram Pandit, positioned the $2.5 billion loss as progress. Last quarter, the financial conglomerate lost $5.1 billion.
“We cut our second-quarter losses in half compared to the first quarter,” Mr. Pandit said in a statement. “While there is still much to do, we are encouraged by our progress.”
It correctly identified Citigroup as a company, and Mr. Pandit as a person. That was easy. And it identified the last paragraph as a quotation. But it also parsed the third paragraph’s “the chief executive, Vikram Pandit” as a “professional person” and “quarter, the financial conglomerate lost $5.1 billion” as a company earnings announcement.
What’s the point? Well think about it next time your Google search returns three million results. By contextualizing a common word with multiple meanings (“orange”) accurately, the possibilities for getting computers to really understand what you’re looking for.
John Nichol Lindsley, Vice President of the Orange national bank of Orange, N.J., and head of the hardware business of John N. Lindsley, Inc., founded more than a century ago by his great-grandfather, died yesterday at his home, 76 Cleveland Street, Orange. He had been President of the Board of Trustees of the First Presbyterian Church of Orange for twenty-five years. He was 72 years old. Mr. Lindsley was a director of the Orange Savings Bank and had been President of the Orange Police Commission.
“Orange” is a city; “Orange Savings Bank” is determined to be an company; “Orange Police Commission” an organization. “John N. Lindsley, Inc.” is a company, but “Mr. Lindsley” is a person. And so on.
Cut that three million down to a manageable few thousand and you’re getting somewhere!