Business analytics meets natural language processing
With SAS and Teragram, text mining and enterprise search come together for a deep dive into your corporate documents
What do you get when you cross powerful business analytics with strong search and natural language processing technologies? Actual answers to the questions that executives and managers ask – not just of numbers, but of in-depth, text-based documents as well.
At SAS Global Forum 2008 in San Antonio, Texas, SAS announced the acquisition of natural language processing (NLP) leader Teragram to enhance SAS business analytics offerings – including text mining, business intelligence and data integration – and to extend SAS Analytics into enterprise and mobile search.
Founded in 1997 by Yves Schabes and Emmanuel Roche, Teragram uses NLP technologies to distill relevant information from text. Teragram performs efficient searches and better organizes text-based information in more than 30 languages to help customers reach new markets and make better decisions.
Why acquire Teragram?
SAS generally acquires companies to gain specific and complementary technologies that enhance or extend its own software, and Teragram is no exception to this rule. Teragram offers the speed, scalability, accuracy and global language support that customers need to retrieve, organize and analyze growing volumes of digital information.
For example, let’s say a business manager wants to know the revenue for the fourth quarter of 2007 for a specific product line. He types the question into his BlackBerry and the exact answer, along with the documents that support it, are revealed immediately.
And that’s just one example.
Teragram enhances SAS text mining offerings by providing:
In essence, this is the perfect partnership: The award-winning text analytics capabilities of SAS will now sit atop a new revved-up engine for NLP. Teragram provides text parsing and rule generation for users who know what they are looking for. SAS text mining provides the intelligence to figure out what business leaders need to know when they don’t know what they’re looking for.
Teragram uses pre-defined or custom-defined rules to detect specific patterns within documents. For example, a brand-management solution needs to analyze documents to detect mentions of key opinion leaders. A custom rule to identify subject-verb-object phrases will reveal patterns such as “James Goodnight works at SAS.” NLP and rules generation provided by Teragram allow deep parsing and a deeper understanding of individual documents, offering great benefits to a wide variety of applications such as warranty analysis, fraud detection, sentiment analysis, brand management, voice of the customer and much more.
Machine learning algorithms in SAS Text Miner are then added to automatically generate rules based on supervised and unsupervised learning. These rules can be deployed within a user’s business environment to discover previously hidden trends and associations within their documents.
Combining the rules-based approach of Teragram with the discovery approach of SAS Text Miner provides the best of both worlds: a broader detection of issues that allows customers to have a deeper understanding of why things happen.
Searching for business answers
Enterprise search presents a number of challenges that differ from standard Web searches. In particular, text-based documents stored across the enterprise do not include the inherent, manual metadata that hyperlinks provide on the Web. Web pages that link to one another across the Internet create a hierarchy and content grouping that is not captured or defined in most enterprise document stores. Teragram tackles that problem by scanning and processing the words and phrases within a collection of documents to surface the connections and document that natural hierarchy.
Teragram’s categorization technologies instantly classify documents according to custom criteria and apply those rules throughout the organization. This enables faster and more accurate access to documents organized by specific topics that match the interest of a given user, regardless of the original document’s location.
Together, these technologies scan structured corporate databases and unstructured sources including text-based reports and Web pages to provide comprehensive answers from multiple information sources.
“All the solutions that we provide leverage the same idea: getting better access to content,” says Teragram’s Schabes.
When combined with SAS, Teragram’s sophisticated search capabilities will deliver an easy-to-use environment that extends the availability of BI throughout organizations. The combination also improves data integration capabilities and offers indexing for information management that is driven not just by a report’s header, but by its actual content and the metadata associated with it.
Customers to benefit from acquisition
Banks use text mining to analyze transcripts of customer calls and related metadata such as call length, hold time, and emotion/stress detection to determine customer satisfaction, sentiment and brand management and to predict churn, credit risk and fraud.
Content-rich industries such as the news media analyze and organize document content with text mining, creating taxonomies to track and organize information as news appears from reporters in the field, wire services and partner outlets.
At more than 44,000 sites, SAS customers represent every major industry and include most Fortune 500® companies. Customers of both Teragram and SAS include: the Associated Press, CNN, The New York Times, Reed Business Information, Ricoh, Sony, The Washington Post, Wolters Kluwer, the World Bank and Yahoo!
SAS’ pragmatic approach to product development ensures that customer input influences product direction. Customers around the globe and in nearly every industry have varied needs to access and analyze unstructured content.
SAS sees a huge opportunity to drive unstructured data across the SAS Enterprise Intelligence Platform in response to this need. The combination of NLP, rules generation and machine learning solidifies SAS’ ability to be proactive in a fast-changing unstructured data market. When combined with SAS, Teragram’s sophisticated search capabilities will deliver an easy-to-use environment that extends the availability of BI throughout organizations.
Improve Search with Direct Answers
This story appears in the Third Quarter 2008 issue of