As big data grows, irrelevant "noise" from content needs to be identified and eliminated. SAS Enterprise Content Categorization add-on modules enable richer processing at the level of words, linguistic relations and word meanings – solving common issues associated with excessive electronic information materials.
The customizable modules provide quick starts to industry taxonomies, effective faceted search, meaningful summaries of materials, real-time alerts to new content availability and more.
Help users identify the information they need, quickly.
When documents are classified based on their content, retrieval activity is improved since more meaningful, relevant information is returned from searches. Relying on limited, predefined keywords alone is insufficient.
Add-on capabilities include search and indexing to retrieve information based on facets defined by the content itself; a crawler that automatically downloads requested documents from internal file systems and the Internet; a text summarization module that identifies the most meaningful sentences in a document – delivered in a condensed form – and a scalable, real-time alert service that delivers notifications to users across a range of alert media, including emails, instant messaging, etc.
Jump-start your categorization efforts.
The SAS Industry Taxonomy Rules provide an extensive, prebuilt suite of terms, entities, attributes and their hierarchical relationships to quick-start categorization projects. Once taxonomies are established, and any linguistic rules developed, the SAS Search and Indexing add-on module can be applied to automatically discern query semantics and enable superior drill-down capabilities to enhance users' investigative techniques.
Narrowing down information to just the relevant sources, this add-on applies stemming and automatic spelling correction to enable richer preprocessing and provide more accurate, relevant search results.
Purge content chaos that spans multiple enterprise repositories.
Enterprise repositories often contain many documents that have been duplicated or edited and republished. Extending the categorization of similar content, the SAS Document Duplication Detection add-on helps organizations minimize their content stores, maintaining only those materials that meet the threshold standards of similarity.
- SAS® Industry Taxonomy Rules
- SAS® Document Duplication Detection
- SAS® Search and Indexing
- SAS® Text Summarization
- SAS® Crawler
- SAS® Content Categorization Information Workbench
- SAS® Content Alerts
- SAS® Text Data Language Pack