
Natural Language Processing (NLP)
What it is and why it matters
Natural language processing (NLP) is a branch of artificial intelligence that helps computers understand, interpret and manipulate human language. NLP draws from many disciplines, including computer science and computational linguistics, in its pursuit to fill the gap between human communication and computer understanding.
Evolution of natural language processing
While natural language processing isn’t a new science, the technology is rapidly advancing thanks to an increased interest in human-to-machine communications, plus an availability of big data, powerful computing and enhanced algorithms.
As a human, you may speak and write in English, Spanish or Chinese. But a computer’s native language – known as machine code or machine language – is largely incomprehensible to most people. At your device’s lowest levels, communication occurs not with words but through millions of zeros and ones that produce logical actions.
Indeed, programmers used punch cards to communicate with the first computers decades ago. This manual and arduous process was understood by a relatively small number of people. These days, you can use generative AI (GenAI) models such as ChatGPT to create code, brainstorm new ideas or summarize research topics.
This technology is made possible by large language models (LLms) using NLP, along with other AI elements like machine learning and deep learning.
Synthetic data and its many uses
Synthetically generated text is often used with NLP models. Want to learn more about what synthetic data is, why it’s so valuable, and how it’s being used today? Watch this explainer video with Brett Wujek, who leads product strategy for next-generation AI technologies at SAS, to hear why synthetic data is so important for the future.

NLP in today’s world
Why is NLP important?
Kia uses AI and advanced analytics to decipher meaning in customer feedback
Kia Motors America regularly collects feedback from vehicle owner questionnaires to uncover quality issues and improve products. But understanding and categorizing customer responses can be difficult. With natural language processing from SAS, KIA can make sense of the feedback. An NLP model automatically categorizes and extracts the complaint type in each response, so quality issues can be addressed in the design and manufacturing process for existing and future vehicles.
How does NLP work?
Breaking down the elemental pieces of language
Natural language processing includes many different techniques for interpreting human language, ranging from statistical and machine learning methods to rules-based and algorithmic approaches. We need a broad array of approaches – because text data and voice-based data vary widely, as do their practical applications.
Basic NLP tasks include tokenization and parsing, lemmatization/stemming, part-of-speech tagging, language detection and identification of semantic relationships. If you ever diagrammed sentences in grade school, you’ve done these tasks manually before.
In general terms, NLP tasks break down language into shorter, elemental pieces, try to understand relationships between the pieces and explore how the pieces work together to create meaning.
These underlying tasks are often used in higher-level NLP capabilities, such as:
- Content categorization provides a linguistic-based document summary, including search and indexing, content alerts and duplication detection.
- Large language model (LLM)-based classification, particularly BERT-based classification, is used to capture the context and meaning of words in a text to improve accuracy compared to traditional models.
- Corpus analysis is used to understand corpus and document structure through output statistics for tasks such as sampling effectively, preparing data as input for further models and strategizing modeling approaches.
- Contextual extraction automatically pulls structured information from text-based sources.
- Sentiment analysis identifies the mood or subjective opinions within a piece of text (as well as large amounts of text), including average sentiment and opinion mining.
- Speech-to-text and text-to-speech conversion transforms voice commands into written text, and vice versa.
- Document summarization automatically generates synopses of large bodies of text and detects represented languages in multi-lingual corpora (documents).
- Machine translation automatically translates text or speech from one language to another.
In all these cases, the overarching goal is to take language input and use linguistics and algorithms to transform or enrich the text in such a way that it delivers greater value.
NLP methods and applications
How computers make sense of textual data
SAS® Visual Text Analytics
How can you find answers in large volumes of textual data? By combining machine learning with natural language processing and text analytics. Find out how your unstructured data can be analyzed to identify issues, evaluate sentiment, detect emerging trends and spot hidden opportunities.
Recommended reading
-
Fishing for the freshest data: Leading the global seafood market with analyticsThe Norwegian Seafood Council uses SAS to give Norwegian seafood exporters a competitive advantage.
-
Manufacturing smarter, safer vehicles with analyticsKia Motors America relies on advanced analytics and artificial intelligence solutions from SAS to improve its products, services and customer satisfaction.
-
Reducing hospital-acquired infections with artificial intelligence Hospitals in the Region of Southern Denmark aim to increase patient safety using analytics and AI solutions from SAS.
-
Your personal data scientistImagine pushing a button on your desk and asking for the latest sales forecasts the same way you might ask Siri for the weather forecast. Find out what else is possible with a combination of natural language processing and machine learning.