Natural Language Processing (NLP)
What it is and why it matters
Natural language processing (NLP) is a branch of artificial intelligence that helps computers understand, interpret and manipulate human language. NLP draws from many disciplines, including computer science and computational linguistics, in its pursuit to fill the gap between human communication and computer understanding.
Evolution of natural language processing
While natural language processing isn’t a new science, the technology is rapidly advancing thanks to an increased interest in human-to-machine communications, plus an availability of big data, powerful computing and enhanced algorithms.
As a human, you may speak and write in English, Spanish or Chinese. But a computer’s native language – known as machine code or machine language – is largely incomprehensible to most people. At your device’s lowest levels, communication occurs not with words but through millions of zeros and ones that produce logical actions.
Indeed, programmers used punch cards to communicate with the first computers 70 years ago. This manual and arduous process was understood by a relatively small number of people. Now you can say, “Alexa, I like this song,” and a device playing music in your home will lower the volume and reply, “OK. Rating saved,” in a humanlike voice. Then it adapts its algorithm to play that song – and others like it – the next time you listen to that music station.
Let’s take a closer look at that interaction. Your device activated when it heard you speak, understood the unspoken intent in the comment, executed an action and provided feedback in a well-formed English sentence, all in the space of about five seconds. The complete interaction was made possible by NLP, along with other AI elements such as machine learning and deep learning.
Community outreach and support for COPD patients enhanced through natural language processing and machine learning
The COPD Foundation uses text analytics and sentiment analysis, NLP techniques, to turn unstructured data into valuable insights. These findings help provide health resources and emotional support for patients and caregivers. Learn more about how analytics is improving the quality of life for those living with pulmonary disease.
Why is NLP important?
Large volumes of textual data
Natural language processing helps computers communicate with humans in their own language and scales other language-related tasks. For example, NLP makes it possible for computers to read text, hear speech, interpret it, measure sentiment and determine which parts are important.
Today’s machines can analyze more language-based data than humans, without fatigue and in a consistent, unbiased way. Considering the staggering amount of unstructured data that’s generated every day, from medical records to social media, automation will be critical to fully analyze text and speech data efficiently.
Structuring a highly unstructured data source
Human language is astoundingly complex and diverse. We express ourselves in infinite ways, both verbally and in writing. Not only are there hundreds of languages and dialects, but within each language is a unique set of grammar and syntax rules, terms and slang. When we write, we often misspell or abbreviate words, or omit punctuation. When we speak, we have regional accents, and we mumble, stutter and borrow terms from other languages.
While supervised and unsupervised learning, and specifically deep learning, are now widely used for modeling human language, there’s also a need for syntactic and semantic understanding and domain expertise that are not necessarily present in these machine learning approaches. NLP is important because it helps resolve ambiguity in language and adds useful numeric structure to the data for many downstream applications, such as speech recognition or text analytics.
NLP in today’s world
Learn more about natural language processing in many industries
Planning for NLP
How are organizations around the world using artificial intelligence and NLP? What are the adoption rates and future plans for these technologies? What are the budgets and deployment plans? And what business problems are being solved with NLP algorithms? Find out in this report from TDWI.
The untapped potential in unstructured text
People’s thoughts, research, opinions, facts and feedback transfer into the digital world through social media feeds, legal case files, electronic health records, contact center logs, warranty claims and more. Natural language processing uncovers the insights hidden in the word streams.
What can text analytics do for your organization?
Text analytics is a type of natural language processing that turns text into data for analysis. Learn how organizations in banking, health care and life sciences, manufacturing and government are using text analytics to drive better customer experiences, reduce fraud and improve society.
How does NLP work?
Breaking down the elemental pieces of language
Natural language processing includes many different techniques for interpreting human language, ranging from statistical and machine learning methods to rules-based and algorithmic approaches. We need a broad array of approaches because the text- and voice-based data varies widely, as do the practical applications.
Basic NLP tasks include tokenization and parsing, lemmatization/stemming, part-of-speech tagging, language detection and identification of semantic relationships. If you ever diagramed sentences in grade school, you’ve done these tasks manually before.
In general terms, NLP tasks break down language into shorter, elemental pieces, try to understand relationships between the pieces and explore how the pieces work together to create meaning.
These underlying tasks are often used in higher-level NLP capabilities, such as:
- Content categorization. A linguistic-based document summary, including search and indexing, content alerts and duplication detection.
- Topic discovery and modeling. Accurately capture the meaning and themes in text collections, and apply advanced analytics to text, like optimization and forecasting.
- Corpus Analysis. Understand corpus and document structure through output statistics for tasks such as sampling effectively, preparing data as input for further models and strategizing modeling approaches.
- Contextual extraction. Automatically pull structured information from text-based sources.
- Sentiment analysis. Identifying the mood or subjective opinions within large amounts of text, including average sentiment and opinion mining.
- Speech-to-text and text-to-speech conversion. Transforming voice commands into written text, and vice versa.
- Document summarization. Automatically generating synopses of large bodies of text and detect represented languages in multi-lingual corpora (documents).
- Machine translation. Automatic translation of text or speech from one language to another.
In all these cases, the overarching goal is to take raw language input and use linguistics and algorithms to transform or enrich the text in such a way that it delivers greater value.
NLP methods and applications
How computers make sense of textual data
NLP and text analytics
Natural language processing goes hand in hand with text analytics, which counts, groups and categorizes words to extract structure and meaning from large volumes of content. Text analytics is used to explore textual content and derive new variables from raw text that may be visualized, filtered, or used as inputs to predictive models or other statistical methods.
NLP and text analytics are used together for many applications, including:
- Investigative discovery. Identify patterns and clues in emails or written reports to help detect and solve crimes.
- Subject-matter expertise. Classify content into meaningful topics so you can take action and discover trends.
- Social media analytics. Track awareness and sentiment about specific topics and identify key influencers.
Everyday NLP examples
There are many common and practical applications of NLP in our everyday lives. Beyond conversing with virtual assistants like Alexa or Siri, here are a few more examples:
- Have you ever looked at the emails in your spam folder and noticed similarities in the subject lines? You’re seeing Bayesian spam filtering, a statistical NLP technique that compares the words in spam to valid emails to identify junk mail.
- Have you ever missed a phone call and read the automatic transcript of the voicemail in your email inbox or smartphone app? That’s speech-to-text conversion, an NLP capability.
- Have you ever navigated a website by using its built-in search bar, or by selecting suggested topic, entity or category tags? Then you’ve used NLP methods for search, topic modeling, entity extraction and content categorization.
A subfield of NLP called natural language understanding (NLU) has begun to rise in popularity because of its potential in cognitive and AI applications. NLU goes beyond the structural understanding of language to interpret intent, resolve context and word ambiguity, and even generate well-formed human language on its own. NLU algorithms must tackle the extremely complex problem of semantic interpretation – that is, understanding the intended meaning of spoken or written language, with all the subtleties, context and inferences that we humans are able to comprehend.
The evolution of NLP toward NLU has a lot of important implications for businesses and consumers alike. Imagine the power of an algorithm that can understand the meaning and nuance of human language in many contexts, from medicine to law to the classroom. As the volumes of unstructured information continue to grow exponentially, we will benefit from computers’ tireless ability to help us make sense of it all.
What to read next
- 5 ways to measure beehive health with analytics and hive-streaming dataThis analytical approach to understanding bee hive health can automatically alert beekeepers to changes in hive weights, temperatures, flight activity and more.
- IoT in health care: Unlocking true, value-based careGiven the potential of IoT – and the challenges of already overburdened health care systems around the world – we can’t afford not to integrate IoT in health care.
- Big data in government: How data and analytics power public programsBig data generated by government and private sources coupled with analytics has become a crucial component for a lot of public-sector work. Why? Because using analytics can improve outcomes of public programs.
- Artificial intelligence, machine learning, deep learning and moreArtificial intelligence, machine learning and deep learning are set to change the way we live and work. How do they relate and how are they changing our world?
- 5 machine learning mistakes and how to avoid themMachine learning is not magic. It presents many of the same challenges as other analytics methods. Learn how to overcome those challenges and incorporate this technique into your analytics strategy.
- Can data sharing lead to cancer discoveries?Clinical trials can bring new drugs – and new hope – to the market for cancer patients. Now, a new data sharing platform for clinical trial data brings even more hope.
- Analytic simulations: Using big data to protect the tiniest patientsAnalytic models help researchers discover the best way to care for babies in the NICU, saving lives (and millions of dollars) in the process.
- Smart cities, smart energy solutions – thanks to the IoTFind out how Envision America and CPS Energy are using the IoT and analytics to make cities smarter and transform energy programs.