News / Features



Text mining for safety

Develop untapped data sources for health, safety and environmental improvements

by Bill Tuzin, Senior Oil and Gas Solution Architect, SAS

Industrial safety theory holds that each significant accident is preceded by upward of 600 less significant incidents. Thus, to be effective at predicting significant accidents, incident management systems and hazard observation programs (IM/HO) must have high volumes of input from the field.

Unfortunately, only a few organizations have implemented IM/HO that supports both high-transaction volume and data that can be analyzed for new hazards. Many factors inhibit the efficacy of these systems, including cumbersome data entry, confusing workflows and recollection decay.

For oil and gas companies to overcome those challenges and address inherent risks in the exploration, refinement and distribution of petrochemical products, an advanced system is needed. New techniques and technologies, with a focus on text analytics, can make all health, safety and environmental protection (HSE) programs more effective by:

  • Improving the quality of data collected.
  • Increasing the utilization of the incident management system.
  • Automating manual processes.
  • Using data-based analytics to support decisions.

Let's consider each of these in more detail, with a focus on practical tips for each area.

Improve the quality of data collected
The ideal solution to improve data quality would relieve the data input user's burden to specify many structured data fields, while also providing the structured data definitions that are needed by HSE personnel for reporting and analysis. SAS has proven technologies that can interpret textual data reliably to determine the appropriate values for the structured text fields. This capability has been honed in numerous installations where customer complaints are analyzed and identification of emerging trends happens up to 70 percent faster than traditional methods.

Real-world analysis of a sampling of approximately 1,000 incident records provided more than 90 percent success in projecting the classification of incident severity. Additional system tuning may provide similar results for data fields such as the nature of damages, probable damage estimates, asset IDs, etc. These capabilities would free the incident-creation user from specifying all of the structured data fields that HSE professionals rely on, resulting in a system that is user-friendly.

SAS analytical models can analyze textual data and determine structured data field values, which can validate the user-specified structured data field values. For example, if the textual description of an incident indicated a release of product A, then the models would validate that product A was specified in the structured data fields for environmental release information.

Increase the utilization of the incident management system 
For many systems, witness statements are collected on paper and scanned images of the documents are attached to the incident record. The contents of these statements are not analyzed other than through the diligence of the HSE professional monitoring the incident. Optical character recognition (OCR) software can digitize the contents of the witness statements to allow data mining to be applied to the statements, thus providing additional input for further analysis.

Text mining for safety
A more user-friendly data capture system will promote additional use by field personnel. For paper-based systems, the data contained on those documents can be digitized by using OCR software. Once digitized, the input would be available for analysis and reporting. The value is a consistent, automated process to analyze and classify hazard observations. These same observations precede more significant incidents that HSE professionals monitor to create a safer workplace.

Automate manual processes
IM/HO data monitors can learn a lot from manufacturing and service companies that analyze and apply customer complaints and feedback from call centers to improve product quality.

Voice capture of conversations can be performed by vendors, including CallMiner, Verint, NICE Systems, Witness Systems and more. The information provided by the voice capture includes the categories created by phonetic index search, metadata about the call and the call transcriptions. SAS can read the audio output from anysystem you might be using once the audio signal is converted to textual format. Once transcribed, SAS Text Miner can interpret the transcription to determine the classifications and categories for the incident.

For IM/HO systems, this voice-recording capability could be used as an alternate to computer data input screens where incident records and hazard identification is accomplished. Instead of inputting data, users could call into an emergency response center where a few prompts would collect critical information, such as who is reporting and where they are located, prior to allowing the user to explain the details of the incident. Transcription will make the recording available for processing by SAS Analytics, where a company-specific taxonomy is used to interpret colloquialisms. Models, built upon analysis of prior records, would predict the structured data elements based upon the interpretation of the text.

Use data-based analytics to support decisions
Incident management systems monitor the health of the enterprise and identify current risk exposures. Both provide critical input into decisions regarding actions that will be taken to mitigate risk. As new risks are identified, mitigation steps may include revisions to existing programs or adding new preventative programs. Analytical capabilities assist in deciding what actions should be taken, what preventative programs should be deployed and what changes should be made to existing programs.

For example, the use of audits to measure preparedness and awareness of hazards is a common risk-reduction program in the oil and gas industry. However, audits that are not focused on the current hazards of the enterprise can undermine overall HSE programs. SAS Text Miner can analyze the topics and topic emphasis within audits to identify discrepancies.

Real-world analysis of a sampling of approximately 1,000 incident records provided more than 90 percent success in projecting the classification of incident severity.

Text mining should also be applied to the corrective action records and root cause systems for the purpose of identifying consistency between the incident, the root causes and the corrective actions. Many times the corrective actions do not repair the root cause of the incident, resulting in a repeat incident. Better alignment of the content of those three systems for a single incident will decrease the likelihood of a repeat incident.

Modernizing your safety analytics
Current day technologies enable speedier and easier collection of safety and environmental information. With the removal of data collection barriers, more data points will be processed more quickly. Empowered by insights previously hidden in paper-based or administrative systems, companies can become more responsive and agile. Faster fact-based decision making will lead to reductions in risk and accidents.

Bio: Bill Tuzin, Solution Architect for the Oil and Gas business unit at SAS, brings 17 years of software industry experience and 15 years of industry experience. He has been applying computer technology to solve business issues for his entire career, while working for chemical, oil, oil field services, grain, transportation and software industries.

Bill Tuzin, Senior Oil and Gas Solution Architect, SAS

Read More