The first part of this series explains typical steps that a survey analyst takes to prepare free-text comments for Natural Language Processing (NLP). In this part, we describe four methods by which a cleaned and standardized corpus lets NLP reveal insights.

1. Create a Word Cloud of Common Words

From the document-term matrix, software can total how many times each word appears in the corpus as a whole (all of the text questions’ combined text). A bar chart with the most frequent words can depict the totals. As in the chart below, which shows the frequently-used significant words in the first 25 posts from my blog, Savvy Surveys for Lawyers, this simple graph makes clear the thrust of the writing. The word “survey” leads the way with almost 250 appearances. At the bottom end, “people” appears about 40 times. Note that the words have not been lemmatized.

This content has been archived. It is available through our partners, LexisNexis® and Bloomberg Law.

To view this content, please continue to their sites.

Not a Lexis Subscriber?
Subscribe Now

Not a Bloomberg Law Subscriber?
Subscribe Now

Why am I seeing this?

LexisNexis® and Bloomberg Law are third party online distributors of the broad collection of current and archived versions of ALM's legal news publications. LexisNexis® and Bloomberg Law customers are able to access and use ALM's content, including content from the National Law Journal, The American Lawyer, Legaltech News, The New York Law Journal, and Corporate Counsel, as well as other sources of legal information.

For questions call 1-877-256-2472 or contact us at [email protected]