Google Ngrams are Pithy

I just learned (via The New York Times) of a new Google tool that allows the curious person to type in a word or phrase and see how often that term is mentioned in over 5.2 million books over the past 500 years. The Google Ngram tool draws from 500 billion words contained in books published between 1500 and 2008 in English, French, Spanish, German, Chinese and Russian.

If every picture tells a story, then perhaps we can intuit some knowledge from this incredible graphing tool, the analytical powers of which would have been unthinkable only a few years ago. Of course lines on a graph are just data, and exercises in interpretation can be creative, manipulative, dull or brilliant. ┬áThat’s why it could be fun.

Here are a few I came up with:

This week the Congressional passage of a bill to extend Bush-era tax cuts was big news. Obama’s health care legislation passed earlier this year remains in limbo as judicial challenges and a Republican Senate gear up to attack. How do the terms “tax cuts, medicare, and medicaid” stack up in terms of their prevalence in our written discourse over the last 100 years?

Does the hive conscience since the 1960’s have a greater appetite for money in one’s own pocket than concerns about health welfare?

Here’s another. Searching for “diabetes, obesity” yielded predictable curves:

Obesity and diabetes are signs of the times, and will only take up more of our collective literature going forward.

And then there are “leeches, lobotomy, and liposuction.” Three bizarre treatments offered by physicians over the past two hundred years:

Fearing a lawsuit for missing a diagnosis, doctors often order multiple expensive tests, a practice called defensive medicine. Patients who experience unfortunate outcomes rarely receive fair compensation, and only those patients whose injuries make for high courtroom drama (or potentially multimillion dollar awards) ever begin the 2-3 year process of litigation. The whole mess benefits trial lawyers to the detriment of health care quality and cost. Here’s “medical malpractice, defensive medicine, healthcare costs, bad outcomes.”

Reads like a book to me.

“abortion, birth control” produces some interesting results. Although abortion has been in the public lexicon for hundreds of years, the use of the term in books surged at about the same time as the oral contraceptive pill and the social movements of the 1960’s were taking off.

Despite the furor over abortion, actually it would seem to be cooling off in terms of frequency appearing in the written word.

And what about the old adage of Franklin that “an ounce of prevention is worth a pound of cure?” It would seem that prevention was all but forgotten for the better part of the past three hundred years, and only recently has gained against the more heroic cure.

With a resurgence of suspicion and pseudo-enlightenment, the opponents of vaccination have abetted the resurgence of diseases like pertussis, measles, varicella, and polio in recent years. How often has the phrase “vaccines are dangerous” entered our collective conscience?

And so that I leave time in my Sunday for decorating the Christmas tree, I leave you with one last comparison of terms in the literature: “apocalypse, global warming.”

In the days leading up to the year 2000, millennial superstitions crescendoed but have since relaxed. Unfortunately, with climate change denial, oil addiction, population growth, and moral paralysis it would seem that one force more powerful than Nostradamus grows stronger as we seemingly relax in our language of doom!

This Google Ngram stuff could become addictive. I’m thinking I’ll need to limit myself to one search a day. What crazy, creative, manipulative, important, and curious connections might we uncover in our culture through the analysis of our collective conversations as they occur across the centuries in the written word?

Perhaps I’ll just have to work on a little side project – can Ngrams be pithy? Please share any thought-provoking ones you might come up with, too!