Who found the tag cloud (word cloud) ?

Posted on 08 Oct 2013 in   Chart of the Week, Data Visualization, Information

Tag clouds (also known as “word clouds”) are visually aggregated summaries of large bodies of textual data.  Their purpose is to take a selection of text or humanly created tags and visually display the frequency of the most commonly used words within that document or collection.  These are useful for qualitative analysis by highlighting major themes found in particular works of interest. This is a very common visualization technique used by bloggers and crowdsourcing sites like Ideascale.

Many sources cite Jim Flanagan as the founder of this idea with his Search Referral Zeitgeist Perl script, although the basic idea (of using word size to designate importance) had previously been used by Douglas Coupland in his 1995 novel “Microsurfs”.


Where are Tag Clouds used and is there another use of Tag Clouds other than web navigation ?

Tag clouds, like mentioned before, are prominently featured on image/photo websites (such as Flickr), and blogs, where they can be used by visitors to navigate the website by theme or keyword. For an example of a more academic use of this technology, see the attached photos, which were generated by Dr. Christopher Green of York University using the Wordle.com website. These images (made using John Dewey’s 1896 article “The Reflex Arc Concept in Psychology” and B. F. Skinner’s 1950 article, “Are Theories of Learning Necessary?”) can be studied to visually compare and summarize the similarities and differences in word usage between these two major psychological texts.


A Wordle of John Dewey’s “The Reflex Arc in Psychology” (1896), produced by Dr. Christopher Green.

DeweyA Wordle of B. F. Skinner’s “The Reflex Arc in Psychology” (1950), produced by Dr. Christopher Green.


As you will see from the two word clouds above , the main themes of the two papers differ quite a bit. John Dewey’s paper focused more on  stimulus and sensation whereas Skinner’s paper focused on reinforcement and extinction.  This is another example of using the data visualization artifact Tag Clouds for textual content comparisons.

Of course, the biggest problem with word clouds is that they are often applied to situations where textual analysis is not appropriate. One could argue that word clouds make sense when the point is to specifically analyze word usage (here is an interesting alternative from N.Y. Times), but it’s ludicrous to make sense of a complex topic like the Iraq War by looking only at the words used to describe the events. Don’t confuse signifiers with what they signify.