“Screwing around” with Datasets (Text Analysis)

I know that the ‘screwing around’ got you thinking, what the hell is he/she talking about right? Well contrary to your little dirty minds, ‘screwing around’ used in this context means to be browsing; you know when you not too sure of what you are really searching for, as Stephen Ramsey 2014  nicely explain.

What is text analysis? Text analysis tools break a text down into smaller units like words, sentences and passages and then gather these units into new views on the text that aide interpretation. Geoffrey Rockwell

gives a brief history of electronic texts and text analysis according to Rockwell a good way to understand text analysis is to look at the tradition of concordancing from which it evolved. A concordance is a standard study tool where one can look up a word and find references to all the passages in the target work where that word occurs. They are alphabetically-sorted lists of the vocabulary of a text (its different words or phrases). Occurrences of each word (the keyword) appear under a headword, each one surrounded by enough context to make out the meaning, and each one identified by a citation to the text that gives its location in the original.

In DITA lab session #Citylis students had a wonderful experience of screwing around with Wordle, Many eyes and Voyant, these are all JavaScript tools that aids text analysis. The datasets that we were urged to save in previous lab sessions were used to experiment in Wordle etc. Wordle generates word clouds from the datasets that were created. Word clouds add flare to your text by allowing you to change fonts, layout and colour schemes. Below is an example of a text analysis produced in Wordle: (#citylis top tweeters)


Datasets of journal articles inserted into Many Eyes for text analysis created the following:


Datasets of journal articles inserted into Voyant produced the following:


Interesting as it is working with these JavaScript tools, Jacob Harris view word clouds as harmful to journalism. For what its worth, I enjoyed screwing around with the text analysis tools!


Open Data , Data Visualization and Analysis: Making sense of it!

Open data is information that is released by organizations to the public in datasets, data sets, if you can recall is a collection of factual information in electronic form. This allows for and support the Freedom of Information (FOI) and it increases transparency, as A. Rae 2004, posits, ‘open data has increased transparency, improved access to information and helped places begin to understand and solve problems.’ However, this data should be presented in such a way that anyone can interpret what is being presented.

What is Data Visualization? Data visualization is a general term that describes any effect to help people understand the significance of data by placing it in a visual context, in short, it is visual representation of data that goes beyond the standard charts and graphs commonly used in Excel spreadsheets, today’s data visualization tools displays data in a more enhanced and sophisticated way such as heat maps, bar, pie and fever charts among others.

This was illustrated in one of our DITA sessions where TAGS were created for #citylis top twitters and data visualization of the results were presented. The data revealed here was very amazing!

Data analysis is the process of discovering and understanding the meaning of data that is presented to us, it is making sense of the information, hence, data visualization is a core and usually essential means to perform data analysis in an effective way.