Aesthetics of Investigation

From dftwiki3
Revision as of 15:27, 9 July 2008 by Thiebaut (talk | contribs) (New page: AEsthetics of Investigation The amount of data collected and available today in our various fields of research has reached such a scale that we are lacking tools to explore the data. It i...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

AEsthetics of Investigation

The amount of data collected and available today in our various fields of research has reached such a scale that we are lacking tools to explore the data. It is not uncommon for data sets to measure in millions or billions or items. What used to be story-driven research--ask the question first, and then use the data to answer it--is becoming data-driven research: gather the data, explore it along some of its multiple dimensions, and see what kind of story is being told. This shift should be taken with care, as one should bear in mind that there is no raw data; data is collected for a purpose [1]. Peter Hall, a designer and writer, uses architecture and its geography in his remarks, but they apply as well to other fields: "What already exists is more than just the physical attributes of a terrain (topography, rivers, roads, buildings), but includes also the various hidden forces that underlie the workings of a given place.[2]" This is not a new phenomenon, and one only has to look at Mappae Mundi, a set of medieval maps mostly from the 11th century, to see how the influence of the Christian views of the time significantly distorted the visualization of the world.

Furthermore as the technology evolves and allows us to garner new types of data in larger quantities, we find ourselves ill-equipped for general-purpose tools to explore and render the data. Increasingly new tools are conceived, putting us, the tool creators, in the position of design artists, and presenting us with the challenging task of mixing science, art and technology [2] with the "goal of using beauty and elegance as a path to clarity and analysis" [3]. Visualization, in particular, is experiencing a boom, as remarkably showcased in the recent Museum of Modern Art exhibit Design and the Elastic Mind. Many examples of stunning visual (computer-assisted) displays appear regularly in various specialized magazines such as Seed or Wired, but also in news publications such as the New York Times or Harper's, creating new standards of aesthetics. This boom is accompanied by the innovation of new programming languages, such as Processing [4], aimed at artists, designers, and scientists, and whose goal is simplify the process of rendering this explosion of huge data sets. Remarkably and unfortunately, there is a lack of reusability of visualization tools: all too often visualization is performed for one particular set of data, requiring great designing and programming skills, the work is published, and the tool is set aside and the cycle is restarted for a new set of data. Lacking reusability and verification on different data sets, we run the risk of sacrificing clarity and analysis in the name of elegance and the aesthetics of design, and possibly the risk of presenting erroneous information. In many ways we are at a frontier where tools are invented as the data are gathered, and used only a few times, for reasons linked to the specificity of the data, and to the fact that advances in technology leads to quick obsolescence. It is as if one had to reinvent the wheel every time a new project is conceived. Elegance and utility need not be antithetical, nor is it necessary to consign new technological tools to virtual landfills.

My interest in the esthetics of investigation is currently rooted in the analysis of the Wikipedias that exist in many different languages. These on-line encyclopedias are both open source, and open-data. One can download not only the current data-base contents of all the 9 million pages of the English-language Wikipedia, but also its 190 million edits that more than 9 million contributors have performed since its creation. While it is worthwhile debating the virtues of the contents of individual pages, I am interested in the study the growth of such on-line encyclopedias, how their contents reflects the culture of their contributors, how, like a living organism, a Wikipedia heals itself when under attack, and how its pages regularly experience birth, growth, and sometimes cycles of deaths and rebirths, or how current events are reported with spectacular speed in those pages. Some examples will illustrate the richness of information hidden in a Wikipedia database. A statistical study of the most frequent 2-word concepts found in the English Wikipedia shows "United States," "New York," and "Median Income" as the most frequent occurrences. An analysis of the average time elapsed between the defacing of a page with electronic graffiti and its clean-up is less than 5 minutes. When bombs exploded in London on 7 July 2005, a Wikipedia page was promptly created and updated with minute-to-minute information, faster and more accurately than for the Web pages of CNN or of the BBC. For me, the challenge lies in the representation of the relationships and trends that exist between page edits, words or punctuation used, between pages and their contributors, between the contributors and the pages they modify, between the geographical location of the contributors and the pages they edit, when such data sets measure in millions, if not billions of items. New visualization methods must be invented and designed, that combine beauty and elegance, in ways consistent with the goals of clarity and analysis. And because computation on such a scale may take days and weeks of computer time, it is important to know how to formulate the questions for which we seek answers through visualization.

The goals of this Kahn workshop is to explore the new challenges of data analysis and visualization, to discuss individual approaches and learn of new methodologies used in fields other than our own, discover new tools, new affinities in the Smith communities, and create a network of resources including technologies and people who share common challenges, and to discuss the foremost importance of merging art, science and technologies in our explorations, visual or others. Tools should be used, shared and disseminated, so that duplication efforts are diminished.

This workshop is open to all members of the Smith communities, and especially to those members who are interested in the challenges of bridging science, art and technologies in their research endeavors.

The format of the workshop will be two one-day meetings, spread over two week-ends, where member will present their interest, worries, or experience with how the aesthetics of investigation permeates their field of research, and with ample time left for discussion.


References

  • [1] Jack van Wijk, The Value of Visualization, in Proc. IEEE Visualization 2005, p. 79-86. [[1]]
  • [2] Peter Hall, Critical Visualization, in Design and the Elastic Mind, by Hugh Aldersey-Williams, Peter Hall, Ted Sargent, and Paola Antonelli, D.A.P./Distributed Art Publishers, Inc., NY, 2008
  • [3] Paola Antonelli, Design and the Elastic Mind, in Design and the Elastic Mind, by Hugh Aldersey-Williams, Peter Hall, Ted Sargent, and Paola Antonelli, D.A.P./Distributed Art Publishers, Inc., NY, 2008
  • [4] Ben Fry and Casey Reas, Processing, www.processing.org.