Calling Something a Dataset: Visualizing the Crystal Palace

Since December, I’ve been working with Steven Lubar for a presentation at Bard College last week. The event focused on the 1853 New York Crystal Palace. Lubar has introduced his presentation over at Medium. (Parts one through three are up at the moment – the next installment will complement this piece.)

Most of my time with this project has been cleaning, wrangling, and interrogating the dataset. (Side note: does anyone remember that poem about literary analysis being like torture? It sometimes feels that way with datasets too.) As we’ve discussed in class, cleaning data is a never-ending process – the data we started with wasn’t raw, but carefully curated OCR from a Cornell Library database. In wrangling the data into a CSV and under headings specific to our needs, we shifted the catalog as a book into a catalog as a database, bringing up multiple questions of historical and technical interest.

Working through the data as a whole and as individual objects, Lubar and I worked to figure out what kind of questions could be explored in the catalog. What was this exhibition? What was interesting about it? What kind of information was collected for the catalog?

Lubar’s interests lie in what this type (see: digital) of curation can do for the museum catalog. On the one hand, this is limited to understanding the New York Crystal Palace itself. Visualizing the museum catalog through maps, tree maps, graphs, and such makes it an historical resource. Exploring the dataset in multiple avenues, we settled on Tableau and a D3 visualization (with help from CDS!) for his presentation.

For the class assignment, I was hoping to explore some other elements of the dataset. I’ve been spatially focused in Tableau, but what other categories and attributes teased out from the data display interesting relationships? What does focusing on the relationships between attributes within the catalog and outside the catalog do to our understanding of the catalog? And is there another tool to use?

Some questions I came across:

  • In the case of geographic visualizations, in what ways does it allow viewers to bring the exhibition out into the world again, exploring in real-time the nationalistic pride embedded in the ideals of the exhibition? Is placing the object back into its “original” location, are we prioritizing the individual item in a way it wasn’t in the exhibition? What do different facets tell us?
  • What categories are prioritized by the main catalog? What quantitative data does it already contain? What questions does visualization of the catalog bring out that previous scholarship didn’t address? Are visualizations obscuring anything from the catalog? Is it reinforcing a specific narrative?
  • Can these visualizations preserve the context of the catalog? Or do we want to remove context? Does rearranging the data let us think about the Crystal Palace exhibition in new ways?



Mapping in Tableau, Colored by Court Location in the Exhibitions
Mapping in Tableau, Colored by Court Location in the Exhibitions
Getting into the details of Tableau Maps - see how a newer map distorts where things are located?
Getting into the details of Tableau Maps – see how a newer map distorts where things are located? The original catalog says Venice, Austria (in reference to present-day Venice, Italy.)
Tableau representation of agents by class (colored by country)
Tableau representation of agents by class (colored by country). We can see how this was an American fair, with the exception of a few categoires. T


Palladio's Gallery Feature
Palladio’s Gallery Feature, which reminds me of a “traditional digital collections look.” As a different kind of list, we see each object’s title (“product”), subtitle (category) and detail (country).
Palladio's Graph feature, showing relationships between country and location
Palladio’s Graph feature, showing relationships between country and location inside the exhibition (A, B, C, D, E, Gallery)
Palladio's Graph feature, showing relationships between country and location (this time, with locations emphasized)
Palladio’s Graph feature, showing relationships between country and location (this time, with locations emphasized)
Palladio's feature to facet, showing a graphical breakdown of each category
Palladio’s feature to facet, showing a graphical breakdown of each category


The Circular Dendrogram shows the hierarchical structure of the data in a non-weighted way. This one showcases country, and then classes that country contributed to.
Cluster Dendrogram: it also offers a way to look at the hierarchical structure. This one showcases the exhibition layout, and then the countries that are showcased in that division.
Circle Packing: also showcases the exhibition layout/country relationship.
Circle Packing: highlights the country/class relationship



More so than distortion, how does curating or visualizing data fundamentally change the dataset? Thinking of an exhibition collection as points instead of objects, I felt myself torn with this assignment to make sure the visualization did justice to the catalog. So in each visualization, I directly thought about how these tools could function as a catalog. Was I distorting the intent of the original catalog? Did calling it a dataset distort these ideas? Or were visualizations enhancing that effect?

In handling those questions, I felt as though the visualizations failed to talk about context when discussing a dataset. Quantitative questions removed the qualitative analysis of these relationships – which is true in many cases of data, but unsettling in the case of museum work. Even as a thought experiment for reconstructing/representing the New York Crystal Palace, I am cautious in saying these are any “true” representations of the exhibition. Calling the catalog a dataset, and curating data instead of curating objects, places the user in a complicated place. Interrogating for these quantitative questions without contextualizing or thinking of audience makes it difficult for the museum catalog as a form to succeed.

Granted, an exhibition catalog does this, too. We wanted to see what happens when you take humanities data into the digital – but the creators of the catalog also shifted the context of a physical exhibition in their process of documentation. So maybe this also a larger question of what happens when we qunatify data, and the visualizations make it easier for us to tease apart these questions.

Of course, it’s important to note that none of these platforms are designed to overtake the current structure of museum catalogs. (I’d argue Palladio is close to serving this end, with its goal to “understand how to design graphical interfaces based on humanistic inquiry.” As Posner’s example of the Charles Weever Cushman Collection indicates, Palladio can do a lot to uncover relationships within a collected dataset.) But it’s interesting to combine these questions with our discussion of interfaces in engaging with the collection. Too many museum catalog databases look the exact same – structured to browse under keywords, image-based, search by category, etc. The database we’ve developed allows us to search and browse and facet as a way to explore the data in a way that museum catalogs don’t traditionally.  Can we do something differently?

What does it mean to think of the metadata embedded in the catalog’s dataset for the purpose of visualizations? How do visualizations rethink for whom the catalog was designed? What happens when we prioritize visualizations first? What if a viewer were first presented with a map of the original exhibition to poke around? What if they were instead presented with data manipulation, with visualizations and breakdowns, instead of the objects of the exhibition? Does that allow the viewer to engage effectively with the historic questions Lubar is interested in? Or does it treat all these facets as something else – something less than historical data?

This is a pressing question in many institutions as they transition into open-access collections and look to publish their information on the internet. And more broadly: What will the new museum catalog look like? What kind of catalog styles exist, and what can technology do to aid those visualizations? How can old exhibition catalogs, like the New York Crystal Palace, incorporate all the textual materials and embedded metadata to make something interactive for the digital age of museums? But most importantly: do these datasets – visualizations, metadata, and all – reflect the ideas with which we started?

Emily Esten is a MA candidate in Public Humanities at Brown University. More info on me and examples of my work may be found at

Leave a Reply

Your email address will not be published. Required fields are marked *