The MoMA is one of the world's greatest museums and one of my favorite places to visit in New York. Over the years I've had my chance to see many incredible exhibits there (RIP my New York life).

Take Your Time

Take Your Time

Sleepwalkers

The Problem Perspective

And I've also stood in line with the hordes to try and catch glimpses of the permanent collection.

Painting by Some Dutch Guy

A while back I learned that the MoMA is on Github and specifically that their collection is avaialable as structured data. Every single one of those artworks and artists is available for you to explore! I started wondering what I could do with this so I headed over to MoMA's website and thought I'd search through the catalog for inspiration. Two challenges immediately became clear.

First, there's a discovery problem. The artworks page looks like the following:

Collection Search

This is a search box optimized for hunting and pecking. If you know what you want - a specific artwork or artist - it will get you to what is available. But it's hard to explore the collection if you don't already know what's in it.

The second problem is a limitation of current search technology. You can only search the metadata about an artwork - it's name, the artist's name - not the contents of the image.

Are There Really no Matches?

A term like "geometric patchwork" returns zero results despite the collection including numerous works by Tauba Auerbach or this print by Barry McGee. And this barely scratches the surface of what's in the collection; I'd argue that there are hundreds of artworks that could be called "geometric patchworks" within the MoMA collection.

These challenges gave me the inspiration to see if I could organize the MoMA's collection for "exploration." Could I create a playful way for you to explore the entire collection - all 90,000+ artworks with images - with the goal of helping you find new artworks and hidden connections amongst artworks and artists?

Before I tell you what I did and go deeper in the goals, you can try exploring the result. It's live on my website and feel free to click around; I'll bet you find an interesting artwork or artist you never knew of before.

So here's what I wanted:

  1. A way to "see" the entire collection. Is there a map that can show me the lay of the land and let me get my bearings? 90,000 images is a lot and I lack the vocabulary to describe art. Can it be laid out for me in a way that is welcoming and encourages me to explore?
  2. To be able to pivot through the collection on multiple dimensions: go from an artwork by an artist to a similar artist to something from the same year to something that is similar and so on in an endless art exploration. I want the collection to be like a visual Wikipedia where I can find new rabbit holes to explore and the journey of discovery is the destination.

I'm a technologist in 2025 so my immediate thought was "AI can do this." Could I send all 90,000 images to an AI and ask it to explain what the image is about? Could I then vectorize those results?

This is geek speak meaing to first convert an image into a wall of descriptive text and then translating this wall of text into 768 numbers between 0 and 1. These numbers - called embeddigns - mean nothing to humans but they capture the meaning behind the words in a way that lets us do math on them. Math like "for a given artwork, what artwork is closest - i.e., most similar - to it". Or if we create a vector for a phrase like "geometric patchwork" we can find the closest artworks - presumably the ones that are actually geometric patchworks.

It took 13 days of my laptop running non-stop to convert all the images into text and vectorize them - but I think this gives us a new way to organize and explore collections like the MoMA's. I also gained a newfound respect for why Nvidia is so important; it took 13 days on my machine despite it having a GPU and being one of the best laptops ever made.

But back to the MoMA.

The first thing I did was try to create a map of all the art by grouping all 90,000 images into 36 clusters. And then each cluster I grouped into another 36 sub-clusters. I picked 36 becuse it's a perfect square so it looks nice tiled on a large monitor plus it could create sub-clusters with an average of 70 artworks, which seemed like a reasonable amount of "like" things to put together.

If you go to the project homepage on a large monitor you'll see a 6x6 grid of images. Try refreshing the page; you'll see most of the images change but the clusters haven't. I've calculated the 10 artworks that are most like that cluster and showing a random one. This lets you get a taste of what's in that cluster without having to click through; it also makes revisiting the page more interesting as the images change.

And it appears that the algorithm understands some differences between the artworks:

The MoMA collection is diverse: it contains paintings, photos, sculptures, video, architectural diagrams, made objects and more. As you refresh the screen you can see the sorting in action. Go to column five, row four and you'll see objects. Row one, columns four and five are photos of people. Row four, column two is colorful artwork (I lack better terms to describe art; more on this in a second).

If we go down to a subcategory, we can further see the power of clustering. Take a look at that cluster of objects in colum five, row four. We've stumbled into the MoMA's designed object collection. We can pick one of these clusters at random and now we've found a set of alike chairs. This is novel; we can now explore similar things in a way you can't using traditional search.

If we click into one of these objects - C2 Solid Chair by Patrick Jouin - we can see more of the power of this vector approach.

I've created sections called "Potential Influences", "Contemporaries" and "Later Artworks". The idea here is we can find all similar artworks and group them into a few different categories: which ones came earlier in time, at the same time and later in time. This gives us a way to explore how styles evolve over time (I did the same for artists as well).

I mentioned earlier that I lack the terms to describe art. I think this is one of the reasons why art is so challenging to explore for outsiders: you don't even know the words to ask questions. So I turned again to AI. I asked it so summarize each artwork in a few search terms (this took another 3 days!) - and then I created a vector search for that term to find related artworks in the MoMA corpus.

This means that you can look at that C2 chair and see it described as geometric organic forms, minimalist chair design, contemporary furniture design, matte gray furniture and modern sculptural chair - and each of these is a clickable link to see more like that from the MoMA corpus. It enables us to endlessly explore from any areas.

You can also explore artworks based on year (random fact: 1971 has the most artworks) or the primary colors of the artwork (check out #a22195).

All these ways of pivoting can seem a little intimidating so I've create the idea of a random path. The idea here is to give you a meandering trail through the collection. If you click in from the nav bar I pick a random cluster, then a random sub-cluster and a random artwork within that. Then we just start pivoting - based on year, similar artist, similar artwork, search term or color. It's a random walk through the collection and you never know where it's going to take you - and it's different for everyone. Plus - if you find something you like, you can jump out at any point: each artwork is a clickable point of departure, as are the descriptions of the random path.

I think there's something here as a discovery mechanism to help people get lost - in a good way - inside massive art collections. It's a quick way to explore the breadth of a collection and like a compressed walk through six floors of a physical gallery.

So: is this actually useful? Well, I've appreciated it. I've found great artwork by Olalekan Joyifous and Zhang Ke that I would otherwise never have found. I've taken trips to the Balkans and gone back in time to the Gorbals. I've gotten lost in posters - check the potential influences, contemporaries and later artworks - and also discovered some truly random rabbit holes.

Despite this, I feel like this approach can still be dramatically improved. The AIs that generate the text are still not great, so searches like art nouveau yield results that clearly include incorrect artworks. I haven't yet figured out how to get computers to ignore background colors or colors that are the artifacts of digitization; these are not colors a human would ever use to describe an artwork but a computer immediately and naively latches onto. Finally, our AI friends can describe an image but they cannot give you any context to it.

For example Marcel Duchamp's Bicycle Wheel is quite abstract: as the MoMA says, art is whatever the artist says it is. But our AI friends are not yet that articulate. They describe the art as follows:

The image depicts a sculpture that combines a bicycle wheel mounted on a wooden stool. The medium is a found object, specifically a bicycle wheel, which has been repurposed as part of the artwork. The technique used here is assemblage, where different materials are combined to create a new object or artwork.

The stylistic reference is clearly Dada, a movement that emerged in the early 20th century, characterized by its rejection of traditional artistic norms and its embrace of found objects and unconventional materials. This piece aligns with Dada's spirit of challenging conventional art and questioning the boundaries of what constitutes art.

What is particularly interesting about this artwork is its subversion of the ordinary. By placing a bicycle wheel on a stool, the artist transforms a mundane object into a work of art, inviting viewers to reconsider their perceptions of everyday items. The simplicity of the materials and the stark contrast between the industrial wheel and the rustic stool create a striking visual effect. This piece also challenges the viewer to think about the relationship between function and form, as the wheel, which is typically associated with movement and purpose, is now stationary and static.

Not bad - but it is not good enough for a vector search to find the items identified as readymade by a human curator at the MoMA.

Despite these shortcomings I hope you'll give the MoMA Explorer a try. I guarantee you'll find something interesting!


Published

Category

Technology

Tags

Contact