Searching on Wikimedia Commons with Structured Data

Wikimedia has been working in the last few years in something called “Structured Data”. While the idea is not to get into the details of what structured data means and does, the important thing to know is that this allows for multimedia content to have different sets of data that enhances search, discoverability and multilinguality. For doing this, Wikimedia Commons uses the information stored in Wikidata, the free knowledge database of the Wikimedia projects.

You can use structured data to search for very specific types of multimedia content. For example, take a look at this amazing picture by Yann Macherez, “Harvesting seaweed in Jambiani”.

Image for post
Image for post
Harvesting seaweed in Jambiani by Yann Macherez is under a CC BY-SA 4.0.

If you go to the link of the file, you will find that there are two “tabs”, one that says “File information” and then “Structured data”.

Image for post
Image for post
Tab one: “File information”, tab two: “structured data”.

File information is the default information that Wikimedia Commons shows you. But if you click on “Structured data”, you will find something that will look like this:

Image for post
Image for post
The structured data tab retrieves “items portrayed in this file”.

Depicts is a “property” of Wikidata (identified with the letter P and the number 180, as there are many other properties in Wikidata). The list under depicts is a set of items from Wikidata, identified with their Q number. For example, ocean in Wikidata is Q9430.

Let’s say that after finding this picture, you want to find more images that have depict (property 180, or P180) ocean (Q9430). In order to do that, you go to the Search box in Wikimedia Commons:

Image for post
Image for post
The search box in Wikimedia Commons is literally that box that appears on the right top corner.

and scroll down to the bottom of the searchbox (in this case, I’m doing it with seaweed, but you could do it with ocean):

Image for post
Image for post

This is how the search results will look like (for ocean!, but you get the idea).

Image for post
Image for post

What this is doing is taking the structured data information to enhance the search, offering better results that if you just type “ocean” or “seaweed”. Compare that search with the search results if you just type “ocean” in the search box. And what are the search results if you use Special:MediaSearch?

The search is producing the following string:

haswbstatement:P180=Q9430

This “haswbstatement:P180=QNUMBER” can be applied with pretty much any item in Wikidata that could potentially be a “depicts” statement. Try searching for seaweed (Q237169) or farming (Q11451). Additionally, if there is a photo that you like, you can always go to the Structured Data tab and see what they are depicting to search for similar results.

I wrote this as part of a larger guide, but I find the content to be too specific. I’m publishing it here because it’s done and it’s easy.

Updated thanks to the great feedback of Sandra Fauconnier

openglam, digitization, open licensing stuff

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store