The ‘hidden’ connections in Google’s Knowledge Graph

As far as I know, the only way to query Google’s Knowledge Graph currently is the search API. Let’s run a query on it, search for instance for Miles Davis’ album “Sketches of Spain”.<your_key_here>&limit=1

The API returns this JSON-LD fragment back (thanks, Jos de Jong for the great JSON Editor Online):


Strip out the wrapping entities and each search result returned is just a node from the Knowledge Graph for which we get the id, type (category), name and description. Additionally, you may get your node linked to a Wikipedia page that provides a detailed description of the entity. That’s what the red box highlights in the previous fragment. Visually, what we get is something like this:


This is nice because your text search is returning an entity in Google’s knowledge graph and it’s structured data… yes but there’s something missing. I don’t think I’d be exaggerating if I said there is the most important bit missing: The context, the connections, the other bits of the graph that this entity relates to. Let me explain what I mean: If I run the same search in a browser I get a much richer result from the Knowledge Graph:


The dashed red box shows what the search API currently returns, and the bits connected with the arrows are the context that I’m talking about. The author of the album, the producers, the awards received, the genre… The data is obviously in the graph and JSON-LD’s capabilities for expressing rich linked data are crying to be used. If that was not enough, the relationships are already defined in so it looks like we have all we need. Actually, Google! you have all you need 🙂

Right, so based on this, what would a (WAY) richer result look like? Look at the little blue box that I added to the original query output:


Or probably for a more intuitive representation, look at the graph that this new JSON-LD fragment represents:


Wouldn’t it be cool? And not only cool but also extremely useful? Let me know your thoughts.

And yes, for those of you who may be wondering where did I get the IRIs of the extra nodes and whether they are real or made up, I did run separate queries on the search API for each of the related entities and stuck it all together manually so valid IRIs but retrieved separately.

One final comment: If you’re interested in publishing/sharing connected data (graph data) as JSON-LD straight from your Neo4j Graph Database, have a look at this repo.





Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s