RDF Search Engine Glimmer Released by Yahoo

by Aaron Bradley on June 20, 2013

in Search Engines, Semantic Web

RDF Search Engine Glimmer Released by Yahoo

Today Peter Mika of Yahoo! announced the open source code release and public demo of Glimmer, a search engine for RDF Data.

As Mika describes the search engine in his announcement, Glimmer "provides support for offline distributed indexing of RDF data using Hadoop MapReduce. It also contains an online ranking component using a state-of-the-art method based on BM25F…."

Mika, Roi Blanco, Sebastiano Vigna describe their adaptation of BM25F used by Glimmer in the paper "Effective and Efficient Entity Search in RDF Data", presented at the International Semantic Web Conference (ISWC) 2011.

The demo allows searching of 750 million triples – the subset of the Web Data Commons that uses schema.org.  This dataset was used, Mika says, because it was previously available only as a static download, and that by "providing API access will make it easier to analyze the data (which previously required AWS payment) and to develop innovative applications."

The main unified search interface supports queries that combine item properties.

The Main Yahoo! Glimmer Search Interface

Glimmer also supports more complex queries that combine schema.org types and properties, like this search for recipes that contain chicken.

Glimmer Search and Partial Results for a Query for Recipes that Contain Chicken

Glimmer's ontology-based search makes hunting down specific schema.org types easier, and takes the guesswork out of putting together the syntax for complex queries.  I was able to conduct the query below simply by selecting "Medical Clinic" and typing "student" into the name field; the same query is expressed this way in the unified search query box: type:{http://schema.org/MedicalClinic} (predicate:{http://schema.org/name} ^ object:student).

A Glimmer Ontology-Based Search

Glimmer also sports a browse pad that allows you to drill down into the schema.org type hierarchy (more generally, as per Mika's announcement, to select "a class from the taxonomy shown on the right" of the Glimmer screen).

The Glimmer Offer Hierarchy

This is an RDF search engine, so while I've selected examples above related to the schema.org namespace, Glimmer is – as the description suggests – more generally designed to to facilitate the searching of structured data by type (class), predicate and object.  You know – triples.  Mika says that they are planning to "add more collections to the demo in the future," so at some point we may see triples in the collection aside from those that use the schema.org namespace.

Interesting work here from the Semantic Search research group at Yahoo! Labs, and definitely worth exploring.

1 David Deering October 10, 2013 at 9:34 am

Aaron, thanks for the post about this (even though I’m a few months behind in seeing it!). Is it possible to use this tool to find examples of schema markups out there in the wild? I’m a visual person, so it’s a lot easier for me to learn how to do something by looking and comparing rather than trying to read directions, if you know what I mean. So I’d like to find a way to find schema markup examples that are not found on schema.org, so that’s why I ask. Thanks in advance for the post and your help.

Previous post:

Next post: