Relevancy Ranking For RDF

ReConRank: A Scalable Ranking Method for Semantic Web Data with Context (pdf) Aidan Hogen, Andreas Harth, Stefan Decker
This paper presents a way of transforming the results of a text query over a set of indexed RDF data into a directed graph and making it suitable for ordering using PageRank-like relevancy ranking. The cool thing here is that the ranking is done at query time, not at index time which means a) there’s no need to re-index to change ranking scores b) it can handle arbitrary RDF data, no upfront knowledge of any schema is required. Basically, the index search retrieves a set of resources from which a topical subgraph is derived. This is combined with the named graphs in the dataset from which each resource is described to effectively imply a set of quads. This combined quad graph is boiled down to contain only resources and the links between them, maintaining indications about which resources are content and which are context (some may be both of course). Finally, further links are inferred into the graph by propagating links between upwards and downwards between the content and context layers. The links within the resulting graph is then analysed and two ranking tables are built; inbound links to content resources informing result ordering and inbound links into context nodes relating to provenance, i.e. the more links into a named graph, the more valuable its content. I like the idea of this and would love to try out some similar stuff in Bigfoot.

On a side note, Danny was dismayed to see that the SWSE search engine that Andreas et al have built using ReConRank and labelled “the George Foreman Grill of search engines” has beaten him to the punch somewhat.

License

This work is published under a Creative Commons Attribution-Share Alike 2.5 License.