D-SPARQ: Distributed, Scalable and Efficient RDF Query Engine

TitleD-SPARQ: Distributed, Scalable and Efficient RDF Query Engine
Publication TypeConference Papers
Year of Publication2013
AuthorsMutharaju, R, Sakr, S, Sala, A, Hitzler, P
EditorBlomqvist, E, Groza, T
Conference NameProceedings of the ISWC 2013 Posters & Demonstrations Track
Conference LocationSydney, Australia
KeywordsD-SPARQ, Distributed Querying, Scalable RDF querying, SPARQL

We present D-SPARQ, a distributed RDF query engine that combines the MapReduce processing framework with a NoSQL distributed data store, MongoDB. The performance of processing SPARQL queries mainly depends on the efficiency of handling the join operations between the RDF triple patterns. Our system features two unique characteristics that enable efficiently tackling this challenge: 1) Identifying specific patterns of the input queries that enable improving the performance by running different parts of the query in a parallel mode. 2) Using the triple selectivity information for reordering the individual triples of the input query within the identified query patterns. The preliminary results demonstrate the scalability and efficiency of our distributed RDF query engine.