2. June 2015
The art of querying multimedia and geospatial data
This is SPARQL!
Over the last decade, we have witnessed the incredible growth of web technologies and services. Nowadays, the information is all around us – television, video content, Facebook, Twitter, and Google. If you want to read the latest news, there are hundreds of web resources. If you need to locate an address, no problem – Google maps knows the answer. Almost anything can be found and is just a few clicks away.
On the other hand, if you are a professional searching for specific information in a database, you may be in trouble. For example, if you are a journalist and you have to write an article about the Islamic State, you would need information about all populated places in the occupied territories. Or, if your article is about the US policy in the Middle East, you may need to find a specific video content, showing the US president next to country leaders from the region. So, what do you do if you have an archive with thousands of video materials but you just need a few specific frames? Is there a technology that can help you, saving you time and effort?
The answer is yes. Such technologies exist and they are GeoSPARQL and SPARQL MM. They are extensions of SPARQL (SPARQL Protocol and RDF Query Language) and provide the necessary functionalities to help you with the above tasks. Here, we want to present some of their capabilities as well as their implementation in GraphDB.
GeoSPARQL
GeoSPARQL represents a standard for retrieving and inserting geospatial RDF data in a wide range of geospatial cases – from simple point of interest knowledge bases to detailed authoritative geospatial data sources for transportation. The current official specification version is OGC 11-052r4 (2012-09-10). It allows two formats for expressing geometries: WKT (well-known text) and GML (a XML-based format). The default coordinate system is WGS84 (World Godetic Survey 1984). The following diagram shows a simplified layout of some geometry classes and properties:
This SPARQL extension provides a very powerful instrument for querying and retrieving geospatial information from semantic repositories. If we go back to our previous example of the article about the Islamic State, now it will be very easy to find all populated places in the occupied territories by using datasets such as GeoNames, which is one of the core datasets in the Linked Open Data cloud. The populated places in this area cannot be found with a place hierarchy because, at the moment, the Islamic State is an illegal organization. So, we need geometry (polygon) to retrieve all cities in that region and here we can use the full power of GeoSPARQL.
In the past, GraphDB could provide only rectangular checking but now – thanks to GeoSPARQL – we are able to use a wide range of geometry shapes.
SPARQL-MM
SPARQL-MM is an extension of the SPARQL query language. It provides spatio-temporal filters and aggregated functions, and can be quite valuable for journalists and media analyzers. It efficiently retrieves multimedia content, based on annotation of images, and movies, using both temporal constraints and positioning of objects within the video scenes.
The SPARQL-MM support in GraphDB implements the required additional specific index structures and the handling of SPARQL syntax extensions. The additional indexing structures allow SPARQL-MM queries to be executed in real time. In terms of news and media organizations, this means that authors and editors can search content archives more productively. In our example of the article about the US policy, the task of retrieving video frames, showing the president next to political leaders of the Middle East becomes very easy with SPARQL-MM, saving you time, money and effort.
Technologies as GeoSPARQL and SPARQL-MM add a great value to our work and everyday life. Tasks that took hours of hard work a few years ago are now just routine operations such as writing a simple SPARQL query or, even better, using a graphical user interface.