Repository: https://github.com/LambdaGeo/qgisparql-triple2layer/
Creators: Sérgio Souza Costa and Nerval Junior
This plugin aims to import data from a connected database and convert it into a geographic data layer in the QGIS geographic information system (GIS) (https://qgis.org/).
Português 🇧🇷 |
English 🇺🇸 |
💡 The screenshots for this documentation were taken in QGIS 3.26.3 running on Windows. Depending on your setup, the screens you encounter might look a bit different. However, all the same buttons will still be available, and the instructions will work on any operating system. You will need QGIS 3.4 (the latest version at the time of writing) to use this plugin.
💡 Before starting this exercise, the **Triple2Layer** plugin must be installed on your computer.
Let's start right away!
To use Triple2Layer, simply open QGIS from the menu bar and hover the mouse over the vector through which you will be able to see the tools allowing you to manipulate vector layers. This way, it will be possible to access the plugins of DBCells in the QGISPARQL cell.
In Figure 1 below, we can see the area of active plugins indicated with arrow number 3. Going to vector in the menu bar as shown in arrow number 1, we can then open the Triple2Layer plugin and select the desired plugin, in this case, Triple2Layer shown in arrow number 2.
Next, you can see in Figure 2 the initial interface of the "Triple2Layer" plugin.
In the main graphical interface of the Triple2Layer plugin, you can see in Figure 3 the first and second parts. The first part contains information about the loaded file and the type of endpoint. In the second part, there is a table where the attributes necessary for the import will be loaded. In this interface, you will also find the "Import" button, which performs the import of the layer to the server. Next to it, there is the "Cancel" button, which is used to cancel the entire execution and close the graphical interface.
This plugin aims to import connected data from a repository and convert it into a geographic data layer within the QGIS geographic information system (https://qgis.org/).
With the plugin open, the first step is to enter the name that the layer (geographic layer) will have when created, as shown in Figure 4. In this example, the layer name was set to "ACRE" since we will import data about this state.
The current version of the plugin allows importing data from two connected data sources:
- Connected database servers, known as a triple store, regardless of implementation (e.g., Virtuoso or Apache Jena Fuseki).
- A data portal called data.world (https://docs.data.world/).
These data portals have a collection of triples representing connected data, where each object can be related to other objects via predicates.
For the creation of a geographic layer, it is necessary for this data to have a geometric attribute, such as the geo:asWKT
predicate in Code 1:
@prefix cells: <https://purl.org/linked-data/dbcells#> .
@prefix geo: <http://www.opengis.net/ont/geosparql#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix sdmx: <http://purl.org/linked-data/sdmx/2009/dimension#> .
<https://purl.org/dbcells/epsg4326#R0_0830Cx-34_7917Cy-6_9714> a cells:Cell ;
cells:resolution 8.3e-02 ;
geo:asWKT "Polygon ((-34.8333349000000041 -6.92973086900001611, -34.750001570000002 -6.92973086900001611, -34.750001570000002 -7.01306419900001643, -34.8333349000000041 -7.01306419900001643, -34.8333349000000041 -6.92973086900001611))" .
sdmx:refArea "PB" .
In the connected data paradigm (see more at: https://ceweb.br/livros/dados-abertos-conectados/), each resource has a URI. In this example, we have the following URI representing a resource:
<https://purl.org/dbcells/epsg4326#R0_0830Cx-34_7917Cy-6_9714>
This resource is connected to other information through three predicates. In this example, the resource represents a polygon that has spatial resolution (area of the polygon), its geometric shape in WKT format, and additional information indicating in which Brazilian state this polygon is located.
💡 In a simplified manner, a connected database, such as [Data.World](http://Data.World) or a triple store server, can be understood as a collection of triples, as described in Code 1.We can then select the type of endpoint, either "Triple Store Endpoint" or "Data.world Dataset," as shown in Figure 5 by arrow numbers 1 and 2.
For the Triple Store, we will enter the server's URL, as shown in Figure 6.
For Data.World, we will use the Dataset name, as shown in Figure 7.
Additionally, in the case of Data.World, for importing, it is necessary to define an access token. The token can be found in the settings on the Data.World portal, as shown in Figure 8.
In "Settings" in the top left corner, we can select the "Data.world Token" option, as shown in Figure 9.
After the text box is selected, with the token used for read and write copied previously, we can paste it into the labeled number 1 and then click "ok" on the arrow 2 as shown in figure 10 below.
A connected data source has thousands or millions of triples. So, they support a query language called SPARQL (https://www.w3.org/TR/sparql11-query). This language allows defining which triples we want to load from pattern matches. This set of triple patterns can have some variables that will be replaced to perform a certain match. Considering the database described in Code 1, we could use the following SPARQL query to bring the resolution and geometries of a given object.
prefix geo: <http://www.opengis.net/ont/geosparql#>
prefix sdmx-dimension: <http://purl.org/linked-data/sdmx/2009/dimension#>
prefix dbc: <https://purl.org/linked-data/dbcells#>
SELECT ?cell ?resolution ?wkt
WHERE {
?cell geo:asWKT ?wkt.
?cell dbc:resolution ?resolution.
?cell sdmx-dimension:refArea "AC".
}
Notice that by the pattern:
?cell sdmx-dimension:refArea "AC".
In a connected data repository, this query will result in only the information for this Brazilian state, in this case, Acre. When processing this query on a connected database, the result could be displayed in a table format with three columns, represented by the variables that appear in the select
clause:
Next, we'll see that the plugin will need some information from the user to transform a data table, like the one described in Table 1, into a geographic layer.
We can load a SPARQL file by opening the dialog, clicking on "Open SPARQL" as shown in Figure 11.
💡 For this example, consider a SPARQL file like the one described in Code 2By clicking on the button, a dialog will open in the computer's file explorer, where we can select the file needed for the query, as shown in Figure 12. We can then choose a file as shown in arrow number 1 by clicking open to load the data.
Next, we will see how the attributes will be defined for the creation of the geographic layer in QGIS.
With the loaded file, the attribute table will appear as shown in Figure 13.
In this table (2) in Figure 13, some information necessary for the creation of the geographic layer from the result of the SPARQL query will be defined. For example, we can define which attribute will be the identifier and which one will be used to represent the geometry. We can define the name of the attribute, which may be different from the name of the variable in the SPARQL file. In addition to defining the data type of the attribute and which ones will be imported.
Firstly, in the attribute selection of Figure 14, the choice representing the geometry is made. So far, the representation supported by the plugin is WKT (Well Know Text), and it can have geometries defined with points, lines, and polygons. By selecting the geometry as a WKT variable, shown in Figure 14, we will observe that the options "Attribute name" and "Attribute type" are disabled.
Not all attributes that come from the query need to be imported. The user will mark the attributes that he wants to be imported into the geographic layer.
In the attribute selection, it is possible to change the name of the attribute and its data type. To change the name of the attribute, simply double-click on the selected attribute and type the new name.
Regarding data types, by default, they will be imported as String
, which represents text. However, we can change it to Int
, which represents an integer, or Double
, which represents a real number. In this example, the resolution is a numerical data with decimal places, representing the resolution of the cell, which is better represented as a Double
.
A geographic layer, like a table in a database, requires that each record (or row) has an identifier (known as the primary key). In some cases, this information will come as a result of querying the repository, so you can indicate which of the attributes represents this identifier. An important criterion is that the values of this attribute must be unique; repetitions are not allowed. If not selected, an ID will be defined as auto-increment, i.e., integer values from 1 to the number of objects.
Finally, we can import the data according to Figure 19 by clicking on "importar."
By clicking, we will first see the message that the layer is being imported and soon after it is loaded.
Next is the image of the 100% loaded Layer.
Upon loading the layer, we can select the loaded layer as shown by the arrow and number 1 in Figure 22.
By pressing the F6 key, we can open the attribute table of the selected layer, as shown in Figure 23.