RISE to Bloome Software
Log In    
r2bsoftware.se r2bsoftware.se
Click to hide navigation tree

JSON Web Service Harvester

The JSON Web Service Harvester may be used with any web service that returns strict JSON (JavaScript Object Notation). Some web services return JavaScript rather than JSON, even though they claim to return JSON, e.g. var variable = {JSON} or function({JSON}); In these cases, the harvester will make an attempt to extract the JSON object.

Configuring the JSON web service harvester

JSON web service harvester properties
Image 1, JSON Web Service Harvester properties

Name The name of the harvester in use by the selected Query-node. To change harvester, select the name property and click on the ellipsis button. Select the appropriate harvester in the form that is displayed.
Accept The accepted content type of the response. The default value for the JSON Web service harvester is application/json.
Accept-Charset The accepted charset of the response. The default is utf-8.
Content-Type The content type of the request. This is only applicable if Method=POST. The default is application/x-www-form-urlencoded.
Custom Headers You can add any number of custom http headers to your request, e.g. Accept-Language.
Data Element If you want to retrieve data from the root JSON object, "Data Element" should be left empty. Otherwize "Data Element" should contain the path to the JSON object or array of JSON objects you want to retrieve data from, e.g. myObject/myArrayOfObjects . The Column Name field of the Query-node leaves should contain the name of the JSON object member to retrieve data from.
Method Request method. Possible values are POST and GET. The default value is POST.
Query string Method = GET: The query string is appended to the Url like Url?Query string. The arguments added to query string should be url-encoded.
Method = POST: The content of the query string is sent in the request body. If content type is application/json, the query string should contain JSON-style arguments, e.g. "arg1":"val1", "arg2":"val2". Otherwise the query string should be url-encoded.
Url The url to the service from which you want to request information.
User authentication
Password Supply a password if the service requires authentication. Leave empty for anonymous authentication.
User ID Supply a user id if the service requires authentication. Leave empty for anonymous authentication.

Adding relations

In the model displayed in image 1 we want to list all municipalities of Sweden, and for each municipality list the job advertisments publiched Swedish employment services authotory. The model actually uses two different harvesters to accomplish this. The list of municipalities, including attributes such as area, population etc. is harvested from a Wikipedia article using the XML Web Service Harvester.  For each municipality discovered in the Wikipedia article, we call the JSON enabled web service, published by the Swedish employment services authoroty, to harvest the advertisments. 

In order to accomplish this we need to describe how the information in our model is related. For HTTP-harvesters, there are two ways to do this. Which method to use depends mainly of the web service being called. You can use any of the methods, or a combination of them.

1. Parent relation

Every query node (except for the root node), independent of which harvester it uses, has a parent relation property in the XML section. 

Add parent relations
Image 2, the parent relation property

To add or edit a parent relation, select the property and click on the ellipsis button.

Add relation
Image 3, managing the parent relation

All leaves of the selected node, having Column Name specified in the Source section, are listed in the Column combo box, and all parent leaves, having Column Name specified in the Source section, are listed in the Parent column combo box. In the first column you can either select any of the listed options, or you can type your own value. In the second column you must select one of the avaliable parent column options. You can add any number of relations.

If request method = GET, the relations will be added to the query string as ColumnName=Parent column value&...
If request method = POST, and content type = application/x-www-form-urlencoded, the relation is sent in the request content according to ColumnName=Parent column value&...
If request method = POST, and content type = application/json, the relation is sent in the request content as Json, e.g {ColumnName=Parent column value, ... }

Where ColumnName is the name of the column you selected, or enetered into, the first column, and Parent column value is the value returned by the parent harvester for the selected column.

2. Tagging the url

Using any of the http-harvesters you may tag the web service url, as well as the query string, using the following syntax; <parent.ColumnName>, where ColumnName is the value of the Column Name property for the parent column/leaf. At runtime, the tag will be substituted with the harvested value of the specified parent column.

The root node does not have a parent, hence you cannot substitute any tags with parent column values for havesters applied to the root node. However, when executing the model, you can supply a filter, that is passed to the harvester of the root node. Http harvesters may make use of this filter by adding the tag <param.filter> to the Url property or the Query string property. 

When you add a http-harvester to the root node, you may notice that the default value for the Query string property is <param.filter>, i.e. by default the harvester uses the supplied filter as query string. Make sure that the filter has the appropriate formatting and encoding depending on the request method and content type used, see the description of the Query string property in the table above. 

Please see the XML Web Service Harvester article for an example of how tag substitution is performed.

Sample JSON Response

When the model is executed, the web service is executed for each municipality like this:

  • Notice how the relation is expanded into the query string. 
  • Also notice that the web service can return both XML and JSON, and that what is returned is determined by the Accept http header. This means that if you follow the links in your web browser, you might get an XML document in response, depending on the Accept http header sent by your browser.

Calling the web service http://api.arbetsformedlingen.se/platsannons/matchning?kommunid=1440, will result in a response similar to the one in image 4. The result may however be different since the advertisments are likely to have been changed since this article was written.  

Sample JSON response
Image 4, Sample JSON Response.

In the Data Element property, image 1, we have entered a JSON path to select all the advertisments from the JSON response, matchningslista/matchningdata.

For each member that we want to export, we create a leaf. As Column Name, in the Source section, we enter the name of the JSON object member that we want to store. Image 5 displays how the Id leaf node harvests its data from the annonsid member of the returned JSON object.

Leaf source properties
Image 5, Leaf source properties