-
Notifications
You must be signed in to change notification settings - Fork 2
Search strategies
On this page
Specify your query terms in the q
query string parameter.
You can run a search against a single API endpoint
or use the /search
endpoint for a wide search.
GET https://data.tepapa.govt.nz/collection/search/?q=kiwi&size=1
{
"results": [
{
"id": "1514",
"type": "Category",
"title": "kahu kiwi",
"creditLine": "Matauranga Māori Thesaurus",
"prefLabel": "kahu kiwi",
"scopeNote": "Kiwi feather cloak",
"pid": "tepapa:collection/category/1514",
"iri": "http://tepapa.govt.nz/collection/category/1514",
"href": "https://data.tepapa.govt.nz/collection/category/1514",
"_meta": {
"created": "2011-10-31T05:08:15.000+0000",
"modified": "2012-05-24T04:29:49.000+0000",
"qualityScore": 1
},
"_api": {
"score": 18.588493
}
}
],
"_metadata": {
"resultset": {
"count": 1999,
"from": 0,
"size": 1
}
}
}
Query type | Example |
---|---|
Keyword | q=kiwi |
Phrase | q="new plymouth" |
Any word | q=batch crib |
All words | q=batch AND crib |
Wildcards | q=aptery* |
Range | q=id:[400 TO 500] |
q=_meta.created:>2016-03 |
|
Field search | q=title:crib |
q=prefLabel:"new plymouth" |
|
q=title:(john AND smith) |
|
Limit by field search | q=kiwi AND collection:TaongaMāori |
For more details on searching see Collections Online's search tips
For the full syntax available see the Elastic query string syntax guide.
Simple filtering is supported within the basic search query by using the boolean AND
operator to add your filter.
/object?q=kiwi AND collection:TaongaMāori
/search?q=kiwi AND (type:Place OR type:Organisation)
However, this is only possible with a few root-level fields such as: type
, collection
, species
The Collections API accepts more complex query statements in the body of a POST request. See more details on making a POST search request containing a ApiSearchRequest query data format query.
This allows you to filter on facetable fields and nested fields (which isn't possibly in simple filtering with the AND boolean operator). See below for details on facetable fields.
POST https://data.tepapa.govt.nz/collection/search
Content-Type: application/json
{
"query" : "kiwi",
"filters": [
{
"field": "production.facetCreatedDate.decadeOfCentury",
"keyword": "1970s"
}
]
}
Response for the filtered query:
{
"results": [
{
"id": 376710,
"type": "Object",
"title": "LP Record \"Haka and Poi: Maori Concert Parties of Queen Victoria and St. Stephen's Schools\"",
"production": [
{
"title": "Queen Victoria and St. Stephen's Schools; musician; 1974",
"createdDate": "1974-01-01",
"verbatimCreatedDate": "1974"
"facetCreatedDate": {
"century": "20th century",
"dayOfWeek": "Tuesday",
"decadeOfCentury": "1970s",
"era": "Common Era (CE)",
"monthOfYear": "January",
"temporal": "1974-01-01",
"verbatim": "01 Jan 1974 / 31 Dec 1974",
"year": "1974"
}
]
},
...
],
"_metadata": { ... }
}
Faceting dyanmically groups the data in a field into categories (or "terms"). It summarises all the values that exist in a field to show the most common values. Often this is used as a first step, to show the possible entry points for 'drilling down' further into the query results, and then one of those values is selected to actually filter the query by.
You can see faceting in action in a Collections Online website search. The search results page automatically includes facets for the type and collection field - showing common values found, and how many there are of each.
Type:
Object (732)
Specimen (1225)
Topic (56)
Publication (9)
In addition to performing basic queries, the advanced search interface allows you to perform a faceted search. The faceting implementation utilises Elasticsearch Term aggregations under the hood.
Specify the faceted fields, along with the number of results you want to receive for each facet, in the facets parameter of the search request:
POST https://data.tepapa.govt.nz/collection/search
Content-Type: application/json
{
"query" : "James Cook",
"facets": [ {
"field": "production.facetCreatedDate.decadeOfCentury",
"size": 3
}, {
"field": "production.spatial.title",
"size": 3
} ]
}
The response contains the top 3 values for the requested facets along with the number of matching documents in the facets
field.
NB: Facet values are returned in alphabetical order, not by count numbers.
{
"results": [ ... ],
"facets": {
"production.facetCreatedDate.decadeOfCentury": {
"1940s": 11751,
"1960s": 12736,
"1970s": 11751
},
"production.spatial.title.verbatim": {
"New Zealand": 71945,
"North Island (New Zealand)": 5171,
"United Kingdom": 6031
}
},
"_metadata": { ... }
}
Only some fields are facetable. Faceting only makes sense on fields that have a small finite number of distinct values. If you request a facet on an unfacetable field, for example a long text field, an error is returned instead:
{
"status": 422,
"developerMessage": "An exception occurred: Field at 'productionUsedTechnique.scopeNote' is not facetable - type is not facetable: text",
[..]
}
The name of the facet returned may not exactly match the name you requested.
If a requested field is not facetable then the API may select a suitable sub-field to return.
In the example above, the faceting request for production.spatial.title
returned a facet for production.spatial.title.verbatim
due to the internal field mappings that are used.
To retrieve the records that match you chosen facet value, run a new POST query containing a filter
.
See Filtering above for an example.
Dates can be tricky for us to record in our catalogue - the dates may be unknown, partially known, fuzzy (e.g. only contain a textual description), or contain mistakes recorded by our curators over 100 years ago.
We have tried to standardise dates as much as possible in the API, using a range of date fields:
Date type | Description | Example field | Example values |
---|---|---|---|
Verbatim date | Original text as recorded | verbatimBirthDate |
11 June 1865 |
Encoded date | Converted to ISO8601 (YYYY-MM-DD) | birthDate |
1865-06-11 |
Faceted date | Each part separated out | facetBirthDate |
year:1865 monthOfYear:June decadeOfCentury:1860s
|
Nested date | Reduced details (no faceted) |
production.contributor.verbatimBirthDate production.contributor.birthDate
|
Verbatim dates are the most accurate. They are human-readable and suitable for display
14 Aug 1940
March 2004
circa 2011
c 1940
active 1920
1870-1872
circa 132-137 million years ago
Their equivalent encoded dates follow the ISO 8601 date standard, though may only have precision to a month or year. Note that these are auto-generated, so are less accurate than the equivalent verbatim date.
1940-08-14
2004-03
2011
1940
1920
1870
132
In version 1, only a few root-level dates are searchable in basic queries
verbatimBirthDate
birthDate
verbatimDeathDate
deathDate
publicationDate
GET https://data.tepapa.govt.nz/collection/search?q=verbatimBirthDate:((Aug OR August) AND 1940)
GET https://data.tepapa.govt.nz/collection/search?q=birthDate:"1940-08"
{
"results": [
{
"id": 2533,
"type": "Person",
"title": "Dr Alan Baker",
"birthPlace": "Inglewood",
"verbatimBirthDate": "14 Aug 1940",
"birthDate": "1940-08-14",
"facetBirthDate": {
"century": "20th century",
"dayOfWeek": "Wednesday",
"decadeOfCentury": "1940s",
"era": "Common Era (CE)",
"monthOfYear": "August",
"temporal": "1940-08-14",
"verbatim": "14 Aug 1940",
"year": "1940"
}
...
}
]
}
To search other date fields, you will need to search the faceted date fields in an advanced query (see the next section).
A faceted date field contains sub-fields describing aspects of the date, such as century, dayOfWeek, etc.
Sometimes these values contain Unknown
or are just approximations.
Usually there are also verbatim and encoded date fields.
Faceted date fields are still experimental:
We have included the faceted date fields in the hope of assisting you in data analysis or approximate categorisation.
However, you should not rely on the faceted date fields to represent the "truth" about a date - display the equivalent verbatim date field instead.
Note that the internal verbatim
facet field is an auto-generated value, instead use the named field, e.g. verbatimCreatedDate
.
An example of a production.facetCreatedDate
field
"createdDate": "1906-01-01",
"facetCreatedDate": {
"century": "20th century",
"dayOfWeek": "Monday",
"decadeOfCentury": "1900s",
"era": "Common Era (CE)",
"monthOfYear": "January",
"temporal": "1906-01-01",
"verbatim": "01 Jan 1906 / 31 Dec 1906",
"year": "1906"
},
"verbatimCreatedDate": "1906"
Here the original value from our catalogue is verbatimCreatedDate: 1906
.
The createdDate
is an ISO 8601 date approximation of that value (which may be a poor approximation if there is not enough precision in the original data).
The facetCreatedDate
contains faceted values for that date, and date range approximations:
-
century
- The century that the original date falls into, here20th century
-
dayOfWeek
- The day of the week. Use with caution as we currently based this on ourtemporal
field which may not have the correct precision level. In this case a more accurate label for day of week would have beenUnknown
as the original verbatim date was just a year, not a particular day -
decadeOfCentury
- The decade of the century, e.g.1900s
-
era
- The era, eitherCommon Era (CE)
(a.k.a. 'AD') orBefore Common Era (BCE)
(a.k.a. 'BC') -
monthOfYear
- The month of the year. Use with caution as we currently base this on ourtemporal
field (see more details indayOfWeek
above) -
temporal
- Usually equal to the "parsed" version of a date, e.g.birthDate
. Use with caution -
verbatim
- This is different to the main verbatim date as it is auto-generated by combining two date fields that represent a date range: the "earliest" and "latest" date. In our example, the production date was determined to be between01 Jan 1906
and31 Dec 1906
-
year
- Approximation of the year.
Through the advanced search interface you can ask for all available facets on a date (the final size:0
below says: don't show results, just the facets)
POST https://data.tepapa.govt.nz/collection/search
{
"query" : "poster",
"facets": [ {
"field": "production.facetCreatedDate.decadeOfCentury",
"size": 5
} ],
"size": 0
}
This returns a list of the faceted sub-fields of that date in the result set:
"facets": {
"production.facetCreatedDate.decadeOfCentury": {
"2010s": 270,
"1910s": 295,
"1940s": 693,
"1950s": 156,
"1980s": 368
}
},
To search by a particular date facet value, add it as a filter to your query in the advanced search interface.
POST https://data.tepapa.govt.nz/collection/search
Content-Type: application/json
{
"query" : "poster",
"filters": [
{
"field": "production.facetCreatedDate.decadeOfCentury",
"keyword": "1940s"
}
]
}
See above for more details on Faceting.
When we include an entity nested inside another entity, we only include a few fields as a summary, so we have excluded faceted date fields.
To see the full record, retrieve the entity directly using the href
field.
GET https://data.tepapa.govt.nz/collection/search/?q=lomu
"refersTo": [
{
"id": 44748,
"type": "Person",
"title": "Jonah Lomu",
"verbatimBirthDate": "12 May 1975",
"birthDate": "1975-05-12",
"verbatimDeathDate": "18 Nov 2015",
"pid": "tepapa:collection/agent/44748",
"iri": "http://tepapa.govt.nz/collection/agent/44748",
"href": "https://data.tepapa.govt.nz/collection/agent/44748"
}
]
GET https://data.tepapa.govt.nz/collection/agent/44748
{
"id": 44748,
"type": "Person",
"title": "Jonah Lomu",
"birthPlace": "Auckland",
"verbatimBirthDate": "12 May 1975",
"birthDate": "1975-05-12",
"facetBirthDate": {
"century": "20th century",
"dayOfWeek": "Monday",
"decadeOfCentury": "1970s",
"era": "Common Era (CE)",
"monthOfYear": "May",
"temporal": "1975-05-12",
"verbatim": "12 May 1975",
"year": "1975"
},
"deathPlace": "Auckland",
"verbatimDeathDate": "18 Nov 2015",
"deathDate": "2015-11-18",
"facetDeathDate": {
"century": "21st century",
"dayOfWeek": "Wednesday",
"decadeOfCentury": "2010s",
"era": "Common Era (CE)",
"monthOfYear": "November",
"temporal": "2015-11-18",
"verbatim": "18 Nov 2015",
"year": "2015"
},
"ethnicity": [
"Tongan"
],
"nationality": [
"New Zealander"
],
"familyName": "Lomu",
"givenName": "Jonah",
"gender": "Male",
"pid": "tepapa:collection/agent/44748",
"iri": "http://tepapa.govt.nz/collection/agent/44748",
"href": "https://data.tepapa.govt.nz/collection/agent/44748",
"_meta": {
"created": "2011-08-31T06:00:49Z",
"modified": "2018-02-27T20:54:56Z",
"qualityScore": 1.9
}
}
API Reference
Development Project