Search API: Filters walkthrough
Filters are a powerful tool to narrow down your search results. We support filters that are generalized to every document, as well as filters that are specific to different datasources.
This guide will focus on our general filters and how to utilize them.
General Filters- How to use
In order to filter search results to show documents that match certain fields via the /search REST API, you need to construct a list of facetFilter
objects and pass it into the requestOptions
object.
Each facetFilter object has the following relevant fields:
-
fieldName
- the name of the field we are filtering by (eg “from” to facet by user, “type” for document type, etc). fieldName should be unique in the list of facetFilter objects. -
values
- a list of facetFilterValue objects. All values are OR’d between the same field name (we AND between different field names).
A facetFilterValue object has the following relevant fields:
-
value
- string value that results are being filtered to. -
relationType
- can take on the values below:- “LT” - Less than.
- “GT” - Greater than.
-
“EQUALS” - default value.
“LT” and “GT” can only be used for time filters (see examples below). Every other filter should use “EQUALS”.
-
isNegated
- not supported, don’t use.
Basic Example
To filter to only the document type pdf search results, you would send the following in the facetFilter field. This is the equivalent of adding “type:pdf” to your search query.
[
{
"fieldName": "type",
"values": [
{
"relationType": "EQUALS",
"value": "pdf"
}
]
}
]
Universal Field Names
Topbar facet field names
last_updated_at:
from:
my:history
collection:
has:golink
type:
Entity field names
businessunit:
city:
country:
industry:
location:
region:
roletype:
startafter:
startbefore:
state:
title:
reportsto:
Exceptions to the basic example
Time filters
Time filters are the only exception to the rule. The fieldName
is always “lastupdatedat”, and we use different relationTypes to specify different time ranges.
We support 2 types of values: specific dates and special values.
Specific dates
Use the “GT” and “LT” relationTypes
to specify a date range. The ranges can also be open-ended (only include a GT or an LT). Each date value should be in the form YYYY-MM-DD
passed in as a string. Note that when using GT and LT, the values are noninclusive (eg using {relationType
=”GT”, value=”2023-06-17”} will include dates from 2023-06-18 and later).
All dates provided will begin with the “start of the day” (12:00 am). Dates will end at the end of the day (11:59:59 pm).
Closed date range example for filtering to documents from dates 6/16, 6/17, 6/18, 6/19:
[
{
"fieldName":"last_updated_at",
"values":[
{
"relationType":"GT",
"value":"2023-06-15"
},
{
"relationType":"LT",
"value":"2023-06-20"
}
]
}
]
Open date range example for filtering to documents from dates 6/11 onwards:
[
{
"fieldName":"last_updated_at",
"values":[
{
"relationType":"GT",
"value":"2023-06-10"
},
]
}
]
Special Values
For special values, we allow the values past_day, past_week, past_month, yesterday, today, past_n_days, past_n_weeks, past_n_months, past_n_years
for the relation type EQUALS
, where n is a number, ie 5 in past_5_days
. For all past_ prefixed values, we also support the last_ prefix, they mean the same thing (ie last_week
is a viable substitute for past_week
).
We allow the values past_day, past_week, past_month, yesterday,today
for the relation type LT
.
We allow the values yesterday
for the relation type GT
.
If you pass in an invalid special value, you will get an 422 error letting you know you have an invalid operator. Invalid special values for time filters are the only case in which a 422 error is returned.
If you are used to using operators and values in the query string, here are some examples of translations of query string value to REST API value.
Sample:
updated:today
becomes
[{fieldName: "last_updated_at", values: [{relationType: "EQUALS", value: "today"}]}]
before:past_week
becomes
[{fieldName: "last_updated_at", values: [{relationType: "LT", value: "past_week"}]}]
after:yesterday
becomes
[{fieldName: "last_updated_at", values: [{relationType: "GT", value: "yesterday"}]}]
Timezone considerations
We factor in user’s timezone for all queries except when you use the value past_week
, past_year
, past_month
and past_day
with the relationType EQUALS
. (these are handled by built-in elastic date aggregation that doesn’t account for timezone).
History filter
my:
facet will only ever have the value “history”. It filters to show only documents the user has viewed before. The object always looks like this:
{fieldName: "suggested", groupName: "", values: [{relationType: "EQUALS", value: "my history"}]}
From filter (or any user filter):
If you would like to specify a specific user, for example, if there are 2 people with the same name, “User one”, you can specify which one you mean by using the email address they authenticated with Glean as the value
. IE user@glean.com and userone@glean.com would facet by different “User One”s even if they have the same name.
Sample query:
from:"User one" updated:today type:document
requestOptions.facetFilters:
[{ "fieldName": "from",
"values": [ {
"relationType": "EQUALS",
"value": "userone@glean.com"
}] },
{
"fieldName": "last_updated_at",
"values": [{
"relationType": "EQUALS",
"value": "today"
}] },
{
"fieldName": "type",
"values": [{
"relationType": "EQUALS",
"value": "document"
}]}
]
Datasource-specific filters
Apart from the general filters we’ve discussed, some filters are specific to one datasource- for example, Confluence has the author
facet, and Slack has the channel
facet. There are also custom facets that are defined for custom datasources pushed via api.
To uncover these datasource-specific facets, you can use our Glean UI to filter by the datasource you’re curious about. You will be able to find a list of facets for your search results, and facet values on the sidebar of the Glean UI (see image below).
Getting possible facets via Search API
If you would like to curl to get the facets, you can use a /search request to get the values that we use to populate the sidebar with a request like this:
{
"query":"test",
"pageSize":10,
"requestOptions":
{
"facetBucketSize" : 3000,
"facetFilters": [
{
"fieldName": "app",
"values": [
{
"value": "confluence",
"relationType": "EQUALS"
}
]
},
]
}
}
This will return a top-level field, facetResults
facetResults has the following relevant fields:
-
sourceName
- same as the facetfieldName
in thefacetFilter
s object -
operatorName
- not relevant -
buckets
- a list of facet bucket objects corresponding to a facet value-
The facet bucket object has the following relevant fields:
-
count
- the number of search results that would be returned if filtering by the facet value -
value
- the facet value (ie “engineering” for the space facet)-
stringValue
- the string value -
intValue
- the integer value (not common) -
displayLabel
- alternative value used for display in the UI -
iconConfig
- optional image used to represent the facet value, such as a profile picture for people facet values.
-
-
-
The facet bucket object has the following relevant fields:
{
`"sourceName":` `"space",`
`"operatorName":` `"SelectMultiple",`
`"buckets":` `[`
`{`
`"count":` `3,`
`"value":` `{`
`"stringValue":` `"engineering"`
`}`
`}`
]
}
To get all of the facets you can use with a particular datasource, you can look at all of the sourceName
s in the facetsResult
s returned.
If latency is a concern, and you only want to receive facetResults
, you can send a request with pageSize
= 0, and add "FACET_RESULTS" within the responseHints
field in requestOptions
to not retrieve any documents and only retrieve facetResults. See sample request body below, which would gather facetResults
for all confluence specific facets:
{
"query":"test",
"pageSize":0,
"requestOptions":
{
"facetBucketSize":3000,
"facetFilters": [
{
"fieldName": "app",
"values": [
{
"value": "confluence",
"relationType": "EQUALS"
}
]
},
]
"responseHints": [
"FACET_RESULTS"
]
},
}
Filtering Facet Results
We provide the functionality of filtering the list of possible facets of a particular type (obtained in the facetResults
above) using a prefix. If you want to filter the buckets
of a particular sourceName
in the facetResults
using a specific prefix, you can set the facetBucketFilter
object. It has the following two fields:
-
facet
- The facet fieldName that you want to filter on. -
prefix
- The prefix by which you want to filter the buckets of that facet by.
To use this Facet Bucket Filtering make sure that the responseHints
field under requestOptions
contains "FACET_RESULTS" . See the sample request body below that filters the "type" bucket using the prefix “co”:
{
"query": "test",
"requestOptions":
{
"facetBucketFilter":
{
"facet": "type",
"prefix": "co",
},
"facetBucketSize":3000,
"responseHints": [
"FACET_RESULTS"
]
},
}
Within the facetResults
field in the response to this query, you will find that the bucket for sourceName
="type" contains only those values with the prefix "co" in any word.
You can limit the maximum number of buckets you get within the facetResults
for all sourceName
by passing your desired value into facetBucketSize
. In the example above, we limit buckets to 3000
Preferred Name prefix matching
For sourceName
supporting people within the buckets (e.g., "from", "owner", etc.), we support prefix matching using their preferred names (or nicknames). So if facetBucketFilter
is applied to the “from” facet using the prefix “Adi,” it will give all the people whose either name or preferred name has a prefix of “Adi”.