Add or update datasource
Add or update a custom datasource and its schema.
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
Structure describing config properties of a custom datasource
Unique identifier of datasource instance to which this config applies.
The user-friendly instance label to display. If omitted, falls back to the title-cased name
.
The type of this datasource. It is an important signal for relevance and must be specified and cannot be UNCATEGORIZED. Please refer to this for more details.
UNCATEGORIZED
, TICKETS
, CRM
, PUBLISHED_CONTENT
, COLLABORATIVE_CONTENT
, QUESTION_ANSWER
, MESSAGING
, CODE_REPOSITORY
, CHANGE_MANAGEMENT
, PEOPLE
, EMAIL
, SSO
, ATS
, KNOWLEDGE_HUB
, EXTERNAL_SHORTCUT
, ENTITY
, CALENDAR
Regular expression that matches URLs of documents of the datasource instance. The behavior for multiple matches is non-deterministic. Note: urlRegex
is a required field for non-entity datasources, but not required if the datasource is used to push custom entities (ie. datasources where isEntityDatasource is false). Please add a regex as specific as possible to this datasource instance.
"https://example-company.datasource.com/.*"
The URL to an image to be displayed as an icon for this datasource instance. Must have a transparency mask. SVG are recommended over PNG. Public, scio-authenticated and Base64 encoded data URLs are all valid (but not third-party-authenticated URLs).
The list of top-level objectType
s for the datasource.
The definition for an DocumentMetadata.objectType
within a datasource.
Example text for what to search for in this datasource
The URL of the landing page for this datasource instance. Should point to the most useful page for users, not the company marketing page.
This only applies to WEB_CRAWL and BROWSER_CRAWL datasources. Defines the seed URLs for crawling.
The URL to an image to be displayed as an icon for this datasource instance in dark mode. Must have a transparency mask. SVG are recommended over PNG. Public, scio-authenticated and Base64 encoded data URLs are all valid (but not third-party-authenticated URLs).
List of built-in facet types that should be hidden for the datasource.
TYPE
, TAG
, AUTHOR
, OWNER
A list of regular expressions to apply to an arbitrary URL to transform it into a canonical URL for this datasource instance. Regexes are to be applied in the order specified in this list.
Regular expression to apply to an arbitrary string to transform it into a canonical string.
A list of regular expressions to apply to an arbitrary title to transform it into a title that will be displayed in the search results
Regular expression to apply to an arbitrary string to transform it into a canonical string.
A regex that identifies titles that should not be indexed
The source from which document content was pulled, e.g. an API crawl or browser history
API_CRAWL
, BROWSER_CRAWL
, BROWSER_HISTORY
, BUILTIN
, FEDERATED_SEARCH
, PUSH_API
, WEB_CRAWL
, NATIVE_HISTORY
List of actions for this datasource instance that will show up in autocomplete and app card, e.g. "Create new issue" for jira
An action for a specific datasource that will show up in autocomplete and app card, e.g. "Create new issue" for jira.
The name of a render config to use for displaying results from this datasource. Any well known datasource name may be used to render the same as that source, e.g. web
or gdrive
. Please refer to this for more details
Aliases that can be used as app
operator-values.
Whether or not this datasource is hosted on-premise.
True if browser activity is able to report the correct URL for VIEW events. Set this to true if the URLs reported by Chrome are constant throughout each page load. Set this to false if the page has Javascript that modifies the URL during or after the load.
If true, a utm_source query param will be added to outbound links to this datasource within Glean.
If true, the fragment part of the URL will be stripped when converting to a canonical url.
If the datasource uses another datasource for identity info, then the name of the datasource. The identity datasource must exist already.
If the datasource uses a specific product access group, then the name of that group.
whether email is used to reference users in document ACLs and in group memberships.
True if this datasource is used to push custom entities.
True if this datasource will be used for testing purpose only. Documents from such a datasource wouldn't have any effect on search rankings.
Response
OK