curl --request POST \
  --url https://{domain}-be.glean.com/api/index/v1/adddatasource \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "name": "<string>",
  "displayName": "<string>",
  "datasourceCategory": "UNCATEGORIZED",
  "urlRegex": "https://example-company.datasource.com/.*",
  "iconUrl": "<string>",
  "objectDefinitions": [
    {
      "name": "<string>",
      "displayLabel": "<string>",
      "docCategory": "UNCATEGORIZED",
      "propertyDefinitions": [
        {
          "name": "<string>",
          "displayLabel": "<string>",
          "displayLabelPlural": "<string>",
          "propertyType": "TEXT",
          "uiOptions": "NONE",
          "hideUiFacet": true,
          "uiFacetOrder": 123,
          "skipIndexing": true,
          "group": "<string>"
        }
      ],
      "propertyGroups": [
        {
          "name": "<string>",
          "displayLabel": "<string>"
        }
      ],
      "summarizable": true
    }
  ],
  "suggestionText": "<string>",
  "homeUrl": "<string>",
  "crawlerSeedUrls": [
    "<string>"
  ],
  "iconDarkUrl": "<string>",
  "hideBuiltInFacets": [
    "TYPE"
  ],
  "canonicalizingURLRegex": [
    {
      "matchRegex": "<string>",
      "rewriteRegex": "<string>"
    }
  ],
  "canonicalizingTitleRegex": [
    {
      "matchRegex": "<string>",
      "rewriteRegex": "<string>"
    }
  ],
  "redlistTitleRegex": "<string>",
  "connectorType": "API_CRAWL",
  "quicklinks": [
    {
      "name": "<string>",
      "shortName": "<string>",
      "url": "<string>",
      "iconConfig": {
        "color": "#343CED",
        "key": "person_icon",
        "iconType": "GLYPH",
        "name": "user"
      },
      "id": "<string>",
      "scopes": [
        "APP_CARD"
      ]
    }
  ],
  "renderConfigPreset": "<string>",
  "aliases": [
    "<string>"
  ],
  "isOnPrem": true,
  "trustUrlRegexForViewActivity": true,
  "includeUtmSource": true,
  "stripFragmentInCanonicalUrl": true,
  "identityDatasourceName": "<string>",
  "productAccessGroup": "<string>",
  "isUserReferencedByEmail": true,
  "isEntityDatasource": false,
  "isTestDatasource": false
}'

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Structure describing config properties of a custom datasource

name
string
required

Unique identifier of datasource instance to which this config applies.

displayName
string

The user-friendly instance label to display. If omitted, falls back to the title-cased name.

datasourceCategory
enum<string>
default:UNCATEGORIZED

The type of this datasource. It is an important signal for relevance and must be specified and cannot be UNCATEGORIZED. Please refer to this for more details.

Available options:
UNCATEGORIZED,
TICKETS,
CRM,
PUBLISHED_CONTENT,
COLLABORATIVE_CONTENT,
QUESTION_ANSWER,
MESSAGING,
CODE_REPOSITORY,
CHANGE_MANAGEMENT,
PEOPLE,
EMAIL,
SSO,
ATS,
KNOWLEDGE_HUB,
EXTERNAL_SHORTCUT,
ENTITY,
CALENDAR
urlRegex
string

Regular expression that matches URLs of documents of the datasource instance. The behavior for multiple matches is non-deterministic. Note: urlRegex is a required field for non-entity datasources, but not required if the datasource is used to push custom entities (ie. datasources where isEntityDatasource is false). Please add a regex as specific as possible to this datasource instance.

Example:

"https://example-company.datasource.com/.*"

iconUrl
string

The URL to an image to be displayed as an icon for this datasource instance. Must have a transparency mask. SVG are recommended over PNG. Public, scio-authenticated and Base64 encoded data URLs are all valid (but not third-party-authenticated URLs).

objectDefinitions
object[]

The list of top-level objectTypes for the datasource.

The definition for an DocumentMetadata.objectType within a datasource.

suggestionText
string

Example text for what to search for in this datasource

homeUrl
string

The URL of the landing page for this datasource instance. Should point to the most useful page for users, not the company marketing page.

crawlerSeedUrls
string[]

This only applies to WEB_CRAWL and BROWSER_CRAWL datasources. Defines the seed URLs for crawling.

iconDarkUrl
string

The URL to an image to be displayed as an icon for this datasource instance in dark mode. Must have a transparency mask. SVG are recommended over PNG. Public, scio-authenticated and Base64 encoded data URLs are all valid (but not third-party-authenticated URLs).

hideBuiltInFacets
enum<string>[]

List of built-in facet types that should be hidden for the datasource.

Available options:
TYPE,
TAG,
AUTHOR,
OWNER
canonicalizingURLRegex
object[]

A list of regular expressions to apply to an arbitrary URL to transform it into a canonical URL for this datasource instance. Regexes are to be applied in the order specified in this list.

Regular expression to apply to an arbitrary string to transform it into a canonical string.

canonicalizingTitleRegex
object[]

A list of regular expressions to apply to an arbitrary title to transform it into a title that will be displayed in the search results

Regular expression to apply to an arbitrary string to transform it into a canonical string.

redlistTitleRegex
string

A regex that identifies titles that should not be indexed

connectorType
enum<string>

The source from which document content was pulled, e.g. an API crawl or browser history

Available options:
API_CRAWL,
BROWSER_CRAWL,
BROWSER_HISTORY,
BUILTIN,
FEDERATED_SEARCH,
PUSH_API,
WEB_CRAWL,
NATIVE_HISTORY

List of actions for this datasource instance that will show up in autocomplete and app card, e.g. "Create new issue" for jira

An action for a specific datasource that will show up in autocomplete and app card, e.g. "Create new issue" for jira.

renderConfigPreset
string

The name of a render config to use for displaying results from this datasource. Any well known datasource name may be used to render the same as that source, e.g. web or gdrive. Please refer to this for more details

aliases
string[]

Aliases that can be used as app operator-values.

isOnPrem
boolean

Whether or not this datasource is hosted on-premise.

trustUrlRegexForViewActivity
boolean
default:true

True if browser activity is able to report the correct URL for VIEW events. Set this to true if the URLs reported by Chrome are constant throughout each page load. Set this to false if the page has Javascript that modifies the URL during or after the load.

includeUtmSource
boolean

If true, a utm_source query param will be added to outbound links to this datasource within Glean.

stripFragmentInCanonicalUrl
boolean
default:true

If true, the fragment part of the URL will be stripped when converting to a canonical url.

identityDatasourceName
string

If the datasource uses another datasource for identity info, then the name of the datasource. The identity datasource must exist already.

productAccessGroup
string

If the datasource uses a specific product access group, then the name of that group.

isUserReferencedByEmail
boolean

whether email is used to reference users in document ACLs and in group memberships.

isEntityDatasource
boolean
default:false

True if this datasource is used to push custom entities.

isTestDatasource
boolean
default:false

True if this datasource will be used for testing purpose only. Documents from such a datasource wouldn't have any effect on search rankings.

Response

200

OK