Indexing API
- Documents
- Permissions
- Troubleshooting
- Datasources
- Authentication
- People
- Shortcuts
Client API
- Activity
- Announcements
- Answers
- Authentication
- Calendar
- Chat
- Agents
- Collections
- Displayable Lists
- Documents
- Images
- Insights
- Messages
- Pins
- Search
- Entities
- Shortcuts
- Summarize
- Tools
- User
- Verification
Actions API
- Authentication
- Setup
Read documents
Read the documents including metadata (does not include enhanced metadata via /documentmetadata
) for the given list of Glean Document IDs or URLs specified in the request.
curl --request POST \
--url https://{domain}-be.glean.com/rest/api/v1/getdocuments \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"documentSpecs": [
{
"url": "<string>"
}
],
"includeFields": [
"LAST_VIEWED_AT"
]
}'
{
"documents": {}
}
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Headers
Email address of a user on whose behalf the request is intended to be made (should be non-empty only for global tokens).
Auth type being used to access the endpoint (should be non-empty only for global tokens).
Body
The specification for the documents to be retrieved.
The URL of the document.
List of Document fields to return (that aren't returned by default)
LAST_VIEWED_AT
, VISITORS_COUNT
, RECENT_SHARES
, DOCUMENT_CONTENT
Response
The document details or the error if document is not found.
The Glean Document ID.
The app or other repository type from which the document was extracted
The source from which document content was pulled, e.g. an API crawl or browser history
API_CRAWL
, BROWSER_CRAWL
, BROWSER_HISTORY
, BUILTIN
, FEDERATED_SEARCH
, PUSH_API
, WEB_CRAWL
, NATIVE_HISTORY
The datasource-specific type of the document (e.g. for Jira issues, this is the issue type such as Bug or Feature Request).
The plaintext content of the document.
The title of the document.
A permalink for the document.
The datasource instance from which the document was extracted.
The type of the result. Interpretation is specific to each datasource. (e.g. for Jira issues, this is the issue type such as Bug or Feature Request).
The name of the container (higher level parent, not direct parent) of the result. Interpretation is specific to each datasource (e.g. Channels for Slack, Project for Jira). cf. parentId
The Glean Document ID of the container. Uniquely identifies the container.
The Glean Document ID of the super container. Super container represents a broader abstraction that contains many containers. For example, whereas container might refer to a folder, super container would refer to a drive.
The id of the direct parent of the result. Interpretation is specific to each datasource (e.g. parent issue for Jira). cf. container
The index-wide unique identifier.
A unique identifier used to represent the document in any logging or feedback requests in place of documentId.
Hash of the Glean Document ID.
The display name.
An opaque identifier that can be used to request metadata for a Person.
A list of documents related to this person.
{
"department": "Movies",
"email": "george@example.com",
"location": "Hollywood, CA",
"phone": 6505551234,
"photoUrl": "https://example.com/george.jpg",
"startDate": "2000-01-23",
"title": "Actor"
}
{
"name": "George Clooney",
"obfuscatedId": "abc123"
}
The display name.
An opaque identifier that can be used to request metadata for a Person.
A list of documents related to this person.
{
"department": "Movies",
"email": "george@example.com",
"location": "Hollywood, CA",
"phone": 6505551234,
"photoUrl": "https://example.com/george.jpg",
"startDate": "2000-01-23",
"title": "Actor"
}
{
"name": "George Clooney",
"obfuscatedId": "abc123"
}
A list of people mentioned in the document.
The display name.
An opaque identifier that can be used to request metadata for a Person.
A list of documents related to this person.
{
"department": "Movies",
"email": "george@example.com",
"location": "Hollywood, CA",
"phone": 6505551234,
"photoUrl": "https://example.com/george.jpg",
"startDate": "2000-01-23",
"title": "Actor"
}
The level of visibility of the document as understood by our system.
PRIVATE
, SPECIFIC_PEOPLE_AND_GROUPS
, DOMAIN_LINK
, DOMAIN_VISIBLE
, PUBLIC_LINK
, PUBLIC_VISIBLE
A list of components this result is associated with. Interpretation is specific to each datasource. (e.g. for Jira issues, these are components.)
The status or disposition of the result. Interpretation is specific to each datasource. (e.g. for Jira issues, this is the issue status such as Done, In Progress or Will Not Fix).
The status category of the result. Meant to be more general than status. Interpretation is specific to each datasource.
A list of stars associated with this result. "Pin" is an older name.
The document which should be a pinned result.
The opaque id of the pin.
Filters which restrict who should see the pinned document. Values are taken from the corresponding filters in people search.
{
"name": "George Clooney",
"obfuscatedId": "abc123"
}
{
"name": "George Clooney",
"obfuscatedId": "abc123"
}
The query strings for which the pinned result will show.
The document priority. Interpretation is datasource specific.
The display name.
An opaque identifier that can be used to request metadata for a Person.
A list of documents related to this person.
{
"department": "Movies",
"email": "george@example.com",
"location": "Hollywood, CA",
"phone": 6505551234,
"photoUrl": "https://example.com/george.jpg",
"startDate": "2000-01-23",
"title": "Actor"
}
{
"name": "George Clooney",
"obfuscatedId": "abc123"
}
The display name.
An opaque identifier that can be used to request metadata for a Person.
A list of documents related to this person.
{
"department": "Movies",
"email": "george@example.com",
"location": "Hollywood, CA",
"phone": 6505551234,
"photoUrl": "https://example.com/george.jpg",
"startDate": "2000-01-23",
"title": "Actor"
}
{
"name": "George Clooney",
"obfuscatedId": "abc123"
}
A list of tags for the document. Interpretation is datasource specific.
A list of collections that the document belongs to.
The unique ID of the Collection.
The unique name of the Collection.
A brief summary of the Collection's contents.
The emoji icon of this Collection.
Indicates whether edits are allowed for everyone or only admins.
The parent of this Collection, or 0 if it's a top-level Collection.
The datasource type this Collection can hold.
{
"name": "George Clooney",
"obfuscatedId": "abc123"
}
{
"name": "George Clooney",
"obfuscatedId": "abc123"
}
The number of items currently in the Collection. Separated from the actual items so we can grab the count without items.
The number of children Collections. Separated from the actual children so we can grab the count without children.
The items in this Collection.
Metadata having what categories this Collection is pinned to and the eligible categories to pin to
The names of the shortcuts (Go Links) that point to this Collection.
The children Collections of this Collection.
A list of user roles for the Collection.
A list of added user roles for the Collection.
A list of removed user roles for the Collection.
Filters which restrict who should see this Collection. Values are taken from the corresponding filters in people search.
The permissions the current viewer has with respect to a particular object.
The user-visible datasource specific id (e.g. Salesforce case number for example, GitHub PR number).
The count of comments (thread replies in the case of slack).
The count of reactions on the document.
To be deprecated in favor of reacts. A (potentially non-exhaustive) list of reactions for the document.
Describes instances of someone posting a link to this document in one of our indexed datasources.
Search endpoint will only fill out numDays ago since that's all we need to display shared badge; docmetadata endpoint will fill out all the fields so that we can display shared badge tooltip
Describes the write permissions levels that a user has for a specific feature
A list of shortcuts of which destination URL is for the document.
Link text following go/ prefix as entered by the user.
canonical link text following go/ prefix where hyphen/underscore is removed.
Title for the Go Link
A list of user roles for the Go Link.
The opaque id of the user generated content.
Destination URL for the shortcut.
Glean Document ID for the URL, if known.
A short, plain text blurb to help people understand the intent of the shortcut.
Whether this shortcut is unlisted or not. Unlisted shortcuts are visible to author + admins only.
For variable shortcuts, contains the URL template; note, destinationUrl
contains default URL.
A list of user roles added for the Shortcut.
A list of user roles removed for the Shortcut.
The permissions the current viewer has with respect to a particular object.
{
"name": "George Clooney",
"obfuscatedId": "abc123"
}
The time the shortcut was created in ISO format (ISO 8601).
{
"name": "George Clooney",
"obfuscatedId": "abc123"
}
The time the shortcut was updated in ISO format (ISO 8601).
Document that corresponds to the destination URL, if applicable.
The URL from which the user is then redirected to the destination URL. Full replacement for https://go/<inputAlias>.
The part of the shortcut preceding the input alias when used for showing shortcuts to users. Should end with "/". e.g. "go/" for native shortcuts.
Indicates whether a shortcut is native or external.
The URL using which the user can access the edit page of the shortcut.
For file datasources like onedrive/github etc this has the path to the file
Custom fields specific to individual datasources
The document's document_category(.proto).
The display name.
An opaque identifier that can be used to request metadata for a Person.
A list of documents related to this person.
{
"department": "Movies",
"email": "george@example.com",
"location": "Hollywood, CA",
"phone": 6505551234,
"photoUrl": "https://example.com/george.jpg",
"startDate": "2000-01-23",
"title": "Actor"
}
{
"name": "George Clooney",
"obfuscatedId": "abc123"
}
A list of documents that are ancestors of this document in the hierarchy of the document's datasource, for example parent folders or containers. Ancestors can be of different types and some may not be indexed. Higher level ancestors appear earlier in the list.
The Glean Document ID.
The app or other repository type from which the document was extracted
The source from which document content was pulled, e.g. an API crawl or browser history
API_CRAWL
, BROWSER_CRAWL
, BROWSER_HISTORY
, BUILTIN
, FEDERATED_SEARCH
, PUSH_API
, WEB_CRAWL
, NATIVE_HISTORY
The datasource-specific type of the document (e.g. for Jira issues, this is the issue type such as Bug or Feature Request).
The title of the document.
A permalink for the document.
{
"container": "container",
"parentId": "JIRA_EN-1337",
"createTime": "2000-01-23T04:56:07.000Z",
"datasource": "datasource",
"author": { "name": "name" },
"documentId": "documentId",
"updateTime": "2000-01-23T04:56:07.000Z",
"mimeType": "mimeType",
"objectType": "Feature Request",
"components": ["Backend", "Networking"],
"status": ["Done"],
"customData": { "someCustomField": "someCustomValue" }
}
A list of content sub-sections in the document, e.g. text blocks with different headings in a Drive doc or Confluence page.
{
"container": "container",
"parentId": "JIRA_EN-1337",
"createTime": "2000-01-23T04:56:07.000Z",
"datasource": "datasource",
"author": { "name": "name" },
"documentId": "documentId",
"updateTime": "2000-01-23T04:56:07.000Z",
"mimeType": "mimeType",
"objectType": "Feature Request",
"components": ["Backend", "Networking"],
"status": ["Done"],
"customData": { "someCustomField": "someCustomValue" }
}
curl --request POST \
--url https://{domain}-be.glean.com/rest/api/v1/getdocuments \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '{
"documentSpecs": [
{
"url": "<string>"
}
],
"includeFields": [
"LAST_VIEWED_AT"
]
}'
{
"documents": {}
}