Documents
Bulk Upload Model
The bulk upload endpoints delete all entries that are not a part of the most recent upload. For example, /bulkindexdocuments
endpoint would delete all the documents that are not present in the most recent upload.
Concurrent uploads are not allowed ie. you cannot start a new upload before the previous upload is finished.
There are some fields that are common across all bulk upload endpoints. We will be describing them here:
Field Name | Description |
---|---|
uploadId | This is the id which uniquely identifies an upload. You need to have a unique uploadId for all the paginated requests you send for an upload. |
isFirstPage | This denotes whether the page being uploaded is the first page, and needs to be true for the first request and false for all subsequent requests for an upload. |
isLastPage | This denotes whether the page being uploaded is the last page, and needs to be true for the last request and false for all other requests for an upload. |
forceRestartUpload | This is required if you want to start a new upload but the previous upload has not finished or has failed. Not specifying this bit in case of an unsuccessful previous upload will fail the request. |
disableStaleDocumentDeletionCheck | The /bulkindexdocuments asynchronously deletes all documents that weren’t a part of the most recent upload session. This can lead to accidental situations where too many documents get wiped in case of an erroneous bulk upload.To mitigate this, we have a deletion check in place which pauses the deletion of stale documents for 7 days if the percentage of docs being deleted exceeds 20%. In case you intentionally want to delete more than 20% of your previously uploaded documents, you can specify disableStaleDocumentDeletionCheck = true , which disables this check and allows the documents to be deleted. Note that documensts are delete asynchronously. If you wish for deletions to take effect immediately, use /processalldocuments endpoint. |
Was this page helpful?