POST
/
datasets
/
upload
Upload Dataset
curl --request POST \
  --url https://api.kontext.dev/datasets/upload \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <x-api-key>' \
  --data '{}'

Overview

Multipart upload using multipart/form-data. If X-As-User maps to a valid OAuthToken(userId, developerId), the dataset is anchored as a user dataset; otherwise it is a population dataset.

Authentication

x-api-key
string
required
Your API key
X-As-User
string
Optional userId to index as a specific user (validated)

Form Data

file
file
required
File to upload

Response

{ "datasetId": "ds_...", "jobId": "job_...", "status": "pending", "itemId": "it_...", "anchorUserId": "user_abc123" }
Notes:
  • anchorUserId echoes the validated user anchor; it is null for population datasets.
  • Facts are only extracted for user‑anchored datasets; population datasets are indexed for vector search.

Examples

curl -X POST https://api.kontext.dev/datasets/upload \
  -H "x-api-key: ktext_abc123..." \
  -H "X-As-User: user_abc123" \
  -F "file=@notes.txt;type=text/plain"

Limits

  • Size cap: MAX_UPLOAD_SIZE_MB (default 10 MB) → 413 if exceeded
  • Types: text/*, application/json → 415 if unsupported
  • GCS required: If GCS credentials/bucket are not configured, returns 500 (upload rejected)

Errors

  • 401 UNAUTHORIZED — missing/invalid x-api-key
  • 413 PAYLOAD TOO LARGE — file exceeds size cap
  • 415 UNSUPPORTED MEDIA TYPE — unsupported type (only text/*, application/json)
  • 500 INTERNAL SERVER ERROR — GCS not configured or upload failure

Next steps

  • Check status: /trpc/datasets.getStatus
  • Facts‑first QA vs Search Dataset: /trpc/datasets.query and /trpc/datasets.search