Helm Parameter | Environment Variable | Default Value | Descriptoin |
---|---|---|---|
deeplake_creds | DEEPLAKE_CREDS | - | Refer to Deeplake storage credentials |
deeplake_root_dir | DEEPLAKE_ROOT_DIR | for helm required, for docker-compose: /var/lib/deeplake |
storage path used by deeplake for data operations |
postgres_database | POSTGRES_DATABASE | neohorizon | virtual database name to use by app, must be created beforehand |
postgres_host | POSTGRES_HOST | k8s service host of postgres dependency | postgres database hostname (required) |
postgres_password | POSTGRES_PASSWORD | postgres | postgres database user password |
postgres_user | POSTGRES_USER | postgres | postgres database username |
postgres_port | POSTGRES_PORT | 5432 | postgres database port |
rabbitmq_url | RABBITMQ_URL | - | rabbitmq ampq url, default will be built from dependency installation (required) |
al_api_token | AL_API_TOKEN | - (required) | api token to authenticate to deployed api |
gemini_api_key | GEMINI_API_KEY | - | optional to run geminy requests |
openai_api_key | OPENAI_API_KEY | - | needed for query generation |
text_image__matrix_of_embeddings__ingestion_url | TEXT_IMAGE__MATRIX_OF_EMBEDDINGS__INGESTION_URL | - | should be full path to endpoint triton inference endpoint |
text_image__matrix_of_embeddings__query_url | TEXT_IMAGE__MATRIX_OF_EMBEDDINGS__QUERY_URL | - | should be full path to endpoint triton inference endpoint |
text_image__embedding__ingestion_url | TEXT_IMAGE__EMBEDDING__INGESTION_URL | - | should be full path to endpoint triton inference endpoint |
text_image__embedding__query_url | TEXT_IMAGE__EMBEDDING__QUERY_URL | - | should be full path to endpoint triton inference endpoint |
text__embedding__ingestion_url | TEXT__EMBEDDING__INGESTION_URL | - | should be full path to endpoint triton inference endpoint |
text__embedding__query_url | TEXT__EMBEDDING__QUERY_URL | - | should be full path to endpoint triton inference endpoint |
Neohorizon works with triton served models for embedding generation both for queries and ingestion. Both helm chart and docker-compose are providing options to run models. Here are descriptions of models:
Models that Activeloop uses
- colnomic: can be used for ingest/retrieval of images, suggested to provide at least 16GiB RAM and A100 GPU
- inf-retriever-v1: can be used for ingest/retrieval of texts, suggested to provide at least 4GiB RAM and A10/L4 GPU
- doclayout_parser: can be used to generate images for answers, suggested to provide at least 4GiB RAM and A10/L4 GPU
- qwen_06B: can be used for ingest/retrieval, suggested to provide at least 4GiB RAM and A10/L4 GPU
Note. Any custom models can be used with neohorizon, only requirement is that models must be served with triton and full URLs must be set in deployment environment variables.
For both helm chart and docker-compose cases default configuration should be reviewed or adjusted to use models.
-
case 1: deployed only image model, then values override yaml would look like
... global: config: text_image__matrix_of_embeddings__ingestion_url: http://activeloop-neohorizon-models-svc/v2/models/colnomic/infer text_image__matrix_of_embeddings__query_url: http://activeloop-neohorizon-models-svc/v2/models/colnomic/infer text_image__embedding__ingestion_url: http://activeloop-neohorizon-models-svc/v2/models/colnomic/infer text_image__embedding__query_url: http://activeloop-neohorizon-models-svc/v2/models/colnomic/infer ... models: - name: models load_models: - colnomic
-
case 2: deployed all models with single deployment
... global: config: text_image__matrix_of_embeddings__ingestion_url: http://activeloop-neohorizon-models-svc/v2/models/colnomic/infer text_image__matrix_of_embeddings__query_url: http://activeloop-neohorizon-models-svc/v2/models/colnomic/infer text_image__embedding__ingestion_url: http://activeloop-neohorizon-models-svc/v2/models/colnomic/infer text_image__embedding__query_url: http://activeloop-neohorizon-models-svc/v2/models/colnomic/infer text__embedding__ingestion_url: http://activeloop-neohorizon-models-svc/v2/models/inf-retriever-v1/infer text__embedding__query_url: http://activeloop-neohorizon-models-svc/v2/models/inf-retriever-v1/infer ... models: - name: models load_models: - colnomic - inf-retriever-v1
-
case 3: separate deployments for image and text
... global: config: text_image__matrix_of_embeddings__ingestion_url: http://activeloop-neohorizon-colnomic-svc/v2/models/colnomic/infer text_image__matrix_of_embeddings__query_url: http://activeloop-neohorizon-colnomic-svc/v2/models/colnomic/infer text_image__embedding__ingestion_url: http://activeloop-neohorizon-colnomic-svc/v2/models/colnomic/infer text_image__embedding__query_url: http://activeloop-neohorizon-colnomic-svc/v2/models/colnomic/infer text__embedding__ingestion_url: http://activeloop-neohorizon-text-svc/v2/models/inf-retriever-v1/infer text__embedding__query_url: http://activeloop-neohorizon-text-svc/v2/models/inf-retriever-v1/infer ... models: - name: colnomic load_models: - colnomic ... - name: text load_models: - inf-retriever-v1
In the case cloud storage is used (s3, gs, azure blob storage) and underlaying infrastructure does not provide out of the box authentication to the storage,
static credentials should be applied as enviornment variable so deeplake can do storge operations.
To give credentials to deeplake, use DEEPLAKE_CREDS
environment variable or corresponding Cloud SKD Environment variables.
DEEPLAKE_CREDS
must be an string serialized dictionary with cloud credentials, examples below.
-
AWS:
or
{ "profile_name": "AWS_PROFILE", "endpoint_url": "https://s3.customerendpoint.com", // OPTIONAL "region_name": "AWS_REGION" // OPTIONAL }
or
{ "profile_name": "AWS_PROFILE", "endpoint_url": "https://s3.customerendpoint.com", // OPTIONAL "region_name": "AWS_REGION" // OPTIONAL }
or
{ "aws_role_arn": "AWS_ROLE_ARN", "aws_session_name": "session-name-for-assume-role", "aws_external_id": "external-id-for-assume-role", "endpoint_url": "https://s3.customerendpoint.com", // OPTIONAL "region_name": "AWS_REGION" // OPTIONAL }
-
AZURE:
{ "azure_client_id": "AZURE_CLIENT_ID", "azure_client_secret": "AZURE_CLIENT_SECRET", "azure_tenant_id": "AZURE_TENANT" }
or
{ "sas_token": "azure-storage-sas-token" }
or
{ "account_name": "...", "container_name": "...", "account_key": "...", }
-
GCP:
{ "json_credentials": "SERVICE_ACCOUNT_JSON_KEY" }