Neural Sparse Search tool
Introduced 2.12
This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated GitHub issue.
The NeuralSparseSearchTool
performs sparse vector retrieval. For more information about neural sparse search, see Neural sparse search.
Step 1: Register and deploy a sparse encoding model
OpenSearch supports several pretrained sparse encoding models. You can either use one of those models or your own custom model. For a list of supported pretrained models, see Sparse encoding models. For more information, see OpenSearch-provided pretrained models and Custom local models.
In this example, you’ll use the amazon/neural-sparse/opensearch-neural-sparse-encoding-v1
pretrained model for both ingestion and search. To register and deploy the model to OpenSearch, send the following request:
POST /_plugins/_ml/models/_register?deploy=true
{
"name": "amazon/neural-sparse/opensearch-neural-sparse-encoding-v1",
"version": "1.0.1",
"model_format": "TORCH_SCRIPT"
}
OpenSearch responds with a task ID for the model registration and deployment task:
{
"task_id": "M_9KY40Bk4MTqirc5lP8",
"status": "CREATED"
}
You can monitor the status of the task by calling the Tasks API:
GET _plugins/_ml/tasks/M_9KY40Bk4MTqirc5lP8
Once the model is registered and deployed, the task state
changes to COMPLETED
and OpenSearch returns a model ID for the model:
{
"model_id": "Nf9KY40Bk4MTqirc6FO7",
"task_type": "REGISTER_MODEL",
"function_name": "SPARSE_ENCODING",
"state": "COMPLETED",
"worker_node": [
"UyQSTQ3nTFa3IP6IdFKoug"
],
"create_time": 1706767869692,
"last_update_time": 1706767935556,
"is_async": true
}
Step 2: Ingest data into an index
First, you’ll set up an ingest pipeline to encode documents using the sparse encoding model set up in the previous step:
PUT /_ingest/pipeline/pipeline-sparse
{
"description": "An sparse encoding ingest pipeline",
"processors": [
{
"sparse_encoding": {
"model_id": "Nf9KY40Bk4MTqirc6FO7",
"field_map": {
"passage_text": "passage_embedding"
}
}
}
]
}
Next, create an index specifying the pipeline as the default pipeline:
PUT index_for_neural_sparse
{
"settings": {
"default_pipeline": "pipeline-sparse"
},
"mappings": {
"properties": {
"passage_embedding": {
"type": "rank_features"
},
"passage_text": {
"type": "text"
}
}
}
}
Last, ingest data into the index by sending a bulk request:
POST _bulk
{ "index" : { "_index" : "index_for_neural_sparse", "_id" : "1" } }
{ "passage_text" : "company AAA has a history of 123 years" }
{ "index" : { "_index" : "index_for_neural_sparse", "_id" : "2" } }
{ "passage_text" : "company AAA has over 7000 employees" }
{ "index" : { "_index" : "index_for_neural_sparse", "_id" : "3" } }
{ "passage_text" : "Jack and Mark established company AAA" }
{ "index" : { "_index" : "index_for_neural_sparse", "_id" : "4" } }
{ "passage_text" : "company AAA has a net profit of 13 millions in 2022" }
{ "index" : { "_index" : "index_for_neural_sparse", "_id" : "5" } }
{ "passage_text" : "company AAA focus on the large language models domain" }
Step 3: Register a flow agent that will run the NeuralSparseSearchTool
A flow agent runs a sequence of tools in order and returns the last tool’s output. To create a flow agent, send the following request, providing the model ID for the model set up in Step 1. This model will encode your queries into sparse vector embeddings:
POST /_plugins/_ml/agents/_register
{
"name": "Test_Neural_Sparse_Agent_For_RAG",
"type": "flow",
"tools": [
{
"type": "NeuralSparseSearchTool",
"parameters": {
"description":"use this tool to search data from the knowledge base of company AAA",
"model_id": "Nf9KY40Bk4MTqirc6FO7",
"index": "index_for_neural_sparse",
"embedding_field": "passage_embedding",
"source_field": ["passage_text"],
"input": "${parameters.question}",
"doc_size":2
}
}
]
}
For parameter descriptions, see Register parameters.
OpenSearch responds with an agent ID:
{
"agent_id": "9X7xWI0Bpc3sThaJdY9i"
}
Step 4: Run the agent
Before you run the agent, make sure that you add the sample OpenSearch Dashboards Sample web logs
dataset. To learn more, see Adding sample data.
Then, run the agent by sending the following request:
POST /_plugins/_ml/agents/9X7xWI0Bpc3sThaJdY9i/_execute
{
"parameters": {
"question":"how many employees does AAA have?"
}
}
OpenSearch returns the inference results:
{
"inference_results": [
{
"output": [
{
"name": "response",
"result": """{"_index":"index_for_neural_sparse","_source":{"passage_text":"company AAA has over 7000 employees"},"_id":"2","_score":30.586042}
{"_index":"index_for_neural_sparse","_source":{"passage_text":"company AAA has a history of 123 years"},"_id":"1","_score":16.088133}
"""
}
]
}
]
}
Register parameters
The following table lists all tool parameters that are available when registering an agent.
Parameter | Type | Required/Optional | Description |
---|---|---|---|
model_id | String | Required | The model ID of the sparse encoding model to use at search time. |
index | String | Required | The index to search. |
embedding_field | String | Required | When the neural sparse model encodes raw text documents, the encoding result is saved in a field. Specify this field as the embedding_field . Neural sparse search matches documents to the query by calculating the similarity score between the query text and the text in the document’s embedding_field . |
source_field | String | Required | The document field or fields to return. You can provide a list of multiple fields as an array of strings, for example, ["field1", "field2"] . |
input | String | Required for flow agent | Runtime input sourced from flow agent parameters. If using a large language model (LLM), this field is populated with the LLM response. |
name | String | Optional | The tool name. Useful when an LLM needs to select an appropriate tool for a task. |
description | String | Optional | A description of the tool. Useful when an LLM needs to select an appropriate tool for a task. |
doc_size | Integer | Optional | The number of documents to fetch. Default is 2 . |
Execute parameters
The following table lists all tool parameters that are available when running the agent.
Parameter | Type | Required/Optional | Description |
---|---|---|---|
question | String | Required | The natural language question to send to the LLM. |