Static corpus transforms¶

When working with static corpuses, you can apply the following types of transforms:

Query transforms to fine-tune the Agentic Interface reasoning and refine its decision-making logic
Output transforms to modify and format responses provided by the Agentic Interface

Query transforms¶

Query transforms used with static corpuses allow you to refine the Agentic Interface’s decision-making and direct it to the appropriate data corpus. To adjust the AI reasoning process, you can:

Instruct the Agentic Interface which corpus to use for particular types of queries
Instruct the Agentic Interface to omit specific data corpuses

Example of use¶

Assume you have two corpuses that provide information about HTTP requests:

HTTP basics
HTTP status codes

Dialog script¶

corpus({
    title: `HTTP basics`,
    urls: [`https://www.tutorialspoint.com/http/http_responses.htm`],
    depth: 1,
    maxPages: 10,
    priority: 1,
});

corpus({
    title: `HTTP status codes`,
    urls: [`https://developer.mozilla.org/en-US/docs/Web/HTTP/Status`],
    depth: 1,
    maxPages: 1,
    priority: 2,
});

The Agentic Interface should retrieve information from corpuses in the following way:

If the user asks question about status codes, the Agentic Interface should use the HTTP status codes corpus.
For all other questions, the Agentic Interface should refer to the HTTP basics corpus.

Directing to a specific corpus

To direct the Agentic Interface to a specific corpus:

In the Agentic Interface project, under Transforms, create the code_queries transform with the following data:
1. In the Instruction field, provide general instructions on how to handle status code queries:
  Instruction¶
  If a question relates to HTTP status codes, convert it to JSON with an array of search queries and the question; else generate null.
2. In the Examples section, add an example:
  - At the top of the Query field, select the data format: text. In the field below, enter the user query: What does code 401 mean?
  - At the top of the Result field, select the data format: json. In the field below, add the result query the Agentic Interface should generate:
  Transform example¶
  { "search": [ "Code 401", "401 meaning", "401 error" ], "question": "What does code 401 mean?" }

To the HTTP status codes corpus, add the query parameter and define the created transform in it:

Dialog script¶

corpus({
    title: `HTTP status codes`,
    urls: [`https://developer.mozilla.org/en-US/docs/Web/HTTP/Status`],
    query: transforms.code_queries,
    depth: 1,
    maxPages: 1,
    priority: 2,
});

Now, the Agentic Interface will use the HTTP status codes corpus to answer questions about status codes.

../../../_images/corpus-query-transforms-direct-result.png

Omitting a specific corpus

To instruct the Agentic Interface to omit a specific corpus for particular types of queries:

In the Agentic Interface project, under Transforms, create the basic_queries transform with the following data:
1. In the Instruction field, provide general instructions on how to handle status code queries:
  Instruction¶
  If a question relates to status codes, generate null.
2. In the Examples section, add an example:
  - At the top of the Query field, select the data format: text. In the field below, enter the user query: What is a status code?.
  - At the top of the Result field, select the data format: json. In the field below, add the reasoning for the Agentic Interface:
  Transform example¶
  <thinking> This query is about status codes. </thinking> null

To the HTTP basics corpus, add the query parameter and define the created transform in it:

Dialog script¶

corpus({
    title: `HTTP basics`,
    urls: [`https://www.tutorialspoint.com/http/http_responses.htm`],
    query: transforms.basic_queries,
    depth: 1,
    maxPages: 10,
    priority: 1,
});

Now, the Agentic Interface will use only the HTTP status codes corpus to answer questions about status codes.

Output transforms¶

Assume you want the Agentic Interface to format responses in a specific way. For this, you can do the following:

To the dialog script, add the corpus() function, specify the data source and name of the transform that will be used to format the answer. In this example, we will use a transform named output.
Dialog script¶
```
corpus({
    urls: [
        `https://developer.mozilla.org/en-US/docs/Web/HTTP/Status`,
   ],
   transforms: transforms.output,
   maxPages: 3,
   depth: 1
});
```
In the Agentic Interface project, under Transforms, create the output transform with the following data:
1. In the Instruction field, provide general instructions describing transform example fields:
  Instruction¶
  Input is the source text, query is the user query, result describes the final text formatted in Markdown.
2. In the Examples section, set the Result data format to markdown.
Use the Debugging Chat to ask questions you want to be transformed, for example: What does code 304 mean?
In the code editor, open the output transform and in the top right corner, click History.
The Transforms Explorer displays the results of data transformation. To the right of the row, click the plus icon to add the result to the output examples.
In the Result field, edit the Full Answer and Links sections to format the output as needed.

Now, you can ask questions about the status codes and see how the set formatting rules are applied to the output:

../../../_images/transforms-corpus-testing.png