Data transformation

Note

The transforms functionality is available in SLU 4.2 and later.

Data preprocessing and transformation is an important step in AI assistant design. When working with raw or semi-structured data, it may be necessary to pass this data through a series of transforms to prepare it for interactions. For example, you may want to modify input data, format it in a proper way and enhance it before it can be displayed in the Alan AI Chat or used in the dialog logic.

Alan AI supports the following formats for data transformations:

  • Input format: plain text, markdown-formatted text, HTML and JSON

  • Output format: plain text, markdown-formatted text, HTML and JSON

To apply transformations to the input data, you need to perform the following steps:

  1. Add transforms to your AI assistant project

  2. Use transforms in the script

Adding transforms

In Alan AI, transforms are templates or predefined rules that must be applied to the input data to transform it to the desired output. Transforms describe the type and structure of the input data, a prompt to Alan AI on how this data must be transformed and the format of the output data.

To add a transform to your AI assistant project:

  1. In the left pane of Alan AI Studio, under Transforms, click Add and enter the name for the transform.

  2. In the Common prompt field, enter a transform prompt. The transform prompt is a short sentence that provides specific instructions on how Alan AI should interpret the transform rules.

    Note

    The common prompt is generic and applied to all examples you provide in the transform. You can extend this generic prompt with additional instructions added to each example.

  3. In the Examples section, add one or more examples showcasing how the input data must be transformed and choose the required input and output data formats.

    To edit transforms data in a convenient way, in the top left corner of any text field in the example, click the expand icon. You can use the following options:

    • [For HTML and JSON data] To apply formatting rules and improve data visual structure, at the top of the text field, click Format HTML/Format JSON.

    • To edit data and code, in the top right corner of the text field, click Editor mode.

    • To preview the output of your data and visualize how it will appear, in the top right corner of the text field, click Split mode or Preview mode.

    • To expand and collapse specific text fields, in the top left corner of corresponding fields, click Expand panel/Collapse panel.

    ../../../_images/transforms-modes.png

    The rules provided in the example may vary depending on the input data and expected output. For example, to convert input JSON data to markdown-formatted text, you may add the following example:

    • Input: chunk of JSON-formatted input data

    • Query: instructions to Alan AI on how to interpret specific fields in the input JSON data

    • Result: example of output text formatted in markdown

  4. In the left pane of Alan AI Studio, click Save to save the transform.

    ../../../_images/transforms-common.png

Using transforms

To apply the created transform for data preprocessing, use the transforms() function in the dialog script.

Dialog script
intent("Show user details", async p => {
    let u = await transforms.format({
        input: {"name": "John Smith", "age": "56", "address": "1234 Main Street"},
        query: "Name is the user's name, age is the user's age, address is the user's address"
    });
    p.play(u);
});

Here, the transforms() function takes the following parameters:

  • format: name of a transform to be applied to transform the input data

  • input: input data to be transformed

  • query: instructions to Alan AI on how to interpret the input data or sample queries to be run

Examples of use

Here are a few examples of how you can use the transforms in dialog scripts.

Let’s assume you have the infrastructure description presented as text. You want your AI assistant to be able to answer questions about the infrastructure components and show their details as structured and formatted text in the Alan AI Chat.

For this, you can do the following:

  1. Add a transform named format with the following rules:

    Example 1

    • Common prompt: The input contains the infrastructure description in plain text, the query contains sample questions, the result contains formatted text

    • Input: Web Server VM (Web-01): hosting the company website, Ubuntu Server 20.04, 4 vCPUs, 8GB RAM, IP address 192.168.1.101, input format: text

    • Query: What is the Web-01 configuration?

    • Result: example of output data, output format: markdown:

      Result field
      ## VM name: Web Server VM (Web-01)
      
      - **Purpose**: Hosting the company website.
      - **Configuration**:
        - **OS**: Ubuntu Server 20.04
        - **Resources**: 4 vCPUs, 8GB RAM
        - **IP Address**: 192.168.1.101
      

    Example 2

    • Common prompt: The input contains the infrastructure description in plain text, the query contains sample questions, the result contains formatted text

    • Input: Web Server VM (Web-01): hosting the company website, Ubuntu Server 20.04, 4 vCPUs, 8GB RAM, IP address 192.168.1.101, input format: text

    • Query: What OS does the Web-01 VM run?

    • Result: example of output data: The **Web-01** runs **Ubuntu Server 20.04**, output format: markdown

    ../../../_images/transforms-format.png
  2. Add the following code to your dialog script:

    Dialog script
    let infrastructure = `
    
        Data Center: Main Data Center
    
        Virtual Machines:
        1. Web Server VM (Web-01): hosting the company website, Ubuntu Server 20.04, 4 vCPUs, 8GB RAM, IP address 192.168.1.101
        2. App Server VM (App-01): running business applications, CentOS 7, 2 vCPUs, 4GB RAM, IP address: 192.168.1.102
        3. Database Server VM (DB-01): hosting relational databases, Windows Server 2019, 8 vCPUs, 16GB RAM, IP address: 192.168.1.103
    `;
    
    intent("Show the $(VM* .+ ) configuration", async p => {
        p.play("Just a second...");
    
            // Transform review data
            let i = await transforms.format({
                input: infrastructure,
                query: 'What is the configuration of ' + p.VM.value
            });
        p.play(i);
    });
    
    intent(
        "(What|How many) $(RESOURCE* .+) does $(VM* .+) (run|have)?",
        "(What|How much|How many) $(RESOURCE* .+) (is|are) (available|) on $(VM* .+)?",
        async p => {
            p.play("Just a second...");
    
           // Transform review data
           let i = await transforms.format({
               input: infrastructure,
               query: p.RESOURCE.value + ' available on ' + p.VM.value
           });
           p.play(i);
       });
    

    Here, the infrastructure description in plain text is saved to infrastructure. When the user asks to show the configuration of a specific VM or asks about resources available on a specific VM, a corresponding intent is invoked and the text is transformed to the necessary format:

    ../../../_images/transforms-format-example.png

Let’s assume you have a JSON-formatted list of product reviews. You want to pre-process this data to:

  • Present data in an aesthetic format using markdown

  • Extract pros and cons from the review text and present them as bulleted text

For this, you can do the following:

  1. Add a transform named summarize with the following rules:

    • Common prompt: The input contains the initial JSON, the query contains fields description, the result shows what fields should be available and how the text should be formatted

    • Input: {"Name": "Emily", "Rating": "4/5", "Review": "A solid fitness tracker with accurate measurements. I appreciate the waterproof design, which allows me to wear it while swimming. The battery life is decent, but I wish it had more customizable watch faces."}, input format: JSON

    • Query: Name is the name of the user who left feedback, rating is the rating set, review contains the product pro and cons

    • Result: example of output data, output format: markdown

      Result field
      # Product reviews
      
      ## User name
      
      - **Pros**: Accurate measurements, waterproof design, decent battery life
      - **Cons**: Few customizable watch faces
      - **Rating**: rating
      
    ../../../_images/transforms.png
  2. Add the following code to your dialog script:

    Dialog script
    let productReviews = {
        "reviews": [
            {
                "name": "Sarah",
                "rating": "5/5",
                "review": "I've been using this fitness tracker for a few weeks now, and it's been a game-changer for my fitness journey. It accurately tracks my steps and heart rate. The sleep tracking feature is a bonus, helping me improve my sleep patterns. The app is user-friendly, and the battery life is impressive. Highly recommend!"
            },
            {
                "name": "Chris",
                "rating": "4/5",
                "review": "The fitness tracker is great for tracking my daily activities. The heart rate monitor is quite accurate, and I like the variety of sports modes. My only wish is for a brighter display in direct sunlight. Overall, a solid choice."
            },
            {
                "name": "Linda",
                "rating": "5/5",
                "review": "I love this fitness tracker. It's lightweight, comfortable to wear, and the battery lasts for days. The sleep tracking is a lifesaver, helping me identify areas for improvement. The app is intuitive, and it syncs seamlessly with my phone. Couldn't be happier!"
            }
        ]
    }
    
    intent(`Show product reviews`, async (p)=> {
        p.play(`Collecting product reviews...`);
    
        // Transform review data
        let r = await transforms.summarize({
            input: productReviews,
            query: 'JSON containing a list of product reviews'
        });
        p.play(r);
    });
    

    Here, reviews in JSON format are saved to productReviews. When the user asks to show reviews, the review data is summarized and formatted to be presented using the defined template:

    ../../../_images/transforms-summary.png

Let’s assume you have a JSON-formatted list of apartments. When the user asks: Show available apartments, you want to display these appartments in an aesthetic format.

For this, you can do the following:

  1. Add a transform named render with the following rules:

    • Common prompt: The input contains the apartment description in JSON, the query contains sample questions, the result contains formatted text

    • Input: {"name": "Luxury Penthouse with Panoramic Views", "location": "789 Skyline Drive, Metropolis", "price": "$1,200,000", "bedrooms": 3, "bathrooms": 3, "image": "https://vmts.ch/wp-content/uploads/2017/06/Hallenbad_2-1.jpg", "url": "https://vmts.ch/en/portfolio/project-hallenbad/"}, input format: JSON

    • Query: Show available apartments?, What apartments do you offer?

    • Result: example of output data, output format: HTML

      Result field
      <div>
        <img src="https://vmts.ch/wp-content/uploads/2017/06/Hallenbad_2-1.jpg" alt="Luxury Penthouse with Panoramic Views">
        <h4>Luxury Penthouse with Panoramic Views</h4>
        <p>Apartment details</p>:
        <ul>
          <li><b>Location:</b>789 Skyline Drive, Metropolis</li>
          <li><b>Price:</b> $1,200,000</li>
          <li><b>Bedrooms:</b> 3</li>
          <li><b>Bathrooms:</b> 3</li>
        </ul>
        <p><a href="https://vmts.ch/en/portfolio/project-hallenbad/">Learn more</a></p>
      </div>
      <br/>
      
    ../../../_images/transforms-render-template.png
  2. Add the following code to your dialog script:

    Dialog script
    let data = {
        "properties": [
            {
                "name": "Modern Apartment in the City Center",
                "location": "123 Main Street, Cityville",
                "price": "$500,000",
                "bedrooms": 2,
                "bathrooms": 2,
                "image": "https://vmts.ch/wp-content/uploads/2017/06/v1.jpg",
                "url": "https://vmts.ch/en/portfolio/project-gasterzimmer/"
            },
            {
                "name": "Suburban Family Home",
                "location": "456 Oak Avenue, Suburbia",
                "price": "$750,000",
                "bedrooms": 4,
                "bathrooms": 3,
                "image": "https://vmts.ch/wp-content/uploads/2017/05/7-2.jpg",
                "url": "https://vmts.ch/en/portfolio/project-niederteufen/"
            },
            {
                "name": "Luxury Penthouse with Panoramic Views",
                "location": "789 Skyline Drive, Metropolis",
                "price": "$1,200,000",
                "bedrooms": 3,
                "bathrooms": 3,
                "image": "https://vmts.ch/wp-content/uploads/2017/06/Hallenbad_2-1.jpg",
                "url": "https://vmts.ch/en/portfolio/project-hallenbad/"
            }
        ]
    }
    
    intent("Show available apartments", async p => {
        p.play("Just a second...");
    
        // Transform apartments data
        let a = await transforms.render({
            input: data,
            query: 'Show available apartments'
        });
        p.play(a);
    });
    

    Here, the apartments data in JSON format are saved to data. When the user asks to show available apartments, the apartments data is rendered and presented using the defined template:

    ../../../_images/transforms-render.png

Transforms Explorer

Alan AI Studio comes with the Transforms Explorer, a tool that enhances transparency and understanding of the data transformation process. The Transforms Explorer allows you to gain insight into the inner workings of the transforms() function and inspect all instances of data transformations that have been executed.

To open the Transforms Explorer, click the magnifying glass icon to the left of the transforms() function in the dialog script.

../../../_images/transforms-explorer-open.png

For every transformation, the Transforms Explorer provides detailed information on the input and output data. You can examine these instances to better understand the impact of each transformation on the input data.

Additionally, you can add successful transformation instances to the list of examples specified for the transform. To do this, to the right of the instance row in the table, click the add icon. Alan AI will use this example for any future data transformation.

../../../_images/transforms-explorer.png