How to Split OpenAI Text Output into Multiple Values Using Open and Closed Tags

I am working on a project where I need to process the output from an OpenAI chat model. The output text includes several sections, each delineated by specific open and closed tags (e.g., [SUMMARY]...[/SUMMARY], [MAIN_POINTS]...[/MAIN_POINTS], etc.). I need to split this single string of text into multiple values, with each value containing the text from one specific section between its respective open and close tags.

Here is an example of the text structure:
“”"
[SUMMARY] Text for summary. [/SUMMARY]
[MAIN_POINTS] Main point 1. Main point 2. [/MAIN_POINTS]
[ACTION_ITEMS] Action item 1. [/ACTION_ITEMS]

“”"
I am struggling to correctly split this text using Make.com’s tools. The goal is to create individual variables for each section (e.g., summary, mainPoints, actionItems, etc.), extracting the content between the tags while excluding the tags themselves.

Could anyone advise on the best approach to achieve this in Make.com? Any insights or suggestions on specific modules or functions to use would be greatly appreciated.

1 Like

Welcome to the Make community!

Please provide the output bundles of the modules by running the scenario, then click the white speech bubble on the top-right of each module, save the contents as a text file, and upload it here into this discussion thread:
Screenshot_2023-10-06_141025

Providing the output bundles will allow others to replicate what is going on in the scenario even if they do not use the external service.

This will allow others to better assist you. Thanks!

3 Likes

Thanks @samliew,

Here are the output from the OpenAI chat completion module:

“”"
[
{
“id”:,
“object”: “chat.completion”,
“created”: “2023-11-11T14:31:09.000Z”,
“model”: “gpt-4-0613”,
“choices”: [
{
“index”: 0,
“message”: {
“role”: “assistant”,
“content”: “[SUMMARY]\nThe transcription is a detailed plan for a YouTube video on tennis strategies. The speaker discusses the concept of synergistic strategies in tennis, using a hypothetical match between Jane Smith and John Doe to illustrate the point. The speaker also compares tennis strategies to everyday tasks like doing laundry and grocery shopping to explain the importance of sequencing in tennis. The speaker plans to delve deeper into these concepts in the video to help viewers understand strategic thinking and systems design.\n[/SUMMARY]\n\n[MAIN_POINTS]\n1. The speaker plans to create a video on synergistic strategies in tennis.\n2. The concept of synergistic strategies is explained using a hypothetical match between Jane Smith and John Doe.\n3. The speaker emphasizes the importance of understanding how different strategies interact to create outcomes greater than their individual parts.\n4. The speaker compares tennis strategies to everyday tasks to explain the importance of sequencing.\n5. The speaker plans to delve deeper into these concepts in the video to help viewers understand strategic thinking and systems design.\n[/MAIN_POINTS]\n\n[ACTION_ITEMS]\n1. Create a video on synergistic strategies in tennis.\n2. Use a hypothetical match between Jane Smith and John Doe to illustrate the concept.\n3. Explain the importance of understanding how different strategies interact.\n4. Compare tennis strategies to everyday tasks to explain the importance of sequencing.\n5. Delve deeper into these concepts in the video to help viewers understand strategic thinking and systems design.\n[/ACTION_ITEMS]\n\n[FOLLOW_UP_QUESTIONS]\n1. What other examples can be used to illustrate the concept of synergistic strategies in tennis?\n2. How can the concept of sequencing be further explained using tennis strategies?\n3. What other concepts related to strategic thinking and systems design can be discussed in the video?\n[/FOLLOW_UP_QUESTIONS]\n\n[CLIENT_NAME] None [/CLIENT_NAME]\n\n[FILE_NAME] None [/FILE_NAME]\n\n[DOCUMENT_TYPE]\nNote\n[/DOCUMENT_TYPE]”
},
“finish_reason”: “stop”
}
],
“usage”: {
“prompt_tokens”: 1817,
“completion_tokens”: 384,
“total_tokens”: 2201
}
}
]
“”"

As I mentioned, I’d to split this single string of “content” text into multiple values, with each value containing the text between its respective open and close tags. The specific values are SUMMARY, MAIN_POINTS, ACTION_ITEMS, FOLLOW_UP_QUESTIONS, CLIENT_NAME, FILE_NAME, and TYPE.

Any tips for how to do this would be very much appreciated.

Hello @Oliver_Marler,

You could try a Text Parser module configured like this:

Be sure Global Match and Singleline are both enabled for this to work properly.
Here’s the text of the Pattern:

[([A-Z_])](.)[/\1]

Here’s a bit of how the output would look:

2 Likes

@Donald_Mitchell - Thank you. This worked great to split it into bundles. Just need to figure out how to convert them into individual variables so I can sort them into unique sections of a page on Notion. Any chance you that’s an easy job? :grimacing:

@Donald_Mitchell - I’m getting the right output from the text parser module (see below), but I can’t figure out how to set the ‘$2’ value from each bundle as a unique variable. I’ve been trying to use the set multiple variables module GPT instructions below, but I keep getting the following error: " Module references non-existing module ‘6’.".

Would really appreciate your help/guidance.

Text Parser Module Output:
“”"
[
{
“i”: 1,
“$1”: “SUMMARY”,
“$2”: “\nThe transcription is a detailed plan for a YouTube video about tennis strategies. The speaker discusses the concept of synergistic strategies in tennis, using examples from a fictional match between Jane Smith and John Doe. The speaker also plans to delve into the concept of shot sequencing and the locus of control in tennis, using these as a lens to understand strategic thinking and systems design.\n”
},
{
“i”: 2,
“$1”: “MAIN_POINTS”,
“$2”: “\n1. The speaker plans to create a video about synergistic strategies in tennis.\n2. The concept is inspired by coach Alex Harmon’s saying about the combination and execution of shots in tennis.\n3. The speaker will use a fictional match between Jane Smith and John Doe to illustrate the concept.\n4. The speaker will discuss the importance of shot sequencing and the locus of control in tennis.\n5. The speaker plans to use tennis as a lens to understand strategic thinking and systems design.\n”
},
{
“i”: 3,
“$1”: “ACTION_ITEMS”,
“$2”: “\n1. Create a video on synergistic strategies in tennis.\n2. Use a fictional match between Jane Smith and John Doe to illustrate the concept.\n3. Discuss the importance of shot sequencing and the locus of control in tennis.\n4. Use tennis as a lens to understand strategic thinking and systems design.\n”
},
{
“i”: 4,
“$1”: “FOLLOW_UP_QUESTIONS”,
“$2”: “\n1. What specific examples will be used to illustrate the concept of synergistic strategies in tennis?\n2. How will the speaker explain the importance of shot sequencing and the locus of control in tennis?\n3. What other concepts related to strategic thinking and systems design will be discussed in the video?\n”
},
{
“i”: 5,
“$1”: “CLIENT_NAME”,
“$2”: “\nNone\n”
},
{
“i”: 6,
“$1”: “FILE_NAME”,
“$2”: “\nNone\n”
},
{
“i”: 7,
“$1”: “DOCUMENT_TYPE”,
“$2”: “\nNote\n”
}
]
“”"

Instructions from GPT:
“”"
Based on the output from the Text Parser module (which is module number 6 in your automation), you can set the variable values in the Set Multiple Variables module as follows:

  1. Variable for SUMMARY:
  • Variable Name: summaryContent
  • Variable Value: {{6.1.$2}}
  1. Variable for MAIN_POINTS:
  • Variable Name: mainPointsContent
  • Variable Value: {{6.2.$2}}
  1. Variable for ACTION_ITEMS:
  • Variable Name: actionItemsContent
  • Variable Value: {{6.3.$2}}
  1. Variable for FOLLOW_UP_QUESTIONS:
  • Variable Name: followUpQuestionsContent
  • Variable Value: {{6.4.$2}}
  1. Variable for CLIENT_NAME:
  • Variable Name: clientNameContent
  • Variable Value: {{6.5.$2}}
  1. Variable for FILE_NAME:
  • Variable Name: fileNameContent
  • Variable Value: {{6.6.$2}}
  1. Variable for DOCUMENT_TYPE:
  • Variable Name: documentTypeContent
  • Variable Value: {{6.7.$2}}

In this setup, 6 refers to the Text Parser module number, i indicates the item’s index, $1 corresponds to the category (like SUMMARY, MAIN_POINTS, etc.), and $2 contains the actual content for each category.
“”"

After the parser, you can use an Array Aggregator then after the Aggregator you will end up with an Array on which you can use the map function.

Here’s a blueprint you can check out.
how-to-split-openai-text-output-into-multiple-values-using-open-and-closed-tags-19484.json (10.2 KB)

2 Likes

Hi @Oliver_Marler, in future, could you format JSON code, otherwise the forum software will convert some characters to make them invalid (checked when pasted into https://jsonformatter.org).

1.

Could you format your JSON by editing three backticks ``` before and after the code, like this:

```
input/output bundle goes here
```

2.

or use the format code button in the editor:
Screenshot_2023-10-02_191027

3.

Alternatively, you can paste and save the contents of the bundles in your text editor as a .json file, and upload it here into this discussion thread.

Once the post has been submitted, it’s too late to format it since it’s already butchered, and you need to make a fresh copy of output bundle, and format it before submitting the forum post.

3 Likes

Thanks @Donald_Mitchell and @samliew. Unfortunately, I’m still completely confused by this.

I’ve gotten to a point where I’ve:
(1) Downloaded a voice note from Google Drive
(2) Transcribed that note with OpenAI (whisper)
(3) Analyzed the transcription and generated a response using a OpenAI Chat Completion Module, which provided the following output:

[
    {
        "id": "chatcmpl-8K58EPgCQJLKFS9ce8mtERFCMEajP",
        "object": "chat.completion",
        "created": "2023-11-12T13:49:38.000Z",
        "model": "gpt-4-0613",
        "choices": [
            {
                "index": 0,
                "message": {
                    "role": "assistant",
                    "content": "[SUMMARY]\nThe transcription is a detailed plan for a YouTube video on tennis strategies. The speaker discusses the concept of synergistic strategies in tennis, using a hypothetical match between Jane Smith and John Doe as an example. The speaker also compares tennis strategies to everyday tasks like doing laundry and grocery shopping to illustrate the importance of strategic sequencing. The video aims to delve deeper into these concepts using tennis as a lens to understand strategic thinking and systems design.\n[/SUMMARY]\n\n[MAIN_POINTS]\n1. The speaker plans to create a YouTube video on synergistic strategies in tennis.\n2. The concept of synergistic strategies is explained using a hypothetical match between Jane Smith and John Doe.\n3. The speaker emphasizes the importance of understanding how different strategies interact to create outcomes greater than their individual parts.\n4. The speaker uses the analogy of doing laundry and grocery shopping to illustrate the importance of strategic sequencing in tennis.\n5. The video aims to delve deeper into these concepts using tennis as a lens to understand strategic thinking and systems design.\n[/MAIN_POINTS]\n\n[ACTION_ITEMS]\n1. Create a YouTube video on synergistic strategies in tennis.\n2. Use a hypothetical match between Jane Smith and John Doe to illustrate the concept.\n3. Use the analogy of doing laundry and grocery shopping to explain strategic sequencing in tennis.\n[/ACTION_ITEMS]\n\n[FOLLOW_UP_QUESTIONS]\n1. What other examples can be used to illustrate the concept of synergistic strategies in tennis?\n2. How can the concept of strategic sequencing in tennis be further explained?\n[/FOLLOW_UP_QUESTIONS]\n\n[CLIENT_NAME]\nNone\n[/CLIENT_NAME]\n\n[FILE_NAME]\nNone\n[/FILE_NAME]\n\n[DOCUMENT_TYPE]\nNote\n[/DOCUMENT_TYPE]"
                },
                "finish_reason": "stop"
            }
        ],
        "usage": {
            "prompt_tokens": 1817,
            "completion_tokens": 343,
            "total_tokens": 2160
        }
    }
]

(4) Used a Text Parser Module with this pattern: “[([A-Z_])](.)[/\1]”, to generate this output:

[
    {
        "i": 1,
        "$1": "SUMMARY",
        "$2": "\nThe transcription is a detailed plan for a YouTube video on tennis strategies. The speaker discusses the concept of synergistic strategies in tennis, using a hypothetical match between Jane Smith and John Doe as an example. The speaker also compares tennis strategies to everyday tasks like doing laundry and grocery shopping to illustrate the importance of strategic sequencing. The video aims to delve deeper into these concepts using tennis as a lens to understand strategic thinking and systems design.\n"
    },
    {
        "i": 2,
        "$1": "MAIN_POINTS",
        "$2": "\n1. The speaker plans to create a YouTube video on synergistic strategies in tennis.\n2. The concept of synergistic strategies is explained using a hypothetical match between Jane Smith and John Doe.\n3. The speaker emphasizes the importance of understanding how different strategies interact to create outcomes greater than their individual parts.\n4. The speaker uses the analogy of doing laundry and grocery shopping to illustrate the importance of strategic sequencing in tennis.\n5. The video aims to delve deeper into these concepts using tennis as a lens to understand strategic thinking and systems design.\n"
    },
    {
        "i": 3,
        "$1": "ACTION_ITEMS",
        "$2": "\n1. Create a YouTube video on synergistic strategies in tennis.\n2. Use a hypothetical match between Jane Smith and John Doe to illustrate the concept.\n3. Use the analogy of doing laundry and grocery shopping to explain strategic sequencing in tennis.\n"
    },
    {
        "i": 4,
        "$1": "FOLLOW_UP_QUESTIONS",
        "$2": "\n1. What other examples can be used to illustrate the concept of synergistic strategies in tennis?\n2. How can the concept of strategic sequencing in tennis be further explained?\n"
    },
    {
        "i": 5,
        "$1": "CLIENT_NAME",
        "$2": "\nNone\n"
    },
    {
        "i": 6,
        "$1": "FILE_NAME",
        "$2": "\nNone\n"
    },
    {
        "i": 7,
        "$1": "DOCUMENT_TYPE",
        "$2": "\nNote\n"
    }
]

(5) Added an Array Aggregator Module and mapped it to the output from the Text Parser, which resulted in this output:

[
    {
        "array": [
            {
                "i": 1,
                "$1": "SUMMARY",
                "$2": "\nThe transcription is a detailed plan for a YouTube video on tennis strategies. The speaker discusses the concept of synergistic strategies in tennis, using a hypothetical match scenario between two fictional players, Jane Smith and John Doe, to illustrate the point. The speaker also discusses the importance of shot sequencing and the concept of the locus of control in tennis. The video aims to use tennis as a lens to understand strategic thinking and systems design.\n",
                "__IMTMATCH__": null
            },
            {
                "i": 2,
                "$1": "MAIN_POINTS",
                "$2": "\n1. The speaker plans to create a video on synergistic strategies in tennis.\n2. The concept of synergistic strategies is explained using a hypothetical match between Jane Smith and John Doe.\n3. The speaker emphasizes the importance of understanding how different strategies interact to create outcomes greater than their individual parts.\n4. The speaker discusses the concept of shot sequencing using an everyday example of doing laundry and grocery shopping.\n5. The speaker introduces the concept of the locus of control in tennis, focusing on what a player can influence.\n6. The video aims to delve deeper into these concepts using tennis as a lens to understand strategic thinking and systems design.\n",
                "__IMTMATCH__": null
            },
            {
                "i": 3,
                "$1": "ACTION_ITEMS",
                "$2": "\n1. Create a video on synergistic strategies in tennis.\n2. Use a hypothetical match scenario to illustrate the concept.\n3. Discuss the importance of shot sequencing and the concept of the locus of control in tennis.\n",
                "__IMTMATCH__": null
            },
            {
                "i": 4,
                "$1": "FOLLOW_UP_QUESTIONS",
                "$2": "\n1. What other examples can be used to illustrate the concept of synergistic strategies in tennis?\n2. How can the concept of the locus of control be further explained in the context of tennis?\n",
                "__IMTMATCH__": null
            },
            {
                "i": 5,
                "$1": "CLIENT_NAME",
                "$2": " None ",
                "__IMTMATCH__": null
            },
            {
                "i": 6,
                "$1": "FILE_NAME",
                "$2": " None ",
                "__IMTMATCH__": null
            },
            {
                "i": 7,
                "$1": "DOCUMENT_TYPE",
                "$2": "\nNote\n",
                "__IMTMATCH__": null
            }
        ],
        "__IMTAGGLENGTH__": 7
    }
]

Now I want to Map the values from the “$2” of each category into a specific Field of a Notion Database Item by using the “Create a Database Item” module for notion with multiple Fields. However, I am only seeing a single array item when trying to map the Array output into the individual Fields. For example:


I’m sure it has something to do with for the formatting of the Field Value to filter the results from the Array Aggregator, but I have no idea how to set that up or where to locate any instructions for how to do it.

As always, your guidance is very much appreciated.

Hi @Oliver_Marler,

Welcome to the Make community!

You can use a Text Parser “Match Pattern module with this regular expression pattern

\[SUMMARY\]\n*(?<summary>[\W\w]+?)\n*\[\/SUMMARY\]\n+\[MAIN_POINTS\]\n*(?<main_points>[\W\w]+?)\n*\[\/MAIN_POINTS\]\n+\[ACTION_ITEMS\]\n*(?<action_items>[\W\w]+?)\n*\[\/ACTION_ITEMS\]\n+\[FOLLOW_UP_QUESTIONS\]\n*(?<follow_up_questions>[\W\w]+?)\n*\[\/FOLLOW_UP_QUESTIONS\]\n+\[CLIENT_NAME\]\n*(?<client_name>[\W\w]+?)\n*\[\/CLIENT_NAME\]\n+\[FILE_NAME\]\n*(?<file_name>[\W\w]+?)\n*\[\/FILE_NAME\]\n+\[DOCUMENT_TYPE\]\n*(?<document_type>[\W\w]+?)\n*\[\/DOCUMENT_TYPE\]

Regex test: https://regex101.com/r/XzFvi6

Screenshot

Important Info

  • :warning: Global match must be set to YES!

Output

This will split the GPT output into individual variables, in a SINGLE bundle, with a SINGLE operation

Aggregators not included.


For more information, see Text Parser in the Make Help Center:

Match Pattern
The Match pattern module enables you to find and extract string elements matching a search pattern from a given text. The search pattern is a regular expression (aka regex or regexp), which is a sequence of characters in which each character is either a metacharacter, having a special meaning, or a regular character that has a literal meaning.

Hope this helps!

2 Likes

Here is the map() command for that:

{{trim(first(map(77.array; "$2"; "$1"; "MAIN_POINTS")))}}

Basically this will grab $2 from each array, then it will filter them out until $1 = “MAIN_POINTS”.
That will leave you with an array of one element, the text for “MAIN_POINT”.
Then, use either the first() or get() command to get the first (and should be only) element in the array.
Trim() removes the extra space from beginning and the end.

You can either use this directly in the Notion module to pull each section of information, or you could use this in a set variables module in case you plan on using them multiple times in the remainder of the scenario.

2 Likes