Extracting multiple values from long string using regex

I have a long string output that has the following format,

# News From This Week
## Sports
###  Basketball
Optional related content & text..
### Ping Pong
Optional related content & text..
## Business
### Technology
Optional related content & text..
# News From Last Week
## Business
### Technology
Optional related content & text..
## Arts

Ultimately in html terms,
‘#’ = h1
‘##’ = h2
‘###’ = h3

My desired output is

  • h1: News From This Week
  • h2: Sports
  • h3: Basketball
  • h3: Ping Pong
  • h2: Business
  • h3: Technology

(I don’t care about anything after the first h1 (News From Last Week)

What I tried,
Was able to get the h1, h2, h3 separately using 3 separate Text Parser (Match Pattern)
Module 1 match pattern: ^#\s([A-Za-z0-9]+( [A-Za-z0-9]+)+)$
Module 2 match pattern: ^##\s([A-Za-z0-9]+( [A-Za-z0-9]+)+)$
Module 3 match pattern: ^###\s([A-Za-z0-9]+( [A-Za-z0-9]+)+)$

Main issues,

  • Not in order
  • Results with key as $1 (also something wrong with my regex or module setting because it returns a $2 with the last word, but I can still work with just using the $1’s)
  • Doesn’t seem efficient

Looked at this post - How to use Regex in Make?
And wondering if I can use Text Parser (Match Pattern Advanced)

Can’t seem to input a valid match pattern, so looking for help to determine

  1. If I can use the Text Parser (Match Pattern Advanced) module to get the output I am looking for, or do I need to replace # with h1, ## with h2, etc… first?
  2. If I only care about first h1 section, should I have a module before to split text before or better to have a module after?

Welcome to the Make community!

Could you go to regex101.com, paste in your Pattern at the top and paste a complete/full example text you are trying to match from below it?

Then, save the regex example and share the link with us here.

This will allow others to assist you here with your pattern. Thanks!

You can also join us in the Make Fans Discord server to chat with other makers!

Here are the 3 examples

h1 example
Pattern

^#\s([A-Za-z0-9]+( [A-Za-z0-9]+)+)$

Test String

# August 2024 News And Updates

Link: regex101: build, test, and debug regex

h2 example
Pattern

^##\s([A-Za-z0-9]+( [A-Za-z0-9]+)+)$

Test String

## This Months News

Link: regex101: build, test, and debug regex

h3 example
Pattern

^###\s([A-Za-z0-9]+( [A-Za-z0-9]+)+)$

Test String

### Enhancements To The Community

Link: regex101: build, test, and debug regex

Welcome to the Make community!

You’ll need two regular expressions.

1.

First, to remove the last set of heading 1 onwards:

(?<=\n)#\s[\w\W]+?$

Proof https://regex101.com/r/9VkW82/1

Screenshot_2024-08-02_100834

2.

Then,

You can use a Text Parser “Match Pattern” module with this Pattern (regular expression):

(?<=^|\n)(?<num>#+)\s+(?<header>[^\n]+)

Proof https://regex101.com/r/EQjnnI/1

Screenshot_2024-08-02_100823

Important Info

  • :warning: Global match must be set to YES!

Screenshot

Screenshot_2024-08-02_100826

Output


For more information, see Text Parser in the Make Help Center:

Match Pattern
The Match pattern module enables you to find and extract string elements matching a search pattern from a given text. The search pattern is a regular expression (aka regex or regexp), which is a sequence of characters in which each character is either a metacharacter, having a special meaning, or a regular character that has a literal meaning.

Hope this helps! Let me know if there are any further questions or issues.

You can also join us in the Make Fans Discord server to chat with other makers. Due to the evolving needs of this community, the Discord invite link can be found elsewhere on this forum. You can either search for it or leave a message below to request an invite.

Module Export

You can copy and paste this module export into your scenario. This will paste the modules shown in my screenshots above.

  1. Copy the JSON code below by clicking the copy button when you mouseover the top-right of the code block

  2. Enter your scenario editor. Press ESC to close any dialogs. Press CTRLV (paste keyboard shortcut for Windows) to paste directly in the canvas.

  3. Click on each imported module and save it for validation. You may be prompted to remap some variables and connections.

Click to Expand Module Export Code

JSON - Copy and Paste this directly in the scenario editor

{
    "subflows": [
        {
            "flow": [
                {
                    "id": 163,
                    "module": "util:ComposeTransformer",
                    "version": 1,
                    "parameters": {},
                    "mapper": {
                        "value": "# August 2024 News And Updates\n## Sports\n###  Basketball\nOptional related content & text..\n### Ping Pong\nOptional related content & text..\n## Business\n### Technology\nOptional related content & text..\n# News From Last Week\n## Business\n### Technology\nOptional related content & text..\n## Arts"
                    },
                    "metadata": {
                        "designer": {
                            "x": 2507,
                            "y": -2963
                        },
                        "restore": {},
                        "expect": [
                            {
                                "name": "value",
                                "type": "text",
                                "label": "Text"
                            }
                        ]
                    }
                },
                {
                    "id": 164,
                    "module": "regexp:Parser",
                    "version": 1,
                    "parameters": {
                        "pattern": "(?<=^|\\n)(?<num>#+)\\s+(?<header>[^\\n]+)",
                        "global": true,
                        "sensitive": true,
                        "multiline": false,
                        "singleline": false,
                        "continueWhenNoRes": false,
                        "ignoreInfiniteLoopsWhenGlobal": false
                    },
                    "mapper": {
                        "text": "{{replace(163.value; \"/(?<=\\n)#\\s[\\w\\W]+?$/\"; emptystring)}}"
                    },
                    "metadata": {
                        "designer": {
                            "x": 2751,
                            "y": -2963
                        },
                        "restore": {
                            "parameters": {
                                "sensitive": {
                                    "collapsed": true
                                },
                                "multiline": {
                                    "collapsed": true
                                },
                                "singleline": {
                                    "collapsed": true
                                },
                                "continueWhenNoRes": {
                                    "collapsed": true
                                }
                            }
                        },
                        "parameters": [
                            {
                                "name": "pattern",
                                "type": "text",
                                "label": "Pattern",
                                "required": true
                            },
                            {
                                "name": "global",
                                "type": "boolean",
                                "label": "Global match",
                                "required": true
                            },
                            {
                                "name": "sensitive",
                                "type": "boolean",
                                "label": "Case sensitive",
                                "required": true
                            },
                            {
                                "name": "multiline",
                                "type": "boolean",
                                "label": "Multiline",
                                "required": true
                            },
                            {
                                "name": "singleline",
                                "type": "boolean",
                                "label": "Singleline",
                                "required": true
                            },
                            {
                                "name": "continueWhenNoRes",
                                "type": "boolean",
                                "label": "Continue the execution of the route even if the module finds no matches",
                                "required": true
                            },
                            {
                                "name": "ignoreInfiniteLoopsWhenGlobal",
                                "type": "boolean",
                                "label": "Ignore errors when there is an infinite search loop",
                                "required": true
                            }
                        ],
                        "expect": [
                            {
                                "name": "text",
                                "type": "text",
                                "label": "Text"
                            }
                        ],
                        "interface": [
                            {
                                "type": "text",
                                "name": "num",
                                "label": "num"
                            },
                            {
                                "type": "text",
                                "name": "header",
                                "label": "header"
                            },
                            {
                                "type": "uinteger",
                                "name": "i",
                                "label": "i"
                            },
                            {
                                "type": "any",
                                "name": "__IMTMATCH__",
                                "label": "Fallback Match"
                            }
                        ]
                    }
                },
                {
                    "id": 167,
                    "module": "util:SetVariable2",
                    "version": 1,
                    "parameters": {},
                    "mapper": {
                        "name": "num",
                        "scope": "roundtrip",
                        "value": "{{parseNumber(replace(replace(replace(164.num; \"###\"; 3); \"##\"; 2); \"#\"; 1))}}"
                    },
                    "metadata": {
                        "designer": {
                            "x": 2993,
                            "y": -2965
                        },
                        "restore": {
                            "expect": {
                                "scope": {
                                    "label": "One cycle"
                                }
                            }
                        },
                        "expect": [
                            {
                                "name": "name",
                                "type": "text",
                                "label": "Variable name",
                                "required": true
                            },
                            {
                                "name": "scope",
                                "type": "select",
                                "label": "Variable lifetime",
                                "required": true,
                                "validate": {
                                    "enum": [
                                        "roundtrip",
                                        "execution"
                                    ]
                                }
                            },
                            {
                                "name": "value",
                                "type": "any",
                                "label": "Variable value"
                            }
                        ],
                        "interface": [
                            {
                                "name": "num",
                                "label": "num",
                                "type": "any"
                            }
                        ]
                    }
                },
                {
                    "id": 168,
                    "module": "builtin:BasicAggregator",
                    "version": 1,
                    "parameters": {
                        "feeder": 164
                    },
                    "mapper": {
                        "num": "{{167.num}}",
                        "header": "{{164.header}}"
                    },
                    "metadata": {
                        "designer": {
                            "x": 3235,
                            "y": -2968,
                            "messages": [
                                {
                                    "category": "last",
                                    "severity": "warning",
                                    "message": "A transformer should not be the last module in the route."
                                }
                            ]
                        },
                        "restore": {
                            "extra": {
                                "feeder": {
                                    "label": "Text parser - Match pattern"
                                },
                                "target": {
                                    "label": "Custom"
                                }
                            }
                        }
                    }
                }
            ]
        }
    ],
    "metadata": {
        "version": 1
    }
}

Hope this helps! Let me know if there are any further questions or issues.

You can also join us in the Make Fans Discord server to chat with other makers. Due to the evolving needs of this community, the Discord invite link can be found elsewhere on this forum. You can either search for it or leave a message below to request an invite.

1 Like

Thanks sameliew!

For 1.
Been racking my brain trying create the regex for the reverse (I’m new to regex)

Where I want everything before the 2nd ‘#’.
Getting close, but can seem to figure not include additional linebreak with #.

[\w\W]+(?<=\n)#\s

Screenshot 2024-08-13 at 9.03.36 PM

able to help?

Step 2. Is working

I have already provided the module export, pattern, and screenshots above.

Hope this helps! Let me know if there are any further questions or issues.

@samliew


P.S.: Did you know, the concepts of about 70% of questions asked on this forum are already covered in the Make Academy. Investing some effort into it will save you lots of time and frustration using Make later!

1 Like

ah missed that. thanks again!

No problem, glad I could help!

1. If anyone has a new question in the future, please start a new thread. This makes it easier for others with the same problem to search for the answers to specific questions, and you are more likely to receive help since newer questions are monitored closely.

2. The Make Community guidelines encourages users to try to mark helpful replies as solutions to help keep the Community organized.

This marks the topic as solved, so that:

  • others can save time when catching up with the latest activity here, and
  • allows others to quickly jump to the solution if they come across the same problem

To do this, simply click the checkbox at the bottom of the post that answers your question:
Screenshot_2023-10-04_161049

3. Don’t forget to like and bookmark this topic so you can get back to it easily in future!

@samliew


P.S.: Did you know, the concepts of about 70% of questions asked on this forum are already covered in the Make Academy. Investing some effort into it will save you lots of time and frustration using Make later!