I need to capture certain content from HTML

Hey,

I need to extract from a .HTML file the following pieces of data:

To be used during a later step with OpenAI.

can be, for example, .

Thanks.

Welcome to the Make community!

For further assistance, please provide the following:

1. Screenshots of module fields and filters, and any error messages

Please share screenshots of relevant error messages, module fields, and filters in question? It would really help other community members to see what you’re looking at.

You can upload images here using the Upload icon in the text editor:

2. Scenario blueprint

Please export the scenario blueprint file to allow others to view the mapped variables in the module fields. At the bottom of the scenario editor, you can click on the three dots to find the Export Blueprint menu item.

3. And most importantly, Input/Output bundles

Please provide the output bundles of the modules by running the scenario (or get from the scenario History tab), then click the white speech bubble on the top-right of each module and select “Download input/output bundles”.

A.

Save each bundle contents in your text editor as a bundle.txt file, and upload it here into this discussion thread.

B.

If you are unable to upload files on this forum, alternatively you can paste the formatted bundles in this manner:

Here are two ways to format text so that it won’t be changed by the forum:

A. Type code block manually
Add three backticks ``` before and after the content/bundle, like this:

```
content goes here
```

B. Highlight and click the format button in the editor

Providing the input/output bundles will allow others to replicate what is going on in the scenario even if they do not use the external service.

Following these steps will allow others to assist you here. Thanks!

Hi @samliew

There’s no error, but the Parser module produces no output when I run it (only [] ).

I’m using this pattern:

\s*(?[^<]+)\s*<\/title>[\w\W]+\s*(?[\w\W]+?)\s*<\/article>

blueprint.json (9.8 KB)

I’m using the Send Gmail just for the test’s sake.

Thanks

Welcome to the Make community!

You can use a Text Parser “Match Pattern” module with this Pattern (regular expression):

<title>\s*(?<title>[^<]+)\s*<\/title>[\w\W]+?<article[^>]+>\s*(?<article>[\w\W]+?)\s*<\/article>

Proof https://regex101.com/r/ahEFZt/1

Important Info

  • :warning: Global match must be set to NO!

For more information, see Text Parser in the Make Help Center:

Match Pattern
The Match pattern module enables you to find and extract string elements matching a search pattern from a given text. The search pattern is a regular expression (aka regex or regexp), which is a sequence of characters in which each character is either a metacharacter, having a special meaning, or a regular character that has a literal meaning.

Hope this helps! Let me know if there are any further questions or issues.

— @samliew

P.S.: Investing some effort into the Make Academy will save you lots of time and frustration using Make.

1 Like

Thanks, @samliew. It worked just fine.

BTW @samliew, is there a way to also clean/remove HTML tags in the same process?

No problem, glad I could help!

1. If anyone has a new question in the future, please start a new thread. This makes it easier for others with the same problem to search for the answers to specific questions, and you are more likely to receive help since newer questions are monitored closely.

2. The Make Community guidelines encourages users to try to mark helpful replies as solutions to help keep the Community organized.

This marks the topic as solved, so that:

  • others can save time when catching up with the latest activity here, and
  • allows others to quickly jump to the solution if they come across the same problem

To do this, simply click the checkbox at the bottom of the post that answers your question:
Screenshot_2023-10-04_161049

3. Don’t forget to like and bookmark this topic so you can get back to it easily in future!

Hope this helps! Let me know if there are any further questions or issues.

— @samliew

P.S.: Investing some effort into the Make Academy will save you lots of time and frustration using Make.