Get SRC URL from HTML tag with multiple urls inside

Experts, your help needed ASAP :slight_smile:

I’m trying to:

  1. Get SRCs of all images From html I have (my portfolio on dribbble) based on filter’ criteria I created.
  2. Create an array of links of all images that pass the criteria
  3. Send them to Instagram module and create Carousel post.

The problem I faced with:
When I’m trying to get a link from srcSet attribute from one of the objects (image in html format), I face with the problem, because this attribute is pretty tricky there.
here is an example of HTML code I get

I want to get a URL form srcSet attribute with the largest image resolution from the urls in this attribute.

So far I’ve tried to use this regex - <img[^>]?srcSet\s=\s*[“”‘]?([^’“” >]+?)[ ‘“”][^>]?href\s=\s*[“”’]?([^'“” >]+?)[ '“”][^>]*?>

But it didn’t help.

Please, advise how to solve this riddle.

Thanks.

Welcome to the Make community!

When asking for help with creating a regex pattern for a text parser module, I strongly suggest you do not censor the text that you actually need to match.

Please also do not share screenshots of text (even partial text). Instead, copy and paste the text here as well so that we can run it against test patterns.

If you do not provide proper examples, you could be wasting our time as we have to guess your sample input. Not only that, you may not get the correct answer, or it may take several “guesses”.

Help us help you. Thanks

samliewrequest private consultation

Join the unofficial Make Discord server to chat with us!

2 Likes

@samliew thanks for reply.

  1. here is a text that I get from web page (web page link):
<img data-test="v-img" src="https://cdn.dribbble.com/userupload/14099353/file/original-432bffd5a8a059dfee436b0653d1f13d.jpg?resize=752x" alt="" width="3200" height="2400" sizes="(max-width: 767px) 100vw, (max-width: 919px) calc(100vw - 32px), (max-width: 1278px) calc(100vw - 240px), 1024px" draggable="false" srcSet="https://cdn.dribbble.com/userupload/14099353/file/original-432bffd5a8a059dfee436b0653d1f13d.jpg?resize=300x225 300w, https://cdn.dribbble.com/userupload/14099353/file/original-432bffd5a8a059dfee436b0653d1f13d.jpg?resize=400x300 400w, https://cdn.dribbble.com/userupload/14099353/file/original-432bffd5a8a059dfee436b0653d1f13d.jpg?resize=600x450 600w, https://cdn.dribbble.com/userupload/14099353/file/original-432bffd5a8a059dfee436b0653d1f13d.jpg?resize=752x564 752w, https://cdn.dribbble.com/userupload/14099353/file/original-432bffd5a8a059dfee436b0653d1f13d.jpg?resize=1024x768 1024w, https://cdn.dribbble.com/userupload/14099353/file/original-432bffd5a8a059dfee436b0653d1f13d.jpg?resize=1200x900 1200w, https://cdn.dribbble.com/userupload/14099353/file/original-432bffd5a8a059dfee436b0653d1f13d.jpg?resize=1504x1128 1504w, https://cdn.dribbble.com/userupload/14099353/file/original-432bffd5a8a059dfee436b0653d1f13d.jpg?resize=2048x1536 2048w, https://cdn.dribbble.com/userupload/14099353/file/original-432bffd5a8a059dfee436b0653d1f13d.jpg?resize=2400x1800 2400w" class="v-img content-block border-radius-8" data-v-5e868365>

I use iterator to get all images from text parser module.
Then I Set variable and with filter conditions I get only those images that I need for Instagram. Those that have specific class.
Once I get these 2 or 3 images, I need to get the link and download the image as a file in order to upload it to Instagram module. IMG tag that I sent, contains bunch of links, and I’m trying to get a link with the largest image size (2048 or 2400).

Any ideas how to do it?

I think you don’t need to grab the srcset, since you can arbitrarily change the resize parameters to the size that you want, and it will always output the largest size it can, e.g.:

https://cdn.dribbble.com/userupload/14099353/file/original-432bffd5a8a059dfee436b0653d1f13d.jpg?resize=5000x

If you still want to grab the last URL from the srcSet, you can use this regex

Screenshot_2024-02-20_151445

You can use a Text Parser “Match Pattern” module with this Pattern (regular expression):

<img[\w\W]+?srcSet="[^"]+(?<img>https?:\/\/[^" ]+)

Proof

https://regex101.com/r/Wah33f

Important Info

  • :warning: Global match must be set to YES!

For more information, see Text Parser in the Make Help Center:

Match Pattern
The Match pattern module enables you to find and extract string elements matching a search pattern from a given text. The search pattern is a regular expression (aka regex or regexp), which is a sequence of characters in which each character is either a metacharacter, having a special meaning, or a regular character that has a literal meaning.

Hope this helps!

2 Likes