Text parser matching http patern not working with a specific website

Pamela1 · January 2, 2024, 10:41am

Hi,
from a telegram watch updates module I pass the a message sent to telegram to a text parser in order to extract a website url:

It works with every website, except with this one La farce de Poutine et Touadera :

This is what happens with all the other websites:

I got the input and the output.

This is what happens with this specific website:

No output.

I don’t understand

sachinkadam5 · January 2, 2024, 11:21am

Hi @Pamela1 ,
In the second screenshot you are missing “www.” after “https://”.
So by adding “www.” after the “https://” it will automatically fetch the whole website url.
Screenshot 2024-01-02 164847

sachin.shrivastava · January 2, 2024, 11:25am

Add (-) Dash in Special Characters in the pattern.
It will run perfectly.

Pamela1 · January 2, 2024, 11:38am

thanks @sachinkadam5 I already tried that, but the result was the same.

No output.

Pamela1 · January 2, 2024, 12:04pm

Thanks @sachin.shrivastava it works!
Why is that? Is it a bug?
But then now it will only catch urls with “-” inside, right ?

sachin.shrivastava · January 2, 2024, 2:26pm

It matches the pattern like http://www.example.com as Base URI but the link provided by you have special character like (-) Dash in the base URL. So we have to put in ‘Special character in the pattern’

samliew · January 2, 2024, 2:47pm

Fun fact

The Base URI is also known as the hostname.

www.example.com

The full hostname with the protocol, is called the origin.

https://www.example.com

If you typed location.hostname or location.origin into the browser console, these values will be what you get.

For more information, see Location - Web APIs | MDN

Topic		Replies	Views
Extract title from .html webpage Questions & Answers filters	3	358	December 11, 2023
Get vidoe url from html Questions & Answers filters , telegram , wordpress	2	206	February 18, 2024
Sudden bug with http module Questions & Answers web-scraping	2	225	December 20, 2023
Get data from text Questions & Answers mapping	6	315	December 25, 2023
Problem with the text parser (advanced) Questions & Answers text-parser	4	300	March 24, 2024

Text parser matching http patern not working with a specific website

Related Topics