Parsing Customer Data from Emails: Extracting Name and Address from HTML Content

Hello Community,

I’m having trouble parsing part of the data from incoming emails, specifically the client’s name and surname.

The issue is that when converting HTML to text, I can’t correctly extract the desired text because the name and surname may consist of 2 to 4 variables.

Example of HTML to text conversion:

yaml

Копировать код

Auftragsnummer: 1112889199Auftragsdatum: 03.03.2025Bestellnummer: IT100315797000460ADAlcon Lieferscheinnummer: 1630130652Sendungsverfolgungsnummer: ZUGTMV0VVersanddienstleister: General Logistics SystemsLieferadressePflege Das GmbHMartin-Luther-Straße 1410777 BerlinDeutschlandLiebe Kundin, lieber Kunde,vielen Dank für Ihre Bestellung. Ihre Bestellung #1112889199 wurde versendet. Bitte überprüfen Sie die beigefügte Versandbestätigung.Mit freundlichen Grüßen,Alcon Customer Service**Dies ist eine automatisch generierte E-Mail. Bitte antworten Sie nicht darauf.**Wir senden Ihnen Versandaktualisierungen an die von Ihnen angegebene E-Mail-Adresse.  

I am particularly interested in this part:
Lieferadresse Pflege Das GmbH

Because there can be 1, 2, 3, or even 4 words after “Lieferadresse”, it’s difficult to define a fixed rule for extracting this text correctly.

I thought that parsing the data from HTML instead of text might be possible since the HTML structure provides elements to rely on, but I haven’t been successful so far.

Here’s an example of the HTML content:

(span style=“vertical-align:baseline;background-color:transparent;margin:0;padding:0;border:0;font-family:‘Arial Unicode MS’,sans-serif;font-size:10.00000pt;font-weight:bold;font-style:normal;color:#000000;text-decoration:none;” lang=“de-DE”>Lieferadresse

Pflege Das GmbH</div)

Can someone advise me on how I can parse the necessary data correctly?

Welcome to the Make community!

1. This forum might have or already changed your text

When pasting text into this forum, you should format the example text using the rich-text editor, otherwise the forum software might modify the displayed text, and you might get incorrect answers from others because of it.

Some things this forum software might do to mangle your text:

– remove extra spaces (which may be necessary)
– convert links to titles (when copied is incorrect)
– incorrect joined links
– convert single and double quotes to smart angled quotes (“ ”)
– emojis
– etc.

This interferes with you receiving correct answers, because it:

– makes JSON invalid (you can verify when copy-paste into https://jsonformatter.org)
– makes incorrect text examples when we need to build a pattern for text parsing

2. To prevent this in future, please format text in code blocks

These are the two ways to format text so that it won’t be modified by the forum:

  • Method 1: Type code block manually

    Add three backticks ``` before and after the content/bundle, like this:

    ```
    content goes here
    ```

  • Method 2. Highlight and click the format button in the editor

3. You might need to re-copy the original text

Once the post has been submitted, it’s too late to format it since it’s already butchered, and you need to make a re-copy of the text, and format it before submitting the forum post.

Please let us know once you have corrected the issue. This will avoid others potentially providing wrong answers based on incorrect text in your question.


When reaching out for assistance with extracting text, it would be super helpful if you could share the actual text you’re trying to match. Screenshots of text can be a bit tricky, so if you could copy and paste the text directly here, that would be awesome! It ensures we can run it against test patterns effectively. If there’s any sensitive info, feel free to change it to something fictional yet still valid by keeping the format intact.

Providing clear text examples saves time on both ends and helps us give you the best possible solution. Without proper examples, we might end up playing a guessing game, and nobody wants that as it is a waste of time! You are more likely to get a correct answer faster. So, help us help you by sharing those text snippets.

Please format the example text this way to preserve line breaks and special characters:

These are the two ways to format text so that it won’t be modified by the forum:

  • Method 1: Type code block manually

    Add three backticks ``` before and after the content/bundle, like this:

    ```
    content goes here
    ```

  • Method 2. Highlight and click the format button in the editor

Hope this helps! Let me know if there are any further questions or issues.

@samliew

P.S.: Investing some effort into the Make Academy will save you lots of time and frustration using Make.