How to erase or add empty characters with regex in text parser?

Hi Community,

I am using a text parser to divide a list of products and their quantities into two values: (quantity) and (name) Here is how.

This pattern is used for that:
(?<num>\d+)x (?<product>.+)

Now, I want to update the pattern in two ways to avoid grouping errors (because sometimes the data source is not perfect). I want to

  1. Erase empty characters before and after the product name.
  2. Make sure that in the product name, there is always an empty space before a number.

Example input (where I need to erase the space after the name)

  • '1x Coffee A 500g `
  • ‘1x Coffee A 500g’

Just using the ’ to demonstrate where the empty space is.

Example input (where I need to add space before the number):

  • 1x Coffee A 500g
  • 1x Coffee A500g

With the pattern above, “1” is the num-value and “Coffee A500g” the product-value.

I’ve tried to understand the RegEx 101, but it’s quite overwhelming. I hope somebody can help me.

You can change the greedy match to a lazy match.

(?<num>\d+)x\s+(?<product>.+?)\s*(?:\n|$)

Proof: https://regex101.com/r/b5NPy6/2

Hope this helps! Let me know if there are any further questions or issues.

@samliew

4 Likes

This has to be done separately, regex only does matching, not adding of new characters, and will cost you more operations.

The best solution is to sanitize the input before it reaches Make.

Thanks! I’ll work on sanitizing the input.