Parse CSV from partway into the file (omit first few lines)

I am a still learning the intricacies to Make (Integromat), so this may be a dumb question.
I am parsing data from .csv files that are saved in OneDrive and adding the data to the bottom of a table in Microsoft 365 excel.
These csv files are exports from various institutions.
One of the files is creating a problem due to the additon of three information lines before the header line for the data.
I need to start the CSV Parse on the fourth line of the file since that is where the header information is located. Otherwise, the CVS Parse module generates an error because the first line is not a header hence doesn’t have the correct number of columns for the data.
I can always remove the lines from the CSV file using a text editor, but would really like to save a step if the automation can do this for me.
I have tried adding a Text Parser prior to the CSV module; I thought this would work since the first two lines are always the same, but the third line changes from export to export.
I have looked at various funcions (i.e. slice) but the data type is ‘buffer’ and not ‘array’ so the functions didn’t work.
Can someone point me in the right direction of where to search to learn how to handle this scenario?
Thanks in advance.

A Text Parser “Replace” module is what you need.

Could you share some examples of the first 4-5 lines of the CSV?

You can paste the formatted text in this forum:

  • Either add three backticks ``` before and after the text, like this:

    ```
    content goes here
    ```

  • Or use the format code button in the editor:
    Screenshot_2023-10-02_191027

This will allow others to better assist you. Thanks!

2 Likes

I did try the Text Parser [Replace], but anything I tried seemed to only replace the searched text. I was using {{empthystring}} as the replacement. So it was looking like I would have to search the entire string, which is problematic since the third line contains a date range which will change each month.
Below is the first five lines of the CSV file, somewhat redacted by replacing some identifiers with ‘x’. You should find the header info as the fourth line down.

Account Name : Share Draft Account,,,,,,,,
Account Number : xxxxxxxxx,,,,,,,,
Date Range : 01/01/2024-01/30/2024,,,,,,,,
Transaction Number,Date,Description,Memo,Amount Debit,Amount Credit,Balance,Check Number,Fees  
20240131000000[-5:EST]*-300.00*42**Online Banking Withdrawal,1/31/2024,Online Banking Withdrawal,Transfer 0030: Internet Access 01/31/2024 15:03 893502:,-300,,968.87,,
20240125000000[-5:EST]*-500.00*42**ACH Withdrawal,1/25/2024,ACH Withdrawal,CRDT CD 000000 ONLINE PMT 240125 12114,-500,,1268.87,,
20240125000000[-5:EST]*-241.60*42**ACH Withdrawal,1/25/2024,ACH Withdrawal,xxx Payment 012424 03110,-241.6,,1768.87,,
20240123000000[-5:EST]*92.17*542**Shared branch Deposit,1/23/2024,Shared branch Deposit,xxx FCU 6011 (2024-01-23) Mobile Deposit 0009,,92.17,2010.47,,
20240123000000[-5:EST]*146.19*542**Shared branch Deposit,1/22/2024,Shared branch Deposit,xxx FCU,,146.19,1918.3,,

I have to append more to the previous since I noticed after I uploaded that the filters were in the wrong place. After putting them after the router I reran and noticed that the first line of the CSV was removed, and the rest of the data was sent to the spreadsheet.
In the Text Parser [Replace], I am searching for " : " (without the quotes) and replacing with {{emptystring}}.
I thought that the data would be iterated through the parser, but apparently not. Do I need some other module to for that text parser to iterate through the data before sending it to the spreadsheet?

You can use a Text Parser “Replace” module with this Pattern (regular expression), to match and remove the first three lines, with {{emptystring}}

^.+\n.+\n.+\n

Proof

https://regex101.com/r/AeqIBJ

Important Info

  • :warning: Global match must be set to NO!
  • :warning: Multiline must be set to NO!

Screenshot

Screenshot_2024-03-01_100339

Output


For more information, see Text Parser in the Make Help Center:

Match Pattern
The Match pattern module enables you to find and extract string elements matching a search pattern from a given text. The search pattern is a regular expression (aka regex or regexp), which is a sequence of characters in which each character is either a metacharacter, having a special meaning, or a regular character that has a literal meaning.

Hope this helps!

2 Likes

You can copy and paste this module export into your scenario. This will paste the modules shown in my screenshots above.

  1. Copy the code below by clicking the copy button when you mouseover the top-right of the code block
    Screenshot_2024-01-17_200117

  2. Enter your scenario editor. Press ESC to close any dialogs. Press CTRLV to paste in the canvas.

  3. Click on each imported module and save it. You may need to remap some variables.

Modules JSON Export

{
    "subflows": [
        {
            "flow": [
                {
                    "id": 22,
                    "module": "regexp:Replace",
                    "version": 1,
                    "parameters": {},
                    "mapper": {
                        "pattern": "^.+\\n.+\\n.+\\n",
                        "value": "{{emptystring}}",
                        "global": false,
                        "sensitive": true,
                        "multiline": false,
                        "singleline": false,
                        "text": "{{21.value}}"
                    },
                    "metadata": {
                        "designer": {
                            "x": 605,
                            "y": -278
                        },
                        "restore": {
                            "expect": {
                                "global": {
                                    "mode": "chose"
                                },
                                "sensitive": {
                                    "mode": "chose"
                                },
                                "multiline": {
                                    "mode": "chose"
                                },
                                "singleline": {
                                    "mode": "chose"
                                }
                            }
                        },
                        "expect": [
                            {
                                "name": "pattern",
                                "type": "text",
                                "label": "Pattern",
                                "required": true
                            },
                            {
                                "name": "value",
                                "type": "text",
                                "label": "New value"
                            },
                            {
                                "name": "global",
                                "type": "boolean",
                                "label": "Global match",
                                "required": true
                            },
                            {
                                "name": "sensitive",
                                "type": "boolean",
                                "label": "Case sensitive",
                                "required": true
                            },
                            {
                                "name": "multiline",
                                "type": "boolean",
                                "label": "Multiline",
                                "required": true
                            },
                            {
                                "name": "singleline",
                                "type": "boolean",
                                "label": "Singleline",
                                "required": true
                            },
                            {
                                "name": "text",
                                "type": "text",
                                "label": "Text"
                            }
                        ]
                    }
                },
                {
                    "id": 23,
                    "module": "csv:ParseCSV",
                    "version": 1,
                    "parameters": {
                        "colCount": 9,
                        "csvContainsHeaders": true,
                        "delimiterType": "other",
                        "relax": false,
                        "delimiter": ","
                    },
                    "mapper": {
                        "csv": "{{22.text}}"
                    },
                    "metadata": {
                        "designer": {
                            "x": 849,
                            "y": -280,
                            "messages": [
                                {
                                    "category": "last",
                                    "severity": "warning",
                                    "message": "A transformer should not be the last module in the route."
                                }
                            ]
                        },
                        "restore": {
                            "parameters": {
                                "delimiterType": {
                                    "label": "Other"
                                }
                            }
                        },
                        "parameters": [
                            {
                                "name": "colCount",
                                "type": "number",
                                "label": "Number of columns",
                                "required": true
                            },
                            {
                                "name": "csvContainsHeaders",
                                "type": "boolean",
                                "label": "CSV contains headers",
                                "required": true
                            },
                            {
                                "name": "delimiterType",
                                "type": "select",
                                "label": "Delimiter",
                                "required": true,
                                "validate": {
                                    "enum": [
                                        ",",
                                        "\t",
                                        "other"
                                    ]
                                }
                            },
                            {
                                "name": "relax",
                                "type": "boolean",
                                "label": "Preserve quotes inside unquoted field",
                                "required": true
                            },
                            {
                                "name": "delimiter",
                                "type": "text",
                                "label": "Delimiter character",
                                "validate": {
                                    "max": 1,
                                    "min": 1
                                },
                                "required": true
                            }
                        ],
                        "expect": [
                            {
                                "name": "csv",
                                "type": "text",
                                "label": "CSV",
                                "required": true
                            }
                        ],
                        "interface": [
                            {
                                "name": "col1",
                                "label": "Transaction Number",
                                "type": "text"
                            },
                            {
                                "name": "col2",
                                "label": "Date",
                                "type": "text"
                            },
                            {
                                "name": "col3",
                                "label": "Description",
                                "type": "text"
                            },
                            {
                                "name": "col4",
                                "label": "Memo",
                                "type": "text"
                            },
                            {
                                "name": "col5",
                                "label": "Amount Debit",
                                "type": "text"
                            },
                            {
                                "name": "col6",
                                "label": "Amount Credit",
                                "type": "text"
                            },
                            {
                                "name": "col7",
                                "label": "Balance",
                                "type": "text"
                            },
                            {
                                "name": "col8",
                                "label": "Check Number",
                                "type": "text"
                            },
                            {
                                "name": "col9",
                                "label": "Fees  ",
                                "type": "text"
                            }
                        ]
                    }
                }
            ]
        }
    ],
    "metadata": {
        "version": 1
    }
}
2 Likes

First, thank you so much for the code and the link. The link is going to be very helpful to me as a learning tool for regex.
Second, I see what some of my issue was. I was approaching the data file on a line-by-line basis. I never considered looking at is as a complete block of data. If I had, I might have considered filtering on the line feeds (newline) characters - or maybe not.
Again, thanks for all of the help and extras to help me learn more about this tool.

2 Likes

Seems like I am still in a daze here.

  • I modified my regex search pattern to match the one you suggested.
  • Using the regex101, I tested what I typed in the regex101 on the exact data from my csv and got the same result as you did.
  • I then ran the scenario again as modified, but the spreadsheet still had the top three lines included.
  • I noticed that your input to the Text parser was “Text”, while my is “Data”. Though I believe that both are strings, I still tried the function toString([Data]), but as somewhat expected get the same result.
  • I rebooted the laptop to clear any potential cache, but that didn’t change anything.
  • I am not sure why, but it seems like my Text parser is not doing anything.
    I am still trying to learn, so what should I look for to help me diagnose this?

Text parser module parameters:
image

Input Bundle to the Text parser module:
input

Output Bundle to the Text parser module:
output

You’ll have to upload your module or scenario export here, or take a full screenshot of the fields (currently collapsed)

2 Likes

Okay, please see attached below. Also, I collapsed the fields because I thought it would help save screen space on the reply and I thought it would be enough that the values appear to be shown in the collapsed version.
blueprint.json (111.2 KB)

It’s been a week since I sent the export, but it looks like I haven’t gotten a response yet.
I understand that people are busy and have their own issues to work on.
I am still trying to learn, so any suggestions that someone can offer will be appreciated.
Thanks in advance.

I ran just the text parser from your scenario export with the input you provided above, and it works, so it’s likely something to do with your data variable from OneDrive.

2 Likes

I have two OneDrive modules, one to watch the files for updates, and the next one to download the found files. There can be as many as three updated CSV files, hence the query search string in the WatchFiles module.
I assumed that I want the FileID variable for the DownloadFile module, but I see that the FileName variable is also there, but that one doesn’t make sense as I assume it is only passing the name of the file, and not the data within.
What I am confused about is that fact that each line in the CSV files data passes to the spreadsheet. It’s as if my text parser is not working, yet when others parse the same data it works. I am very perplexed.

Can you verify that the Data contains CSV text or is it binary data? Can you provide the output bundle of both OneDrive modules?

Please provide the input and output bundles of the modules by running the scenario (or get from the scenario History tab), then click the white speech bubble on the top-right of each module and select “Download input/output bundles”.
Screenshot_2023-10-06_141025

A.

Save each bundle contents in your text editor as a bundle.txt file, and upload it here into this discussion thread.

Uploading them here will look like this:

module-1-input-bundle.txt (12.3 KB)
module-1-output-bundle.txt (12.3 KB)

B.

If you are unable to upload files on this forum, alternatively you can paste the formatted bundles in this manner:

  • Either add three backticks ``` before and after the code, like this:

    ```
    input/output bundle content goes here
    ```

  • Or use the format code button in the editor:
    Screenshot_2023-10-02_191027

Providing the input/output bundles will allow others to replicate what is going on in the scenario even if they do not use the external service.

If you think this is a bug, directly contacting support can often lead to a faster resolution. They have access to your specific account details, scenario and scenario logs, server-side logs, and internal tools and resources, which allows them to investigate more thoroughly than what you have access to. Additionally, sharing sensitive information about your situation might not be suitable for an open forum discussion.

You can open a new ticket, or if you are unable to login for some reason, you also can reach support using the contact form on the website.

If you manage to get your issue resolved with support, we’d still love to hear about it! Sharing your solution on the forum can help others facing similar problems.

2 Likes

For the time being, I am not going to upload the bundles due to proprietary information.
However, I was able to determine that the Data variable contains binary data.
Based on your question, Can I assume that this is important and that I probably need another module?
Thanks for your feedback - I am learning a lot here.

1 Like