Issues with Extracting Data from Encrypted PDF Files using OpenAI Assistant in Make Automation

Hello everyone,

I’m working on an automation scenario in Make that involves processing emails with PDF attachments. The goal is to extract specific information from these PDFs and save the details to Google Sheets while also renaming the files based on their content. Here’s the flow of the process:

  1. Trigger on new email with attachments: Fetch emails with attachments from a specific folder in Gmail.
  2. Feed attachments: Process each attachment.
  3. Upload file to OpenAI: Send the PDF to OpenAI Assistant for processing.
  4. Message Assistant: Ask OpenAI Assistant to extract specific information from the PDF.
  5. Upload file to Google Drive: Save the renamed file to Google Drive.
  6. Add row to Google Sheets: Save the extracted information into a Google Sheets document.

The issue I’m encountering is that the PDFs I receive are encrypted, which prevents the OpenAI Assistant from extracting data properly. There are no errors in the process itself, but the Assistant is not able to retrieve the data from the encrypted PDFs.

Here is the blueprint of my scenario:

{
    "name": "Save Gmail emails and attachments to Google Sheets and Google Drive",
    "flow": [
        {
            "id": 1,
            "module": "google-email:TriggerNewEmail",
            "parameters": {
                "folder": "Invoices/",
                "xGmRaw": "has:attachment",
                "account": --------,
                "markSeen": false,
                "maxResults": 3,
                "searchType": "gmail"
            }
        },
        {
            "id": 2,
            "module": "google-email:FeedAttachments",
            "parameters": {},
            "mapper": {
                "array": "{{1.attachments}}"
            }
        },
        {
            "id": 11,
            "module": "openai-gpt-3:uploadFile",
            "parameters": {
                "__IMTCONN__": ---------,
                "purpose": "assistants",
                "fileData": "{{2.data}}",
                "fileName": "{{2.fileName}}"
            }
        },
        {
            "id": 12,
            "module": "openai-gpt-3:messageAssistantAdvanced",
            "parameters": {
                "__IMTCONN__": -----------,
                "role": "assistant",
                "model": "gpt-4o",
                "tools": ["file_search", "code_interpreter"],
                "message": "{{11.id}}",
                "assistantId": "--------------------",
                "instructions": }
        },
        {
            "id": 3,
            "module": "google-drive:uploadAFile",
            "parameters": {
                "__IMTCONN__": -----------,
                "data": "{{12.result.file_data}}",
                "select": "value",
                "convert": false,
                "filename": "{{12.result.file_name}}",
                "folderId": "-----------------------------",
                "destination": "drive"
            }
        },
        {
            "id": 4,
            "module": "google-sheets:addRow",
            "parameters": {
                "__IMTCONN__": ----------,
                "mode": "fromAll",
                "values": {
                    "0": "{{formatDate(1.date; \"DD.MM.YYYY\")}}",
                    "1": "{{1.subject}}{{1.from.name}}{{1.from.address}}",
                    "2": "{{1.text}}",
                    "3": " {{2.fileName}}"
                },
                "sheetId": "{{3.name}}",
                "spreadsheetId": "/{{3.name}}{{2.data}}",
                "includesHeaders": true,
                "insertDataOption": "INSERT_ROWS",
                "valueInputOption": "USER_ENTERED",
                "insertUnformatted": false
            }
        }
    ]
}

Problem:
The PDFs that are uploaded are encrypted and and as a result, the OpenAI Assistant cannot retrieve the data properly. The process completes without any errors, but the Assistant does not receive the correct data from the file.

*The pdf files are in hebrew(gpt has no problem with reading hebrew but other tools might encounter a problem

Has anyone faced similar issues or could suggest a way to decrypt the PDFs before processing them, or an alternative approach to extract the data from these encrypted PDFs effectively?

Thank you for your assistance!

Welcome to the Make community!

Do you have the password to decrypt the PDFs?

samliewrequest private consultation

Join the Make Fans Discord server to chat with other makers!

1 Like

Thank you. the file isn’t encrypted via password. i found out it’s encrypted after following the data path. the data that passes from the mail attachment looks like numbers in a bit formation.

You can use the CloudConvert module to convert the PDF to text, which can be easily fed into the OpenAI module then.

samliewrequest private consultation

Join the Make Fans Discord server to chat with other makers!

doesn’t work… i’ve tried many tools and nothing helped so far. i don’t even know where the problem is because the proccess doesn’t fail sends an error but the assistant doesn’t get the file

If the issue is about uploading files to the GPT module, you should take a look at this How to PDF into openAI (Solution!)

samliewrequest private consultation

Join the Make Fans Discord server to chat with other makers!