What is your goal?
Want to read OCR file
What is the problem?
Hi everyone, I have roughly the following flow:
The source is a PDF file, I read the content from it, create diagrams (something like SketchWow), send them for approval on Slack, and then publish on Instagram.
I have only one problem — actually, I’m stuck at the very beginning. I already have a ready OCR file on Google Drive, but I’m not really sure what to do next: which module should I add to read that OCR so that I can then reference the extracted text, for example in OpenAI, to generate a prompt. Because in the OpenAI modules themselves there is nowhere to map an OCR file from Google Drive — I couldn’t find such an option anywhere. As a result, the Compose String always returns a value with length 0, no matter what I use, and I’m not really sure how to handle or fix this.
Screenshots: scenario setup, module configuration, errors
1 Like
Hey there,
You mean you have a pdf you want to do OCR on?
There are plenty of apps to do that, just type pdf in the search. I personally use pdf.co for OCR.
If you insist on using chatgpt, then you need to upload the file first. But I suggest using a dedicated app built for doing pdf OCR than a generic llm.
No, Google Docs does the OCR automatically for me. The problem comes later — what should I do so that the OpenAI module, using that file (based on the OCR file), generates a specific prompt? Because everywhere in the OpenAI modules I don’t see any option to add or map that OCR file. It doesn’t necessarily have to be OpenAI — it could be something else — but I’d like it to generate a specific JSON based on the knowledge from the OCR file, which I can then export further to other tools or applications.
1 Like
Yeah its a separate module that uploads the file first and returns a file id. Then you use the file id in the prompt to reference it.