Which is the better option for analyzing and extracting information from a PDF?

I have two scenarios for processing PDF files and I’m looking for the best option. In the first scenario, I download my PDF file and upload it to OpenAI using an “Assistant” for processing. Based on what I indicate, the assistant provides me with a response.

In the second scenario, I download the file and use PDF.co to extract information, which I then pass to a conversation in OpenAI, but this time without using the “Assistant”.

I’d like to ask: Which of the two scenarios is better for analyzing and extracting information from the PDF? Is there a clear advantage of one approach over the other in terms of accuracy, efficiency, or ease of use?

Hi @Mark10,

Thank you for your question. I doubt anyone can give you an accurate answer, this is a situation of experience.

I’d say that extracting information from the PDF and parse it first will give you control over what is sent to OpenAI. If you send the PDF file as a whole, you depend on OpenAI to interpret it correctly. But there is also something to say about the amount of operations.

Cheers,
Henk

1 Like