What Method and module to choose when working with Open AI GPT to analyse invoices?

Hi,
Starting with a project to get invoice information / data using AI and I have several questions:

  1. When using Create a Completion (GPT-3, GPT-3.5, GPT-4) - I see that I need to choose Select Method “Create a chat completion (GPT Models)” and / or “Create a Prompt completion” - When and why should I use each?

  2. I guess that I can convert my invoice PDF to image and sent it to the “Analyze image - vision” module, or maybe to convert it to text ans to use the “Create a Completion (GPT-3, GPT-3.5, GPT-4)” with a prompt to get the data or maybe to use the “Transform text to a structured data” and probably some other modules - any I dea / tip which one will work with such a task?

  3. Maybe I should try EDEN AI or gemini or any other such tool?

Regards,
Ram

Hello @Ram ,

I recently completed a project that achieved just that. Here’s what I did:

I utilized Nanonets to convert the PDF into text directly (without converting it into an image)—a reliable service. Additionally, I employed ChatGPT to create a chat completion using a GPT model with the role set as a user. In the message content provided to the model, I included a prompt to extract text from the previous module in JSON format (ensuring the desired JSON structure is met).

My approach involved requesting the generation of a JSON for each page and then consolidating these into a single variable.

Explaining this may sound somewhat complicated, but as the saying goes, a picture is worth a thousand words. Here is an illustration of my scenario.

4 Likes

Thanks @Rafael_Sanchez
I was also going in that direction.

You mentioned that you used “Chat completion using a GPT model with the role set as a user” - why didn’t you use a prompt? can you explain what is the difference between Chat completion and prompt?

Why are you using 2 GPT modules?

Regards,
Ram

at the end is a prompt, but for me using the role User in the module gave me better results.

2 Likes

Hi all,
For future readers:

When utilizing the OpenAI API, understanding the distinction between “Chat Completion” and “Prompt Completion” is crucial for effective text generation:

  1. Chat Completion:
  • Allows you to define various roles within the conversation:
    • System Instructions: You can provide general information and instructions to guide the model’s behavior and role within the conversation. These instructions help shape the context and expectations for the dialogue.
    • User Prompt/Request: You specify the user’s input or question that initiates the conversation. This sets the stage for the model’s response and directs the flow of the dialogue.
    • Assistant/GPT Previous Reply: You have the option to include previous responses from the assistant or GPT (if any) to maintain context continuity in the conversation. This helps create a more coherent and natural dialogue flow.
  1. Prompt Completion:
  • Focuses solely on the user’s prompt or request without specifying distinct roles for the conversation.
  • You provide the initial input or question from the user, and the model generates a response based solely on that prompt.
  • Unlike Chat Completion, there are no predefined roles for the system or assistant. The generated response is solely based on the provided prompt without additional contextual information.

In essence, Chat Completion offers more flexibility and control by allowing you to define the system’s role, incorporate previous assistant responses, and set the user’s prompt. On the other hand, Prompt Completion is simpler, focusing solely on generating a response based on the user’s input without specifying distinct roles within the conversation. Depending on your requirements and desired level of control, you can choose the appropriate method for your text generation needs.

6 Likes

Hello @Ram :wave:

I just want to quickly say thank you very much for sharing such a detailed overview of your findings with all of us here in the community. This is truly valuable and I’m sure it will simplify the lives of many who are looking for similar information. :pray:

FYI: I marked your last comment as a solution to keep the community organized and easy to look for answers.