Multimodal Telegram Bot with OpenAI Vision using raw HTTP in Make

In the Make Academy API calls with HTTP modules exercise, the Telegram scenario uses multipart/form-data to upload an image to OpenAI’s /v1/files endpoint. The file upload succeeds, and the bot replies that the image was received—but no further processing or analysis happens.

That felt incomplete, so I extended the flow by replacing the file-upload step with a call to /v1/responses, passing the Telegram image as base64 image input so it could actually be interpreted. I also added a Router so non-image messages return a helpful response instead of ending the scenario.
I turned this into a small “fortune teller” bot: upload an image, and it returns a symbolic, creative interpretation.

1 Like