I’m not sure if this post complies with the community guidelines. Apologies if it isn’t.
Our clients often send us photos of documents we need from them. These are taken with their mobile phone camera, and aren’t great, to say the least. Anyone aware of a solution similar to CamScanner that we can implement into a scenario?
do you want to extract the text from given fotos? Do you want to determine objects? Faces?
All of this is possible with Make
How do you receive those images? Through a form? Through WhatsApp?..
The photos we get are sometimes with the wrong orientation (landscape), sometimes they’re creased and almost all the time they aren’t cropped properly. All of this is solved in one go when using CamScanner and such. Here’s a short video about it (also attached below).
However, the APIs for these services are crazy high.
We don’t want to extract text, but perhaps it could be a good idea for future developments (pending resolution of legal issues related to privacy and such).
That’s the easy part. Once we have an image enhancer engine, we can push into it photos from any source using the magic of Make
Ah okay I See!
So yeah, I haven’t done it so I’m just sharing my approach to challenges like this one.
a) As you said there are solution out there you could buy. I saw one for ~2000€/yeah which might be quite expensive.
b) For all of this AI-Text/Speech/Analyse stuff I always try Google. I found two services which might work but I am not sure.
My favorite guess: Rilevamento di più oggetti | API Cloud Vision | Google Cloud
After localizing the object you would then go on to crop the image for recognized boundaries. Once you’re there you might as well just go on and use it to scan the image for Text using their OCR feature
This sounds like a quite complex workflow to be honest. That’s probably why the other services are so expansive…
Hope it helps though!