In this tutorial, I have covered how to extract text from image programmatically using IDOL OnDemand OCR API. This API is a web-service, so you can use the service in any programming language like PHP , Java , .NET , Python,…etc. I have used PHP language to extract text from image.
IDOL OCR API Quota limit is 5000 Requests per month. If you want to increase the quota, just follow the link.
IDOL supports the following image formats:
- TIFF
- JPEG
- PNG
- GIF
- BMP and ICO
- PBM, PGM, and PPM
Follow the steps to extract text from image.
Step 1) Signup with IDOL OnDemand
Step 2) Get API key from Mange API Keys
Step 3) OCR API accepts public URLs and local files
3.1) Extract text from image (URL)
If you want to extract text from image (URL), you need to send GET request in the following format
https://api.idolondemand.com/1/api/sync/ocrdocument/v1?apikey=YOUR_KEY&url=IMAGE_URL&mode=document_photo
Supported image modes:
document_photo – For images are taken with mobile phone camera
document_scan – For images are taken using flatbed scanner.
scene_photo – Use to recognize text in a scene, for example signs and billboards in a landscape.
subtitle – Use to recognize text superimposed on an image, such as TV subtitles.
3.2) Extract text from image (Local file)
If you want to extract text from a local image, you need to POST multipart form data to OCR web-service
Below HTML code suffice the purpose: (extractPrivate.html)
<html> <body> <form action="https://api.idolondemand.com/1/api/sync/ocrdocument/v1" method="post" enctype="multipart/form-data"> <input type="hidden" name="apikey" value="PUT_YOUR_API_KEY"/> <input type="hidden" name="mode" value="document_photo" /> <input type="file" name="file" /> <input type="submit" value="submit" /> </form> </body> </html>
Sample Response format:
{ "text_block": [ { "text": "EXTRACTED_TEXT_FROM_IMAGE_IS_HERE", "left": 0, "top": 0, "width": 762, "height": 1049 } ] }
Note: The above code can be used only for personnel purpose. If it is made to public then anybody can see your API KEY.
To solve the issue, Our website needs to behave like a proxy.
1). Image is uploaded from browser to our server ( extract.html)
2). Text is extracted from image using OCR API at our server. (extract_handler.php)
3). Response is returned to browser.
Below code suffice the purpose.
HTML code : (extract.html)
<html> <body> <form action="extract_handler.php" method="post" enctype="multipart/form-data"> <input type="file" name="file" /> <input type="submit" value="submit" /> </form> </body> </html>
We need to implement extract handler(extract_handler.php), which handles file uploads and text extraction using OCR API.
PHP Code:
<?php $url = 'https://api.idolondemand.com/1/api/sync/ocrdocument/v1'; $output_dir = "uploads/"; if(isset($_FILES["file"])) { $fileName = md5(date('Y-m-d H:i:s:u')).$_FILES["file"]["name"]; //unique filename //move the file to uploads folder move_uploaded_file($_FILES["file"]["tmp_name"],$output_dir.$fileName); //multipart form post using CURL $filePath = realpath($output_dir.$fileName); $post = array('apikey' => '83c9208b-c536-481a-aa91-5a79aab324a0', 'mode' => 'document_photo', 'file' =>'@'.$filePath); $ch = curl_init(); curl_setopt($ch, CURLOPT_URL,$url); curl_setopt($ch, CURLOPT_POST,1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post); curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); $result=curl_exec ($ch); curl_close ($ch); echo $result; /* //If you want to return only text use this. $json = json_decode($result,true); if($json && isset($json['text_block'])) { $textblock =$json['text_block'][0]; echo $textblock['text']; }*/ //remove the file unlink($filePath); } ?>