In this tutorial, I will cover how to run speech to text from video or sound files programmatically using the IDOL OnDemand Recognize Speech API . IDOL OnDemand is a web API platform, meaning that we could use any language to send the HTTP requests needed to do the Speech To Text, we will be using PHP in this example.
Requirements:
- Signup with IDOL OnDemand
- Get API key from Account->Tools-> Manage API Keys
- Recognize Speech API accepts public URLs and local files
Speech to Text from Video/Recording (URL)
If you want to extract text from image (URL), you need to send GET request in the following format
https://api.idolondemand.com/1/api/async/recognizespeech/v1?apikey=YOUR_KEY&url=VIDEO_URL&language=LANGUAGE
Supported languages:
* de-DE – German
* en-US – US English
* en-GB – British English
* es-ES – European Spanish
* fr-FR – French
* it-IT – Italian
* zh-CN – Mandarin
The Speech to text only takes ASYNC requests, as we can see from the “async” in the URL and hence it will return a jobID.
For Example
https://api.idolondemand.com/1/api/async/recognizespeech/v1?url=https%3A%2F%2Fwww.idolondemand.com%2Fsample-content%2Fvideos%2Fhpnext.mp4&apikey=APIKEY
Will return a jobID
{ "jobID": "usw3p_7175b0a9-3c3c-40ea-aaa3-7c16e2e92418" }
We can use that jobID with the job API
https://api.idolondemand.com/1/job/result/usw3p_7175b0a9-3c3c-40ea-aaa3-7c16e2e92418?apikey=APIKEY
The output will look like this
{ "actions": [ { "result": { "document": [ { "content": "we want to hear from you let's get the conversation started about what's next for Hewlett Packard this is HP next this source" } ] }, "status": "finished", "action": "recognizespeech", "version": "v1" } ], "jobID": "usw3p_7175b0a9-3c3c-40ea-aaa3-7c16e2e92418", "status": "finished" }
Speech to Text from Video/Recording (LOCAL FILE)
If you want to extract text from a local video or recording, you need to POST multipart form data to the Recognize Speech API
We will need two things:
1). extract.html – An HTML Form to post the image to our server
Image is uploaded from browser to server.
2. extract_Handler.php – A PHP page to post the file to IDOL OnDemand and fetch the result
Text is extracted from image using OCR API at our server and returned to browser.
Below code suffice the purpose.
HTML code : (extract.html)
<html> <body> <form action="extract_handler.php" method="post" enctype="multipart/form-data"> <input type="file" name="file" /> <select name="language"> <option>en-US</option> <option>en-GB</option> <option>en-DE</option> <option>en-ES</option> <option>en-FR</option> <option>it-IT</option> <option>zh-CN</option> </select> <input type="submit" value="submit" /> </form> </body> </html>
We need to implement the extract handler(extract_handler.php), which handles file uploads and text extraction using Recognize Speech API.
PHP Code:
<?php $url = 'https://api.idolondemand.com/1/api/async/recognizespeech/v1'; $apikey = '<APIKEYHERE>' $output_dir = "."; if(isset($_FILES["file"])) { $language= $_POST["name"]; // Get language from form $fileName = $_FILES["file"]["name"]; // Filename //move the file to uploads folder move_uploaded_file($_FILES["file"]["tmp_name"],$output_dir.$fileName); //multipart form post using CURL $filePath = realpath($output_dir.$fileName); $post = array('apikey' => $apikey, 'language' => $language, 'file' => new CurlFile($filePath)); $ch = curl_init(); curl_setopt($ch, CURLOPT_URL,$url); curl_setopt($ch, CURLOPT_POST,1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post); curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); $result=curl_exec ($ch); curl_close ($ch); $json = json_decode($result,true); // We now need to fetch the result using the jobID if($json && isset($json['jobID'])) { $jobID =$json['jobID']; //get the jobid $post = array('apikey' => $apikey); $joburl = "https://api.idolondemand.com/1/job/status/". $jobID; $status="running"; // Loop until the status is set to Finished do { $ch = curl_init(); curl_setopt($ch, CURLOPT_URL,$joburl); curl_setopt($ch, CURLOPT_POST,1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post); curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); $result=curl_exec ($ch); curl_close ($ch); echo "test"; $json = json_decode($result,true); $status=$json["status"]; } while ($status!="finished"); echo $result; } //remove the file unlink($filePath); } ?>