Speech To Text from videos / recordings using PHP – IDOL OnDemand Speech To Text API

In this tutorial, I will cover how to run speech to text from video or sound files programmatically using the IDOL OnDemand Recognize Speech API . IDOL OnDemand is a web API platform, meaning that we could use any language to send the HTTP requests needed to do the Speech To Text, we will be using PHP in this example.


  • ┬áSignup with IDOL OnDemand
  • Get API key from Account->Tools-> Manage API Keys
  • Recognize Speech API accepts public URLs and local files

Speech to Text from Video/Recording (URL)

If you want to extract text from image (URL), you need to send GET request in the following format


Supported languages:

* de-DE – German
* en-US – US English
* en-GB – British English
* es-ES – European Spanish
* fr-FR – French
* it-IT – Italian
* zh-CN – Mandarin

The Speech to text only takes ASYNC requests, as we can see from the “async” in the URL and hence it will return a jobID.

For Example



Will return a jobID

  "jobID": "usw3p_7175b0a9-3c3c-40ea-aaa3-7c16e2e92418"


We can use that jobID with the job API


The output will look like this

  "actions": [
      "result": {
        "document": [
            "content": "we want to hear from you let's get the conversation started about what's next for Hewlett Packard this is HP next this source"
      "status": "finished",
      "action": "recognizespeech",
      "version": "v1"
  "jobID": "usw3p_7175b0a9-3c3c-40ea-aaa3-7c16e2e92418",
  "status": "finished"

Speech to Text from Video/Recording (LOCAL FILE)

If you want to extract text from a local video or recording, you need to POST multipart form data to the Recognize Speech API

We will need two things:
1). extract.html – An HTML Form to post the image to our server
Image is uploaded from browser to server.

2. extract_Handler.php – A PHP page to post the file to IDOL OnDemand and fetch the result
Text is extracted from image using OCR API at our server and returned to browser.


Below code suffice the purpose.

HTML code : (extract.html)

<form action="extract_handler.php" method="post" enctype="multipart/form-data">
<input type="file" name="file" />
<select name="language">
<input type="submit" value="submit" />


We need to implement the extract handler(extract_handler.php), which handles file uploads and text extraction using Recognize Speech API.

PHP Code:

$url = 'https://api.idolondemand.com/1/api/async/recognizespeech/v1';
$apikey = '<APIKEYHERE>'
$output_dir = ".";
    $language= $_POST["name"]; // Get language from form
    $fileName = $_FILES["file"]["name"]; // Filename

    //move the file to uploads folder
    //multipart form post using CURL
    $filePath = realpath($output_dir.$fileName);
    $post = array('apikey' => $apikey,
                    'language' => $language,
                    'file' => new CurlFile($filePath));

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL,$url);
    curl_setopt($ch, CURLOPT_POST,1);
    curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
    $result=curl_exec ($ch);
    curl_close ($ch);
    $json = json_decode($result,true);
	 // We now need to fetch the result using the jobID
        if($json && isset($json['jobID']))
        $jobID =$json['jobID']; //get the jobid
        $post = array('apikey' => $apikey);
        $joburl = "https://api.idolondemand.com/1/job/status/". $jobID;
        // Loop until the status is set to Finished
        do {
          $ch = curl_init();
          curl_setopt($ch, CURLOPT_URL,$joburl);
          curl_setopt($ch, CURLOPT_POST,1);
          curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
          curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
          $result=curl_exec ($ch);
          curl_close ($ch);
          echo "test";
          $json = json_decode($result,true);
        } while ($status!="finished");
        echo $result;
    //remove the file


About Author

I am a developer and I maintain the site https://hayageek.com. The best software developers are those who can think like both a developer and a user.
All posts by Ravishanker Kusuma