Skip to content Skip to sidebar Skip to footer

How To Read Mp3 Data From Google Cloud Using Python

I am trying to read mp3/wav data from google cloud and trying to implement audio diarization technique. Issue is that I am not able to read the result which has passed by google ap

Solution 1:

Its too late for the author of this thread. However, posting the solution for someone in future as I too had similar issue. Change result = response.results[-1] to result = response.result().results[-1] and it will work fine

Solution 2:

Do you have access to the wav file in your bucket? also, this is the entire code? It seems that the sample_rate_hertz and the imports are missing. Here you have the code copy/pasted from the google docs samples, but I edited it to have just the diarization function.

#!/usr/bin/env python"""Google Cloud Speech API sample that demonstrates enhanced models
and recognition metadata.
Example usage:
    python diarization.py
"""import argparse
import io



deftranscribe_file_with_diarization():
    """Transcribe the given audio file synchronously with diarization."""# [START speech_transcribe_diarization_beta]from google.cloud import speech_v1p1beta1 as speech
    client = speech.SpeechClient()



    audio = speech.types.RecognitionAudio(uri="gs://<YOUR_BUCKET/<YOUR_WAV_FILE>")

    config = speech.types.RecognitionConfig(
        encoding=speech.enums.RecognitionConfig.AudioEncoding.LINEAR16,
        sample_rate_hertz=8000,
        language_code='en-US',
        enable_speaker_diarization=True,
        diarization_speaker_count=2)

    print('Waiting for operation to complete...')
    response = client.recognize(config, audio)

    # The transcript within each result is separate and sequential per result.# However, the words list within an alternative includes all the words# from all the results thus far. Thus, to get all the words with speaker# tags, you only have to take the words list from the last result:
    result = response.results[-1]

    words_info = result.alternatives[0].words

    # Printing out the output:for word_info in words_info:
        print("word: '{}', speaker_tag: {}".format(word_info.word,
                                                   word_info.speaker_tag))
    # [END speech_transcribe_diarization_beta]if __name__ == '__main__':

    transcribe_file_with_diarization()

To run the code just name it diarization.py and use the command:

python diarization.py

Also, you have to install the latest google-cloud-speech library:

pip install --upgrade google-cloud-speech

And you need to have the credentials of your service account in a json file, you can check more info here

Post a Comment for "How To Read Mp3 Data From Google Cloud Using Python"