Guides
Transcribe audio & video
֍
Learn how to transcribe audio and video

The TranscribeSpeech node transcribes speech from audio or video input, with additional built-in capabilities:

  • segmentation by sentence
  • diarization (speaker identification)
  • alignment to word-level timestamps
  • automatic chapter detection

To simply transcribe input without further processing, provide an audio_uri. This can be a publicly-hosted audio or video file, base-64-encoded audio or video data, or a privately-hosted external file (opens in a new tab). For best results, you may also provide a prompt that describes the content of the audio or video.

Python
TypeScript

from substrate import Substrate, TranscribeSpeech
# ...
transcript = TranscribeSpeech(
audio_uri="https://media.substrate.run/dfw-clip.m4a",
prompt="David Foster Wallace interviewed about US culture",
)
res = substrate.run(transcript)

Output

{
"text": "language like that, the wounded inner child, the inner pain, is part of a kind of pop psychological movement in the United States that is a sort of popular Freudianism that ..."
}

To enable additional capabilities, set:

  • segment: True to return a list of sentence segments with start and end timestamps.
  • align: True to return a list of aligned words within sentence segments.
  • diarize: True to include speaker IDs within segments and words.
  • suggest_chapters: True to return a list of suggested chapters with titles and start timestamps.
Python
TypeScript

transcript = TranscribeSpeech(
audio_uri="https://media.substrate.run/dfw-clip.m4a",
prompt="David Foster Wallace interviewed about US culture",
segment=True,
align=True,
diarize=True,
suggest_chapters=True,
)

Output

{
"text": "language like that, the wounded inner child, the inner pain, is part of a kind of pop psychological movement in the United States that is a sort of popular Freudianism that ...",
"segments": [
{
"start": 0.874,
"end": 15.353,
"speaker": "SPEAKER_00",
"text": "language like that, the wounded inner child, the inner pain, is part of a kind of pop psychological movement in the United States that is a sort of popular Freudianism that",
"words": [
{
"word": "language",
"start": 0.874,
"end": 1.275,
"speaker": "SPEAKER_00"
},
{
"word": "like",
"start": 1.295,
"end": 1.455,
"speaker": "SPEAKER_00"
}
]
}
],
"chapters": [
{
"title": "Introduction to the Wounded Inner Child and Popular Psychology in US",
"start": 0.794
},
{
"title": "The Paradox of Popular Psychology and Anger in America",
"start": 16.186
}
]
}