Not able to extract subtitles from public youtube video

What are you trying to achieve?

the custom app was working perfect, however after some days, wasnt able to get the youtube subtittles.
the subtitles are in public videos and automatically generated.

i tried several things. even created a youtube API, but still is same. the error should not be there.

to extract the subtitles i did the next modules:
*a webhook module, to get the url
*a 0kodekit module to get the trasncript:

from youtube_transcript_api import YouTubeTranscriptApi

def fetch_video_transcript(video_url):
# Extracting Video ID from URL
if “youtu.be” in video_url:
video_id = video_url.split(‘/’)[-1]
else:
video_id = video_url.split(‘v=’)[-1].split(‘&’)[0]

# Attempt to fetch the video transcript in Spanish
try:
    transcript_list = YouTubeTranscriptApi.get_transcript(video_id, languages=['es', 'es-419', 'es-MX', 'es-ES'])
    full_transcript = '\n'.join([item['text'] for item in transcript_list])
    return full_transcript
except Exception as e:
    return f"Failed to fetch transcript: {str(e)}"

Main block to set the ‘result’ variable, as expected by the platform

video_url = “{{1.subject}}”
result = fetch_video_transcript(video_url)

the result is always same:

Result Failed to fetch transcript:
Could not retrieve a transcript for the video https://www.youtube.com/watch?v=Ym6tMseNEp0! This is most likely caused by:

Subtitles are disabled for this video

If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at Issues · jdepoix/youtube-transcript-api · GitHub. Please add which version of youtube_transcript_api you are using and provide the information needed to replicate the error.

Steps taken so far

try do modify the code, to bypass the issue.
use a youtube API. nothing changes

the video are public from differents channels, and i can see have transcript subtitles

Hello, look at this. My approach with HTTP calls works perfectly:

It seems the method to Regex “timedtext”, and replacing “\u0026” with “&” no longer works.

Check for yourself, the HTTP module doesn’t return “timedtext” any longer.

If you have another solution, I’d be happy to know :slight_smile:

By the way I also made a video about the original solution