Videogrep, how to extract a certain word from a video (or more)


In this post, we’ll see how to use video grep, an idea and a software made by Sam Lavigne.

Everything is well explained in his GitHub repository or on this page, but I think it could be easier to follow some easy steps for people that are not so confident with technology.

Link to Sam blog

This post will also serve as a reminder for me.

Download the videos with YT-DLP

Install YT-DLP with this command:

pip install yt-dlp

We are using YT-DLP to download videos. To do that just write this command with the link to the video or the playlist you want to download.

yt-dlp "" --sub-langs it --sub-format srt --format mp4 -f 137+140

Create subtitles with Vosk

To generate .srt we need Vosk transcriber, that is a speech to text (STT) open source software (GitHub repo). To install Vosk run

pip install vosk

If the YouTube video doesn’t have subtitles, you can create them with the Vosk speech to text.

The simplest command that you can use is this. It’s going to use the small modell.

vosk-transcriber -l it -i video.mp4 -t srt -o

If you want a more precise transcription, you can download the full model for your language. At the moment, you need to unzip the model before to use it.

vosk-transcriber -l it -i ./video.mp4 -m /home/ale/Downloads/vosk-model-it-0.22 -o ./ -t srt

In this case, you have to specify where is the full model located. Note that you can transcribe also more videos at the same type, just leave ./ without specifying the name of the video.

If you are transcribing multiple videos with Vosk, in the actual version there is a bug that doesn’t allow spaces into videos.

In this way it will put – instead of spaces in every videos. Run this command once that you are in the folder where your videos are.

rename 's/\s+/-/g' *

Installing and using videogrep

To install videogrep run this.

pip install videogrep

Finally, we can use videogrep. Before cutting a video, it’s interesting to know how many times that word is repeated. To do that, we can use

-n 1

videogrep --input "./video.mp4" -n 1

Once that we know which word (or words!) we want to use we remove -n 1

videogrep --input "./video.mp4" --search 'ciao' -n 1

How useful was this post?

Click on a star to rate it!

Average rating 0 / 5. Vote count: 0

No votes so far! Be the first to rate this post.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.