Python: Youtube Downloader with Metadata

Created: 07.07.2022 | Last edited: 19.07.2022


Sooo, there are some songs I like that are remixes and hence...just on YouTube. Artists often provide free download links to some of their songs, but these may break after a few years. Also songs tend to sometimes get deleted for copyright issues even if it is not appropriate. So I needed a downloader to make sure I can get the officially free songs I love for sure. Searching around for a tool was...frustrating. Not only seem all capable tools to be quite expensive for often simple functionality, they also don't offer features I want. If I download a song to put it in my apple music library I want it to:

  • be the best audio quality possible
  • have proper meta data (like author, genre, ...)
  • have a album cover art

There must be a cheap tool around that is capable of achieving all of this, right? Maybe. I couldn't find it. And considering that achieving this takes us not even 100 lines of Python, why waste more time searching?

Spoiler: I will fix the metadata with a command line tool only available to mac. On Windows or Linux you can still follow along and use a different approach for meta data down the line.

Downloading from YouTube

YouTube doesn't like downloaders. Not at all. That's why Youtube's code gets changed around from time to time to break downloaders. So if you thought about buildiung a simple's not that easy. And annoying as you have to fix it over and over again. That's why I decided to go for the package pytube. If you try it and it doesn't work: YouTube might have broken it again. In this case I usually tend to wait 1-2 weeks until pytube got a new fix for YouTube's adjustments and it works again. I tried fixing it several times myself but to me...the time spend is just not worth it. So if it breaks, let's just be grateful for the community supporting pytube to do the job for us and give it a little wait.


For Windows, we need the Python packages pytube and requests as well as the command line tool ffmpeg.

For Mac, pytube, requests and ffmpeg would be sufficient as well, but to make our lives easier we add atomicparsley to our mix.

Getting started

First off, we need a new Python script:

import os
from pytube import YouTube
import subprocess
import requests

def main():

if __name__ == '__main__':

After some essential imports we should create a main function as well as the if clause as shown. It's not neccessary, but best practice. For one, this way people know our file is a standalone script that can be ran and works on it's own. On the other hand by putting code inside a main function instead of just throwing it in the if-loop we make sure we don't accidentally assign global variables or run unpredictable code blocks if someone ever imports this file into another.

The main function

Next, we will continue our work on the main-function. I want the title, artist, and genre as meta information. And the album cover art, but we will handle this automatically later on. Feel free to add other inputs for meta data you like.

def main():
    link = input('Link: ')
    title = input('Title: ')
    artist = input('Artist: ')
    genre = input('Genre: ')

Now it's time to get the video and our video thumbnail as cover art. For the latter there is a build-in function by pytube. However this often seems to malfunction and I am not sure why. Anyway, I prefer to generate the url "manually" which still sometimes doesn't work, but was overall more reliable to me:

    # get a youtube video object
    yt = YouTube(link)

    # build a custom url to the thumbnail of the video
    thumbnail_url = '' % yt.video_id

    # get the thumbnail
    page = requests.get(thumbnail_url)

    # save the thumbnail as a file
    with open('thumbnail.jpg', 'wb') as f:

For the next part we need to see how YouTube works. YouTube offers us several data streams, for example to provide different quality levels when streaming videos. We only need audio. pytube lets us access all streams it can extract to choose which one we want. So we gather it's streams and filter them by audio only as we don't need large video files if we just want music.

    print('Available audio streams: ')
    audio_streams = yt.streams.filter(only_audio=True)

Remeber how we wanted the best possible audio quality? Well, now it's time to choose a stream. We should pick the best quality one. Even if file size is a concern later on, we can compress down a high quality file but the other way around...not as easy. So let's stick to the best quality.

What is "best quality"? I am going to do something that will HiFi friends scream. I will assume that the highest bitrate equals the best audio quality. It's not true, I know. But let's just assume that the highest bitrate laso implies the highest sample rate as well as bit depth; thus being the "highest quality". We have to do so as pytube streams will only get us said bitrate. So let's handle it like that.

Getting the best quality

We will add to our main function now:

    best_stream = None
    highest_quality = 1
    for item in audio_streams:
        quality = int(''.join(filter(str.isdigit, item.abr)))
        if quality > highest_quality:
            highest_quality = quality
            best_stream = item
    print('Best quality stream is:')

This way we will get the audio stream with the highest bitrate.

An example output so far looks like this:

[<Stream: itag="139" mime_type="audio/mp4" abr="48kbps" acodec="mp4a.40.5" progressive="False" type="audio">, <Stream: itag="140" mime_type="audio/mp4" abr="128kbps" acodec="mp4a.40.2" progressive="False" type="audio">, <Stream: itag="249" mime_type="audio/webm" abr="50kbps" acodec="opus" progressive="False" type="audio">, <Stream: itag="250" mime_type="audio/webm" abr="70kbps" acodec="opus" progressive="False" type="audio">, <Stream: itag="251" mime_type="audio/webm" abr="160kbps" acodec="opus" progressive="False" type="audio">]
<Stream: itag="139" mime_type="audio/mp4" abr="48kbps" acodec="mp4a.40.5" progressive="False" type="audio">
<Stream: itag="140" mime_type="audio/mp4" abr="128kbps" acodec="mp4a.40.2" progressive="False" type="audio">
<Stream: itag="249" mime_type="audio/webm" abr="50kbps" acodec="opus" progressive="False" type="audio">
<Stream: itag="250" mime_type="audio/webm" abr="70kbps" acodec="opus" progressive="False" type="audio">
<Stream: itag="251" mime_type="audio/webm" abr="160kbps" acodec="opus" progressive="False" type="audio">
Best quality stream is:
<Stream: itag="251" mime_type="audio/webm" abr="160kbps" acodec="opus" progressive="False" type="audio">

Downloading the audio

Finally now that we know what stream exactly we want we can download it:

    ys = yt.streams.get_by_itag(int(best_stream.itag))'output.webm')

We will save it as webm file as that is what we are usually given by the highest quality streams. This format acts basically as a container for video and audio. Even though we know it only contains audio and we only need audio, let's just keep it that format for now to don't accidentally compress it already resulting in lower audio quality.

Converting and adding meta data

Now we need to convert this webm file to a format we can use and add our meta tags. This will be done in the next chapter, however let's add the function calls for this already to finish up our main function:

    ys = yt.streams.get_by_itag(int(best_stream.itag))'output.webm')



Note how I also do some cleanup in the end by removing files we don't need anymore once we got our final file.

I have to admit this is bad implementation. There is no need to actually create those files and having to delete them afterwards. However I was a bit lazy here as I had no motivation getting more into ffmpeg or atomicparsley to figure out how to pass the data around properly. Overall I don't work with audio files and didn't want to spend the time learning more about it here.

Converting the file format

So now we will need the earlier introduced function convert_for_apple().

It basically just runs ffmpeg from the Python script and looks like this:

def convert_for_apple(filename_webm):
    # for other devices using AAC: 
    # command = ['ffmpeg', '-i', filename_webm, '-codec:a', 'aac', 'out.aac']
    # for Apple devices (.m4a):
    command = ['ffmpeg', '-i', filename_webm, '-acodec', 'alac', 'out.m4a'], stdout=subprocess.PIPE, stdin=subprocess.PIPE)

Adding meta data

Last step would be to add our meta data and thumbnail to the finished file. I am on a Mac so I will use atomicparsley which is a great command line tool to do so.

On Windows

However, if you are on Windows you could also do it with ffmpeg like the step before. To do so you can modify the command from before like this:

command = ['ffmpeg', '-i', filename_webm, '-codec:a', 'aac', '-metadata', 'author="your_author_name"', '-metadata', 'title="your_title"', 'out.aac']
# list of meta data tags:
# ffmpeg -i in.mp4 -i IMAGE -map 0 -map 1 -c copy -c:v:1 png -disposition:v:1 attached_pic out.mp4

On Mac

To set our metadata and thumbnail using atomicparsley, we can define a function as follows:

def set_album_cover_and_meta(filename_audio, filename_thumbnail, title, artist, genre):
    command = ['atomicparsley', filename_audio,
               '--artwork', filename_thumbnail,
               '--title', title,
               '--artist', artist,
               '--genre', genre,
               '--output', title + ".m4a"
               ], stdout=subprocess.PIPE, stdin=subprocess.PIPE)

So this should be self-explanatory: we pass the function our generated or from input retrieved meta data of the main function, then build a command for atomicparsley to add this meta data to our file, rename it to it's title, and let the entire command run in our terminal.


So in this post we used pytube to retrieve YouTube audio streams, determine the highest quality one, and download it. Next we used ffmpeg to convert it (and for Windows: to add meta data) and added our web scraper thumbnail as well as other meta data using atomicparsley.