Many people now buy music only online. But there are millions — maybe billions — of music compact discs (CDs) in circulation. You can still find a lot of music in this form. Musicbrainz, a project of the Metabrainz Foundation, offers a huge online database of music releases, including on CD. You can use Musicbrainz to retrieve CD data for tagging the CDs you rip. This article is the first of a miniseries showing how to rip CDs to files you can store in an online library.

Install supporting libraries

The script shown later in this article requires a few libraries to work properly. To install them, open a terminal and run this command using sudo:

sudo dnf install python3-musicbrainzngs python3-requests python3-libdiscid

The musicbrainzngs library provides the latest support for the Musicbrainz online database. The requests library provides an easy way to make HTTP requests from within a script. Finally, the libdiscid library provides functions for reading embedded IDs on any music CD, which can be looked up in the online Musicbrainz database.

Make an account

If you don’t have one already, make an account at Musicbrainz. You’ll use this account to access the database via the script. Visit https://musicbrainz.org and select Create Account near the top of the page. Fill out the online web form and follow instructions to make your account.

 

Make note of the password you use for the account. You’ll need it to run the script.

The script

This script reads a CD in your system’s default disc device. It then prints out information about the release from Musicbrainz. Finally, if possible the script stores a cover image. Store this script on your system as ~/bin/get-contents:

import musicbrainzngs as mb
import requests
import json
from getpass import getpass

this_disc = libdiscid.read(libdiscid.default_device())
mb.set_useragent(app='get-contents', version='0.1')
mb.auth(u=input('Musicbrainz username: '), p=getpass())

release = mb.get_releases_by_discid(this_disc.id,
                                    includes=['artists', 'recordings'])
if release.get('disc'):
   this_release=release['disc']['release-list'][0]
   title = this_release['title']
   artist = this_release['artist-credit'][0]['artist']['name']
 
   if this_release['cover-art-archive']['artwork'] == 'true':
      url = 'http://coverartarchive.org/release/' + this_release['id']
      art = json.loads(requests.get(url, allow_redirects=True).content)
      for image in art['images']:
         if image['front'] == True:
            cover = requests.get(image['image'], 
                                 allow_redirects=True)
            fname = '{0} - {1}.jpg'.format(artist, title)
            print('COVER="{}"'.format(fname))
            f = open(fname, 'wb')
            f.write(cover.content)
            f.close()
            break
 
   print('TITLE="{}"'.format(title))
   print('ARTIST="{}"'.format(artist))
   print('YEAR="{}"'.format(this_release['date'].split('-')[0]))
   for medium in this_release['medium-list']:
      for disc in medium['disc-list']:
         if disc['id'] == this_disc.id:
            tracks=medium['track-list']
            for track in tracks:
               print('TRACK[{}]="{}"'.format(track['number'], 
                                             track['recording']['title']))
            break

Examining the script

The first stanza of the script reads the CD device. Then it asks you to log in to the Musicbrainz  service. Notice the specific user agent so the Musicbrainz service can tell this script from other applications.

this_disc = libdiscid.read(libdiscid.default_device())
mb.set_useragent(app='get-contents', version='0.1')
mb.auth(u=input('Musicbrainz username: '), p=getpass())

The next section asks the Musicbrainz service to use the disc ID provided by libdiscid to get the specific release data. Note the request limits the data it requests as a courtesy. It also discovers the title and artist data using the first available results:

release = mb.get_releases_by_discid(this_disc.id, 
                                    includes=['artists', 'recordings'])
if release.get('disc'):
    this_release=release['disc']['release-list'][0]
    title = this_release['title']
    artist = this_release['artist-credit'][0]['artist']['name']

The next section of the script (up to the first break statement) tries to download the front cover art for the CD. It writes this information to a file named $ARTIST – $TITLE.jpg. You can use this cover art to tag the CD appropriately.

The final stanza prints out information of the disc in a format readable by the bash shell as an array. Here’s an example of what this might look like for a random disc:

TITLE="Intriguer"
ARTIST="Crowded House"
YEAR="2010"
TRACK[1]="Saturday Sun"
TRACK[2]="Archer's Arrows"
TRACK[3]="Amsterdam"
TRACK[4]="Either Side of the World"
TRACK[5]="Falling Dove"
TRACK[6]="Isolation"
TRACK[7]="Twice If You're Lucky"
TRACK[8]="Inside Out"
TRACK[9]="Even If"
TRACK[10]="Elephants"

Why print the data out in this format? Stay tuned to Fedora Magazine and find out in a future article.


Photo by Jonathan Kriz on Flickr.