The Musicbrainz API

Musicbrainz logoMusicBrainz is a community-maintained open source encyclopedia of music information. It is also the source of metadata used by Hawkwynd Radio to display the metadata for the song currently being played within the application.

Have you ever wanted to know “Who’s the drummer on that song?” Or, I wonder when this album was published? Who’s singing background vocals on that song with Sting? Who’s the bass player? Stuff like that is all maintained within Musicbrainz.

In 2000, Gracenote took over the free CDDB project and commercialized it, essentially charging users for accessing the very data they themselves contributed. In response, Robert Kaye founded MusicBrainz. The project has since grown rapidly from a one-man operation to an international community of enthusiasts that appreciates both music and music metadata. Along the way, the scope of the project has expanded from its origins as a mere CDDB replacement to the true music encyclopedia MusicBrainz is today. MusicBrainz is operated by the MetaBrainz Foundation, a California based 501(c)(3) tax-exempt non-profit corporation dedicated to keeping MusicBrainz free and open source.

MusicBrainz Web Service (v2) PHP class

This PHP library that allows you to easily access the MusicBrainz Web Service V2 API. This library is a fork of https://github.com/chrisdawson/MusicBrainz and takes some inspiration from the Python bindings.

After careful consideration, I chose to use this library as the main class for the Hawkwynd Radio software. With some adjustments, and trial and error, I believe I have built a suitable search application which meets the requirements of my project.

This post will cover some of the challenges I had to overcome with writing the backend logic to the Hawkwynd Radio project.

The MusicBrainz Anatomy

To give you better insight into the Musicbrainz Web Service and how it provides data, I feel it’s necessary to give a brief overview of the anatomy of it all.

The Artist

An artist is generally a musician (or musician persona), group of musicians, or other music professional (like a producer or engineer). Occasionally, it can also be a non-musical person (like a photographer, an illustrator, or a poet whose writings are set to music), or even a fictional character.

Example: Michael Jackson or Rush.

The Release Group

A release group, just as the name suggests, is used to group several different releases into a single logical entity. Every release belongs to one, and only one release group.

Example: Thriller or “A Farewell To Kings”

The Release

A MusicBrainz release represents the unique release (i.e. issuing) of a product on a specific date with specific release information such as the country, label, barcode and packaging. If you walk into a store and purchase an album or single, they are each represented in MusicBrainz as one release.

Example “Thriller (Live)” or “4oth Anniversary Edition Farewell To Kings”

The Recording

A recording is an entity in MusicBrainz which can be linked to tracks on releases. Each track must always be associated with a single recording, but a recording can be linked to any number of tracks.

Example: “Beat it” or “Cinderella Man”

The Label

Labels are one of the most complicated and controversial parts of the music industry. The main reason for that being that the term itself is not clearly defined and refers to at least two overlapping concepts: imprints, and the companies that control them. Fortunately, in many cases the imprint and the company controlling it have the same name.

Example: “K-Tel Records” or “Anthem Records”

A Visual Representation

Visual Representation of Data sets

Use Case Scenario

The scenario is relatively simple but has some restrictions, which presented significant challenges when building the logic for formulating the request and processing the results returned from the MB api.

The Search Parameters

The API provides for the ability to include multiple filters, and parameters of a search. The base of Hawkwynd Radio’s search is simple: Artist and Title (song).

Scenario Details

The client (MIXXX running on a workstation) plays an MP3 file from a playlist, and transmits the “Artist” and “Title” values to the Shoutcast service running on a remote server, as well as the audio stream of the playing track (song). The Shoutcast server then distributes the source of the audio to requesting connections from the web page that visitors are accessing at http://stream.hawkwynd.com. In addition, the Shoutcast server provides the Artist and Title values through a an authenticated request from our web code (php). Now, the fun begins. Having obtained the Artist and Title being played, we could simply display that information on the website in it’s basic form.

ZZ Top – A Fool for Your Stockings

With a little CSS styling we’ve achieved our purpose to let the viewer know they are listing to Queen “Tie You Mother Down” song.

And that’s where Hawkwynd Radio started. Crude, but effective, it served two purposes – play the song, and display the song Artist and Title being played currently. So, we know the artist and we know the title of the song. By submitting our request to MB through the API, we can get a treasure trove of data about these two value pairs.

Using Javascript, we can initiate a request to our API and render the page with the results to display a wide variety of data.

1
2
3
4
5
6
7
8
9
10
11
12
13
$.getJSON('statistics.php', function(data){
        var meta            = data.streams[0].songtitle;
        var artist          = meta.substr(0, meta.indexOf(' - '));
        var title           = meta.substr(meta.indexOf(' - ') + 3);
        var servercontent   = data.streams[0].servertitle.split('-'); // configured in software - some text
        var servertitle     = servercontent.shift();
        var motd            = servercontent;                         // one-line motd on server
        var samplerate      = data.streams[0].samplerate;           // samplerate 44100
        var bitrate         = data.streams[0].bitrate;              // bitrate  128
        var genre           = data.streams[0].servergenre;          // not used
        var streamstatus    = data.streams[0].streamstatus;         // status of the stream
        var streamuptime    = data.streams[0].streamuptime;         // how long stream is playing
});

statistics.php is a very simple script. For brevity, config.inc.php is not shown as it contains security info we dont want to share.

1
2
3
4
5
6
require_once('include/config.inc.php');

// call stats from shoutcast server
$json   = file_get_contents(SHOUTCAST_HOST .'/statistics?json=1');
echo $json;
exit;

The statistics.php returns a JSON result, of which we can obtain the artist/title values of the currently streaming song:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
{
  "totalstreams":1,
  "activestreams":1,
  "currentlisteners":0,
  "peaklisteners":6,
  "maxlisteners":512,
  "uniquelisteners":0,
  "averagetime":0,
  "version":"2.6.0.750 (posix(linux x64))",
  "streams":[{
  "id":1,
  "currentlisteners":469,
  "peaklisteners":500,
  "maxlisteners":512,
  "servergenre":"Classic Rock",
  "serverurl":"http:\/\/stream.hawkwynd.com",
  "servertitle":"Hawkwynd Radio",
  "songtitle":"ZZ Top - A Fool for Your Stockings",
  "streamhits":2431,
  "streamstatus":1,
  "streamuptime":69174,
  "bitrate":"128",
  "samplerate":"48000",
  "content":"audio\/mpeg"
 }]
}

From this JSON data, we parse the “songtitle” value and split on the “-” to obtain our ARTIST and TITLE values being played on the Shoutcast server.
Now that we have our ARTIST and TITLE values, we can send a request to our API and query for data.

1
2
        var artist          = meta.substr(0, meta.indexOf(' - '));
        var title           = meta.substr(meta.indexOf(' - ') + 3);

So we have our two values we need:
artist is “ZZ Top”
title is “A Fool for Your Stockings”

Now, we post that to our API script and begin the process of getting the metadata from Musicbrainz.

Something like this:

1
2
3
4
5
6
7
8
9
10
function musicbrainzSearchFirst(a , t, flag){
     $.post( "firstrecording.php", { artist: a, title: t})      
      .done(function( data ) {            

           console.log('Got results from musicbrainz.org.');
           console.log('I searched artist:' + a + ' title:' + t);
           console.log( $.parseJSON(data) );
             

      });

Javascript posts the ARTIST and TITLE to our API code which queries MB for a match. It then sorts the results and keeps only the oldest date release, which would be considered to be the FIRST release of the song, which is what I want to know, not any re-releases, or compilation releases, but the first time it was released. AKA The original release of that song. In addition, we want to set some ‘rules’ for Musicbrainz to adhere to.

firstrecording.php looks something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
$a = $_POST['artist'];
$t = $_POST['title'];

use Guzzle\Http\Client;
use MusicBrainz\Filters\ArtistFilter;
use MusicBrainz\Filters\LabelFilter;
use MusicBrainz\Filters\RecordingFilter;
use MusicBrainz\Filters\ReleaseGroupFilter;
use MusicBrainz\HttpAdapters\GuzzleHttpAdapter;
use MusicBrainz\MusicBrainz;

// Create new MusicBrainz object
$brainz = new MusicBrainz(new GuzzleHttpAdapter(new Client()),'mySecretMusicBrainzUsername', 'myUltraSecretSuperStrongPassword');
$brainz->setUserAgent('Hawkwynd Radio', '1.0', 'http://stream.hawkwynd.com');

// set defaults values

$releaseDate    = new DateTime();
$artistId       = null;
$songId         = null;
$trackLen       = -1;
$albumName      = '';
$lastScore      = null;
$firstRecording = array(
    'query'       => array(),
    'release'     => null,
    'releaseDate' => new DateTime(),
    'releaseCount' => null,    
    'recording'   => null,
    'artistId'    => null,
    'recordingId' => null,
    'trackLength' => null,
    'execution' => new stdClass(),
    'dump'      => new stdClass()
);

Now, we want to tell Musicbrainz to filter the results, to return only Official releases (no bootleg, etc) and only the Albums primary type of release.

1
2
3
4
5
6
7
$args = array(
    "recording"     => $t,
    "artist"        => $a,
    "creditname"    => $a,
    "status"        => "Official",
    "primarytype"   => "Album"
);

Here, we iterate through the recordings returned, and stop at the highest score rating. A 100 is a match, but I’ve found that a score of 99 is really a better solution because sometimes Musicbrainz assigns the real deal a 99. I don’t know why, but they do.
Then, make sure we have the oldest release year before building our response object of metadata.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
try {

    $recordings = $brainz->search(new RecordingFilter($args));
    $releases   = [];
    $out        = new stdClass();
    $lastScore  = null;
    foreach($recordings as $recording){
       
         if (null != $lastScore && $lastScore < 99) {
            break;
         }      
         
        $lastScore        = $recording->getScore();
        $releaseDates     = $recording->getReleaseDates();
        $oldestReleaseKey = key($releaseDates);

Now, we are going to filter our results array, and keep the oldest release year and load it into our payload array.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
   if( $releaseDates[$oldestReleaseKey]->format('Y') < $firstRecording['releaseDate']->format('Y')){

            $firstRecording = array(
                'query'       => $args,
                'release'     => $recording->releases[$oldestReleaseKey],
                'releaseDate' => $recording->releases[$oldestReleaseKey]->getReleaseDate(),
                'release-count' => count($recording->releases),
                'recording'   => $recording,            
                'artist'      => $recording->getArtist(),
                'recordingId' => $recording->getId(),
                'trackLength' => $recording->getLength(),
                'execution'   => new stdClass(), // used for debugging
                );
               
        }
    }

The results array is kicked back as JSON and looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
{
    "firstRecording": {
    "release": {
        "id": "53e5b994-123f-319f-aba7-380dca7bca84",
        "title": "Degüello",
        "status": "Official",
        "quality": "",
        "language": "",
        "script": "",
        "date": "1979-08-27",
        "country": "US",
        "barcode": "",
        "artists": [],
        "brainz": {},
        "secondaryType": "",
        "label": null,
        "annotation": "",
        "coverart": "http://coverartarchive.org/release/53e5b994-123f-319f-aba7-380dca7bca84/11552046929-500.jpg"
    },
    "releaseDate": {
        "date": "1979-08-27 00:00:00.000000",
        "timezone_type": 3,
        "timezone": "UTC"
    },
    "recording": {
        "id": "f6150067-91bc-49b3-97de-4416c0c272a0",
        "title": "A Fool for Your Stockings",
        "score": 100,
        "brainz": {},
        "length": 256693,
        "artistID": "a81259a0-a2f5-464b-866e-71220f2739f1"
    },
    "recordingId": "f6150067-91bc-49b3-97de-4416c0c272a0",
    "trackLength": 256693,
    "artist": {
        "name": "ZZ Top",
        "id": "a81259a0-a2f5-464b-866e-71220f2739f1",
        "annotation": null,
        "disambiguation": "",
        "country": "US",
        "area": "United States",
        "begin_area": "Houston",
        "life-span": {
        "begin": "1969",
        "end": null
    }
   
    }
}
}

Now, we have our payload array, let’s get some additional data from Musicbrainz about the artist, release and cover art.

Continue Reading…