Ever since the release of her first album in 2006, Taylor Swift has become an international sensation. As a talented singer and songwriter, her greatest hits have spanned several genres. From the country “Fearless” album (including “You Belong With Me” and “Love Story”) to the synth-pop “1989” album (including “Blank Space” and “Wildest Dreams”) to the urban “reputation” album (including “Look What You Made Me Do”) to the indie folk and alternative rock album “evermore” (including “willow”), Swift never fails to captivate her listeners.

So what is it about her music that makes it so popular, even across multiple genres?

Using Spotify’s Web API, we can get a closer look at hundreds of her songs. Although using APIs can sound intimidating, Spotify has done a tremendous job making the process quick and easy. Additionally, Spotify has many interesting audio features for each track that we will take a look at.

(This post will only cover the data collection. The data analysis will follow in the next post)

Creating an Account and Registering an Application with Spotify

To begin, you will need to create a free account with Spotify if you do not already have one. Next, login to the developer dashboard. Here you will have to read and agree to the Terms of Service. Reading these, I determined that the following process is allowable under Spotify’s Terms of Service. Feel free to do the same if you run into any questions about what is or is not allowed when using this API.

From here, you will create and register an application. Once this is created, you can find your Client ID and your Client Secret which you will use to access Spotify in python. It is generally best practice to write down this information in a secure location and not share it with others as this information is specific to your API access.

(Much of this initial process is also covered in the Spotify Web API Documentation)

Setting up Your Python Environment

After opening your python IDE, you will need to load the following packages:

import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
import pandas as pd

Next, you will need to authenticate your connection with this API using your Client ID and Client Secret like so:

cid = "Insert your Client ID"
secret = "Insert your Client Secret"
client_credentials_manager = SpotifyClientCredentials(client_id = cid, client_secret = secret)
sp = spotipy.Spotify(client_credentials_manager = client_credentials_manager)

With this setup, you are ready to move on to the exciting part - importing the data!

(In this tutorial, I will show you how to import all of the songs from Taylor Swift’s albums on Spotify. However, this method applies to any other artist too. I encourage you to look at the Spotify reference for these methods. This reference is very easy to read and explains the JSON data structures that the following methods will return.)

Accessing an Artist’s Albums on Spotify

First, I would recommend opening the Spotify Web Player to follow along visually. Search for “Taylor Swift” and click on her artist profile. Then, copy and paste the URL to this page into your IDE. Your URL should look like this: “https://open.spotify.com/artist/06HL4z0CvFAxyc27GXpf02”. Notice how, after “/artist/” there is a string of letters and numbers. This is called the Spotify ID for this artist. This API makes use of similar Spotify IDs for tracks, playlists, and albums.

Now, using Taylor Swift’s artist Spotify ID, we can access Spotify’s stored information by calling the following method:

taylor_swift = sp.artist_albums('06HL4z0CvFAxyc27GXpf02')['items']

Retrieving the Track Data From Each Album

Right now, taylor_swift is a list of all of Swift’s albums. To get all tracks from each album, we will use this method. I will use a for loop to iterate through each album and extract the desired information into a track. I chose to extract the name, album name, release date, popularity, duration, and danceability, but you are welcome to choose any features that are of interest to you. To explore some of the audio features Spotify has available for each track, check out this reference.

(Caution: The following code may take a few minutes to run so feel free to only run this for the first couple of Swift’s albums)

tracks = []
for album in taylor_swift:
 album_name = album['name']
 release_date = album['release_date']
 
 album_tracks = sp.album_tracks(album['uri'])["items"]
 
 for song in album_tracks:
 name = song['name']
 duration = song['duration_ms']
 danceability = sp.audio_features(song['uri'])[0]['danceability']
 popularity = sp.track(song['uri'])['popularity']
 track = [name, album_name, release_date, popularity, duration, danceability]
 tracks.append(track)

So now you should have an array of all of Swift’s songs from these albums! To be able to analyze these, go ahead and put these into a data frame like so:

all_songs = pd.DataFrame(tracks, columns=["name", "album_name", "release_date", "popularity", "duration", "danceability"])
all_songs

This should return a nice table like this:

API Table

If you got to this point, great job! If you are stuck or have questions on any of this process please post your questions below, I am happy to help!

Also, stay tuned for the next post where I will guide you through some exciting analyses. There is so much that could be analyzed in this dataset and I am still debated what to research specifically, so if you have any suggestions please comment them below! One idea is to include trying to visualize Taylor Swift’s popularity over the years using the “popularity” and “release_date” variables. Another is to use the “duration” and “daceability” to see if they can be used to predict the popularity of her songs.

In the meantime, have fun exploring the Spotify API and try your hand at the challenges below.

(If needed, consult my GitHub Repository)

Bonus Challenges

  • Try doing the same process for your favorite artist.
  • See if you can do something similar but for a playlist, not an artist.
  • Try doing some data analysis of this data on your own and see what you find!