Scrape Spotify’s API within 20 mins

Alpar Gür
5 min readJul 2, 2023
Photo by Jonas Leupe on Unsplash

Recently, I worked on a project to develop a machine learning model which should predict the genre of a given track based on the audio features of it. In order to do that, we needed tracks and their audio features. Fortunately, Spotify offers various APIs to extract tracks and their preanalysed audio features.

In this post, I will be sharing with you how we collected the data to train our models. Without further ado, let’s get started!


  • While calling Spotify’s APIs authentication is required. Therefore, you will need a client-id and client secret to obtain an authentication token. You will find these in settings tab of Spotify for Developers. If you didn’t used it before you can just click to the link and create a Spotify app.
  • For scraping the APIs we’ll use Python, pandas and Jupyter so make sure your development environment is fully equipped and up to date.


First of all, create new environment variables using CLI and reload your environment:

$ export SPOTIFY_CLIENT_ID=<your-client-id>
$ export SPOTIFY_CLIENT_SECRET=<your-client-secret>

Code snippet below imports necessary libraries and defines a function which exposes Spotify API to get the authentication token using client credentials:

# import packages
import os
import json
import requests
import pandas as pd

def get_access_token(client_id: str, client_secret: str, grant_type: str = 'client_credentials'):
url = '{}&client_id={}&client_secret={}'.format(grant_type, client_id, client_secret)
response =, headers={'Content-Type':'application/x-www-form-urlencoded'})
access_token = 'Bearer ' + json.loads(response.text)['access_token']

return access_token

# get access token
grant_type = 'client_credentials'
client_id = os.getenv('SPOTIFY_CLIENT_ID')
client_secret = os.getenv('SPOTIFY_CLIENT_SECRET')

access_token = get_access_token(client_id, client_secret, grant_type)

Following is a generic function to make API calls by providing only the target URL and an access token:

def get_data(url: str, access_token: str, verbose: bool =…