How to connect Jupyter Notebook with a database using MySQL Connector
Jupyter Notebook is very handy and it’s fun to use when we’re working with data. In some cases we may want to connect to a database rather than loading a file every single time especially if the dataset is constantly growing.
In this post i will be showing you how to connect your Jupyter Notebook with a local / remote database using MySQL Connector. The only pre-requisites are having your Jupyter environment installed and a database to connect :)
So let’s get started!
First, you’ll need open terminal and install the mysql-connector package:
pip3 install mysql-connector
After installing the package open a Jupyter Notebook and follow the steps below.
import os
import mysql.connector
I would suggest you to declare your credentials and other connection informations as environment variables and then use them for a much more secure connection. Once you have your variables ready, you can enter the name of corresponding environment variables in brackets in the code below.
username = os.environ.get('')
password = os.environ.get('')
host = os.environ.get('')
And now create the database connection.
cnx = mysql.connector.connect(host = host,
port = #optional,
user = username,
passwd = password,
database = #database
)
Now you should have been connected to your database. If you are trying to connect to remote database and receiving an error about authentication plug-in add the line below into mysql.connector.connect function.
auth_plugin='mysql_clear_password'
VOILA! You are set to run SQL-Queries in your Jupyter Notebook and more importantly have the ability to process the data from your database without exporting it.
And now let’s create a dataframe using an SQL-Query.
df = pd.read_sql_query('Select * FROM Table', cnx)
After running this code above, a dataframe with the every rows & columns of selected table will be created.
This post covers a minor but highly crucial part of data analysis. Without the ingredients you can’t cook and without the data it’s extremely hard to do analysis.
It took me a while to create this connection between a database and Jupyter Notebook, so I hope this post will save some time for people out there facing with the same issue.