Back to blog
6 min read

Setting Up a Python Environment for Data Science

A practical guide to installing Python, managing packages, and preparing Jupyter before your first dataset — Windows, macOS, and Linux.

  • python
  • setup
  • jupyter

Before analysing data, you need a working environment. This is the setup I followed at the start — install the tools, verify they run, then move on to learning. Steps below cover Windows, macOS, and Linux.

What you are installing

ToolPurpose
PythonThe language you write code in
pipInstalls third-party packages
venvKeeps project packages isolated
JupyterNotebook interface — run code cell by cell
pandasLoads and manipulates tables of data
matplotlib / seabornCreates charts and plots

Step 1 — Install Python

Windows

  1. Download the installer from python.org/downloads
  2. Run it and check “Add python.exe to PATH” at the bottom of the first screen — easy to miss, but important
  3. Click Install Now

Open Command Prompt or PowerShell and check:

python --version

You should see something like Python 3.12.x.

On Windows the command is usually python, not python3. If python does not work, try py --version instead.

macOS

Download from python.org or use Homebrew:

brew install python
python3 --version

Linux

Python is often pre-installed. If not:

sudo apt update && sudo apt install python3 python3-venv python3-pip   # Ubuntu / Debian
python3 --version

Step 2 — Create a virtual environment

A virtual environment is a separate folder for packages so they do not clash with other projects.

Windows (Command Prompt or PowerShell):

mkdir my-data-science
cd my-data-science
python -m venv venv
venv\Scripts\activate

macOS / Linux:

mkdir my-data-science
cd my-data-science
python3 -m venv venv
source venv/bin/activate

When active, your terminal shows (venv) at the start.

To leave the environment later, run deactivate.

Step 3 — Upgrade pip and install packages

pip is Python’s package manager. Upgrade it first to avoid common warnings:

python -m pip install --upgrade pip

Use python -m pip instead of typing pip alone — works reliably on every OS.

Then install the core data science stack:

python -m pip install pandas numpy matplotlib seaborn jupyter
PackageWhat it does
pandasReads CSV files into tables (DataFrames)
numpyFast maths on numbers and arrays
matplotlibBase plotting library
seabornEasier, nicer statistical charts built on matplotlib
jupyterRuns .ipynb notebook files in the browser

Step 4 — Launch Jupyter

jupyter notebook

This opens a browser tab. Click New → Python 3 to create a notebook. Each grey box is a cell — type code, press Shift + Enter to run it.

If jupyter is not recognised, try:

python -m jupyter notebook

Step 5 — Quick verification

Run this in your first cell:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

print("Setup complete")
print("Pandas version:", pd.__version__)

If no errors appear, your environment is ready.

Common issues

python or pip not found (Windows) — reinstall Python and make sure “Add python.exe to PATH” was checked. Close and reopen your terminal after installing.

PowerShell blocks activation (Windows) — if venv\Scripts\activate fails, run this once in PowerShell as admin:

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

Or use Command Prompt instead of PowerShell.

pip not found (macOS / Linux) — use python3 -m pip instead of pip.

Permission errors — make sure your virtual environment is activated before installing. You should see (venv) in your prompt.

Yellow pip warnings — usually safe to ignore after upgrading pip. If a package fails, read the last few lines of the error; it often names the missing dependency.

Jupyter won’t open — try python -m jupyter lab as an alternative, or reinstall with python -m pip install --upgrade jupyter.

What comes next

With Python running and libraries installed, you are ready to load a real dataset and explore it. That is where pandas, charts, and groupby come in — the fun part.