Setting Up a Python Environment for Data Science
A practical guide to installing Python, managing packages, and preparing Jupyter before your first dataset — Windows, macOS, and Linux.
- python
- setup
- jupyter
Before analysing data, you need a working environment. This is the setup I followed at the start — install the tools, verify they run, then move on to learning. Steps below cover Windows, macOS, and Linux.
What you are installing
| Tool | Purpose |
|---|---|
| Python | The language you write code in |
| pip | Installs third-party packages |
| venv | Keeps project packages isolated |
| Jupyter | Notebook interface — run code cell by cell |
| pandas | Loads and manipulates tables of data |
| matplotlib / seaborn | Creates charts and plots |
Step 1 — Install Python
Windows
- Download the installer from python.org/downloads
- Run it and check “Add python.exe to PATH” at the bottom of the first screen — easy to miss, but important
- Click Install Now
Open Command Prompt or PowerShell and check:
python --versionYou should see something like Python 3.12.x.
On Windows the command is usually
python, notpython3. Ifpythondoes not work, trypy --versioninstead.
macOS
Download from python.org or use Homebrew:
brew install python
python3 --versionLinux
Python is often pre-installed. If not:
sudo apt update && sudo apt install python3 python3-venv python3-pip # Ubuntu / Debian
python3 --versionStep 2 — Create a virtual environment
A virtual environment is a separate folder for packages so they do not clash with other projects.
Windows (Command Prompt or PowerShell):
mkdir my-data-science
cd my-data-science
python -m venv venv
venv\Scripts\activatemacOS / Linux:
mkdir my-data-science
cd my-data-science
python3 -m venv venv
source venv/bin/activateWhen active, your terminal shows (venv) at the start.
To leave the environment later, run deactivate.
Step 3 — Upgrade pip and install packages
pip is Python’s package manager. Upgrade it first to avoid common warnings:
python -m pip install --upgrade pipUse
python -m pipinstead of typingpipalone — works reliably on every OS.
Then install the core data science stack:
python -m pip install pandas numpy matplotlib seaborn jupyter| Package | What it does |
|---|---|
pandas | Reads CSV files into tables (DataFrames) |
numpy | Fast maths on numbers and arrays |
matplotlib | Base plotting library |
seaborn | Easier, nicer statistical charts built on matplotlib |
jupyter | Runs .ipynb notebook files in the browser |
Step 4 — Launch Jupyter
jupyter notebookThis opens a browser tab. Click New → Python 3 to create a notebook. Each grey box is a cell — type code, press Shift + Enter to run it.
If jupyter is not recognised, try:
python -m jupyter notebookStep 5 — Quick verification
Run this in your first cell:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
print("Setup complete")
print("Pandas version:", pd.__version__)If no errors appear, your environment is ready.
Common issues
python or pip not found (Windows) — reinstall Python and make sure “Add python.exe to PATH” was checked. Close and reopen your terminal after installing.
PowerShell blocks activation (Windows) — if venv\Scripts\activate fails, run this once in PowerShell as admin:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUserOr use Command Prompt instead of PowerShell.
pip not found (macOS / Linux) — use python3 -m pip instead of pip.
Permission errors — make sure your virtual environment is activated before installing. You should see (venv) in your prompt.
Yellow pip warnings — usually safe to ignore after upgrading pip. If a package fails, read the last few lines of the error; it often names the missing dependency.
Jupyter won’t open — try python -m jupyter lab as an alternative, or reinstall with python -m pip install --upgrade jupyter.
What comes next
With Python running and libraries installed, you are ready to load a real dataset and explore it. That is where pandas, charts, and groupby come in — the fun part.