Skip to content

Goal of this section

By the end of this section you will be able to:

  • Install Python in a reproducible environment
  • Start JupyterLab and run code cells
  • Know what an “environment” is and why we use it in bioinformatics

Why environments matter

In bioinformatics, results must be reproducible. If you and a collaborator install “Python + packages” on different days, you can end up with different package versions and different results.

A conda environment is a self-contained box that holds Python plus exactly the packages we choose.


Install Miniconda

Install Miniconda from the official instructions:

  • Miniconda download + install guide: https://www.anaconda.com/docs/getting-started/miniconda/main

After installation, open a terminal.

Hint On windows the easiest is to using winget on the command line:

  • Open a terminal and install using winget install Miniconda3 Python.Python.3.10
  • Open a new Terminal
  • In the new terminal make conda available with & "$HOME\miniconda3\shell\condabin\conda-hook.ps1"

Create and activate the course environment

Create an environment (one time):

conda create -n PythonCourse python=3.10

Activate it:

conda activate PythonCourse

From now on, whenever you work on this course, start by activating:

## windows only
& "$HOME\miniconda3\shell\condabin\conda-hook.ps1"
## all
conda activate PythonCourse

Install the packages we’ll use

pip install  jupyterlab notebook nbclient ipykernel
pip install  numpy pandas matplotlib seaborn scipy statsmodels scikit-learn

Notes: - numpy + pandas: data handling - matplotlib + seaborn: plotting - scipy: clustering and distances - statsmodels: linear models / ANOVA - scikit-learn: standard ML tools (we’ll use it to compare with “from-scratch” ideas)


Start Jupter Lab

From within the environment you now can start the Jupyter server:

jupyter lab

A browser window should open. Create a new notebook using the kernel:

Python 3 (ipykernel)


Choosing an editor (later)

For notebooks / scripts, JupyterLab is great. For writing reusable packages, use an IDE:

  • VS Code (recommended): general purpose, very popular
  • PyCharm: powerful Python IDE, lots of features

For this course: we’ll mostly use JupyterLab and you can even stick to JupyterLab for the final project. But if you try to create a python package out of the final project (optional) I recommend you to use an IDE.


Quick check

Run this in a notebook cell:

import numpy as np
import pandas as pd

print(np.arange(1, 6))
print(pd.DataFrame({"a": [1, 2], "b": [3, 4]}))

If this runs without errors, you’re ready.