This guide provides detailed instructions for setting up HealthProcessAI in different environments.
The fastest way to get started depends on your environment:
Conda provides isolated environments and manages both Python and system dependencies.
git clone https://github.com/ki-smile/HealthProcessAI.git
cd HealthProcessAI
# Create environment from file
conda env create -f environment.yml
# Activate the environment
conda activate healthprocessai
# Check Python version
python --version # Should show Python 3.10.x
# Test imports
python -c "import pm4py; print(f'PM4PY version: {pm4py.__version__}')"
python -c "import pandas; print(f'Pandas version: {pandas.__version__}')"
# Create .env file
echo "OPENROUTER_API_KEY=your-key-here" > .env
# Or export directly
export OPENROUTER_API_KEY="your-key-here"
# Navigate to examples
cd examples
# Run patient flow example
python example_patient_flow.py
# Run complete pipeline
python complete_pipeline_example.py
# Start Jupyter Lab
jupyter lab
# Or classic notebook
jupyter notebook
Google Colab provides free cloud-based Jupyter notebooks with GPU support.
Click the badge above to open our pre-configured notebook.
Create a new notebook and run these cells:
# Install required packages
!pip install pm4py pandas numpy matplotlib seaborn graphviz requests -q
!pip install scikit-learn python-dotenv markdown weasyprint -q
# Install system dependencies
!apt-get update -qq
!apt-get install -y graphviz -qq
print("✅ All packages installed!")
# Clone HealthProcessAI repository
!git clone https://github.com/ki-smile/HealthProcessAI.git
# Navigate to repository
import os
os.chdir('healthprocessai')
# Add to Python path
import sys
sys.path.append('/content/healthprocessai')
print("✅ Repository cloned and configured!")
# Option 1: Direct input
OPENROUTER_API_KEY = "" # @param {type:"string"}
# Option 2: Use Colab secrets (recommended)
from google.colab import userdata
try:
OPENROUTER_API_KEY = userdata.get('OPENROUTER_API_KEY')
print("✅ API key loaded from secrets")
except:
print("⚠️ Add your API key above or in Colab secrets")
# Import HealthProcessAI modules
from core.step1_data_loader import EventLogLoader
from core.step2_process_mining import ProcessMiner
from core.step3_llm_integration import LLMAnalyzer
print("✅ HealthProcessAI imported successfully!")
# Load sample data
import pandas as pd
# Option 1: Use provided data
df = pd.read_csv('data/sepsisAgregated_Infection.csv')
# Option 2: Upload your own
from google.colab import files
uploaded = files.upload()
print(f"✅ Loaded {len(df)} events")
Mount Google Drive to save results:
from google.colab import drive
drive.mount('/content/drive')
# Save results to Drive
output_path = '/content/drive/MyDrive/HealthProcessAI_Results/'
!mkdir -p {output_path}
Enable GPU for faster processing:
import tensorflow as tf
print("GPU Available: ", tf.config.list_physical_devices('GPU'))
# Create virtual environment
python -m venv healthprocessai-env
# Activate (Windows)
healthprocessai-env\Scripts\activate
# Activate (Mac/Linux)
source healthprocessai-env/bin/activate
# Install requirements
pip install -r requirements.txt
# Install package in development mode
pip install -e .
pip install healthprocessai
C:\Program Files\Graphviz\binbrew install graphviz
sudo apt-get install graphviz
# or
sudo yum install graphviz
docker pull ki-smile/healthprocessai:latest
docker run -p 8888:8888 ki-smile/healthprocessai
# Clone repository
git clone https://github.com/ki-smile/HealthProcessAI.git
cd HealthProcessAI
# Build image
docker build -t healthprocessai .
# Run container
docker run -p 8888:8888 -v $(pwd)/data:/app/data healthprocessai
# Solution: Install with specific version
pip install pm4py==2.7.11.7
# Solution: Install system package
conda install graphviz python-graphviz -c conda-forge
# Solution: Process in chunks
chunk_size = 10000
for chunk in pd.read_csv('large_file.csv', chunksize=chunk_size):
process_chunk(chunk)
# Solution: Add delays between requests
import time
time.sleep(1) # Wait 1 second between API calls
Create .env file in project root:
OPENROUTER_API_KEY=your-key-here
HEALTHPROCESSAI_DATA_DIR=/path/to/data
HEALTHPROCESSAI_OUTPUT_DIR=/path/to/output
Run the test suite:
# Run all tests
pytest tests/
# Run with coverage
pytest --cov=healthprocessai tests/
# Run specific test
pytest tests/test_data_loader.py
If you encounter issues:
Developed at SMAILE, Karolinska Institutet