System Data Flow
From raw hospital reference data through causal AI to actionable QC insights. Hover any node to see its file path. All components communicate through the FastAPI backend.
Data Layer
Synthetic QC data generated by data/synthetic/generate.py,
calibrated against MIMIC-IV Demo distributions (PhysioNet, 2023).
180 days × 3 instruments × 8 tests × 3 QC levels = 116,640 records.
QC Engine
Full Westgard multi-rule implementation with clinically appropriate tiered time windows per rule type. Prevents stale violations from inflating the FAIL rate. Rules: 1-2s, 1-3s, 2-2s, R-4s, 4-1s, 10x.
Causal AI
DoWhy + pgmpy DAG structure. Backdoor linear regression estimates Average Treatment Effects. Counterfactuals: "what if temp had been 19°C?" — uses real ATE coefficients.
Technology Stack by Layer
Three architectural layers: Data (ingest + calibration), Intelligence (causal AI + QC rules), and Interface (API + dashboard + MCP).
Tool Reference
| Tool | Category | Role in ARIA | Why we chose it |
|---|---|---|---|
| Python 3.11 | Core | All backend logic | Ecosystem: DoWhy, FastAPI, pandas |
| pandas | Data | QC data loading & transformation | Industry standard for tabular data |
| numpy | Data | Z-score computation, clipping | Fast array math, required by DoWhy |
| MIMIC-IV Demo | Data | Calibration reference distributions | Free, real hospital lab data (PhysioNet) |
| DoWhy 0.11 | AI | Causal graph + ATE estimation | Microsoft Research, best Python causal lib |
| pgmpy | AI | Bayesian network / DAG backend | Required by DoWhy for graph operations |
| scikit-learn | AI | Linear regression estimator | Backdoor criterion implementation in DoWhy |
| FastAPI | API | REST backend + HTML page serving | Async, auto-docs, Jinja2 support |
| Uvicorn | API | ASGI server | Production-grade, pairs with FastAPI |
| Jinja2 | Interface | HTML template engine | Native FastAPI support, clean inheritance |
| Plotly.js | Interface | All interactive charts | No build step, rich chart types, dark theme |
| SQLite | Storage | QC result history persistence | Zero-config, built into Python stdlib |
| MCP | Interface | AI assistant integration | Anthropic's Model Context Protocol standard |
| Docker | Ops | Container deployment | Reproducible environment, one-command deploy |
Project File Tree
📂 data/ — Where all data lives
- raw/mimic_demo/ — Real hospital lab data from PhysioNet MIMIC-IV Demo. Used to calibrate synthetic value distributions. Free access, no credentials required.
- processed/ — Cleaned output from
loader.pyafter type conversion and sorting. - synthetic/generate.py — Creates ~116,640 QC records across 180 days, 3 instruments, 8 tests, 3 QC levels, and 19 reagent lots. Z-scores clipped to ±4.0.
- synthetic/qc_data.csv — The generated dataset loaded at app startup.
🔬 src/ — All Python backend logic
- ingestion/loader.py — Reads qc_data.csv, parses timestamps, returns summary stats.
- qc/rules.py — Full Westgard implementation (1-2s, 1-3s, 2-2s, R-4s, 4-1s, 10x) with tiered time windows per rule type.
- causal/engine.py — Builds DoWhy CausalModel from networkx DiGraph. Estimates ATE for temperature, calibration hours, and reagent lot using backdoor linear regression.
- explainer/explainer.py — Generates natural language root cause explanations and computes counterfactual z-scores analytically.
- storage/db.py — Three functions:
init_db(),save_result(),get_recent(). Pure sqlite3, no ORM. - api/main.py — FastAPI app. Serves 5 HTML pages + 8 REST endpoints. Loads data once at startup; causal model computed lazily on first request.
- mcp/server.py — MCP server exposing ARIA's analysis to AI assistants (Claude, Copilot) as tools.
🖥️ dashboard/ — Frontend templates and assets
- templates/base.html — Shared layout: fixed sidebar (240px), fixed topbar (56px), scrollable main-content area. All pages extend this.
- templates/overview.html — 4 KPI cards, donut chart, grouped bar chart, searchable QC status table, MIMIC disclaimer.
- templates/causal.html — ATE horizontal bar chart, 7-node causal DAG, detailed results table, explanation info box.
- templates/explainer.html — Failure slider (51 failures), z-score gauge, counterfactual simulation with temp and calibration sliders.
- templates/alerts.html — Active failures table with severity borders, all 6 Westgard rule cards in 3-column grid.
- static/style.css — 600+ line dark design system. CSS custom properties, card/table/badge/slider/result-box components.
- static/charts.js — All Plotly.js chart functions. Zero dependencies beyond Plotly.
🧪 tests/ — Automated test suite
- test_qc.py — Unit tests for all 6 Westgard rules. Verifies that synthetic z-score sequences trigger correct violations.
- test_causal.py — Integration test for the causal engine. Checks that ATE values are in realistic range and DAG loads without errors.
- test_api.py — FastAPI endpoint tests using httpx TestClient. Covers /health, /summary, /qc/status, /causal/analysis, /causal/counterfactual.