System Data Flow

9 Components

From raw hospital reference data through causal AI to actionable QC insights. Hover any node to see its file path. All components communicate through the FastAPI backend.

① DATA SOURCES MIMIC-IV · Synthetic generator
② INGESTION Data loader · type coercion
③ ANALYTICS Westgard rules · DoWhy causal engine
④ INTERFACE FastAPI · Dashboard · SQLite · MCP

Data Layer

Synthetic QC data generated by data/synthetic/generate.py, calibrated against MIMIC-IV Demo distributions (PhysioNet, 2023). 180 days × 3 instruments × 8 tests × 3 QC levels = 116,640 records.

QC Engine

Full Westgard multi-rule implementation with clinically appropriate tiered time windows per rule type. Prevents stale violations from inflating the FAIL rate. Rules: 1-2s, 1-3s, 2-2s, R-4s, 4-1s, 10x.

Causal AI

DoWhy + pgmpy DAG structure. Backdoor linear regression estimates Average Treatment Effects. Counterfactuals: "what if temp had been 19°C?" — uses real ATE coefficients.

Technology Stack by Layer

Python 3.11

Three architectural layers: Data (ingest + calibration), Intelligence (causal AI + QC rules), and Interface (API + dashboard + MCP).

Tool Reference

Tool Category Role in ARIA Why we chose it
Python 3.11CoreAll backend logicEcosystem: DoWhy, FastAPI, pandas
pandasDataQC data loading & transformationIndustry standard for tabular data
numpyDataZ-score computation, clippingFast array math, required by DoWhy
MIMIC-IV DemoDataCalibration reference distributionsFree, real hospital lab data (PhysioNet)
DoWhy 0.11AICausal graph + ATE estimationMicrosoft Research, best Python causal lib
pgmpyAIBayesian network / DAG backendRequired by DoWhy for graph operations
scikit-learnAILinear regression estimatorBackdoor criterion implementation in DoWhy
FastAPIAPIREST backend + HTML page servingAsync, auto-docs, Jinja2 support
UvicornAPIASGI serverProduction-grade, pairs with FastAPI
Jinja2InterfaceHTML template engineNative FastAPI support, clean inheritance
Plotly.jsInterfaceAll interactive chartsNo build step, rich chart types, dark theme
SQLiteStorageQC result history persistenceZero-config, built into Python stdlib
MCPInterfaceAI assistant integrationAnthropic's Model Context Protocol standard
DockerOpsContainer deploymentReproducible environment, one-command deploy

Project File Tree

ARIA v1.0.0

  
📂 data/ — Where all data lives
  • raw/mimic_demo/ — Real hospital lab data from PhysioNet MIMIC-IV Demo. Used to calibrate synthetic value distributions. Free access, no credentials required.
  • processed/ — Cleaned output from loader.py after type conversion and sorting.
  • synthetic/generate.py — Creates ~116,640 QC records across 180 days, 3 instruments, 8 tests, 3 QC levels, and 19 reagent lots. Z-scores clipped to ±4.0.
  • synthetic/qc_data.csv — The generated dataset loaded at app startup.
🔬 src/ — All Python backend logic
  • ingestion/loader.py — Reads qc_data.csv, parses timestamps, returns summary stats.
  • qc/rules.py — Full Westgard implementation (1-2s, 1-3s, 2-2s, R-4s, 4-1s, 10x) with tiered time windows per rule type.
  • causal/engine.py — Builds DoWhy CausalModel from networkx DiGraph. Estimates ATE for temperature, calibration hours, and reagent lot using backdoor linear regression.
  • explainer/explainer.py — Generates natural language root cause explanations and computes counterfactual z-scores analytically.
  • storage/db.py — Three functions: init_db(), save_result(), get_recent(). Pure sqlite3, no ORM.
  • api/main.py — FastAPI app. Serves 5 HTML pages + 8 REST endpoints. Loads data once at startup; causal model computed lazily on first request.
  • mcp/server.py — MCP server exposing ARIA's analysis to AI assistants (Claude, Copilot) as tools.
🖥️ dashboard/ — Frontend templates and assets
  • templates/base.html — Shared layout: fixed sidebar (240px), fixed topbar (56px), scrollable main-content area. All pages extend this.
  • templates/overview.html — 4 KPI cards, donut chart, grouped bar chart, searchable QC status table, MIMIC disclaimer.
  • templates/causal.html — ATE horizontal bar chart, 7-node causal DAG, detailed results table, explanation info box.
  • templates/explainer.html — Failure slider (51 failures), z-score gauge, counterfactual simulation with temp and calibration sliders.
  • templates/alerts.html — Active failures table with severity borders, all 6 Westgard rule cards in 3-column grid.
  • static/style.css — 600+ line dark design system. CSS custom properties, card/table/badge/slider/result-box components.
  • static/charts.js — All Plotly.js chart functions. Zero dependencies beyond Plotly.
🧪 tests/ — Automated test suite
  • test_qc.py — Unit tests for all 6 Westgard rules. Verifies that synthetic z-score sequences trigger correct violations.
  • test_causal.py — Integration test for the causal engine. Checks that ATE values are in realistic range and DAG loads without errors.
  • test_api.py — FastAPI endpoint tests using httpx TestClient. Covers /health, /summary, /qc/status, /causal/analysis, /causal/counterfactual.