pg_llm

pg_llm Architecture (v1.1)

1. Overview

pg_llm is a PostgreSQL extension that provides LLM capabilities as native SQL functions. The extension keeps state and artifacts in _pg_llm_catalog, while model invocation and orchestration run inside PostgreSQL backends.

The architecture is designed around:

2. High-Level Components

2.1 SQL API Layer (src/pg_llm.cpp)

Main responsibilities:

2.2 Model Layer (src/models/*)

2.3 Text2SQL Layer (src/text2sql/*)

2.4 Support Layer (src/utils/*)

3. Persistent Catalog Model

All extension-owned data is persisted in _pg_llm_catalog.

Core tables:

4. Public API Shape

4.1 Backward-Compatible Text APIs

These remain available and act as thin wrappers over structured execution paths.

4.2 Structured APIs

4.3 Streaming APIs

Both return SRF rows with seq_no, chunk, is_final, model_name, confidence_score, request_id.

5. Runtime Flows

5.1 Chat / Multi-Turn Chat

  1. Resolve model instance from ModelManager.
  2. Optionally assemble RAG context (options.enable_rag).
  3. Invoke model (blocking or streaming).
  4. Apply confidence threshold logic and optional local fallback.
  5. Persist session messages (for multi-turn mode).
  6. Persist audit and trace records.

5.2 Parallel Chat and Routing

  1. Run candidate models in parallel.
  2. Select the highest-confidence candidate.
  3. Evaluate effective threshold (model-level / GUC / options).
  4. Trigger fallback model when confidence is below threshold.
  5. Persist candidate scores and routing decisions to trace/audit.

5.3 Text2SQL

  1. Build schema context.
  2. Perform optional vector retrieval and similar-query lookup.
  3. Generate SQL through LLM prompt path.
  4. Execute SQL via SPI and capture tabular result.
  5. Run EXPLAIN and include plan lines.
  6. Persist trace/audit.

5.4 Reports

  1. Execute SQL and capture structured result + explain output.
  2. Ask model for narrative summary.
  3. Produce report JSON with recommendations and Vega-Lite spec.
  4. Persist report artifact in catalog.

5.5 Knowledge / Feedback

6. Security And Observability

6.1 Core GUCs

6.2 Secret Handling

6.3 Observability

7. Build And Packaging