pg_llm Architecture (v1.1)
1. Overview
pg_llm is a PostgreSQL extension that provides LLM capabilities as native SQL functions. The extension keeps state and artifacts in _pg_llm_catalog, while model invocation and orchestration run inside PostgreSQL backends.
The architecture is designed around:
- Backward-compatible text APIs
- Additive structured
jsonb and SRF streaming APIs
- Persistent session/audit/trace/report/knowledge data
- Configurable confidence routing and local fallback
- In-database observability and security controls
2. High-Level Components
2.1 SQL API Layer (src/pg_llm.cpp)
Main responsibilities:
- SQL entrypoints (
PG_FUNCTION_INFO_V1) and argument decoding
- SPI-based reads/writes to
_pg_llm_catalog
- Response shaping (
text, jsonb, SRF rows)
- Request-level audit and trace persistence
2.2 Model Layer (src/models/*)
LLMInterface: provider adapter for chat, streaming, and embeddings
ModelManager: model registration, lazy instance loading, parallel inference
- Decrypts encrypted model secrets when loading model instances
- Includes deterministic mock provider path for offline tests
2.3 Text2SQL Layer (src/text2sql/*)
- Schema discovery / optional caller-supplied schema
- Optional vector retrieval for relevant examples/context
- SQL statement generation (
generate_statement)
- SQL execution + EXPLAIN capture for structured outputs
2.4 Support Layer (src/utils/*)
- Core GUC definitions
- JSON/UUID helpers
- AES-GCM encryption and decryption for secrets
- Redaction utilities for audit/trace metadata
- PostgreSQL-native logging macros (
elog)
3. Persistent Catalog Model
All extension-owned data is persisted in _pg_llm_catalog.
Core tables:
pg_llm_models: model routing metadata and encrypted secrets
pg_llm_sessions, pg_llm_session_messages: session state and history
pg_llm_audit_log: request-level audit events
pg_llm_trace_log: intermediate decisions and pipeline traces
pg_llm_reports: persisted report artifacts
pg_llm_knowledge_documents, pg_llm_knowledge_chunks: RAG corpus
pg_llm_feedback: user feedback linked to request_id
pg_llm_queries, pg_llm_vectors: text2sql/vector support data
4. Public API Shape
4.1 Backward-Compatible Text APIs
pg_llm_chat
pg_llm_parallel_chat
pg_llm_text2sql
- Existing session APIs (
pg_llm_create_session, pg_llm_multi_turn_chat, etc.)
These remain available and act as thin wrappers over structured execution paths.
4.2 Structured APIs
pg_llm_chat_json, pg_llm_parallel_chat_json, pg_llm_text2sql_json
pg_llm_execute_sql_with_analysis, pg_llm_generate_report
pg_llm_get_session, pg_llm_get_session_messages, pg_llm_update_session_state, pg_llm_delete_session
pg_llm_add_knowledge, pg_llm_search_knowledge
pg_llm_record_feedback
pg_llm_get_audit_log, pg_llm_get_trace
4.3 Streaming APIs
pg_llm_chat_stream
pg_llm_multi_turn_chat_stream
Both return SRF rows with seq_no, chunk, is_final, model_name, confidence_score, request_id.
5. Runtime Flows
5.1 Chat / Multi-Turn Chat
- Resolve model instance from
ModelManager.
- Optionally assemble RAG context (
options.enable_rag).
- Invoke model (blocking or streaming).
- Apply confidence threshold logic and optional local fallback.
- Persist session messages (for multi-turn mode).
- Persist audit and trace records.
5.2 Parallel Chat and Routing
- Run candidate models in parallel.
- Select the highest-confidence candidate.
- Evaluate effective threshold (model-level / GUC / options).
- Trigger fallback model when confidence is below threshold.
- Persist candidate scores and routing decisions to trace/audit.
5.3 Text2SQL
- Build schema context.
- Perform optional vector retrieval and similar-query lookup.
- Generate SQL through LLM prompt path.
- Execute SQL via SPI and capture tabular result.
- Run
EXPLAIN and include plan lines.
- Persist trace/audit.
5.4 Reports
- Execute SQL and capture structured result + explain output.
- Ask model for narrative summary.
- Produce report JSON with recommendations and Vega-Lite spec.
- Persist report artifact in catalog.
5.5 Knowledge / Feedback
- Knowledge ingestion stores document, chunks, deterministic embeddings.
- Knowledge search ranks chunks by vector distance + lexical boost.
- Feedback is stored per
request_id for later workflow use.
6. Security And Observability
6.1 Core GUCs
pg_llm.master_key
pg_llm.audit_enabled
pg_llm.trace_enabled
pg_llm.redact_sensitive
pg_llm.audit_sample_rate
pg_llm.default_confidence_threshold
pg_llm.default_local_fallback
6.2 Secret Handling
- API keys and secret-bearing configs are encrypted before persistence.
- Decryption happens in backend memory only.
- Redaction is applied in audit/trace output when enabled.
6.3 Observability
pg_llm_audit_log captures request outcome and metadata.
pg_llm_trace_log captures intermediate execution decisions.
request_id is the correlation key across APIs and tables.
7. Build And Packaging
- Extension version:
1.1
- Upgrade path:
pg_llm--1.0--1.1.sql
- Primary build/install path: CMake (
contrib/pg_llm/CMakeLists.txt)
- SQL regression tests are maintained in
test/sql and test/expected