pg_llm

Text2SQL Design (v1.1)

1. Goal

pg_llm Text2SQL turns natural language prompts into executable SQL inside PostgreSQL, with optional vector/RAG context and structured observability.

Primary APIs:

2. Pipeline

Implementation entrypoint: build_text2sql_json_internal(...) in src/pg_llm.cpp.

Step 1: Context Assembly

Step 2: Optional Vector Retrieval

When use_vector_search = true:

Vector storage/search uses _pg_llm_catalog.pg_llm_vectors with corrected column naming:

Step 3: SQL Generation

Step 4: SQL Execution + Analysis

Step 5: Observability

3. Input Contract

pg_llm_text2sql_json(instance_name, prompt, schema_info, use_vector_search, options)

Current option keys parsed by implementation:

4. Output Contract (jsonb)

Typical shape:

{
  "request_id": "uuid",
  "instance_name": "model-instance",
  "prompt": "show latest 10 orders",
  "generated_sql": "SELECT ...",
  "execution": {
    "status": "SPI_OK_SELECT",
    "row_count": 10,
    "columns": ["id", "created_at"],
    "rows": [{"id": "1", "created_at": "..."}],
    "explain": ["Seq Scan on ..."]
  },
  "similar_queries": ["..."],
  "vector_hits": [
    {
      "table_name": "orders",
      "column_name": "note",
      "row_id": 123,
      "similarity": 0.91,
      "metadata": {"source": "..."}
    }
  ]
}

pg_llm_text2sql(...) returns only generated_sql for compatibility.

5. Failure Semantics

Errors are raised as PostgreSQL errors (ereport(ERROR)) for:

6. Security Considerations

7. Build And Deployment Notes