---
name: kingbase-database
description: >
  KingBaseES V8 database operations via Docker container.
  Covers: connecting with ksql, extracting view/table definitions,
  handling large output truncation, and introspecting recursive CTE views.
  Use when working with the kingbasev8 container (port 54321, user ywkuser, db yfywk).
---

# KingBaseES Database Operations

## Connection

Default container: `kingbasev8`, port `54321`, user `ykwuser`, db `yfywk`.

```bash
# Run ksql via Docker exec — ALWAYS use full path
docker exec kingbasev8 /home/kingbase/install/kingbase/bin/ksql -U ywkuser -d yfywk -c "SELECT ..."
```

**Pitfall**: ksql is NOT on PATH inside the container. Bare `ksql` and `/opt/kingbase/bin/ksql` both fail. The only working path is `/home/kingbase/install/kingbase/bin/ksql`.

---

## Extracting View Definitions

### Small results (fits in terminal)

```bash
docker exec kingbasev8 ksql -U ywkuser -d yfywk -c "
  SELECT view_definition
  FROM information_schema.views
  WHERE table_schema = 'ywdata'
    AND table_name = 'v_your_view';
"
```

### Large results (truncated output)

KingBaseES ksql truncates output at ~200k chars. For complex views with recursive CTEs:

```bash
# Redirect to file
docker exec kingbasev8 /home/kingbase/install/kingbase/bin/ksql -U ywkuser -d yfywk -c "SELECT view_definition FROM ..." > /tmp/view.txt 2>/dev/null

# Check line count
wc -l /tmp/view.txt

# Read with line limits
read_file(path="/tmp/view.txt", limit=250, offset=0)
```

### Exporting view definition as runnable .sql file

Use `sys_get_viewdef()` with `-At` (tuples-only, unaligned) to produce a clean reproducible CREATE statement:

```bash
docker exec kingbasev8 /home/kingbase/install/kingbase/bin/ksql -U ywkuser -d yfywk -At -c \
  "SELECT 'CREATE OR REPLACE VIEW ywdata.v_my_view AS' || E'\n' || sys_get_viewdef('ywdata.v_my_view'::regclass, true);" \
  > ~/exports/v_my_view.sql
```

- `-A` = unaligned, `-t` = tuples only (no header/footer)
- `sys_get_viewdef(..., true)` = pretty-printed
- Output is directly re-executable via `\i` or `ksql -f`

### Listing views by creation order (newest first)

`information_schema.views` doesn't expose creation time. Use `sys_class.oid` as a proxy — higher oid = created later:

```sql
SELECT n.nspname, c.relname, c.relkind, c.oid
FROM sys_class c JOIN sys_namespace n ON n.oid = c.relnamespace
WHERE c.relkind IN ('v','m') AND n.nspname = 'ywdata'
ORDER BY c.oid DESC LIMIT 20;
```

`relkind`: `v` = view, `m` = materialized view.

---

## Introspecting Recursive CTE Views

Complex views often use `WITH RECURSIVE` + `MATERIALIZED` CTEs. Analysis pattern:

1. **Extract full definition** to file (see above)
2. **Identify CTE boundaries** by grepping for `WITH RECURSIVE` and `SELECT` lines
3. **Map data flow**: trace FROM/JOIN clauses through each CTE
4. **Extract filter logic**: WHERE clauses often contain business rules (age checks, exclusion lists, status filters)
5. **Note materialization**: `MATERIALIZED` CTEs are computed once and reused — important for performance understanding

### Common patterns in ywdata schema

| Pattern | Example |
|---------|---------|
| Base table filter | `zq_zfba_xyrxx` with `isdel_dm = '0'` and `sfzh` validation |
| Exclusion via NOT EXISTS | Filter out "违法事实不能成立" or "终止侦查决定书" |
| Age validation | `age(lrsj, birthday) < 18` for juvenile cases |
| Hardcoded exclusions | `ajbh NOT IN ('A...', 'A...')` or `rybh NOT IN ('R...', 'R...')` |
| LATERAL JOIN | Used for school info lookup with priority ordering |

---

## DMP Restore (KEMDP format)

```bash
docker exec kingbasev8 sys_restore -U ywkuser -d yfywk \
  --schema-only --no-owner --no-privileges /path/to/file.dmp
```

- Format: KEMDP custom format
- Encoding: GBK

---

## Incremental View Building Pattern

When building filtered views from base views, follow this layered approach:

```
v_wcnr_minor_detail          ← 第一步：录入年龄<18 + 通用清洗 + 一案一行
    ↓
v_wcnr_wfry_has_facts        ← 第二步：过滤"有违法事实"（行政/刑事各有规则）
    ↓
v_wcnr_wfry_aggregated       ← 第三步（可选）：按人聚合，多案件用 → 连接
```

### Critical pitfall: field gaps between views

`v_wcnr_minor_detail` was originally **missing `rybh` (人员编号)**. Resolved by rebuilding the view to include `rybh` from `zq_zfba_xyrxx.xyrxx_rybh`. No other objects depended on the old view, so safe to rebuild.

**Decision tables for join keys**:
- `zq_zfba_xzcfjds`: uses `ajxx_ajbh` + `xzcfjds_rybh`
- `zq_zfba_byxzcfjds`: uses `ajxx_ajbh` + `byxzcfjds_rybh`
- `zq_zfba_wenshu`: uses `ajbh` + `xgry_xm` (name-based, NO rybh field!)

### "Has illegal facts" filter logic (for step 2 view)

Source: `v_wcnr_minor_detail` (or rebuilt version with rybh)

**Admin cases (`ajlx = '行政'`)**: person's (ajbh + rybh) must match at least one of:
- `zq_zfba_xzcfjds` (`ajxx_ajbh` + `xzcfjds_rybh`) — any record = has facts
- `zq_zfba_byxzcfjds` (`ajxx_ajbh` + `byxzcfjds_rybh`, AND `byxzcfjds_cbryj !~ '违法事实不能成立'`) — only non-dismissed records count

**Criminal cases (`ajlx = '刑事'`)**: both conditions must NOT trigger:
1. NOT EXISTS in `zq_zfba_byxzcfjds` where `ajxx_ajbh` matches AND `byxzcfjds_rybh` matches AND `byxzcfjds_cbryj ~ '违法事实不能成立'`
2. NOT EXISTS in `zq_zfba_wenshu` where `ajbh` matches AND `xgry_xm` matches (name-based!) AND `wsmc ~ '终止侦查决定书'`

> Key difference: admin uses EXISTS (keep if matched), criminal uses NOT EXISTS (keep if not matched).

---

## Session Reference: ywdata.v_wcnr_wfry Views

All four views live in `ywdata` schema and filter for juvenile cases (录入年龄 < 18).

### `v_wcnr_wfry_base` — 一案一行 (通用嫌疑人源)
- Source: `ywdata.zq_zfba_xyrxx` (通用嫌疑人表)
- CTEs: `wf_validated` → `xz_keys` → `wf_xz_filtered`
- Filters: 身份证18位+正则+生日合法+20年内出生; 案件/人员未删除; 录入年龄<18; 硬编码排除3条记录
- Key fields: `fasj_age` (发案时年龄), `current_age` (当前年龄)
- No aggregation, no school info

### `v_wcnr_wfry_jbxx` — 一人一行 (未成年专项源)
- Source: `ywdata.zq_zfba_wcnr_xyr` (未成年专项表)
- Aggregates: 案件类型/时间链/编号/案由用 `→` 连接; 户籍/现住地/办案部门用 `、` 连接
- School info: LATERAL JOIN `sh_gd_zxxxsxj_xx`, 优先"在校"状态
- Filters: 案件未删除+人员未删除+身份代码=1; 刑事排除不予处罚+终止侦查; 行政必须有处罚记录

### `v_wcnr_wfry_jbxx_base` — 一案一行 (通用嫌疑人源, 同base)
- Source: `ywdata.zq_zfba_xyrxx`
- Logic: 与 `v_wcnr_wfry_base` 完全一致, 仅多了 `ajxx_join_ajxx_ajmc` (案件名称) 字段
- Granularity: 一案一行, 无聚合

### `v_wcnr_minor_detail` — 一案一行 (通用嫌疑人源)
- Source: `ywdata.zq_zfba_xyrxx`
- Fields: 原有21个字段 + `rybh` (人员编号, 从 `xyrxx_rybh` 补入, 用于关联处罚决定书)
- Extra fields: `ajxx_jyaq` (简要案情), `ajxx_fadd` (发案地点), `ajxx_join_ajxx_cbqy_bh_dm` (承办区域代码)
- No school info, no aggregation
- **Note**: Rebuilt during session to add `rybh`; no other objects depended on it.

### `v_wcnr_wfry_fact_check` — 一案一行 (过滤"有违法事实")
- Source: `v_wcnr_minor_detail`
- Admin (`ajlx = '行政'`): (ajbh + rybh) matches `zq_zfba_xzcfjds` OR (ajbh + rybh with `!~ '违法事实不能成立'`) in `zq_zfba_byxzcfjds`
- Criminal (`ajlx = '刑事'`): NOT EXISTS in `byxzcfjds` (违法事实不能成立) AND NOT EXISTS in `wenshu` (终止侦查决定书, name-join)
- wenshu table has NO rybh field — uses `ajbh + xgry_xm` (姓名) for association

## Quick Reference

| Task | Command |
|------|---------|
| List views in schema | `SELECT table_name FROM information_schema.views WHERE table_schema = 'ywdata'` |
| List tables in schema | `SELECT table_name FROM information_schema.tables WHERE table_schema = 'ywdata'` |
| Count rows | `SELECT COUNT(*) FROM ywdata.table_name` |
| View columns | `SELECT column_name FROM information_schema.columns WHERE table_schema = 'ywdata' AND table_name = '...'` |

### Supporting reference
- `references/decision-tables.md` — field-level reference for decision tables used in "has illegal facts" filtering (`xzcfjds`, `byxzcfjds`, `wenshu`), including join key differences.
- `references/ywdata-glossary.md` — 业务术语词典：表名/字段缩写中文对照、易错点 (e.g. `byxzcf` = 不予行政处罚, NOT 必要教育)。**翻译表名/字段前先查这里**。

---

## Workflow: Building Multi-Field Derived Views (字段映射先确认)

When the user asks to extend a view with **multiple derived fields** (5+ flag/aggregation fields, especially business-rule flags like 「是否XX」), DO NOT immediately write SQL.

**Required sequence:**
1. **Read referenced source SQLs** (existing similar views, ratio_base views, the *_detail.sql in user's project folder)
2. **Build a field-mapping table** with columns: 字段名 / 数据源表 / 关联键 / 判定规则 / 维度(按人/按案)
3. **Surface ambiguities explicitly** as numbered questions (统计范围? 多对一聚合方式? 命名约定?)
4. **Wait for user confirmation** before writing the CREATE VIEW
5. Only after confirmation, write the SQL

This is the user's preferred workflow for these tasks — they want to verify business logic mapping before code is generated, because a wrong rule on field 7 means rewriting a 200-line view.

**Common ambiguities to flag**:
- 违法次数: 仅行政? 行政+刑事? 仅事实核查通过的?
- 多条处罚聚合: `string_agg`? 最新一条? MAX?
- 按人 vs 按案 标志的覆盖范围 (一人多案时所有案件共享同一标志吗?)
- 视图命名后缀 (_full / _detail / _check / _ext)
