Skip to content

Commit ec7d9ac

Browse files
committed
feat: add dbt integration (Phase 1)
Implement core dbt module for query composition with Jinja templating, external dataset reference resolution, and project initialization. Includes CLI commands (, , ), model compilation engine, and comprehensive documentation.
1 parent 710c4e3 commit ec7d9ac

File tree

19 files changed

+3277
-1
lines changed

19 files changed

+3277
-1
lines changed

.amp-dbt/state.db

28 KB
Binary file not shown.

AMP_DBT_DESIGN.md

Lines changed: 784 additions & 0 deletions
Large diffs are not rendered by default.

USER_WALKTHROUGH.md

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
# Complete User Walkthrough: Amp DBT from Start to Finish
2+
3+
## Step-by-Step Guide
4+
5+
### Step 1: Initialize Project
6+
7+
```bash
8+
# Create directory
9+
mkdir -p /tmp/my-amp-project
10+
11+
# Initialize (from project root)
12+
cd /Users/vivianpeng/Work/amp-python
13+
uv run python -m amp.dbt.cli init --project-dir /tmp/my-amp-project
14+
```
15+
16+
**Creates:** models/, macros/, tests/, docs/, dbt_project.yml
17+
18+
---
19+
20+
### Step 2: Create Models
21+
22+
#### Model 1: Staging (External Dependency)
23+
24+
```bash
25+
cd /tmp/my-amp-project
26+
mkdir -p models/staging
27+
28+
cat > models/staging/stg_erc20.sql << 'EOF'
29+
{{ config(dependencies={'eth': '_/[email protected]'}) }}
30+
SELECT block_num, tx_hash FROM {{ ref('eth') }}.logs LIMIT 100
31+
EOF
32+
```
33+
34+
#### Model 2: Intermediate (Internal Dependency)
35+
36+
```bash
37+
mkdir -p models/intermediate
38+
39+
cat > models/intermediate/int_stats.sql << 'EOF'
40+
SELECT COUNT(*) as count FROM {{ ref('stg_erc20') }}
41+
EOF
42+
```
43+
44+
#### Model 3: Marts (Internal Dependency)
45+
46+
```bash
47+
mkdir -p models/marts
48+
49+
cat > models/marts/analytics.sql << 'EOF'
50+
SELECT * FROM {{ ref('int_stats') }} ORDER BY count DESC LIMIT 10
51+
EOF
52+
```
53+
54+
---
55+
56+
### Step 3: Test/Compile
57+
58+
```bash
59+
cd /Users/vivianpeng/Work/amp-python
60+
61+
# List models
62+
uv run python -m amp.dbt.cli list --project-dir /tmp/my-amp-project
63+
64+
# Compile
65+
uv run python -m amp.dbt.cli compile --project-dir /tmp/my-amp-project
66+
67+
# See compiled SQL
68+
uv run python -m amp.dbt.cli compile --show-sql --project-dir /tmp/my-amp-project
69+
```
70+
71+
**Shows:** Dependencies, execution order, compiled SQL with CTEs
72+
73+
---
74+
75+
### Step 4: Run
76+
77+
```bash
78+
# Dry run (see plan)
79+
uv run python -m amp.dbt.cli run --dry-run --project-dir /tmp/my-amp-project
80+
81+
# Actually run
82+
uv run python -m amp.dbt.cli run --project-dir /tmp/my-amp-project
83+
```
84+
85+
**Creates:** State tracking in .amp-dbt/state.db
86+
87+
---
88+
89+
### Step 5: Monitor
90+
91+
```bash
92+
# Check status
93+
uv run python -m amp.dbt.cli status --project-dir /tmp/my-amp-project
94+
95+
# Monitor dashboard
96+
uv run python -m amp.dbt.cli monitor --project-dir /tmp/my-amp-project
97+
98+
# Auto-refresh
99+
uv run python -m amp.dbt.cli monitor --watch --project-dir /tmp/my-amp-project
100+
```
101+
102+
---
103+
104+
## Quick Command Reference
105+
106+
All commands run from: `/Users/vivianpeng/Work/amp-python`
107+
108+
```bash
109+
# Initialize
110+
uv run python -m amp.dbt.cli init --project-dir /tmp/my-amp-project
111+
112+
# List
113+
uv run python -m amp.dbt.cli list --project-dir /tmp/my-amp-project
114+
115+
# Compile
116+
uv run python -m amp.dbt.cli compile --project-dir /tmp/my-amp-project
117+
uv run python -m amp.dbt.cli compile --show-sql --project-dir /tmp/my-amp-project
118+
119+
# Run
120+
uv run python -m amp.dbt.cli run --dry-run --project-dir /tmp/my-amp-project
121+
uv run python -m amp.dbt.cli run --project-dir /tmp/my-amp-project
122+
123+
# Monitor
124+
uv run python -m amp.dbt.cli status --project-dir /tmp/my-amp-project
125+
uv run python -m amp.dbt.cli monitor --project-dir /tmp/my-amp-project
126+
```

apps/execute_query.py

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,23 @@
22

33
from amp.client import Client
44

5-
client = Client('grpc://127.0.0.1:80')
5+
# Replace with your remote server URL
6+
# Format: grpc://hostname:port or grpc+tls://hostname:port for TLS
7+
SERVER_URL = "grpc://34.27.238.174:80"
8+
9+
# Option 1: No authentication (if server doesn't require it)
10+
# client = Client(url=SERVER_URL)
11+
12+
# Option 2: Use explicit auth token
13+
# client = Client(url=SERVER_URL, auth_token='your-token-here')
14+
15+
# Option 3: Use environment variable AMP_AUTH_TOKEN
16+
# export AMP_AUTH_TOKEN="your-token-here"
17+
# client = Client(url=SERVER_URL)
18+
19+
# Option 4: Use auto-refreshing auth from shared auth file (recommended)
20+
# Uses ~/.amp/cache/amp_cli_auth (shared with TypeScript CLI)
21+
client = Client(url=SERVER_URL, auth=True)
622

723
df = client.get_sql('select * from eth_firehose.logs limit 1', read_all=True).to_pandas()
824
print(df)

models/example_model.sql

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
-- Example model
2+
{{ config(
3+
dependencies={'eth': '_/[email protected]'},
4+
description='Example model showing how to use ref()'
5+
) }}
6+
7+
SELECT
8+
block_num,
9+
block_hash,
10+
timestamp
11+
FROM {{ ref('eth') }}.blocks
12+
LIMIT 10

models/staging/stg_test.sql

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{{ config(
2+
dependencies={'eth': '_/[email protected]'},
3+
track_progress=True,
4+
track_column='block_num'
5+
) }}
6+
SELECT block_num, tx_hash FROM {{ ref('eth') }}.logs LIMIT 10

pyproject.toml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,15 @@ dependencies = [
2929
# Admin API client support
3030
"httpx>=0.27.0",
3131
"pydantic>=2.0,<2.12", # Constrained for PyIceberg compatibility
32+
# DBT support
33+
"jinja2>=3.1.0",
34+
"pyyaml>=6.0",
35+
"rich>=13.0.0",
3236
]
3337

38+
[project.scripts]
39+
amp-dbt = "amp.dbt.cli:main"
40+
3441
[dependency-groups]
3542
dev = [
3643
"altair>=5.5.0", # Data visualization for notebooks

src/amp/dbt/README.md

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
# Amp DBT - Phase 1 Implementation
2+
3+
## Overview
4+
5+
Phase 1 implements the core compilation engine for Amp DBT, providing basic query composition with Jinja templating and external dataset reference resolution.
6+
7+
## Features Implemented
8+
9+
### ✅ Project Initialization
10+
- `amp-dbt init` command creates a new DBT project structure
11+
- Generates `dbt_project.yml`, directory structure, and example model
12+
13+
### ✅ Model Loading and Parsing
14+
- Loads SQL model files from `models/` directory
15+
- Parses `{{ config() }}` blocks from model SQL
16+
- Extracts configuration (dependencies, track_progress, etc.)
17+
18+
### ✅ Jinja Templating Support
19+
- Full Jinja2 template rendering
20+
- Custom `ref()` function for dependency resolution
21+
- Support for variables and macros (basic)
22+
23+
### ✅ ref() Resolution (External Datasets Only)
24+
- Resolves `{{ ref('eth') }}` to dataset references like `_/[email protected]`
25+
- Validates dependencies are defined in config
26+
- Replaces ref() calls in compiled SQL
27+
28+
### ✅ Basic Config Parsing
29+
- Parses `{{ config() }}` blocks from model SQL
30+
- Loads `dbt_project.yml` (optional)
31+
- Supports dependencies, track_progress, register, deploy flags
32+
33+
### ✅ CLI Commands
34+
- `amp-dbt init` - Initialize new project
35+
- `amp-dbt compile` - Compile models
36+
- `amp-dbt list` - List all models
37+
38+
## Usage
39+
40+
### Initialize a Project
41+
42+
```bash
43+
amp-dbt init my-project
44+
cd my-project
45+
```
46+
47+
### Create a Model
48+
49+
Create `models/staging/stg_erc20_transfers.sql`:
50+
51+
```sql
52+
{{ config(
53+
dependencies={'eth': '_/[email protected]'},
54+
track_progress=true,
55+
track_column='block_num',
56+
description='Decoded ERC20 Transfer events'
57+
) }}
58+
59+
SELECT
60+
l.block_num,
61+
l.block_hash,
62+
l.timestamp,
63+
l.tx_hash,
64+
l.address as token_address
65+
FROM {{ ref('eth') }}.logs l
66+
WHERE
67+
l.topic0 = evm_topic('Transfer(address indexed from, address indexed to, uint256 value)')
68+
AND l.topic3 IS NULL
69+
```
70+
71+
### Compile Models
72+
73+
```bash
74+
# Compile all models
75+
amp-dbt compile
76+
77+
# Compile specific models
78+
amp-dbt compile --select stg_*
79+
80+
# Show compiled SQL
81+
amp-dbt compile --show-sql
82+
```
83+
84+
## Project Structure
85+
86+
```
87+
my-project/
88+
├── dbt_project.yml # Project configuration
89+
├── models/ # SQL model files
90+
│ ├── staging/
91+
│ │ └── stg_erc20_transfers.sql
92+
│ └── marts/
93+
│ └── token_analytics.sql
94+
├── macros/ # Reusable SQL macros (future)
95+
├── tests/ # Data quality tests (future)
96+
└── docs/ # Documentation (future)
97+
```
98+
99+
## Limitations (Phase 1)
100+
101+
- ❌ Internal model references (model-to-model dependencies) not supported
102+
- ❌ Macros system not fully implemented
103+
- ❌ No execution (`amp-dbt run`) - compilation only
104+
- ❌ No monitoring or tracking
105+
- ❌ No testing framework
106+
107+
## Next Steps (Phase 2)
108+
109+
- Internal model dependency resolution (CTE inlining)
110+
- Dependency graph building
111+
- Topological sort for execution order
112+
- `amp-dbt run` command for execution
113+

src/amp/dbt/__init__.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
"""Amp DBT - Query composition and orchestration framework for Amp."""
2+
3+
from amp.dbt.project import AmpDbtProject
4+
from amp.dbt.compiler import Compiler
5+
from amp.dbt.models import CompiledModel, ModelConfig
6+
from amp.dbt.exceptions import (
7+
AmpDbtError,
8+
CompilationError,
9+
ConfigError,
10+
DependencyError,
11+
ProjectNotFoundError,
12+
)
13+
14+
__all__ = [
15+
'AmpDbtProject',
16+
'Compiler',
17+
'CompiledModel',
18+
'ModelConfig',
19+
'AmpDbtError',
20+
'CompilationError',
21+
'ConfigError',
22+
'DependencyError',
23+
'ProjectNotFoundError',
24+
]
25+

0 commit comments

Comments
 (0)