YAML-based pattern matching with multi-line capabilities for log normalization using syslog-ng patterndb
patterndb-yaml brings intuitive YAML pattern definitions to syslog-ng's proven patterndb engine. Instead of writing complex XML patterns, you define rules in readable YAML and let patterndb-yaml handle the translation to syslog-ng's efficient pattern matcher.
This makes it easier to normalize heterogeneous logs - transforming different log formats into standardized output for comparison, analysis, or filtering.
- YAML rules - Readable pattern definitions instead of syslog-ng XML
- Field extraction - Pull specific data (table names, IDs, etc.) from matched lines
- Pattern matching - Powered by syslog-ng's efficient C implementation
- Multi-line sequences - Handle log entries spanning multiple lines
- Explain mode - Debug which patterns matched and why
- CLI and Python API - Use as a command-line tool or library
Requirements: Python 3.9+, syslog-ng 4.10.1+
⚠️ Important:patterndb-yamlrequires syslog-ng to be installed from official repositories (distro defaults may be incompatible).See SYSLOG_NG_INSTALLATION.md for platform-specific instructions.
brew tap JeffreyUrban/patterndb-yaml && brew install patterndb-yaml✅ Automatically installs syslog-ng as a dependency. Homebrew manages all dependencies and provides easy updates via brew upgrade.
⚠️ Manual Setup Required: You must install syslog-ng separately before using pipx.
# STEP 1: Install syslog-ng from official repos (REQUIRED)
# See docs/SYSLOG_NG_INSTALLATION.md for your platform
# STEP 2: Install patterndb-yaml
pipx install patterndb-yamlpipx installs in an isolated environment with global CLI access. Update with pipx upgrade patterndb-yaml.
⚠️ Manual Setup Required: You must install syslog-ng separately before using pip.
# STEP 1: Install syslog-ng from official repos (REQUIRED)
# See docs/SYSLOG_NG_INSTALLATION.md for your platform
# STEP 2: Install patterndb-yaml
pip install patterndb-yamlUse pip if you want to use patterndb-yaml as a library in your Python projects.
# Development installation
git clone https://github.com/JeffreyUrban/patterndb-yaml
cd patterndb-yaml
pip install -e ".[dev]"Windows is not currently supported. Consider using WSL2 (Windows Subsystem for Linux) and following the Linux installation instructions.
Requirements: Python 3.9+, syslog-ng (installed automatically with Homebrew)
Create a YAML rules file (rules.yaml):
rules:
- name: log_info
pattern:
- text: "["
- text: "INFO"
- text: "] "
- field: message
output: "[info:{message}]"
- name: log_error
pattern:
- text: "["
- text: "ERROR"
- text: "] "
- field: message
output: "[error:{message}]"Process your logs:
# Process from stdin
cat app.log | patterndb-yaml --rules rules.yaml
# Process a file
patterndb-yaml --rules rules.yaml --input app.log
# Get statistics
patterndb-yaml --rules rules.yaml --input app.log --statsfrom patterndb_yaml import PatterndbYaml
from pathlib import Path
# Initialize with rules
processor = PatterndbYaml(rules_path=Path("rules.yaml"))
# Process logs
with open("app.log") as infile, open("clean.log", "w") as outfile:
processor.process(infile, outfile)
# Get statistics
stats = processor.get_stats()
print(f"Matched {stats['lines_matched']} of {stats['lines_processed']} lines")
print(f"Match rate: {stats['match_rate']:.1%}")- Log Normalization - Transform heterogeneous log formats into standardized output
- Data Extraction - Pull structured data from unstructured log lines
- Log Filtering - Identify and process specific log patterns
- Format Standardization - Convert legacy log formats to modern structured formats
- Compliance - Normalize logs for security analysis and auditing
patterndb-yaml uses syslog-ng's patterndb engine for efficient pattern matching:
- YAML → XML - Converts your readable YAML rules into syslog-ng's XML patterndb format
- Pattern Matching - Uses syslog-ng's C implementation for fast, memory-efficient matching
- Field Extraction - Pulls named fields from matched patterns
- Output Transformation - Applies output templates to normalize log format
The system processes logs line-by-line with constant memory usage, making it suitable for large files and streaming data.
Read the full documentation at patterndb-yaml.readthedocs.io
Key sections:
- Getting Started - Installation and quick start guide
- Use Cases - Real-world examples across different domains
- Guides - Pattern design, performance tips, common patterns
- Reference - Complete CLI and Python API documentation
# Clone repository
git clone https://github.com/JeffreyUrban/patterndb-yaml.git
cd patterndb-yaml
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run with coverage
pytest --cov=patterndb_yaml --cov-report=html
# Build documentation
cd docs && mkdocs build- Time complexity: O(n) where n is number of log lines
- Space complexity: O(1) constant memory for processing
- Throughput: Processes logs line-by-line with streaming support
- Memory: Minimal memory footprint, suitable for large files
Performance is determined by syslog-ng's patterndb engine, which uses efficient C implementations for pattern matching.
MIT License - See LICENSE file for details
Jeffrey Urban