PipeGenStreaming Data Pipeline Generator

Create and manage streaming data pipelines using Apache Kafka and FlinkSQL with AI-powered generation and real-time monitoring.

Get Started

View on GitHub

Quick Project Scaffolding

Generate complete pipeline projects with SQL statements, AVRO schemas, and Docker Compose setup in seconds.

AI-Powered Generation

Describe your pipeline in natural language and let AI create optimized FlinkSQL statements and schemas.

Dynamic Traffic Patterns

Simulate realistic traffic spikes and load patterns for comprehensive testing and capacity planning.

Local Development Stack

Complete Docker-based development environment with Kafka, Flink, and Schema Registry.

Execution Reports

Professional HTML reports with interactive charts, performance metrics, and complete configuration snapshots.

Why PipeGen?

Building streaming data pipelines traditionally requires deep knowledge of Apache Kafka, FlinkSQL, AVRO schemas, and complex deployment configurations. PipeGen eliminates this complexity by providing:

Zero-config local development - Complete stack with one command

AI-assisted pipeline creation - Natural language to production-ready code

Realistic testing capabilities - Traffic pattern simulation for load testing

Real-time visibility - Live monitoring and comprehensive reporting

DevOps-ready workflows - Automated deployment and cleanup

Quick Example

bash

# Install PipeGen
curl -sSL https://raw.githubusercontent.com/mcolomerc/pipegen/main/install.sh | bash

# Create an AI-generated fraud detection pipeline
pipegen init fraud-detection --describe "Monitor payment transactions, detect suspicious patterns using machine learning, and alert on potential fraud within 30 seconds"

# Deploy local development stack
pipegen deploy

# Run with traffic spikes simulation and report generation
pipegen run --message-rate 100 --duration 10m --traffic-pattern "2m-4m:400%,6m-8m:300%" --reports-dir ./reports

# Or use smart consumer stopping for faster feedback
pipegen run --expected-messages 1000 --message-rate 50

Comprehensive Execution Reports

Every pipeline execution automatically generates professional HTML reports saved to the reports/ folder with:

Performance analytics with interactive charts and detailed metrics
Complete configuration snapshots for reproducibility
Traffic pattern analysis and load testing insights
Resource utilization tracking and system health monitoring
Professional styling ready for stakeholder sharing
Timestamped filenames for easy historical analysis

PipeGenStreaming Data Pipeline Generator

Quick Project Scaffolding

AI-Powered Generation

Dynamic Traffic Patterns

Local Development Stack

Execution Reports

Why PipeGen?

Quick Example

Comprehensive Execution Reports

Perfect for Teams

Enterprises

Developers

PipeGenStreaming Data Pipeline Generator

Quick Project Scaffolding

AI-Powered Generation

Dynamic Traffic Patterns

Local Development Stack

Execution Reports

Why PipeGen? ​

Quick Example ​

Comprehensive Execution Reports ​

Perfect for Teams ​

Enterprises

Developers

Why PipeGen?

Quick Example

Comprehensive Execution Reports

Perfect for Teams