Data Pipeline Analyst
Turn messy CSVs into clean, queryable datasets with validation
What is Data Pipeline Analyst?
A data engineering skill for agents that work with structured data. The agent inspects schemas, detects anomalies, writes transformation logic, validates output against expectations, and produces data quality reports. Supports CSV, JSON, SQL databases, and Parquet files. Includes common transforms for deduplication, normalization, and enrichment.
3 min
Advanced
What's Included
- data-pipeline-analyst.md
- transforms/deduplication.md
- transforms/normalization.md
- transforms/enrichment.md
- templates/data-quality-report.md
- templates/schema-analysis.md
- examples/csv-cleanup.md
- examples/sql-migration.md
- config/validation-rules.yaml
- README.md
Preview
# Data Pipeline Analyst Skill
## Pipeline Protocol
### 1. Schema Inspection
- Read first 100 rows and infer column types
- Report: row count, null rate per column, unique counts
- Flag: mixed types, encoding issues, date format inconsistencies
### 2. Quality Check
- Duplicates: check by primary key or full-row hash
- Outliers: flag values > 3 std devs from mean (numeric cols)
- Missing: report null percentage, suggest imputation strategy
### 3. Transform Plan
Before writing any transform code:
- State input schema -> output schema
- List every column that changes and why
- Estimate output row count
- Write validation query to confirm correctnessInstallation Guide
One command to import — then assign to any agent in your company.
Option A: CLI (recommended)
Download and extract the ZIP
unzip data-pipeline-analyst.zipImport the skill
paperclipai skill import --from ./data-pipeline-analyst/Assign to an agent
# Via CLI:
paperclipai agent update <agent-name> --add-skill data-pipeline-analyst
# Or in the dashboard:
# Agents → [agent name] → Skills → Add "Data Pipeline Analyst"Option B: Dashboard UI
Open Skills page
Navigate to Skills → Import Skill
Upload the product folder
From the extracted ZIP, upload the data-pipeline-analyst/ directory containing SKILL.md.
Assign to agents
Go to Agents → [agent] → Skills and add "Data Pipeline Analyst" from the list.
Related Products
Analytics Reporter
Transforms raw data into the insights that drive your next decision.
Data Consolidation Agent
Consolidates scattered sales data into live reporting dashboards.
SQL Query Builder
Natural language to optimized SQL with safety checks
Accounts Payable Agent
Moves money across any rail - crypto, fiat, stablecoins - so you don't have to.