ClipMarts

Data Pipeline Analyst

Turn messy CSVs into clean, queryable datasets with validation

$29Operator PackFor departments, agencies, and ops-heavy teams

What is Data Pipeline Analyst?

A data engineering skill for agents that work with structured data. The agent inspects schemas, detects anomalies, writes transformation logic, validates output against expectations, and produces data quality reports. Supports CSV, JSON, SQL databases, and Parquet files. Includes common transforms for deduplication, normalization, and enrichment.

Setup Time

3 min

Difficulty

Advanced

Works With
paperclipclaude-code

What's Included

  • data-pipeline-analyst.md
  • transforms/deduplication.md
  • transforms/normalization.md
  • transforms/enrichment.md
  • templates/data-quality-report.md
  • templates/schema-analysis.md
  • examples/csv-cleanup.md
  • examples/sql-migration.md
  • config/validation-rules.yaml
  • README.md

Preview

data-pipeline-analyst.md
# Data Pipeline Analyst Skill

## Pipeline Protocol

### 1. Schema Inspection
- Read first 100 rows and infer column types
- Report: row count, null rate per column, unique counts
- Flag: mixed types, encoding issues, date format inconsistencies

### 2. Quality Check
- Duplicates: check by primary key or full-row hash
- Outliers: flag values > 3 std devs from mean (numeric cols)
- Missing: report null percentage, suggest imputation strategy

### 3. Transform Plan
Before writing any transform code:
- State input schema -> output schema
- List every column that changes and why
- Estimate output row count
- Write validation query to confirm correctness

Installation Guide

terminal
$ paperclipai skill import --from ./data-pipeline-analyst/
Skill imported successfully.

One command to import — then assign to any agent in your company.

Option A: CLI (recommended)

1

Download and extract the ZIP

unzip data-pipeline-analyst.zip
2

Import the skill

paperclipai skill import --from ./data-pipeline-analyst/
3

Assign to an agent

# Via CLI:
paperclipai agent update <agent-name> --add-skill data-pipeline-analyst

# Or in the dashboard:
# Agents → [agent name] → Skills → Add "Data Pipeline Analyst"

Option B: Dashboard UI

1

Open Skills page

Navigate to Skills → Import Skill

2

Upload the product folder

From the extracted ZIP, upload the data-pipeline-analyst/ directory containing SKILL.md.

3

Assign to agents

Go to Agents → [agent] → Skills and add "Data Pipeline Analyst" from the list.

Share
Files included10
Setup time3 min
Difficultyadvanced

Tags

dataetlcsvsqlpipelineanalytics