CSV / Excel Deep Dive (Prompt 2)

Trigger Phrase

Use prompt: CSV / Excel Deep Dive

Prompt

198 words

ROLE:
You are a data analyst using code to inspect, clean, analyse, and visualise uploaded datasets.

GOAL:
Perform a thorough analysis of an uploaded CSV, Excel, or JSON file and explain what matters clearly.

INPUT:
Data file: [UPLOAD FILE]

CONTEXT:
The user wants both a data health check and meaningful analysis, not a superficial summary. The output should help them understand quality issues, patterns, and next questions.

TASKS:
1. Run a data health check covering rows, columns, missing values, data types, and obvious outliers or errors.
2. Produce summary statistics for numerical columns.
3. Analyse categorical columns using top values and frequencies.
4. Identify strong correlations above 0.7 or below -0.7.
5. Analyse time-based trends if there is a date column.
6. Surface the 3 most interesting patterns or anomalies.
7. Create 3 to 4 clean, professional visualisations.
8. End with 3 things worth investigating further and why.

CONSTRAINTS:
- Do not invent missing inputs.
- Use code for the analysis.
- Flag uncertainty or data quality issues explicitly.
- Keep charts presentation-ready.

OUTPUT FORMAT:
- Data health check
- Statistical analysis
- Key patterns and anomalies
- Visualisations
- Further investigation points

IMPORTANT:
Wait for user data before starting. Write in British English. Prioritise clarity, evidence, and clean visual storytelling.

Before & After

❌ Without this prompt

Unstructured request with unclear constraints and inconsistent output.

✅ With this prompt

Reusable, testable prompt/skill with clear trigger, inputs, output format, guardrails, and pass criteria.

Install Instructions

Copy the prompt text. Paste into ChatGPT, Claude, Gemini, or any AI chat. Fill in bracketed placeholders with your details. Run and review output.

Test It

Test command:

Trigger with: 'Test the CSV / Excel Deep Dive with this input: [provide a short real example]'. Confirm output is specific, structured, and useful.

Expected output:

Most missing values are concentrated in the acquisition_source column, which makes channel attribution unreliable for 18% of rows. The strongest correlation is between trial length and conversion rate at 0.76.

Pass criteria:

Output is specific to the input provided — not generic. Output follows the stated format and length. No invented statistics, facts, prices, or dates. Placeholders are not left unfilled.

⚠️ Guardrails

Do not invent statistics, prices, laws, medical claims, or financial advice. Do not leave placeholders unfilled in output. Flag when inputs are too vague to produce a quality result — ask for clarification.

📁 Context File Tip

Project Context file

⚠️ Common Failure Modes

May become generic, over-confident, miss constraints, over-automate, or produce output that needs fact checking.

🔧 Fix Prompt

Tighten the goal, add examples, add constraints, specify the output format, and ask the model to list assumptions before final output.

🎛 Available Modes

Quick Detailed Critic Final

🔌 Compatibility & Requirements

🌐 Needs web access

📎 Needs uploaded files

👤 Needs human approval

Approval point: Before publishing, sending, spending money, changing systems, or making commitments.

Required tools: Web researchFile analysisSpreadsheet tool

⚡ Automation

🔌 MCP-compatible

📋 Upgrade Notes

Upgraded for Prompt Hub Pro v9.9.5 scoring, skill metadata, importer compatibility, and reusable agent/workflow presentation.

Next step: Upgrade to a reusable skill for consistent repeated results

✅ 👤 HackTheSim · v2.0

Curated practical AI workflow and prompt library.

📅 Reviewed: 2026-06-19 🧪 Tested with: ChatGPT, Claude, Gemini

💡 Suggest an improvement