Skip to content

Pilot Phase Guide

"Week 1-2: Prove the concept. Calibrate the system. Build the foundation."

The pilot phase is designed to validate HEAT with minimal risk and maximum learning. By the end of Week 2, you'll have baseline data, calibrated thresholds, and proof of value to justify full team rollout.


Pilot Objectives

Validate signal quality — Do Pain Streaks correlate with actual grind? ✅ Calibrate intensity scale — Is x8 meaningful vs x5? ✅ Identify tag patterns — Which work types matter for your team? ✅ Measure adoption friction — Does 30-second tagging actually work? ✅ Prove business value — Can you catch one burnout signal early?

Success criteria: At least 1 actionable insight per pilot participant by Week 2


Pilot Team Selection

Option A: Individual Pilot (Safest)

Who: 1-2 volunteer developers + their manager

Pros:

  • Minimal disruption
  • Easy to calibrate
  • Safe to iterate

Cons:

  • Limited pattern detection
  • No team heatmap view
  • Slower to prove ROI

Best for: Conservative organizations, initial validation


Who: 3-5 developers + 1 manager (e.g., single squad)

Pros:

  • Team heatmap visible
  • Bus Factor detectable
  • Faster ROI proof
  • Realistic workflow test

Cons:

  • Requires group buy-in
  • More coordination

Best for: Agile teams, fast-moving organizations


Selection Criteria

Look for:

CriterionWhy It Matters
VolunteersAdoption works when people want it
Diverse work typesMix of Feature, Bug, Support, Config work
Reasonable managerWill actually use the data, not weaponize it
Stable projectNot in crisis mode (skews baseline)
Technical comfortCan install browser extension

Avoid:

❌ Teams in firefighting mode (skews data) ❌ Skeptical managers (kills adoption) ❌ Solo developers with no manager buy-in ❌ Teams with <1 week left on project


Week 1: Setup & Calibration

Day 1: Monday — Installation & Onboarding

Morning (30 minutes):

  1. Install browser extension

    • Download from internal deployment or public store
    • Configure API endpoint
    • Authenticate with SSO or team token
  2. Quick training (15 minutes)

What to tag:

Work Type: Feature | Bug | Blocker | Support | Config | Research
Intensity: x1 (routine) to x10 (high load)

Rule of thumb:
- x1-x3: No mental fatigue
- x4-x6: Concentrated effort
- x7-x10: Grinding, stuck, frustrated
  1. Tag your first task
    • Pick task you're currently on
    • Choose work type
    • Estimate intensity (make your best guess)
    • Click "Save Tag"

Afternoon:

  • Tag 2-3 more tasks as you switch work
  • Note which parts feel unclear (we'll calibrate tomorrow)

Expected tags Day 1: 3-5 per person (getting comfortable)


Day 2: Tuesday — Intensity Calibration

Morning standup (5 minutes):

Quick round-robin:

  • "What intensity did you log yesterday?"
  • "Did any task feel harder than the number you gave it?"
  • "Any confusion on work types?"

Intensity calibration exercise:

Review these scenarios together:

ScenarioInitial GuessCalibratedWhy
Reviewed PR for familiar codex2x1Too easy to be x2, truly routine
Debugged SQL performancex6x8"I was stuck for 3 hours, felt exhausting"
Built CRUD endpointx4x3Standard feature work, not intense
Production incidentx9x9Correctly calibrated (crisis mode)

The Physical Test:

x1: Could do this while half-asleep
x3: Normal focus, feel fine after
x5: Deep focus, need a break after
x7: Frustrated, want to walk away
x10: Mentally fried, done for the day

Afternoon:

  • Re-tag yesterday's work with calibrated understanding
  • Tag today's work with new mental model

Expected outcome: Everyone aligns on intensity scale meaning


Day 3: Wednesday — Work Type Clarity

Morning standup (5 minutes):

Discuss edge cases:

ScenarioTagWhy
"Helping teammate debug their code"Support, x2Assisting others, low personal effort
"Fixing bug in feature I just built"Bug, x4It's a defect, even if recent
"Stuck waiting for QA environment"Blocker, x5Can't proceed, frustration building
"Reading docs for new library"Research, x3Exploration, not building yet
"Updating CI/CD config"Config, x2Tooling/environment work

Common confusions resolved:

Q: "What if I'm blocked on a bug?" A: Tag as Blocker (primary) + note "waiting on DB team" (we track bottlenecks, not just bug fixing)

Q: "I spent 4 hours on a feature, but 2 hours was routine and 2 hours was hard. How do I tag?" A: Two tags: Feature, x2 and Feature, x7. Be granular.

Q: "Do I tag meetings?" A: Only if it's substantive work (architecture review = Research, x4). Skip standup.

Afternoon:

  • Tag with confidence on work types
  • Note any remaining edge cases for Friday review

Expected outcome: <10% tag ambiguity


Day 4: Thursday — Pattern Recognition

Morning:

Pilot participants get access to Developer View dashboard:

Your Heatmap (This Week):
├── Monday: 12 intensity (🟩 Normal)
├── Tuesday: 18 intensity (🟨 Warm) — Calibration adjustments
├── Wednesday: 8 intensity (🟦 Cool)
└── Thursday: (in progress)

Tag Distribution:
├── Feature: 45%
├── Bug: 25%
├── Blocker: 15%
├── Support: 10%
├── Config: 5%

Self-reflection questions (5 minutes):

  1. Does my heatmap match how I felt this week?
  2. Am I surprised by my tag distribution?
  3. Did I catch any Pain Streaks forming?

Pair discussion (10 minutes):

  • Share your heatmap with one other pilot participant
  • Discuss: "What does this data tell you about your week?"
  • Look for patterns: "Why was Tuesday warm?"

Afternoon:

  • Continue tagging
  • Start thinking about patterns

Expected outcome: First "aha moment" — data matches reality


Day 5: Friday — Week 1 Retrospective

Pilot team meeting (30 minutes):

Agenda:

  1. Data review (10 minutes)

    Manager shares (anonymized or open, team decides):

    • Team heatmap view
    • Tag distribution
    • Any Pain Streaks detected (probably none yet, Week 1 is baseline)
  2. Calibration check (5 minutes)

    • Does everyone agree on intensity scale now?
    • Any work types still confusing?
    • Are tags taking >30 seconds? (If yes, simplify)
  3. Early insights (10 minutes)

    Go around the table:

    • "One thing I learned about my work this week?"

    Examples:

    • "I didn't realize 30% of my time was Support"
    • "Tuesday felt bad because I had 3 blockers stacked"
    • "I thought I was productive Wednesday, but it was all Config work"
  4. Week 2 goals (5 minutes)

    • Goal: Catch one Pain Streak before it becomes a problem
    • Goal: Identify one Bus Factor = 1 situation
    • Goal: 90%+ tagging consistency

Expected outcome: Team is comfortable, sees value, ready to continue


Week 2: Pattern Detection

Day 6: Monday — Baseline Established

Morning standup (3 minutes):

  • Quick reminder: Tag as you work
  • New goal: "Let's see if we can detect a Pain Streak this week"

Manager setup:

Enable Pain Streak Alerts:

  • Threshold: 3 days on same task/tag combo
  • Alert delivery: Slack DM or email
  • Auto-check: Daily at 5pm

Developers:

Tag as usual. Week 1 was calibration, Week 2 is real signal detection.


Day 7: Tuesday — First Pain Streak Alert (Maybe)

Scenario: Alice gets 🔥 Streak: 2 days notification

Task: PROJ-42 "Fix payment gateway timeout"
Tag: Blocker, x8
Streak: Day 2 (Monday + Tuesday)

Manager action:

  1. Check-in (5 minutes): "Hey Alice, I see you've been on the payment bug for 2 days at high intensity. Need help?"

  2. Options:

    • Pair with senior developer
    • Escalate to vendor support
    • Deprioritize for now if not critical
  3. Outcome:

    • Alice paired with Bob (knows payment system)
    • Blocker resolved by end of day
    • Cascade prevented: Rushed fix that could have caused production incident

Value demonstrated: $245K cascade avoided (see Cascade Effect example)


Day 8: Wednesday — Bus Factor Detection

Manager View shows:

Module Ownership (Last 2 Weeks):
├── Payment Gateway: Alice (100%)
├── Auth Service: Carol (90%), David (10%)
├── API Core: Bob (60%), Alice (40%)
└── Frontend: David (70%), Carol (30%)

🚨 Bus Factor = 1: Payment Gateway

Manager action:

  1. Immediate: Schedule knowledge transfer session (Friday)
  2. This sprint: Pair Carol with Alice on payment work
  3. Next month: Cross-training plan for all critical modules

Value demonstrated: If Alice leaves, payment module continuity preserved


Day 9: Thursday — Tag Analysis Insight

Tag Analysis View shows:

Team Tag Distribution (Week 2):
├── Feature: 140 intensity (good, innovation work)
├── Bug: 80 intensity (normal)
├── Blocker: 180 intensity (🔴 Alert: 2× baseline)
├── Support: 60 intensity (normal)
└── Config: 120 intensity (🟡 Warning: Environment issues?)

Drill-down on Blocker spike:
- 3 of 5 developers blocked on "QA environment down"
- Total lost capacity: ~12 hours this week

Manager action:

  1. Immediate: Escalate to DevOps (QA env priority)
  2. Root cause: Deployment script broken Monday, not fixed until Thursday
  3. Prevention: QA env monitoring alert added

Value demonstrated: Systemic issue caught, future cascades prevented


Day 10: Friday — Pilot Phase Retrospective

Pilot team meeting (45 minutes):

Agenda:

  1. Success stories (15 minutes)

    Share wins:

    • Alice's Pain Streak caught early
    • Bus Factor = 1 on Payment identified
    • Config spike revealed QA env issue
  2. Metrics review (10 minutes)

    Adoption:

    • Tagging consistency: 85%+ (target: 90%+)
    • Avg time per tag: 25 seconds (target: <30s)
    • Dashboard usage: Manager checked 4×/week

    Value:

    • Pain Streaks detected: 1
    • Cascades prevented: 1 (estimated $245K savings)
    • Systemic issues surfaced: 1 (QA env downtime)
    • Bus Factor = 1 situations: 1 (cross-training initiated)
  3. Learnings (10 minutes)

    What worked:

    • Daily tagging became habit by Day 3
    • Manager using data proactively, not punitively
    • Dashboard matched "gut feel" (validated framework)

    What needs improvement:

    • Tag selector UI could be faster (1-click for common tags?)
    • Some edge cases still unclear (support vs collaboration?)
    • Need reminder to tag at end of day
  4. Go/No-Go decision (10 minutes)

    Criteria for full rollout:

    CriterionStatusNotes
    Tagging adoption >85%✅ 85%Good enough
    At least 1 actionable insight✅ 3 insightsExceeded
    Team sentiment positive✅ UnanimousEveryone wants to continue
    Manager buy-in✅ StrongAlready using data
    Technical stability✅ No issuesExtension + API solid

    Decision:GO for full team rollout


Post-Pilot: Preparing for Rollout

Week 3 Prep Activities

Manager:

  1. Document pilot wins

    • Write 1-page summary for leadership
    • Include: Alice's Pain Streak, Bus Factor finding, QA env issue
    • Quantify value: $245K cascade prevented (conservative)
  2. Plan team onboarding

    • Schedule 30-minute team kickoff meeting
    • Prepare demo using pilot team's real data (anonymized if needed)
    • Assign onboarding buddy for each new user
  3. Review thresholds

    • Based on pilot data, are default thresholds right?
    • Intensity color bands: <5 (cool), 5-15 (normal), 15-30 (warm), 30+ (critical)
    • Adjust if needed for team norms

Pilot participants:

  1. Become champions

    • Share your experience in team channels
    • Offer to pair with new users for first tagging session
    • Be honest about what worked and what didn't
  2. Refine workflows

    • Identify optimal tagging moments (end of day vs real-time?)
    • Share tips (e.g., "I tag in standup for yesterday's work")

Common Pilot Issues & Solutions

Issue 1: "I keep forgetting to tag"

Solutions:

  • Browser extension badge reminder
  • End-of-day Slack reminder bot
  • Buddy system (pair checks in daily)
  • Make it a standup ritual

Issue 2: "I'm not sure if this is x5 or x7"

Solution:

Use comparative questions:

  • "Was this harder than the last x5 task I did?"
  • "Did I need a break after, or could I continue?"
  • "On a scale of 1-10 physical exhaustion, how tired am I?"

Remember: Precision doesn't matter as much as consistency. If you're consistently calling similar work x6, the patterns will emerge.


Issue 3: "Manager is using this against me"

RED FLAG. Stop pilot immediately.

Required conversation:

"HEAT is a team health tool, not a performance review system. Using it punitively will kill adoption and trust."

If manager doesn't course-correct, escalate or abandon pilot.


Issue 4: "This feels like surveillance"

Clarify:

HEAT MeasuresHEAT Does NOT Measure
Effort intensity patternsHours worked
Team health signalsIndividual productivity scores
Systemic frictionWho is "best" or "worst"
Knowledge distributionSurveillance metrics

Opt-in philosophy:

  • You choose what to tag
  • Your data is visible to you first
  • Manager sees patterns, not granular activity

If concern persists, review Architecture privacy section together.


Issue 5: "Tags don't match my reality"

Investigate:

  1. Calibration issue?

    • Review intensity scale together
    • Use Physical Test framework
  2. Work type confusion?

  3. Legitimate gap?

    • Maybe HEAT isn't capturing your work type (e.g., heavy meeting load)
    • Consider custom tags or note this as limitation

Remember: Pilot is for finding these gaps. Iterate.


Pilot Success Metrics

Quantitative (Minimum Thresholds)

MetricTargetAcceptableConcern
Tagging consistency>90%>80%<80%
Avg time per tag<20s<30s>30s
Dashboard usage (manager)Daily3×/week<2×/week
Actionable insights3+1+0
Pain Streaks detected1+0 (Week 1 only)0 (Week 2)

Qualitative (Must-Haves)

Team sentiment: Majority positive or neutral ✅ Manager buy-in: Using data proactively ✅ Technical stability: No blocking bugs ✅ Privacy comfort: No surveillance concerns ✅ Value proof: At least 1 "aha moment"


Pilot Deliverables (End of Week 2)

For Leadership:

  1. One-page summary

    • Pilot team: 5 developers, 2-week trial
    • Key findings: [List 3-5 insights]
    • Value demonstrated: [$X cascade prevented]
    • Recommendation: Proceed to full rollout
  2. Example dashboard screenshot (anonymized)

    • Team heatmap
    • Tag distribution
    • Pain Streak example

For Team:

  1. Onboarding guide (based on pilot learnings)

    • 5-minute quick start
    • Common edge cases FAQ
    • Tag cheat sheet
  2. Manager playbook

    • Daily/weekly/monthly rituals
    • Intervention matrix (when to act on signals)
    • Example check-in scripts

Next Steps After Pilot

✅ Pilot Success?Team Rollout Phase (Week 3-4)

🔧 Need More Calibration? → Extend pilot to Week 3, iterate

📊 Want to See More Examples?Case Studies

❓ Questions on Framework?Observable Signals Guide


"Pilot proves value. Rollout scales it. Sustained use delivers ROI." 🔥