08 — Assessment Design

Assessment Design

The complete assessment specification — flow, item bank, conversation scripts, scoring model, and data-to-output mapping. Every question exists because a specific output needs the data.

1. Design Principle — Output First

Every question exists because a specific output element needs the data. If a question doesn’t feed a visible output, it doesn’t belong.

FREE Tier Outputs → Data Needed

Output ElementData Required
Archetype assignmentTop 2 superpowers → maps to 1 of 6 archetypes
Top 3 superpowers (visual bars)5 dimension scores (relative ranking)
AI PotentialMindset Index + peak dimension + learning velocity proxy
Best Percentile statIndustry, role, company type, seniority (multi-dimensional benchmarking)
Growth edgeWeakest dimension (positive framing)
Mode profile (Card C)Primary + secondary mode + agentic orientation
Adaptive card layoutOverall percentile tier (determines visibility)

PAID Tier Outputs → Additional Data Needed

Output ElementData Required
Superpowered Score (0–100)4 component indices: Mindset 35%, Skills 30%, Domain 25%, Technical 10%
20 sub-competency scoresLayer 2 behavioral evidence for all 20 sub-competencies
Full radar chartAll 5 dimension scores (refined with Layer 2)
Mode depth chartPer-mode depth 0–5 + agentic depth per mode
Self-perception gapLayer 1 vs Layer 2 score comparison per dimension
AI Potential (refined)Learning velocity from conversation + consistency gap
3 learning path recsWeakest sub-competencies + domain-specific growth areas
CertificateName, archetype, score, percentile, framework version, date

2. Assessment Flow

FREE Tier — Spark Profile (~7–8.5 min to result)

PHASE 0: AI WELCOME CHAT (2.5-3 min) │ Conversational onboarding + mode profiling │ Msgs 1-4: name, work context, AI usage, mindset hint │ Msgs 5-6: mode selection (multi-select) + depth per mode │ NLP extracts: industry, role, company type │ Direct capture: modes used, depth per mode ▼ PHASE 2: FORCED-CHOICE ITEMS (3-4 min) │ 22 items: tap one of 2 statements (~8-10 sec each) ▼ PHASE 3: ANCHORING + AGENTIC + SKILL (1 min) │ 5 anchoring (Likert) + 1 agentic + 3 skill items ▼ LAYER 1 SCORING ENGINE │ → 5 dimensions, archetype, AI Potential, mode profile │ + Phase 0 signals (adoption, mindset, mode depth) ▼ ★ EMAIL GATE ★ │ "Your profile is ready! Where should I send it?" ▼ ★ SPARK PROFILE + ADAPTIVE CARD ★ │ Mode Profile labeled "your self-assessment" ▼ PHASE 4: MINI AI CHAT (2-3 min) │ References what user said in Phase 0 │ Creates curiosity gap → upgrade trigger ▼ UPGRADE WALL
Flow note: Phase 0 combines conversational onboarding (Msgs 1–4) with interactive mode profiling (Msgs 5–6), replacing old form fields AND old MODE1/MODE2 items from Phase 3. Mode depth is self-reported with compression scoring (capped at 70/100). Email Gate captures leads at peak curiosity. Mode Profile in free tier labeled “your self-assessment” with upgrade CTA.

PAID Tier — Full Profile (~25–35 min)

EVERYTHING FROM FREE (7-8.5 min) ▼ PHASE 5: FULL AI CONVERSATION (15-20 min) │ 5a: Experience Exploration (5-8 min) │ 5b: Scenario Challenges (5-8 min) │ 5c: Reflection & Closing (2-3 min) ▼ FULL SCORING ENGINE │ L1 (30%) + L2 (70%) │ 20 sub-competencies, 4 indices ▼ FULL PROFILE + REPORT + CERTIFICATE

3. Phase 0: AI Welcome Chat

Phase 0 combines conversational onboarding (Messages 1–4) with interactive mode profiling (Messages 5–6). It captures context, adoption level, mindset hint, AND full mode selection + depth — replacing both the old form fields and the old MODE1/MODE2 items.

Conversation Script (6–7 messages, ~2.5–3 min)

Message 1 — Name
AI: “Hi! I’m here to map your AI superpowers. Let’s start simple — what’s your name?”
User: [free text] → extracts Name
Message 2 — Work Context
AI: “Nice to meet you, [Name]! What do you do? Tell me your role and industry — just in your own words.”
User: [free text] → NLP extracts Role, Industry, Company type
Message 3 — AI Adoption
AI: “Got it. Now I’m curious — how does AI show up in your work today?”
User: [free text] → extracts AI Adoption Level + usage patterns
Message 4 — Mindset Hint
AI: “When AI gives you a result you didn’t expect, what’s your first instinct?”
User: [free text] → extracts Mindset hint
Message 5 — Mode Selection (interactive cards)
AI: “Now let me understand HOW you use AI. Which of these do you do? Tap all that apply.”
4 tapable mode cards (multi-select):
💬 Chatuju s AI 🎨 Tvorim obsah s AI 🔧 Buduji s AI ⚡ Automatizuju s AI
Message 6 — Mode Depth (per selected mode)
AI: “For each one — tap the description that best matches what you ACTUALLY DO:”
For each selected mode, show 5 descriptive cards (single-select, no numbers visible). See depth card tables below.
Message 7 — Transition
AI: “Love it. I have a great picture of how you work with AI. Now let’s find out where your real superpowers are.”
→ Animated transition to Phase 2 (forced-choice UI)

Mode Depth Cards (behavioral framing: “what do you DO?”)

User sees descriptive cards only — no numbers. Internal depth values are for scoring backend.

💬 Conversational

InternalCard text
C-1I occasionally ask AI something
C-2I regularly use AI to solve work tasks
C-3I lead a dialogue with AI — I give context, iterate, refine
C-4AI knows my context — I know how to prompt it for my work
C-5AI is my daily partner for strategy and decision-making

🎨 Creative

InternalCard text
R-1I occasionally generate something — an image, text, a presentation
R-2I regularly create content with AI — I have my own workflows
R-3I create at a professional level — deep in one or more formats
R-4I have my own production system — multiple AI tools in a coordinated process
R-5I produce at scale — campaigns, series, dozens of outputs

🔧 Builder

InternalCard text
B-1I tried, but nothing I actually use
B-2I built a few simple things — pages, tools
B-3I regularly build things I use at work
B-4I build apps that others use too
B-5I manage AI agents that code for me

⚡ Orchestration

InternalCard text
O-1I tried connecting a few tools
O-2I have a few simple automations that save me time
O-3I build workflows where AI processes data by rules
O-4I have systems that run on their own — I check them occasionally
O-5My workflows adapt and make decisions on their own
Mode depth design principles (expert panel): Behavioral framing (“what do you DO”), no numbers visible, max 8–10 words per card, cumulative levels, compression scoring (capped at 70/100), free tier labeled “your self-assessment.” Expected L1 vs L2 gap: 30–40% of users self-report 1–2 levels higher than L2 confirms.

NLP Extraction Logic

FieldExtraction MethodFallback
NameDirect from Message 1Ask again
IndustryNLP classification → 12 categoriesQuick picker: 12 chips
RoleNLP classification → 7 levelsQuick picker: 7 chips
Company typeNLP inference from contextQuick picker: 5 chips
AI AdoptionNLP → High / Medium / Low / NoneDefault Medium
Mindset HintNLP → Analytical / Trusting / AdaptiveSoft signal, no fallback
Mode SelectionDirect from Message 5 card taps
Mode DepthDirect from Message 6 card taps

Signals from Phase 0

SignalSourceTypeUsage
AI AdoptionMsg 3BonusContextualizes L1 scores. Does NOT affect dimension scores.
Mindset HintMsg 4BonusTiebreaker for archetype. Does NOT override L1.
Mode SelectionMsg 5PrimaryWhich modes appear in Mode Profile. Feeds scenario selection.
Mode DepthMsg 6Primary (compressed)Self-assessed depth. Capped at 70/100. Free tier: “self-assessment.” Paid: replaced 70% by L2.

Deferred Fields

FieldWhen CollectedPurpose
SeniorityPost-result (optional) or NLP-inferredSeniority percentile
CountryAuto-detect from IP/localeGeographic percentile
Why conversational + interactive onboarding: Phase 0 combines natural conversation (Msgs 1–4) with interactive card-based UI (Msgs 5–6). This gives both qualitative context and structured mode data from the first 3 minutes. The user’s first impression is “I’m talking to an AI that gets me.”

4. Phase 2: Forced-Choice Item Bank (22 items)

Each item presents two positive statements. The user selects which resonates more. There is no wrong answer — the choice reveals which superpower is stronger. Items are tagged as Mindset (M) or Skill (S) to enable separate Mindset Index and Capability Index calculations.

Item Design Principles

  1. Both options are positive. No “wrong” choice.
  2. Each item contrasts two different dimensions.
  3. Language is natural and work-contextual.
  4. No tool names in items.
  5. Mindset items = attitude/orientation. Skill items = capability/action.
  6. Social desirability balance. Both options must sound equally “good” to a business professional. Validation required: present items to 20 people without context — if one option is chosen >65% as “more impressive,” rewrite it.

Perception Items (5)

P1 — AI Curiosity (M) vs. Critical Trust (M)
“When I hear about a new AI tool, I’m more likely to...”
(a) Try it right away to see what it can do → Perception: AI Curiosity
(b) Research its reliability and limitations before investing time → Intelligence: Critical Trust
P2 — Opportunity Recognition (S) vs. Prompt Mastery (S)
“When facing a new work challenge, my first instinct is to...”
(a) Think about which parts of this problem AI could handle → Perception: Opportunity Recognition
(b) Think about how to structure my input so AI gives me the best result → Intelligence: Prompt Mastery
P3 — AI Curiosity (M) vs. Iterative Learning (M)
“What excites me more about AI:”
(a) Discovering capabilities I didn’t know existed → Perception: AI Curiosity
(b) Seeing how fast my own workflow evolves because of it → Knowledge: Iterative Learning
P4 — Problem Reframing (S) vs. Process Decomposition (S)
“When given a complex task, I naturally...”
(a) Question whether the task itself is the right thing to do → Perception: Problem Reframing
(b) Break it down into smaller steps and assign each to the best approach → Integration: Process Decomposition
P5 — Opportunity Recognition (S) vs. Workflow Orchestration (S)
“I add the most value when I...”
(a) Spot where AI can be used in places nobody else thought of → Perception: Opportunity Recognition
(b) Design a system that connects multiple tools into a smooth workflow → Integration: Workflow Orchestration

Intelligence Items (4)

I1 — Critical Trust (M) vs. Creative Courage (M)
“When AI gives me an unexpected result, I’m more likely to...”
(a) Investigate why — I want to understand the reasoning → Intelligence: Critical Trust
(b) See if I can use it creatively — unexpected results can be valuable → Creation: Creative Courage
I2 — Prompt Mastery (S) vs. The 90/10 Craft (S)
“I spend more energy on...”
(a) Getting the right input into AI — the prompt, the context, the structure → Intelligence: Prompt Mastery
(b) Refining what comes out of AI — editing, iterating, polishing → Creation: The 90/10 Craft
I3 — Critical Trust (M) vs. Augmentation Vision (M)
“I trust AI most when...”
(a) I can verify its work against clear criteria → Intelligence: Critical Trust
(b) I’ve designed a system where AI and I each play to our strengths → Integration: Augmentation Vision
I4 — Strategic AI Dialogue (S) vs. Knowledge Architecture (S)
“My strength with AI is more about...”
(a) Using AI as a thinking partner — asking the right questions to get deep insights → Intelligence: Strategic AI Dialogue
(b) Organizing and structuring knowledge so it’s always ready to use → Knowledge: Knowledge Architecture

Knowledge Items (4)

K1 — Iterative Learning (M) vs. Critical Trust (M)
“When an AI tool I rely on starts giving worse results, I...”
(a) See it as a chance to find something better — I adapt fast → Knowledge: Iterative Learning
(b) Investigate what changed and whether the outputs can still be trusted → Intelligence: Critical Trust
K2 — Knowledge Architecture (S) vs. Opportunity Recognition (S)
“After a successful AI project, I’m more likely to...”
(a) Document what worked and create a reusable template → Knowledge: Knowledge Architecture
(b) Look for other situations where the same approach could apply → Perception: Opportunity Recognition
K3 — Iterative Learning (M) vs. Creative Courage (M)
“My relationship with change:”
(a) I actively seek it — I’m always looking for better ways to work → Knowledge: Iterative Learning
(b) I embrace it because it opens possibilities that didn’t exist before → Creation: Creative Courage
K4 — Knowledge Compounding (S) vs. Collaboration Design (S)
“When I discover a great AI workflow, I naturally...”
(a) Write it down so I or others can replicate it → Knowledge: Knowledge Compounding
(b) Show it to my team and design how we can use it together → Integration: Collaboration Design

Creation Items (4)

C1 — Creative Courage (M) vs. Augmentation Vision (M)
“I’m most proud when I...”
(a) Create something ambitious with AI that I couldn’t have done alone → Creation: Creative Courage
(b) Design a system where AI handles the heavy lifting while I focus on strategy → Integration: Augmentation Vision
C2 — The 90/10 Craft (S) vs. Workflow Orchestration (S)
“My work with AI is better described as...”
(a) I push AI outputs to a level of quality that impresses people → Creation: The 90/10 Craft
(b) I connect multiple AI tools into workflows that produce consistent results → Integration: Workflow Orchestration
C3 — Creative Courage (M) vs. Critical Trust (M)
“When considering an AI project I’ve never attempted before, I think...”
(a) “Let’s try it — worst case, I learn something” → Creation: Creative Courage
(b) “Let me first understand the risks and limitations” → Intelligence: Critical Trust
C4 — Building (S) vs. Context Engineering (S)
“My AI versatility shows in...”
(a) Building tools, apps, or prototypes — I make things that work → Creation: Building
(b) Preparing the right context for AI — the data, examples, and structure that make outputs excellent → Knowledge: Context Engineering

Integration Items (5)

G1 — Augmentation Vision (M) vs. Iterative Learning (M)
“I see AI’s biggest impact as...”
(a) Transforming how humans and technology work together as a system → Integration: Augmentation Vision
(b) Accelerating how fast we can learn, adapt, and improve → Knowledge: Iterative Learning
G2 — Workflow Orchestration (S) vs. Knowledge Architecture (S)
“My strongest contribution with AI is...”
(a) Designing multi-step processes where the right tool does each job → Integration: Workflow Orchestration
(b) Building organized knowledge systems that make information accessible and reusable → Knowledge: Knowledge Architecture
G3 — Augmentation Vision (M) vs. AI Curiosity (M)
“When I think about AI’s future, I focus on...”
(a) How human-AI collaboration will reshape how organizations work → Integration: Augmentation Vision
(b) What new capabilities will become possible that we can’t imagine today → Perception: AI Curiosity
G4 — Process Decomposition (S) vs. Problem Reframing (S)
“When I approach a big project with AI, I’m known for...”
(a) Breaking it into perfectly-sized pieces that each tool can handle → Integration: Process Decomposition
(b) Stepping back and redefining what we’re actually trying to achieve → Perception: Problem Reframing
G5 — Collaboration Design (S) vs. The 90/10 Craft (S)
“I create more value by...”
(a) Designing how my team uses AI together effectively → Integration: Collaboration Design
(b) Ensuring every AI-assisted deliverable meets the highest quality standard → Creation: The 90/10 Craft

Coverage Matrix

Ipsative measurement note: Forced-choice items produce ipsative (relative) scores — choosing one dimension suppresses another. A person cannot score high on all 5 simultaneously from forced-choice alone. The Dimension Anchoring Items in Phase 3 provide normative (absolute-level) calibration to complement the ipsative profile.
DimensionAppearancesMindset (M)Skill (S)
Perception83 (P1, P3, G3)5 (P2, P4, P5, K2, G4)
Intelligence84 (P1, I1, I3, K1, C3)3 (P2, I2, I4)
Knowledge93 (K1, K3, G1)5 (I4, K2, K4, G2, C4)
Creation84 (I1, K3, C1, C3)4 (I2, C2, C4, G5)
Integration114 (I3, C1, G1, G3)7 (P4, P5, K4, C2, G2, G4, G5)

Balance note: Integration has the most appearances (11), Perception/Intelligence/Creation the fewest (8 each). Acceptable — Integration is the most cross-cutting dimension by design.

5. Phase 3: Anchoring, Agentic & Skill Items (12 items)

Dimension Anchoring Items (5 — Likert scale 1–5)

These items establish absolute levels for each dimension, solving the ipsative measurement trap. Rated 1–5 (“Not at all like me” to “Exactly like me”).

ANCHOR1 — Perception
“I actively look for new ways AI could be used in my work — even in areas where nobody else is using it yet.”
1 · 2 · 3 · 4 · 5
ANCHOR2 — Intelligence
“I have a systematic approach to evaluating AI outputs — I know what to trust, what to verify, and what to reject.”
1 · 2 · 3 · 4 · 5
ANCHOR3 — Knowledge
“I organize my AI knowledge — prompts, templates, workflows — so I can reuse and build on what I’ve learned.”
1 · 2 · 3 · 4 · 5
ANCHOR4 — Creation
“I use AI to tackle ambitious creative projects I wouldn’t attempt on my own.”
1 · 2 · 3 · 4 · 5
ANCHOR5 — Integration
“I design workflows where AI handles repeatable tasks while I focus on judgment and strategy.”
1 · 2 · 3 · 4 · 5
Dimension_L1_Final = (Ipsative_Rank_Normalized × 0.6) + (Anchor_Score_Normalized × 0.4)

Agentic Orientation (1 item — single select)

Note: Mode selection and mode depth are now captured in Phase 0 (Section 3, Messages 5–6). Old MODE1/MODE2 items removed — the new Phase 0 design provides richer mode data (multi-select modes + 5-level depth per mode).
MODE3 — Agentic Orientation
“Which best describes how AI operates in your daily work?”
(a) I start every AI interaction myself → Non-agentic (0)
(b) I have AI assistants with instructions I return to regularly → Low agentic (1)
(c) Some of my AI processes run on their own, I check in periodically → Moderate agentic (2)
(d) I have AI systems that make decisions and take actions autonomously → High agentic (3)

Skill Signal Items (3 — single select)

SKILL1 — AI Usage Frequency
“How often do you use AI tools in your work?”
(a) Rarely — a few times a month → Score: 1
(b) Weekly — it’s part of my toolkit → Score: 2
(c) Daily — I use AI for multiple tasks every day → Score: 3
(d) Constantly — AI is running in the background of most of my work → Score: 4
SKILL2 — AI Tool Breadth
“How many different AI tools do you use regularly?”
(a) 1 (mostly ChatGPT or similar) → Score: 1
(b) 2–3 different tools → Score: 2
(c) 4–6 tools across different categories → Score: 3
(d) 7+ tools — I have a full AI toolkit → Score: 4
SKILL3 — Building Depth
“Have you built anything with AI? (apps, automations, custom tools)”
(a) No, I haven’t tried → Score: 0
(b) I’ve experimented but nothing I use regularly → Score: 1
(c) Yes, I’ve built tools or automations I actually use → Score: 2
(d) I regularly build apps, scripts, or systems with AI → Score: 3

6. Layer 1 Scoring Model

Dimension Scoring

Each forced-choice item assigns +1 to the chosen dimension and 0 to the other. Raw dimension scores are the sum of all choices for that dimension across 22 items.

Dimension_Normalized = (Raw_Score / Max_Possible_Score) × 100

Mindset Index (for AI Potential)

Mindset_Index_L1 = average(
  Perception_Mindset_Items,   // AI Curiosity signals
  Intelligence_Mindset_Items, // Critical Trust signals
  Knowledge_Mindset_Items,    // Iterative Learning signals
  Creation_Mindset_Items,     // Creative Courage signals
  Integration_Mindset_Items   // Augmentation Vision signals
) × normalization_factor

AI Potential (Layer 1 estimate)

AI_Potential_L1 = min(100, weighted_average(
  Mindset_Index_L1 × 1.4,      // High mindset = high ceiling
  Peak_Dimension × 1.1,         // Strongest dimension → potential
  SKILL1_frequency × 10,        // Usage frequency as velocity proxy
  100 - |Mindset - Capability| × 0.5  // Gap = untapped potential
))

Archetype Assignment

Top 2 dimensions by normalized score → Archetype lookup:

Top 2 DimensionsArchetype
Integration + IntelligenceAI Architect
Perception + IntelligenceAI Navigator
Creation + IntegrationAI Builder
Knowledge + PerceptionAI Catalyst
Creation + KnowledgeAI Amplifier
Perception + CreationAI Pioneer

7. Email Gate & Lead Capture

The Email Gate sits between Layer 1 scoring and result display — the moment of peak curiosity. The user has invested 6+ minutes, scoring is complete, and they’re about to see their profile. This is the optimal point for email capture.

UX Copy & Design

Scoring complete → Loading: “Analyzing your responses…”

“Your AI Superpower Profile is ready!”
Enter your email to see your results and get personalized tips for growth.

[ your@email.com ]   [ Show my profile → ]

We’ll send your profile + 3 personalized growth tips. No spam, unsubscribe anytime.
Design principles: Show a blurred preview of the Spark Profile card behind the email form (curiosity amplifier). Loading animation before the gate builds anticipation. Single field only (email) — name already captured in Phase 0. No skip option: users invested 6+ min and WILL enter email (sunk cost + curiosity). Target capture rate: 75–85%.

What Happens on Submit

  1. Email validated (format + disposable email detection)
  2. Spark Profile displayed immediately — zero delay
  3. Welcome email sent within 60 seconds (profile card, top 3 superpowers, 3 growth tips, CTA to Full Profile)
  4. Lead captured in CRM with tags: archetype, top superpower, industry, role, company type, AI adoption level

Email Follow-Up Sequence

DayEmailContentCTA
Day 0Welcome + ProfileSpark Profile card, top 3 superpowers, 3 growth tips“Go deeper → Full Profile”
Day 3Growth InsightDeep-dive on #1 superpower: what it means, how top performers use it“See how you compare →”
Day 7Curiosity NudgeTease Full Profile reveals (sub-competencies, radar, learning path). Reference Growth Edge.“Unlock your full potential →”
Day 14Social ProofAggregated stats: “3,200+ professionals mapped. [Archetype] types like you tend to…”“Get your complete profile →”
Sequence principle: Each email adds NEW value (not just “buy now”). The user should feel smarter after reading each email, even if they never upgrade. This builds trust and brand — some upgrade on day 3, some on day 14, some forward it to colleagues (organic growth).

Expected Metrics

MetricTarget
Email capture rate75–85%
Welcome email open rate65–75%
Day 3 open rate35–45%
Day 7 open rate25–35%
Paid conversion from sequence5–12%

8. Phase 4: Mini AI Chat (FREE Tier)

2–3 personalized follow-up questions based on Layer 1 results. Gives users a taste of the AI conversation and creates the upgrade trigger.

Q1: Strongest Superpower Probe (always asked first)

Top SuperpowerQuestion
Perception“You seem to naturally spot AI opportunities. Can you give me a quick example — the last time you saw an AI use case that others missed?”
Intelligence“You strike me as someone who really thinks about how they use AI. What’s an example of a time your careful approach paid off?”
Knowledge“You have a knack for making things reusable. What’s the most valuable AI template or system you’ve built for yourself?”
Creation“You seem like someone who ships ambitious things with AI. What’s the most creative project you’ve tackled?”
Integration“You think in systems. What’s the most sophisticated AI workflow you’ve designed?”

Q2: Growth Area Hint (asked second)

Weakest SuperpowerQuestion
Perception“If you had to find ONE new area where AI could help that you haven’t explored yet — what would it be?”
Intelligence“When AI gives you something 80% right, what’s your process for getting it to 100%?”
Knowledge“After a great AI session, do you save what worked — or start fresh next time?”
Creation“If you could build anything with AI this week — no constraints — what would it be?”
Integration“Imagine you could automate one repetitive part of your weekly work. What would it be?”

Upgrade Wall

There’s much more here.

In 3 quick answers, I can already see your [archetype_name] profile forming. But to really understand your superpowers — and show you where your biggest growth opportunity is — I need about 20 more minutes.

Unlock your Full Profile →
Deep AI conversation · Full superpower radar · Personalized learning path · Certificate

Curiosity Gap Principle (expert panel)

After each mini-chat response, the AI references something it noticed but can’t explore:

Scoring note: Mini AI chat responses are NOT scored for Layer 2. They are purely for user experience, creating informational curiosity gaps, and optional qualitative flagging.

9. Phase 5: Full AI Conversation (PAID Tier)

Conversation Architecture

PhaseDurationPurpose
5a: Experience Exploration5–8 minDeep-dive into real AI usage, 5 Observable Differentiators
5b: Scenario Challenges5–8 min3–5 adaptive scenarios, domain-specific + universal
5c: Reflection & Closing2–3 minForward-looking mindset signals, identity framing

5a: Experience Exploration

Opening:

“Let’s go deeper. Tell me about a recent project where AI played a significant role. Walk me through it — what was the task, what did you do, and how did it turn out?”

The AI probes the 2 weakest dimensions and 1 strongest (to confirm) using these probes:

ProbeTestsQuestion
A: DecompositionProcess Decomposition, Opportunity Recognition“When you face a complex problem — how do you decide what to hand to AI and what to handle yourself?”
B: Multi-ToolWorkflow Orchestration, Building“Did you use just one tool, or did you combine several? How did they work together?”
C: IterationThe 90/10 Craft, Prompt Mastery“When AI gives you something close but not quite right, what’s your process?”
D: System ThinkingKnowledge Architecture, Augmentation Vision“Do you have any AI workflows or templates you use repeatedly?”
E: ReframingProblem Reframing, AI Curiosity“Has there been a time when the question itself needed to be different?”

5b: Scenario Challenges

Selection logic: Focus on weakest 2 dimensions, include domain-specific scenario, include 1 outside primary mode, escalate complexity.

Universal Scenario Bank (select 2–3)

IDScenarioPrimary Dimensions
S1The Competitive Analysis Sprint — 15 companies, 3 days, no analystPerception, Intelligence, Integration
S2The AI Failure — hallucinated data already sent to clientIntelligence, Knowledge, Creation
S3The Reluctant Team — AI tool automates 40%, half resistIntegration, Creation, Knowledge
S4The Urgent Presentation — 2 hours, unexpectedCreation, Intelligence, Perception
S5Knowledge Overflow — 200 pages, extract 10 insightsKnowledge, Integration, Intelligence
S6The Creative Brief — full campaign in one dayCreation, Perception, Knowledge
S7Process Audit — find top 3 AI opportunities in departmentIntegration, Knowledge, Perception
S8The Learning Challenge — master a new AI toolKnowledge, Intelligence, Creation

5c: Reflection & Closing

Three final questions that capture mindset signals:

  1. Forward-looking: “Looking ahead 12 months — how do you expect your work with AI to change?”
  2. Temperature check: “Gut feeling — how much of your work will involve AI two years from now?”
  3. Identity: “If you had to describe your relationship with AI in one sentence?”
“Is there anything about how you use AI that we haven’t covered? Anything you’re particularly proud of, or struggling with?”

10. Layer 2 Scoring Model

Per-Response Scoring

ScoreMeaning
0No signal (dimension not relevant)
1Weak negative signal (absence of competency)
2Weak positive signal (slight indication)
3Clear positive signal (competency demonstrated)
4Strong positive signal (exceptional depth)

Confidence Weighting

ConfidenceWeightWhen
High1.0Direct behavioral evidence — specific example, detailed process
Medium0.7Indirect evidence — general description, hypothetical
Low0.4Ambiguous — could indicate this or another dimension

5 Observable Differentiators

DifferentiatorScore 1Score 3Score 5Maps to
DecompositionDumps whole problemsSometimes breaks downReflexive decomposition5.3 + 1.2
Multi-ToolSingle tool only2–3 tools linkedComplex pipelines5.2 + 4.3
Iteration1–2 tries3–4 intentional roundsUntil quality bar met4.2 + 2.2
System ThinkingEverything one-offSome templates savedBuilds reusable systems3.2 + 5.1
ReframingExecutes as givenOccasionally questionsRedefines before executing1.4 + 1.1

Layer 2 Technical Architecture

Single-call LLM design (expert panel): Use one LLM call per user turn that both continues conversation AND outputs structured scoring JSON. This halves API cost and eliminates scoring latency vs. running a separate “judge LLM.”
// Single LLM call returns both:
{
  "conversation_response": "Natural language reply...",
  "scoring": {
    "sub_competencies": { "1.1_ai_curiosity": { "score": 3, "confidence": "high" } },
    "language_signals": ["exploration", "system_thinking"],
    "next_probe_priority": "knowledge"
  }
}

Observable Differentiator reliability: Score each differentiator at 3 separate points during conversation (not just once at end), average the 3 scores, calibrate with 50+ human-expert-scored transcripts before launch.

11. Combined Scoring (Layer 1 + Layer 2)

Sub-Competency Score

Sub_Competency_Final = (L1_Score × 0.30) + (L2_Score × 0.70)

Dimension Score

Dimension_Score =
  (Mindset_Sub × 0.30) +
  (Skill_Sub × 0.30) +
  (Additional_1 × 0.20) +
  (Additional_2 × 0.20)

Normalized to 0-100.

Superpowered Score

Superpowered_Score =
  (Mindset_Index × 0.35) +
  (Capability_Index × 0.30) +
  (Application_Index × 0.25) +
  (Technical_Depth × 0.10)

Range: 0-100

AI Potential (refined with Layer 2)

AI_Potential_raw = weighted_average(
  Mindset_Index × 1.4,
  Learning_Velocity × 1.2,
  Strongest_Dimension × 1.1,
  100 - Consistency_Gap × 0.5
)

// Sigmoid normalization to distribute across 40-100 range
AI_Potential_Final = sigmoid_map(AI_Potential_raw,
  pilot_mean, pilot_sd, target_range=[40, 100])
Display threshold: Only show AI Potential when AI_Potential − Score ≥ 10. For top performers where they’re nearly equal, omit it — it adds no insight at that level.

Self-Perception Gap

Consistency = 1 - |L1_normalized - L2_normalized| / 100

> 0.8: "Highly consistent self-awareness"
0.5-0.8: "Some gaps between self-perception and practice"
< 0.5: "Significant self-perception gap"
Self-report inflation detection: If Layer 1 exceeds Layer 2 by >20 points for any dimension: “Your self-perception in [dimension] is stronger than what your practices show. This often means untapped potential — you know what good looks like but haven’t fully applied it yet.”

12. Data-to-Output Mapping

OutputPhase 0 (AI Chat)Phase 2 (FC)Phase 3 (Mode/Skill)Email GatePhase 4 (Mini)Phase 5 (Full)
ArchetypeMindset hint (tiebreaker)Top 2 dimensionsRefined
Top 3 SuperpowersDimension rankingRefined
AI PotentialAdoption level (context)Mindset + peak dimSKILL1 velocityLearning velocity + gap
Best PercentileIndustry, role, company typeDimension scoresRefined scores
Growth EdgeWeakest dimensionWeakest sub-competency
Mode ProfileMode selection + depth (self-report, compressed)MODE3 (agentic)Behavioral depth (70% weight)
SP ScoreL1 estimateSkill signalsFull 4-component
20 Sub-competenciesL1 signalsFull L2 scoring
Learning PathIndustry contextWeakest 3 + domain
CertificateNameScore + archetype
Lead CaptureName, industry, role, company type, adoptionEmail
Email SequenceGrowth Edge + adoption for tipsTop superpower for contentTriggers sequenceCuriosity gap for Day 3Full data for emails

13. UX Design Principles

Progress & Pacing

Visual Design

Tone

Mobile-First

14. Anti-Gaming & Quality Controls

Time-Per-Item Tracking

If 22 items completed in <60 seconds (<3s per item), flag as “speed-through.” Don’t invalidate — apply confidence: low modifier to all L1 scores, increasing L2 weight.

Semantic Consistency Check

The Consistency Score (L1 vs L2 comparison) detects both self-perception gaps (coaching insight) and gaming (reliability concern):

No explicit attention checks (“select option B”). They break immersion and feel insulting to professionals.

Social Desirability Mitigation

  1. Pre-launch validation: Present all 22 items to 20 people without context — if one option chosen >65%, rewrite it
  2. Both-positive design: Both options must sound equally desirable
  3. Ipsative format helps: Forced-choice is inherently more resistant to gaming than agree/disagree scales

15. Re-Assessment Strategy

Hybrid Item Rotation (60/40)

ComponentStrategyRationale
Phase 0 AI ChatSame structure, different phrasingAI varies wording naturally; Q3 (AI usage) captures genuine change over time
Forced-choice13 anchor items identical + 9 rotatedAnchors track change; rotation reduces memory gaming
Anchoring itemsAlways identical (all 5)Stable absolute-level measurement
Mode/Skill itemsAlways identicalTrack actual behavioral change
Mini AI chatDifferent questions each timeNatural variation
Layer 2 scenariosAlways differentAdaptive nature ensures different paths

Expanded bank needed: 35 total items (13 permanent + 22 rotatable, drawing 9 per session). Target: Q3 2026.

16. Alternative Path — Conversation Upload Assessment

The Idea

Instead of answering questions about how you use AI, you show how you use AI. The user uploads 30–50 of their past AI conversations — from any tool (ChatGPT, Claude, Cursor, Copilot, Gemini, or any export) — and the system analyzes their actual behavior to generate a full SP™ AI Score profile.

This is a 100% behavioral assessment — no self-report, no forced-choice, no “how often do you…” questions. Just real evidence of how you work with AI.

Why This Path Exists

Problem with traditional assessmentHow conversation upload solves it
People overestimate their skillsConversations don’t lie — they show what you actually do
Assessment fatigue (“not another quiz”)Zero questions — you just share what you already have
Hard to assess advanced usersPower users leave the richest behavioral traces
Self-report misses nuanceReal conversations reveal patterns the user doesn’t even notice
Takes 25–35 minutesUpload takes 2 minutes — analysis happens in the background

What the User Provides

Input: 30–50 past conversations with any AI tool.

Accepted formats:

Minimum requirement: 30 conversations spanning at least 2 weeks. Recommended: 50+ conversations across 30+ days to avoid recency bias.

What the System Analyzes

1. AI Interaction Modes

SignalWhat it reveals
Questions, brainstorming, back-and-forth dialogueConversational mode — do you think WITH AI?
Content creation, drafts, creative requestsCreative mode — do you produce, iterate, and refine?
Code, tools, apps, prototypes, technical buildsBuilder mode — do you build things that work?
Automation, pipelines, multi-step workflows, agentsOrchestration mode — do you design systems, not just tasks?

2. Five Superpowers

SuperpowerWhat we look for
PerceptionDo you spot opportunities? Do you reframe problems? Do you use AI for things others wouldn’t think of?
IntelligenceDo you think strategically with AI? Do you evaluate outputs critically? Do you simulate perspectives?
KnowledgeDo you build systems that remember? Do you manage context? Do you compound knowledge across sessions?
CreationDo you produce at quality? Do you iterate until it’s right? Do you maintain your own voice and standards?
IntegrationDo you connect AI into your real workflows? Do you design processes, not just tasks? Do you build for others?

3. AI Skills

What we observeSkill it maps to
How you structure requests, give context, iteratePrompting
How many tools, models, and integrations you combineTool Selection
Whether you create reusable agents, instructions, or templatesAI Assistants
Whether you build apps, scripts, prototypes, or toolsVibe Coding
Whether you create content — posts, articles, copy, docsContent Creation
Whether you analyze, clean, or enrich dataData Work
Whether things run without youAutomation
Whether you design multi-step or multi-agent workflowsAutonomous Workflows
Whether you build systems that persist and reuse knowledgeKnowledge Management
Whether you use AI to accelerate entire projectsProject Acceleration

4. Mindset — Observable Through Behavior

Mindset componentBehavioral signal
Curiosity & OpennessHow often do you try new approaches? Do you experiment?
Critical TrustDo you push back on AI? Do you evaluate outputs or just accept them?
Augmentation VisionDo you ask AI to DO things, or to help you THINK about things?
Learning VelocityAre you visibly better in recent conversations than older ones?
Collaboration DesignDo you design AI interactions for others, or only for yourself?

How the Output Looks

The output is a full SP™ AI Score profile — identical in structure to the Paid Full Profile:

  1. SP™ AI Score (0–100) — composite score based on the four weighted components
  2. Archetype — derived from top 2 superpowers
  3. Narrative — 2–3 sentences describing who you are as an AI professional
  4. Radar chart — 5 superpower dimensions
  5. Mode profile — which modes you use and at what depth
  6. Your Stack (Skills) — all 10 skills ranked and scored in human language
  7. Behavioral Evidence — the strongest patterns observed, written as insights
  8. What Would Make You Faster — 3–5 growth recommendations, each with a “Solve with AI” button that opens the prompt directly in Claude or ChatGPT

Scoring Methodology

Scoring follows the standard SP™ framework with one key difference: Layer 2 (behavioral evidence) carries 100% of the weight — there is no Layer 1 self-report.

SP™ AI Score = weighted_average(
  Mindset Index      × 35%   // observed via behavioral proxies
  Applied AI Skills  × 30%   // observed via tool usage and task types
  Domain Integration × 25%   // observed via workflow complexity
  Technical Depth    × 10%   // observed via builder-mode activity
)

Comparison to Standard Assessment

AspectStandard (questionnaire)Conversation Upload
InputUser answers questions about behaviorUser shows actual behavior
Time to complete7–35 minutes2 minutes + async processing
Self-report biasPresent (people overestimate)Absent (behavior doesn’t lie)
Works for beginnersYes — questions are accessibleLimited — needs 30+ AI conversations
Works for power usersMay feel reductiveExcellent — richer signals
Scoring layersL1 (30%) + L2 (70%) or L1 onlyL2 only (100% behavioral)
Privacy modelAnswers storedConversations processed and discarded

Target Audience

Ideal for:

Not ideal for:

Pilot Results

Tested with 50+ Cursor session transcripts (Jan–Feb 2026):


17. Open Questions

Resolved (from expert panel)

#QuestionResolution
4AI Potential clusters 85–95Added sigmoid normalization + display threshold (≥ 10 gap)
7Anti-gamingTime-per-item tracking + semantic consistency. No attention checks.
8Re-assessment itemsHybrid rotation (60% anchor / 40% rotated)

Still Open

  1. Item validation (NON-NEGOTIABLE): 22 FC items + 5 anchoring items need validation with 50–100 pilot users. Required before public launch.
  2. Phase 0 + Mini chat scoring weight: Phase 0 bonus signals (AI Adoption, Mindset Hint) are used for personalization and tiebreaking, NOT primary scoring. Mini chat (Phase 4) remains UX/conversion only. Revisit if pilot shows conversational responses are highly diagnostic.
  3. Scenario bank size: Target 20 universal by Q2 2026, 35+ rotatable items by Q3 2026.
  4. Layer 1 strategic accuracy: Business decision — make L1 directionally accurate (archetype correct 70%+) but numerically imprecise. Show archetype confidently; blur dimension scores/sub-competencies in free tier.
  5. Chat technology: Start with text; add voice (ElevenLabs) as v2 feature.
  6. Observable Differentiator calibration: 50+ human-expert-scored transcripts needed. Target: Cohen’s kappa > 0.7.
  7. Enterprise path: Add “share with your team” prompt post-assessment.