B. F. Skinner, Operant Conditioning, and What Marketers Can Learn
Why a mid-century behavioural scientist still shapes your CRM, app engagement and loyalty strategy today.
B. F. Skinner was a Harvard psychologist who argued that much of what we do is shaped by consequences. If an action reliably leads to a desirable outcome, we tend to repeat it. Skinner called this operant conditioning and spent decades showing, with almost obsessive precision, how different patterns of rewards and signals shape behaviour.
His work moved psychology beyond Pavlov’s bell-and-dog reflexes and into the realm of everyday choice. For marketers, this is gold dust: it’s a blueprint for designing environments that make desired customer behaviours more likely. For a primary introduction to Skinner’s programme, see The Behaviour of Organisms and Science and Human Behaviour.
The Marketing Made Clear Podcast
Check out the Marketing Made Clear Podcast on all good streaming platforms including Spotify:
Before Skinner: Pavlov and Thorndike in one Paragraph
Pavlov’s classical conditioning linked stimuli together (bell = salivation), while Edward Thorndike’s Law of Effect established that behaviours followed by satisfying outcomes are more likely to recur. Skinner built on Thorndike, focusing on how actions operate on the environment to produce consequences.
Operant Conditioning in Plain English
-
Positive reinforcement adds something desirable to increase a behaviour.
-
Negative reinforcement removes something unpleasant to increase a behaviour.
-
Punishment reduces a behaviour by adding something aversive or removing something desirable.
-
Extinction is the fading of a behaviour when it’s no longer reinforced.
Skinner’s lab engine was the operant conditioning chamber (often called a Skinner box): a controlled environment where an animal could press a lever or peck a key to earn food, letting researchers map how consequences shape responding with clockwork accuracy.
He also showed that we can get “superstitious” behaviours if rewards happen by accident. In a classic 1948 paper, pigeons repeated odd rituals because food happened to arrive after those movements, even though the movements didn’t cause it. Sound familiar when you think of “lucky” checkout tricks or cargo-cult marketing hacks?

The Building Blocks Marketers Should Know
-
Shaping – reinforcing successive approximations toward a target behaviour. Think onboarding flows that reward small steps en route to subscription or first purchase.
-
Discriminative stimuli (SDs) – cues signalling that a behaviour will be rewarded. Your push notification or “Sale ends tonight” banner can be an SD if, in its presence, the desired action produces value.
-
Token economies – points and tokens as generalised reinforcers exchangeable for rewards. That’s the logic behind points programmes and many gamified apps.
Schedules of Reinforcement: The Real Magic
Skinner and C. B. Ferster’s Schedules of Reinforcement showed that when you reward matters as much as what you reward. Different schedules produce distinct, predictable response patterns. Here’s the short course.
| Schedule | How it works | Behaviour pattern | Marketing analogy |
|---|---|---|---|
| Fixed Ratio (FR) | Reward after a set number of responses (e.g., every 5th action) | High rate, brief pause after reward | “Buy 4, get the 5th free” punch cards; predictable point thresholds |
| Variable Ratio (VR) | Reward after an unpredictable number of responses, around an average | Very high, steady responding; most resistant to extinction | Instant-win promotions, surprise-and-delight credits, some gamified streak bonuses |
| Fixed Interval (FI) | First response after a fixed time is rewarded | Scalloped: slow after reward, speeding up as the interval ends | Weekly member offers; predictable “Happy Hour” deals |
| Variable Interval (VI) | First response after varying time intervals is rewarded | Moderate, steady responding | Occasionally timed perks; unpredictable “drops” or content releases |
Two advanced findings matter hugely for marketers:
-
Partial reinforcement extinction effect (PREE): behaviours learned under intermittent rewards persist longer when rewards stop. This is why surprise rewards can build stickiness, but also why turning off a well-liked perk can sting for a long time.
-
Accidental reinforcement risks: if you accidentally reward the wrong behaviour (e.g., panic-discounting after cart abandonment every time), you can condition customers to wait you out.
From Lab to Market: Practical Applications
1) Loyalty programmes and points systems
-
Use Fixed Ratio (FR) schedules for clarity and habit formation at the start (e.g., “earn 100 points, get £5”), then blend in Variable Ratio (VR) elements for excitement and persistence (e.g., occasional mystery bonuses).
-
Plan the extinction curve: if you ever remove perks, taper them or provide substitutes to avoid backlash and learned “withdrawal.” Academic work shows reward structures can undermine intrinsic motivation if they feel controlling; autonomy-supportive rewards work better.
-
Remember that loyalty isn’t only conditioning. Status, habit and relational drivers all interplay, so design for more than points.
2) Gamified engagement in apps and CRM
-
Streaks and progress mechanics are essentially shaping plus Fixed Ratio (FR) reinforcement, often buffered with “streak freezes” to reduce the pain of breaking the chain. Duolingo is transparent about building habits this way.
-
Variable rewards (VR) are powerful but potent: use sparingly and ethically. Product teams often cite Skinner when building variable reward loops.
3) Messaging as discriminative stimuli
-
Treat push, email and in-app prompts as Discriminative Stimuli (SDs): clear, timely cues that, in their presence, the desired action produces value. If the promised consequence doesn’t materialise (e.g., “exclusive offer” that isn’t), you weaken the SD and condition non-response.
4) Promotions and “surprise-and-delight”
-
Rotate Variable Ratio; VR-style surprise credits for advocacy or milestone moments, not just purchases, to reinforce community behaviours without teaching wait-for-discount.
-
Time some benefits on Variable Interval (VI) schedules to maintain steady engagement without training exact “deal-hunting” moments.
5) Token economies and tiers
-
Points are tokens. They become powerful when widely exchangeable and paired with meaningful “backup” reinforcers (experiences, access, recognition). This is token reinforcement 101.
Case notes: What this Looks Like in the Wild
-
Duolingo: streaks and freezes are a masterclass in shaping, FR reinforcement and cushioning extinction. The company publicly discusses the habit-building science behind streaks.
-
Starbucks Rewards: tiers, challenges and intermittent perks deploy token economies and mixed schedules to sustain participation. Studies and theses have examined how the interface nudges behaviour.
All These Acronyms
I Thought This Was Meant to Make Things More Clear?
Unfortunately when you enter the world of Science, especially psychology, acronyms are rife! But it gets more confusing as acronyms like “SD” aren’t normal acronyms. They are “behaviour-analytic notation”
-
S stands for stimulus.
-
The superscript D means discriminative (i.e., this stimulus sets the occasion for reinforcement). In print it’s written Sᴰ; in plain text we drop the superscript and write SD.
-
When you’re talking about more than one, you just add the plural “s”: SDs = multiple discriminative stimuli.
Related symbols you’ll see in the same system:
-
SΔ (S-delta): a stimulus signalling that a response will not be reinforced.
-
SR: a reinforcing stimulus (sometimes written Sᴿ+ for positive, Sᴿ− for negative).
-
SP: a punishing stimulus.
So “SDs” isn’t “short for the words” so much as “plural of the symbol Sᴰ” used in operant conditioning.
Hope this little interjection helps!

Guardrails: Manipulation, Regulation and Reputation
Skinner argued for a technology of behaviour, but modern marketing has guardrails. Variable rewards in games, especially loot boxes, have raised concerns about gambling-like mechanics for children. UK government and regulators have pressed for industry principles and clear advertising disclosures, with ASA guidance updated and UKIE publishing principles in 2023. If you use variable rewards, be transparent.
The ASA’s specific guidance on advertising in-game purchases is required reading if you work with gamified experiences or “advergames” in the UK.
Ethically, there’s also the intrinsic-motivation trap: heavy, controlling rewards can crowd out genuine brand affinity. Design for perceived autonomy, competence and relatedness, not just response rates.
Advanced Lessons for Advanced Marketers
-
Engineer transitions between schedules. Don’t stick to one schedule forever. Many programmes start with FR/FI to teach the behaviour, then shift to VR/VI to maintain it more efficiently. But plan the handover.
-
Mind the PREE. If you intermittently reward discount-seeking, you create stubborn price sensitivity. Use intermittent non-price rewards to avoid reinforcing deal-only behaviour.
-
Superstition and spurious correlation. Be careful what you accidentally reward. If every abandoned basket gets an automatic coupon, you will shape more abandonment. Skinner’s pigeons are a cautionary tale.
-
Design SDs that keep their promises. Consistent contingency between cue and consequence builds powerful stimulus control. Break that contingency and you teach customers to ignore you.
-
Layer status, habit and relationship. Conditioning is a tool, not a religion. The best loyalty ecosystems integrate conditioning with status signalling and genuine relational benefits.
-
Know the intellectual debate. Skinner’s behaviourism was later challenged by the cognitive turn, famously via Chomsky’s critique of Verbal Behavior. Useful context when you balance behavioural design with messaging and meaning.
A Quick Psychology Based Playbook
-
Define the target behaviour precisely and map the minimum rewarding path.
-
Pick initial reinforcers customers genuinely value; test which consequences strengthen behaviour most.
-
Start with continuous or fixed-ratio rewards to teach the behaviour, then migrate to variable schedules to maintain it.
-
Use clear SDs for your prompts; make sure promised value follows.
-
Shape complex behaviours with stepped rewards.
-
Plan extinction: how do you phase out discounts or legacy perks without blowback?
-
Monitor for unintended conditioning (e.g., training “wait for a code”).
-
Stress-test against ASA/CAP guidance and age-sensitivity if gamifying.
TL;DR
Skinner showed that behaviour is shaped by its consequences and, crucially, by the schedule on which rewards arrive. For marketers, the “how and when” of rewards is the lever behind sticky loyalty, durable habits and responsive audiences. Use fixed schedules to teach behaviours and variable schedules to sustain them, but design ethically: be transparent, avoid accidental reinforcement of discount-seeking, and mind the ASA if you’re gamifying. Conditioning is a powerful tool, not a worldview; blend it with status and relationship design for loyalty that outlasts the next coupon.


