Competitor & Strategy
Competitor & Strategy
Live

Product Cannibalization

Find product pages on your site competing for the same queries.

Product list

One per line: URL | feature keywords | sales/clicks

Product cluster 1 (4 variants)

hiking
boots
merrell
waterproof
primary
1,200 sales/clicks
/products/moab-3-mid
hiking boots merrell mid waterproof
consolidate / differentiate
320 sales/clicks
/products/moab-3-mid-wide
hiking boots merrell mid waterproof wide
consolidate / differentiate
220 sales/clicks
/products/moab-2-mid
hiking boots merrell mid waterproof
consolidate / differentiate
180 sales/clicks
/products/moab-3-low
hiking boots merrell low waterproof

Product cluster 2 (2 variants)

hiking
salomon
trail
primary
880 sales/clicks
/products/x-ultra-4
hiking shoes salomon trail
consolidate / differentiate
460 sales/clicks
/products/x-ultra-4-mid
hiking boots salomon mid trail

Product cluster 3 (2 variants)

trail
running
altra
primary
540 sales/clicks
/products/lone-peak-7
trail running shoes altra zero-drop
consolidate / differentiate
80 sales/clicks
/products/lone-peak-7-mid
trail running mid altra

Start here · What is product cannibalization?

Multiple SKUs can satisfy the same shopper query when titles and bullets reuse the same phrases. Internal search and Google may rank the wrong PDP.

Each line is URL | feature keywords | sales-or-clicks. Tokens strip punctuation and stopwords before similarity math.

The tool unions SKUs whose Jaccard similarity clears your threshold, picks the highest metric SKU as primary, and surfaces shared tokens that explain the collision.

When to use this tool

  • Variant sprawl cleanup

    Wide versus mid versus low cuts of the same boot might share nearly identical descriptive tokens.

  • Marketplace seller hygiene

    Detect duplicate listings that compete for the same long-tail modifiers.

  • Merchandising strategy

    Decide which SKU keeps the head term and which needs differentiated positioning or canonical policy.

  • Data science handoffs

    Export clusters for finance teams comparing revenue concentration across near-duplicate PDPs.

Examples

Walk through these with the form above — they are practice scenarios, not live data.

Adjust sensitivity

Try this

Drag Similarity threshold down to 0.35 to see looser clusters; raise toward 0.8 for strict matches.

What to look for

Conflict cards grow or shrink depending on how aggressively you define duplication.

Primary selection

Try this

Ensure sales numbers reflect your analytics—not future forecasts—before trusting the primary badge.

What to look for

Highest sales URL becomes primary; others get consolidate guidance that still needs merchandiser judgement.

Short tutorial

Follow in order the first time you use the tool; later you can skip to the step you need.

  1. Step 1 — Build the paste file

    One PDP per line with human-readable keyword phrases and a numeric performance metric.

  2. Step 2 — Normalize phrases honestly

    If two rows share marketing fluff only, they may cluster unfairly. Edit copy upstream.

  3. Step 3 — Tune threshold

    Start at 0.5 default, then widen or tighten to match merchandising tolerance.

  4. Step 4 — Review clusters

    Read shared token chips to confirm shoppers would conflate the SKUs.

  5. Step 5 — Pair with site-wide cannibal workflows

    Use Keyword cannibalization for query-level conflicts and Ecommerce site auditor for template health.

More detail

New here? Skim Start here first, then run one Examples scenario in the form above.

Product Cannibalization does one job: find product pages on your site competing for the same queries. It lives under Competitor & Strategy on SEOToolkits, where the beginner idea is simple: Competitor SEO compares your site with ranking pages so you can find realistic gaps and advantages.

FAQ

Does this read GTIN or SKU codes?
Only the keyword phrase and URL text you enter. Add identifiers externally if needed.
Can it auto-merge products?
No. It surfaces clusters. PIM, ERP, and CMS teams still execute merges.
What metric should I use?
Sales, revenue, or Search Console clicks all work as long as the column stays numeric and consistent.
International duplicates?
Separate locales into different pastes so language tokens do not distort similarity.

Same workflow cluster on SEOToolkits — open another module without leaving context.