Duplication Detection

Vibesweep identifies duplicate code blocks, similar functions, and repeated patterns that bloat your codebase and make maintenance harder.

Types of Duplication

Exact Duplicates

Identical code blocks appearing in multiple places:

// File: utils/validation.js
function validateEmail(email) {
  const re = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
  return re.test(email);
}

// File: components/ContactForm.js
function validateEmail(email) {
  const re = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
  return re.test(email);
}

Near Duplicates

Similar code with minor variations:

// Function 1
function calculateUserDiscount(user) {
  if (user.isPremium) return 0.20;
  if (user.purchases > 10) return 0.10;
  return 0.05;
}

// Function 2 - 92% similar
function calculateMemberDiscount(member) {
  if (member.isPremium) return 0.20;
  if (member.orders > 10) return 0.10;
  return 0.05;
}

Structural Duplicates

Same logic pattern, different implementation:

// Pattern 1: Traditional loop
let total = 0;
for (let i = 0; i < items.length; i++) {
  total += items[i].price;
}

// Pattern 2: Modern approach
const total = items.reduce((sum, item) => sum + item.price, 0);

AI-Generated Duplication

AI assistants often create duplication by:

Rewriting Existing Functions

// Original function
export function formatCurrency(amount) {
  return `$${amount.toFixed(2)}`;
}

// AI rewrote instead of importing
function formatPrice(value) {
  return `$${value.toFixed(2)}`;
}

Verbose Implementations

// AI generated - 15 lines
function getMaxValue(numbers) {
  if (!numbers || numbers.length === 0) {
    return null;
  }
  let max = numbers[0];
  for (let i = 1; i < numbers.length; i++) {
    if (numbers[i] > max) {
      max = numbers[i];
    }
  }
  return max;
}

// Could be: Math.max(...numbers)

Detection Algorithm

Vibesweep uses multiple strategies to find duplicates:

  1. Token-based Analysis: Compares code structure ignoring variable names
  2. AST Comparison: Matches similar code patterns at the syntax level
  3. Fuzzy Matching: Finds near-duplicates with configurable thresholds
  4. Cross-file Analysis: Detects duplication across your entire project

Configuration

Customize duplication detection in .vibesweeprc.json:

{
  "duplication": {
    "enabled": true,
    "minTokens": 50,
    "threshold": 0.85,
    "ignorePatterns": [
      "**/test/**",
      "**/*.config.js"
    ],
    "languages": {
      "javascript": {
        "minTokens": 40
      },
      "typescript": {
        "minTokens": 40
      }
    }
  }
}

Understanding the Report

šŸ” Duplication Analysis
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Found 5 duplicate blocks across 12 files

šŸ“ Duplicate Block #1 (87 tokens, 4 instances)
   Similarity: 100%
   
   src/utils/format.js:12-25
   src/components/Table.js:45-58
   src/pages/Dashboard.js:102-115
   src/helpers/display.js:8-21
   
   šŸ’” Consider extracting to shared utility function

Fixing Duplicates

1. Extract Common Functions

// Before: Duplicated in 3 files
function parseUserData(raw) {
  return {
    id: raw.user_id,
    name: raw.full_name,
    email: raw.email_address
  };
}

// After: Shared utility
// utils/user.js
export function parseUserData(raw) {
  return {
    id: raw.user_id,
    name: raw.full_name,
    email: raw.email_address
  };
}

// Import where needed
import { parseUserData } from '@/utils/user';

2. Create Shared Components

// Before: Similar error messages everywhere
<div className="error-box">
  <p className="error-text">{error.message}</p>
</div>

// After: Reusable component
export function ErrorMessage({ error }) {
  return (
    <div className="error-box">
      <p className="error-text">{error.message}</p>
    </div>
  );
}

3. Use Configuration Objects

// Before: Repeated API configurations
const headers1 = { 'Authorization': token, 'Content-Type': 'application/json' };
const headers2 = { 'Authorization': token, 'Content-Type': 'application/json' };

// After: Shared config
export const API_HEADERS = {
  'Authorization': () => getToken(),
  'Content-Type': 'application/json'
};

Pro Features

Vibesweep Pro enhances duplication detection with:

  • Semantic Analysis: Understand code meaning, not just structure
  • Auto-refactoring: Safely extract and consolidate duplicate code
  • Cross-repository: Find duplication across multiple repos
  • Impact Analysis: See maintenance cost of duplicates
  • Team Reports: Track who introduces duplication

Best Practices

  1. Review Before Removing: Some duplication is intentional
  2. Consider Context: Similar code might have different purposes
  3. Test After Refactoring: Ensure behavior remains identical
  4. Document Decisions: Note why some duplicates are kept

Related