Version: Endpoint V2

Search Indexing Guidelines

Problem

Dynamic filtering components (like the Contract Addresses Table and DVN Addresses Table) update URLs with query parameters to make filter states shareable:

/deployments/deployed-contracts?chains=ethereum,bsc&stages=mainnet
/deployments/dvn-addresses?dvns=layerzero&chains=polygon

Search indexers like Typesense treat each unique URL as a separate page, creating hundreds of duplicate content entries for the same base page content.

Solution

We implement a multi-layered approach to prevent duplicate content indexing:

1. Meta Tags (Primary Solution)

The useSearchIndexing hook automatically adds appropriate meta tags when filters are active:

// When filters are active:
<link rel="canonical" href="/deployments/deployed-contracts" />
<meta name="robots" content="noindex, follow" />

How it works:

Canonical URL: Points to the base page without query parameters
Noindex: Prevents indexing of filtered pages while allowing link following

2. Robots.txt (Secondary Protection)

Additional robots.txt rules block crawlers from accessing filtered URLs:

# Block crawling of filtered URLs
Disallow: /*?*chains=*
Disallow: /*?*dvns=*
Disallow: /*?*stages=*
Disallow: /*?*page=*

3. Usage in Components

For any component with filtering, use the hook:

import {useSearchIndexing} from '../hooks/useSearchIndexing';

function MyFilterableComponent() {
  const [filters, setFilters] = useState([]);

  // Determine if any filters are active
  const hasActiveFilters = filters.length > 0 || searchTerm || otherFilterConditions;

  // Apply search indexing rules
  useSearchIndexing(hasActiveFilters);

  // ... rest of component
}

Benefits

Prevents Duplicate Content: Search engines won't index multiple versions of the same page
Maintains Shareability: URLs with filters still work for users sharing links
Preserves SEO: Base pages retain their search ranking
Automatic: Works without manual intervention once implemented

Implementation Details

Canonical URLs tell search engines which version is the "master"
Noindex prevents duplicate content while preserving link equity
Robots.txt provides an additional layer of protection
Cleanup ensures meta tags are removed when filters are cleared

Best Practices

Always use the hook for any component that modifies URLs based on user interaction
Test thoroughly to ensure meta tags are added/removed correctly
Monitor search console for any remaining duplicate content issues
Consider analytics when deciding which filtered pages (if any) should be indexed

Monitoring

Check Google Search Console and other search tools for:

Duplicate content warnings
Pages with canonical issues
Unexpected indexed URLs with query parameters

References

This approach follows industry best practices used by:

Major ecommerce sites (Amazon, eBay)
Documentation platforms (Algolia, GitLab)
Content management systems (WordPress, Drupal)

See Webflow's pagination approach and ecommerce faceted navigation best practices for more details.

Problem​

Solution​

1. Meta Tags (Primary Solution)​

2. Robots.txt (Secondary Protection)​

3. Usage in Components​

Benefits​

Implementation Details​

Best Practices​

Monitoring​

References​