Web Archives

Internet Archive Wayback CDX API

Historical web capture index for domain and URL research.

slug: wayback-machine-cdx-api
priority: 98
reviewed: Apr 24, 2026

Build nowPhase 1Low riskAPIapproved

01overview

How this source is shaped

One of the highest-value sources for OSINT. It enables historical reconstruction of websites, page availability, old redirects, old robots exposure and deleted content context.

Source type: Archive
Access model: Free
Pricing model: Free Public API With Usage Considerations
API available: Yes
Requires account: No
Risk level: Low
Sensitivity: Normal
Integration phase: Phase 1
Integration priority: 98

02scoring

Review dimensions

Each dimension is graded on a 0–10 scale. The overall score is a weighted aggregate.

overall score

8.74/10

Weighted aggregate across the eight review dimensions.

Authorityreputation and provenance of the source

9.30/10

Data qualityaccuracy, coverage, completeness

8.50/10

Usabilityhow quickly an analyst can extract value

7.80/10

APIshape, stability and cost of programmatic access

8.60/10

Documentationhow well the source is explained and referenced

8.00/10

Freshnesshow up-to-date the data stream is

8.20/10

Ethical fitalignment with our ethical OSINT posture

9.30/10

Commercial valueproduct leverage and monetisable surface

9.40/10

03application

Where this source fits

What analysts use it for, and — just as important — where it does not belong.

Primary use cases

domain_history
deleted_page_recovery
redirect_history
content_change_detection
evidence_context

Suitable for

journalists
seo_analysts
threat_researchers
compliance_analysts

Not suitable for

real_time_monitoring
private_content_recovery

data types

web_captureshistorical_urlstimestampsstatus_codesmime_typesdigests

04opinion

Editorial take

Our qualitative read on the source — tone, framing and trust posture.

This should be one of the first real integrations. It is ethical, explainable, useful for SEO, journalism, compliance and cyber exposure reports.

05product

Integration stance

Build, buy or defer. What shape the product integration would take, and why.

Build a Domain History module: capture timeline, first seen, last seen, status changes, content-type distribution and important archived URLs.

06governance

Ethics and compliance

What to handle carefully, and what must not ship without sign-off.

Ethical notes

Do not imply archived content is current. Always label timestamps clearly and preserve context.

Compliance notes

Respect Internet Archive access patterns and avoid abusive bulk requests.

07technical

Metadata

Catalog-side technical footer. Values as recorded in the source row.

source owner: Internet Archive
report module: domain_history
integration candidate: true

Back to catalog