46 lines
1.9 KiB
Markdown
46 lines
1.9 KiB
Markdown
## Roadmap (post v0.0.1)
|
|
|
|
Prioritized from easiest/low-risk to more involved work. Check off as we ship.
|
|
|
|
### Quick wins (target v0.0.2)
|
|
- [x] Add crawl metadata (startedAt, finishedAt, durationMs)
|
|
- [x] Include run parameters in report (maxDepth, concurrency, timeout, userAgent, sameHostOnly)
|
|
- [x] Status histogram (2xx/3xx/4xx/5xx totals) in summary
|
|
- [x] Normalize and dedupe trailing `/.` URL variants in output
|
|
- [ ] Add compact `reportSummary` text block to JSON
|
|
- [ ] Top external domains with counts
|
|
- [ ] Broken links sample (first N) + per-domain broken counts
|
|
|
|
### Moderate scope
|
|
- [ ] Robots.txt summary (present, fetchedAt, sample disallow rules)
|
|
- [ ] Sitemap extras (index → child sitemaps, fetch errors)
|
|
- [ ] Per-page response time (responseTimeMs) and content length
|
|
- [ ] Basic page metadata: `<title>`, canonical (if present)
|
|
- [ ] Depth distribution (count of pages by depth)
|
|
- [ ] Duplicate title/canonical detection (lists of URLs)
|
|
|
|
### Content/asset analysis
|
|
- [ ] Extract assets (images/css/js) per page with status/type/size
|
|
- [ ] Mixed-content detection (http assets on https pages)
|
|
- [ ] Image accessibility metric (alt present ratio)
|
|
|
|
### Security and quality signals
|
|
- [ ] Security headers by host (HSTS, CSP, X-Frame-Options, Referrer-Policy)
|
|
- [ ] Insecure forms (http action on https page)
|
|
- [ ] Large pages and slow pages (p95 thresholds) summary
|
|
|
|
### Link behavior and graph
|
|
- [ ] Redirect map (from → to, hops; count summary)
|
|
- [ ] Indegree/outdegree stats; small graph summary
|
|
|
|
### Outputs and UX
|
|
- [ ] CSV exports: pages.csv, links.csv, assets.csv
|
|
- [ ] NDJSON export option for streaming pipelines
|
|
- [ ] Optional: include file/line anchors in JSON for large outputs
|
|
|
|
### Notes
|
|
- Keep JSON stable and sorted; avoid breaking changes. If we change fields, bump minor version and document in `reports/REPORT_SCHEMA.md`.
|
|
- Favor opt-in flags for heavier analyses (assets, headers) to keep default runs fast.
|
|
|
|
|