## Roadmap (post v0.0.1) Prioritized from easiest/low-risk to more involved work. Check off as we ship. ### Quick wins (target v0.0.2) - [x] Add crawl metadata (startedAt, finishedAt, durationMs) - [x] Include run parameters in report (maxDepth, concurrency, timeout, userAgent, sameHostOnly) - [x] Status histogram (2xx/3xx/4xx/5xx totals) in summary - [x] Normalize and dedupe trailing `/.` URL variants in output - [x] Add compact `reportSummary` text block to JSON - [x] Top external domains with counts - [x] Broken links sample (first N) + per-domain broken counts ### Core additions (default, no flags) - [ ] Robots.txt summary (present, fetchedAt) - [ ] Sitemap extras (index → child sitemaps, fetch errors) - [ ] Per-page response time (responseTimeMs) and content length (basic) - [ ] Basic page metadata: `` - [ ] Depth distribution (count of pages by depth) - [ ] Redirect map summary (from → to domain counts) ### Outputs and UX - [ ] CSV exports: pages.csv, links.csv - [ ] NDJSON export option for streaming pipelines ### Notes - All report metrics must be gathered by default with zero flags required. - Keep JSON stable and sorted; update `reports/REPORT_SCHEMA.md` when fields change.