31 lines
1.2 KiB
Markdown
31 lines
1.2 KiB
Markdown
## Roadmap (post v0.0.1)
|
|
|
|
Prioritized from easiest/low-risk to more involved work. Check off as we ship.
|
|
|
|
### Quick wins (target v0.0.2)
|
|
- [x] Add crawl metadata (startedAt, finishedAt, durationMs)
|
|
- [x] Include run parameters in report (maxDepth, concurrency, timeout, userAgent, sameHostOnly)
|
|
- [x] Status histogram (2xx/3xx/4xx/5xx totals) in summary
|
|
- [x] Normalize and dedupe trailing `/.` URL variants in output
|
|
- [x] Add compact `reportSummary` text block to JSON
|
|
- [x] Top external domains with counts
|
|
- [x] Broken links sample (first N) + per-domain broken counts
|
|
|
|
### Core additions (default, no flags)
|
|
- [x] Robots.txt summary (present, fetchedAt)
|
|
- [x] Sitemap extras (index → child sitemaps, fetch errors)
|
|
- [x] Per-page response time (responseTimeMs) and content length (basic)
|
|
- [x] Basic page metadata: `<title>`
|
|
- [x] Depth distribution (count of pages by depth)
|
|
- [x] Redirect map summary (from → to domain counts)
|
|
|
|
### Outputs and UX
|
|
- [ ] CSV exports: pages.csv, links.csv
|
|
- [ ] NDJSON export option for streaming pipelines
|
|
|
|
### Notes
|
|
- All report metrics must be gathered by default with zero flags required.
|
|
- Keep JSON stable and sorted; update `reports/REPORT_SCHEMA.md` when fields change.
|
|
|
|
|