|
|
||
|---|---|---|
| docs | ||
| scripts | ||
| tests | ||
| .gitignore | ||
| PLAYWRIGHT_SETUP_COMPLETE.md | ||
| README.md | ||
| REDDIT_ALERTS_COMPLETE.md | ||
| package.json | ||
| playwright.config.js | ||
README.md
RSS Feed Monitor - Google Alerts
This repository contains validated Google Alert queries for monitoring repair-related discussions across Canadian platforms.
⚠️ START HERE
✨ NEW: Production-Ready Reddit Alerts Available!
Use docs/google-alerts-reddit-tuned.md for validated, high-performance alerts that produce regular, relevant results.
Read REDDIT_ALERTS_COMPLETE.md for test results showing 100% success rate and 10/10 relevant results.
Files
Documentation
docs/google-alerts-reddit-tuned.md- ✨ START HERE - 25 production-ready alerts (100% validated)REDDIT_ALERTS_COMPLETE.md- ✨ READ SECOND - Complete test results and setup guidedocs/REDDIT_KEYWORDS.md- Consumer language keyword conversion tabledocs/google-alerts-broad.md- Original 84 alerts (needs tuning)docs/google-alerts.md- Regional Reddit queries (61 alerts, low volume)docs/PLAYWRIGHT_SCRAPING.md- Guide to Playwright scraping with anti-detectiondocs/PLAYWRIGHT_RECORDING.md- Guide to recording alert setup with codegen
Python Tools
scripts/validate_alerts.py- Validator tool that checks queries and generates fixesscripts/generate_broad_queries.py- Generates location-based broad queries
Playwright Tools (NEW)
scripts/human-behavior.js- Human-like behavior library for bot detection avoidancescripts/playwright-scraper.js- Main scraper with Google search validationscripts/validate-scraping.js- Batch validator for testing multiple alertsscripts/example-usage.js- Usage examples and demonstrationsscripts/scraper-config.js- Configuration for behavior fine-tuningtests/alert-setup.spec.js- Test documenting alert setup processdocs/PLAYWRIGHT_RECORDING.md- Guide to recording alert setup with codegen
Quick Start
1. Test Before You Create
Copy this query and test in Google Search (NOT Alerts):
"macbook repair" ("Toronto" OR "Mississauga" OR "Kitchener")
If you see 50+ results → the broad approach works ✅
2. Choose Your Strategy
- Want results now? Use
docs/google-alerts-broad.md(recommended) - Want Reddit-only? Use
docs/google-alerts.md(may have low volume) - Not sure? Read
docs/ALERT_STRATEGY.md
3. Set Up Alerts
- Open the file you chose
- Find an alert (e.g., "Data Recovery - Ontario")
- Copy the query block (everything inside
```) - Go to Google Alerts
- Paste the query, set
As-it-happens→RSS feed - Click
Create Alert
Validating Queries
Python Validator (Static Analysis)
Run the validator to check query structure and limits:
python3 scripts/validate_alerts.py docs/google-alerts.md
To regenerate working queries from a broken file:
python3 scripts/validate_alerts.py docs/google-alerts.md --fix > docs/google-alerts-fixed.md
Playwright Validator (Live Testing) - NEW! 🚀
Test queries by actually searching Google with human-like behavior to avoid bot detection:
# Install dependencies first
npm install
# Test a single query
node scripts/playwright-scraper.js '"macbook repair" Toronto'
# Batch test multiple alerts from markdown file
node scripts/validate-scraping.js docs/google-alerts-broad.md --max 5
# Run example demonstrations
node scripts/example-usage.js 1
Features:
- 🤖 Realistic mouse movements with bezier curves and occasional overshooting
- 📜 Natural scrolling patterns with random intervals
- ⌨️ Human-like typing with variable speeds and occasional typos
- ⏱️ Random delays mimicking real user behavior
- 🎭 Randomized browser fingerprints to avoid detection
See docs/PLAYWRIGHT_SCRAPING.md for full documentation.
Recording Alert Setup Process 🎬
Use Playwright's codegen to record and document the alert setup workflow:
# Record a new alert setup process
npm run record:alert-setup
This opens an interactive browser where you can perform the alert setup steps, and Playwright will generate test code automatically. Perfect for documenting the exact process for future reference.
See docs/PLAYWRIGHT_RECORDING.md for full documentation.
Query Design
All queries follow these limits to ensure Google Alerts fires reliably:
- ≤8 site filters per alert
- ≤18 OR terms per keyword block
- ≤500 characters total length
- ≤4 exclusion terms (
-job -entertainment -movie -music)
Regional Structure
Reddit-based alerts are split into 5 regions to stay within limits:
- Ontario-GTA: kitchener, waterloo, CambridgeON, guelph, toronto, mississauga, brampton
- Ontario-Other: ontario, londonontario, HamiltonOntario, niagara, ottawa
- Western: vancouver, VictoriaBC, Calgary, Edmonton
- Prairies: saskatoon, regina, winnipeg
- Eastern: montreal, quebeccity, halifax, newfoundland
Each service type (Data Recovery, Laptop Repair, Console Repair, etc.) has 5 regional alerts.
Alert Categories
Data Recovery (15 alerts)
- General data recovery
- HDD/SSD specialty recovery
- SD card/USB recovery
Device Repair (25 alerts)
- Laptop/MacBook logic board repair
- GPU/Desktop board repair
- Console repair & refurbishment
- Smartphone repair
- iPad repair
- Connector (FPC) replacement
Specialized Services (10 alerts)
- Key fob repair
- Microsolder/diagnostics
- Device refurbishment & trade-ins
Non-Reddit Platforms (11 alerts)
- Kijiji/Used.ca classifieds
- Facebook Marketplace
- Craigslist
- Tech forums
- Discord communities
- Bulk/auction sourcing
Troubleshooting
No results coming through?
- Test the query in Google Search first (not in Alerts)
- If Google Search shows results, the alert should work
- If no results exist, the keywords may be too specific
- Run
python3 scripts/validate_alerts.pyto check for limit violations
Alert stopped working?
Re-run validation and regenerate:
python3 scripts/validate_alerts.py docs/google-alerts.md --fix > docs/google-alerts-new.md
Technical Notes
- Queries use exact-phrase matching (
"keyword") for precision - The
-"ALERT_NAME:..."marker was removed from all queries (it caused false negatives) - Exclusions are limited to high-noise terms only
- Site filters use
site:reddit.com/r/subredditformat (not full URLs)