# RSS Feed Monitor - Google Alerts This repository contains validated Google Alert queries for monitoring repair-related discussions across Canadian platforms. ## ⚠️ START HERE **📋 Production-Ready Google Alerts for Canadian Repair Leads** Use `docs/google-alerts-reddit-tuned.md` for **29 validated alerts** targeting Canadian customers. **Key Features:** - 7 high-volume global subs (r/techsupport, r/applehelp, r/datarecovery) filtered for Canadian locations - 22 Canadian regional subreddit alerts (r/toronto, r/vancouver, r/kitchener, etc.) - Consumer language validated ("won't turn on" vs "logic board repair") - Organized by priority tiers ## Files ### Alert Queries - **`docs/google-alerts-reddit-tuned.md`** - ✨ **START HERE** - 29 production-ready alerts ### Documentation - `docs/REDDIT_KEYWORDS.md` - Consumer language keyword conversion table - `docs/PLAYWRIGHT_SCRAPING.md` - Guide to Playwright scraping with anti-detection - `docs/PLAYWRIGHT_RECORDING.md` - Guide to recording alert setup with codegen - `docs/QUICKSTART_PLAYWRIGHT.md` - Quick start guide for Playwright tools ### Python Tools - `scripts/validate_alerts.py` - Validator tool that checks queries and generates fixes - `scripts/generate_broad_queries.py` - Generates location-based broad queries ### Playwright Tools (NEW) - `scripts/human-behavior.js` - Human-like behavior library for bot detection avoidance - `scripts/playwright-scraper.js` - Main scraper with Google search validation - `scripts/validate-scraping.js` - Batch validator for testing multiple alerts - `scripts/example-usage.js` - Usage examples and demonstrations - `scripts/scraper-config.js` - Configuration for behavior fine-tuning - `tests/alert-setup.spec.js` - Test documenting alert setup process - `docs/PLAYWRIGHT_RECORDING.md` - Guide to recording alert setup with codegen ## Quick Start ### Set Up Alerts 1. Open `docs/google-alerts-reddit-tuned.md` 2. Start with **Tier 1** alerts (global subs filtered for Canada - highest volume) 3. Copy a query from inside the ` ``` ` code blocks 4. Go to [Google Alerts](https://www.google.com/alerts) 5. Paste the query, click "Show options", configure: - How often: `As-it-happens` - Region: `Canada` - Deliver to: `RSS feed` 6. Click `Create Alert` 7. Click the RSS icon to get your feed URL ### Validating Queries #### Python Validator (Static Analysis) Run the validator to check query structure and limits: ```bash python3 scripts/validate_alerts.py docs/google-alerts.md ``` To regenerate working queries from a broken file: ```bash python3 scripts/validate_alerts.py docs/google-alerts.md --fix > docs/google-alerts-fixed.md ``` #### Playwright Validator (Live Testing) - NEW! 🚀 Test queries by actually searching Google with human-like behavior to avoid bot detection: ```bash # Install dependencies first npm install # Test a single query node scripts/playwright-scraper.js '"macbook repair" Toronto' # Batch test multiple alerts from markdown file node scripts/validate-scraping.js docs/google-alerts-broad.md --max 5 # Run example demonstrations node scripts/example-usage.js 1 ``` **Features:** - 🤖 Realistic mouse movements with bezier curves and occasional overshooting - 📜 Natural scrolling patterns with random intervals - ⌨️ Human-like typing with variable speeds and occasional typos - ⏱️ Random delays mimicking real user behavior - 🎭 Randomized browser fingerprints to avoid detection See `docs/PLAYWRIGHT_SCRAPING.md` for full documentation. #### Recording Alert Setup Process 🎬 Use Playwright's codegen to record and document the alert setup workflow: ```bash # Record a new alert setup process npm run record:alert-setup ``` This opens an interactive browser where you can perform the alert setup steps, and Playwright will generate test code automatically. Perfect for documenting the exact process for future reference. See `docs/PLAYWRIGHT_RECORDING.md` for full documentation. ## Query Design All queries follow these limits to ensure Google Alerts fires reliably: - **≤8 site filters** per alert - **≤18 OR terms** per keyword block - **≤500 characters** total length - **≤4 exclusion terms** (`-job -entertainment -movie -music`) ## Regional Structure Reddit-based alerts are split into 5 regions to stay within limits: 1. **Ontario-GTA**: kitchener, waterloo, CambridgeON, guelph, toronto, mississauga, brampton 2. **Ontario-Other**: ontario, londonontario, HamiltonOntario, niagara, ottawa 3. **Western**: vancouver, VictoriaBC, Calgary, Edmonton 4. **Prairies**: saskatoon, regina, winnipeg 5. **Eastern**: montreal, quebeccity, halifax, newfoundland Each service type (Data Recovery, Laptop Repair, Console Repair, etc.) has 5 regional alerts. ## Alert Categories ### Data Recovery (15 alerts) - General data recovery - HDD/SSD specialty recovery - SD card/USB recovery ### Device Repair (25 alerts) - Laptop/MacBook logic board repair - GPU/Desktop board repair - Console repair & refurbishment - Smartphone repair - iPad repair - Connector (FPC) replacement ### Specialized Services (10 alerts) - Key fob repair - Microsolder/diagnostics - Device refurbishment & trade-ins ### Non-Reddit Platforms (11 alerts) - Kijiji/Used.ca classifieds - Facebook Marketplace - Craigslist - Tech forums - Discord communities - Bulk/auction sourcing ## Troubleshooting **No results coming through?** 1. Test the query in Google Search first (not in Alerts) 2. If Google Search shows results, the alert should work 3. If no results exist, the keywords may be too specific 4. Run `python3 scripts/validate_alerts.py` to check for limit violations **Alert stopped working?** Re-run validation and regenerate: ```bash python3 scripts/validate_alerts.py docs/google-alerts.md --fix > docs/google-alerts-new.md ``` ## Technical Notes - Queries use exact-phrase matching (`"keyword"`) for precision - The `-"ALERT_NAME:..."` marker was removed from all queries (it caused false negatives) - Exclusions are limited to high-noise terms only - Site filters use `site:reddit.com/r/subreddit` format (not full URLs)