184 lines
6.0 KiB
Markdown
184 lines
6.0 KiB
Markdown
# RSS Feed Monitor - Google Alerts
|
|
|
|
This repository contains validated Google Alert queries for monitoring repair-related discussions across Canadian platforms.
|
|
|
|
## ⚠️ START HERE
|
|
|
|
**📋 Production-Ready Google Alerts for Canadian Repair Leads**
|
|
|
|
Use `docs/google-alerts-reddit-tuned.md` for **29 validated alerts** targeting Canadian customers.
|
|
|
|
**Key Features:**
|
|
- 7 high-volume global subs (r/techsupport, r/applehelp, r/datarecovery) filtered for Canadian locations
|
|
- 22 Canadian regional subreddit alerts (r/toronto, r/vancouver, r/kitchener, etc.)
|
|
- Consumer language validated ("won't turn on" vs "logic board repair")
|
|
- Organized by priority tiers
|
|
|
|
## Files
|
|
|
|
### Alert Queries
|
|
- **`docs/google-alerts-reddit-tuned.md`** - ✨ **START HERE** - 29 production-ready alerts
|
|
|
|
### Documentation
|
|
- `docs/REDDIT_KEYWORDS.md` - Consumer language keyword conversion table
|
|
- `docs/PLAYWRIGHT_SCRAPING.md` - Guide to Playwright scraping with anti-detection
|
|
- `docs/PLAYWRIGHT_RECORDING.md` - Guide to recording alert setup with codegen
|
|
- `docs/QUICKSTART_PLAYWRIGHT.md` - Quick start guide for Playwright tools
|
|
|
|
### Python Tools
|
|
- `scripts/validate_alerts.py` - Validator tool that checks queries and generates fixes
|
|
- `scripts/generate_broad_queries.py` - Generates location-based broad queries
|
|
|
|
### Playwright Tools (NEW)
|
|
- `scripts/human-behavior.js` - Human-like behavior library for bot detection avoidance
|
|
- `scripts/playwright-scraper.js` - Main scraper with Google search validation
|
|
- `scripts/validate-scraping.js` - Batch validator for testing multiple alerts
|
|
- `scripts/example-usage.js` - Usage examples and demonstrations
|
|
- `scripts/scraper-config.js` - Configuration for behavior fine-tuning
|
|
- `tests/alert-setup.spec.js` - Test documenting alert setup process
|
|
- `docs/PLAYWRIGHT_RECORDING.md` - Guide to recording alert setup with codegen
|
|
|
|
## Quick Start
|
|
|
|
### Set Up Alerts
|
|
|
|
1. Open `docs/google-alerts-reddit-tuned.md`
|
|
2. Start with **Tier 1** alerts (global subs filtered for Canada - highest volume)
|
|
3. Copy a query from inside the ` ``` ` code blocks
|
|
4. Go to [Google Alerts](https://www.google.com/alerts)
|
|
5. Paste the query, click "Show options", configure:
|
|
- How often: `As-it-happens`
|
|
- Region: `Canada`
|
|
- Deliver to: `RSS feed`
|
|
6. Click `Create Alert`
|
|
7. Click the RSS icon to get your feed URL
|
|
|
|
### Validating Queries
|
|
|
|
#### Python Validator (Static Analysis)
|
|
|
|
Run the validator to check query structure and limits:
|
|
|
|
```bash
|
|
python3 scripts/validate_alerts.py docs/google-alerts.md
|
|
```
|
|
|
|
To regenerate working queries from a broken file:
|
|
|
|
```bash
|
|
python3 scripts/validate_alerts.py docs/google-alerts.md --fix > docs/google-alerts-fixed.md
|
|
```
|
|
|
|
#### Playwright Validator (Live Testing) - NEW! 🚀
|
|
|
|
Test queries by actually searching Google with human-like behavior to avoid bot detection:
|
|
|
|
```bash
|
|
# Install dependencies first
|
|
npm install
|
|
|
|
# Test a single query
|
|
node scripts/playwright-scraper.js '"macbook repair" Toronto'
|
|
|
|
# Batch test multiple alerts from markdown file
|
|
node scripts/validate-scraping.js docs/google-alerts-broad.md --max 5
|
|
|
|
# Run example demonstrations
|
|
node scripts/example-usage.js 1
|
|
```
|
|
|
|
**Features:**
|
|
- 🤖 Realistic mouse movements with bezier curves and occasional overshooting
|
|
- 📜 Natural scrolling patterns with random intervals
|
|
- ⌨️ Human-like typing with variable speeds and occasional typos
|
|
- ⏱️ Random delays mimicking real user behavior
|
|
- 🎭 Randomized browser fingerprints to avoid detection
|
|
|
|
See `docs/PLAYWRIGHT_SCRAPING.md` for full documentation.
|
|
|
|
#### Recording Alert Setup Process 🎬
|
|
|
|
Use Playwright's codegen to record and document the alert setup workflow:
|
|
|
|
```bash
|
|
# Record a new alert setup process
|
|
npm run record:alert-setup
|
|
```
|
|
|
|
This opens an interactive browser where you can perform the alert setup steps, and Playwright will generate test code automatically. Perfect for documenting the exact process for future reference.
|
|
|
|
See `docs/PLAYWRIGHT_RECORDING.md` for full documentation.
|
|
|
|
## Query Design
|
|
|
|
All queries follow these limits to ensure Google Alerts fires reliably:
|
|
|
|
- **≤8 site filters** per alert
|
|
- **≤18 OR terms** per keyword block
|
|
- **≤500 characters** total length
|
|
- **≤4 exclusion terms** (`-job -entertainment -movie -music`)
|
|
|
|
## Regional Structure
|
|
|
|
Reddit-based alerts are split into 5 regions to stay within limits:
|
|
|
|
1. **Ontario-GTA**: kitchener, waterloo, CambridgeON, guelph, toronto, mississauga, brampton
|
|
2. **Ontario-Other**: ontario, londonontario, HamiltonOntario, niagara, ottawa
|
|
3. **Western**: vancouver, VictoriaBC, Calgary, Edmonton
|
|
4. **Prairies**: saskatoon, regina, winnipeg
|
|
5. **Eastern**: montreal, quebeccity, halifax, newfoundland
|
|
|
|
Each service type (Data Recovery, Laptop Repair, Console Repair, etc.) has 5 regional alerts.
|
|
|
|
## Alert Categories
|
|
|
|
### Data Recovery (15 alerts)
|
|
- General data recovery
|
|
- HDD/SSD specialty recovery
|
|
- SD card/USB recovery
|
|
|
|
### Device Repair (25 alerts)
|
|
- Laptop/MacBook logic board repair
|
|
- GPU/Desktop board repair
|
|
- Console repair & refurbishment
|
|
- Smartphone repair
|
|
- iPad repair
|
|
- Connector (FPC) replacement
|
|
|
|
### Specialized Services (10 alerts)
|
|
- Key fob repair
|
|
- Microsolder/diagnostics
|
|
- Device refurbishment & trade-ins
|
|
|
|
### Non-Reddit Platforms (11 alerts)
|
|
- Kijiji/Used.ca classifieds
|
|
- Facebook Marketplace
|
|
- Craigslist
|
|
- Tech forums
|
|
- Discord communities
|
|
- Bulk/auction sourcing
|
|
|
|
## Troubleshooting
|
|
|
|
**No results coming through?**
|
|
|
|
1. Test the query in Google Search first (not in Alerts)
|
|
2. If Google Search shows results, the alert should work
|
|
3. If no results exist, the keywords may be too specific
|
|
4. Run `python3 scripts/validate_alerts.py` to check for limit violations
|
|
|
|
**Alert stopped working?**
|
|
|
|
Re-run validation and regenerate:
|
|
|
|
```bash
|
|
python3 scripts/validate_alerts.py docs/google-alerts.md --fix > docs/google-alerts-new.md
|
|
```
|
|
|
|
## Technical Notes
|
|
|
|
- Queries use exact-phrase matching (`"keyword"`) for precision
|
|
- The `-"ALERT_NAME:..."` marker was removed from all queries (it caused false negatives)
|
|
- Exclusions are limited to high-noise terms only
|
|
- Site filters use `site:reddit.com/r/subreddit` format (not full URLs)
|