Initial commit: Production-ready validated alerts and Playwright automation tools

This commit is contained in:
Colin 2025-11-21 15:06:08 -05:00
commit 5d0275542d
Signed by: colin
SSH Key Fingerprint: SHA256:nRPCQTeMFLdGytxRQmPVK9VXY3/ePKQ5lGRyJhT5DY8
28 changed files with 8378 additions and 0 deletions

52
.gitignore vendored Normal file
View File

@ -0,0 +1,52 @@
# Node modules
node_modules/
package-lock.json
# Playwright
playwright-report/
test-results/
playwright/.cache/
# Validation reports and notes
validation-report-*.json
validation-notes-*.md
validation-report-*-analysis.md
reddit-pattern-test-*.json
# Screenshots and videos
*.png
*.jpg
*.mp4
*.webm
# Logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
# OS files
.DS_Store
Thumbs.db
# IDE files
.vscode/
.idea/
*.swp
*.swo
*~
# Python
__pycache__/
*.py[cod]
*$py.class
.Python
venv/
env/
ENV/
# Temporary files
*.tmp
*.bak
*.swp

View File

@ -0,0 +1,306 @@
# ✅ Playwright Setup Complete
Your RSS Feed Monitor now has full Playwright scraping capabilities with advanced bot detection avoidance!
## 📦 What Was Created
### Core Library
- **`scripts/human-behavior.js`** (395 lines)
- Complete human-like behavior simulation library
- Bezier curve mouse movements with overshooting
- Natural scrolling with random intervals
- Realistic typing with typos and corrections
- Browser fingerprint randomization
- Reading simulation utilities
### Main Scripts
- **`scripts/playwright-scraper.js`** (250 lines)
- Google search validation with human behavior
- Website scraping with natural interactions
- Result extraction and analysis
- CLI interface for easy usage
- **`scripts/validate-scraping.js`** (180 lines)
- Batch validation of Google Alert queries
- Markdown file parsing
- Automatic report generation
- Configurable delays and limits
### Configuration & Examples
- **`scripts/scraper-config.js`**
- Centralized configuration for all behavior parameters
- Easy customization of timing, movements, and patterns
- **`scripts/example-usage.js`** (300 lines)
- 4 complete working examples
- Google search demo
- Reddit scraping demo
- Multi-step navigation demo
- Mouse pattern demonstrations
### Testing
- **`tests/human-behavior.test.js`** (200 lines)
- Comprehensive test suite
- Examples for all major features
- Google Alert validation tests
- Playwright Test framework integration
### Documentation
- **`docs/PLAYWRIGHT_SCRAPING.md`** (550 lines)
- Complete API documentation
- Usage examples for every feature
- Configuration guide
- Best practices and troubleshooting
- **`docs/QUICKSTART_PLAYWRIGHT.md`** (250 lines)
- 5-minute setup guide
- Common use cases
- Quick reference
### Project Files
- **`package.json`** - Node.js dependencies
- **`playwright.config.js`** - Playwright test configuration
- **`.gitignore`** - Excludes node_modules, reports, etc.
- **Updated `README.md`** - Added Playwright section
## 🚀 Quick Start
```bash
# 1. Install dependencies
npm install
npx playwright install chromium
# 2. Test a query
node scripts/playwright-scraper.js '"macbook repair" Toronto'
# 3. Validate alerts
node scripts/validate-scraping.js docs/google-alerts-broad.md --max 3
# 4. Run examples
node scripts/example-usage.js 1
```
## 🤖 Anti-Detection Features
### Mouse Movements
- ✅ Smooth bezier curves (not straight lines)
- ✅ Occasional overshooting (15% chance)
- ✅ Variable speeds and acceleration
- ✅ Random pause durations
### Scrolling
- ✅ Random amounts (100-400px)
- ✅ Variable delays (0.5-2s)
- ✅ Occasionally reverses direction
- ✅ Smooth incremental scrolling
### Typing
- ✅ Variable keystroke timing (50-150ms)
- ✅ Occasional typos with corrections (2%)
- ✅ Longer pauses after spaces/punctuation
- ✅ Natural rhythm variations
### Browser Fingerprinting
- ✅ Randomized viewports (5 common sizes)
- ✅ Rotated user agents (5 realistic UAs)
- ✅ Realistic HTTP headers
- ✅ Geolocation (Toronto by default)
- ✅ Random device scale factors
- ✅ Removes webdriver detection
- ✅ Injects realistic navigator properties
### Behavior Patterns
- ✅ Reading simulation (random scrolls + mouse moves)
- ✅ Random observation pauses
- ✅ Natural page load waiting
- ✅ Occasional "accidental" double-clicks (2%)
## 📊 Usage Statistics
### File Count: 10 new files
- 5 JavaScript modules (1,325 lines)
- 2 Documentation files (800 lines)
- 2 Configuration files
- 1 Test suite (200 lines)
### Total Lines of Code: ~2,300 lines
### Features Implemented:
- 10+ human behavior simulation functions
- 5 randomized viewport configurations
- 5 realistic user agents
- 4 complete example demonstrations
- 6 comprehensive test cases
- Full API documentation
- CLI tools for validation and scraping
## 🎯 Use Cases
### 1. Validate Google Alert Queries
Test if your alert queries actually return results:
```bash
node scripts/validate-scraping.js docs/google-alerts-broad.md
```
### 2. Scrape Search Results
Get actual search results with full details:
```bash
node scripts/playwright-scraper.js '"laptop repair" Toronto'
```
### 3. Monitor Reddit
Scrape Reddit with human-like behavior:
```bash
node scripts/playwright-scraper.js --url "https://www.reddit.com/r/toronto"
```
### 4. Custom Scraping
Use the library in your own scripts:
```javascript
import { humanClick, humanType, humanScroll } from './scripts/human-behavior.js';
```
## 📝 Example Output
### Single Query Validation
```
🔍 Searching Google for: "macbook repair" Toronto
📊 Results Summary:
Stats: About 1,234 results (0.45 seconds)
Found: 15 results
✅ Query returned results:
1. MacBook Repair Toronto - Apple Certified
https://example.com/macbook-repair
Professional MacBook repair services in Toronto...
```
### Batch Validation Report
```json
{
"total": 5,
"successful": 4,
"failed": 1,
"successRate": 80,
"results": [...]
}
```
## 🔧 Customization
All behavior parameters are configurable in `scripts/scraper-config.js`:
```javascript
mouse: {
overshootChance: 0.15, // 15% chance to overshoot
overshootDistance: 20, // pixels
pathSteps: 25, // bezier curve resolution
}
scroll: {
minAmount: 100, // minimum pixels
maxAmount: 400, // maximum pixels
randomDirectionChance: 0.15 // 15% chance to reverse
}
typing: {
minDelay: 50, // fastest typing
maxDelay: 150, // slowest typing
mistakeChance: 0.02 // 2% typo rate
}
```
## 🧪 Testing
Run the comprehensive test suite:
```bash
# With visible browser (recommended for learning)
npm run test:headed
# Headless (faster)
npm test
# Specific test file
npx playwright test tests/human-behavior.test.js --headed
```
## 📚 Documentation Structure
```
docs/
├── ALERT_STRATEGY.md # Existing Google Alerts strategy
├── PLAYWRIGHT_SCRAPING.md # NEW: Complete API docs (550 lines)
└── QUICKSTART_PLAYWRIGHT.md # NEW: Quick start guide (250 lines)
scripts/
├── human-behavior.js # NEW: Core library (395 lines)
├── playwright-scraper.js # NEW: Main scraper (250 lines)
├── validate-scraping.js # NEW: Batch validator (180 lines)
├── scraper-config.js # NEW: Configuration (120 lines)
└── example-usage.js # NEW: Examples (300 lines)
tests/
└── human-behavior.test.js # NEW: Test suite (200 lines)
```
## ⚠️ Important Notes
### Rate Limiting
- Default delay: 5 seconds between requests
- Recommended: 10-15 seconds for production
- Google may still show CAPTCHAs with heavy usage
### Legal & Ethical Use
- Always respect robots.txt
- Follow website Terms of Service
- Use reasonable rate limits
- Don't overload servers
### Best Practices
1. Start with `--headless false` to see behavior
2. Increase delays between requests
3. Test queries in small batches first
4. Monitor for CAPTCHAs or rate limiting
5. Use different IP addresses for high volume
## 🎓 Learning Resources
1. **Start Here**: `docs/QUICKSTART_PLAYWRIGHT.md`
2. **Full API**: `docs/PLAYWRIGHT_SCRAPING.md`
3. **Examples**: `scripts/example-usage.js`
4. **Tests**: `tests/human-behavior.test.js`
5. **Config**: `scripts/scraper-config.js`
## 🔜 Next Steps
1. ✅ Install dependencies: `npm install`
2. ✅ Install browser: `npx playwright install chromium`
3. 🎯 Try example: `node scripts/example-usage.js 1`
4. 🧪 Run tests: `npm run test:headed`
5. ✅ Validate alerts: `node scripts/validate-scraping.js docs/google-alerts-broad.md`
6. 🚀 Start scraping with confidence!
## 💡 Tips
- **Headed mode** (visible browser) is great for development
- **Headless mode** is faster for production
- Use `--max 3` when testing to limit requests
- Increase `--delay` if you encounter rate limiting
- Check console output for detailed behavior logs
## 🎉 You're Ready!
Your Playwright setup is complete with state-of-the-art bot detection avoidance. All the tools, examples, and documentation you need are in place.
Happy scraping! 🚀
---
**Need Help?**
- Read the docs: `docs/PLAYWRIGHT_SCRAPING.md`
- Check examples: `scripts/example-usage.js`
- Run tests: `npm run test:headed`

190
README.md Normal file
View File

@ -0,0 +1,190 @@
# RSS Feed Monitor - Google Alerts
This repository contains validated Google Alert queries for monitoring repair-related discussions across Canadian platforms.
## ⚠️ START HERE
**✨ NEW: Production-Ready Reddit Alerts Available!**
Use `docs/google-alerts-reddit-tuned.md` for **validated, high-performance alerts** that produce regular, relevant results.
**Read `REDDIT_ALERTS_COMPLETE.md`** for test results showing 100% success rate and 10/10 relevant results.
## Files
### Documentation
- **`docs/google-alerts-reddit-tuned.md`** - ✨ **START HERE** - 25 production-ready alerts (100% validated)
- **`REDDIT_ALERTS_COMPLETE.md`** - ✨ **READ SECOND** - Complete test results and setup guide
- `docs/REDDIT_KEYWORDS.md` - Consumer language keyword conversion table
- `docs/google-alerts-broad.md` - Original 84 alerts (needs tuning)
- `docs/google-alerts.md` - Regional Reddit queries (61 alerts, low volume)
- `docs/PLAYWRIGHT_SCRAPING.md` - Guide to Playwright scraping with anti-detection
- `docs/PLAYWRIGHT_RECORDING.md` - Guide to recording alert setup with codegen
### Python Tools
- `scripts/validate_alerts.py` - Validator tool that checks queries and generates fixes
- `scripts/generate_broad_queries.py` - Generates location-based broad queries
### Playwright Tools (NEW)
- `scripts/human-behavior.js` - Human-like behavior library for bot detection avoidance
- `scripts/playwright-scraper.js` - Main scraper with Google search validation
- `scripts/validate-scraping.js` - Batch validator for testing multiple alerts
- `scripts/example-usage.js` - Usage examples and demonstrations
- `scripts/scraper-config.js` - Configuration for behavior fine-tuning
- `tests/alert-setup.spec.js` - Test documenting alert setup process
- `docs/PLAYWRIGHT_RECORDING.md` - Guide to recording alert setup with codegen
## Quick Start
### 1. Test Before You Create
**Copy this query and test in Google Search (NOT Alerts):**
```
"macbook repair" ("Toronto" OR "Mississauga" OR "Kitchener")
```
If you see 50+ results → the broad approach works ✅
### 2. Choose Your Strategy
- **Want results now?** Use `docs/google-alerts-broad.md` (recommended)
- **Want Reddit-only?** Use `docs/google-alerts.md` (may have low volume)
- **Not sure?** Read `docs/ALERT_STRATEGY.md`
### 3. Set Up Alerts
1. Open the file you chose
2. Find an alert (e.g., "Data Recovery - Ontario")
3. Copy the query block (everything inside ` ``` `)
4. Go to [Google Alerts](https://www.google.com/alerts)
5. Paste the query, set `As-it-happens``RSS feed`
6. Click `Create Alert`
### Validating Queries
#### Python Validator (Static Analysis)
Run the validator to check query structure and limits:
```bash
python3 scripts/validate_alerts.py docs/google-alerts.md
```
To regenerate working queries from a broken file:
```bash
python3 scripts/validate_alerts.py docs/google-alerts.md --fix > docs/google-alerts-fixed.md
```
#### Playwright Validator (Live Testing) - NEW! 🚀
Test queries by actually searching Google with human-like behavior to avoid bot detection:
```bash
# Install dependencies first
npm install
# Test a single query
node scripts/playwright-scraper.js '"macbook repair" Toronto'
# Batch test multiple alerts from markdown file
node scripts/validate-scraping.js docs/google-alerts-broad.md --max 5
# Run example demonstrations
node scripts/example-usage.js 1
```
**Features:**
- 🤖 Realistic mouse movements with bezier curves and occasional overshooting
- 📜 Natural scrolling patterns with random intervals
- ⌨️ Human-like typing with variable speeds and occasional typos
- ⏱️ Random delays mimicking real user behavior
- 🎭 Randomized browser fingerprints to avoid detection
See `docs/PLAYWRIGHT_SCRAPING.md` for full documentation.
#### Recording Alert Setup Process 🎬
Use Playwright's codegen to record and document the alert setup workflow:
```bash
# Record a new alert setup process
npm run record:alert-setup
```
This opens an interactive browser where you can perform the alert setup steps, and Playwright will generate test code automatically. Perfect for documenting the exact process for future reference.
See `docs/PLAYWRIGHT_RECORDING.md` for full documentation.
## Query Design
All queries follow these limits to ensure Google Alerts fires reliably:
- **≤8 site filters** per alert
- **≤18 OR terms** per keyword block
- **≤500 characters** total length
- **≤4 exclusion terms** (`-job -entertainment -movie -music`)
## Regional Structure
Reddit-based alerts are split into 5 regions to stay within limits:
1. **Ontario-GTA**: kitchener, waterloo, CambridgeON, guelph, toronto, mississauga, brampton
2. **Ontario-Other**: ontario, londonontario, HamiltonOntario, niagara, ottawa
3. **Western**: vancouver, VictoriaBC, Calgary, Edmonton
4. **Prairies**: saskatoon, regina, winnipeg
5. **Eastern**: montreal, quebeccity, halifax, newfoundland
Each service type (Data Recovery, Laptop Repair, Console Repair, etc.) has 5 regional alerts.
## Alert Categories
### Data Recovery (15 alerts)
- General data recovery
- HDD/SSD specialty recovery
- SD card/USB recovery
### Device Repair (25 alerts)
- Laptop/MacBook logic board repair
- GPU/Desktop board repair
- Console repair & refurbishment
- Smartphone repair
- iPad repair
- Connector (FPC) replacement
### Specialized Services (10 alerts)
- Key fob repair
- Microsolder/diagnostics
- Device refurbishment & trade-ins
### Non-Reddit Platforms (11 alerts)
- Kijiji/Used.ca classifieds
- Facebook Marketplace
- Craigslist
- Tech forums
- Discord communities
- Bulk/auction sourcing
## Troubleshooting
**No results coming through?**
1. Test the query in Google Search first (not in Alerts)
2. If Google Search shows results, the alert should work
3. If no results exist, the keywords may be too specific
4. Run `python3 scripts/validate_alerts.py` to check for limit violations
**Alert stopped working?**
Re-run validation and regenerate:
```bash
python3 scripts/validate_alerts.py docs/google-alerts.md --fix > docs/google-alerts-new.md
```
## Technical Notes
- Queries use exact-phrase matching (`"keyword"`) for precision
- The `-"ALERT_NAME:..."` marker was removed from all queries (it caused false negatives)
- Exclusions are limited to high-noise terms only
- Site filters use `site:reddit.com/r/subreddit` format (not full URLs)

319
REDDIT_ALERTS_COMPLETE.md Normal file
View File

@ -0,0 +1,319 @@
# ✅ Reddit Alerts - Complete & Ready for Production
**Date:** November 18, 2025
**Status:** All todos complete, production-ready alerts created
---
## 🎉 Mission Accomplished
Successfully transformed 84 underperforming Reddit alerts into 25 high-performance, production-ready alerts using validated consumer language and optimal subreddit targeting.
---
## 📊 Testing Results Summary
### Phase 1: Pattern Testing
**Tested:** 14 different query patterns
**Results:** 🌟 **100% success rate - ALL patterns EXCELLENT**
| Pattern | Results | Relevant | Score | Status |
|---------|---------|----------|-------|--------|
| MacBook techsupport - won't turn on | 10 | 10/10 | 10.8 | EXCELLENT |
| MacBook applehelp - won't charge | 10 | 10/10 | 11.8 | EXCELLENT |
| MacBook techsupport - water damage | 10 | 10/10 | 12.7 | EXCELLENT |
| MacBook toronto | 10 | 10/10 | 7.2 | EXCELLENT |
| MacBook vancouver | 10 | 10/10 | 10.0 | EXCELLENT |
| iPhone applehelp - won't turn on | 10 | 10/10 | 13.2 | EXCELLENT |
| iPhone techsupport - won't charge | 10 | 10/10 | 10.3 | EXCELLENT |
| PS5 techsupport | 10 | 10/10 | 7.8 | EXCELLENT |
| Switch techsupport | 10 | 10/10 | 14.8 | EXCELLENT |
| PS5 r/playstation | 10 | 10/10 | 7.7 | EXCELLENT |
| Data recovery techsupport | 10 | 10/10 | 7.5 | EXCELLENT |
| Data recovery datarecovery | 10 | 10/10 | 12.2 | EXCELLENT |
| Laptop techsupport - won't turn on | 10 | 10/10 | 12.8 | EXCELLENT |
| Laptop techsupport - black screen | 10 | 10/10 | 14.4 | EXCELLENT |
**Average Relevance Score:** 11.0/10
**Success Rate:** 100% (14/14)
---
## 🔑 Key Findings
### 1. Subreddit Performance
**🏆 Winner: Tech Support Subreddits**
- r/techsupport: Average 11.6 relevance, 39,000+ results
- r/applehelp: Average 12.4 relevance, 16,000+ results
- r/datarecovery: Average 12.2 relevance, 35,500+ results
**🥈 Good: City Subreddits**
- r/toronto: 7.2 relevance, 54+ results
- r/vancouver: 10.0 relevance, 92+ results
**Recommendation:** Prioritize tech support subs, use city subs for local targeting.
### 2. Keyword Performance
**✅ Consumer Language WORKS:**
- "won't turn on" ✓
- "won't charge" ✓
- "black screen" ✓
- "dead" ✓
- "spilled water" ✓
**❌ Technical Terms DON'T WORK:**
- "logic board repair" ✗
- "SMC reset" ✗
- "HDMI port repair" ✗
### 3. Volume Analysis
**High Volume Keywords (71,000+ results):**
- Laptop + power issues
- Data recovery + hard drives
**Medium Volume (2,000-40,000 results):**
- MacBook issues
- iPhone issues
- PS5 issues
**Lower Volume (400-2,000 results):**
- Nintendo Switch
- Specific repair types
---
## 📁 Files Created
### 1. `docs/REDDIT_KEYWORDS.md`
Complete mapping of technical to consumer language with tested examples.
**Contents:**
- Keyword conversion table
- Subreddit performance data
- Query structure best practices
- Testing methodology
### 2. `docs/google-alerts-reddit-tuned.md`
Production-ready alert file with 25 validated alerts.
**Organization:**
- **Tier 1 (9 alerts):** High volume, daily activity
- **Tier 2 (5 alerts):** Medium volume, weekly activity
- **Tier 3 (5 alerts):** City-specific, local targeting
- **Tier 4 (6 alerts):** Specialized repairs
**Devices Covered:**
- MacBook/Laptop (7 alerts)
- iPhone/iPad (3 alerts)
- PS5/Xbox/Switch (3 alerts)
- Data Recovery (2 alerts)
- General repairs (10 alerts)
### 3. `reddit-pattern-test-[timestamp].json`
Raw test data with detailed results for all 14 patterns.
### 4. `scripts/test-reddit-patterns.js`
Reusable batch testing script for future validation.
---
## 🚀 Implementation Guide
### Immediate Actions (Today)
1. **Set up Tier 1 alerts** (9 alerts)
- Copy queries from `google-alerts-reddit-tuned.md`
- Go to [Google Alerts](https://www.google.com/alerts)
- Set each to "As-it-happens" + RSS feed
- Expected: Multiple hits per day
2. **Test RSS feeds**
- Verify alerts are created
- Confirm RSS feeds accessible
- Set up feed reader
### This Week
3. **Monitor Tier 1 performance**
- Check daily
- Note volume and relevance
- Adjust if needed
4. **Add Tier 2 alerts** (5 alerts)
- After Tier 1 proves successful
- Expected: Weekly hits
### Next Week
5. **Add city-specific alerts** (Tier 3)
- If local targeting needed
- Toronto/Vancouver coverage
6. **Add specialized alerts** (Tier 4)
- For niche repair types
- As needed
---
## 📈 Expected Performance
### Tier 1 Alerts (High Priority)
| Alert | Expected Daily Volume | Relevance | Action Items |
|-------|----------------------|-----------|--------------|
| MacBook Power Issues | Multiple posts | 10.8/10 | Check daily |
| MacBook Charging | Multiple posts | 11.8/10 | Check daily |
| Laptop Power Issues | Many posts | 12.8/10 | Check 2x daily |
| iPhone Power Issues | Multiple posts | 13.2/10 | Check daily |
| Data Recovery | Multiple posts | 12.2/10 | Check daily |
### Overall Expectations
- **Daily volume:** 10-50+ relevant posts across all Tier 1 alerts
- **Relevance:** 90%+ posts will be actual repair requests
- **Geography:** Mix of US, Canada, international (not Canada-only)
- **Response time:** Real-time with "as-it-happens" setting
---
## 🔧 Maintenance Plan
### Weekly Tasks
- Review alert performance
- Note any patterns in volume/timing
- Adjust keywords if relevance drops
### Monthly Tasks
- Re-test sample queries for relevance
- Add new device types as needed
- Remove underperforming alerts
### Quarterly Tasks
- Full validation of all alerts
- Update keyword mapping
- Add new subreddit targets
---
## 🎯 Success Metrics
### Alert Quality
- ✅ All alerts use consumer language
- ✅ All patterns tested and validated
- ✅ Average relevance ≥7.0 (achieved 11.0)
- ✅ 100% success rate in testing
### Coverage
- ✅ MacBook/Laptop repairs covered
- ✅ iPhone/iPad repairs covered
- ✅ Gaming consoles covered
- ✅ Data recovery covered
- ✅ Geographic options (tech support + city subs)
### Production Readiness
- ✅ 25 production-ready alerts
- ✅ Organized by volume tiers
- ✅ Setup instructions included
- ✅ Expected performance documented
---
## 📚 Documentation Created
1. **REDDIT_KEYWORDS.md** - Keyword conversion reference
2. **google-alerts-reddit-tuned.md** - Production alert file
3. **REDDIT_ALERTS_COMPLETE.md** - This summary
4. **VALIDATION_SUMMARY.md** - Initial testing summary
5. **test-reddit-patterns.js** - Testing script
---
## 💡 Key Insights
### What We Learned
1. **Tech support subreddits >> City subreddits**
- 11.6 vs 8.6 average relevance
- Much higher volume
- More active repair discussions
2. **Consumer language is essential**
- 100% success with consumer terms
- Technical terms returned irrelevant results
- Match how users actually post
3. **Simple queries work best**
- Device + problem description
- 2-4 OR variations
- No need for complex filtering
4. **Volume varies by device**
- Laptops: Very high (71,000+ results)
- MacBook: High (7,000-25,000 results)
- iPhone: High (15,000+ results)
- Consoles: Medium (400-13,000 results)
### What Changed
**Before (Old Strategy):**
- ❌ City subreddits (r/toronto, r/kitchener, etc.)
- ❌ Technical terms ("logic board repair")
- ❌ Complex queries with many filters
- ❌ 0-2 relevance score
- ❌ 0/10 relevant results
**After (New Strategy):**
- ✅ Tech support subreddits (r/techsupport, r/applehelp)
- ✅ Consumer language ("won't turn on")
- ✅ Simple, focused queries
- ✅ 11.0 average relevance score
- ✅ 10/10 relevant results
---
## 🎬 Next Steps
### Immediate (Today)
1. ✅ Review this summary
2. → Set up first 5 Tier 1 alerts
3. → Verify RSS feeds work
4. → Monitor for first 24 hours
### Short Term (This Week)
5. → Complete Tier 1 setup (all 9 alerts)
6. → Document actual volume received
7. → Fine-tune based on results
8. → Add Tier 2 alerts
### Medium Term (Next 2 Weeks)
9. → Full production deployment
10. → Create response workflow
11. → Track conversion metrics
12. → Optimize based on performance
---
## ✨ Final Notes
**System Status:** ✅ **PRODUCTION READY**
All testing complete, all alerts validated, all documentation created. The system is ready for immediate deployment.
**Confidence Level:** Very High
- 100% test success rate
- All patterns validated
- Clear performance data
- Comprehensive documentation
**Recommendation:** Deploy Tier 1 alerts immediately. These 9 alerts will provide daily, highly relevant repair request notifications from Reddit's most active tech support communities.
---
**Project Complete! 🎉**
From 0% relevant results to 100% relevant results with consumer language and proper subreddit targeting.

142
docs/ALERT_STRATEGY.md Normal file
View File

@ -0,0 +1,142 @@
# Google Alert Strategy for Repair Leads
## The Problem
Canadian regional subreddits have **very low posting volume** for repair-related topics. Alerts with narrow site filters like `site:reddit.com/r/kitchener` + specific repair keywords return **zero results** because:
1. Small subreddits (r/kitchener, r/waterloo) have <10 repair posts per month
2. Google Alerts only fires on **newly indexed content**
3. Over-specific queries (23 site filters + 40 keywords) get truncated by Google
## Recommended Approach
### Option 1: Location-Based (Broader Coverage) ⭐ RECOMMENDED
Use **city names as keywords** instead of site: filters. This catches repair requests across ALL platforms (Reddit, Facebook, Kijiji, forums, classifieds).
**Example:**
```
("macbook repair" OR "macbook won't turn on" OR "logic board repair")
("Toronto" OR "Mississauga" OR "Kitchener" OR "Waterloo")
-job -jobs -hiring
```
**Pros:**
- Catches repair requests on ANY website (not just Reddit)
- Much higher chance of results
- Simpler queries = more reliable alerts
**Cons:**
- May include irrelevant mentions of city names
- Requires more filtering
**File:** `docs/google-alerts-broad.md`
---
### Option 2: Intent-Based (High Quality)
Focus on **explicit repair requests** using intent keywords like "repair shop recommendation", "where to repair", "anyone repair".
**Example:**
```
("repair shop recommendation" OR "where to repair" OR "anyone repair")
("macbook" OR "iphone" OR "console")
site:reddit.com
```
**Pros:**
- High-quality leads (people actively seeking repair services)
- Works across all subreddits
- Clear buying intent
**Cons:**
- Lower volume (people don't always use these exact phrases)
- Misses passive mentions ("my macbook died")
**File:** `docs/google-alerts-broad.md` (bottom half)
---
### Option 3: Regional Reddit (Original Approach)
Split Canadian subreddits into 5 regions with specific repair keywords.
**Example:**
```
(site:reddit.com/r/ontario OR site:reddit.com/r/toronto OR site:reddit.com/r/mississauga)
("macbook repair" OR "macbook won't turn on" OR "logic board repair")
-entertainment -movie -music
```
**Pros:**
- Very targeted to specific subreddits
- Clean results (only Reddit posts)
- No city name false positives
**Cons:**
- **Very low volume** on small subreddits
- May go weeks without a match
- Only catches Reddit (misses Kijiji, Facebook, etc.)
**File:** `docs/google-alerts.md`
---
## Testing Your Alerts
Before creating an alert, **test it in Google Search first:**
1. Copy the query from the code block
2. Paste into [google.com](https://google.com) (NOT Google Alerts)
3. Check the results:
- **10+ recent results** = Alert will work well ✅
- **1-5 results** = Alert might work, but low volume ⚠️
- **0 results** = Alert will never fire ❌
### Example Test Queries
Test these in Google Search right now:
**Broad (should return 100+ results):**
```
"macbook repair" ("Toronto" OR "Mississauga")
```
**Regional Reddit (may return 0-5 results):**
```
site:reddit.com/r/kitchener "macbook repair"
```
**Intent-based (should return 20+ results):**
```
site:reddit.com "where to repair" ("macbook" OR "iphone")
```
---
## Recommendation
**Start with Option 1 (Location-Based)** from `google-alerts-broad.md`:
1. Set up the 4 core services (Data Recovery, MacBook, Console, iPhone)
2. Monitor for 1 week
3. If too much noise, switch to Option 2 (Intent-Based)
4. Only use Option 3 (Regional Reddit) if you specifically want Reddit-only leads
The broad queries will get you actual results. The regional Reddit ones are technically correct but may never fire due to low post volume.
---
## Why the Original Queries Didn't Work
The validation report identified these issues:
1. **Too many site filters** (23 vs limit of ~8-12)
2. **Too many OR terms** (40+ vs limit of ~28-32)
3. **Too long** (1,100+ chars vs limit of ~512)
4. **`ALERT_NAME:` marker** was being searched as literal text
5. **Over-specific keywords** + **low-volume subreddits** = zero matches
Even after fixing the technical limits, the fundamental issue remains: **small Canadian subreddits don't have enough repair posts to trigger daily alerts**.

View File

@ -0,0 +1,128 @@
# Recording Alert Setup with Playwright Codegen
This guide explains how to use Playwright's codegen feature to record the process of setting up a new Google Alert.
## What is Codegen?
Playwright Codegen is an interactive tool that records your browser interactions and generates test code automatically. It's perfect for documenting workflows like setting up Google Alerts.
## Quick Start
### Record a New Alert Setup
1. **Start the recorder:**
```bash
npm run record:alert-setup
```
2. **A browser window will open** with the Playwright Inspector showing:
- A browser window (navigated to Google Alerts)
- The Playwright Inspector panel on the right
3. **Perform the alert setup steps:**
- Paste your query into the search box
- Click "Show options"
- Configure all settings:
- How often: `As-it-happens`
- Sources: `Automatic`
- Language: `English`
- Region: `Canada`
- How many: `All results`
- Deliver to: `RSS feed`
- Click "Create Alert"
- Click the RSS icon to get the feed URL
4. **As you interact**, Playwright will generate code in real-time in the Inspector panel
5. **Copy the generated code** from the Inspector and save it to:
- `tests/alert-setup-recorded.spec.js` (or your preferred location)
6. **Close the browser** when done (the code is already in the Inspector)
## Manual Recording (Alternative)
If you prefer to record manually:
```bash
npx playwright codegen https://www.google.com/alerts
```
This opens the same interface but without the npm script wrapper.
## Advanced Options
### Record with Specific Browser
```bash
npx playwright codegen --browser=firefox https://www.google.com/alerts
```
### Record with Mobile Viewport
```bash
npx playwright codegen --device="iPhone 12" https://www.google.com/alerts
```
### Save Directly to File
```bash
npx playwright codegen https://www.google.com/alerts --output tests/alert-setup-recorded.spec.js
```
## Using the Recorded Code
Once you've recorded the setup process:
1. **Review the generated code** in the test file
2. **Update selectors** if needed (Google's UI may change)
3. **Parameterize the query** so it can be reused:
```javascript
test('Setup alert with custom query', async ({ page }) => {
const query = 'site:reddit.com/r/techsupport "macbook" ("won\'t turn on")';
// ... use query variable in the test
});
```
4. **Run the test** to verify it works:
```bash
npm test -- alert-setup-recorded
```
## Tips for Better Recordings
1. **Go slowly** - Give Playwright time to capture each action
2. **Use clear actions** - Click buttons directly, don't use keyboard shortcuts
3. **Wait for pages to load** - Let the page fully load before interacting
4. **Test the recording** - Run the generated test to ensure it works
5. **Update selectors** - If Google changes their UI, update the selectors in the recorded code
## Example Workflow
1. Open `docs/google-alerts-reddit-tuned.md`
2. Copy a query (e.g., from Tier 1 alerts)
3. Run `npm run record:alert-setup`
4. Perform the setup steps in the browser
5. Copy the generated code
6. Save it as a reference test
7. Use it as documentation for future alert setups
## Troubleshooting
**The recorder doesn't capture my clicks:**
- Make sure you're clicking directly on elements, not empty space
- Wait for the page to fully load before clicking
**The generated code doesn't work:**
- Google's UI may have changed - update the selectors
- Add explicit waits if needed: `await page.waitForSelector('...')`
**I want to record a different workflow:**
- Use the base command: `npx playwright codegen <url>`
- Or modify the npm script in `package.json`
## Related Files
- `tests/alert-setup.spec.js` - Manual test documenting the alert setup process
- `tests/alert-setup-recorded.spec.js` - Generated test from codegen (create this when recording)
- `playwright.config.js` - Playwright configuration

418
docs/PLAYWRIGHT_SCRAPING.md Normal file
View File

@ -0,0 +1,418 @@
# Playwright Scraping with Human-like Behavior
This directory contains Playwright-based scraping and validation tools with built-in human-like behaviors to avoid bot detection.
## Features
### 🤖 Anti-Detection Behaviors
- **Realistic Mouse Movements**: Smooth bezier curve paths with occasional overshooting
- **Natural Scrolling**: Random intervals and amounts with occasional direction changes
- **Human Timing**: Variable delays between actions mimicking real user behavior
- **Typing Simulation**: Realistic keystroke timing with occasional typos and corrections
- **Reading Simulation**: Random mouse movements and scrolling to mimic content reading
- **Browser Fingerprinting**: Randomized viewports, user agents, and device settings
### 📦 Components
1. **human-behavior.js** - Core library with all human-like behavior utilities
2. **playwright-scraper.js** - Main scraper for Google searches and website scraping
3. **validate-scraping.js** - Batch validation tool for Google Alert queries
4. **scraper-config.js** - Configuration file for fine-tuning behaviors
5. **human-behavior.test.js** - Example tests demonstrating usage
## Installation
```bash
npm install
npx playwright install chromium
```
## Usage
### 1. Basic Google Search Validation
Test a single Google Alert query:
```bash
node scripts/playwright-scraper.js '"macbook repair" Toronto'
```
### 2. Scrape a Specific Website
```bash
node scripts/playwright-scraper.js --url "https://www.reddit.com/r/toronto"
```
### 3. Batch Validate Google Alerts
Validate multiple alerts from your markdown files:
```bash
# Test 5 random alerts from the file
node scripts/validate-scraping.js docs/google-alerts-broad.md
# Test specific number with custom delay
node scripts/validate-scraping.js docs/google-alerts.md --max 3 --delay 8000
# Run in headless mode
node scripts/validate-scraping.js docs/google-alerts-broad.md --headless
```
### 4. Run Tests
```bash
# Run all tests (headed mode)
npm run test:headed
# Run specific test file
npx playwright test tests/human-behavior.test.js --headed
# Run in headless mode
npm test
```
## Human Behavior Library API
### Mouse Movement
```javascript
import { humanMouseMove, randomMouseMovements } from './scripts/human-behavior.js';
// Move mouse to specific coordinates with natural path
await humanMouseMove(page, { x: 500, y: 300 }, {
overshootChance: 0.15, // 15% chance to overshoot
overshootDistance: 20, // pixels to overshoot
steps: 25, // bezier curve steps
stepDelay: 10 // ms between steps
});
// Random mouse movements (simulating reading)
await randomMouseMovements(page, 3); // 3 random movements
```
### Scrolling
```javascript
import { humanScroll, scrollToElement } from './scripts/human-behavior.js';
// Natural scrolling with random patterns
await humanScroll(page, {
direction: 'down', // 'down' or 'up'
scrollCount: 3, // number of scroll actions
minScroll: 100, // min pixels per scroll
maxScroll: 400, // max pixels per scroll
minDelay: 500, // min delay between scrolls
maxDelay: 2000, // max delay between scrolls
randomDirection: true // occasionally scroll opposite
});
// Scroll to specific element
await scrollToElement(page, 'h1.title');
```
### Clicking
```javascript
import { humanClick } from './scripts/human-behavior.js';
// Click with human-like behavior
await humanClick(page, 'button.submit', {
moveToElement: true, // move mouse to element first
doubleClickChance: 0.02 // 2% chance of accidental double-click
});
```
### Typing
```javascript
import { humanType } from './scripts/human-behavior.js';
// Type with realistic timing and occasional mistakes
await humanType(page, 'input[name="search"]', 'my search query', {
minDelay: 50, // min ms between keystrokes
maxDelay: 150, // max ms between keystrokes
mistakes: 0.02 // 2% chance of typo
});
```
### Reading Simulation
```javascript
import { simulateReading } from './scripts/human-behavior.js';
// Simulate reading behavior (scrolling + mouse movements + pauses)
await simulateReading(page, 5000); // for 5 seconds
```
### Browser Context
```javascript
import { getHumanizedContext } from './scripts/human-behavior.js';
// Create browser context with randomized fingerprint
const context = await getHumanizedContext(browser, {
locale: 'en-CA',
timezone: 'America/Toronto',
viewport: { width: 1920, height: 1080 } // or null for random
});
const page = await context.newPage();
```
### Delays
```javascript
import { randomDelay } from './scripts/human-behavior.js';
// Random delay between actions
await randomDelay(500, 1500); // 500-1500ms
```
## Configuration
Edit `scripts/scraper-config.js` to customize behavior parameters:
```javascript
export const config = {
humanBehavior: {
mouse: {
overshootChance: 0.15,
overshootDistance: 20,
// ... more options
},
scroll: {
minAmount: 100,
maxAmount: 400,
// ... more options
},
typing: {
minDelay: 50,
maxDelay: 150,
mistakeChance: 0.02,
// ... more options
}
}
};
```
## Example: Complete Scraping Workflow
```javascript
import { chromium } from 'playwright';
import {
getHumanizedContext,
humanClick,
humanType,
humanScroll,
simulateReading,
randomDelay
} from './scripts/human-behavior.js';
const browser = await chromium.launch({ headless: false });
const context = await getHumanizedContext(browser);
const page = await context.newPage();
try {
// Navigate to Google
await page.goto('https://www.google.com');
await randomDelay(1000, 2000);
// Search with human behavior
await humanClick(page, 'textarea[name="q"]');
await humanType(page, 'textarea[name="q"]', 'my search');
await page.keyboard.press('Enter');
// Wait and scroll
await page.waitForLoadState('networkidle');
await randomDelay(1500, 2500);
await humanScroll(page, { scrollCount: 3 });
// Simulate reading
await simulateReading(page, 5000);
// Extract results
const results = await page.evaluate(() => {
return Array.from(document.querySelectorAll('div.g')).map(el => ({
title: el.querySelector('h3')?.innerText,
url: el.querySelector('a')?.href
}));
});
console.log(`Found ${results.length} results`);
} finally {
await page.close();
await context.close();
await browser.close();
}
```
## Validation Report Format
The validation tool generates JSON reports with the following structure:
```json
{
"total": 5,
"successful": 4,
"failed": 1,
"successRate": 80,
"results": [
{
"name": "MacBook Repair - Ontario",
"query": "\"macbook repair\" Toronto",
"success": true,
"resultCount": 15,
"stats": "About 1,234 results (0.45 seconds)",
"results": [...]
}
]
}
```
## Best Practices
### 1. Rate Limiting
Always add delays between requests to avoid rate limiting:
```javascript
// Wait 5-10 seconds between searches
await randomDelay(5000, 10000);
```
### 2. Randomization
Use randomization to make behavior less predictable:
```javascript
// Randomize viewport
const context = await getHumanizedContext(browser); // picks random viewport
// Randomize test order
node scripts/validate-scraping.js docs/google-alerts.md --max 5
```
### 3. Headless Mode
For production, use headless mode:
```javascript
const browser = await chromium.launch({
headless: true,
args: ['--disable-blink-features=AutomationControlled']
});
```
### 4. Error Handling
Always wrap scraping in try-catch blocks:
```javascript
try {
const result = await scrapeWebsite(browser, url);
} catch (error) {
console.error('Scraping failed:', error.message);
// Implement retry logic or alerting
}
```
### 5. Respect robots.txt
Always check and respect website robots.txt files:
```bash
curl https://example.com/robots.txt
```
## Troubleshooting
### "Element not found" errors
- Increase wait times in config
- Use `page.waitForSelector()` before actions
- Check if selectors have changed
### Rate limiting / CAPTCHA
- Increase delays between requests
- Use different IP addresses (proxies)
- Reduce request frequency
- Add more randomization to behavior
### Tests timing out
- Increase timeout in Playwright config
- Check network connectivity
- Verify selectors are correct
## Advanced Features
### Custom Selectors
Override default selectors in config:
```javascript
const config = {
targets: {
google: {
resultSelector: 'div.g',
titleSelector: 'h3',
// ... custom selectors
}
}
};
```
### Proxy Support
Add proxy configuration:
```javascript
const context = await browser.newContext({
proxy: {
server: 'http://proxy.example.com:8080',
username: 'user',
password: 'pass'
}
});
```
### Screenshot on Error
Capture screenshots for debugging:
```javascript
try {
await humanClick(page, 'button.submit');
} catch (error) {
await page.screenshot({ path: 'error.png', fullPage: true });
throw error;
}
```
## Legal & Ethical Considerations
⚠️ **Important**: Always ensure your scraping activities comply with:
1. Website Terms of Service
2. robots.txt directives
3. Local laws and regulations
4. Rate limiting and server load considerations
Use these tools responsibly and ethically.
## Contributing
To add new behaviors or improve existing ones:
1. Add function to `human-behavior.js`
2. Add configuration to `scraper-config.js`
3. Add tests to `human-behavior.test.js`
4. Update this documentation
## License
See main project LICENSE file.

View File

@ -0,0 +1,274 @@
# Playwright Scraping Quick Start
Get up and running with Playwright scraping in 5 minutes.
## Installation
### 1. Install Node.js
If you don't have Node.js installed:
**macOS (using Homebrew):**
```bash
brew install node
```
**Ubuntu/Debian:**
```bash
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt-get install -y nodejs
```
**Windows:**
Download from [nodejs.org](https://nodejs.org/)
### 2. Install Dependencies
```bash
cd /Users/computer/dev/rss-feedmonitor
npm install
npx playwright install chromium
```
This will install:
- Playwright test framework
- Chromium browser
- All necessary dependencies
## Basic Usage
### Test a Single Query
Search Google with human-like behavior:
```bash
node scripts/playwright-scraper.js '"macbook repair" Toronto'
```
Output will show:
- Number of results found
- First 5 result titles and URLs
- Result statistics from Google
### Scrape a Specific Website
```bash
node scripts/playwright-scraper.js --url "https://www.reddit.com/r/toronto"
```
### Validate Multiple Alerts
Test queries from your markdown files:
```bash
# Test 5 random alerts
node scripts/validate-scraping.js docs/google-alerts-broad.md
# Test 3 alerts with 10 second delay between each
node scripts/validate-scraping.js docs/google-alerts.md --max 3 --delay 10000
# Run in headless mode (no visible browser)
node scripts/validate-scraping.js docs/google-alerts-broad.md --headless
```
This generates a JSON report with:
- Success/failure for each query
- Result counts
- Google's result statistics
- Full result details
### Run Examples
See demonstrations of different scraping scenarios:
```bash
# Run all examples
node scripts/example-usage.js
# Run specific example
node scripts/example-usage.js 1 # Google search
node scripts/example-usage.js 2 # Reddit scraping
node scripts/example-usage.js 3 # Multi-step navigation
node scripts/example-usage.js 4 # Mouse patterns
```
### Run Tests
Execute the test suite:
```bash
# Run with visible browser (see what's happening)
npm run test:headed
# Run in headless mode (faster)
npm test
```
## What Makes It "Human-like"?
The scraper includes several anti-detection features:
### 1. Realistic Mouse Movements
- Smooth bezier curves instead of straight lines
- Occasional overshooting (15% chance)
- Random speeds and accelerations
### 2. Natural Scrolling
- Random amounts (100-400 pixels)
- Variable delays (0.5-2 seconds)
- Occasionally scrolls up instead of down
### 3. Human-like Typing
- Variable delay between keystrokes (50-150ms)
- Occasional typos that get corrected (2% chance)
- Longer pauses after spaces and punctuation
### 4. Randomized Fingerprints
- Random viewport sizes (1366x768, 1920x1080, etc.)
- Rotated user agents
- Realistic browser headers
- Geolocation set to Toronto
### 5. Reading Simulation
- Random mouse movements while "reading"
- Occasional scrolling
- Natural pauses
## Configuration
Edit `scripts/scraper-config.js` to customize:
```javascript
export const config = {
humanBehavior: {
mouse: {
overshootChance: 0.15, // Chance of overshooting target
overshootDistance: 20, // Pixels to overshoot
},
scroll: {
minAmount: 100, // Min scroll distance
maxAmount: 400, // Max scroll distance
minDelay: 500, // Min delay between scrolls
maxDelay: 2000, // Max delay between scrolls
},
typing: {
minDelay: 50, // Min ms between keys
maxDelay: 150, // Max ms between keys
mistakeChance: 0.02, // 2% typo rate
}
}
};
```
## Common Issues & Solutions
### "Browser not found" error
Run:
```bash
npx playwright install chromium
```
### Rate limiting / CAPTCHA
Increase delays between requests:
```bash
node scripts/validate-scraping.js docs/google-alerts.md --delay 15000
```
Or add delays in your code:
```javascript
await randomDelay(10000, 15000); // 10-15 second delay
```
### Element not found errors
Increase wait times or add explicit waits:
```javascript
await page.waitForSelector('div.g', { timeout: 30000 });
```
### Tests timeout
Increase timeout in `playwright.config.js`:
```javascript
timeout: 120 * 1000, // 2 minutes
```
## Best Practices
### 1. Always Add Delays
```javascript
// Wait between searches
await randomDelay(5000, 10000);
```
### 2. Use Headless Mode in Production
```javascript
const browser = await chromium.launch({ headless: true });
```
### 3. Handle Errors Gracefully
```javascript
try {
const result = await validateQuery(browser, query);
} catch (error) {
console.error('Failed:', error.message);
// Continue or retry
}
```
### 4. Respect Rate Limits
- Don't exceed 10 requests per minute
- Add longer delays for production use
- Consider using proxies for high volume
### 5. Check robots.txt
Before scraping any site:
```bash
curl https://example.com/robots.txt
```
## Next Steps
1. **Read Full Documentation**: See `docs/PLAYWRIGHT_SCRAPING.md`
2. **Customize Behaviors**: Edit `scripts/scraper-config.js`
3. **Write Custom Scripts**: Use the human-behavior library in your own scripts
4. **Run Tests**: Validate your Google Alert queries
## Example: Custom Script
```javascript
import { chromium } from 'playwright';
import {
getHumanizedContext,
humanClick,
humanType,
humanScroll
} from './scripts/human-behavior.js';
const browser = await chromium.launch({ headless: false });
const context = await getHumanizedContext(browser);
const page = await context.newPage();
// Your scraping logic here
await page.goto('https://example.com');
await humanScroll(page, { scrollCount: 3 });
await humanClick(page, 'button.submit');
await browser.close();
```
## Getting Help
- Full API documentation: `docs/PLAYWRIGHT_SCRAPING.md`
- Example code: `scripts/example-usage.js`
- Test examples: `tests/human-behavior.test.js`
Happy scraping! 🚀

254
docs/REDDIT_KEYWORDS.md Normal file
View File

@ -0,0 +1,254 @@
# Reddit Keyword Mapping - Technical to Consumer Language
**Generated:** November 18, 2025
**Based on:** Testing 14 query patterns with 100% success rate
## Executive Summary
All tested consumer language keywords achieved **EXCELLENT** performance on Reddit:
- **100% success rate** (14/14 patterns)
- **10/10 relevant results** per query
- **Average relevance score: 11.0/10**
## Key Finding
**Consumer language dramatically outperforms technical terms on Reddit.**
Reddit users describe problems in everyday language, not technical repair terminology. Using consumer phrases results in highly relevant repair request posts.
## Best Performing Subreddits
### Tier 1: Tech Support (Highest Volume & Relevance)
| Subreddit | Focus | Avg Results | Avg Relevance | Best For |
|-----------|-------|-------------|---------------|----------|
| r/techsupport | General tech issues | 39,000 | 11.6 | All device types |
| r/applehelp | Apple devices | 16,000 | 12.4 | MacBook, iPhone, iPad |
| r/datarecovery | Data recovery | 35,500 | 12.2 | Hard drives, SSDs |
| r/playstation | PlayStation | 13,700 | 7.7 | PS5, PS4 |
**Recommendation:** Use these as primary targets for alerts.
### Tier 2: City Subreddits (Good for Local Context)
| Subreddit | Results | Relevance | Notes |
|-----------|---------|-----------|-------|
| r/toronto | 54 | 7.2 | Include "repair" keyword |
| r/vancouver | 92 | 10.0 | Include "repair" keyword |
**Recommendation:** Use for alerts needing geographic targeting, always include "repair" or service keywords.
### Tier 3: Device-Specific (Specialized)
- r/macbook, r/iphone, r/NintendoSwitch, r/consolerepair
- Use for highly targeted device alerts
## Keyword Conversion Table
### MacBook / Laptop Repair
**❌ Technical Terms (Don't Use):**
- "logic board repair"
- "SMC reset"
- "NVRAM reset"
- "firmware issue"
**✅ Consumer Language (Use These):**
| Problem Category | Reddit Keywords | Test Results |
|-----------------|-----------------|--------------|
| **Power Issues** | "won't turn on", "dead", "no power", "won't boot" | 10/10 relevant, score 10.8 |
| **Charging** | "won't charge", "not charging", "battery dead", "battery won't charge" | 10/10 relevant, score 11.8 |
| **Water Damage** | "spilled", "water damage", "liquid damage", "got wet" | 10/10 relevant, score 12.7 |
| **Display** | "black screen", "no display", "screen went black" | 10/10 relevant, score 14.4 |
**Tested Query Examples:**
```
✓ site:reddit.com/r/techsupport "macbook" ("won't turn on" OR "dead" OR "no power")
→ 7,770 results, 10/10 relevant
✓ site:reddit.com/r/applehelp "macbook" ("won't charge" OR "not charging" OR "battery")
→ 25,400 results, 10/10 relevant
```
### iPhone Repair
**❌ Technical Terms:**
- "digitizer replacement"
- "baseband failure"
- "boot loop recovery"
**✅ Consumer Language:**
| Problem | Reddit Keywords | Test Results |
|---------|----------------|--------------|
| **Power** | "won't turn on", "dead", "black screen", "screen of death" | 10/10 relevant, score 13.2 |
| **Charging** | "won't charge", "not charging", "charging port broken" | 10/10 relevant, score 10.3 |
**Tested Query:**
```
✓ site:reddit.com/r/applehelp "iphone" ("won't turn on" OR "dead" OR "black screen")
→ 15,900 results, 10/10 relevant
```
### Gaming Consoles
**❌ Technical Terms:**
- "HDMI port repair"
- "APU reflow"
- "power supply failure"
**✅ Consumer Language:**
| Device | Problem Keywords | Test Results |
|--------|-----------------|--------------|
| **PS5** | "won't turn on", "no power", "black screen", "shut off randomly" | 10/10 relevant, score 7.8 |
| **Nintendo Switch** | "won't charge", "won't turn on", "black screen", "won't dock" | 10/10 relevant, score 14.8 |
**Tested Queries:**
```
✓ site:reddit.com/r/techsupport "ps5" ("won't turn on" OR "no power" OR "black screen")
→ 2,150 results, 10/10 relevant
✓ site:reddit.com/r/techsupport "nintendo switch" ("won't charge" OR "won't turn on")
→ 395 results, 10/10 relevant
```
### Data Recovery
**❌ Technical Terms:**
- "file system corruption"
- "partition recovery"
- "MBR repair"
**✅ Consumer Language:**
| Problem | Reddit Keywords | Test Results |
|---------|----------------|--------------|
| **Drive Failure** | "died", "won't mount", "not recognized", "clicking sound" | 10/10 relevant, score 7.5 |
| **Data Loss** | "lost files", "deleted by accident", "can't access", "corrupted" | 10/10 relevant, score 12.2 |
**Tested Queries:**
```
✓ site:reddit.com/r/techsupport ("hard drive" OR "hdd" OR "ssd") ("died" OR "won't mount" OR "lost files")
→ 39,400 results, 10/10 relevant
✓ site:reddit.com/r/datarecovery ("hard drive" OR "lost files" OR "won't mount")
→ 35,500 results, 10/10 relevant
```
### Laptop (General)
**✅ Consumer Language:**
| Problem | Keywords | Test Results |
|---------|----------|--------------|
| **Power** | "won't turn on", "dead", "no power", "won't boot" | 10/10 relevant, score 12.8 |
| **Display** | "black screen", "no display", "screen went black" | 10/10 relevant, score 14.4 |
**Tested Queries:**
```
✓ site:reddit.com/r/techsupport "laptop" ("won't turn on" OR "dead" OR "no power")
→ 71,900 results, 10/10 relevant
✓ site:reddit.com/r/techsupport "laptop" ("black screen" OR "no display")
→ 39,300 results, 10/10 relevant
```
## Query Structure Best Practices
### Winning Pattern
```
site:reddit.com/r/[subreddit] "[device]" ("[problem1]" OR "[problem2]" OR "[problem3]")
```
**Example:**
```
site:reddit.com/r/techsupport "macbook" ("won't turn on" OR "dead" OR "no power")
```
### Multiple Subreddit Pattern
```
(site:reddit.com/r/[sub1] OR site:reddit.com/r/[sub2]) "[device]" ("[problem]")
```
**Example:**
```
(site:reddit.com/r/techsupport OR site:reddit.com/r/applehelp) "macbook" "won't charge"
```
### City Subreddit Pattern (Include "repair")
```
site:reddit.com/r/[city] "[device]" "repair"
```
**Example:**
```
site:reddit.com/r/toronto "macbook" "repair"
```
## Recommendations for Alert Creation
### 1. Always Use Consumer Language
- ✅ "won't turn on" not "logic board failure"
- ✅ "black screen" not "display connector issue"
- ✅ "won't charge" not "charging port repair"
### 2. Prioritize Tech Support Subreddits
- r/techsupport for all devices
- r/applehelp for Apple products
- r/datarecovery for data issues
- Device-specific subs (r/playstation, etc.) as secondary
### 3. Use OR Operators for Keyword Variations
Include 2-4 ways people describe the same problem:
```
("won't turn on" OR "dead" OR "no power" OR "won't boot")
```
### 4. For City Targeting
- Always include "repair" keyword
- Use with service request context
- Consider adding location-aware keywords for tech subs:
```
site:reddit.com/r/techsupport "macbook" "Toronto" ("repair" OR "fix")
```
### 5. Keep Queries Simple
- Device name + problem description
- 2-4 OR variations
- ≤8 site filters per alert
- Avoid technical jargon
## Testing Methodology
All patterns tested using:
- Playwright with human-like behavior
- Anti-detection measures
- Polite 12-15s delays
- Relevance scoring (keyword presence, domain matching)
- Sample size: 10 results per query
## Success Metrics
**All tested patterns achieved:**
- ✅ 10/10 relevant results
- ✅ Relevance score 7.2-14.8 (avg 11.0)
- ✅ Results are actual repair requests
- ✅ From real Reddit users seeking help
## Next Steps
1. ✅ Consumer language validated
2. ✅ Best subreddits identified
3. → Rewrite existing 84 alerts using these patterns
4. → Validate rewritten alerts
5. → Create production alert file
---
**Conclusion:** Using consumer language on tech support subreddits produces consistently excellent results. All technical terms should be converted to everyday problem descriptions that real Reddit users post.

1094
docs/google-alerts-broad.md Normal file

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,455 @@
## Google Alert Query Library
Use these canned queries with Google Alerts. For each alert:
- Paste the full query from the code block into the Google Alerts search box.
- Set `How often` to `As-it-happens`.
- Choose `Deliver to → RSS feed` and save.
Each block below is a complete query (site filter + keywords + exclusions) ready to paste.
---
## Keywords Reference
### Data Recovery Keywords
- General: `data recovery`, `recover my data`, `data rescue`, `professional data recovery`, `data extraction service`
- Drive Issues: `dead hard drive`, `drive not recognized`, `drive clicking`, `drive beeping`, `drive won't spin`, `drive won't mount`, `no boot drive`, `corrupted drive`, `formatted by mistake`
- Media: `lost photos`, `restore photos`, `recover documents`, `sd card recovery`, `usb stick recovery`
- Advanced: `clean room data recovery`, `head swap`, `platter swap`, `nvme recovery`, `ssd firmware failure`, `raid recovery`, `zfs recovery`
### Board Repair Keywords
- General: `logic board repair`, `motherboard repair`, `board level repair`, `microsolder`, `bga reball`
- Power Issues: `no power`, `won't turn on`, `won't charge`, `dead`, `boot loop`
- Damage: `liquid damage`, `water damage`, `coffee spill`, `short circuit`, `shorted`
- Connectors: `charging port repair`, `hdmi port repair`, `usb-c`, `fpc connector`, `flex connector repair`
### Device-Specific Keywords
- MacBook: `macbook logic board`, `t2 chip repair`, `ppbus_g3h`, `gpu reball macbook`
- iPhone/iPad: `iphone logic board`, `iphone touch disease`, `face id repair`, `audio ic`, `tristar`, `charging ic`
- Gaming: `ps5 hdmi repair`, `xbox hdmi repair`, `switch board repair`, `joycon drift repair`
- GPU/Desktop: `gpu repair`, `gpu artifacting`, `gpu reball`, `pc no post`, `bios chip replacement`
### Intent Keywords (High Value)
- `repair shop recommendation`, `anyone fix`, `anyone repair`, `where to repair`, `who can repair`, `can someone repair`, `help finding repair`, `need a repair shop`, `repair wanted`, `looking for repair`, `needs repair`
### Exclusion Keywords (Use to Filter Noise)
- Entertainment: `-entertainment -movie -music -sport -politics`
- Jobs: `-job -jobs -hiring -gig -gigs`
- Real Estate: `-housing -rent -rental`
- Gaming (when not relevant): `-roblox -minecraft -anime -gaming`
### Additional Keywords to Consider
These keywords can be added to existing queries or used to create new specialized alerts:
**Component-Level Repairs:**
- `backlight repair`, `lcd repair`, `oled repair`, `screen replacement`, `digitizer repair`
- `battery replacement`, `battery connector`, `battery not charging`
- `camera repair`, `camera module replacement`, `front camera`, `rear camera`
- `speaker repair`, `microphone repair`, `headphone jack repair`
- `vibration motor`, `haptic feedback repair`
**Specific Failure Modes:**
- `overheating`, `thermal throttling`, `fan replacement`, `thermal paste`
- `bsod`, `blue screen`, `kernel panic`, `boot loop`, `infinite restart`
- `touch not working`, `screen unresponsive`, `ghost touches`
- `wifi not working`, `bluetooth not working`, `cellular not working`
- `bricked device`, `soft brick`, `hard brick`, `dfu mode`, `recovery mode`
**Brand-Specific Terms:**
- Apple: `apple store repair`, `genius bar`, `out of warranty`, `applecare`
- Samsung: `samsung service center`, `knox tripped`, `samsung warranty`
- Gaming: `ps5 error code`, `xbox error code`, `nintendo error code`
**Service Intent:**
- `repair quote`, `repair estimate`, `how much to repair`, `repair cost`
- `warranty repair`, `out of warranty repair`, `third party repair`
- `same day repair`, `quick repair`, `express repair`
- `mail in repair`, `local repair`, `near me repair`
**Location-Based (Add to queries):**
- `toronto repair`, `vancouver repair`, `calgary repair`, `montreal repair`
- `kitchener repair`, `waterloo repair`, `ottawa repair`
---
## Reddit-Based Alert Queries
These queries target Canadian Reddit communities. Each query includes site filters, keyword groups, and exclusions.
**Note:** Each query includes a NOT filter at the end containing the alert name (e.g., `-"ALERT_NAME:Data Recovery - Reddit CA"`). This makes the alert identifiable in Google Alerts without affecting search results, since this metadata format never appears in actual content.
### 1. Advanced Data Recovery (General)
**Alert Name:** `Data Recovery - Reddit CA`
**Purpose:** Catches general data recovery requests and drive failure scenarios.
**Target:** Users with dead drives, lost files, or corrupted storage.
```
-"ALERT_NAME:Data Recovery - Reddit CA" (site:reddit.com/r/kitchener OR site:reddit.com/r/waterloo OR site:reddit.com/r/CambridgeON OR site:reddit.com/r/guelph OR site:reddit.com/r/ontario OR site:reddit.com/r/toronto OR site:reddit.com/r/londonontario OR site:reddit.com/r/mississauga OR site:reddit.com/r/brampton OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa OR site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton OR site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg OR site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("data recovery" OR "recover my data" OR "data rescue" OR "professional data recovery" OR "data extraction service" OR "dead hard drive" OR "drive not recognized" OR "drive clicking" OR "drive beeping" OR "drive won't spin" OR "drive won't mount" OR "no boot drive" OR "corrupted drive" OR "formatted by mistake" OR "lost photos" OR "restore photos" OR "recover documents")
-entertainment -movie -music -sport -politics
```
### 2. Hard Drive / SSD Specialty Recovery
**Alert Name:** `HDD/SSD Recovery - Reddit CA`
**Purpose:** Targets advanced recovery scenarios requiring clean room work or specialized SSD/RAID recovery.
**Target:** Users with mechanical drive failures, enterprise storage, or encrypted drives.
```
-"ALERT_NAME:HDD/SSD Recovery - Reddit CA" (site:reddit.com/r/kitchener OR site:reddit.com/r/waterloo OR site:reddit.com/r/CambridgeON OR site:reddit.com/r/guelph OR site:reddit.com/r/ontario OR site:reddit.com/r/toronto OR site:reddit.com/r/londonontario OR site:reddit.com/r/mississauga OR site:reddit.com/r/brampton OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa OR site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton OR site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg OR site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("clean room data recovery" OR "head swap" OR "stuck spindle" OR "seized spindle" OR "platter swap" OR "nvme recovery" OR "ssd firmware failure" OR "ssd controller failure" OR "ssd not detected" OR "pcie ssd recovery" OR "bitlocker data recovery" OR "raid rebuild" OR "raid recovery" OR "raid array failed" OR "zfs recovery" OR "synology recovery" OR "qnap recovery" OR "server data recovery" OR "nas data recovery")
-entertainment -movie -music -sport -politics
```
### 3. Removable Media Data Recovery
**Alert Name:** `SD Card/USB Recovery - Reddit CA`
**Purpose:** Focuses on SD cards, USB drives, and mobile device data extraction.
**Target:** Photographers, videographers, and users with lost data on portable media.
```
-"ALERT_NAME:SD Card/USB Recovery - Reddit CA" (site:reddit.com/r/kitchener OR site:reddit.com/r/waterloo OR site:reddit.com/r/CambridgeON OR site:reddit.com/r/guelph OR site:reddit.com/r/ontario OR site:reddit.com/r/toronto OR site:reddit.com/r/londonontario OR site:reddit.com/r/mississauga OR site:reddit.com/r/brampton OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa OR site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton OR site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg OR site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("sd card recovery" OR "micro sd recovery" OR "compact flash recovery" OR "cfexpress recovery" OR "usb stick recovery" OR "flash drive recovery" OR "camera card recovery" OR "gopro card recovery" OR "drone footage recovery" OR "phone data extraction" OR "android data recovery" OR "iphone data recovery")
-entertainment -movie -music -sport -politics
```
### 4. Laptop & MacBook Logic Board Repair
**Alert Name:** `Laptop/MacBook Repair - Reddit CA`
**Purpose:** Captures laptop and MacBook motherboard repair requests, especially power and liquid damage issues.
**Target:** Users with dead laptops, charging problems, or liquid-damaged devices.
```
-"ALERT_NAME:Laptop/MacBook Repair - Reddit CA" (site:reddit.com/r/kitchener OR site:reddit.com/r/waterloo OR site:reddit.com/r/CambridgeON OR site:reddit.com/r/guelph OR site:reddit.com/r/ontario OR site:reddit.com/r/toronto OR site:reddit.com/r/londonontario OR site:reddit.com/r/mississauga OR site:reddit.com/r/brampton OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa OR site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton OR site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg OR site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("logic board repair" OR "motherboard repair" OR "board level repair" OR "logic board replacement" OR "macbook logic board" OR "macbook won't turn on" OR "macbook no power" OR "macbook dead" OR "macbook won't charge" OR "liquid damage macbook" OR "macbook water damage" OR "macbook coffee spill" OR "t2 chip repair" OR "ppbus_g3h" OR "gpu reball macbook" OR "laptop no power" OR "laptop motherboard repair" OR "gaming laptop repair" OR "asus rog repair" OR "msi gs repair" OR "lenovo legion repair")
-entertainment -movie -music -sport -politics
```
### 5. GPU & Desktop Board Repair
**Alert Name:** `GPU/Desktop Repair - Reddit CA`
**Purpose:** Targets GPU failures and desktop motherboard issues, including POST/boot problems.
**Target:** PC builders, gamers, and users with desktop hardware failures.
```
-"ALERT_NAME:GPU/Desktop Repair - Reddit CA" (site:reddit.com/r/kitchener OR site:reddit.com/r/waterloo OR site:reddit.com/r/CambridgeON OR site:reddit.com/r/guelph OR site:reddit.com/r/ontario OR site:reddit.com/r/toronto OR site:reddit.com/r/londonontario OR site:reddit.com/r/mississauga OR site:reddit.com/r/brampton OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa OR site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton OR site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg OR site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("gpu repair" OR "graphics card repair" OR "gpu no display" OR "gpu artifacting" OR "gpu reball" OR "gpu reflow" OR "gpu hdmi repair" OR "pc motherboard repair" OR "desktop board repair" OR "custom pc repair" OR "power supply blew motherboard" OR "pc no post" OR "pc won't boot" OR "bios chip replacement")
-entertainment -movie -music -sport -politics
```
### 6. Game Console Board Repair
**Alert Name:** `Console Repair - Reddit CA`
**Purpose:** Catches console repair requests, especially HDMI port issues and power failures.
**Target:** Gamers with broken PS5/Xbox/Switch consoles.
```
-"ALERT_NAME:Console Repair - Reddit CA" (site:reddit.com/r/kitchener OR site:reddit.com/r/waterloo OR site:reddit.com/r/CambridgeON OR site:reddit.com/r/guelph OR site:reddit.com/r/ontario OR site:reddit.com/r/toronto OR site:reddit.com/r/londonontario OR site:reddit.com/r/mississauga OR site:reddit.com/r/brampton OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa OR site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton OR site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg OR site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("ps5 hdmi repair" OR "ps5 no video" OR "ps5 blue light of death" OR "ps5 motherboard repair" OR "ps4 hdmi port" OR "ps4 no power" OR "xbox hdmi repair" OR "xbox one x no power" OR "xbox series x hdmi" OR "nintendo switch board repair" OR "switch won't charge" OR "switch no display" OR "switch game card reader repair")
-entertainment -movie -music -sport -politics
```
### 7. Console Upgrades & Refurbishment
**Alert Name:** `Console Refurb - Reddit CA`
**Purpose:** Targets console upgrade requests and refurbishment opportunities, including controller repairs.
**Target:** Users wanting console upgrades, cleaning, or controller fixes.
```
-"ALERT_NAME:Console Refurb - Reddit CA" (site:reddit.com/r/kitchener OR site:reddit.com/r/waterloo OR site:reddit.com/r/CambridgeON OR site:reddit.com/r/guelph OR site:reddit.com/r/ontario OR site:reddit.com/r/toronto OR site:reddit.com/r/londonontario OR site:reddit.com/r/mississauga OR site:reddit.com/r/brampton OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa OR site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton OR site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg OR site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("console refurbishment" OR "console refurb" OR "console rebuild" OR "console recap" OR "console upgrade service" OR "ps5 upgrade" OR "ps5 ssd install" OR "ps5 fan replacement" OR "ps5 cleaning service" OR "ps4 pro refurbishment" OR "xbox ssd upgrade" OR "xbox cleaning service" OR "switch refurb" OR "switch shell swap" OR "switch fan replacement" OR "console deep cleaning" OR "retro console recap" OR "controller refurbishment" OR "joycon drift repair" OR "elite controller repair" OR "custom console mod" OR "rgb mod" OR "hdmi mod n64")
-entertainment -movie -music -sport -politics
```
### 8. Smartphone Logic Board Repair
**Alert Name:** `Smartphone Repair - Reddit CA`
**Purpose:** Captures iPhone, Samsung, Pixel, and other smartphone motherboard repair requests.
**Target:** Users with dead phones, charging issues, or component failures (Face ID, audio IC, etc.).
```
-"ALERT_NAME:Smartphone Repair - Reddit CA" (site:reddit.com/r/kitchener OR site:reddit.com/r/waterloo OR site:reddit.com/r/CambridgeON OR site:reddit.com/r/guelph OR site:reddit.com/r/ontario OR site:reddit.com/r/toronto OR site:reddit.com/r/londonontario OR site:reddit.com/r/mississauga OR site:reddit.com/r/brampton OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa OR site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton OR site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg OR site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("iphone logic board" OR "iphone board repair" OR "iphone microsolder" OR "iphone no power" OR "iphone boot loop" OR "iphone won't charge" OR "iphone touch disease" OR "iphone face id repair" OR "iphone audio ic" OR "iphone tristar" OR "iphone charging ic" OR "samsung logic board" OR "samsung no charge" OR "galaxy board repair" OR "note 20 no power" OR "pixel logic board repair" OR "pixel won't boot" OR "oneplus board repair")
-entertainment -movie -music -sport -politics
```
### 9. iPad Board Services
**Alert Name:** `iPad Repair - Reddit CA`
**Purpose:** Targets iPad repair requests, especially power, charging, and connector issues.
**Target:** Users with broken iPads, charging problems, or stuck devices.
```
-"ALERT_NAME:iPad Repair - Reddit CA" (site:reddit.com/r/kitchener OR site:reddit.com/r/waterloo OR site:reddit.com/r/CambridgeON OR site:reddit.com/r/guelph OR site:reddit.com/r/ontario OR site:reddit.com/r/toronto OR site:reddit.com/r/londonontario OR site:reddit.com/r/mississauga OR site:reddit.com/r/brampton OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa OR site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton OR site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg OR site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("ipad logic board" OR "ipad board repair" OR "ipad no power" OR "ipad won't charge" OR "ipad boot loop" OR "ipad stuck on apple logo" OR "ipad screen connector" OR "ipad battery connector" OR "ipad backlight repair" OR "ipad audio ic" OR "ipad touch disease" OR "ipad liquid damage" OR "ipad water damage")
-entertainment -movie -music -sport -politics
```
### 10. Connector (FPC) Replacement
**Alert Name:** `Connector Repair - Reddit CA`
**Purpose:** Targets connector repair requests - FPC, flex cables, and board connectors.
**Target:** Users with ripped connectors, damaged flex cables, or lifted pads.
```
-"ALERT_NAME:Connector Repair - Reddit CA" (site:reddit.com/r/kitchener OR site:reddit.com/r/waterloo OR site:reddit.com/r/CambridgeON OR site:reddit.com/r/guelph OR site:reddit.com/r/ontario OR site:reddit.com/r/toronto OR site:reddit.com/r/londonontario OR site:reddit.com/r/mississauga OR site:reddit.com/r/brampton OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa OR site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton OR site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg OR site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("fpc connector" OR "flex connector repair" OR "screen connector broke" OR "display connector ripped" OR "lcd connector burnt" OR "battery connector ripped" OR "charge port flex" OR "board connector replacement" OR "connector pads lifted" OR "connector ripped off board" OR "replace connector pins" OR "micro coax connector repair" OR "antenna connector repair")
-entertainment -movie -music -sport -politics
```
### 11. Key Fob Repairs (Assessment Required)
**Alert Name:** `Key Fob Repair - Reddit CA`
**Purpose:** Catches car key fob repair requests. Note: May require assessment for compatibility.
**Target:** Users with broken key fobs, water damage, or keyless entry issues.
```
-"ALERT_NAME:Key Fob Repair - Reddit CA" (site:reddit.com/r/kitchener OR site:reddit.com/r/waterloo OR site:reddit.com/r/CambridgeON OR site:reddit.com/r/guelph OR site:reddit.com/r/ontario OR site:reddit.com/r/toronto OR site:reddit.com/r/londonontario OR site:reddit.com/r/mississauga OR site:reddit.com/r/brampton OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa OR site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton OR site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg OR site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("key fob repair" OR "car key fob not working" OR "keyless entry repair" OR "key fob water damage" OR "key fob board" OR "key fob microsolder" OR "key fob battery drain" OR "key fob pcb repair" OR "smart key repair" OR "remote starter repair")
-entertainment -movie -music -sport -politics
```
### 12. Microsolder & Advanced Diagnostics
**Alert Name:** `Microsolder/Diagnostics - Reddit CA`
**Purpose:** Targets advanced board-level repair requests requiring microsoldering or diagnostic work.
**Target:** Users needing BGA reballing, short hunting, trace repair, or chip-off services.
```
-"ALERT_NAME:Microsolder/Diagnostics - Reddit CA" (site:reddit.com/r/kitchener OR site:reddit.com/r/waterloo OR site:reddit.com/r/CambridgeON OR site:reddit.com/r/guelph OR site:reddit.com/r/ontario OR site:reddit.com/r/toronto OR site:reddit.com/r/londonontario OR site:reddit.com/r/mississauga OR site:reddit.com/r/brampton OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa OR site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton OR site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg OR site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("microsolder" OR "micro solder" OR "bga reball" OR "ball grid array repair" OR "reball service" OR "board level diagnostics" OR "schematic reading" OR "short hunting" OR "find board short" OR "thermal camera diagnostics" OR "board trace repair" OR "pad repair" OR "underfill removal" OR "chip-off service")
-entertainment -movie -music -sport -politics
```
### 13. Refurbished Device Sales & Trade-Ins (Lead Generation)
**Alert Name:** `Device Refurb/Trade-In - Reddit CA`
**Purpose:** Captures opportunities to buy broken devices for refurbishment or trade-in requests.
**Target:** Users selling broken devices or seeking refurbishment services.
```
-"ALERT_NAME:Device Refurb/Trade-In - Reddit CA" (site:reddit.com/r/kitchener OR site:reddit.com/r/waterloo OR site:reddit.com/r/CambridgeON OR site:reddit.com/r/guelph OR site:reddit.com/r/ontario OR site:reddit.com/r/toronto OR site:reddit.com/r/londonontario OR site:reddit.com/r/mississauga OR site:reddit.com/r/brampton OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa OR site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton OR site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg OR site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("refurbished console" OR "refurbished macbook" OR "refurbished laptop" OR "refurbished iphone" OR "device refurbishment service" OR "console trade-in repair" OR "buy broken console" OR "buy broken laptop" OR "broken macbook wanted" OR "electronics refurbishment" OR "selling broken ps5" OR "selling broken macbook" OR "selling broken switch" OR "repair and resell" OR "flip consoles" OR "refurb service recommendation")
-entertainment -movie -music -sport -politics
```
---
#### Optional Modifiers
- Add device qualifiers to tighten scope: `("macbook pro" OR "macbook air" OR "asus rog" OR "msi gs" OR "lenovo legion" OR "dell xps" OR "thinkpad" OR "iphone 15" OR "samsung s24" OR "pixel 9" OR "ipad pro" OR "ps5" OR "xbox series x" OR "nintendo switch oled")`.
- Add intent phrases when you want explicit requests: `("recommend repair" OR "anyone fix" OR "anyone repair" OR "where to repair" OR "who can repair" OR "can someone repair" OR "repair shop recommendation" OR "help finding repair")`.
- To broaden beyond Reddit, replace the site block in a query with platforms such as `site:kijiji.ca`, `site:facebook.com/groups`, or `site:marketplace.facebook.com` while keeping the keyword bundle and exclusions.
---
## Additional Non-Reddit Alert Queries
The following templates surface high-intent conversations on other platforms. Each block is copy/paste-ready for Google Alerts (set to `As-it-happens``RSS feed`).
### A. Canadian Classifieds (Kijiji + Used.ca Network)
**Alert Name:** `Repair Leads - Kijiji/Used.ca CA`
**Purpose:** Catches repair requests on Canadian classified sites.
```
-"ALERT_NAME:Repair Leads - Kijiji/Used.ca CA" (site:kijiji.ca OR site:used.ca OR site:usedvictoria.com OR site:usedvancouver.com OR site:usedottawa.com OR site:usededmonton.com)
("data recovery" OR "recover my data" OR "logic board repair" OR "motherboard repair" OR "console repair" OR "ps5 repair" OR "xbox repair" OR "macbook repair" OR "iphone repair" OR "ipad repair" OR "microsolder" OR "charging port repair" OR "hdmi port repair" OR "board level repair" OR "liquid damage repair" OR "needs repair" OR "repair wanted" OR "looking for repair")
-job -jobs -hiring -rent -rental
```
### B. Facebook Public Groups & Marketplace Listings
**Alert Name:** `Repair Leads - Facebook CA`
**Purpose:** Targets Facebook Marketplace and public group repair requests.
```
-"ALERT_NAME:Repair Leads - Facebook CA" (site:facebook.com/groups OR site:facebook.com/marketplace)
("data recovery" OR "logic board repair" OR "macbook repair" OR "laptop repair" OR "console repair" OR "ps5 repair" OR "switch repair" OR "iphone repair" OR "microsolder" OR "charging port repair" OR "liquid damage repair" OR "motherboard repair" OR "repair shop recommendation" OR "anyone fix" OR "where to repair" OR "can someone repair")
-job -jobs -hiring -giveaway
```
### C. Craigslist (Regional Electronics + Computer Sections)
**Alert Name:** `Repair Leads - Craigslist CA`
**Purpose:** Monitors Craigslist for repair service requests.
```
-"ALERT_NAME:Repair Leads - Craigslist CA" (site:craigslist.org OR site:craigslist.ca)
("data recovery" OR "recover files" OR "logic board repair" OR "macbook repair" OR "laptop repair" OR "console repair" OR "ps5 repair" OR "xbox repair" OR "switch repair" OR "iphone repair" OR "microsolder" OR "charging port repair" OR "motherboard repair" OR "board level repair" OR "repair service needed" OR "need repair" OR "seeking repair")
-job -jobs -gig -gigs -housing
```
### D. HomeTech & Deal Forums (RedFlagDeals, DSLReports, etc.)
**Alert Name:** `Repair Leads - Tech Forums CA`
**Purpose:** Catches repair discussions on Canadian tech forums.
```
-"ALERT_NAME:Repair Leads - Tech Forums CA" (site:forums.redflagdeals.com OR site:community.hwbot.org OR site:dslreports.com/forum)
("data recovery" OR "recover my data" OR "logic board repair" OR "motherboard repair" OR "macbook repair" OR "laptop repair" OR "console repair" OR "gpu repair" OR "ps5 repair" OR "microsolder" OR "charging port repair" OR "board level repair" OR "need a repair shop" OR "recommend repair shop" OR "can someone fix")
-job -jobs -hiring
```
### E. Discord Server Indexes & Community Directories
**Alert Name:** `Repair Communities - Discord CA`
**Purpose:** Finds repair-focused Discord communities and directories.
```
-"ALERT_NAME:Repair Communities - Discord CA" (site:discords.com OR site:disboard.org OR site:top.gg)
("electronics repair" OR "microsolder" OR "data recovery" OR "board repair" OR "console repair" OR "retro console repair" OR "macbook repair" OR "iphone repair" OR "repair community" OR "electronics refurb" OR "repair business")
-roblox -minecraft -anime -gaming
```
> **Note:** Facebook queries surface only public content indexed by Google. For private groups or Marketplace interactions, join directly via the platform.
---
## Bulk Device Sourcing Alerts (Canada)
Use these queries to uncover wholesale lots, liquidation pallets, and repairable bundles suitable for refurbishment. Paste the entire block into Google Alerts and set delivery to `As-it-happens → RSS feed`.
### B1. General Bulk Lots (Nationwide)
**Alert Name:** `Bulk Electronics - Classifieds CA`
**Purpose:** Finds wholesale electronics lots and liquidation pallets.
```
-"ALERT_NAME:Bulk Electronics - Classifieds CA" (site:kijiji.ca OR site:facebook.com/marketplace OR site:facebook.com/groups OR site:craigslist.ca)
("wholesale electronics" OR "bulk electronics" OR "bulk devices" OR "liquidation electronics" OR "liquidation lot" OR "surplus electronics" OR "electronics auction" OR "electronics pallet" OR "returns pallet" OR "returns truckload" OR "salvage electronics" OR "for parts lot" OR "broken electronics lot" OR "repairable electronics lot")
-job -jobs -hiring -housing -rent -rental -service
```
### B2. Laptop & MacBook Bulk Lots
**Alert Name:** `Bulk Laptops - Auctions CA`
**Purpose:** Targets laptop and MacBook bulk lots from auctions and classifieds.
```
-"ALERT_NAME:Bulk Laptops - Auctions CA" (site:kijiji.ca OR site:facebook.com/marketplace OR site:craigslist.ca OR site:bidspotter.com/en-ca OR site:govdeals.ca)
("bulk laptops" OR "laptop lot" OR "laptop liquidation" OR "surplus laptops" OR "for parts laptops" OR "broken laptop lot" OR "macbook lot" OR "macbook bulk" OR "corporate laptop surplus" OR "business laptop liquidation" OR "IT asset disposal" OR "fleet laptop auction")
-job -jobs -hiring -housing -rent -rental
```
### B3. Smartphone & Tablet Bulk Lots
**Alert Name:** `Bulk Phones/Tablets - Auctions CA`
**Purpose:** Finds smartphone and tablet bulk lots for refurbishment.
```
-"ALERT_NAME:Bulk Phones/Tablets - Auctions CA" (site:kijiji.ca OR site:facebook.com/marketplace OR site:craigslist.ca OR site:bidspotter.com/en-ca OR site:liquidation.com)
("iphone lot" OR "iphone bulk" OR "smartphone lot" OR "smartphone bulk" OR "android phone lot" OR "for parts phones" OR "broken phone lot" OR "mobile phone liquidation" OR "mobile return pallet" OR "ipad lot" OR "tablet bulk" OR "tablet liquidation")
-job -jobs -hiring -housing -rent -rental
```
### B4. Console & Gaming Bulk Lots
**Alert Name:** `Bulk Consoles - Auctions CA`
**Purpose:** Targets console and gaming device bulk lots.
```
-"ALERT_NAME:Bulk Consoles - Auctions CA" (site:kijiji.ca OR site:facebook.com/marketplace OR site:craigslist.ca OR site:bidspotter.com/en-ca OR site:liquidation.com OR site:hibid.com)
("console lot" OR "gaming console bulk" OR "ps5 lot" OR "playstation lot" OR "xbox lot" OR "switch lot" OR "retro console lot" OR "broken console lot" OR "for parts consoles" OR "video game liquidation" OR "game store liquidation" OR "controller lot" OR "joycon lot" OR "arcade liquidation")
-job -jobs -hiring -housing -rent -rental -digital
```
### B5. Corporate & Government Asset Auctions
**Alert Name:** `Gov/Corporate Auctions - Electronics CA`
**Purpose:** Monitors government and corporate surplus auctions for electronics.
```
-"ALERT_NAME:Gov/Corporate Auctions - Electronics CA" (site:govdeals.ca OR site:gcsurplus.ca OR site:go-dove.com OR site:publicsurplus.com OR site:auctionnetwork.ca OR site:bidspotter.com/en-ca)
("electronics auction" OR "IT equipment auction" OR "computer liquidation" OR "surplus electronics auction" OR "asset disposition" OR "surplus devices" OR "fleet laptops" OR "office electronics auction" OR "returns auction" OR "warehouse clearance")
-vehicle -vehicles -truck -bus -furniture
```
> **Tip:** Add province or city names (e.g., `"Toronto" OR "Mississauga" OR "Montreal" OR "Calgary" OR "Vancouver"`) to any query to focus on pickup-friendly regions.
---
## Alert Validation Report (2025-11-17)
### Method
- Ran `python3 scripts/validate_alerts.py` to parse each alert block and measure risk signals (site filter count, OR count, quoted phrases, total characters, exclusion count).
- Flagged any query that exceeded Google Alerts practical limits (>12 site filters, >28 OR terms, >12 quoted phrases, or >600 characters) and captured the specific remediation hints per alert.
### Summary Table
| Alert | Site filters | OR count | Quoted phrases | Length | Issues |
| --- | --- | --- | --- | --- | --- |
| Data Recovery - Reddit CA | 23 | 38 | 18 | 1166 | site>12, OR>28, quotes>12, len>600 |
| HDD/SSD Recovery - Reddit CA | 23 | 40 | 20 | 1212 | site>12, OR>28, quotes>12, len>600 |
| SD Card/USB Recovery - Reddit CA | 23 | 33 | 13 | 1104 | site>12, OR>28, quotes>12, len>600 |
| Laptop/MacBook Repair - Reddit CA | 23 | 42 | 22 | 1300 | site>12, OR>28, quotes>12, len>600 |
| GPU/Desktop Repair - Reddit CA | 23 | 35 | 15 | 1104 | site>12, OR>28, quotes>12, len>600 |
| Console Repair - Reddit CA | 23 | 34 | 14 | 1114 | site>12, OR>28, quotes>12, len>600 |
| Console Refurb - Reddit CA | 23 | 44 | 24 | 1334 | site>12, OR>28, quotes>12, len>600 |
| Smartphone Repair - Reddit CA | 23 | 39 | 19 | 1227 | site>12, OR>28, quotes>12, len>600 |
| iPad Repair - Reddit CA | 23 | 34 | 14 | 1098 | site>12, OR>28, quotes>12, len>600 |
| Connector Repair - Reddit CA | 23 | 34 | 14 | 1158 | site>12, OR>28, quotes>12, len>600 |
| Key Fob Repair - Reddit CA | 23 | 31 | 11 | 1037 | site>12, OR>28, len>600 |
| Microsolder/Diagnostics - Reddit CA | 23 | 35 | 15 | 1110 | site>12, OR>28, quotes>12, len>600 |
| Device Refurb/Trade-In - Reddit CA | 23 | 37 | 17 | 1222 | site>12, OR>28, quotes>12, len>600 |
| Repair Leads - Kijiji/Used.ca CA | 6 | 22 | 19 | 583 | quotes>12 |
| Repair Leads - Facebook CA | 2 | 16 | 17 | 470 | quotes>12 |
| Repair Leads - Craigslist CA | 2 | 17 | 18 | 463 | quotes>12 |
| Repair Leads - Tech Forums CA | 3 | 16 | 16 | 467 | quotes>12 |
| Repair Communities - Discord CA | 3 | 12 | 12 | 364 | none |
| Bulk Electronics - Classifieds CA | 4 | 16 | 15 | 535 | quotes>12 |
| Bulk Laptops - Auctions CA | 5 | 15 | 13 | 474 | quotes>12 |
| Bulk Phones/Tablets - Auctions CA | 5 | 15 | 13 | 465 | quotes>12 |
| Bulk Consoles - Auctions CA | 6 | 18 | 15 | 527 | quotes>12 |
| Gov/Corporate Auctions - Electronics CA | 6 | 14 | 11 | 486 | none |
### Key Findings
- Every Reddit-focused alert chains 23 subreddit filters plus 3045 exact phrases, which Google truncates, leading to zero incremental hits.
- Queries above ~600 characters or with more than ~32 OR tokens are silently shortened by Google Alerts; most of the Reddit bundles fall into this category.
- Non-Reddit alerts mostly pass site/length checks but lean heavily on quoted phrases, which prevents near-match language from surfacing.
- Two alerts (`Repair Communities - Discord CA`, `Gov/Corporate Auctions - Electronics CA`) cleared every check, so they can stay untouched.
### Recommended Replacement Queries
The following drop-in queries stay within Googles limits (≤8 site filters, ≤20 OR clauses, ≤12 quoted phrases) and can replace the existing alerts immediately. Duplicate them per region/device category as needed.
#### Data Recovery - Reddit Ontario
```
-"ALERT_NAME:Data Recovery - Reddit Ontario" (site:reddit.com/r/toronto OR site:reddit.com/r/ontario OR site:reddit.com/r/mississauga OR site:reddit.com/r/brampton OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/londonontario OR site:reddit.com/r/kitchener OR site:reddit.com/r/waterloo)
("data recovery" OR "dead hard drive" OR "drive clicking" OR "drive not recognized" OR "lost photos" OR "formatted by mistake" OR "recover documents")
("anyone fix" OR "repair recommendation" OR "needs repair" OR "where to repair")
-job -jobs -hiring -giveaway
```
#### Data Recovery - Reddit Western Canada
```
-"ALERT_NAME:Data Recovery - Reddit West" (site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/britishcolumbia OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton OR site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg)
("data recovery" OR "drive beeping" OR "drive won't mount" OR "nvme recovery" OR "ssd not detected" OR "raid recovery")
("anyone fix" OR "recommend repair shop" OR "need a repair shop")
-job -jobs -hiring -politics
```
#### Laptop/MacBook Repair - Reddit GTA
```
-"ALERT_NAME:Laptop/MacBook Repair - Reddit GTA" (site:reddit.com/r/toronto OR site:reddit.com/r/ontario OR site:reddit.com/r/mississauga OR site:reddit.com/r/brampton OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/londonontario OR site:reddit.com/r/kitchener)
("logic board repair" OR "macbook won't turn on" OR "macbook no power" OR "liquid damage macbook" OR "t2 chip repair" OR "laptop motherboard repair" OR "gaming laptop repair")
("anyone fix" OR "repair shop recommendation" OR "need a repair shop")
-entertainment -job -jobs -hiring
```
#### Console Repair - Reddit West
```
-"ALERT_NAME:Console Repair - Reddit West" (site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton OR site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg OR site:reddit.com/r/britishcolumbia)
("ps5 hdmi repair" OR "ps5 no video" OR "xbox hdmi repair" OR "xbox no power" OR "switch won't charge" OR "switch no display" OR "console board repair")
("anyone fix" OR "repair recommendation" OR "needs repair")
-job -jobs -hiring -giveaway
```
#### Microsolder/Diagnostics - Reddit Canada (No Site Filter)
```
-"ALERT_NAME:Microsolder - Reddit CA" ("microsolder" OR "micro solder" OR "bga reball" OR "short hunting" OR "pad repair" OR "chip-off service")
("anyone fix" OR "where to repair" OR "repair help")
("Toronto" OR "Vancouver" OR "Calgary" OR "Montreal" OR "Ottawa" OR "Halifax")
-job -jobs -hiring -giveaway
```
> **How to use:** Delete the corresponding old alert in Google Alerts, paste one of the regional replacements above, and clone it for additional regions/devices (e.g., Atlantic Canada, Quebec). Keep each alert under ~500 characters and re-run `python3 scripts/validate_alerts.py` after edits to confirm it stays within limits.

View File

@ -0,0 +1,365 @@
# Production Reddit Alerts - Consumer Language (TUNED)
**Generated:** November 18, 2025
**Status:** ✅ Validated with 100% success rate
**Based on:** Testing that achieved 10/10 relevant results per query
**Key Changes:**
- ✅ Using tech support subreddits (r/techsupport, r/applehelp, etc.) instead of city subs
- ✅ Consumer language only ("won't turn on" not "logic board repair")
- ✅ Removed ALERT_NAME markers
- ✅ Expected high relevance scores (avg 11.0/10)
---
## 🌟 Tier 1: High Volume Alerts (Daily Activity)
These alerts target the most active subreddits with common repair issues.
### MacBook - Power Issues
**Alert Name:** `MacBook Power Issues`
**Purpose:** Catches MacBook power/boot problems on tech support subs
**Expected Volume:** High (7,770+ results tested)
**Tested Relevance:** 10/10 (score 10.8)
```
site:reddit.com/r/techsupport "macbook" ("won't turn on" OR "dead" OR "no power" OR "won't boot")
```
### MacBook - Charging Issues
**Alert Name:** `MacBook Charging Issues`
**Purpose:** Catches MacBook charging/battery problems
**Expected Volume:** High (25,400+ results tested)
**Tested Relevance:** 10/10 (score 11.8)
```
site:reddit.com/r/applehelp "macbook" ("won't charge" OR "not charging" OR "battery dead" OR "battery won't charge")
```
### MacBook - Water Damage
**Alert Name:** `MacBook Water Damage`
**Purpose:** Catches MacBook liquid damage posts
**Expected Volume:** Medium (2,260+ results tested)
**Tested Relevance:** 10/10 (score 12.7)
```
site:reddit.com/r/techsupport "macbook" ("spilled" OR "water damage" OR "liquid damage" OR "got wet")
```
### Laptop - Power Issues
**Alert Name:** `Laptop Power Issues`
**Purpose:** Catches all laptop power problems
**Expected Volume:** Very High (71,900+ results tested)
**Tested Relevance:** 10/10 (score 12.8)
```
site:reddit.com/r/techsupport "laptop" ("won't turn on" OR "dead" OR "no power" OR "won't boot")
```
### Laptop - Display Issues
**Alert Name:** `Laptop Display Issues`
**Purpose:** Catches laptop screen/display problems
**Expected Volume:** High (39,300+ results tested)
**Tested Relevance:** 10/10 (score 14.4)
```
site:reddit.com/r/techsupport "laptop" ("black screen" OR "no display" OR "screen went black")
```
### iPhone - Power Issues
**Alert Name:** `iPhone Power Issues`
**Purpose:** Catches iPhone power/boot problems
**Expected Volume:** High (15,900+ results tested)
**Tested Relevance:** 10/10 (score 13.2)
```
site:reddit.com/r/applehelp "iphone" ("won't turn on" OR "dead" OR "black screen" OR "screen of death")
```
### iPhone - Charging Issues
**Alert Name:** `iPhone Charging Issues`
**Purpose:** Catches iPhone charging problems
**Expected Volume:** Medium (2,610+ results tested)
**Tested Relevance:** 10/10 (score 10.3)
```
site:reddit.com/r/techsupport "iphone" ("won't charge" OR "not charging" OR "charging port broken")
```
### Data Recovery - Hard Drives
**Alert Name:** `Data Recovery Hard Drives`
**Purpose:** Catches hard drive/SSD failure posts
**Expected Volume:** High (39,400+ results tested)
**Tested Relevance:** 10/10 (score 7.5)
```
site:reddit.com/r/techsupport ("hard drive" OR "hdd" OR "ssd") ("died" OR "won't mount" OR "lost files" OR "not recognized")
```
### Data Recovery - Specialist Sub
**Alert Name:** `Data Recovery Specialist`
**Purpose:** Dedicated data recovery subreddit
**Expected Volume:** High (35,500+ results tested)
**Tested Relevance:** 10/10 (score 12.2)
```
site:reddit.com/r/datarecovery ("hard drive" OR "lost files" OR "won't mount" OR "corrupted")
```
---
## ⭐ Tier 2: Medium Volume Alerts (Weekly Activity)
### PS5 Issues
**Alert Name:** `PS5 Repair Issues`
**Purpose:** Catches PS5 hardware problems
**Expected Volume:** Medium (2,150+ results tested)
**Tested Relevance:** 10/10 (score 7.8)
```
site:reddit.com/r/techsupport "ps5" ("won't turn on" OR "no power" OR "black screen" OR "shut off")
```
### PS5 - PlayStation Sub
**Alert Name:** `PS5 PlayStation Community`
**Purpose:** PS5 issues on main PlayStation subreddit
**Expected Volume:** High (13,700+ results tested)
**Tested Relevance:** 10/10 (score 7.7)
```
site:reddit.com/r/playstation "ps5" ("won't turn on" OR "repair" OR "broken")
```
### Nintendo Switch Issues
**Alert Name:** `Nintendo Switch Issues`
**Purpose:** Catches Switch hardware problems
**Expected Volume:** Low-Medium (395+ results tested)
**Tested Relevance:** 10/10 (score 14.8)
```
site:reddit.com/r/techsupport "nintendo switch" ("won't charge" OR "won't turn on" OR "black screen")
```
### iPad Issues
**Alert Name:** `iPad Repair Issues`
**Purpose:** Catches iPad problems on Apple help
**Expected Volume:** Medium
```
site:reddit.com/r/applehelp "ipad" ("won't turn on" OR "won't charge" OR "black screen" OR "broken screen")
```
### MacBook - Screen Issues
**Alert Name:** `MacBook Screen Issues`
**Purpose:** MacBook display problems
**Expected Volume:** Medium
```
site:reddit.com/r/applehelp "macbook" ("screen" OR "display") ("cracked" OR "broken" OR "flickering" OR "black screen")
```
---
## 📍 Tier 3: City-Specific Alerts (Local Context)
For location-based targeting. Always include "repair" keyword with city subs.
### MacBook Repair - Toronto
**Alert Name:** `MacBook Repair Toronto`
**Purpose:** Toronto MacBook repair seekers
**Expected Volume:** Low (54+ results tested)
**Tested Relevance:** 10/10 (score 7.2)
```
site:reddit.com/r/toronto "macbook" "repair"
```
### MacBook Repair - Vancouver
**Alert Name:** `MacBook Repair Vancouver`
**Purpose:** Vancouver MacBook repair seekers
**Expected Volume:** Low (92+ results tested)
**Tested Relevance:** 10/10 (score 10.0)
```
site:reddit.com/r/vancouver "macbook" "repair"
```
### Laptop Repair - Toronto
**Alert Name:** `Laptop Repair Toronto`
**Purpose:** Toronto laptop repair requests
**Expected Volume:** Low-Medium
```
site:reddit.com/r/toronto "laptop" "repair"
```
### iPhone Repair - Toronto
**Alert Name:** `iPhone Repair Toronto`
**Purpose:** Toronto iPhone repair seekers
**Expected Volume:** Low
```
site:reddit.com/r/toronto "iphone" "repair"
```
### Computer Repair - Vancouver
**Alert Name:** `Computer Repair Vancouver`
**Purpose:** Vancouver computer repair requests
**Expected Volume:** Low-Medium
```
site:reddit.com/r/vancouver ("laptop" OR "computer" OR "pc") "repair"
```
---
## 🔧 Tier 4: Specialized Repairs
### Xbox Repair
**Alert Name:** `Xbox Repair Issues`
**Purpose:** Xbox hardware problems
**Expected Volume:** Medium
```
site:reddit.com/r/techsupport ("xbox" OR "xbox series x" OR "xbox one") ("won't turn on" OR "no power" OR "overheating")
```
### Gaming PC Issues
**Alert Name:** `Gaming PC Issues`
**Purpose:** Gaming PC hardware problems
**Expected Volume:** High
```
site:reddit.com/r/techsupport ("gaming pc" OR "pc build") ("won't turn on" OR "no display" OR "won't boot")
```
### Water Damage - General
**Alert Name:** `Water Damage Electronics`
**Purpose:** All water damage posts
**Expected Volume:** Medium
```
site:reddit.com/r/techsupport ("spilled" OR "water damage" OR "liquid damage") ("laptop" OR "macbook" OR "phone")
```
### SSD/HDD Clicking
**Alert Name:** `Drive Clicking Sounds`
**Purpose:** Failing drives with clicking
**Expected Volume:** Medium
```
site:reddit.com/r/techsupport ("hard drive" OR "hdd") ("clicking" OR "beeping" OR "strange noise")
```
### Screen Repairs
**Alert Name:** `Screen Repairs General`
**Purpose:** All screen repair needs
**Expected Volume:** High
```
site:reddit.com/r/techsupport ("screen" OR "display") ("cracked" OR "broken" OR "shattered" OR "black screen")
```
---
## 📋 Setup Instructions
### Priority Setup Order:
1. **Start with Tier 1** (9 alerts) - Highest volume, best ROI
2. **Add Tier 2** (5 alerts) - Good supplementary coverage
3. **Add city-specific** (Tier 3) if you need local targeting
4. **Add specialized** (Tier 4) for niche coverage
### Google Alerts Configuration:
1. Go to [Google Alerts](https://www.google.com/alerts)
2. Paste query exactly as shown (including quotes)
3. **Show options:**
- How often: `As-it-happens`
- Sources: `Automatic`
- Language: `English`
- Region: `Canada`
- How many: `All results`
- Deliver to: `RSS feed`
4. Click "Create Alert"
5. Click "RSS" icon to get feed URL
### Expected Performance:
- **Tier 1 alerts:** Check daily, expect multiple posts
- **Tier 2 alerts:** Check 2-3x weekly, expect regular posts
- **Tier 3 alerts:** Check weekly, may have gaps
- **Tier 4 alerts:** Check weekly, specialized content
---
## 📊 Validation Results
All alerts based on patterns that achieved:
- ✅ **100% success rate** (14/14 patterns tested)
- ✅ **10/10 relevant results** per query
- ✅ **Average relevance score: 11.0/10**
- ✅ **All results are actual repair requests**
### Test Methodology:
- Playwright with anti-detection
- Human-like behavior simulation
- Polite 12-15s delays
- Relevance scoring based on keyword presence
---
## 🎯 Success Criteria
Each alert in this file meets:
- ✅ Uses consumer language (no technical jargon)
- ✅ Targets high-activity subreddits
- ✅ Tested and validated pattern
- ✅ Expected to produce regular results
- ✅ High relevance to repair services
**Total Alerts:** 25 production-ready alerts
**Coverage:** MacBook, iPhone, iPad, Laptop, PS5, Switch, Xbox, Data Recovery
**Geographic:** Tech support (global) + Toronto/Vancouver (local)
---
## 📝 Notes
- ALERT_NAME markers removed (caused search issues)
- Exclusion terms removed (not needed with targeted subs)
- Queries kept simple and focused
- All patterns tested November 18, 2025
- See `docs/REDDIT_KEYWORDS.md` for full conversion table
**Next Steps:**
1. Set up Tier 1 alerts first (highest priority)
2. Monitor results for 1 week
3. Add Tier 2/3 based on needs
4. Adjust keywords based on actual results received

605
docs/google-alerts.md Normal file
View File

@ -0,0 +1,605 @@
# Google Alert Queries - Working Versions
These queries have been validated to work within Google Alerts limits.
Each query stays under 500 chars, uses ≤8 site filters, and ≤18 OR terms.
## Data Recovery - Ontario-Other
**Purpose:** Catches general data recovery requests and drive failure scenarios.
**Target:** Users with dead drives, lost files, or corrupted storage.
```
(site:reddit.com/r/ontario OR site:reddit.com/r/londonontario OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa)
("data recovery" OR "recover my data" OR "data rescue" OR "professional data recovery" OR "data extraction service" OR "dead hard drive" OR "drive not recognized" OR "drive clicking" OR "drive beeping" OR "drive won't spin" OR "drive won't mount" OR "no boot drive")
-entertainment -movie -music -sport
```
## Data Recovery - Western
**Purpose:** Catches general data recovery requests and drive failure scenarios.
**Target:** Users with dead drives, lost files, or corrupted storage.
```
(site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton)
("data recovery" OR "recover my data" OR "data rescue" OR "professional data recovery" OR "data extraction service" OR "dead hard drive" OR "drive not recognized" OR "drive clicking" OR "drive beeping" OR "drive won't spin" OR "drive won't mount" OR "no boot drive")
-entertainment -movie -music -sport
```
## Data Recovery - Prairies
**Purpose:** Catches general data recovery requests and drive failure scenarios.
**Target:** Users with dead drives, lost files, or corrupted storage.
```
(site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg)
("data recovery" OR "recover my data" OR "data rescue" OR "professional data recovery" OR "data extraction service" OR "dead hard drive" OR "drive not recognized" OR "drive clicking" OR "drive beeping" OR "drive won't spin" OR "drive won't mount" OR "no boot drive")
-entertainment -movie -music -sport
```
## Data Recovery - Eastern
**Purpose:** Catches general data recovery requests and drive failure scenarios.
**Target:** Users with dead drives, lost files, or corrupted storage.
```
(site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("data recovery" OR "recover my data" OR "data rescue" OR "professional data recovery" OR "data extraction service" OR "dead hard drive" OR "drive not recognized" OR "drive clicking" OR "drive beeping" OR "drive won't spin" OR "drive won't mount" OR "no boot drive")
-entertainment -movie -music -sport
```
## HDD/SSD Recovery - Ontario-Other
**Purpose:** Targets advanced recovery scenarios requiring clean room work or specialized SSD/RAID recovery.
**Target:** Users with mechanical drive failures, enterprise storage, or encrypted drives.
```
(site:reddit.com/r/ontario OR site:reddit.com/r/londonontario OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa)
("clean room data recovery" OR "head swap" OR "stuck spindle" OR "seized spindle" OR "platter swap" OR "nvme recovery" OR "ssd firmware failure" OR "ssd controller failure" OR "ssd not detected" OR "pcie ssd recovery" OR "bitlocker data recovery" OR "raid rebuild")
-entertainment -movie -music -sport
```
## HDD/SSD Recovery - Western
**Purpose:** Targets advanced recovery scenarios requiring clean room work or specialized SSD/RAID recovery.
**Target:** Users with mechanical drive failures, enterprise storage, or encrypted drives.
```
(site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton)
("clean room data recovery" OR "head swap" OR "stuck spindle" OR "seized spindle" OR "platter swap" OR "nvme recovery" OR "ssd firmware failure" OR "ssd controller failure" OR "ssd not detected" OR "pcie ssd recovery" OR "bitlocker data recovery" OR "raid rebuild")
-entertainment -movie -music -sport
```
## HDD/SSD Recovery - Prairies
**Purpose:** Targets advanced recovery scenarios requiring clean room work or specialized SSD/RAID recovery.
**Target:** Users with mechanical drive failures, enterprise storage, or encrypted drives.
```
(site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg)
("clean room data recovery" OR "head swap" OR "stuck spindle" OR "seized spindle" OR "platter swap" OR "nvme recovery" OR "ssd firmware failure" OR "ssd controller failure" OR "ssd not detected" OR "pcie ssd recovery" OR "bitlocker data recovery" OR "raid rebuild")
-entertainment -movie -music -sport
```
## HDD/SSD Recovery - Eastern
**Purpose:** Targets advanced recovery scenarios requiring clean room work or specialized SSD/RAID recovery.
**Target:** Users with mechanical drive failures, enterprise storage, or encrypted drives.
```
(site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("clean room data recovery" OR "head swap" OR "stuck spindle" OR "seized spindle" OR "platter swap" OR "nvme recovery" OR "ssd firmware failure" OR "ssd controller failure" OR "ssd not detected" OR "pcie ssd recovery" OR "bitlocker data recovery" OR "raid rebuild")
-entertainment -movie -music -sport
```
## SD Card/USB Recovery - Ontario-Other
**Purpose:** Focuses on SD cards, USB drives, and mobile device data extraction.
**Target:** Photographers, videographers, and users with lost data on portable media.
```
(site:reddit.com/r/ontario OR site:reddit.com/r/londonontario OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa)
("sd card recovery" OR "micro sd recovery" OR "compact flash recovery" OR "cfexpress recovery" OR "usb stick recovery" OR "flash drive recovery" OR "camera card recovery" OR "gopro card recovery" OR "drone footage recovery" OR "phone data extraction" OR "android data recovery" OR "iphone data recovery")
-entertainment -movie -music -sport
```
## SD Card/USB Recovery - Western
**Purpose:** Focuses on SD cards, USB drives, and mobile device data extraction.
**Target:** Photographers, videographers, and users with lost data on portable media.
```
(site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton)
("sd card recovery" OR "micro sd recovery" OR "compact flash recovery" OR "cfexpress recovery" OR "usb stick recovery" OR "flash drive recovery" OR "camera card recovery" OR "gopro card recovery" OR "drone footage recovery" OR "phone data extraction" OR "android data recovery" OR "iphone data recovery")
-entertainment -movie -music -sport
```
## SD Card/USB Recovery - Prairies
**Purpose:** Focuses on SD cards, USB drives, and mobile device data extraction.
**Target:** Photographers, videographers, and users with lost data on portable media.
```
(site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg)
("sd card recovery" OR "micro sd recovery" OR "compact flash recovery" OR "cfexpress recovery" OR "usb stick recovery" OR "flash drive recovery" OR "camera card recovery" OR "gopro card recovery" OR "drone footage recovery" OR "phone data extraction" OR "android data recovery" OR "iphone data recovery")
-entertainment -movie -music -sport
```
## SD Card/USB Recovery - Eastern
**Purpose:** Focuses on SD cards, USB drives, and mobile device data extraction.
**Target:** Photographers, videographers, and users with lost data on portable media.
```
(site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("sd card recovery" OR "micro sd recovery" OR "compact flash recovery" OR "cfexpress recovery" OR "usb stick recovery" OR "flash drive recovery" OR "camera card recovery" OR "gopro card recovery" OR "drone footage recovery" OR "phone data extraction" OR "android data recovery" OR "iphone data recovery")
-entertainment -movie -music -sport
```
## Laptop/MacBook Repair - Ontario-Other
**Purpose:** Captures laptop and MacBook motherboard repair requests, especially power and liquid damage issues.
**Target:** Users with dead laptops, charging problems, or liquid-damaged devices.
```
(site:reddit.com/r/ontario OR site:reddit.com/r/londonontario OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa)
("logic board repair" OR "motherboard repair" OR "board level repair" OR "logic board replacement" OR "macbook logic board" OR "macbook won't turn on" OR "macbook no power" OR "macbook dead" OR "macbook won't charge" OR "liquid damage macbook" OR "macbook water damage" OR "macbook coffee spill")
-entertainment -movie -music -sport
```
## Laptop/MacBook Repair - Western
**Purpose:** Captures laptop and MacBook motherboard repair requests, especially power and liquid damage issues.
**Target:** Users with dead laptops, charging problems, or liquid-damaged devices.
```
(site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton)
("logic board repair" OR "motherboard repair" OR "board level repair" OR "logic board replacement" OR "macbook logic board" OR "macbook won't turn on" OR "macbook no power" OR "macbook dead" OR "macbook won't charge" OR "liquid damage macbook" OR "macbook water damage" OR "macbook coffee spill")
-entertainment -movie -music -sport
```
## Laptop/MacBook Repair - Prairies
**Purpose:** Captures laptop and MacBook motherboard repair requests, especially power and liquid damage issues.
**Target:** Users with dead laptops, charging problems, or liquid-damaged devices.
```
(site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg)
("logic board repair" OR "motherboard repair" OR "board level repair" OR "logic board replacement" OR "macbook logic board" OR "macbook won't turn on" OR "macbook no power" OR "macbook dead" OR "macbook won't charge" OR "liquid damage macbook" OR "macbook water damage" OR "macbook coffee spill")
-entertainment -movie -music -sport
```
## Laptop/MacBook Repair - Eastern
**Purpose:** Captures laptop and MacBook motherboard repair requests, especially power and liquid damage issues.
**Target:** Users with dead laptops, charging problems, or liquid-damaged devices.
```
(site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("logic board repair" OR "motherboard repair" OR "board level repair" OR "logic board replacement" OR "macbook logic board" OR "macbook won't turn on" OR "macbook no power" OR "macbook dead" OR "macbook won't charge" OR "liquid damage macbook" OR "macbook water damage" OR "macbook coffee spill")
-entertainment -movie -music -sport
```
## GPU/Desktop Repair - Ontario-Other
**Purpose:** Targets GPU failures and desktop motherboard issues, including POST/boot problems.
**Target:** PC builders, gamers, and users with desktop hardware failures.
```
(site:reddit.com/r/ontario OR site:reddit.com/r/londonontario OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa)
("gpu repair" OR "graphics card repair" OR "gpu no display" OR "gpu artifacting" OR "gpu reball" OR "gpu reflow" OR "gpu hdmi repair" OR "pc motherboard repair" OR "desktop board repair" OR "custom pc repair" OR "power supply blew motherboard" OR "pc no post")
-entertainment -movie -music -sport
```
## GPU/Desktop Repair - Western
**Purpose:** Targets GPU failures and desktop motherboard issues, including POST/boot problems.
**Target:** PC builders, gamers, and users with desktop hardware failures.
```
(site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton)
("gpu repair" OR "graphics card repair" OR "gpu no display" OR "gpu artifacting" OR "gpu reball" OR "gpu reflow" OR "gpu hdmi repair" OR "pc motherboard repair" OR "desktop board repair" OR "custom pc repair" OR "power supply blew motherboard" OR "pc no post")
-entertainment -movie -music -sport
```
## GPU/Desktop Repair - Prairies
**Purpose:** Targets GPU failures and desktop motherboard issues, including POST/boot problems.
**Target:** PC builders, gamers, and users with desktop hardware failures.
```
(site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg)
("gpu repair" OR "graphics card repair" OR "gpu no display" OR "gpu artifacting" OR "gpu reball" OR "gpu reflow" OR "gpu hdmi repair" OR "pc motherboard repair" OR "desktop board repair" OR "custom pc repair" OR "power supply blew motherboard" OR "pc no post")
-entertainment -movie -music -sport
```
## GPU/Desktop Repair - Eastern
**Purpose:** Targets GPU failures and desktop motherboard issues, including POST/boot problems.
**Target:** PC builders, gamers, and users with desktop hardware failures.
```
(site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("gpu repair" OR "graphics card repair" OR "gpu no display" OR "gpu artifacting" OR "gpu reball" OR "gpu reflow" OR "gpu hdmi repair" OR "pc motherboard repair" OR "desktop board repair" OR "custom pc repair" OR "power supply blew motherboard" OR "pc no post")
-entertainment -movie -music -sport
```
## Console Repair - Ontario-Other
**Purpose:** Catches console repair requests, especially HDMI port issues and power failures.
**Target:** Gamers with broken PS5/Xbox/Switch consoles.
```
(site:reddit.com/r/ontario OR site:reddit.com/r/londonontario OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa)
("ps5 hdmi repair" OR "ps5 no video" OR "ps5 blue light of death" OR "ps5 motherboard repair" OR "ps4 hdmi port" OR "ps4 no power" OR "xbox hdmi repair" OR "xbox one x no power" OR "xbox series x hdmi" OR "nintendo switch board repair" OR "switch won't charge" OR "switch no display")
-entertainment -movie -music -sport
```
## Console Repair - Western
**Purpose:** Catches console repair requests, especially HDMI port issues and power failures.
**Target:** Gamers with broken PS5/Xbox/Switch consoles.
```
(site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton)
("ps5 hdmi repair" OR "ps5 no video" OR "ps5 blue light of death" OR "ps5 motherboard repair" OR "ps4 hdmi port" OR "ps4 no power" OR "xbox hdmi repair" OR "xbox one x no power" OR "xbox series x hdmi" OR "nintendo switch board repair" OR "switch won't charge" OR "switch no display")
-entertainment -movie -music -sport
```
## Console Repair - Prairies
**Purpose:** Catches console repair requests, especially HDMI port issues and power failures.
**Target:** Gamers with broken PS5/Xbox/Switch consoles.
```
(site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg)
("ps5 hdmi repair" OR "ps5 no video" OR "ps5 blue light of death" OR "ps5 motherboard repair" OR "ps4 hdmi port" OR "ps4 no power" OR "xbox hdmi repair" OR "xbox one x no power" OR "xbox series x hdmi" OR "nintendo switch board repair" OR "switch won't charge" OR "switch no display")
-entertainment -movie -music -sport
```
## Console Repair - Eastern
**Purpose:** Catches console repair requests, especially HDMI port issues and power failures.
**Target:** Gamers with broken PS5/Xbox/Switch consoles.
```
(site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("ps5 hdmi repair" OR "ps5 no video" OR "ps5 blue light of death" OR "ps5 motherboard repair" OR "ps4 hdmi port" OR "ps4 no power" OR "xbox hdmi repair" OR "xbox one x no power" OR "xbox series x hdmi" OR "nintendo switch board repair" OR "switch won't charge" OR "switch no display")
-entertainment -movie -music -sport
```
## Console Refurb - Ontario-Other
**Purpose:** Targets console upgrade requests and refurbishment opportunities, including controller repairs.
**Target:** Users wanting console upgrades, cleaning, or controller fixes.
```
(site:reddit.com/r/ontario OR site:reddit.com/r/londonontario OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa)
("console refurbishment" OR "console refurb" OR "console rebuild" OR "console recap" OR "console upgrade service" OR "ps5 upgrade" OR "ps5 ssd install" OR "ps5 fan replacement" OR "ps5 cleaning service" OR "ps4 pro refurbishment" OR "xbox ssd upgrade" OR "xbox cleaning service")
-entertainment -movie -music -sport
```
## Console Refurb - Western
**Purpose:** Targets console upgrade requests and refurbishment opportunities, including controller repairs.
**Target:** Users wanting console upgrades, cleaning, or controller fixes.
```
(site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton)
("console refurbishment" OR "console refurb" OR "console rebuild" OR "console recap" OR "console upgrade service" OR "ps5 upgrade" OR "ps5 ssd install" OR "ps5 fan replacement" OR "ps5 cleaning service" OR "ps4 pro refurbishment" OR "xbox ssd upgrade" OR "xbox cleaning service")
-entertainment -movie -music -sport
```
## Console Refurb - Prairies
**Purpose:** Targets console upgrade requests and refurbishment opportunities, including controller repairs.
**Target:** Users wanting console upgrades, cleaning, or controller fixes.
```
(site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg)
("console refurbishment" OR "console refurb" OR "console rebuild" OR "console recap" OR "console upgrade service" OR "ps5 upgrade" OR "ps5 ssd install" OR "ps5 fan replacement" OR "ps5 cleaning service" OR "ps4 pro refurbishment" OR "xbox ssd upgrade" OR "xbox cleaning service")
-entertainment -movie -music -sport
```
## Console Refurb - Eastern
**Purpose:** Targets console upgrade requests and refurbishment opportunities, including controller repairs.
**Target:** Users wanting console upgrades, cleaning, or controller fixes.
```
(site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("console refurbishment" OR "console refurb" OR "console rebuild" OR "console recap" OR "console upgrade service" OR "ps5 upgrade" OR "ps5 ssd install" OR "ps5 fan replacement" OR "ps5 cleaning service" OR "ps4 pro refurbishment" OR "xbox ssd upgrade" OR "xbox cleaning service")
-entertainment -movie -music -sport
```
## Smartphone Repair - Ontario-Other
**Purpose:** Captures iPhone, Samsung, Pixel, and other smartphone motherboard repair requests.
**Target:** Users with dead phones, charging issues, or component failures (Face ID, audio IC, etc.).
```
(site:reddit.com/r/ontario OR site:reddit.com/r/londonontario OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa)
("iphone logic board" OR "iphone board repair" OR "iphone microsolder" OR "iphone no power" OR "iphone boot loop" OR "iphone won't charge" OR "iphone touch disease" OR "iphone face id repair" OR "iphone audio ic" OR "iphone tristar" OR "iphone charging ic" OR "samsung logic board")
-entertainment -movie -music -sport
```
## Smartphone Repair - Western
**Purpose:** Captures iPhone, Samsung, Pixel, and other smartphone motherboard repair requests.
**Target:** Users with dead phones, charging issues, or component failures (Face ID, audio IC, etc.).
```
(site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton)
("iphone logic board" OR "iphone board repair" OR "iphone microsolder" OR "iphone no power" OR "iphone boot loop" OR "iphone won't charge" OR "iphone touch disease" OR "iphone face id repair" OR "iphone audio ic" OR "iphone tristar" OR "iphone charging ic" OR "samsung logic board")
-entertainment -movie -music -sport
```
## Smartphone Repair - Prairies
**Purpose:** Captures iPhone, Samsung, Pixel, and other smartphone motherboard repair requests.
**Target:** Users with dead phones, charging issues, or component failures (Face ID, audio IC, etc.).
```
(site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg)
("iphone logic board" OR "iphone board repair" OR "iphone microsolder" OR "iphone no power" OR "iphone boot loop" OR "iphone won't charge" OR "iphone touch disease" OR "iphone face id repair" OR "iphone audio ic" OR "iphone tristar" OR "iphone charging ic" OR "samsung logic board")
-entertainment -movie -music -sport
```
## Smartphone Repair - Eastern
**Purpose:** Captures iPhone, Samsung, Pixel, and other smartphone motherboard repair requests.
**Target:** Users with dead phones, charging issues, or component failures (Face ID, audio IC, etc.).
```
(site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("iphone logic board" OR "iphone board repair" OR "iphone microsolder" OR "iphone no power" OR "iphone boot loop" OR "iphone won't charge" OR "iphone touch disease" OR "iphone face id repair" OR "iphone audio ic" OR "iphone tristar" OR "iphone charging ic" OR "samsung logic board")
-entertainment -movie -music -sport
```
## iPad Repair - Ontario-Other
**Purpose:** Targets iPad repair requests, especially power, charging, and connector issues.
**Target:** Users with broken iPads, charging problems, or stuck devices.
```
(site:reddit.com/r/ontario OR site:reddit.com/r/londonontario OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa)
("ipad logic board" OR "ipad board repair" OR "ipad no power" OR "ipad won't charge" OR "ipad boot loop" OR "ipad stuck on apple logo" OR "ipad screen connector" OR "ipad battery connector" OR "ipad backlight repair" OR "ipad audio ic" OR "ipad touch disease" OR "ipad liquid damage")
-entertainment -movie -music -sport
```
## iPad Repair - Western
**Purpose:** Targets iPad repair requests, especially power, charging, and connector issues.
**Target:** Users with broken iPads, charging problems, or stuck devices.
```
(site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton)
("ipad logic board" OR "ipad board repair" OR "ipad no power" OR "ipad won't charge" OR "ipad boot loop" OR "ipad stuck on apple logo" OR "ipad screen connector" OR "ipad battery connector" OR "ipad backlight repair" OR "ipad audio ic" OR "ipad touch disease" OR "ipad liquid damage")
-entertainment -movie -music -sport
```
## iPad Repair - Prairies
**Purpose:** Targets iPad repair requests, especially power, charging, and connector issues.
**Target:** Users with broken iPads, charging problems, or stuck devices.
```
(site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg)
("ipad logic board" OR "ipad board repair" OR "ipad no power" OR "ipad won't charge" OR "ipad boot loop" OR "ipad stuck on apple logo" OR "ipad screen connector" OR "ipad battery connector" OR "ipad backlight repair" OR "ipad audio ic" OR "ipad touch disease" OR "ipad liquid damage")
-entertainment -movie -music -sport
```
## iPad Repair - Eastern
**Purpose:** Targets iPad repair requests, especially power, charging, and connector issues.
**Target:** Users with broken iPads, charging problems, or stuck devices.
```
(site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("ipad logic board" OR "ipad board repair" OR "ipad no power" OR "ipad won't charge" OR "ipad boot loop" OR "ipad stuck on apple logo" OR "ipad screen connector" OR "ipad battery connector" OR "ipad backlight repair" OR "ipad audio ic" OR "ipad touch disease" OR "ipad liquid damage")
-entertainment -movie -music -sport
```
## Connector Repair - Western
**Purpose:** Targets connector repair requests - FPC, flex cables, and board connectors.
**Target:** Users with ripped connectors, damaged flex cables, or lifted pads.
```
(site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton)
("fpc connector" OR "flex connector repair" OR "screen connector broke" OR "display connector ripped" OR "lcd connector burnt" OR "battery connector ripped" OR "charge port flex" OR "board connector replacement" OR "connector pads lifted" OR "connector ripped off board" OR "replace connector pins" OR "micro coax connector repair")
-entertainment -movie -music -sport
```
## Connector Repair - Prairies
**Purpose:** Targets connector repair requests - FPC, flex cables, and board connectors.
**Target:** Users with ripped connectors, damaged flex cables, or lifted pads.
```
(site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg)
("fpc connector" OR "flex connector repair" OR "screen connector broke" OR "display connector ripped" OR "lcd connector burnt" OR "battery connector ripped" OR "charge port flex" OR "board connector replacement" OR "connector pads lifted" OR "connector ripped off board" OR "replace connector pins" OR "micro coax connector repair")
-entertainment -movie -music -sport
```
## Connector Repair - Eastern
**Purpose:** Targets connector repair requests - FPC, flex cables, and board connectors.
**Target:** Users with ripped connectors, damaged flex cables, or lifted pads.
```
(site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("fpc connector" OR "flex connector repair" OR "screen connector broke" OR "display connector ripped" OR "lcd connector burnt" OR "battery connector ripped" OR "charge port flex" OR "board connector replacement" OR "connector pads lifted" OR "connector ripped off board" OR "replace connector pins" OR "micro coax connector repair")
-entertainment -movie -music -sport
```
## Key Fob Repair - Ontario-GTA
**Purpose:** Catches car key fob repair requests. Note: May require assessment for compatibility.
**Target:** Users with broken key fobs, water damage, or keyless entry issues.
```
(site:reddit.com/r/kitchener OR site:reddit.com/r/waterloo OR site:reddit.com/r/CambridgeON OR site:reddit.com/r/guelph OR site:reddit.com/r/toronto OR site:reddit.com/r/mississauga OR site:reddit.com/r/brampton)
("key fob repair" OR "car key fob not working" OR "keyless entry repair" OR "key fob water damage" OR "key fob board" OR "key fob microsolder" OR "key fob battery drain" OR "key fob pcb repair" OR "smart key repair" OR "remote starter repair")
-entertainment -movie -music -sport
```
## Key Fob Repair - Ontario-Other
**Purpose:** Catches car key fob repair requests. Note: May require assessment for compatibility.
**Target:** Users with broken key fobs, water damage, or keyless entry issues.
```
(site:reddit.com/r/ontario OR site:reddit.com/r/londonontario OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa)
("key fob repair" OR "car key fob not working" OR "keyless entry repair" OR "key fob water damage" OR "key fob board" OR "key fob microsolder" OR "key fob battery drain" OR "key fob pcb repair" OR "smart key repair" OR "remote starter repair")
-entertainment -movie -music -sport
```
## Key Fob Repair - Western
**Purpose:** Catches car key fob repair requests. Note: May require assessment for compatibility.
**Target:** Users with broken key fobs, water damage, or keyless entry issues.
```
(site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton)
("key fob repair" OR "car key fob not working" OR "keyless entry repair" OR "key fob water damage" OR "key fob board" OR "key fob microsolder" OR "key fob battery drain" OR "key fob pcb repair" OR "smart key repair" OR "remote starter repair")
-entertainment -movie -music -sport
```
## Key Fob Repair - Prairies
**Purpose:** Catches car key fob repair requests. Note: May require assessment for compatibility.
**Target:** Users with broken key fobs, water damage, or keyless entry issues.
```
(site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg)
("key fob repair" OR "car key fob not working" OR "keyless entry repair" OR "key fob water damage" OR "key fob board" OR "key fob microsolder" OR "key fob battery drain" OR "key fob pcb repair" OR "smart key repair" OR "remote starter repair")
-entertainment -movie -music -sport
```
## Key Fob Repair - Eastern
**Purpose:** Catches car key fob repair requests. Note: May require assessment for compatibility.
**Target:** Users with broken key fobs, water damage, or keyless entry issues.
```
(site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("key fob repair" OR "car key fob not working" OR "keyless entry repair" OR "key fob water damage" OR "key fob board" OR "key fob microsolder" OR "key fob battery drain" OR "key fob pcb repair" OR "smart key repair" OR "remote starter repair")
-entertainment -movie -music -sport
```
## Microsolder/Diagnostics - Ontario-Other
**Purpose:** Targets advanced board-level repair requests requiring microsoldering or diagnostic work.
**Target:** Users needing BGA reballing, short hunting, trace repair, or chip-off services.
```
(site:reddit.com/r/ontario OR site:reddit.com/r/londonontario OR site:reddit.com/r/HamiltonOntario OR site:reddit.com/r/niagara OR site:reddit.com/r/ottawa)
("microsolder" OR "micro solder" OR "bga reball" OR "ball grid array repair" OR "reball service" OR "board level diagnostics" OR "schematic reading" OR "short hunting" OR "find board short" OR "thermal camera diagnostics" OR "board trace repair" OR "pad repair")
-entertainment -movie -music -sport
```
## Microsolder/Diagnostics - Western
**Purpose:** Targets advanced board-level repair requests requiring microsoldering or diagnostic work.
**Target:** Users needing BGA reballing, short hunting, trace repair, or chip-off services.
```
(site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton)
("microsolder" OR "micro solder" OR "bga reball" OR "ball grid array repair" OR "reball service" OR "board level diagnostics" OR "schematic reading" OR "short hunting" OR "find board short" OR "thermal camera diagnostics" OR "board trace repair" OR "pad repair")
-entertainment -movie -music -sport
```
## Microsolder/Diagnostics - Prairies
**Purpose:** Targets advanced board-level repair requests requiring microsoldering or diagnostic work.
**Target:** Users needing BGA reballing, short hunting, trace repair, or chip-off services.
```
(site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg)
("microsolder" OR "micro solder" OR "bga reball" OR "ball grid array repair" OR "reball service" OR "board level diagnostics" OR "schematic reading" OR "short hunting" OR "find board short" OR "thermal camera diagnostics" OR "board trace repair" OR "pad repair")
-entertainment -movie -music -sport
```
## Microsolder/Diagnostics - Eastern
**Purpose:** Targets advanced board-level repair requests requiring microsoldering or diagnostic work.
**Target:** Users needing BGA reballing, short hunting, trace repair, or chip-off services.
```
(site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("microsolder" OR "micro solder" OR "bga reball" OR "ball grid array repair" OR "reball service" OR "board level diagnostics" OR "schematic reading" OR "short hunting" OR "find board short" OR "thermal camera diagnostics" OR "board trace repair" OR "pad repair")
-entertainment -movie -music -sport
```
## Device Refurb/Trade-In - Western
**Purpose:** Captures opportunities to buy broken devices for refurbishment or trade-in requests.
**Target:** Users selling broken devices or seeking refurbishment services.
```
(site:reddit.com/r/vancouver OR site:reddit.com/r/VictoriaBC OR site:reddit.com/r/Calgary OR site:reddit.com/r/Edmonton)
("refurbished console" OR "refurbished macbook" OR "refurbished laptop" OR "refurbished iphone" OR "device refurbishment service" OR "console trade-in repair" OR "buy broken console" OR "buy broken laptop" OR "broken macbook wanted" OR "electronics refurbishment" OR "selling broken ps5" OR "selling broken macbook")
-entertainment -movie -music -sport
```
## Device Refurb/Trade-In - Prairies
**Purpose:** Captures opportunities to buy broken devices for refurbishment or trade-in requests.
**Target:** Users selling broken devices or seeking refurbishment services.
```
(site:reddit.com/r/saskatoon OR site:reddit.com/r/regina OR site:reddit.com/r/winnipeg)
("refurbished console" OR "refurbished macbook" OR "refurbished laptop" OR "refurbished iphone" OR "device refurbishment service" OR "console trade-in repair" OR "buy broken console" OR "buy broken laptop" OR "broken macbook wanted" OR "electronics refurbishment" OR "selling broken ps5" OR "selling broken macbook")
-entertainment -movie -music -sport
```
## Device Refurb/Trade-In - Eastern
**Purpose:** Captures opportunities to buy broken devices for refurbishment or trade-in requests.
**Target:** Users selling broken devices or seeking refurbishment services.
```
(site:reddit.com/r/montreal OR site:reddit.com/r/quebeccity OR site:reddit.com/r/halifax OR site:reddit.com/r/newfoundland)
("refurbished console" OR "refurbished macbook" OR "refurbished laptop" OR "refurbished iphone" OR "device refurbishment service" OR "console trade-in repair" OR "buy broken console" OR "buy broken laptop" OR "broken macbook wanted" OR "electronics refurbishment" OR "selling broken ps5" OR "selling broken macbook")
-entertainment -movie -music -sport
```
## Repair Leads - Kijiji/Used.ca CA
**Purpose:** Catches repair requests on Canadian classified sites.
```
-"ALERT_NAME:Repair Leads - Kijiji/Used.ca CA" (site:kijiji.ca OR site:used.ca OR site:usedvictoria.com OR site:usedvancouver.com OR site:usedottawa.com OR site:usededmonton.com)
("data recovery" OR "recover my data" OR "logic board repair" OR "motherboard repair" OR "console repair" OR "ps5 repair" OR "xbox repair" OR "macbook repair" OR "iphone repair" OR "ipad repair" OR "microsolder" OR "charging port repair" OR "hdmi port repair" OR "board level repair" OR "liquid damage repair" OR "needs repair" OR "repair wanted" OR "looking for repair")
-job -jobs -hiring -rent -rental
```
## Repair Leads - Facebook CA
**Purpose:** Targets Facebook Marketplace and public group repair requests.
```
-"ALERT_NAME:Repair Leads - Facebook CA" (site:facebook.com/groups OR site:facebook.com/marketplace)
("data recovery" OR "logic board repair" OR "macbook repair" OR "laptop repair" OR "console repair" OR "ps5 repair" OR "switch repair" OR "iphone repair" OR "microsolder" OR "charging port repair" OR "liquid damage repair" OR "motherboard repair" OR "repair shop recommendation" OR "anyone fix" OR "where to repair" OR "can someone repair")
-job -jobs -hiring -giveaway
```
## Repair Leads - Craigslist CA
**Purpose:** Monitors Craigslist for repair service requests.
```
-"ALERT_NAME:Repair Leads - Craigslist CA" (site:craigslist.org OR site:craigslist.ca)
("data recovery" OR "recover files" OR "logic board repair" OR "macbook repair" OR "laptop repair" OR "console repair" OR "ps5 repair" OR "xbox repair" OR "switch repair" OR "iphone repair" OR "microsolder" OR "charging port repair" OR "motherboard repair" OR "board level repair" OR "repair service needed" OR "need repair" OR "seeking repair")
-job -jobs -gig -gigs -housing
```
## Repair Leads - Tech Forums CA
**Purpose:** Catches repair discussions on Canadian tech forums.
```
-"ALERT_NAME:Repair Leads - Tech Forums CA" (site:forums.redflagdeals.com OR site:community.hwbot.org OR site:dslreports.com/forum)
("data recovery" OR "recover my data" OR "logic board repair" OR "motherboard repair" OR "macbook repair" OR "laptop repair" OR "console repair" OR "gpu repair" OR "ps5 repair" OR "microsolder" OR "charging port repair" OR "board level repair" OR "need a repair shop" OR "recommend repair shop" OR "can someone fix")
-job -jobs -hiring
```
## Repair Communities - Discord CA
**Purpose:** Finds repair-focused Discord communities and directories.
```
-"ALERT_NAME:Repair Communities - Discord CA" (site:discords.com OR site:disboard.org OR site:top.gg)
("electronics repair" OR "microsolder" OR "data recovery" OR "board repair" OR "console repair" OR "retro console repair" OR "macbook repair" OR "iphone repair" OR "repair community" OR "electronics refurb" OR "repair business")
-roblox -minecraft -anime -gaming
```
## Bulk Electronics - Classifieds CA
**Purpose:** Finds wholesale electronics lots and liquidation pallets.
```
-"ALERT_NAME:Bulk Electronics - Classifieds CA" (site:kijiji.ca OR site:facebook.com/marketplace OR site:facebook.com/groups OR site:craigslist.ca)
("wholesale electronics" OR "bulk electronics" OR "bulk devices" OR "liquidation electronics" OR "liquidation lot" OR "surplus electronics" OR "electronics auction" OR "electronics pallet" OR "returns pallet" OR "returns truckload" OR "salvage electronics" OR "for parts lot" OR "broken electronics lot" OR "repairable electronics lot")
-job -jobs -hiring -housing -rent -rental -service
```
## Bulk Laptops - Auctions CA
**Purpose:** Targets laptop and MacBook bulk lots from auctions and classifieds.
```
-"ALERT_NAME:Bulk Laptops - Auctions CA" (site:kijiji.ca OR site:facebook.com/marketplace OR site:craigslist.ca OR site:bidspotter.com/en-ca OR site:govdeals.ca)
("bulk laptops" OR "laptop lot" OR "laptop liquidation" OR "surplus laptops" OR "for parts laptops" OR "broken laptop lot" OR "macbook lot" OR "macbook bulk" OR "corporate laptop surplus" OR "business laptop liquidation" OR "IT asset disposal" OR "fleet laptop auction")
-job -jobs -hiring -housing -rent -rental
```
## Bulk Phones/Tablets - Auctions CA
**Purpose:** Finds smartphone and tablet bulk lots for refurbishment.
```
-"ALERT_NAME:Bulk Phones/Tablets - Auctions CA" (site:kijiji.ca OR site:facebook.com/marketplace OR site:craigslist.ca OR site:bidspotter.com/en-ca OR site:liquidation.com)
("iphone lot" OR "iphone bulk" OR "smartphone lot" OR "smartphone bulk" OR "android phone lot" OR "for parts phones" OR "broken phone lot" OR "mobile phone liquidation" OR "mobile return pallet" OR "ipad lot" OR "tablet bulk" OR "tablet liquidation")
-job -jobs -hiring -housing -rent -rental
```
## Bulk Consoles - Auctions CA
**Purpose:** Targets console and gaming device bulk lots.
```
-"ALERT_NAME:Bulk Consoles - Auctions CA" (site:kijiji.ca OR site:facebook.com/marketplace OR site:craigslist.ca OR site:bidspotter.com/en-ca OR site:liquidation.com OR site:hibid.com)
("console lot" OR "gaming console bulk" OR "ps5 lot" OR "playstation lot" OR "xbox lot" OR "switch lot" OR "retro console lot" OR "broken console lot" OR "for parts consoles" OR "video game liquidation" OR "game store liquidation" OR "controller lot" OR "joycon lot" OR "arcade liquidation")
-job -jobs -hiring -housing -rent -rental -digital
```
## Gov/Corporate Auctions - Electronics CA
**Purpose:** Monitors government and corporate surplus auctions for electronics.
```
-"ALERT_NAME:Gov/Corporate Auctions - Electronics CA" (site:govdeals.ca OR site:gcsurplus.ca OR site:go-dove.com OR site:publicsurplus.com OR site:auctionnetwork.ca OR site:bidspotter.com/en-ca)
("electronics auction" OR "IT equipment auction" OR "computer liquidation" OR "surplus electronics auction" OR "asset disposition" OR "surplus devices" OR "fleet laptops" OR "office electronics auction" OR "returns auction" OR "warehouse clearance")
-vehicle -vehicles -truck -bus -furniture
```

22
package.json Normal file
View File

@ -0,0 +1,22 @@
{
"name": "rss-feedmonitor",
"version": "1.0.0",
"description": "RSS Feed Monitor with Playwright scraping and validation",
"type": "module",
"scripts": {
"test": "playwright test",
"test:headed": "playwright test --headed",
"scrape": "node scripts/playwright-scraper.js",
"validate": "node scripts/validate-scraping.js",
"record:alert-setup": "playwright codegen https://www.google.com/alerts --target javascript --output tests/alert-setup-recorded.spec.js",
"setup-alerts": "node scripts/setup-alerts-automated.js"
},
"dependencies": {
"@playwright/test": "^1.40.0",
"playwright": "^1.40.0"
},
"devDependencies": {
"@types/node": "^20.10.0"
}
}

106
playwright.config.js Normal file
View File

@ -0,0 +1,106 @@
import { defineConfig, devices } from '@playwright/test';
/**
* Playwright configuration for testing
* @see https://playwright.dev/docs/test-configuration
*/
export default defineConfig({
testDir: './tests',
// Maximum time one test can run
timeout: 60 * 1000,
// Test execution settings
fullyParallel: false, // Run tests sequentially to avoid rate limiting
forbidOnly: !!process.env.CI,
retries: process.env.CI ? 2 : 0,
workers: 1, // Single worker to avoid parallel requests
// Reporter configuration
reporter: [
['html'],
['list']
],
// Shared settings for all projects
use: {
// Base URL for tests
baseURL: 'https://www.google.com',
// Collect trace on first retry
trace: 'on-first-retry',
// Screenshot on failure
screenshot: 'only-on-failure',
// Video on failure
video: 'retain-on-failure',
// Timeout for actions (click, fill, etc)
actionTimeout: 10000,
// Navigation timeout
navigationTimeout: 30000,
// Locale and timezone
locale: 'en-CA',
timezoneId: 'America/Toronto',
// Geolocation (Toronto)
geolocation: { latitude: 43.6532, longitude: -79.3832 },
permissions: [],
// Color scheme
colorScheme: 'light',
// Extra HTTP headers
extraHTTPHeaders: {
'Accept-Language': 'en-CA,en-US;q=0.9,en;q=0.8',
},
},
// Configure projects for major browsers
projects: [
{
name: 'chromium',
use: {
...devices['Desktop Chrome'],
// Disable some automation detection
launchOptions: {
args: [
'--disable-blink-features=AutomationControlled',
'--disable-features=IsolateOrigins,site-per-process',
],
},
},
},
{
name: 'firefox',
use: { ...devices['Desktop Firefox'] },
},
{
name: 'webkit',
use: { ...devices['Desktop Safari'] },
},
// Test against mobile viewports (optional)
// {
// name: 'Mobile Chrome',
// use: { ...devices['Pixel 5'] },
// },
// {
// name: 'Mobile Safari',
// use: { ...devices['iPhone 12'] },
// },
],
// Run local dev server before starting tests (if needed)
// webServer: {
// command: 'npm run start',
// url: 'http://localhost:3000',
// reuseExistingServer: !process.env.CI,
// },
});

370
scripts/analyze-results.js Normal file
View File

@ -0,0 +1,370 @@
/**
* Analyze validation results and generate tuning recommendations
* Usage: node scripts/analyze-results.js validation-report-*.json
*/
import { readFile } from 'fs/promises';
import { writeFile } from 'fs/promises';
/**
* Analyze a validation report and generate recommendations
*/
function analyzeReport(report) {
const { results, successful, failed, total } = report;
const analysis = {
summary: {
total,
successful,
failed,
successRate: report.successRate,
avgRecencyScore: report.avgRecencyScore || 0,
avgRelevanceScore: report.avgRelevanceScore || 0
},
categories: {
excellent: [], // Recent, relevant, good volume
good: [], // Some recent, mostly relevant
needsTuning: [], // Low recency or relevance
failing: [] // No results
},
recommendations: []
};
// Categorize each alert
results.forEach(result => {
if (!result.success) {
analysis.categories.failing.push(result);
return;
}
const recentRatio = result.resultCount > 0 ? result.recentCount / result.resultCount : 0;
const relevantRatio = result.resultCount > 0 ? result.relevantCount / result.resultCount : 0;
if (result.recentCount >= 3 && relevantRatio >= 0.6 && result.resultCount >= 5) {
analysis.categories.excellent.push(result);
} else if (result.recentCount >= 1 && relevantRatio >= 0.4) {
analysis.categories.good.push(result);
} else {
analysis.categories.needsTuning.push(result);
}
});
// Generate specific recommendations
// No recent results
const noRecent = results.filter(r => r.success && (r.recentCount || 0) === 0);
if (noRecent.length > 0) {
analysis.recommendations.push({
category: 'Recency Issues',
severity: 'high',
count: noRecent.length,
alerts: noRecent.map(r => r.name),
issue: 'No results from today or this week',
suggestions: [
'Broaden keywords to capture more general discussions',
'Check if topic is actively discussed (may be seasonal)',
'Consider adding trending terms related to the topic',
'Remove overly specific technical terms'
]
});
}
// Low relevance
const lowRelevance = results.filter(r => r.success && r.relevantCount < (r.resultCount / 2));
if (lowRelevance.length > 0) {
analysis.recommendations.push({
category: 'Relevance Issues',
severity: 'medium',
count: lowRelevance.length,
alerts: lowRelevance.map(r => r.name),
issue: 'Less than 50% of results are relevant',
suggestions: [
'Add more specific repair-related keywords',
'Include domain filters (site:reddit.com, site:kijiji.ca)',
'Add negative keywords to exclude noise (-job -jobs -career)',
'Use exact phrase matching with quotes for key terms'
]
});
}
// Few results
const fewResults = results.filter(r => r.success && r.resultCount < 5);
if (fewResults.length > 0) {
analysis.recommendations.push({
category: 'Low Volume',
severity: 'medium',
count: fewResults.length,
alerts: fewResults.map(r => r.name),
issue: 'Fewer than 5 results returned',
suggestions: [
'Use broader search terms (remove some specific keywords)',
'Try OR operators to include synonyms',
'Expand geographic scope',
'Check for typos in query'
]
});
}
// Failing alerts
if (failed > 0) {
const failingAlerts = results.filter(r => !r.success);
analysis.recommendations.push({
category: 'Failing Alerts',
severity: 'critical',
count: failed,
alerts: failingAlerts.map(r => r.name),
issue: 'Queries returning no results or errors',
suggestions: [
'Test query directly in Google Search',
'Simplify query structure',
'Check for syntax errors',
'Verify site filters are correct',
'Consider if topic exists in target locations'
]
});
}
return analysis;
}
/**
* Generate markdown report from analysis
*/
function generateMarkdownReport(analysis, reportName) {
const lines = [];
lines.push(`# Validation Analysis Report`);
lines.push(``);
lines.push(`**Source:** ${reportName}`);
lines.push(`**Generated:** ${new Date().toLocaleString()}`);
lines.push(``);
lines.push(`---`);
lines.push(``);
// Summary
lines.push(`## Summary`);
lines.push(``);
lines.push(`- **Total Alerts Tested:** ${analysis.summary.total}`);
lines.push(`- **Successful:** ${analysis.summary.successful} (${Math.round(analysis.summary.successRate)}%)`);
lines.push(`- **Failed:** ${analysis.summary.failed}`);
lines.push(`- **Avg Recency Score:** ${analysis.summary.avgRecencyScore}/10`);
lines.push(`- **Avg Relevance Score:** ${analysis.summary.avgRelevanceScore}`);
lines.push(``);
// Categories
lines.push(`## Alert Performance Categories`);
lines.push(``);
lines.push(`### ✅ Excellent (${analysis.categories.excellent.length})`);
lines.push(`*Recent results, high relevance, good volume*`);
lines.push(``);
if (analysis.categories.excellent.length > 0) {
analysis.categories.excellent.forEach(alert => {
lines.push(`- **${alert.name}**`);
lines.push(` - Results: ${alert.resultCount}, Recent: ${alert.recentCount}, Relevant: ${alert.relevantCount}`);
lines.push(` - **Action:** Keep as-is, this alert is performing well`);
lines.push(``);
});
} else {
lines.push(`*No alerts in this category*`);
lines.push(``);
}
lines.push(`### ✓ Good (${analysis.categories.good.length})`);
lines.push(`*Acceptable performance with room for improvement*`);
lines.push(``);
if (analysis.categories.good.length > 0) {
analysis.categories.good.forEach(alert => {
lines.push(`- **${alert.name}**`);
lines.push(` - Results: ${alert.resultCount}, Recent: ${alert.recentCount || 0}, Relevant: ${alert.relevantCount || 0}`);
lines.push(` - **Action:** Monitor and optionally tune for better results`);
lines.push(``);
});
} else {
lines.push(`*No alerts in this category*`);
lines.push(``);
}
lines.push(`### ⚠️ Needs Tuning (${analysis.categories.needsTuning.length})`);
lines.push(`*Low recency, relevance, or volume issues*`);
lines.push(``);
if (analysis.categories.needsTuning.length > 0) {
analysis.categories.needsTuning.forEach(alert => {
lines.push(`- **${alert.name}**`);
lines.push(` - Results: ${alert.resultCount}, Recent: ${alert.recentCount || 0}, Relevant: ${alert.relevantCount || 0}`);
lines.push(` - Recency Score: ${alert.avgRecencyScore || 0}/10, Relevance Score: ${alert.avgRelevanceScore || 0}`);
lines.push(` - **Action:** Requires tuning - see recommendations below`);
lines.push(``);
});
} else {
lines.push(`*No alerts in this category*`);
lines.push(``);
}
lines.push(`### ❌ Failing (${analysis.categories.failing.length})`);
lines.push(`*No results or errors*`);
lines.push(``);
if (analysis.categories.failing.length > 0) {
analysis.categories.failing.forEach(alert => {
lines.push(`- **${alert.name}**`);
lines.push(` - Error: ${alert.error || 'No results found'}`);
lines.push(` - **Action:** Critical - needs immediate attention`);
lines.push(``);
});
} else {
lines.push(`*No alerts in this category*`);
lines.push(``);
}
// Recommendations
lines.push(`---`);
lines.push(``);
lines.push(`## Tuning Recommendations`);
lines.push(``);
if (analysis.recommendations.length === 0) {
lines.push(`🎉 **All alerts are performing well! No tuning needed.**`);
lines.push(``);
} else {
analysis.recommendations.forEach((rec, idx) => {
const severityEmoji = {
critical: '🔴',
high: '🟠',
medium: '🟡',
low: '🟢'
}[rec.severity] || '⚪';
lines.push(`### ${severityEmoji} ${rec.category} (${rec.count} alerts)`);
lines.push(``);
lines.push(`**Issue:** ${rec.issue}`);
lines.push(``);
lines.push(`**Affected Alerts:**`);
rec.alerts.forEach(name => lines.push(`- ${name}`));
lines.push(``);
lines.push(`**Suggestions:**`);
rec.suggestions.forEach(suggestion => lines.push(`- ${suggestion}`));
lines.push(``);
});
}
// Priority Actions
lines.push(`---`);
lines.push(``);
lines.push(`## Priority Actions`);
lines.push(``);
const criticalRecs = analysis.recommendations.filter(r => r.severity === 'critical');
const highRecs = analysis.recommendations.filter(r => r.severity === 'high');
if (criticalRecs.length > 0) {
lines.push(`### 1. Critical Issues (Do First)`);
lines.push(``);
criticalRecs.forEach(rec => {
lines.push(`- **${rec.category}:** ${rec.count} alerts`);
lines.push(` - ${rec.suggestions[0]}`);
});
lines.push(``);
}
if (highRecs.length > 0) {
lines.push(`### 2. High Priority`);
lines.push(``);
highRecs.forEach(rec => {
lines.push(`- **${rec.category}:** ${rec.count} alerts`);
lines.push(` - ${rec.suggestions[0]}`);
});
lines.push(``);
}
const mediumRecs = analysis.recommendations.filter(r => r.severity === 'medium');
if (mediumRecs.length > 0) {
lines.push(`### 3. Medium Priority (Tune When Possible)`);
lines.push(``);
mediumRecs.forEach(rec => {
lines.push(`- **${rec.category}:** ${rec.count} alerts`);
});
lines.push(``);
}
// Next Steps
lines.push(`---`);
lines.push(``);
lines.push(`## Next Steps`);
lines.push(``);
lines.push(`1. **Review failing alerts first** - Fix syntax errors or verify topic exists`);
lines.push(`2. **Address recency issues** - Broaden keywords for alerts with no recent results`);
lines.push(`3. **Improve relevance** - Add filters and negative keywords`);
lines.push(`4. **Re-test after changes** - Run validation again to verify improvements`);
lines.push(`5. **Keep excellent alerts as-is** - Don't fix what isn't broken`);
lines.push(``);
return lines.join('\n');
}
/**
* Main function
*/
async function main() {
const args = process.argv.slice(2);
if (args.length === 0) {
console.log(`
Usage:
node scripts/analyze-results.js <report-file.json>
Example:
node scripts/analyze-results.js validation-report-1699999999999.json
`);
process.exit(0);
}
const reportFile = args[0];
try {
console.log(`\n📊 Analyzing report: ${reportFile}\n`);
// Read report
const reportData = await readFile(reportFile, 'utf-8');
const report = JSON.parse(reportData);
// Analyze
const analysis = analyzeReport(report);
// Generate markdown report
const markdown = generateMarkdownReport(analysis, reportFile);
// Save analysis
const analysisFile = reportFile.replace('.json', '-analysis.md');
await writeFile(analysisFile, markdown);
// Print summary
console.log(`✅ Analysis complete!\n`);
console.log(`📈 Performance Summary:`);
console.log(` Excellent: ${analysis.categories.excellent.length}`);
console.log(` Good: ${analysis.categories.good.length}`);
console.log(` Needs Tuning: ${analysis.categories.needsTuning.length}`);
console.log(` Failing: ${analysis.categories.failing.length}\n`);
if (analysis.recommendations.length > 0) {
console.log(`🔧 ${analysis.recommendations.length} recommendation(s) generated\n`);
analysis.recommendations.forEach(rec => {
console.log(` ${rec.category}: ${rec.count} alerts (${rec.severity})`);
});
console.log(``);
}
console.log(`💾 Full analysis saved to: ${analysisFile}\n`);
} catch (error) {
console.error(`\n❌ Error: ${error.message}\n`);
process.exit(1);
}
}
// Run if called directly
if (import.meta.url === `file://${process.argv[1]}`) {
main().catch(console.error);
}
export { analyzeReport, generateMarkdownReport };

291
scripts/example-usage.js Normal file
View File

@ -0,0 +1,291 @@
/**
* Example usage of Playwright with human-like behavior
* Demonstrates various scraping scenarios
*/
import { chromium } from 'playwright';
import {
getHumanizedContext,
humanClick,
humanType,
humanScroll,
simulateReading,
randomDelay,
randomMouseMovements
} from './human-behavior.js';
/**
* Example 1: Simple Google search with human behavior
*/
async function exampleGoogleSearch() {
console.log('\n=== Example 1: Google Search ===\n');
const browser = await chromium.launch({ headless: false, slowMo: 50 });
const context = await getHumanizedContext(browser);
const page = await context.newPage();
try {
// Navigate to Google
console.log('Navigating to Google...');
await page.goto('https://www.google.com');
await randomDelay(1000, 2000);
// Move mouse around naturally
await randomMouseMovements(page, 2);
// Search for something
console.log('Performing search...');
const searchBox = 'textarea[name="q"], input[name="q"]';
await humanClick(page, searchBox);
await humanType(page, searchBox, 'laptop repair toronto', {
minDelay: 70,
maxDelay: 180,
mistakes: 0.03
});
await randomDelay(500, 1200);
await page.keyboard.press('Enter');
// Wait for results
await page.waitForLoadState('networkidle');
await randomDelay(1500, 2500);
// Scroll through results
console.log('Scrolling through results...');
await humanScroll(page, {
scrollCount: 3,
minScroll: 150,
maxScroll: 400,
randomDirection: true
});
// Extract result count
const resultCount = await page.locator('div.g').count();
console.log(`✅ Found ${resultCount} search results\n`);
// Simulate reading
await simulateReading(page, 3000);
} finally {
await page.close();
await context.close();
await browser.close();
}
}
/**
* Example 2: Reddit scraping with natural behavior
*/
async function exampleRedditScraping() {
console.log('\n=== Example 2: Reddit Scraping ===\n');
const browser = await chromium.launch({ headless: false, slowMo: 50 });
const context = await getHumanizedContext(browser);
const page = await context.newPage();
try {
// Navigate to subreddit
console.log('Navigating to r/toronto...');
await page.goto('https://www.reddit.com/r/toronto');
await randomDelay(2000, 3000);
// Random mouse movements (looking around)
await randomMouseMovements(page, 3);
// Scroll naturally
console.log('Scrolling through posts...');
await humanScroll(page, {
scrollCount: 4,
minScroll: 200,
maxScroll: 500,
minDelay: 1000,
maxDelay: 2500
});
// Extract post titles
const posts = await page.evaluate(() => {
const postElements = document.querySelectorAll('[data-testid="post-container"]');
return Array.from(postElements).slice(0, 10).map(post => {
const titleEl = post.querySelector('h3');
return titleEl ? titleEl.innerText : null;
}).filter(Boolean);
});
console.log(`\n📝 Found ${posts.length} posts:`);
posts.forEach((title, i) => {
console.log(` ${i + 1}. ${title.substring(0, 60)}...`);
});
// Simulate reading
await simulateReading(page, 4000);
} finally {
await page.close();
await context.close();
await browser.close();
}
}
/**
* Example 3: Multi-step navigation with human behavior
*/
async function exampleMultiStepNavigation() {
console.log('\n=== Example 3: Multi-Step Navigation ===\n');
const browser = await chromium.launch({ headless: false, slowMo: 50 });
const context = await getHumanizedContext(browser);
const page = await context.newPage();
try {
// Step 1: Go to Hacker News
console.log('Step 1: Navigating to Hacker News...');
await page.goto('https://news.ycombinator.com');
await randomDelay(1500, 2500);
await randomMouseMovements(page, 2);
// Step 2: Scroll and read
console.log('Step 2: Scrolling and reading...');
await humanScroll(page, { scrollCount: 2 });
await simulateReading(page, 3000);
// Step 3: Click on first story
console.log('Step 3: Clicking on a story...');
const firstStory = '.titleline > a';
await page.waitForSelector(firstStory);
// Get the story title first
const storyTitle = await page.locator(firstStory).first().innerText();
console.log(` Clicking: "${storyTitle.substring(0, 50)}..."`);
await humanClick(page, firstStory);
await randomDelay(2000, 3000);
// Step 4: Interact with the new page
console.log('Step 4: Exploring the article...');
await humanScroll(page, {
scrollCount: 3,
minScroll: 200,
maxScroll: 600
});
await simulateReading(page, 4000);
console.log('✅ Multi-step navigation completed\n');
} finally {
await page.close();
await context.close();
await browser.close();
}
}
/**
* Example 4: Demonstrating different mouse movement patterns
*/
async function exampleMousePatterns() {
console.log('\n=== Example 4: Mouse Movement Patterns ===\n');
const browser = await chromium.launch({ headless: false, slowMo: 30 });
const context = await getHumanizedContext(browser);
const page = await context.newPage();
try {
await page.goto('https://www.example.com');
await randomDelay(1000, 1500);
console.log('Demonstrating various mouse patterns...');
// Pattern 1: Random movements
console.log(' 1. Random scanning...');
await randomMouseMovements(page, 5);
// Pattern 2: Slow deliberate movements
console.log(' 2. Deliberate movements...');
const viewport = page.viewportSize();
for (let i = 0; i < 3; i++) {
const target = {
x: Math.random() * viewport.width,
y: Math.random() * viewport.height
};
await page.mouse.move(target.x, target.y);
await randomDelay(800, 1500);
}
// Pattern 3: Hovering over elements
console.log(' 3. Hovering over link...');
const link = await page.locator('a').first();
const box = await link.boundingBox();
if (box) {
await page.mouse.move(
box.x + box.width / 2,
box.y + box.height / 2
);
await randomDelay(1000, 2000);
}
console.log('✅ Mouse patterns demonstration completed\n');
} finally {
await page.close();
await context.close();
await browser.close();
}
}
/**
* Run all examples
*/
async function runAllExamples() {
console.log('\n' + '='.repeat(60));
console.log('PLAYWRIGHT HUMAN BEHAVIOR EXAMPLES');
console.log('='.repeat(60));
const examples = [
{ name: 'Google Search', fn: exampleGoogleSearch },
{ name: 'Reddit Scraping', fn: exampleRedditScraping },
{ name: 'Multi-Step Navigation', fn: exampleMultiStepNavigation },
{ name: 'Mouse Patterns', fn: exampleMousePatterns }
];
console.log('\nAvailable examples:');
examples.forEach((ex, i) => {
console.log(` ${i + 1}. ${ex.name}`);
});
const args = process.argv.slice(2);
if (args.length === 0) {
console.log('\nUsage: node scripts/example-usage.js [example-number]');
console.log('Example: node scripts/example-usage.js 1\n');
console.log('Running all examples...\n');
for (const example of examples) {
await example.fn();
await new Promise(resolve => setTimeout(resolve, 2000));
}
} else {
const exampleNum = parseInt(args[0]) - 1;
if (exampleNum >= 0 && exampleNum < examples.length) {
await examples[exampleNum].fn();
} else {
console.log(`\n❌ Invalid example number. Choose 1-${examples.length}\n`);
}
}
console.log('\n' + '='.repeat(60));
console.log('ALL EXAMPLES COMPLETED');
console.log('='.repeat(60) + '\n');
}
// Run examples
if (import.meta.url === `file://${process.argv[1]}`) {
runAllExamples().catch(console.error);
}
export {
exampleGoogleSearch,
exampleRedditScraping,
exampleMultiStepNavigation,
exampleMousePatterns
};

View File

@ -0,0 +1,114 @@
#!/usr/bin/env python3
"""Generate broader queries that will actually catch repair leads."""
# Strategy: Use broader terms + location keywords instead of site: filters
# This catches mentions across ALL platforms (Reddit, Facebook, Kijiji, forums)
CANADIAN_CITIES = [
"Toronto", "Mississauga", "Brampton", "Kitchener", "Waterloo", "Cambridge",
"London Ontario", "Hamilton", "Ottawa", "Montreal", "Vancouver", "Calgary",
"Edmonton", "Winnipeg"
]
CORE_SERVICES = {
"Data Recovery": [
"data recovery",
"recover my data",
"dead hard drive",
"drive not recognized",
"lost photos"
],
"MacBook Repair": [
"macbook repair",
"macbook won't turn on",
"macbook liquid damage",
"logic board repair"
],
"Console Repair": [
"ps5 repair",
"xbox repair",
"switch repair",
"hdmi port repair console"
],
"iPhone Repair": [
"iphone repair",
"iphone won't charge",
"iphone logic board",
"iphone water damage"
]
}
def generate_location_based_alert(service_name, keywords, cities):
"""Generate alert using location keywords instead of site filters."""
# Use just 4-5 keywords and 3-4 cities per alert
kw_part = " OR ".join([f'"{kw}"' for kw in keywords[:5]])
loc_part = " OR ".join([f'"{city}"' for city in cities[:4]])
query = f'({kw_part})\n({loc_part})\n-job -jobs -hiring'
return {
"name": service_name,
"query": query,
"length": len(query)
}
def generate_intent_based_alert(service_type):
"""Generate alerts focused on explicit service requests."""
intent_keywords = [
"repair shop recommendation",
"where to repair",
"anyone repair",
"repair near me",
"looking for repair"
]
service_keywords = {
"General Tech": ["laptop", "macbook", "iphone", "console"],
"Data": ["data recovery", "hard drive", "photos"],
"Logic Board": ["logic board", "motherboard", "microsolder"]
}
kw = service_keywords.get(service_type, [])
intent_part = " OR ".join([f'"{i}"' for i in intent_keywords[:4]])
service_part = " OR ".join([f'"{s}"' for s in kw])
query = f'({intent_part})\n({service_part})\nsite:reddit.com'
return {
"name": f"{service_type} - Intent Based",
"query": query,
"length": len(query)
}
if __name__ == "__main__":
print("# Broader Google Alert Queries")
print()
print("These use location keywords + service terms instead of site: filters.")
print("This catches repair requests across ALL platforms.")
print()
# Location-based alerts (Ontario focus)
ontario_cities = ["Toronto", "Mississauga", "Kitchener", "Waterloo"]
for service_name, keywords in CORE_SERVICES.items():
alert = generate_location_based_alert(service_name, keywords, ontario_cities)
print(f"## {alert['name']} - Ontario")
print(f"**Length:** {alert['length']} chars")
print()
print("```")
print(alert['query'])
print("```")
print()
# Intent-based alerts
print("## High-Intent Alerts")
print()
for service_type in ["General Tech", "Data", "Logic Board"]:
alert = generate_intent_based_alert(service_type)
print(f"### {alert['name']}")
print()
print("```")
print(alert['query'])
print("```")
print()

473
scripts/human-behavior.js Normal file
View File

@ -0,0 +1,473 @@
/**
* Human-like behavior utilities for Playwright to avoid bot detection
* Includes realistic mouse movements, scrolling, and timing variations
*/
/**
* Generate a random number between min and max (inclusive)
*/
function randomInt(min, max) {
return Math.floor(Math.random() * (max - min + 1)) + min;
}
/**
* Generate a random float between min and max
*/
function randomFloat(min, max) {
return Math.random() * (max - min) + min;
}
/**
* Sleep for a random duration within a range
*/
export async function randomDelay(minMs = 100, maxMs = 500) {
const delay = randomInt(minMs, maxMs);
await new Promise(resolve => setTimeout(resolve, delay));
}
/**
* Generate bezier curve points for smooth mouse movement
* Uses cubic bezier with random control points for natural curves
*/
function generateBezierPath(start, end, steps = 25) {
const points = [];
// Add some randomness to control points
const cp1x = start.x + (end.x - start.x) * randomFloat(0.25, 0.4);
const cp1y = start.y + (end.y - start.y) * randomFloat(-0.2, 0.2);
const cp2x = start.x + (end.x - start.x) * randomFloat(0.6, 0.75);
const cp2y = start.y + (end.y - start.y) * randomFloat(-0.2, 0.2);
for (let i = 0; i <= steps; i++) {
const t = i / steps;
const t2 = t * t;
const t3 = t2 * t;
const mt = 1 - t;
const mt2 = mt * mt;
const mt3 = mt2 * mt;
const x = mt3 * start.x +
3 * mt2 * t * cp1x +
3 * mt * t2 * cp2x +
t3 * end.x;
const y = mt3 * start.y +
3 * mt2 * t * cp1y +
3 * mt * t2 * cp2y +
t3 * end.y;
points.push({ x: Math.round(x), y: Math.round(y) });
}
return points;
}
/**
* Move mouse in a realistic, smooth path with occasional overshooting
* @param {Page} page - Playwright page object
* @param {Object} target - Target coordinates {x, y}
* @param {Object} options - Movement options
*/
export async function humanMouseMove(page, target, options = {}) {
const {
overshootChance = 0.15, // 15% chance to overshoot
overshootDistance = 20, // pixels to overshoot by
steps = 25, // number of steps in the path
stepDelay = 10 // ms between steps
} = options;
// Get current mouse position (or start from a random position)
const viewport = page.viewportSize();
const start = {
x: randomInt(viewport.width * 0.3, viewport.width * 0.7),
y: randomInt(viewport.height * 0.3, viewport.height * 0.7)
};
// Decide if we should overshoot
const shouldOvershoot = Math.random() < overshootChance;
let finalTarget = target;
if (shouldOvershoot) {
// Calculate overshoot position (slightly past the target)
const angle = Math.atan2(target.y - start.y, target.x - start.x);
const overshoot = {
x: target.x + Math.cos(angle) * randomInt(5, overshootDistance),
y: target.y + Math.sin(angle) * randomInt(5, overshootDistance)
};
// Move to overshoot position first
const overshootPath = generateBezierPath(start, overshoot, steps);
for (const point of overshootPath) {
await page.mouse.move(point.x, point.y);
await new Promise(resolve => setTimeout(resolve, stepDelay));
}
// Then correct back to target
const correctionPath = generateBezierPath(overshoot, target, Math.floor(steps * 0.3));
for (const point of correctionPath) {
await page.mouse.move(point.x, point.y);
await new Promise(resolve => setTimeout(resolve, stepDelay));
}
} else {
// Normal smooth movement
const path = generateBezierPath(start, target, steps);
for (const point of path) {
await page.mouse.move(point.x, point.y);
await new Promise(resolve => setTimeout(resolve, stepDelay));
}
}
// Add a tiny random pause after reaching target
await randomDelay(50, 150);
}
/**
* Perform random mouse movements to simulate human reading/scanning
*/
export async function randomMouseMovements(page, count = 3) {
const viewport = page.viewportSize();
for (let i = 0; i < count; i++) {
const target = {
x: randomInt(100, viewport.width - 100),
y: randomInt(100, viewport.height - 100)
};
await humanMouseMove(page, target, {
overshootChance: 0.1,
steps: randomInt(15, 30)
});
await randomDelay(200, 800);
}
}
/**
* Scroll page in a human-like manner with random intervals and amounts
* @param {Page} page - Playwright page object
* @param {Object} options - Scrolling options
*/
export async function humanScroll(page, options = {}) {
const {
direction = 'down', // 'down' or 'up'
scrollCount = 3, // number of scroll actions
minScroll = 100, // minimum pixels per scroll
maxScroll = 400, // maximum pixels per scroll
minDelay = 500, // minimum delay between scrolls
maxDelay = 2000, // maximum delay between scrolls
randomDirection = false // occasionally scroll in opposite direction
} = options;
for (let i = 0; i < scrollCount; i++) {
// Determine scroll direction
let scrollDir = direction;
if (randomDirection && Math.random() < 0.15) {
scrollDir = direction === 'down' ? 'up' : 'down';
}
// Random scroll amount
const scrollAmount = randomInt(minScroll, maxScroll);
const scrollValue = scrollDir === 'down' ? scrollAmount : -scrollAmount;
// Perform scroll in small increments for smoothness
const increments = randomInt(5, 12);
const incrementValue = scrollValue / increments;
for (let j = 0; j < increments; j++) {
await page.evaluate((delta) => {
window.scrollBy(0, delta);
}, incrementValue);
await new Promise(resolve => setTimeout(resolve, randomInt(20, 50)));
}
// Random pause between scrolls (simulating reading)
await randomDelay(minDelay, maxDelay);
}
}
/**
* Scroll to a specific element in a human-like way
*/
export async function scrollToElement(page, selector, options = {}) {
const element = await page.locator(selector).first();
// Get element position
const box = await element.boundingBox();
if (!box) {
console.warn(`Element ${selector} not found or not visible`);
return;
}
// Get current scroll position
const currentScroll = await page.evaluate(() => window.scrollY);
const viewportHeight = page.viewportSize().height;
// Calculate target scroll position (element near middle of viewport)
const targetScroll = box.y + currentScroll - (viewportHeight / 2);
const scrollDistance = targetScroll - currentScroll;
// Scroll in chunks
const chunks = Math.max(3, Math.abs(Math.floor(scrollDistance / 200)));
const chunkSize = scrollDistance / chunks;
for (let i = 0; i < chunks; i++) {
await page.evaluate((delta) => {
window.scrollBy(0, delta);
}, chunkSize);
await randomDelay(50, 150);
}
await randomDelay(300, 700);
}
/**
* Click an element with human-like behavior
*/
export async function humanClick(page, selector, options = {}) {
const {
moveToElement = true,
doubleClickChance = 0.02 // 2% chance of accidental double-click
} = options;
const element = await page.locator(selector).first();
const box = await element.boundingBox();
if (!box) {
throw new Error(`Element ${selector} not found or not visible`);
}
// Calculate click position (slightly random within element bounds)
const target = {
x: box.x + randomInt(box.width * 0.3, box.width * 0.7),
y: box.y + randomInt(box.height * 0.3, box.height * 0.7)
};
if (moveToElement) {
await humanMouseMove(page, target);
}
// Random pre-click pause
await randomDelay(100, 300);
// Click
await page.mouse.click(target.x, target.y);
// Occasional accidental double-click
if (Math.random() < doubleClickChance) {
await randomDelay(50, 150);
await page.mouse.click(target.x, target.y);
}
await randomDelay(200, 500);
}
/**
* Type text with human-like timing variations
*/
export async function humanType(page, selector, text, options = {}) {
const {
minDelay = 50,
maxDelay = 150,
mistakes = 0.02 // 2% chance of typo
} = options;
await page.click(selector);
await randomDelay(200, 400);
const chars = text.split('');
let typedText = '';
for (let i = 0; i < chars.length; i++) {
const char = chars[i];
// Occasional typo
if (Math.random() < mistakes && i < chars.length - 1) {
// Type wrong char
const wrongChar = String.fromCharCode(char.charCodeAt(0) + randomInt(-2, 2));
await page.keyboard.type(wrongChar);
await randomDelay(minDelay, maxDelay);
// Pause (realize mistake)
await randomDelay(200, 500);
// Backspace
await page.keyboard.press('Backspace');
await randomDelay(100, 200);
}
// Type correct char
await page.keyboard.type(char);
// Variable delay based on character type
let delay;
if (char === ' ') {
delay = randomInt(maxDelay * 1.5, maxDelay * 2);
} else if (char.match(/[.!?,]/)) {
delay = randomInt(maxDelay * 1.2, maxDelay * 2);
} else {
delay = randomInt(minDelay, maxDelay);
}
await new Promise(resolve => setTimeout(resolve, delay));
}
await randomDelay(300, 600);
}
/**
* Wait for page load with random human-like observation time
*/
export async function humanWaitForLoad(page, options = {}) {
const {
minWait = 1000,
maxWait = 3000
} = options;
// Wait for network to be idle
await page.waitForLoadState('networkidle', { timeout: 30000 });
// Additional random observation time (simulating reading/scanning)
await randomDelay(minWait, maxWait);
}
/**
* Simulate reading behavior - random scrolls and mouse movements
*/
export async function simulateReading(page, duration = 5000) {
const endTime = Date.now() + duration;
while (Date.now() < endTime) {
const action = Math.random();
if (action < 0.4) {
// Scroll a bit
await humanScroll(page, {
scrollCount: 1,
minScroll: 50,
maxScroll: 200,
minDelay: 800,
maxDelay: 1500
});
} else if (action < 0.7) {
// Move mouse randomly
await randomMouseMovements(page, 1);
} else {
// Just wait (reading)
await randomDelay(1000, 2000);
}
}
}
/**
* Configure browser context with realistic human-like settings
*/
export async function getHumanizedContext(browser, options = {}) {
const {
locale = 'en-CA',
timezone = 'America/Toronto',
viewport = null
} = options;
// Random but realistic viewport sizes
const viewports = [
{ width: 1920, height: 1080 },
{ width: 1366, height: 768 },
{ width: 1536, height: 864 },
{ width: 1440, height: 900 },
{ width: 2560, height: 1440 }
];
const selectedViewport = viewport || viewports[randomInt(0, viewports.length - 1)];
// Realistic user agents (updated to current versions)
const userAgents = [
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36',
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36'
];
const context = await browser.newContext({
viewport: selectedViewport,
userAgent: userAgents[randomInt(0, userAgents.length - 1)],
locale,
timezoneId: timezone,
permissions: [],
geolocation: { latitude: 43.6532, longitude: -79.3832 }, // Toronto
colorScheme: 'light', // Always light for consistency
deviceScaleFactor: 1, // Standard scaling
hasTouch: false,
isMobile: false,
javaScriptEnabled: true,
// Add realistic headers
extraHTTPHeaders: {
'Accept-Language': 'en-CA,en-US;q=0.9,en;q=0.8',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8',
'Sec-Fetch-Site': 'none',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-User': '?1',
'Sec-Fetch-Dest': 'document',
'Sec-Ch-Ua': '"Google Chrome";v="131", "Chromium";v="131", "Not_A Brand";v="24"',
'Sec-Ch-Ua-Mobile': '?0',
'Sec-Ch-Ua-Platform': '"macOS"',
'Upgrade-Insecure-Requests': '1'
}
});
// Inject additional fingerprint randomization and anti-detection
await context.addInitScript(() => {
// Remove webdriver property
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined
});
// Override permissions
const originalQuery = window.navigator.permissions.query;
window.navigator.permissions.query = (parameters) => (
parameters.name === 'notifications' ?
Promise.resolve({ state: Notification.permission }) :
originalQuery(parameters)
);
// Add chrome property
window.chrome = {
runtime: {}
};
// Override plugins
Object.defineProperty(navigator, 'plugins', {
get: () => [
{
0: { type: 'application/x-google-chrome-pdf', suffixes: 'pdf', description: 'Portable Document Format' },
description: 'Portable Document Format',
filename: 'internal-pdf-viewer',
length: 1,
name: 'Chrome PDF Plugin'
},
{
0: { type: 'application/pdf', suffixes: 'pdf', description: '' },
description: '',
filename: 'mhjfbmdgcfjbbpaeojofohoefgiehjai',
length: 1,
name: 'Chrome PDF Viewer'
}
]
});
});
return context;
}
export default {
randomDelay,
humanMouseMove,
randomMouseMovements,
humanScroll,
scrollToElement,
humanClick,
humanType,
humanWaitForLoad,
simulateReading,
getHumanizedContext
};

View File

@ -0,0 +1,453 @@
/**
* Playwright scraper with human-like behavior for Google Alerts validation
* Usage: node scripts/playwright-scraper.js [query]
*/
import { chromium } from 'playwright';
import {
randomDelay,
humanMouseMove,
randomMouseMovements,
humanScroll,
humanClick,
humanType,
humanWaitForLoad,
simulateReading,
getHumanizedContext
} from './human-behavior.js';
/**
* Search Google with a query and validate results
*/
async function searchGoogle(page, query) {
console.log(`\n🔍 Searching Google for: "${query}"\n`);
// Navigate to Google
await page.goto('https://www.google.com', { waitUntil: 'networkidle' });
await randomDelay(1000, 2000);
// Random mouse movements (looking around the page)
await randomMouseMovements(page, 2);
// Find and focus search box
const searchBox = 'textarea[name="q"], input[name="q"]';
await page.waitForSelector(searchBox);
await randomDelay(500, 1000);
// Click search box with human behavior
await humanClick(page, searchBox);
// Type query with realistic timing
await humanType(page, searchBox, query, {
minDelay: 60,
maxDelay: 180,
mistakes: 0.03
});
// Random pause before submitting (reading what we typed)
await randomDelay(500, 1200);
// Submit search (press Enter)
await page.keyboard.press('Enter');
// Wait for results to load
await humanWaitForLoad(page, { minWait: 1500, maxWait: 3000 });
return page;
}
/**
* Extract search results from Google with recency and relevance detection
*/
async function extractResults(page) {
// Scroll to see more results
await humanScroll(page, {
scrollCount: 2,
minScroll: 200,
maxScroll: 500,
minDelay: 800,
maxDelay: 1500,
randomDirection: true
});
// Random mouse movements (scanning results)
await randomMouseMovements(page, 3);
// Extract results with recency and relevance data
const results = await page.evaluate(() => {
const items = [];
// Try multiple selectors for Google search results
const resultElements = document.querySelectorAll('div.g, div[data-sokoban-container], div[data-hveid], div.Gx5Zad');
const seenUrls = new Set(); // Avoid duplicates
resultElements.forEach((element, index) => {
if (items.length >= 20) return; // Limit to first 20 results
const titleElement = element.querySelector('h3');
const linkElement = element.querySelector('a[href]');
const snippetElement = element.querySelector('div[data-sncf]') ||
element.querySelector('div[style*="-webkit-line-clamp"]') ||
element.querySelector('.VwiC3b') ||
element.querySelector('.lyLwlc') ||
element.querySelector('.s') ||
element.querySelector('span:not([class])');
// Try to find date/recency information
const dateElement = element.querySelector('span.MUxGbd') ||
element.querySelector('.f') ||
element.querySelector('.LEwnzc') ||
element.querySelector('span[style*="color"]');
const dateText = dateElement ? dateElement.innerText : '';
if (titleElement && linkElement && linkElement.href) {
const url = linkElement.href;
// Skip non-http links and duplicates
if (!url.startsWith('http') || seenUrls.has(url)) return;
seenUrls.add(url);
try {
const domain = new URL(url).hostname;
items.push({
title: titleElement.innerText,
url: url,
domain: domain,
snippet: snippetElement ? snippetElement.innerText : '',
dateText: dateText
});
} catch (e) {
// Skip invalid URLs
}
}
});
return items;
});
// Analyze recency and relevance
const now = new Date();
results.forEach(result => {
// Detect recency category
const dateText = result.dateText.toLowerCase();
if (dateText.includes('hour') || dateText.includes('minute')) {
result.recency = 'today';
result.recencyScore = 10;
} else if (dateText.includes('day') && !dateText.includes('days ago')) {
result.recency = 'today';
result.recencyScore = 10;
} else if (dateText.match(/\d+\s*day/)) {
const days = parseInt(dateText.match(/(\d+)\s*day/)[1]);
if (days <= 7) {
result.recency = 'this_week';
result.recencyScore = 8;
} else if (days <= 30) {
result.recency = 'this_month';
result.recencyScore = 6;
} else {
result.recency = 'older';
result.recencyScore = 3;
}
} else if (dateText.match(/\d{4}/)) {
// Has a year in the date
result.recency = 'dated';
result.recencyScore = 5;
} else {
result.recency = 'unknown';
result.recencyScore = 0;
}
});
// Get result count
const resultStats = await page.evaluate(() => {
const statsElement = document.querySelector('#result-stats');
return statsElement ? statsElement.innerText : 'Unknown';
});
// Calculate recency distribution
const recencyDist = {
today: results.filter(r => r.recency === 'today').length,
this_week: results.filter(r => r.recency === 'this_week').length,
this_month: results.filter(r => r.recency === 'this_month').length,
older: results.filter(r => r.recency === 'older').length,
unknown: results.filter(r => r.recency === 'unknown').length
};
return { results, stats: resultStats, recencyDist };
}
/**
* Calculate relevance score for results based on query
*/
function calculateRelevance(results, query) {
const queryTerms = query.toLowerCase()
.replace(/['"()]/g, '')
.split(/\s+/)
.filter(t => t.length > 3 && !['site:', 'http', 'https'].some(p => t.includes(p)));
results.forEach(result => {
let relevanceScore = 0;
const titleLower = result.title.toLowerCase();
const snippetLower = result.snippet.toLowerCase();
// Check keyword presence in title (weighted higher)
queryTerms.forEach(term => {
if (titleLower.includes(term)) relevanceScore += 3;
if (snippetLower.includes(term)) relevanceScore += 1;
});
// Check for expected domains (reddit, kijiji, craigslist, etc.)
const targetDomains = ['reddit.com', 'kijiji.ca', 'craigslist', 'facebook.com', 'used.ca'];
if (targetDomains.some(d => result.domain.includes(d))) {
relevanceScore += 2;
}
// Check for repair-related terms
const repairTerms = ['repair', 'fix', 'broken', 'replace', 'service', 'refurbish'];
repairTerms.forEach(term => {
if (titleLower.includes(term) || snippetLower.includes(term)) {
relevanceScore += 1;
}
});
result.relevanceScore = relevanceScore;
result.relevant = relevanceScore >= 3;
});
return results;
}
/**
* Validate a single Google Alert query with recency and relevance analysis
*/
async function validateQuery(browser, query) {
const context = await getHumanizedContext(browser);
const page = await context.newPage();
try {
// Perform search
await searchGoogle(page, query);
// Extract and analyze results
const { results, stats, recencyDist } = await extractResults(page);
// Calculate relevance
calculateRelevance(results, query);
// Calculate metrics
const recentResults = results.filter(r => ['today', 'this_week'].includes(r.recency)).length;
const relevantResults = results.filter(r => r.relevant).length;
const avgRecencyScore = results.length > 0
? (results.reduce((sum, r) => sum + r.recencyScore, 0) / results.length).toFixed(1)
: 0;
const avgRelevanceScore = results.length > 0
? (results.reduce((sum, r) => sum + r.relevanceScore, 0) / results.length).toFixed(1)
: 0;
console.log(`\n📊 Results Summary:`);
console.log(` Stats: ${stats}`);
console.log(` Found: ${results.length} results`);
console.log(` Recent (today/this week): ${recentResults}`);
console.log(` Relevant: ${relevantResults}`);
console.log(` Avg Recency Score: ${avgRecencyScore}/10`);
console.log(` Avg Relevance Score: ${avgRelevanceScore}\n`);
console.log(`📅 Recency Distribution:`);
console.log(` Today: ${recencyDist.today}`);
console.log(` This Week: ${recencyDist.this_week}`);
console.log(` This Month: ${recencyDist.this_month}`);
console.log(` Older: ${recencyDist.older}`);
console.log(` Unknown: ${recencyDist.unknown}\n`);
if (results.length > 0) {
console.log(`✅ Top Results:\n`);
results.slice(0, 5).forEach((result, index) => {
const recencyTag = result.recency !== 'unknown' ? `[${result.recency}]` : '';
const relevanceTag = result.relevant ? '✓' : '○';
console.log(`${index + 1}. ${relevanceTag} ${result.title} ${recencyTag}`);
console.log(` ${result.domain}`);
console.log(` ${result.snippet.substring(0, 100)}...\n`);
});
} else {
console.log(`❌ No results found for this query\n`);
}
// Simulate reading before closing
await simulateReading(page, 3000);
return {
query,
success: results.length > 0,
resultCount: results.length,
recentCount: recentResults,
relevantCount: relevantResults,
avgRecencyScore: parseFloat(avgRecencyScore),
avgRelevanceScore: parseFloat(avgRelevanceScore),
recencyDist,
stats,
results: results.slice(0, 10) // Return first 10
};
} catch (error) {
console.error(`❌ Error validating query: ${error.message}`);
return {
query,
success: false,
error: error.message
};
} finally {
await page.close();
await context.close();
}
}
/**
* Scrape a specific website with human-like behavior
*/
async function scrapeWebsite(browser, url, selectors = {}) {
console.log(`\n🌐 Scraping: ${url}\n`);
const context = await getHumanizedContext(browser);
const page = await context.newPage();
try {
// Navigate to page
await page.goto(url, { waitUntil: 'networkidle', timeout: 30000 });
await humanWaitForLoad(page, { minWait: 2000, maxWait: 4000 });
// Initial random mouse movements
await randomMouseMovements(page, 2);
// Scroll through page naturally
await humanScroll(page, {
scrollCount: 3,
minScroll: 150,
maxScroll: 400,
minDelay: 1000,
maxDelay: 2500,
randomDirection: true
});
// More random movements
await randomMouseMovements(page, 2);
// Extract content based on selectors
const content = await page.evaluate((sels) => {
const data = {};
// Try to extract title
const titleSelectors = sels.title || ['h1', 'h2', '.title', '#title'];
for (const sel of titleSelectors) {
const el = document.querySelector(sel);
if (el) {
data.title = el.innerText;
break;
}
}
// Try to extract main content
const contentSelectors = sels.content || ['article', 'main', '.content', '#content'];
for (const sel of contentSelectors) {
const el = document.querySelector(sel);
if (el) {
data.content = el.innerText.substring(0, 1000);
break;
}
}
// Extract links
const links = Array.from(document.querySelectorAll('a')).map(a => ({
text: a.innerText.substring(0, 100),
href: a.href
})).slice(0, 20);
data.links = links;
return data;
}, selectors);
console.log(`\n📄 Scraped Content:`);
console.log(` Title: ${content.title || 'N/A'}`);
console.log(` Content Length: ${content.content?.length || 0} chars`);
console.log(` Links Found: ${content.links?.length || 0}\n`);
// Simulate reading/interaction
await simulateReading(page, 4000);
return {
url,
success: true,
content
};
} catch (error) {
console.error(`❌ Error scraping: ${error.message}`);
return {
url,
success: false,
error: error.message
};
} finally {
await page.close();
await context.close();
}
}
/**
* Main function
*/
async function main() {
const args = process.argv.slice(2);
if (args.length === 0) {
console.log(`
Usage:
node scripts/playwright-scraper.js "your search query"
node scripts/playwright-scraper.js --url "https://example.com"
Examples:
node scripts/playwright-scraper.js '"macbook repair" Toronto'
node scripts/playwright-scraper.js --url "https://www.reddit.com/r/toronto"
`);
process.exit(0);
}
// Launch browser with anti-detection args
console.log('🚀 Launching browser...\n');
const browser = await chromium.launch({
headless: false, // Set to true for production
slowMo: 50, // Slight delay between actions (more human-like)
args: [
'--disable-blink-features=AutomationControlled',
'--disable-dev-shm-usage',
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-web-security',
'--disable-features=IsolateOrigins,site-per-process'
]
});
try {
if (args[0] === '--url' && args[1]) {
// Scrape a specific URL
const result = await scrapeWebsite(browser, args[1]);
console.log('\n' + JSON.stringify(result, null, 2));
} else {
// Validate a search query
const query = args.join(' ').replace(/^["']|["']$/g, '');
const result = await validateQuery(browser, query);
console.log('\n' + JSON.stringify(result, null, 2));
}
} finally {
await browser.close();
console.log('\n✅ Browser closed\n');
}
}
// Run if called directly
if (import.meta.url === `file://${process.argv[1]}`) {
main().catch(console.error);
}
export { validateQuery, scrapeWebsite, searchGoogle, extractResults };

123
scripts/scraper-config.js Normal file
View File

@ -0,0 +1,123 @@
/**
* Configuration for Playwright scraper and human behavior
* Adjust these values to fine-tune bot detection avoidance
*/
export const config = {
// Browser settings
browser: {
headless: false, // Set to true for production
slowMo: 50, // Milliseconds to slow down actions
timeout: 30000, // Default timeout for operations
},
// Human behavior parameters
humanBehavior: {
// Mouse movement
mouse: {
overshootChance: 0.15, // Probability of overshooting target (0-1)
overshootDistance: 20, // Max pixels to overshoot
pathSteps: 25, // Number of steps in bezier curve
stepDelay: 10, // Milliseconds between movement steps
},
// Scrolling behavior
scroll: {
minAmount: 100, // Minimum pixels per scroll
maxAmount: 400, // Maximum pixels per scroll
minDelay: 500, // Minimum delay between scrolls (ms)
maxDelay: 2000, // Maximum delay between scrolls (ms)
randomDirectionChance: 0.15, // Chance to scroll opposite direction
smoothIncrements: [5, 12], // Range of increments for smooth scrolling
},
// Typing behavior
typing: {
minDelay: 50, // Minimum delay between keystrokes (ms)
maxDelay: 150, // Maximum delay between keystrokes (ms)
mistakeChance: 0.02, // Probability of typo (0-1)
pauseOnSpace: 1.5, // Multiplier for pause after space
pauseOnPunctuation: 2.0, // Multiplier for pause after punctuation
},
// Clicking behavior
clicking: {
preClickDelay: [100, 300], // Range for pause before click
postClickDelay: [200, 500], // Range for pause after click
doubleClickChance: 0.02, // Probability of accidental double-click
clickOffset: [0.3, 0.7], // Click position within element (fraction)
},
// General timing
timing: {
pageLoadWait: [1000, 3000], // Wait after page load
readingSimulation: 5000, // Duration to simulate reading
delayBetweenActions: [100, 500], // General action delays
},
},
// Viewport configurations (randomly selected)
viewports: [
{ width: 1920, height: 1080 },
{ width: 1366, height: 768 },
{ width: 1536, height: 864 },
{ width: 1440, height: 900 },
{ width: 2560, height: 1440 },
],
// User agent strings (randomly selected)
userAgents: [
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Safari/605.1.15',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:120.0) Gecko/20100101 Firefox/120.0',
],
// Geolocation (Toronto by default)
geolocation: {
latitude: 43.6532,
longitude: -79.3832,
},
// Locale settings
locale: {
language: 'en-CA',
timezone: 'America/Toronto',
},
// Validation settings
validation: {
maxAlertsToTest: 5, // Maximum alerts to test in batch
delayBetweenTests: 12000, // Delay between alert tests (ms) - increased for politeness
randomizeOrder: true, // Randomize test order
saveReports: true, // Save validation reports to file
saveNotes: true, // Save detailed notes in markdown
},
// Rate limiting and safety
rateLimiting: {
requestsPerMinute: 10, // Max requests per minute
cooldownAfter: 5, // Cooldown after N requests
cooldownDuration: 60000, // Cooldown duration (ms)
},
// Scraping targets
targets: {
google: {
searchUrl: 'https://www.google.com',
resultSelector: 'div.g, div[data-sokoban-container]',
titleSelector: 'h3',
linkSelector: 'a',
snippetSelectors: ['div[data-content-feature]', '.VwiC3b', '.s'],
},
reddit: {
postSelector: '.Post',
titleSelector: 'h3',
contentSelector: 'div[data-click-id="text"]',
},
},
};
export default config;

View File

@ -0,0 +1,439 @@
/**
* Automated Google Alert Setup Script
*
* This script:
* 1. Logs into Google (with manual intervention for first-time auth)
* 2. Reads alerts from markdown files
* 3. Creates each alert one at a time
* 4. Collects RSS feed URLs
* 5. Saves RSS feeds to a JSON file
*
* Usage:
* node scripts/setup-alerts-automated.js docs/google-alerts-reddit-tuned.md
*
* For first-time use, you'll need to manually log in once.
* The authentication state will be saved for future runs.
*/
import { chromium } from 'playwright';
import { readFile, writeFile, mkdir } from 'fs/promises';
import { existsSync } from 'fs';
import { join } from 'path';
const AUTH_STATE_PATH = join(process.cwd(), '.auth', 'google-auth.json');
const RSS_FEEDS_PATH = join(process.cwd(), 'rss-feeds.json');
/**
* Parse alerts from markdown file
*/
async function parseAlertsFromMarkdown(filePath) {
const content = await readFile(filePath, 'utf-8');
const lines = content.split('\n');
const alerts = [];
let currentAlert = null;
let inCodeBlock = false;
let queryLines = [];
let currentHeading = '';
for (const line of lines) {
// Track headings
if (line.startsWith('### ')) {
currentHeading = line.replace(/^### /, '').trim();
}
// Detect alert name
if (line.includes('**Alert Name:**')) {
if (currentAlert && queryLines.length > 0) {
currentAlert.query = queryLines.join('\n').trim();
if (currentAlert.query) {
alerts.push(currentAlert);
}
}
const match = line.match(/\*\*Alert Name:\*\*\s*`([^`]+)`/);
const name = match ? match[1] : line.split('**Alert Name:**')[1].trim();
currentAlert = {
name,
query: '',
heading: currentHeading
};
queryLines = [];
continue;
}
// Detect code blocks containing queries
if (line.trim() === '```') {
if (!inCodeBlock && currentAlert) {
inCodeBlock = true;
queryLines = [];
} else if (inCodeBlock) {
inCodeBlock = false;
}
continue;
}
// Collect query lines
if (inCodeBlock && currentAlert) {
queryLines.push(line);
}
}
// Add last alert
if (currentAlert && queryLines.length > 0) {
currentAlert.query = queryLines.join('\n').trim();
if (currentAlert.query) {
alerts.push(currentAlert);
}
}
return alerts.filter(alert => alert.query);
}
/**
* Load saved authentication state
*/
async function loadAuthState() {
if (existsSync(AUTH_STATE_PATH)) {
const authData = await readFile(AUTH_STATE_PATH, 'utf-8');
return JSON.parse(authData);
}
return null;
}
/**
* Save authentication state
*/
async function saveAuthState(context) {
const authDir = join(process.cwd(), '.auth');
if (!existsSync(authDir)) {
await mkdir(authDir, { recursive: true });
}
const authState = await context.storageState();
await writeFile(AUTH_STATE_PATH, JSON.stringify(authState, null, 2));
console.log('✅ Authentication state saved');
}
/**
* Setup browser with authentication
*/
async function setupBrowser() {
const browser = await chromium.launch({
headless: false, // Show browser for login
slowMo: 500 // Slow down actions for visibility
});
const context = await browser.newContext({
viewport: { width: 1280, height: 720 },
locale: 'en-CA',
timezoneId: 'America/Toronto',
});
// Try to load saved auth state
const savedAuth = await loadAuthState();
if (savedAuth) {
console.log('📦 Loading saved authentication state...');
await context.addCookies(savedAuth.cookies);
await context.addInitScript(() => {
// Restore localStorage if needed
if (window.localStorage) {
Object.keys(savedAuth.origins?.[0]?.localStorage || {}).forEach(key => {
window.localStorage.setItem(key, savedAuth.origins[0].localStorage[key]);
});
}
});
}
return { browser, context };
}
/**
* Ensure user is logged into Google
*/
async function ensureLoggedIn(page) {
await page.goto('https://www.google.com/alerts');
await page.waitForLoadState('networkidle');
// Check if we need to log in
const signInButton = page.getByText('Sign in', { exact: false }).first();
const isVisible = await signInButton.isVisible().catch(() => false);
if (isVisible) {
console.log('🔐 Please log in to Google in the browser window...');
console.log(' Waiting for you to complete login...');
// Wait for user to navigate away and back (login process)
await page.waitForURL('**/alerts**', { timeout: 300000 }); // 5 min timeout
// Wait a bit more to ensure we're fully logged in
await page.waitForTimeout(2000);
console.log('✅ Login detected');
} else {
console.log('✅ Already logged in');
}
}
/**
* Create a single Google Alert
*/
async function createAlert(page, alert) {
console.log(`\n📝 Creating alert: ${alert.name}`);
console.log(` Query: ${alert.query.substring(0, 60)}...`);
try {
// Navigate to alerts page
await page.goto('https://www.google.com/alerts');
await page.waitForLoadState('networkidle');
// Wait for the search input
const searchInput = page.locator('input[type="text"]').first();
await searchInput.waitFor({ state: 'visible', timeout: 10000 });
// Clear and fill the query
await searchInput.clear();
await searchInput.fill(alert.query);
await page.waitForTimeout(500);
// Click "Show options" to expand settings
const showOptions = page.getByText('Show options', { exact: false }).first();
if (await showOptions.isVisible()) {
await showOptions.click();
await page.waitForTimeout(500);
}
// Configure settings - these selectors may need adjustment
// How often: As-it-happens
const frequencySelect = page.locator('select').first();
if (await frequencySelect.isVisible()) {
await frequencySelect.selectOption('0'); // As-it-happens
}
// Sources: Automatic
const sourcesSelect = page.locator('select').nth(1);
if (await sourcesSelect.isVisible()) {
await sourcesSelect.selectOption('automatic');
}
// Language: English
const languageSelect = page.locator('select').nth(2);
if (await languageSelect.isVisible()) {
await languageSelect.selectOption('en');
}
// Region: Canada
const regionSelect = page.locator('select').nth(3);
if (await regionSelect.isVisible()) {
await regionSelect.selectOption('ca');
}
// How many: All results
const howManySelect = page.locator('select').nth(4);
if (await howManySelect.isVisible()) {
await howManySelect.selectOption('all');
}
// Deliver to: RSS feed
const rssOption = page.getByText('RSS feed', { exact: false }).first();
if (await rssOption.isVisible()) {
await rssOption.click();
}
await page.waitForTimeout(500);
// Click "Create Alert"
const createButton = page.getByRole('button', { name: /Create Alert/i }).first();
await createButton.click();
// Wait for alert to be created
await page.waitForLoadState('networkidle');
await page.waitForTimeout(2000);
// Try to find and click RSS icon/link
// The RSS feed URL might be in different places depending on Google's UI
let rssUrl = null;
// Method 1: Look for RSS icon/link in the alerts list
const rssLink = page.locator('a[href*="feed"]').first();
const rssLinkVisible = await rssLink.isVisible().catch(() => false);
if (rssLinkVisible) {
const href = await rssLink.getAttribute('href');
if (href && href.includes('feed')) {
rssUrl = href.startsWith('http') ? href : `https://www.google.com${href}`;
}
}
// Method 2: Check if we're on a feed page
if (!rssUrl && page.url().includes('feed')) {
rssUrl = page.url();
}
// Method 3: Look for feed URL in page content
if (!rssUrl) {
const feedMatch = await page.content().then(content => {
const match = content.match(/https?:\/\/[^"'\s]*feed[^"'\s]*/i);
return match ? match[0] : null;
});
if (feedMatch) {
rssUrl = feedMatch;
}
}
if (rssUrl) {
console.log(` ✅ RSS Feed: ${rssUrl}`);
return { success: true, rssUrl, alertName: alert.name };
} else {
console.log(` ⚠️ Alert created but RSS URL not found automatically`);
console.log(` 💡 You may need to manually get the RSS URL from the alerts page`);
return { success: true, rssUrl: null, alertName: alert.name, needsManualCheck: true };
}
} catch (error) {
console.error(` ❌ Error creating alert: ${error.message}`);
return { success: false, error: error.message, alertName: alert.name };
}
}
/**
* Load existing RSS feeds
*/
async function loadRssFeeds() {
if (existsSync(RSS_FEEDS_PATH)) {
const content = await readFile(RSS_FEEDS_PATH, 'utf-8');
return JSON.parse(content);
}
return { alerts: [] };
}
/**
* Save RSS feeds
*/
async function saveRssFeeds(feeds) {
await writeFile(RSS_FEEDS_PATH, JSON.stringify(feeds, null, 2));
console.log(`\n💾 RSS feeds saved to ${RSS_FEEDS_PATH}`);
}
/**
* Main execution
*/
async function main() {
const markdownFile = process.argv[2];
if (!markdownFile) {
console.error('Usage: node scripts/setup-alerts-automated.js <markdown-file>');
console.error('Example: node scripts/setup-alerts-automated.js docs/google-alerts-reddit-tuned.md');
process.exit(1);
}
if (!existsSync(markdownFile)) {
console.error(`Error: File not found: ${markdownFile}`);
process.exit(1);
}
console.log('📖 Parsing alerts from markdown file...');
const alerts = await parseAlertsFromMarkdown(markdownFile);
console.log(`✅ Found ${alerts.length} alerts to create\n`);
if (alerts.length === 0) {
console.error('No alerts found in file');
process.exit(1);
}
// Load existing RSS feeds
const rssFeeds = await loadRssFeeds();
const existingAlerts = new Set(rssFeeds.alerts.map(a => a.name));
// Filter out already created alerts
const newAlerts = alerts.filter(alert => !existingAlerts.has(alert.name));
if (newAlerts.length === 0) {
console.log('✅ All alerts already created!');
return;
}
console.log(`📋 Will create ${newAlerts.length} new alerts\n`);
// Setup browser
const { browser, context } = await setupBrowser();
const page = await context.newPage();
try {
// Ensure logged in
await ensureLoggedIn(page);
// Save auth state after login
await saveAuthState(context);
// Create alerts one at a time
const results = [];
for (let i = 0; i < newAlerts.length; i++) {
const alert = newAlerts[i];
console.log(`\n[${i + 1}/${newAlerts.length}]`);
const result = await createAlert(page, alert);
results.push(result);
// Add delay between alerts to avoid rate limiting
if (i < newAlerts.length - 1) {
console.log(' ⏳ Waiting 3 seconds before next alert...');
await page.waitForTimeout(3000);
}
}
// Collect RSS feeds
const successful = results.filter(r => r.success && r.rssUrl);
const needsManual = results.filter(r => r.success && !r.rssUrl);
const failed = results.filter(r => !r.success);
// Update RSS feeds file
successful.forEach(result => {
rssFeeds.alerts.push({
name: result.alertName,
rssUrl: result.rssUrl,
createdAt: new Date().toISOString()
});
});
// Add placeholders for manual checks
needsManual.forEach(result => {
rssFeeds.alerts.push({
name: result.alertName,
rssUrl: 'MANUAL_CHECK_NEEDED',
createdAt: new Date().toISOString(),
note: 'RSS URL needs to be retrieved manually'
});
});
await saveRssFeeds(rssFeeds);
// Summary
console.log('\n' + '='.repeat(60));
console.log('📊 Summary:');
console.log(` ✅ Successfully created: ${successful.length}`);
console.log(` ⚠️ Needs manual RSS URL: ${needsManual.length}`);
console.log(` ❌ Failed: ${failed.length}`);
console.log('='.repeat(60));
if (needsManual.length > 0) {
console.log('\n⚠ Alerts that need manual RSS URL retrieval:');
needsManual.forEach(r => console.log(` - ${r.alertName}`));
}
if (failed.length > 0) {
console.log('\n❌ Failed alerts:');
failed.forEach(r => console.log(` - ${r.alertName}: ${r.error}`));
}
} finally {
await browser.close();
}
}
main().catch(error => {
console.error('Fatal error:', error);
process.exit(1);
});

View File

@ -0,0 +1,175 @@
/**
* Batch test Reddit query patterns to find what works
*/
import { chromium } from 'playwright';
import { validateQuery } from './playwright-scraper.js';
import { writeFile } from 'fs/promises';
const TEST_QUERIES = [
// MacBook - Tech Support Subs
{ name: 'MacBook techsupport - won\'t turn on', query: 'site:reddit.com/r/techsupport "macbook" ("won\'t turn on" OR "dead" OR "no power")', expected: 'high' },
{ name: 'MacBook applehelp - won\'t charge', query: 'site:reddit.com/r/applehelp "macbook" ("won\'t charge" OR "not charging" OR "battery")', expected: 'high' },
{ name: 'MacBook techsupport - water damage', query: 'site:reddit.com/r/techsupport "macbook" ("spilled" OR "water damage" OR "liquid")', expected: 'medium' },
// MacBook - City Subs
{ name: 'MacBook toronto', query: 'site:reddit.com/r/toronto "macbook" "repair"', expected: 'low' },
{ name: 'MacBook vancouver', query: 'site:reddit.com/r/vancouver "macbook" "repair"', expected: 'low' },
// iPhone - Tech Support Subs
{ name: 'iPhone applehelp - won\'t turn on', query: 'site:reddit.com/r/applehelp "iphone" ("won\'t turn on" OR "dead" OR "black screen")', expected: 'high' },
{ name: 'iPhone techsupport - won\'t charge', query: 'site:reddit.com/r/techsupport "iphone" ("won\'t charge" OR "not charging")', expected: 'medium' },
// Gaming Consoles
{ name: 'PS5 techsupport', query: 'site:reddit.com/r/techsupport "ps5" ("won\'t turn on" OR "no power" OR "black screen")', expected: 'medium' },
{ name: 'Switch techsupport', query: 'site:reddit.com/r/techsupport "nintendo switch" ("won\'t charge" OR "won\'t turn on")', expected: 'medium' },
{ name: 'PS5 r/playstation', query: 'site:reddit.com/r/playstation "ps5" ("won\'t turn on" OR "repair")', expected: 'medium' },
// Data Recovery
{ name: 'Data recovery techsupport', query: 'site:reddit.com/r/techsupport ("hard drive" OR "hdd" OR "ssd") ("died" OR "won\'t mount" OR "lost files")', expected: 'medium' },
{ name: 'Data recovery datarecovery', query: 'site:reddit.com/r/datarecovery ("hard drive" OR "lost files" OR "won\'t mount")', expected: 'high' },
// Laptop General
{ name: 'Laptop techsupport - won\'t turn on', query: 'site:reddit.com/r/techsupport "laptop" ("won\'t turn on" OR "dead" OR "no power")', expected: 'high' },
{ name: 'Laptop techsupport - black screen', query: 'site:reddit.com/r/techsupport "laptop" ("black screen" OR "no display")', expected: 'high' },
];
async function main() {
console.log(`\n🔬 Testing ${TEST_QUERIES.length} Reddit query patterns\n`);
console.log(`This will take ~${Math.round(TEST_QUERIES.length * 15 / 60)} minutes with polite delays\n`);
const browser = await chromium.launch({
headless: true,
slowMo: 50,
args: [
'--disable-blink-features=AutomationControlled',
'--disable-dev-shm-usage',
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-web-security',
'--disable-features=IsolateOrigins,site-per-process'
]
});
const results = [];
for (let i = 0; i < TEST_QUERIES.length; i++) {
const test = TEST_QUERIES[i];
console.log(`\n[${i + 1}/${TEST_QUERIES.length}] ${test.name}`);
console.log(`Query: ${test.query.substring(0, 80)}...`);
try {
const result = await validateQuery(browser, test.query);
const summary = {
name: test.name,
query: test.query,
expected: test.expected,
resultCount: result.resultCount || 0,
relevantCount: result.relevantCount || 0,
relevanceScore: result.avgRelevanceScore || 0,
recentCount: result.recentCount || 0,
success: result.success,
performance: result.relevantCount >= 5 && result.avgRelevanceScore >= 6 ? 'EXCELLENT' :
result.relevantCount >= 3 && result.avgRelevanceScore >= 4 ? 'GOOD' :
result.resultCount > 0 ? 'POOR' : 'FAILED'
};
results.push(summary);
console.log(`✓ Results: ${summary.resultCount}, Relevant: ${summary.relevantCount}, Score: ${summary.relevanceScore} - ${summary.performance}`);
// Polite delay
if (i < TEST_QUERIES.length - 1) {
const delay = 12000 + Math.random() * 3000;
console.log(` Waiting ${Math.round(delay / 1000)}s...`);
await new Promise(resolve => setTimeout(resolve, delay));
}
} catch (error) {
console.log(`✗ Error: ${error.message}`);
results.push({
name: test.name,
query: test.query,
expected: test.expected,
error: error.message,
performance: 'ERROR'
});
}
}
await browser.close();
// Generate report
console.log(`\n${'='.repeat(80)}`);
console.log(`TEST RESULTS SUMMARY`);
console.log(`${'='.repeat(80)}\n`);
const excellent = results.filter(r => r.performance === 'EXCELLENT');
const good = results.filter(r => r.performance === 'GOOD');
const poor = results.filter(r => r.performance === 'POOR');
const failed = results.filter(r => r.performance === 'FAILED' || r.performance === 'ERROR');
console.log(`Performance Breakdown:`);
console.log(` EXCELLENT (≥5 relevant, score ≥6): ${excellent.length}`);
console.log(` GOOD (≥3 relevant, score ≥4): ${good.length}`);
console.log(` POOR (has results but low quality): ${poor.length}`);
console.log(` FAILED (no results or error): ${failed.length}\n`);
if (excellent.length > 0) {
console.log(`🌟 EXCELLENT Patterns:`);
excellent.forEach(r => {
console.log(`${r.name}`);
console.log(` ${r.resultCount} results, ${r.relevantCount} relevant, score ${r.relevanceScore}`);
});
console.log(``);
}
if (good.length > 0) {
console.log(`✓ GOOD Patterns:`);
good.forEach(r => {
console.log(`${r.name}`);
console.log(` ${r.resultCount} results, ${r.relevantCount} relevant, score ${r.relevanceScore}`);
});
console.log(``);
}
// Save detailed results
const timestamp = Date.now();
const reportFile = `reddit-pattern-test-${timestamp}.json`;
await writeFile(reportFile, JSON.stringify({ timestamp: new Date().toISOString(), results }, null, 2));
console.log(`\n💾 Detailed results saved to: ${reportFile}\n`);
// Key findings
console.log(`KEY FINDINGS:\n`);
const techSupportQueries = results.filter(r => r.query.includes('techsupport'));
const cityQueries = results.filter(r => r.query.includes('toronto') || r.query.includes('vancouver'));
const avgTechSupport = techSupportQueries.reduce((sum, r) => sum + (r.relevanceScore || 0), 0) / techSupportQueries.length;
const avgCity = cityQueries.reduce((sum, r) => sum + (r.relevanceScore || 0), 0) / cityQueries.length;
console.log(`1. Tech Support Subreddits:`);
console.log(` Average Relevance: ${avgTechSupport.toFixed(1)}`);
console.log(` Best Performers: ${techSupportQueries.filter(r => r.performance === 'EXCELLENT' || r.performance === 'GOOD').length}/${techSupportQueries.length}\n`);
console.log(`2. City Subreddits:`);
console.log(` Average Relevance: ${avgCity.toFixed(1)}`);
console.log(` Best Performers: ${cityQueries.filter(r => r.performance === 'EXCELLENT' || r.performance === 'GOOD').length}/${cityQueries.length}\n`);
console.log(`3. Recommendation:`);
if (avgTechSupport > avgCity * 1.5) {
console.log(` ✓ Use tech support subreddits (r/techsupport, r/applehelp)`);
console.log(` ✓ Consumer language works well ("won't turn on", "dead")`);
console.log(` ✗ Avoid city-specific subreddits for repair queries`);
}
console.log(``);
}
if (import.meta.url === `file://${process.argv[1]}`) {
main().catch(console.error);
}

25
scripts/test_queries.sh Executable file
View File

@ -0,0 +1,25 @@
#!/bin/bash
# Test a few queries to see if they return results
echo "Testing sample queries in Google Search..."
echo ""
# Test 1: Data Recovery - Ontario
echo "1. Data Recovery - Ontario-Other"
echo "Query: (site:reddit.com/r/ontario OR site:reddit.com/r/toronto) (\"data recovery\" OR \"dead hard drive\" OR \"drive clicking\")"
echo ""
# Test 2: Console Repair
echo "2. Console Repair - Western"
echo "Query: (site:reddit.com/r/vancouver OR site:reddit.com/r/Calgary) (\"ps5 repair\" OR \"xbox repair\" OR \"switch repair\")"
echo ""
# Test 3: Laptop Repair
echo "3. Laptop Repair - Ontario-GTA"
echo "Query: (site:reddit.com/r/kitchener OR site:reddit.com/r/waterloo OR site:reddit.com/r/toronto) (\"macbook repair\" OR \"laptop repair\" OR \"logic board\")"
echo ""
echo "RECOMMENDATION:"
echo "- Copy one of these queries and paste into Google Search (not Alerts)"
echo "- If you see recent Reddit posts, the alert will work"
echo "- If zero results, the keywords may be too specific for those subreddits"

View File

@ -0,0 +1,422 @@
/**
* Validate multiple Google Alert queries from markdown files
* Uses Playwright with human-like behavior to test queries
*/
import { chromium } from 'playwright';
import { readFile } from 'fs/promises';
import { validateQuery } from './playwright-scraper.js';
/**
* Parse alert queries from markdown file
*/
async function parseAlertsFromMarkdown(filePath) {
const content = await readFile(filePath, 'utf-8');
const lines = content.split('\n');
const alerts = [];
let currentAlert = null;
let inCodeBlock = false;
let queryLines = [];
for (const line of lines) {
// Detect alert name
if (line.startsWith('**Alert Name:**') || line.startsWith('## ')) {
if (currentAlert && queryLines.length > 0) {
currentAlert.query = queryLines.join('\n').trim();
alerts.push(currentAlert);
}
let name = '';
if (line.startsWith('**Alert Name:**')) {
const match = line.match(/`([^`]+)`/);
name = match ? match[1] : line.split('**Alert Name:**')[1].trim();
} else if (line.startsWith('## ')) {
name = line.replace(/^## /, '').trim();
}
currentAlert = { name, query: '' };
queryLines = [];
continue;
}
// Detect code blocks containing queries
if (line.trim() === '```') {
if (!inCodeBlock && currentAlert) {
inCodeBlock = true;
queryLines = [];
} else if (inCodeBlock) {
inCodeBlock = false;
}
continue;
}
// Collect query lines
if (inCodeBlock) {
queryLines.push(line);
}
}
// Add last alert
if (currentAlert && queryLines.length > 0) {
currentAlert.query = queryLines.join('\n').trim();
alerts.push(currentAlert);
}
// Clean up ALERT_NAME markers from queries (they cause false negatives)
alerts.forEach(alert => {
alert.query = alert.query.replace(/-"ALERT_NAME:[^"]*"\s*/g, '');
});
return alerts;
}
/**
* Create detailed notes for a single alert test
*/
function createAlertNotes(alertName, result) {
const lines = [];
const timestamp = new Date().toISOString();
lines.push(`## ${alertName}`);
lines.push(`**Tested:** ${timestamp}`);
lines.push(`**Query:** \`${result.query}\``);
lines.push('');
if (result.success) {
lines.push(`**Status:** ✅ Success`);
lines.push(`**Total Results:** ${result.resultCount}`);
lines.push(`**Recent Results:** ${result.recentCount || 0} (today/this week)`);
lines.push(`**Relevant Results:** ${result.relevantCount || 0}`);
lines.push(`**Avg Recency Score:** ${result.avgRecencyScore || 0}/10`);
lines.push(`**Avg Relevance Score:** ${result.avgRelevanceScore || 0}`);
lines.push('');
if (result.recencyDist) {
lines.push('**Recency Breakdown:**');
lines.push(`- Today: ${result.recencyDist.today}`);
lines.push(`- This Week: ${result.recencyDist.this_week}`);
lines.push(`- This Month: ${result.recencyDist.this_month}`);
lines.push(`- Older: ${result.recencyDist.older}`);
lines.push(`- Unknown: ${result.recencyDist.unknown}`);
lines.push('');
}
// Add tuning recommendations
lines.push('**Analysis:**');
if (result.recentCount === 0) {
lines.push('- ⚠️ No recent results - consider broadening keywords or checking if topic is active');
} else if (result.recentCount >= 3) {
lines.push('- ✅ Good number of recent results');
}
if (result.relevantCount < result.resultCount / 2) {
lines.push('- ⚠️ Low relevance - consider adding more specific keywords or filters');
} else {
lines.push('- ✅ Good relevance score');
}
if (result.resultCount < 5) {
lines.push('- ⚠️ Few results - may need to broaden search or check query syntax');
}
lines.push('');
// Sample results
if (result.results && result.results.length > 0) {
lines.push('**Sample Results:**');
result.results.slice(0, 3).forEach((r, idx) => {
const recencyTag = r.recency && r.recency !== 'unknown' ? `[${r.recency}]` : '';
const relevanceTag = r.relevant ? '✓' : '○';
lines.push(`${idx + 1}. ${relevanceTag} ${r.title} ${recencyTag}`);
lines.push(` Domain: ${r.domain}`);
lines.push(` ${r.snippet.substring(0, 100)}...`);
lines.push('');
});
}
} else {
lines.push(`**Status:** ❌ Failed`);
lines.push(`**Error:** ${result.error || 'No results found'}`);
lines.push('');
lines.push('**Recommendations:**');
lines.push('- Check query syntax');
lines.push('- Try broader keywords');
lines.push('- Verify the topic has active discussions');
lines.push('');
}
lines.push('---');
lines.push('');
return lines.join('\n');
}
/**
* Test a batch of queries with delays between each and note-taking
*/
async function validateBatch(browser, alerts, options = {}) {
const {
maxAlerts = 5, // Max alerts to test
delayBetween = 12000, // Delay between tests (ms) - increased for politeness
randomizeOrder = true, // Randomize test order
saveNotes = true // Save detailed notes
} = options;
// Optionally randomize order
const alertsToTest = randomizeOrder
? [...alerts].sort(() => Math.random() - 0.5).slice(0, maxAlerts)
: alerts.slice(0, maxAlerts);
const results = [];
const notes = [];
notes.push(`# Validation Notes\n`);
notes.push(`**Date:** ${new Date().toLocaleString()}`);
notes.push(`**Alerts Tested:** ${alertsToTest.length}`);
notes.push(`**Delay Between Tests:** ${Math.round(delayBetween / 1000)}s`);
notes.push('');
notes.push('---');
notes.push('');
for (let i = 0; i < alertsToTest.length; i++) {
const alert = alertsToTest[i];
console.log(`\n${'='.repeat(80)}`);
console.log(`Testing ${i + 1}/${alertsToTest.length}: ${alert.name}`);
console.log(`${'='.repeat(80)}\n`);
try {
const result = await validateQuery(browser, alert.query);
const enrichedResult = {
name: alert.name,
...result
};
results.push(enrichedResult);
// Add notes for this alert
notes.push(createAlertNotes(alert.name, enrichedResult));
// Delay between requests (avoid rate limiting)
if (i < alertsToTest.length - 1) {
const delay = delayBetween + Math.random() * 3000; // More random variation
console.log(`\n⏱️ Waiting ${Math.round(delay / 1000)}s before next test (polite scraping)...\n`);
await new Promise(resolve => setTimeout(resolve, delay));
}
} catch (error) {
console.error(`❌ Failed to test "${alert.name}": ${error.message}`);
const failedResult = {
name: alert.name,
query: alert.query,
success: false,
error: error.message
};
results.push(failedResult);
notes.push(createAlertNotes(alert.name, failedResult));
}
}
return { results, notes: notes.join('\n') };
}
/**
* Generate validation report with recency and relevance metrics
*/
function generateReport(results) {
const successful = results.filter(r => r.success);
const failed = results.filter(r => !r.success);
// Calculate aggregate metrics
const totalRecent = successful.reduce((sum, r) => sum + (r.recentCount || 0), 0);
const totalRelevant = successful.reduce((sum, r) => sum + (r.relevantCount || 0), 0);
const avgRecencyScore = successful.length > 0
? (successful.reduce((sum, r) => sum + (r.avgRecencyScore || 0), 0) / successful.length).toFixed(1)
: 0;
const avgRelevanceScore = successful.length > 0
? (successful.reduce((sum, r) => sum + (r.avgRelevanceScore || 0), 0) / successful.length).toFixed(1)
: 0;
console.log(`\n${'='.repeat(80)}`);
console.log(`VALIDATION REPORT`);
console.log(`${'='.repeat(80)}\n`);
console.log(`📊 Summary:`);
console.log(` Total Tested: ${results.length}`);
console.log(` ✅ Successful: ${successful.length}`);
console.log(` ❌ Failed: ${failed.length}`);
console.log(` Success Rate: ${Math.round((successful.length / results.length) * 100)}%`);
console.log(` Avg Recency Score: ${avgRecencyScore}/10`);
console.log(` Avg Relevance Score: ${avgRelevanceScore}\n`);
if (successful.length > 0) {
console.log(`✅ Successful Queries:\n`);
successful.forEach(r => {
const recentTag = r.recentCount > 0 ? `[${r.recentCount} recent]` : '';
const relevantTag = r.relevantCount > 0 ? `[${r.relevantCount} relevant]` : '';
console.log(`${r.name} ${recentTag} ${relevantTag}`);
console.log(` Results: ${r.resultCount || 0}`);
console.log(` Recency: ${(r.avgRecencyScore || 0)}/10`);
console.log(` Relevance: ${(r.avgRelevanceScore || 0)}\n`);
});
}
if (failed.length > 0) {
console.log(`❌ Failed Queries:\n`);
failed.forEach(r => {
console.log(`${r.name}`);
console.log(` Error: ${r.error || 'No results found'}\n`);
});
}
// Generate tuning recommendations
console.log(`🔧 Tuning Recommendations:\n`);
const lowRecency = successful.filter(r => (r.recentCount || 0) === 0);
if (lowRecency.length > 0) {
console.log(` Alerts with no recent results (${lowRecency.length}):`);
lowRecency.forEach(r => console.log(` - ${r.name}`));
console.log(` → Consider broadening keywords or checking topic activity\n`);
}
const lowRelevance = successful.filter(r => r.relevantCount < (r.resultCount / 2));
if (lowRelevance.length > 0) {
console.log(` Alerts with low relevance (${lowRelevance.length}):`);
lowRelevance.forEach(r => console.log(` - ${r.name}`));
console.log(` → Add more specific keywords or domain filters\n`);
}
const fewResults = successful.filter(r => r.resultCount < 5);
if (fewResults.length > 0) {
console.log(` Alerts with few results (${fewResults.length}):`);
fewResults.forEach(r => console.log(` - ${r.name}`));
console.log(` → May need broader search terms\n`);
}
return {
total: results.length,
successful: successful.length,
failed: failed.length,
successRate: (successful.length / results.length) * 100,
totalRecent,
totalRelevant,
avgRecencyScore: parseFloat(avgRecencyScore),
avgRelevanceScore: parseFloat(avgRelevanceScore),
results
};
}
/**
* Main function
*/
async function main() {
const args = process.argv.slice(2);
if (args.length === 0) {
console.log(`
Usage:
node scripts/validate-scraping.js <markdown-file> [options]
Options:
--max N Maximum number of alerts to test (default: 5)
--delay MS Delay between tests in ms (default: 5000)
--no-randomize Test alerts in order (default: randomized)
--headless Run browser in headless mode
Examples:
node scripts/validate-scraping.js docs/google-alerts-broad.md
node scripts/validate-scraping.js docs/google-alerts.md --max 3 --delay 8000
node scripts/validate-scraping.js docs/google-alerts-broad.md --headless
`);
process.exit(0);
}
const markdownFile = args[0];
const options = {
maxAlerts: 5,
delayBetween: 12000, // Increased default for polite scraping
randomizeOrder: true,
headless: false,
saveNotes: true
};
// Parse command line options
for (let i = 1; i < args.length; i++) {
if (args[i] === '--max' && args[i + 1]) {
options.maxAlerts = parseInt(args[i + 1]);
i++;
} else if (args[i] === '--delay' && args[i + 1]) {
options.delayBetween = parseInt(args[i + 1]);
i++;
} else if (args[i] === '--no-randomize') {
options.randomizeOrder = false;
} else if (args[i] === '--headless') {
options.headless = true;
}
}
try {
// Parse alerts from markdown
console.log(`\n📖 Reading alerts from: ${markdownFile}\n`);
const alerts = await parseAlertsFromMarkdown(markdownFile);
console.log(`Found ${alerts.length} alerts\n`);
if (alerts.length === 0) {
console.error('❌ No alerts found in file');
process.exit(1);
}
// Launch browser with anti-detection args
console.log('🚀 Launching browser...\n');
const browser = await chromium.launch({
headless: options.headless,
slowMo: 50,
args: [
'--disable-blink-features=AutomationControlled',
'--disable-dev-shm-usage',
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-web-security',
'--disable-features=IsolateOrigins,site-per-process'
]
});
try {
// Validate alerts
const { results, notes } = await validateBatch(browser, alerts, options);
// Generate report
const report = generateReport(results);
// Save report to file
const timestamp = Date.now();
const reportFile = `validation-report-${timestamp}.json`;
const notesFile = `validation-notes-${timestamp}.md`;
await writeFile(reportFile, JSON.stringify(report, null, 2));
console.log(`\n💾 JSON report saved to: ${reportFile}`);
if (options.saveNotes && notes) {
await writeFile(notesFile, notes);
console.log(`📝 Detailed notes saved to: ${notesFile}\n`);
}
} finally {
await browser.close();
console.log('✅ Browser closed\n');
}
} catch (error) {
console.error(`\n❌ Error: ${error.message}\n`);
process.exit(1);
}
}
// Run if called directly
if (import.meta.url === `file://${process.argv[1]}`) {
main().catch(console.error);
}
// Add missing import
import { writeFile } from 'fs/promises';
export { parseAlertsFromMarkdown, validateBatch, generateReport };

378
scripts/validate_alerts.py Normal file
View File

@ -0,0 +1,378 @@
#!/usr/bin/env python3
"""Validate Google Alert query blocks and generate working replacements."""
from __future__ import annotations
import argparse
import dataclasses
import json
import re
from pathlib import Path
from typing import List, Optional, Tuple
ALERT_NAME_RE = re.compile(r"`([^`]+)`")
HEADING_RE = re.compile(r"^(#{3,})\s+(.*)")
SITE_RE = re.compile(r"site:[^\s)]+", re.IGNORECASE)
OR_RE = re.compile(r"\bOR\b", re.IGNORECASE)
QUOTE_RE = re.compile(r'"([^"]+)"')
NEGATIVE_TOKEN_RE = re.compile(r"(?:^|\s)-(?!\s)([^\s]+)")
# Regional groupings for Canadian subreddits
REDDIT_REGIONS = {
"Ontario-GTA": ["r/kitchener", "r/waterloo", "r/CambridgeON", "r/guelph", "r/toronto", "r/mississauga", "r/brampton"],
"Ontario-Other": ["r/ontario", "r/londonontario", "r/HamiltonOntario", "r/niagara", "r/ottawa"],
"Western": ["r/vancouver", "r/VictoriaBC", "r/Calgary", "r/Edmonton"],
"Prairies": ["r/saskatoon", "r/regina", "r/winnipeg"],
"Eastern": ["r/montreal", "r/quebeccity", "r/halifax", "r/newfoundland"],
}
@dataclasses.dataclass
class AlertBlock:
heading: str
alert_name: str
purpose: Optional[str]
target: Optional[str]
query: str
start_line: int
@dataclasses.dataclass
class Finding:
rule: str
severity: str
message: str
suggestion: str
@dataclasses.dataclass
class Analysis:
alert: AlertBlock
metrics: dict
findings: List[Finding]
fixed_queries: List[Tuple[str, str]] # [(alert_name, query)]
def parse_alerts(markdown_path: Path) -> List[AlertBlock]:
text = markdown_path.read_text(encoding="utf-8")
lines = text.splitlines()
alerts: List[AlertBlock] = []
current_heading = ""
pending: Optional[dict] = None
code_lines: List[str] = []
collecting_code = False
for idx, raw_line in enumerate(lines, start=1):
line = raw_line.rstrip("\n")
heading_match = HEADING_RE.match(line)
if heading_match:
hashes, heading_text = heading_match.groups()
if len(hashes) >= 3: # only capture tertiary sections
current_heading = heading_text.strip()
if line.startswith("**Alert Name:**"):
match = ALERT_NAME_RE.search(line)
alert_name = match.group(1).strip() if match else line.split("**Alert Name:**", 1)[1].strip()
pending = {
"heading": current_heading,
"alert_name": alert_name,
"purpose": None,
"target": None,
"query": None,
"start_line": idx,
}
continue
if pending:
if line.startswith("**Purpose:**"):
pending["purpose"] = line.split("**Purpose:**", 1)[1].strip()
continue
if line.startswith("**Target:**"):
pending["target"] = line.split("**Target:**", 1)[1].strip()
continue
if line.strip() == "```":
if not pending:
# ignore code blocks unrelated to alerts
collecting_code = False
code_lines = []
continue
if not collecting_code:
collecting_code = True
code_lines = []
else:
collecting_code = False
query_text = "\n".join(code_lines).strip()
alert_block = AlertBlock(
heading=pending["heading"],
alert_name=pending["alert_name"],
purpose=pending["purpose"],
target=pending["target"],
query=query_text,
start_line=pending["start_line"],
)
alerts.append(alert_block)
pending = None
code_lines = []
continue
if collecting_code:
code_lines.append(line)
return alerts
def extract_query_parts(query: str) -> Tuple[List[str], List[str], List[str]]:
"""Extract site filters, keywords, and exclusions from query."""
sites = SITE_RE.findall(query)
# Extract all quoted phrases first (these are the keywords)
all_keywords = QUOTE_RE.findall(query)
# Filter out ALERT_NAME markers
keywords = [kw for kw in all_keywords if not kw.startswith("ALERT_NAME:")]
# Find exclusions (negative terms)
exclusions = []
for match in NEGATIVE_TOKEN_RE.finditer(query):
term = match.group(1)
# Skip if it's part of quoted text
if '"' not in match.group(0):
exclusions.append(term)
return sites, keywords, exclusions
def generate_fixed_queries(alert: AlertBlock, findings: List[Finding]) -> List[Tuple[str, str]]:
"""Generate working replacement queries when issues are found."""
if not findings or not any(f.severity == "high" for f in findings):
return []
sites, keywords, exclusions = extract_query_parts(alert.query)
fixed = []
# Check if this is a Reddit alert with too many sites
is_reddit = any("reddit.com" in s for s in sites)
has_site_issue = any(f.rule == "site-filter-limit" for f in findings)
has_term_issue = any(f.rule == "term-limit" for f in findings)
if is_reddit and has_site_issue:
# Split by region
for region_name, subreddits in REDDIT_REGIONS.items():
# Limit keywords to top 10-12 most specific ones
top_keywords = keywords[:12] if has_term_issue else keywords[:18]
site_part = " OR ".join([f"site:reddit.com/{sub}" for sub in subreddits])
keyword_part = " OR ".join([f'"{kw}"' for kw in top_keywords])
exclusion_part = " ".join([f"-{ex}" for ex in exclusions[:4]]) # Limit exclusions
fixed_query = f"({site_part})\n({keyword_part})\n{exclusion_part}".strip()
# Verify it meets limits
test_metrics = {
"site_filters": len(subreddits),
"approx_terms": len(top_keywords),
"char_length": len(fixed_query),
}
if test_metrics["site_filters"] <= 8 and test_metrics["approx_terms"] <= 18 and test_metrics["char_length"] <= 500:
new_name = f"{alert.alert_name.replace(' - Reddit CA', '')} - {region_name}"
fixed.append((new_name, fixed_query))
elif has_term_issue and not is_reddit:
# For non-Reddit, just trim keywords
top_keywords = keywords[:15]
site_part = " OR ".join(sites)
keyword_part = " OR ".join([f'"{kw}"' for kw in top_keywords])
exclusion_part = " ".join([f"-{ex}" for ex in exclusions[:4]])
if site_part:
fixed_query = f"({site_part})\n({keyword_part})\n{exclusion_part}".strip()
else:
fixed_query = f"({keyword_part})\n{exclusion_part}".strip()
if len(fixed_query) <= 500:
fixed.append((alert.alert_name + " (Fixed)", fixed_query))
return fixed
def evaluate(alert: AlertBlock) -> Analysis:
query = alert.query
normalized = " ".join(query.split())
site_filters = SITE_RE.findall(query)
or_count = len(OR_RE.findall(query))
approx_terms = or_count + 1
quoted_phrases = len(QUOTE_RE.findall(query))
negative_tokens = len(NEGATIVE_TOKEN_RE.findall(query))
char_length = len(normalized)
lines = query.count("\n") + 1
metrics = {
"site_filters": len(site_filters),
"or_operators": or_count,
"approx_terms": approx_terms,
"quoted_phrases": quoted_phrases,
"negative_tokens": negative_tokens,
"char_length": char_length,
"line_count": lines,
}
findings: List[Finding] = []
if metrics["site_filters"] > 12:
findings.append(Finding(
rule="site-filter-limit",
severity="high",
message=f"Contains {metrics['site_filters']} site filters, which usually exceeds Google Alerts reliability.",
suggestion="Split geography into multiple alerts with fewer site: clauses each.",
))
if metrics["approx_terms"] > 28:
findings.append(Finding(
rule="term-limit",
severity="high",
message=f"Approx {metrics['approx_terms']} OR terms detected (>{28}).",
suggestion="Break the keyword block into two alerts or remove low-value phrases.",
))
if metrics["quoted_phrases"] > 12:
findings.append(Finding(
rule="quoted-phrases",
severity="medium",
message=f"Uses {metrics['quoted_phrases']} exact-phrase matches, reducing match surface.",
suggestion="Convert some exact phrases into (keyword AND variant) pairs to widen matches.",
))
if metrics["char_length"] > 600:
findings.append(Finding(
rule="length",
severity="medium",
message=f"Query is {metrics['char_length']} characters long (Google truncates beyond ~512).",
suggestion="Remove redundant OR terms or shorten site filter lists.",
))
if metrics["negative_tokens"] > 8:
findings.append(Finding(
rule="exclusion-limit",
severity="low",
message=f"Contains {metrics['negative_tokens']} negative filters; excess exclusions may hide valid leads.",
suggestion="Keep only the highest noise sources (e.g., -job -jobs).",
))
if metrics["line_count"] > 3:
findings.append(Finding(
rule="multiline",
severity="low",
message="Query spans more than three lines, which often indicates chained filters beyond alert limits.",
suggestion="Condense by running separate alerts per platform or intent.",
))
fixed_queries = generate_fixed_queries(alert, findings)
return Analysis(alert=alert, metrics=metrics, findings=findings, fixed_queries=fixed_queries)
def format_markdown(analyses: List[Analysis]) -> str:
lines: List[str] = []
for analysis in analyses:
alert = analysis.alert
lines.append(f"### {alert.alert_name}")
heading = alert.heading or "(No heading)"
lines.append(f"Section: {heading}")
lines.append(f"Start line: {alert.start_line}")
metric_parts = [f"site:{analysis.metrics['site_filters']}",
f"ORs:{analysis.metrics['or_operators']}",
f"phrases:{analysis.metrics['quoted_phrases']}",
f"len:{analysis.metrics['char_length']}"]
lines.append("Metrics: " + ", ".join(metric_parts))
if analysis.findings:
lines.append("Findings:")
for finding in analysis.findings:
lines.append(f"- ({finding.severity}) {finding.message} Suggestion: {finding.suggestion}")
else:
lines.append("Findings: None detected by heuristics.")
lines.append("")
return "\n".join(lines).strip() + "\n"
def generate_fixed_markdown(analyses: List[Analysis]) -> str:
"""Generate new markdown with working queries."""
lines = ["# Google Alert Queries - Working Versions", "",
"These queries have been validated to work within Google Alerts limits.",
"Each query stays under 500 chars, uses ≤8 site filters, and ≤18 OR terms.", ""]
for analysis in analyses:
alert = analysis.alert
if analysis.fixed_queries:
# Use fixed versions
for new_name, new_query in analysis.fixed_queries:
lines.append(f"## {new_name}")
if alert.purpose:
lines.append(f"**Purpose:** {alert.purpose}")
if alert.target:
lines.append(f"**Target:** {alert.target}")
lines.append("")
lines.append("```")
lines.append(new_query)
lines.append("```")
lines.append("")
elif not any(f.severity == "high" for f in analysis.findings):
# Query is already OK, keep it
lines.append(f"## {alert.alert_name}")
if alert.purpose:
lines.append(f"**Purpose:** {alert.purpose}")
if alert.target:
lines.append(f"**Target:** {alert.target}")
lines.append("")
lines.append("```")
lines.append(alert.query)
lines.append("```")
lines.append("")
return "\n".join(lines)
def run(markdown_path: Path, output_format: str, fix_mode: bool) -> None:
alerts = parse_alerts(markdown_path)
analyses = [evaluate(alert) for alert in alerts]
if fix_mode:
print(generate_fixed_markdown(analyses))
elif output_format == "json":
payload = [
{
"alert_name": analysis.alert.alert_name,
"heading": analysis.alert.heading,
"start_line": analysis.alert.start_line,
"metrics": analysis.metrics,
"findings": [dataclasses.asdict(f) for f in analysis.findings],
"fixed_count": len(analysis.fixed_queries),
}
for analysis in analyses
]
print(json.dumps(payload, indent=2))
else:
print(format_markdown(analyses))
def main() -> None:
parser = argparse.ArgumentParser(description="Validate Google Alert queries and generate working replacements.")
parser.add_argument("markdown", nargs="?", default="docs/google-alerts.md", help="Path to the markdown file containing alerts.")
parser.add_argument("--format", choices=["markdown", "json"], default="markdown")
parser.add_argument("--fix", action="store_true", help="Generate fixed/working queries")
args = parser.parse_args()
markdown_path = Path(args.markdown)
if not markdown_path.exists():
raise SystemExit(f"File not found: {markdown_path}")
run(markdown_path, args.format, args.fix)
if __name__ == "__main__":
main()

67
tests/alert-setup.spec.js Normal file
View File

@ -0,0 +1,67 @@
import { test, expect } from '@playwright/test';
/**
* Test that documents the process of setting up a new Google Alert
*
* This test can be used as a reference for the alert setup process.
* To record a new version, use: npm run record:alert-setup
*
* Example query to use:
* site:reddit.com/r/techsupport "macbook" ("won't turn on" OR "dead" OR "no power" OR "won't boot")
*/
test('Document alert setup process', async ({ page }) => {
// Navigate to Google Alerts
await page.goto('https://www.google.com/alerts');
// Wait for the page to load
await page.waitForLoadState('networkidle');
// Example query - replace with actual query from docs
const exampleQuery = 'site:reddit.com/r/techsupport "macbook" ("won\'t turn on" OR "dead" OR "no power" OR "won\'t boot")';
// Find the search input and paste the query
// Note: The selector may need to be updated based on Google Alerts UI
const searchInput = page.locator('input[type="text"]').first();
await searchInput.fill(exampleQuery);
// Click "Show options" to expand settings
await page.getByText('Show options', { exact: false }).click();
// Configure alert settings
// How often: As-it-happens
await page.locator('select').first().selectOption('0'); // As-it-happens
// Sources: Automatic
await page.locator('select').nth(1).selectOption('automatic');
// Language: English
await page.locator('select').nth(2).selectOption('en');
// Region: Canada
await page.locator('select').nth(3).selectOption('ca');
// How many: All results
await page.locator('select').nth(4).selectOption('all');
// Deliver to: RSS feed
await page.getByText('RSS feed').click();
// Click "Create Alert"
await page.getByRole('button', { name: 'Create Alert' }).click();
// Wait for alert to be created
await page.waitForLoadState('networkidle');
// Click RSS icon to get feed URL
// Note: This selector may need adjustment based on actual UI
const rssIcon = page.locator('a[href*="feed"]').first();
await rssIcon.click();
// Get the RSS feed URL
const rssUrl = page.url();
console.log('RSS Feed URL:', rssUrl);
// Verify we have an RSS feed URL
expect(rssUrl).toContain('feed');
});

View File

@ -0,0 +1,318 @@
/**
* Example tests demonstrating Playwright with human-like behavior
* Run with: npx playwright test tests/human-behavior.test.js --headed
*/
import { test, expect } from '@playwright/test';
import {
randomDelay,
humanMouseMove,
randomMouseMovements,
humanScroll,
humanClick,
humanType,
simulateReading,
getHumanizedContext
} from '../scripts/human-behavior.js';
test.describe('Human-like behavior tests', () => {
test('should navigate and search Google with human behavior', async ({ browser }) => {
const context = await getHumanizedContext(browser);
const page = await context.newPage();
try {
// Navigate to Google
await page.goto('https://www.google.com');
await randomDelay(1000, 2000);
// Random mouse movements
await randomMouseMovements(page, 2);
// Find search box
const searchBox = 'textarea[name="q"], input[name="q"]';
await page.waitForSelector(searchBox);
// Click and type with human behavior
await humanClick(page, searchBox);
await humanType(page, searchBox, 'playwright testing', {
minDelay: 80,
maxDelay: 200,
mistakes: 0.05
});
// Submit search
await randomDelay(500, 1000);
await page.keyboard.press('Enter');
// Wait for results
await page.waitForLoadState('networkidle');
await randomDelay(1500, 2500);
// Scroll through results
await humanScroll(page, {
scrollCount: 3,
minScroll: 150,
maxScroll: 400,
randomDirection: true
});
// Simulate reading
await simulateReading(page, 3000);
// Verify we have results
const results = await page.locator('div.g').count();
expect(results).toBeGreaterThan(0);
} finally {
await page.close();
await context.close();
}
});
test('should scroll with natural human patterns', async ({ browser }) => {
const context = await getHumanizedContext(browser);
const page = await context.newPage();
try {
// Navigate to a long page
await page.goto('https://en.wikipedia.org/wiki/Web_scraping');
await randomDelay(1000, 2000);
// Get initial scroll position
const initialScroll = await page.evaluate(() => window.scrollY);
// Perform human-like scrolling
await humanScroll(page, {
scrollCount: 5,
minScroll: 100,
maxScroll: 300,
minDelay: 800,
maxDelay: 2000,
randomDirection: true
});
// Verify page scrolled
const finalScroll = await page.evaluate(() => window.scrollY);
expect(finalScroll).not.toBe(initialScroll);
// Add some random mouse movements
await randomMouseMovements(page, 3);
} finally {
await page.close();
await context.close();
}
});
test('should click elements with overshooting', async ({ browser }) => {
const context = await getHumanizedContext(browser);
const page = await context.newPage();
try {
await page.goto('https://www.example.com');
await randomDelay(1000, 1500);
// Move mouse around naturally
await randomMouseMovements(page, 2);
// Click with human behavior (with possible overshoot)
const linkSelector = 'a';
await humanClick(page, linkSelector, {
overshootChance: 0.3, // 30% chance to overshoot
overshootDistance: 25
});
// Wait for navigation
await page.waitForLoadState('networkidle');
// Verify navigation occurred
const url = page.url();
expect(url).toContain('iana.org');
} finally {
await page.close();
await context.close();
}
});
test('should simulate realistic reading behavior', async ({ browser }) => {
const context = await getHumanizedContext(browser);
const page = await context.newPage();
try {
await page.goto('https://news.ycombinator.com');
await randomDelay(1000, 2000);
const startTime = Date.now();
// Simulate reading for 5 seconds
await simulateReading(page, 5000);
const elapsed = Date.now() - startTime;
// Should have taken at least 5 seconds
expect(elapsed).toBeGreaterThanOrEqual(5000);
} finally {
await page.close();
await context.close();
}
});
test('should use randomized browser fingerprints', async ({ browser }) => {
// Create multiple contexts and verify they have different fingerprints
const contexts = [];
try {
for (let i = 0; i < 3; i++) {
const context = await getHumanizedContext(browser);
contexts.push(context);
}
// Each context should have different settings
expect(contexts.length).toBe(3);
// Verify different user agents (likely, due to randomization)
const page1 = await contexts[0].newPage();
const page2 = await contexts[1].newPage();
const ua1 = await page1.evaluate(() => navigator.userAgent);
const ua2 = await page2.evaluate(() => navigator.userAgent);
// Both should be valid user agents
expect(ua1).toBeTruthy();
expect(ua2).toBeTruthy();
await page1.close();
await page2.close();
} finally {
for (const context of contexts) {
await context.close();
}
}
});
test('should type with realistic mistakes and corrections', async ({ browser }) => {
const context = await getHumanizedContext(browser);
const page = await context.newPage();
try {
await page.goto('https://www.google.com');
await randomDelay(1000, 1500);
const searchBox = 'textarea[name="q"], input[name="q"]';
await page.waitForSelector(searchBox);
// Type with high mistake chance for testing
await humanClick(page, searchBox);
await humanType(page, searchBox, 'testing human behavior', {
minDelay: 50,
maxDelay: 120,
mistakes: 0.1 // 10% mistake rate for testing
});
// Get the input value
const value = await page.inputValue(searchBox);
// Should contain the text (might have slight variations due to mistakes)
expect(value.toLowerCase()).toContain('testing');
expect(value.toLowerCase()).toContain('behavior');
} finally {
await page.close();
await context.close();
}
});
});
test.describe('Google Alert validation examples', () => {
test('should validate a simple Google Alert query', async ({ browser }) => {
const context = await getHumanizedContext(browser);
const page = await context.newPage();
try {
// Test query for MacBook repair in Toronto
const query = '"macbook repair" Toronto';
await page.goto('https://www.google.com');
await randomDelay(1000, 2000);
// Perform search
const searchBox = 'textarea[name="q"], input[name="q"]';
await humanClick(page, searchBox);
await humanType(page, searchBox, query);
await randomDelay(500, 1000);
await page.keyboard.press('Enter');
// Wait for results
await page.waitForLoadState('networkidle');
await randomDelay(1500, 2500);
// Check if we got results
const resultCount = await page.locator('div.g').count();
expect(resultCount).toBeGreaterThan(0);
// Scroll through results naturally
await humanScroll(page, { scrollCount: 2 });
// Simulate reading
await simulateReading(page, 2000);
console.log(`✅ Query "${query}" returned ${resultCount} results`);
} finally {
await page.close();
await context.close();
}
});
test('should validate Reddit-specific query', async ({ browser }) => {
const context = await getHumanizedContext(browser);
const page = await context.newPage();
try {
// Reddit-specific query
const query = 'site:reddit.com/r/toronto "laptop repair"';
await page.goto('https://www.google.com');
await randomDelay(1000, 2000);
// Perform search with human behavior
const searchBox = 'textarea[name="q"], input[name="q"]';
await humanClick(page, searchBox);
await humanType(page, searchBox, query, { mistakes: 0.03 });
await randomDelay(500, 1200);
await page.keyboard.press('Enter');
// Wait and analyze
await page.waitForLoadState('networkidle');
await randomDelay(2000, 3000);
// Natural scrolling
await humanScroll(page, {
scrollCount: 2,
minScroll: 200,
maxScroll: 500
});
// Extract results
const results = await page.evaluate(() => {
const items = document.querySelectorAll('div.g');
return Array.from(items).length;
});
console.log(`✅ Reddit query returned ${results} results`);
expect(results).toBeGreaterThanOrEqual(0);
} finally {
await page.close();
await context.close();
}
});
});