8.7 KiB
8.7 KiB
Canadian Repair RSS Feed Monitor - Development Guide
For Developers: Technical documentation for maintaining and extending the RSS feed generation system.
🏗️ Project Architecture
Directory Structure
rss-feedmonitor/
├── README.md # User-facing documentation
├── DEVELOPMENT.md # This technical guide
├── .gitignore # Git ignore rules
├── docs/ # User documentation
│ ├── WORKFLOW.md
│ ├── QUICK_CREATE_GUIDE.md
│ ├── KEYWORD_OPTIMIZATION.md
│ ├── canadian-subreddits.md
│ └── canadian-repair-searches.md
├── scripts/ # Python generation scripts
│ ├── generate_modular_rss_feeds.py
│ ├── generate_optimized_rss_feeds.py
│ ├── generate_practical_rss_feeds.py
│ ├── generate_rss_feeds.py
│ ├── extract_website_keywords.py
│ └── update_keywords_from_website.py
├── data/ # Source data files
│ ├── repair_keywords.json
│ ├── canadian_subreddits.json
│ └── remaining_queries.txt
├── feeds/ # Generated RSS feeds
│ ├── rss-feeds.json
│ ├── *.md (generated RSS docs)
│ └── *.opml (RSS reader imports)
└── archive/ # Old/deprecated files
├── ALERT_CREATION_PROCESS.md
└── ALL_SEARCH_LINKS_COMPLETE.txt
🔧 Development Setup
Prerequisites
- Python 3.8+
- No external dependencies for core functionality
- Optional:
pyyamlfor advanced keyword extraction
Installation
# Clone the repository
git clone <repository-url>
cd rss-feedmonitor
# No pip installs required for basic functionality
# Optional: pip install pyyaml (for extract_website_keywords.py)
📊 Data Sources
repair_keywords.json
Purpose: Defines all repair keyword categories and search terms Structure:
{
"categories": {
"iphone_repairs": {
"name": "iPhone Repair Requests",
"description": "...",
"devices": ["iPhone", "iPhone 12", ...],
"problems": ["repair", "fix", "broken", ...]
}
},
"additional_keywords": {
"urgency_indicators": ["emergency", "urgent", ...],
"location_indicators": ["local", "near me", ...]
}
}
canadian_subreddits.json
Purpose: Defines Canadian subreddits with metadata Structure:
{
"priorities": {
"critical": {
"subreddits": [
{
"name": "toronto",
"province": "ON",
"population": "2.9M",
"priority_score": 10
}
]
}
}
}
🛠️ RSS Generation Scripts
generate_modular_rss_feeds.py (Primary)
Purpose: Main RSS feed generation script Features:
- Reads from data/ source files
- Generates both Markdown and OPML outputs
- Modular design for easy maintenance
- Handles keyword categorization automatically
Usage:
cd scripts
python3 generate_modular_rss_feeds.py
Output:
feeds/rss_feeds_[timestamp].md- Human-readable RSS feed documentationfeeds/rss_feeds_[timestamp].opml- RSS reader import file
Keyword Update Scripts
update_keywords_from_website.py
Purpose: Manually update keywords from motherboardrepair.ca Usage:
cd scripts
python3 update_keywords_from_website.py
extract_website_keywords.py
Purpose: Extract keywords from website YAML/CSV files (requires pyyaml) Usage:
pip install pyyaml
cd scripts
python3 extract_website_keywords.py
🔄 RSS Feed Generation Process
1. Keyword Processing
# Load keywords from data/repair_keywords.json
keywords = load_keywords()
# For each category (iphone_repairs, macbook_repairs, etc.)
for category, data in keywords["categories"].items():
# Extract devices and problems
devices = data["devices"]
problems = data["problems"]
# Generate search query: (device1 OR device2) AND (problem1 OR problem2)
search_query = build_search_query(devices, problems)
2. URL Generation
# Reddit search RSS format
base_url = "https://www.reddit.com/r/{}/search.rss?q={}&sort=new&type=link"
# URL encode the search query
encoded_query = urllib.parse.quote(search_query)
rss_url = base_url.format(subreddit_name, encoded_query)
3. Output Generation
Markdown Output
- Hierarchical structure by priority/city/category
- Search queries and RSS URLs for each feed
- Device and problem breakdowns
- Implementation guidance
OPML Output
- XML format for RSS reader bulk import
- Nested outlines by priority/subreddit/category
- RSS XML URLs with proper encoding
📝 Adding New Keywords
1. Edit repair_keywords.json
{
"categories": {
"new_category": {
"name": "New Device Repairs",
"description": "New device type repair requests",
"devices": ["Device1", "Device2"],
"problems": ["issue1", "issue2", "issue3"]
}
}
}
2. Regenerate RSS Feeds
cd scripts
python3 generate_modular_rss_feeds.py
🏙️ Adding New Canadian Cities
1. Edit canadian_subreddits.json
{
"priorities": {
"medium": {
"subreddits": [
{
"name": "newcity",
"province": "AB",
"population": "500K",
"priority_score": 5
}
]
}
}
}
2. Regenerate RSS Feeds
cd scripts
python3 generate_modular_rss_feeds.py
🔍 Reddit Search RSS Format
URL Structure
https://www.reddit.com/r/[subreddit]/search.rss?q=[query]&sort=new&type=link
Query Syntax
- AND operations: Use
ANDbetween device and problem groups - OR operations: Use
ORwithin device/problem groups - Exact phrases: Use
"quotes"for multi-word terms - URL encoding: All special characters must be URL-encoded
Examples
# iPhone repairs
query = '("iPhone" OR "iPhone 12") AND ("repair" OR "broken")'
# URL encoded
encoded = urllib.parse.quote(query)
url = f"https://www.reddit.com/r/toronto/search.rss?q={encoded}&sort=new&type=link"
🧪 Testing RSS Feeds
Manual Testing
- Copy RSS URL to browser
- Verify feed loads and shows recent posts
- Check that search results match expected keywords
- Test OPML import in RSS reader
Automated Testing
# Test feed validity (requires feedparser)
pip install feedparser
python3 -c "
import feedparser
feed = feedparser.parse('YOUR_RSS_URL')
print(f'Feed title: {feed.feed.title}')
print(f'Entries: {len(feed.entries)}')
"
🚀 Deployment
Git Workflow
# Update source files
git add data/*.json
git commit -m "Update keywords/subreddits"
# Regenerate feeds
python3 scripts/generate_modular_rss_feeds.py
# Commit generated files
git add feeds/
git commit -m "Regenerate RSS feeds with updated data"
# Push changes
git push origin main
Version Control Strategy
- Source files (data/*.json): Always commit changes
- Generated files (feeds/*.md, *.opml): Regenerate as needed, commit for distribution
- Scripts: Version controlled, update as needed
🐛 Troubleshooting
Common Issues
RSS Feed Not Loading:
- Verify subreddit name is correct
- Check if subreddit restricts RSS access
- Ensure URL encoding is proper
No Search Results:
- Simplify search query (Reddit search has limitations)
- Check keyword spelling and relevance
- Verify subreddit has active repair discussions
OPML Import Issues:
- Validate XML structure
- Check for special characters in URLs
- Test with a single feed first
Debug Mode
Add debug prints to scripts:
# In generate_modular_rss_feeds.py
print(f"Processing {len(feeds)} feeds...")
for feed in feeds[:5]: # Debug first 5
print(f" {feed['subreddit']}: {feed['category_name']}")
📈 Performance Optimization
Feed Count Management
- Current: ~322 feeds across 23 subreddits × 14 categories
- Monitor RSS reader performance with high feed counts
- Consider priority-based feed generation for large deployments
Update Frequency
- Daily: Regenerate feeds for latest subreddit activity
- Weekly: Update keywords based on lead quality analysis
- Monthly: Add new cities/subreddits as market expands
🔒 Security Considerations
- No API keys or authentication required (Reddit RSS is public)
- Source files contain only public subreddit information
- Generated RSS URLs are safe for public distribution
- No sensitive data stored in repository
📚 Related Documentation
docs/WORKFLOW.md- User-facing workflow guidedocs/QUICK_CREATE_GUIDE.md- Fast RSS feed creationdocs/canadian-repair-searches.md- Search strategy detailsREADME.md- Project overview for users