rss-feedmonitor/DEVELOPMENT.md

8.7 KiB
Raw Permalink Blame History

Canadian Repair RSS Feed Monitor - Development Guide

For Developers: Technical documentation for maintaining and extending the RSS feed generation system.

🏗️ Project Architecture

Directory Structure

rss-feedmonitor/
├── README.md                           # User-facing documentation
├── DEVELOPMENT.md                      # This technical guide
├── .gitignore                         # Git ignore rules
├── docs/                              # User documentation
│   ├── WORKFLOW.md
│   ├── QUICK_CREATE_GUIDE.md
│   ├── KEYWORD_OPTIMIZATION.md
│   ├── canadian-subreddits.md
│   └── canadian-repair-searches.md
├── scripts/                           # Python generation scripts
│   ├── generate_modular_rss_feeds.py
│   ├── generate_optimized_rss_feeds.py
│   ├── generate_practical_rss_feeds.py
│   ├── generate_rss_feeds.py
│   ├── extract_website_keywords.py
│   └── update_keywords_from_website.py
├── data/                              # Source data files
│   ├── repair_keywords.json
│   ├── canadian_subreddits.json
│   └── remaining_queries.txt
├── feeds/                             # Generated RSS feeds
│   ├── rss-feeds.json
│   ├── *.md (generated RSS docs)
│   └── *.opml (RSS reader imports)
└── archive/                           # Old/deprecated files
    ├── ALERT_CREATION_PROCESS.md
    └── ALL_SEARCH_LINKS_COMPLETE.txt

🔧 Development Setup

Prerequisites

  • Python 3.8+
  • No external dependencies for core functionality
  • Optional: pyyaml for advanced keyword extraction

Installation

# Clone the repository
git clone <repository-url>
cd rss-feedmonitor

# No pip installs required for basic functionality
# Optional: pip install pyyaml (for extract_website_keywords.py)

📊 Data Sources

repair_keywords.json

Purpose: Defines all repair keyword categories and search terms Structure:

{
  "categories": {
    "iphone_repairs": {
      "name": "iPhone Repair Requests",
      "description": "...",
      "devices": ["iPhone", "iPhone 12", ...],
      "problems": ["repair", "fix", "broken", ...]
    }
  },
  "additional_keywords": {
    "urgency_indicators": ["emergency", "urgent", ...],
    "location_indicators": ["local", "near me", ...]
  }
}

canadian_subreddits.json

Purpose: Defines Canadian subreddits with metadata Structure:

{
  "priorities": {
    "critical": {
      "subreddits": [
        {
          "name": "toronto",
          "province": "ON",
          "population": "2.9M",
          "priority_score": 10
        }
      ]
    }
  }
}

🛠️ RSS Generation Scripts

generate_modular_rss_feeds.py (Primary)

Purpose: Main RSS feed generation script Features:

  • Reads from data/ source files
  • Generates both Markdown and OPML outputs
  • Modular design for easy maintenance
  • Handles keyword categorization automatically

Usage:

cd scripts
python3 generate_modular_rss_feeds.py

Output:

  • feeds/rss_feeds_[timestamp].md - Human-readable RSS feed documentation
  • feeds/rss_feeds_[timestamp].opml - RSS reader import file

Keyword Update Scripts

update_keywords_from_website.py

Purpose: Manually update keywords from motherboardrepair.ca Usage:

cd scripts
python3 update_keywords_from_website.py

extract_website_keywords.py

Purpose: Extract keywords from website YAML/CSV files (requires pyyaml) Usage:

pip install pyyaml
cd scripts
python3 extract_website_keywords.py

🔄 RSS Feed Generation Process

1. Keyword Processing

# Load keywords from data/repair_keywords.json
keywords = load_keywords()

# For each category (iphone_repairs, macbook_repairs, etc.)
for category, data in keywords["categories"].items():
    # Extract devices and problems
    devices = data["devices"]
    problems = data["problems"]

    # Generate search query: (device1 OR device2) AND (problem1 OR problem2)
    search_query = build_search_query(devices, problems)

2. URL Generation

# Reddit search RSS format
base_url = "https://www.reddit.com/r/{}/search.rss?q={}&sort=new&type=link"

# URL encode the search query
encoded_query = urllib.parse.quote(search_query)
rss_url = base_url.format(subreddit_name, encoded_query)

3. Output Generation

Markdown Output

  • Hierarchical structure by priority/city/category
  • Search queries and RSS URLs for each feed
  • Device and problem breakdowns
  • Implementation guidance

OPML Output

  • XML format for RSS reader bulk import
  • Nested outlines by priority/subreddit/category
  • RSS XML URLs with proper encoding

📝 Adding New Keywords

1. Edit repair_keywords.json

{
  "categories": {
    "new_category": {
      "name": "New Device Repairs",
      "description": "New device type repair requests",
      "devices": ["Device1", "Device2"],
      "problems": ["issue1", "issue2", "issue3"]
    }
  }
}

2. Regenerate RSS Feeds

cd scripts
python3 generate_modular_rss_feeds.py

🏙️ Adding New Canadian Cities

1. Edit canadian_subreddits.json

{
  "priorities": {
    "medium": {
      "subreddits": [
        {
          "name": "newcity",
          "province": "AB",
          "population": "500K",
          "priority_score": 5
        }
      ]
    }
  }
}

2. Regenerate RSS Feeds

cd scripts
python3 generate_modular_rss_feeds.py

🔍 Reddit Search RSS Format

URL Structure

https://www.reddit.com/r/[subreddit]/search.rss?q=[query]&sort=new&type=link

Query Syntax

  • AND operations: Use AND between device and problem groups
  • OR operations: Use OR within device/problem groups
  • Exact phrases: Use "quotes" for multi-word terms
  • URL encoding: All special characters must be URL-encoded

Examples

# iPhone repairs
query = '("iPhone" OR "iPhone 12") AND ("repair" OR "broken")'

# URL encoded
encoded = urllib.parse.quote(query)
url = f"https://www.reddit.com/r/toronto/search.rss?q={encoded}&sort=new&type=link"

🧪 Testing RSS Feeds

Manual Testing

  1. Copy RSS URL to browser
  2. Verify feed loads and shows recent posts
  3. Check that search results match expected keywords
  4. Test OPML import in RSS reader

Automated Testing

# Test feed validity (requires feedparser)
pip install feedparser
python3 -c "
import feedparser
feed = feedparser.parse('YOUR_RSS_URL')
print(f'Feed title: {feed.feed.title}')
print(f'Entries: {len(feed.entries)}')
"

🚀 Deployment

Git Workflow

# Update source files
git add data/*.json
git commit -m "Update keywords/subreddits"

# Regenerate feeds
python3 scripts/generate_modular_rss_feeds.py

# Commit generated files
git add feeds/
git commit -m "Regenerate RSS feeds with updated data"

# Push changes
git push origin main

Version Control Strategy

  • Source files (data/*.json): Always commit changes
  • Generated files (feeds/*.md, *.opml): Regenerate as needed, commit for distribution
  • Scripts: Version controlled, update as needed

🐛 Troubleshooting

Common Issues

RSS Feed Not Loading:

  • Verify subreddit name is correct
  • Check if subreddit restricts RSS access
  • Ensure URL encoding is proper

No Search Results:

  • Simplify search query (Reddit search has limitations)
  • Check keyword spelling and relevance
  • Verify subreddit has active repair discussions

OPML Import Issues:

  • Validate XML structure
  • Check for special characters in URLs
  • Test with a single feed first

Debug Mode

Add debug prints to scripts:

# In generate_modular_rss_feeds.py
print(f"Processing {len(feeds)} feeds...")
for feed in feeds[:5]:  # Debug first 5
    print(f"  {feed['subreddit']}: {feed['category_name']}")

📈 Performance Optimization

Feed Count Management

  • Current: ~322 feeds across 23 subreddits × 14 categories
  • Monitor RSS reader performance with high feed counts
  • Consider priority-based feed generation for large deployments

Update Frequency

  • Daily: Regenerate feeds for latest subreddit activity
  • Weekly: Update keywords based on lead quality analysis
  • Monthly: Add new cities/subreddits as market expands

🔒 Security Considerations

  • No API keys or authentication required (Reddit RSS is public)
  • Source files contain only public subreddit information
  • Generated RSS URLs are safe for public distribution
  • No sensitive data stored in repository
  • docs/WORKFLOW.md - User-facing workflow guide
  • docs/QUICK_CREATE_GUIDE.md - Fast RSS feed creation
  • docs/canadian-repair-searches.md - Search strategy details
  • README.md - Project overview for users