Compare commits
2 Commits
f2d7ba2f5d
...
7968da9b60
Author | SHA1 | Date |
---|---|---|
|
7968da9b60 | |
|
467b7dcd1a |
|
@ -3,8 +3,8 @@
|
|||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<meta name="description" content="ScanSnap WebDAV Service - High-performance receipt digitization for buildersclub.ca">
|
||||
<title>ScanSnap WebDAV Service - Colin Knapp Portfolio</title>
|
||||
<meta name="description" content="ScanSnap Scanner Service - High-performance receipt digitization for buildersclub.ca">
|
||||
<title>ScanSnap Scanner Service - Colin Knapp Portfolio</title>
|
||||
<link rel="icon" type="image/x-icon" href="../favicon.ico">
|
||||
<link rel="stylesheet" href="../styles.css" integrity="sha256-Y+6RTuKMnPfNa1TjCQCcFhxwo0G+xNy7t1MaAvn5SuU=">
|
||||
<script src="../theme.js" integrity="sha256-+dDNTo7WAOmn2YC875+vn9oH4UkMwlVOGlARp2uq3A4="></script>
|
||||
|
@ -19,222 +19,90 @@
|
|||
<a href="../index.html">← Back to Portfolio</a>
|
||||
</nav>
|
||||
|
||||
<h1>ScanSnap WebDAV Service for buildersclub.ca</h1>
|
||||
<h1>ScanSnap Scanner Service for buildersclub.ca</h1>
|
||||
|
||||
<div class="project-meta">
|
||||
<p><strong>Timeframe:</strong> 2025-Present</p>
|
||||
<p><strong>Role:</strong> Full-Stack Developer & DevOps Engineer</p>
|
||||
<p><strong>Technologies:</strong> Python, WebDAV, WsgiDAV, macOS Integration</p>
|
||||
<p><strong>Role:</strong> Developer</p>
|
||||
<p><strong>Technologies:</strong> Python, macOS Integration</p>
|
||||
<p><strong>Client:</strong> <a href="https://buildersclub.ca" target="_blank" rel="noopener noreferrer">buildersclub.ca</a></p>
|
||||
</div>
|
||||
|
||||
<hr>
|
||||
|
||||
<section class="project-overview">
|
||||
<h2>The Challenge</h2>
|
||||
<h2>Scanner Service Overview</h2>
|
||||
<p>
|
||||
Running a business means dealing with receipts. Lots of them. And for buildersclub.ca members juggling multiple projects,
|
||||
managing receipt documentation was becoming a serious time sink. Traditional scanning workflows involved multiple steps:
|
||||
scan, save, organize, upload. Multiply that by dozens of receipts, and you're looking at hours of manual work every week.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
I needed a solution that could handle the club's Fujitsu ScanSnap iX1500 scanner—a beast of a machine capable of digitizing
|
||||
50 receipts at nearly one scan per second—but without the usual friction of file management systems.
|
||||
For buildersclub.ca members, I created a simple network scanner endpoint that makes digitizing receipts and documents fast and easy.
|
||||
The service is available at <a href="http://192.168.0.119:9876" target="_blank">http://192.168.0.119:9876</a> on the clubhouse network.
|
||||
</p>
|
||||
|
||||
<div class="highlight-box">
|
||||
<h3>What We Built</h3>
|
||||
<p>
|
||||
A custom WebDAV server optimized specifically for high-speed document scanning. Load 50 receipts, hit scan,
|
||||
and watch them all digitize in under a minute. Files are immediately accessible via macOS Finder (just like
|
||||
a network drive), with automatic daily cleanup to prevent storage bloat. Zero maintenance required.
|
||||
</p>
|
||||
<h3>Key Features</h3>
|
||||
<ul>
|
||||
<li><strong>Processing Speed:</strong> ~1 receipt per second</li>
|
||||
<li><strong>Fast Processing:</strong> ~1 receipt per second</li>
|
||||
<li><strong>Batch Capacity:</strong> Up to 50 documents at once</li>
|
||||
<li><strong>File Access:</strong> Native Finder integration</li>
|
||||
<li><strong>Cleanup:</strong> Automated daily at 3:00 AM</li>
|
||||
<li><strong>Network Protocol:</strong> WebDAV 1.0/2.0 compliant</li>
|
||||
<li><strong>Simple Access:</strong> Just press Command+K in Finder and enter the URL</li>
|
||||
<li><strong>Automatic Cleanup:</strong> Files are automatically removed at 3:00 AM daily</li>
|
||||
<li><strong>Zero Maintenance:</strong> No user management required</li>
|
||||
</ul>
|
||||
<p class="highlight-note">
|
||||
<strong>For buildersclub.ca members:</strong> <a href="http://192.168.0.119:9876" target="_blank">Access the scanner service here</a> (clubhouse network only)
|
||||
<strong>Access URL:</strong> <a href="http://192.168.0.119:9876" target="_blank">http://192.168.0.119:9876</a> (clubhouse network only)
|
||||
</p>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<section class="technical-story">
|
||||
<h2>The Technical Journey</h2>
|
||||
<section class="real-world-impact">
|
||||
<h2>How It Works</h2>
|
||||
|
||||
<h3>Simple Network Scanner Access</h3>
|
||||
<p>
|
||||
The system provides a straightforward network location where the ScanSnap scanner can send documents directly.
|
||||
Just connect with Command+K in Finder, enter the URL, and you have instant access to a network drive ready for scanning.
|
||||
</p>
|
||||
<p>
|
||||
This creates a seamless experience - load your documents, hit scan, and they're immediately available on your computer
|
||||
without any additional steps or software.
|
||||
</p>
|
||||
<h3>Simple Setup</h3>
|
||||
<ol>
|
||||
<li>Connect to the clubhouse network</li>
|
||||
<li>Press Command+K in Finder</li>
|
||||
<li>Enter <code>http://192.168.0.119:9876</code></li>
|
||||
<li>Click "Connect"</li>
|
||||
<li>The scanner folder appears in Finder</li>
|
||||
</ol>
|
||||
|
||||
<h3>Security Without the Headache</h3>
|
||||
<p>
|
||||
Here's the thing about receipt scanners: you want them to be fast and frictionless. Authentication dialogs kill that flow.
|
||||
But you also can't just leave a wide-open file server exposed to the internet.
|
||||
</p>
|
||||
<p>
|
||||
The solution? Custom permissions at the protocol level. The scanner can upload files and delete them when needed,
|
||||
but it can't move, copy, or rename anything. More importantly, the service is completely isolated to its own directory—there's
|
||||
literally no way for it to access files outside <code>~/scansnap-dav/scans</code>, even if someone tried to hack around it.
|
||||
</p>
|
||||
<pre><code>class ScanSnapProvider(FilesystemProvider):
|
||||
def create_collection(self, path):
|
||||
# No creating subdirectories
|
||||
raise DAVError(403, "Creating directories not allowed")
|
||||
<h3>Scanning Process</h3>
|
||||
<ol>
|
||||
<li>Load documents into the ScanSnap scanner</li>
|
||||
<li>Select the network folder as the destination</li>
|
||||
<li>Press scan</li>
|
||||
<li>Documents appear in the folder instantly</li>
|
||||
<li>Copy or move files as needed</li>
|
||||
</ol>
|
||||
|
||||
def copy_resource(self, src_path, dest_path, depth):
|
||||
# No copying files around
|
||||
raise DAVError(403, "Copying not allowed")
|
||||
|
||||
def move_resource(self, src_path, dest_path):
|
||||
# No moving or renaming
|
||||
raise DAVError(403, "Moving/renaming not allowed")</code></pre>
|
||||
<p>
|
||||
For the clubhouse environment, this works perfectly. It's on a trusted network, accessible only to members,
|
||||
and the restricted permissions mean there's no risk of accidentally messing up the file system.
|
||||
<strong>Note:</strong> All files are automatically deleted at 3:00 AM daily to keep the system clean.
|
||||
Make sure to copy important files to your own storage before then.
|
||||
</p>
|
||||
|
||||
<h3>The Storage Problem Nobody Thinks About</h3>
|
||||
<p>
|
||||
When you're scanning 50 receipts at a time, storage fills up fast. Even with PDF compression, you're looking at
|
||||
several megabytes per scan session. Do that a few times a day, and suddenly you're managing gigabytes of receipt data.
|
||||
</p>
|
||||
<p>
|
||||
The fix? Automatic cleanup. Every night at 3 AM, a Python scheduler wipes the scans directory clean. Receipts
|
||||
are meant to be temporary anyway—scan them, grab what you need, move on. The cleanup runs silently in the background,
|
||||
and members never have to think about storage management.
|
||||
</p>
|
||||
<pre><code>def cleanup_scans():
|
||||
scans_dir = os.path.expanduser("~/scansnap-dav/scans")
|
||||
for filename in os.listdir(scans_dir):
|
||||
file_path = os.path.join(scans_dir, filename)
|
||||
if os.isfile(file_path):
|
||||
os.remove(file_path)
|
||||
|
||||
# Daily cleanup at 3:00 AM
|
||||
schedule.every().day.at("03:00").do(cleanup_scans)</code></pre>
|
||||
</section>
|
||||
|
||||
<section class="real-world-impact">
|
||||
<h2>Real-World Impact</h2>
|
||||
<h2>Benefits</h2>
|
||||
|
||||
<h3>From Hours to Minutes</h3>
|
||||
<p>
|
||||
Before this system, processing a week's worth of receipts meant:
|
||||
</p>
|
||||
<ol>
|
||||
<li>Scan receipts one by one (or in small batches)</li>
|
||||
<li>Wait for files to save to the local machine</li>
|
||||
<li>Open file manager and organize scans</li>
|
||||
<li>Upload to cloud storage or accounting software</li>
|
||||
<li>Clean up local copies to free up space</li>
|
||||
</ol>
|
||||
<p>
|
||||
That's easily 20-30 minutes of manual work for a typical batch of receipts.
|
||||
</p>
|
||||
<p>
|
||||
Now? Load the scanner hopper, hit scan, wait 60 seconds, grab the PDFs from Finder. Done. The time savings
|
||||
are dramatic—what used to take half an hour now takes maybe two minutes.
|
||||
</p>
|
||||
|
||||
<h3>The Numbers</h3>
|
||||
<ul>
|
||||
<li><strong>Time Reduction:</strong> 95% decrease in manual document processing</li>
|
||||
<li><strong>Batch Efficiency:</strong> 50 receipts in under 60 seconds</li>
|
||||
<li><strong>Storage Overhead:</strong> Zero (automated cleanup handles everything)</li>
|
||||
<li><strong>User Training Required:</strong> Literally just "Command+K, enter the URL"</li>
|
||||
<li><strong>Time Savings:</strong> 95% reduction in document processing time</li>
|
||||
<li><strong>Efficiency:</strong> Process 50 receipts in under 60 seconds</li>
|
||||
<li><strong>Simplicity:</strong> No special software or training needed</li>
|
||||
<li><strong>Reliability:</strong> Automatic maintenance keeps the system running smoothly</li>
|
||||
</ul>
|
||||
|
||||
<h3>Why It Works</h3>
|
||||
<p>
|
||||
The beauty of this solution is its simplicity. There's no complex web interface, no database, no authentication system
|
||||
to maintain. It's just a WebDAV endpoint that does exactly what the scanner needs and nothing more.
|
||||
</p>
|
||||
<p>
|
||||
For buildersclub.ca members, it means one less thing to think about. Receipts get scanned, files are immediately
|
||||
available, and storage never becomes an issue. The system just works, quietly and reliably, in the background.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<section class="technical-details">
|
||||
<h2>Under the Hood</h2>
|
||||
|
||||
<h3>The Tech Stack</h3>
|
||||
<ul>
|
||||
<li><strong>Server Framework:</strong> WsgiDAV with Cheroot WSGI server</li>
|
||||
<li><strong>Language:</strong> Python 3.13+</li>
|
||||
<li><strong>Automation:</strong> Python schedule library for cleanup</li>
|
||||
<li><strong>macOS Integration:</strong> launchd for auto-start on boot</li>
|
||||
<li><strong>Protocol:</strong> WebDAV with macOS-specific optimizations</li>
|
||||
</ul>
|
||||
|
||||
<h3>Key Configuration</h3>
|
||||
<pre><code>config = {
|
||||
"host": "0.0.0.0",
|
||||
"port": 9876,
|
||||
"provider_mapping": {
|
||||
"/": ScanSnapProvider(scans_dir)
|
||||
},
|
||||
"hotfixes": {
|
||||
"emulate_win32_lastmod": True,
|
||||
"unquote_path_info": True,
|
||||
"win_accept_anonymous": True,
|
||||
},
|
||||
"property_manager": True,
|
||||
"lock_storage": True,
|
||||
}</code></pre>
|
||||
|
||||
<h3>Security Considerations</h3>
|
||||
<ul>
|
||||
<li><strong>Network Scope:</strong> Clubhouse network only, no internet exposure</li>
|
||||
<li><strong>File Isolation:</strong> Cannot access anything outside the scans directory</li>
|
||||
<li><strong>Operation Restrictions:</strong> Upload, read, and delete only—no move/copy/rename</li>
|
||||
<li><strong>Authentication:</strong> None required (trusted network environment)</li>
|
||||
</ul>
|
||||
</section>
|
||||
|
||||
<section class="lessons-learned">
|
||||
<h2>Lessons Learned</h2>
|
||||
|
||||
<h3>Sometimes Simple is Better</h3>
|
||||
<p>
|
||||
I could have built a full web application with user accounts, file organization features, OCR processing,
|
||||
automatic categorization, cloud sync... but none of that was actually needed. The scanner needed a place to
|
||||
dump files quickly, and users needed to grab those files easily. Mission accomplished with a fraction of the complexity.
|
||||
</p>
|
||||
|
||||
<h3>Simple Network Integration</h3>
|
||||
<p>
|
||||
The solution integrates directly with macOS Finder, making it immediately familiar to users without requiring
|
||||
any special software or training. Connect once, and the scanner endpoint is always ready to receive your documents.
|
||||
</p>
|
||||
|
||||
<h3>Automatic Cleanup Changes Everything</h3>
|
||||
<p>
|
||||
The daily cleanup feature turned this from a "nice to have" into a "set it and forget it" solution. Nobody
|
||||
thinks about storage, nobody worries about running out of space, and the system stays lean indefinitely.
|
||||
This simple solution dramatically reduces the time buildersclub.ca members spend on receipt management,
|
||||
allowing them to focus on their projects instead of paperwork.
|
||||
</p>
|
||||
</section>
|
||||
|
||||
<hr>
|
||||
|
||||
<div class="project-links">
|
||||
<h3>Related Links</h3>
|
||||
<h3>Quick Links</h3>
|
||||
<ul>
|
||||
<li><a href="../index.html">← Back to Portfolio</a></li>
|
||||
<li><a href="https://buildersclub.ca" target="_blank" rel="noopener noreferrer">buildersclub.ca</a></li>
|
||||
<li><strong>For Members:</strong> <a href="http://192.168.0.119:9876" target="_blank">Scanner Service Access</a> (clubhouse network)</li>
|
||||
<li><a href="https://github.com/mar10/wsgidav" target="_blank" rel="noopener noreferrer">WsgiDAV Framework</a></li>
|
||||
<li><a href="https://www.fujitsu.com/us/products/computing/peripheral/scanners/scansnap/" target="_blank" rel="noopener noreferrer">Fujitsu ScanSnap Scanners</a></li>
|
||||
<li><strong>Scanner Access:</strong> <a href="http://192.168.0.119:9876" target="_blank">http://192.168.0.119:9876</a> (clubhouse network)</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
|
|
Loading…
Reference in New Issue