27 Aug 2004
Jessica has some interesting thoughts on combining blogs with file management. I think she has the right idea, but she's finding it hard to elaborate the details of how it would work because she's so used to using software that was just built wrong.
Here's what I mean: the concept of a "file manager" as a business/collaboration tool is broken. A file manager focuses on files, which tend to have different formats and byte lengths and gobble up disk quotas and sit in one folder and can only be viewed by certain software. So the file manager's job is to help you navigate each of these things.
None of these capabilities, emphatically not a single one, belongs in frassle or any other blog community system.
Files contain information. Blogs are about sharing information. However, sharing files is a lot harder than sharing information in blogs. This is because:
1. Files come in a variety of incompatible formats.
- you often can't view them in your web browser
- they might not be easy to read for all of your users
- they're hard to excerpt and republish
- they're hard to search
2. If you have a file, it's difficult to find out:
- who created it
- who's in charge of it
- if you're looking at the latest version
- if any discussion has flowed from this information
3. Files expose you to some major risks and annoyances:
- you might get a virus
- you might run out of disk space
- you might copy this somewhere and forget about it, and years later be unable to decide if this is a crucial document or just some cruft you decided to hold temporarily.
If you invert each item in this list, you see the major strengths of blogging over other information sharing approaches. Specifically:
- thanks to the personal focus of blogs, you can always tell who is in charge of some post
- thanks to permalinks, you can always ensure you're looking at the latest version
- thanks to the simplicity of HTML and RSS, you can always easily excerpt, republish, and search blog posts
- thanks to comments, trackback, and technorati/feedster/google, you can find out what discussion has flowed from any post
- thanks to the ubiquity and architecture of the web, you never have to risk viruses, running out of disk space, etc.
So I think the requirements for integrating file management into a blog community system are simple. Take all the things that make blogging so valuable and easy, and support a few additional input formats.
That is an easily stated but profound requirement. Let's consider the example of the PDF format, which is used for printable multi-page documents. PDF is about your best bet for preserving exact print formatting in a document while still displaying OK on screen. You've probably already viewed PDF documents in your web browser, using Adobe's excellent Acrobat Reader. This gives you some things for free:
- you can view documents through your browser without saving files (of course, you must have the plugin, which is not quite as ubiquitous as web browsers)
- you probably can't get a virus from reading a PDF file
But a few challenges remain. Let's go through the positive features of blogs and speculate on how to integrate PDF files.
- you can always tell who is in charge of some post
One way to do this with a PDF file is to link it to a particular blog post. Ideally, the blog community system would actually edit the PDF file and insert that entry's permalink at the top of the first page. This should be clickable (to load the blog post page in your browser) but should not automatically be printed in hard copy.
- you can always ensure you're looking at the latest version
PDF files may include a mechanism for checking for alternate versions in some sort of repository. If so, it should be used. If not, the link approach described above is a suitable workaround.
- you can always easily excerpt, republish, and search blog posts
It's really hard to do any of these on PDFs. The likely solution is to just convert them into HTML, and make the actual PDF file a secondary format primarily useful for printing. But while extracting the text from a PDF file is easy, translating the meaning of a page's design to a non-paginated medium like HTML is not something computers can do perfectly. If you're not convinced, try out Google's impression of IRS tax form 1040. Keep in mind that Google is a wealthy company with some of the best engineers in the world. They have a fairly readable HTML rendering of the form, but commentary on a form like this really requires post-it notes, highlighters, and drawings in red ink—none of which we can do or emulate well on the web.
- you can find out what discussion has flowed from any post
PDFs are bad for discussion because they are one step removed from immutable hard copy. Therefore there are few responses, but those that exist are likely to be in emails or other inaccessible media (phone calls, face-to-face meetings). So there probably isn't much discussion to be found, but we can likely just divert people onto the standard comment/response system.
- you never have to risk viruses, running out of disk space, etc.
This is mostly free with PDF files when you use the browser plugin.
Hmm. That's not so bad. The other question is, can we do all of these things without compromising the rest of the blogging experience? Clearly, despite our best efforts, a PDF will never be as easy to work with as a plain HTML blog post. It will waste more time and cause more frustration. What about more complex/proprietary formats, like MS Word? What about video? audio? source code? executable software?
Actually, I suspect many of these will have their own holistic community collaboration platforms, using the metaphors for authority, discussion, republication, and convenient use that apply in that medium. Cool.