computers/content management


link

Ideas for hooking frassle into Planet OSCOM's aggregator.

link

link

After much background thought, I believe I know how to best add file management to frassle. (This is a long-term goal, but has been getting some attention recently.) It's simple.

Each file is a blog post, just like any other content fragment in frassle.

The note body, when viewing the blog on a web page or RSS feed, shows an HTML representation of that content fragment. There is also a link to the original file. So if you uploaded a PDF, it would get converted to HTML for the note body. A photo would be represented as a thumbnail. An XML document would apply any appropriate XSLT transformations and display as XHTML or just a document tree. A text file would become preformatted HTML. An unknown format would just give you file name and size.

This approach has some significant advantages over email-style attachments. In no particular order:

  • you can categorize individual files
  • you automatically get the same renaming, editing, deleting, and sharing/privacy options as anything else in your blog
  • you can easily browse by timestamp or category
  • you can subscribe to updates
  • you can use the regular search to search the HTML representations of files; i.e. you get to do text searches of PDFs and MS Word documents.

On the other hand, if you're used to handling files as attachments, this seems a little weird. Some files just don't stand on their own: photos are part of an album; PDFs are final print-ready renderings of a long-term effort; source code files are parts of complicated projects. If we just treat them all like time-ordered blog posts, won't it just be chaos?

Yes. And that's a good thing.

See, this problem is orthogonal to file uploading. Any piece of content has a context that is more or less important to properly understanding what it means. Showing you that context is one of frassle's most fundamental goals. Time, categorizations, and inter-feed threading of conversations are the basic tools we use to accomplish that goal. Why not expand the challenge? Detecting and expressing more meaningful relationships between content will make all of frassle more powerful.

And hey, I have the feeling categories and timestamps will prevent a lot of chaos by themselves.


Example: One of the first applications of the multimedia post framework can be a tool to help you show posts in context. The scrapbook is a simple list of blog posts (notes in frassle parlance), chosen from any blog. Each note is identified by permalink, and can be included in whole, selectively excerpted, or simply referenced. You can also add text anywhere in the scrapbook. As you browse through frassle, each note has an "add to scrapbook" button.

I like the way this sounds. You could not only use the scrapbook to, say, collect a number of related discussions from around the web—you could actually build photo albums with it! The same concept—a binder with pages covered in taped-on papers in real life, a list of permalinks online—works to gather and share related pieces of content.

Still better, the scrapbook itself can be a piece of content. Under the multimedia post framework, we could represent the scrapbook as an XML document, probably in OPML. Then you can share scrapbooks easily, rename/delete/edit them in the standard way, and even incorporate one scrapbook into another!

link

Jessica has some interesting thoughts on combining blogs with file management. I think she has the right idea, but she's finding it hard to elaborate the details of how it would work because she's so used to using software that was just built wrong.

Here's what I mean: the concept of a "file manager" as a business/collaboration tool is broken. A file manager focuses on files, which tend to have different formats and byte lengths and gobble up disk quotas and sit in one folder and can only be viewed by certain software. So the file manager's job is to help you navigate each of these things.

None of these capabilities, emphatically not a single one, belongs in frassle or any other blog community system.

Files contain information. Blogs are about sharing information. However, sharing files is a lot harder than sharing information in blogs. This is because:

1. Files come in a variety of incompatible formats.

  • you often can't view them in your web browser
  • they might not be easy to read for all of your users
  • they're hard to excerpt and republish
  • they're hard to search

2. If you have a file, it's difficult to find out:

  • who created it
  • who's in charge of it
  • if you're looking at the latest version
  • if any discussion has flowed from this information

3. Files expose you to some major risks and annoyances:

  • you might get a virus
  • you might run out of disk space
  • you might copy this somewhere and forget about it, and years later be unable to decide if this is a crucial document or just some cruft you decided to hold temporarily.

If you invert each item in this list, you see the major strengths of blogging over other information sharing approaches. Specifically:

  • thanks to the personal focus of blogs, you can always tell who is in charge of some post
  • thanks to permalinks, you can always ensure you're looking at the latest version
  • thanks to the simplicity of HTML and RSS, you can always easily excerpt, republish, and search blog posts
  • thanks to comments, trackback, and technorati/feedster/google, you can find out what discussion has flowed from any post
  • thanks to the ubiquity and architecture of the web, you never have to risk viruses, running out of disk space, etc.

So I think the requirements for integrating file management into a blog community system are simple. Take all the things that make blogging so valuable and easy, and support a few additional input formats.

Analysis

That is an easily stated but profound requirement. Let's consider the example of the PDF format, which is used for printable multi-page documents. PDF is about your best bet for preserving exact print formatting in a document while still displaying OK on screen. You've probably already viewed PDF documents in your web browser, using Adobe's excellent Acrobat Reader. This gives you some things for free:

  • you can view documents through your browser without saving files (of course, you must have the plugin, which is not quite as ubiquitous as web browsers)
  • you probably can't get a virus from reading a PDF file

But a few challenges remain. Let's go through the positive features of blogs and speculate on how to integrate PDF files.

  • you can always tell who is in charge of some post
    One way to do this with a PDF file is to link it to a particular blog post. Ideally, the blog community system would actually edit the PDF file and insert that entry's permalink at the top of the first page. This should be clickable (to load the blog post page in your browser) but should not automatically be printed in hard copy.
  • you can always ensure you're looking at the latest version
    PDF files may include a mechanism for checking for alternate versions in some sort of repository. If so, it should be used. If not, the link approach described above is a suitable workaround.
  • you can always easily excerpt, republish, and search blog posts
    It's really hard to do any of these on PDFs. The likely solution is to just convert them into HTML, and make the actual PDF file a secondary format primarily useful for printing. But while extracting the text from a PDF file is easy, translating the meaning of a page's design to a non-paginated medium like HTML is not something computers can do perfectly. If you're not convinced, try out Google's impression of IRS tax form 1040. Keep in mind that Google is a wealthy company with some of the best engineers in the world. They have a fairly readable HTML rendering of the form, but commentary on a form like this really requires post-it notes, highlighters, and drawings in red ink—none of which we can do or emulate well on the web.
  • you can find out what discussion has flowed from any post
    PDFs are bad for discussion because they are one step removed from immutable hard copy. Therefore there are few responses, but those that exist are likely to be in emails or other inaccessible media (phone calls, face-to-face meetings). So there probably isn't much discussion to be found, but we can likely just divert people onto the standard comment/response system.
  • you never have to risk viruses, running out of disk space, etc.
    This is mostly free with PDF files when you use the browser plugin.

Hmm. That's not so bad. The other question is, can we do all of these things without compromising the rest of the blogging experience? Clearly, despite our best efforts, a PDF will never be as easy to work with as a plain HTML blog post. It will waste more time and cause more frustration. What about more complex/proprietary formats, like MS Word? What about video? audio? source code? executable software?

Actually, I suspect many of these will have their own holistic community collaboration platforms, using the metaphors for authority, discussion, republication, and convenient use that apply in that medium. Cool.

link

Great news! Our proposal for a talk called Interpersonal Content Management has been accepted into the 4th conference on Open-Source Content Management. Josh and I will be speaking at 10:15am on Friday, October 1 at ETH in Zurich, Switzerland.

In addition to providing me with a great excuse to travel to Europe, this conference will give me an opportunity to meet many interesting people. On the work end of things, it implies a few tasks for Josh and me to finish before leaving the country:

  1. Write a user/developer/administrator's manual for frassle.
  2. Set up a version control system.
  3. Release frassle under the GPL.
  4. Write the actual talk!

The talk will be about an emerging genre of content management we call Interpersonal Content Management. Here's the talk proposal:

Content management systems have found two compelling applications. The organizational CMS focuses many contributors around common business goals. The personal CMS, typically a blog, eliminates barriers to individual publishing. While organizational CMS recreates its social structures based on existing business relationships, personal CMS leaves its users to develop relationships from the ground up.

Bloggers express these relationships using simple mechanisms like linking and republishing. Because blogs provide a lasting, personal identity, they make it possible for social phenomena like reputation and trust to develop online. These in turn support informal communities of interest, offering their members ad hoc ways to collaborate without establishing a typical business relationship. We call this blossoming new usage Interpersonal Content Management. It is characterized by a fusion of content consumer and producer roles. By contributing incremental commentary on others' content—even by the implicit endorsement of linking—individuals make connections between their own interests and the interests of others.

Readers already use these connections informally to evaluate the meaning and relevance of new content. But by designing software to make the creation and discovery of such connections easier, we can make interpersonal content management more practical and scalable. First, users are empowered to quickly bookmark, rate, categorize, and remix content from anywhere on the web. Second, the associations they form in these actions are tracked by the interpersonal CMS, which then automatically suggests ratings and categorizations for new content. By making personally meaningful judgments about content, individuals not only prioritize items within their own CMS, but help those who trust them do the same.

This presentation explains Interpersonal CMS in terms of its technological features, the relationships it supports between people, the goals that drive its deployment, and the challenges it will face in the future. In each of these aspects, we contrast iCMS with existing notions of personal and organizational CMS. Specific references will be made to our proof-of-concept iCMS, frassle (frassle.rura.org), which will be open-sourced this summer.