kind of writing/predictions


About two and a half years ago, I wrote a post called How to Design an Interactive RSS Scraper. A scraper is a tool that extracts data from a web page; its most common use is to generate an RSS feed for a blog that doesn’t already have one. While there have been lots of scrapers, most of them focused on automatically figuring stuff out given just a URL. It seemed you wouldn’t get reliable good performance on lots of different page styles being fully automatic, but given a little bit of interactive selection — here’s a date, here’s a title, here’s the story — you could guide the scraper’s initial guesses and make a good feed without much complicated effort.

I recently found out about Dapper, a scraping service that takes this approach. It works quite well. The UI is pretty nice, and although there are some parts I still can’t figure out, I am able to generate RSS feeds. So if you’re looking for a scraper, try it! Here’s one feed I made with Dapper.

If you were impressed by GMail, prepare for a surprise. Some very talented individuals have been creating demos that push the graphical power of Javascript and DHTML into the realm of 3D games. Of particular note are redbug (online demo, IE recommended) and neja.

Bonus: The author of neja, P01, has a brilliant technique for drawing lines, and explains it in a very lucid tutorial.

It's funny, but I'm still impressed as I watch browser-based apps pick up functions we used to think were limited to desktop software. This isn't going to stop, though, and pretty soon the web will be a platform that blends the best of client-server software and the best of desktop software. When that happens, Microsoft is going to be pissed off.

Ajax stands for asynchronous Javascript and XML. It's a set of techniques for building web applications that don't need full page refreshes to transfer data to and from the server. Instead, they package it up as XML, send it in the background using the XmlHttpRequest object, and use the Javascript DOM to load the new data into the page. The result is a web-based application that can support fine-grained interface changes as well as larger-scale page changes, more like a desktop application.

What just struck me is just how similar this really is to a desktop application. The UI runs in code on the client, and only relevant data moves to/from the server. You could almost describe the server as a disk storage abstraction layer.

As soon as web application developers start to realize this, look for some of them to leverage their investments in ajax to turn this web apps into desktop apps. If you're sending a bunch of asynchronous requests, why not just queue them up and send them later when you're connected? If internet bandwidth isn't literally ubiquitous in a couple of years, this will still be a problem worth solving.

If you're a programmer, you are probably at this point thinking, "Shimon, you eeeeeeeeeeediot, you can't run ajax apps offline and queue things up because javascript can't save files." This is really an arcane technical reason, and one that could probably be worked around, but that doesn't actually matter. Javascript can store persistent data on your disk, in the form of cookies. Sure, there are limits to the size of cookies, but you could just generate lots of them—with little cookie allocation tables and parity checksums, like a little cookie RAID storing your to-do list items. Kluge? Hell yeah! This is javascript, after all.

So… keep an eye out for the next big thing: Save-As deployment!

link

A friend emailed me asking what I thought about Ruby. I've only recently picked up a book on Ruby and am charmed by its comprehensive non-annoyingness. Here's what I wrote back to him. What do you think?

Will it take off? I doubt it. I think it has been steadily gaining over
the past few years, but there isn't anything so revolutionary about its
design or its culture that I think will let it dethrone Perl, PHP, and
Python (perhaps it's the lack of a letter P). At least not in the US. Ruby
originated in Japan, and may well be a leading scripting language there.
It's well-enough known, it has its O'Reilly books, but I don't see it
becoming THE HOT THING. It certainly has little traction in big software
companies.

There is a web application framework for Ruby called Rails [here's a promising recent article], which is rumored
to be a very lightweight, easy-to-start-with framework. Their website has a
10 minute video where someone installs it and builds a sample site. This
is highly laudable in a world where suits insist on web applications written
in J2EE, a framework so byzantine that it makes Software Architects
confident they cannot understand the details of programming, while making Programmers confident they cannot understand the scope of design.

link

Tech journalists, excited by IBM's sale of its PC division to Chinese computer giant Lenovo, are trying to guess what IBM will do next. This guy thinks they should buy Apple.

It's not an original suggestion, but it is especially hilarious. Apple succeeds in a business that other tech companies fail at, and fail horrendously. What sets it apart is taste. Apple has taste because of egomaniac CEO Steve Jobs. Does IBM have taste? Well, the thinkpad is a nice machine, but it's clear Big Blue has never garnered the same customer devotion as Apple.

The most obvious way to break Apple would be to make its products bland and uncreative. IBM, especially since Lou Gerstner took over, is all about making and marketing products and services at low risk and high scale, milking existing brands for all they're worth. The only IBM product I use on a regular basis is Lotus Notes, and starting that program is like hitting a magical 1992 button on my PC.

An obvious way to achieve blandness and uncreativity is to fire Steve Jobs. As if the theory of this option weren't compelling enough, there is actually historical precedent that Apple sucks without Steve Jobs. And yet, the author of this piece actually contemplates that, in order to make an acquisition by IBM work, Jobs might be fired.

Just ask yourself: when was the last time a bunch of engineers started a site to collect genuinely interesting personal stories about working with John Sculley or Lou Gerstner?

Few of my friends will be happy to see Bush win. But on the other hand, Kerry is a rather pathetic candidate too. Assuming that Bush does win, what's going to happen? It's time to make some predictions.

1.

With the assistance of a bold Republican congress, Bush will get additional steam for the continued occupation of Iraq. This is badly needed, but will not be enough. The election win will only bolster Bush's overconfidence on Iraq, and the administration will hardly even try to strike the deals we need with India, China, and Russia to bring 100,000 non-American troops into Iraq. After two years, Russia will pledge some troops but not enough. Near the end of Bush's second term, China will begin to consider some sort of military partnership with the US to bring the middle east out of total chaos, and it will be up to the next president to develop the partnership.

2.

Increased spending on Iraq will further balloon the budget deficit. I believe improving world security is a worthwhile infrastructure investment, so I'm not freaked out about this. Additionally, the administration will come close to neglect of the domestic economy, which is probably the best possible outcome. Without a reelection to worry about, we're likely to get lied to spoken to less.

Where Bush does damage domestically may be Social Security. The risk here is not exactly privatization itself, which could be done without harm, but that whatever changes they have in mind will create a domestic budget crisis. By characterizing the crisis in terms of "welfare spending," Bush could strike a powerful political triple-play:

  1. reunite fiscally conservative Republicans with the Christian right
  2. distract nation from ongoing costs of reconstruction in Iraq
  3. actually cut welfare spending.

Pulling off this sort of thing would be huge political win for the Republicans. It is likely to make many domestic issues marginally worse, including healthcare (especially for the poor).

3.

A new generation of voters is outraged at the secretive and deceptive manners of our nation's leadership. Looking for a leader, we will probably not find anyone actually good. But the process of looking will set out our values as a political bloc, and these values will guide the mainstream voters in 20 years. The foremost will state that leaders must be communicators. A good politician cleverly leverages the media, including websites/weblogs she directly authors, to communicate with voters and constituents.

4.

John Kerry will not be doing Viagra advertisements, and it will be possible to take a walk around Beacon Hill without being accosted by the Secret Service.

Today a friend invited me into a social network site called Multiply. It has some nice features, like photo albums, review sharing, and even a built-in journal. But the incentive isn't there for me to use it, because for it to be useful I have to get some critical mass of friends on it. The ones that aren't already on Friendster or Orkut are probably averse to the whole idea of telling a computer who their friends are. And the ones that are on Friendster and Orkut are unlikely to want to re-enter all that same data onto some other site, in hope that more of their friends will come. The barriers to entry are so high that any new entrant to the market has to do something really novel, or the market leaders have to really, really suck.

It's easy to see the logic behind an endeavor like Multiply: if we build services that are greatly enhanced by a social network, people will join the social network. Unfortunately, since people can't see the magic of your services until the social network exists, each potential user has to solve a frustrating chicken-and-egg problem. This exact problem will be pervasive in the social networking tools market until standards for data interchange are worked out.

Frassle escapes this problem. Because of RSS syndication, you don't need to convince a dozen friends to switch blogging platforms in order to use frassle's magic on their content. Because of OPML subscriptions import (it will be in the next version, I swear), you can easily try out frassle's aggregator on your familiar subscriptions list.

Now here's the kicker: I bet I can leverage this flexibility to work around the lack of a standard in the social networking space. I don't expect thousands of users to flock to frassle—and I couldn't support them if they did. But I do think that frassle will support a social network system that delivers more value, in easy-to-sample, titillating increments, than other systems.

Consider the procedure for getting value out of yet another social network system:

  1. Sign up
  2. Paste in email addresses of a dozen friends
  3. Wait for them to join; probably 3 of them do
  4. Wait a few more days for them to invite some friends
  5. Assuming you know a lot of their friends, you have a social network with 10 people in it. Unfortunately, this network doesn't map to any real-life community; it's an arbitrary grouping based primarily on how bored people were.
  6. Now you can realign your friendships based on how capable people are of using Yet Another Social Network SystemTM. Over time, those 10 people might become a coherent community of friends, so that YASNTM might make sense.

Needless to say, I don't think many people will have the naievete or wherewithal to restructure their actual network of friends based on YASNTM. Therefore, Friendster and Orkut have a great deal of lock-in because they have a mass of users.

Worst of all, even if you've entered this data into another social network tool, you have to re-enter it. Later, you have to maintain the same information in two different places. This is ridiculous, but none of the companies that have the power to change this situation actually want to, because it would reduce their lock-in.

Now, contrast the steps it takes for a new user to get value out of frassle's social features. Note that most of these features are in the design phase right now—they don't yet actually exist. But they could will.

  1. Sign up
  2. Upload an OPML file from your aggregator.
  • Right away, get access to related content in feeds you haven't seen using the taxonomies of blogs you read.

Register your blog's RSS feed.

  • Frassle will make up a random number and ask you to put it inside of a blog entry. Once frassle sees this number in your feed, it knows you are in control of that feed—much like how some sites verify your email address. You can then delete that blog entry, if you like.
  • Right away, you can browse all of the content inside frassle via your own categories. You didn't have to type them in to frassle.
  • If a frassle user has already subscribed to your blog, you're in luck! Not only are we likely to have hundreds of entries from your blog, rather than just the latest ten, but there are probably lots of recommendations in frassle, just waiting for you to show up and read them!

Right away, you can enrich frassle's understanding of your categories by categorizing other people's posts. You get to use frassle's excellent categorization interface, and categories in your RSS are automatically copied into frassle—no need to re-enter data.You can also re-syndicate data from frassle, so your data is never just stuck. Frassle gives you RSS for everything and helps you put dynamic data on your own web pages.Keep on using the blog and aggregator tools you're comfortable with, and frassle will keep on

  1. finding relevant material for you
  2. supporting personalized search
  3. tracking emergent communities of interest based on what you write about and categorize
  4. using your content to support these enhanced services for other frassle users and readers of your blog.

For someone who already has a blog and aggregator habit, not only does frassle eliminate redundant data entry, it also gives you more deeply personalized information—and all in fewer steps!

And that's why, even if frassle doesn't make it, someone will figure out how to support all of this. And they will finally bring us a social network system that's not only nice in theory but also, and more important, feasible.

I was travelling a bit this weekend, to my old college town of Williamstown, Massachusetts. While there I met up with many friends, most of whom asked what I like to do in my spare time. Which led to me explaining frassle, in pretty abstract terms, to a bunch of smart people that weren't all familiar with blogs.

While explaining frassle five times in a day isn't exactly continuous entertainment, it did let me get a few interesting, big-picture perspectives on frassle. Which got me thinking: where could frassle go? Dreaming big now—what ideas that are in frassle today could grow into major new kinds of tools and services? Here are a few answers.

1. The most accurately targeted advertising ever
Frassle defines shared understanding in a computable, measurable way. It can quickly give you an explanation of what one of my categories means, delivered in your own terms. It can probably be scaled to do this across multi-level social networks. To the point, frassle can give you a pretty good idea of who (among a very large network) would be interested in a particular thing, given only a small sample.

That is exactly what advertisers want to do. Given a product, they want to make the people who are likely to buy it aware of how great it is. The nature of mass media favors advertising that pushes products almost everyone needs, because the message always goes to everyone. Products that are likely to be highly useful only to a smaller group of people are therefore underserved by mass media advertising. Google knows this and delivers a higher clickthrough rate on ads hosted through their AdWords program because they keep track of how successful different ads are on different websites.

Frassle's inter-category mappings would simply be another way to track the same kind of correlations. Since frassle delivers a publishing and authoring interface to individual users, not just to website with a box on the right for ads, you could even target ads based on data about the viewer herself, rather than the particular website she is currently viewing.

Could it work as well as Google's AdWords? Could it work better?

2. A personalized search engine

The same kind of data frassle would collect for targeted advertising would also make for a much smarter search engine. First, search could be contextualized for each user, based on their category structure and recent interests. Second, relevance data could be used to better rank search results. The right stuff, right at the top of the list.

3. Integrated knowledge management for workgroups

While it may be very hard to make all of frassle's features usable for the general internet-using public, it might be a lot easier to set it up within an existing organization. Companies have oodles of documents, scattered all over the place; they have oodles of people, struggling to communicate through overloaded tools like email and insufficient physical spaces; they have oddles of shared projects that ought to involve certain people but can't make those people aware they even exist. The ideas in frassle could help with all of these, and might find excellent uptake in a small, close-knit community of knowledge workers.


Is it scalable? When I first discussed the ideas behind frassle with Scott Johnson, founder of feedster, he seemed skeptical that it could scale enough to cover a significant part of the internet or even just the blogosphere. I have a lot of respect for Scott's opinion—he's been building search engines for a long, long time. So I know a system that tracks hundreds of category systems and thousands of inter-category relationships probably can't be scaled to millions and trillions just by adding off-the-shelf software. This is still the big question. Do we need a Google-style cluster of thousands of machines? Can we make it a distributed, peer-to-peer application?

Is anyone else doing this? Some related ventures are under way from Technorati, Amazon.com and A9, and certainly others. How do I find them?

link

There are some great observations in this overly long essay.

One interesting trend is the shift of value away from software and toward the network effects surrounding software-based services. What this means is that while the software of ebay or amazon or orkut is fairly easy to clone, each of these businesses has its competitive advantage in the scale and involvement of its user base. The advantage is not in writing software, but in developing self-sustaining communities that invite and reward effective participation. This is dependent on software in roughly the same way that good cities are dependent on the layout of public spaces, roads, parks, transit networks, and buildings. Given enough money, you could clone all of these aspects of a city, but your clone wouldn't have any life until it was full of people constantly occupying the physical space and gradually reshaping it to fit their own lives.

In other words, skills now crucial in making software aren't taught in The Art of Computer Programming. If you want to make software, read Philip and Alex's Guide to Web Publishing, or better yet, A Pattern Language.


There is also some grist for the prediction mill in this essay. Here are mine:

  • Microsoft will ship open-source software within 10 years. Leading up to this point, they will transition to a business focused primarily on helping people find and use content (including software) created by third parties. Their software margins will crumble during this time period, but they may be able to sustain a profitable software business by driving quality up and cost down due to explosive growth in the number of devices that use software.
  • Some interesting stuff is going to happen when people start figuring out how to commoditize network effects. This problem will require figuring out how to make software more responsive to user intentions, and less brittle at the mercy of incompatible formal interfaces. The driving forces in the next generation of programming systems will be social, not technical.

link

The New York times has a Circuits piece on Amplify, a tool that lets you easily combine stuff from multiple websites on a single page. So you can create a tiled view of different wallpaper and furniture patterns, or combine info from several sites on the same topic. Here's an example.

Sounds like blogging, eh? Jeff Jarvis, Steve Rubel, and Rafat Ali deride it as a weak attempt to do blogging in a proprietary format. I guess they're right. But to the extent that Amplify is useful or successful, what can we learn from it? What can we learn from it suckage?

First the successes—or anticipated successes. The frames design is horrid for most things, like the Bushisms example linked above, but it is good for some things. Sometime you want to compare things side-by-side. Doing this with frames might work for some people.

Getting in the New York Times is good. Perhaps it's paid placement, but in any case it is a good way to reach out to thousands of people likely to be interested in a high-tech product.

Now the suckage. Frames-based design is usually bad. People are good at using scrollbars. The site has some classic design flaws, but the chief problem is that there are lots of links with vague feel-good titles that nobody will ever click on. Consider the seven things in the HUGE Amplify bar at the top of each amp page, from left to right:

  1. Cone-shaped doohickey. Turns out this just goes to Amplify homepage, as does the huge logo on the right. It almost looks like some sort of magical control widget that allows you to set the volume, but that metaphor makes no sense here.
  2. Back button. BACK button? What the fuck? It's just like the back button in my browser except it's in the wrong place and doesn't work.
  3. "What is Amplify?". Another link to the homepage. Brilliant!
  4. "Get Amplify". Download for MSIE/Windows. The most straightforward item of the bunch, though not something you'd click more than once.
  5. "Amplify Community". Links to a collection of hierarchically categorized pages by other amp users. This is a reasonable features, but if it said "see 5 other amps about animals having sex" instead of something totally generic maybe people could be interested enough to click on it. Comment links on every fragment of an amp page would be better.
  6. "Share this amp". Send a link via email. Useful enough, but why not just say "email this amp"?
  7. Huge amplify logo. Goes to, shockingly, the home page.

Oh, and I found out what the back button does. It takes you to the previous amp you were at. Rather than just letting your browser's back button work, it introduces a puzzling UI behavior by opening any amps on top of each other in the same window. If you use your browser's back button, it goes to some intermediate page momentarily and then forwards you to the page you were just viewing. Lame.

Well, I guess I turned from taking an optimistic look at Amplify to ragging on it hard Jarvis-style. Sorry Amplify, but maybe these suggestions can help you improve your interface. I'll leave you with some wisdom from Strongbad, whose love for scrolling could serve as a good lesson for the frames-addicted amplify developers.

Every day you come a-scrollin' back, scroll buttons gettin' ill like a heart-attack. Uh!

Next Page »