June 29, 2004

Cool, I'm LiveJournal'd

I don't know how such things are made to happen, but apparently I have a LiveJournal presence for this blog. Which seems to be the clever sort of hack that allows external blogs to be read by the internal LiveJournal system.

Posted by danyelf at 11:22 PM | Comments (0) | TrackBack

June 28, 2004

Disabilities and Visualization

I'm e-attending a conference on Visualization and Disabilities run online and through the phones by the W3C. The first speaker is the director of the SETI project at Berkeley, Kent Cullers who, I am interested to note, was the first totally blind physicist to earn a PhD in physics. His solution is an image-to-braille printer that produces "grayscale"--that is, variably-sized dots that he can run his fingers over and explore. (Others simulate this with a Phantom device--those are interactive, however, which means that you can synchronize the pen-feeling with sounds and audio cues.)

A different speaker points us to AUDIODOOM (ref, google), which seems to make sounds based on positions within a space. So does this project at HCIL, which tries to summarize chunks of maps ("east to west", "north to south").

A speaker from IBM has an attempt to write textual descriptions of charts: a chart is then simplified to the readable text. I have some concerns about whether this is sufficient, but it's a start... in particular, it summarizes bar charts with a single word ("medium"), which strikes me as a little painful.

(Other speakers mention a markup for enhancing web pages with sign language gesture-notation for the deaf, or an attempt to add more navigation information for the learning-disabled).

The lesson that I'm beginning to get is that for most forms, you need to simplify the data in some way or another. Yet if you remove data, then people can't do everything. Kent Culler, in particular, rebels hard against the idea of translations that lose information and fidelity: "The more dimensions I can look at at once, the happier I am."

The challenge, I think, is working out hierarchies of information: how do you provide a fast top-level of information, and then how do you allow a user to dig deeper into the data?

Posted by danyelf at 08:57 AM | Comments (0) | TrackBack

June 25, 2004

danah gets funny email about friendster

danah boyd has a great marketing-survey-thing that Friendster is apparently sending out to SuperFriends. It seems they are offering four template letters, and askign the SuperFriends which they like most as the center of a new campaign.


Subject: We Still Care

We miss you. There, we said it. It feels better. So we're going to do everything we can to bring you back to Friendster, all the way up to that John Cusack boom box Say Anything bit. So before it all comes to that, just come back to Friendster. We've already made it easier for you, getting much faster and clearer as we've grown. Now, in just minutes you can find people you've been wondering about: friends from summer camp, college roommates, high school buddies, cousins, people you used to date, people you wanted to date, these people you know, and don't know, are connected to each other and what a beautifully small world it really is. Or date, or help a friend find a date. We don't care. We just want you back in our lives. And we can tell you that you want the same thing. We can see it... in your eyes. The light, the heat. Your eyes, we feel complete. See, we told you.



Oh, to make sure you keep getting these vaguely sarcastic emails, please add Friendster to your email address book now. If for no other reason than it will look cool to have Friendster in your address book.

Rockin', right? Hip, up to date, and trying way too hard. But that's danah's territory.

What I want to focus on is the interview questions at the bottom.

*Which of the two versions do you prefer? (version 1 or 2) _____

*Is the email appropriate to send to active Friendster members? (yes/no)

*Would you be likely to click the link and go to Friendster if you
received this email? (yes/no) _____

Friendster is asking its SuperFriends (who, presumably, click on the links fairly frequently) to judge whether someone else who hasn't clicked on the links for a while is likely to. These are--I think--very different populations.

*Would you be open to receiving similar emails from Friendster in the
future? (yes/no) _____

Here, Friendster is asking a user who has just gotten an unsolicited email from a service they use whether an inactive user is likely to enjoy a different unsolicited email from a service they don't use.

The mind boggles.

Posted by danyelf at 04:51 PM | Comments (0) | TrackBack

June 24, 2004

Still thinking about categorization

Just an interesting schema: Learning the Lessons of Nixon blogs under the Dewey Decimal System

Part of their catalog, then, includes:

770 Photography
790 Sports
800 Literature
808.7 Humor
900 Geography & History

Of course, since WordPress allows multiple categorization, a fair number of posts are under more than one Dewey number.

Posted by danyelf at 06:52 PM | Comments (0) | TrackBack

The fate of books

William Tozier just bought a barnful of books:

Authors: reflect with care. Conceit and vanity may lead you to imagine the final resting place of your work as a well-dusted University library shelf, or a remainder stack, or even (at worst) in a decorative stack in a furniture store. Nope. For these few hundreds, as with untold millions more, the final resting place is a combination of (a) the guts of some bugs, and (b) the Buckeye Waste Handling tip just north of Bellefontaine, OH.


Posted by danyelf at 06:37 PM | Comments (0) | TrackBack

June 23, 2004

When "Don't Be Evil" is hard

People have certainly pointed it out before. But once again, Google finds that its adwords program's idealistic constraints match poorly against reality.

One site recieves a letter:

At this time, Google policy does not permit the advertisement of websites that contain "language that advocates against an individual, group, or organization". As noted in our advertising terms and conditions, we reserve the right to exercise editorial discretion when it comes to the advertising we accept on our site. I have reviewed your site and it contains language such as 'secretive, paranoid and vengeance-filled' which we will not allow to run on our site at this time.

(Perrspectives: Articles: Google's Gag Order)

It's an interesting issue. I have to admit I don't have a definite answer on this: I can see that Google doesn't want to sell adwords to hate groups, but it's hard to figure out where the right line is. A few years ago, a friend put together a website expressing a strong opinion about the Nike corporation, and bought adwords on "Nike Sweatshop." Would that have been against Google rules? It's hard to figure out -- but it's pretty clear from a search for adwords google censorship that this is not the only time this issue has been encountered.

Posted by danyelf at 05:20 PM | Comments (0) | TrackBack

Asimov, and Simplistic Social Networks

Cory Doctrow blames Isaac Asimov's simplistic three laws of robotics for a growing societal oversimplification of social and moral issues:

Yet Asimov's reductionist approach to human interaction may be his most lasting influence. His thinking is alive and well and likely filling your inbox at this moment with come-ons asking you to identify your friends and rate their "sexiness" on a scale of one to three. Today's social networking services like Friendster and Orkut collapse the subtle continuum of friendship and trust into a blunt equation that says, "So-and-so is indeed my friend," and "I trust so-and-so to see all my other 'friends.'" These systems demand that users configure their relationships in a way that's easily modeled in software. It reflects a mechanistic view of human interaction: "If Ann likes Bob and Bob hates Cindy, then Ann hates Cindy." The idea that we can take our social interactions and code them with an Asimovian algorithm ("allow no harm, obey all orders, protect yourself") is at odds with the messy, unpredictable world. The Internet succeeds because it is nondeterministic and unpredictable: The Net's underlying TCP/IP protocol makes no quality of service guarantees and promises nothing about the route a message will take or whether it will arrive.

- Cory Doctorow, in Wired: Rise of the Machines

While I respect Cory, and enjoy his thoughtful work (such as his recent DRM rant), my instinctive reaction is to disagree--both about his characterization of Asimov, and about his characterization of social networks.

I can meaningfully disagree with his reading of Asimov: each of the stories in "I, Robot" suggests, in one way or another, that the three laws aren't nearly as simple as they seem. In "Evidence," an entity that may or may not be a robot runs for mayor of New York. He is observed to punch someone--a clear violation of the First Law, which means he must not be a robot, right? In "Liar!", a robot tries to figure out how to define "injury." And so on.

Similarly, his claim that social network services reflect a "mechanistic view" of human relationships is, I think, a little unfair. Perhaps Orkut (and Friendster, and similar) do force an odd feeling set of limitations on the world, but in that, they are doing no better (or worse) than any other operationalization of social interaction.

Look, for example, at this article on social network analysis that I picked, more or less at random.

Asking people about their interaction with others –their communications, their exchange of advice and other resources – remains the source of most sociocentric network data. When groups are small (up to 150, but usually 20-60) the researcher can list the members’ names and ask each person how well they know each other person (on a scale of 0 to 5, for example), or how often they interact with each other person (for example, once a week, once a month or never). For example, a researcher may present all students ina class with a list of all the students in the class and ask them to rate how well they know each one. (Christopher McCarty, "Social Network Analysis," 20031).

It's reductionist, but it's the source of a lot of useful data. For better or worse, a social network analyst needs to decide--at some point or another--what to do relative to a pair of people. Are they connected, for purposes of information transfer or social dependance or something? If no decisions are made, you don't get pretty pictures. Scientists constantly have to model systems (where "model" basically means "to yank out the confusing bits, leaving the parts that can be usefully operationalized.")

Essentially, you can hate Orkut for forcing a "Yes/No" (or 1-5) answer, but you risk hating the entire field of sociology at once. (There is another argument that those techniques should not be applied to online services--an argument that I largely agree with--but that doesn't seem to be what Cory is saying here.)

Which means that scientists in all fields are busily scraping away details, leaving stark and ugly systems. Newton's laws tell us very little about the color or shape--or even friction--of the object in motion. Yes, it's a messier world than that, but we can get a lot of work done by simplifying the model.

The notion that Asimov's three laws are somehow to blame for scientific oversimplication, then, becomes an odd joke. Asimov was surely influenced by the scientific desire to constrain systems to simple laws: he was trained as a scientist and earned a PhD; his work contains not only the three laws but the odd notion of Psychohistory: a deterministic set of equations that describe the actions of many people. Psychohistory is a condensed attempt to simplify all of sociology, history, anthropology, and psychology in a series of rather complex equations.

Asimov promptly spends the entire Foundation series showing various failures of the psychohistorical model: mutations the model didn't predict, systems that tumbled faster or slower than the model expected, and similar.

Perhaps a wiser critic than I might write about whether Asimov's work reflects this tension, whether there is a deeper purpose to the way he sets up models only to tear them down and poke holes in them. Cory might argue, too, that even if Asimov knew it, others don't get it, and so he would be disappointed by the application of this.

I'll simply suggest that Asimov was clearly aware of the limitations of these models.

1 From the Sage Encyclopedia of Community, Karen Christiansen and David Levinson, eds.

Posted by danyelf at 05:02 PM | Comments (0) | TrackBack

June 22, 2004

Red Ted says it for me

Red Ted also, apparently, was just told by his advisor to have something in by Tuesday:

Dear Advisor

Attached are four chapters, some polished and some still with splinters, totalling about [104] pages.

I am off to work on the fifth of my [eight] chapters. Have a nice weekend.

Posted by danyelf at 08:03 PM | Comments (0) | TrackBack

June 18, 2004

Gmail Review

Ok, so I'm a late adopter. (Well, for an early adopter, I'm a late adopter.) This means that there's already a lot of GMail reviews out there. (Here and here and, well, generally, here ).

But last week I got a gmail account, and so early this week I started all1 my mail forwarding to danyel-at-gmail.com, and--after a week now--I have 232 messages in there. Which is nothing, really, but it's enough to start making a meaningful statement about the service.

I should comment that I am not contrasting Gmail to (free) Yahoo mail or Hotmail or mail.com or any of the free services. If you want a disposaccount to catch spam or register for personal ads or whatever, you can do that fairly easily. And there's no question that GMail is head and shoulders above those systems: fast, responsive, lots of storage, unobtrusive ads, it's good.

What I'm doing is comparing GMail to my day to day mailer (Mozilla Thunderbird). My conclusion is that it's a close matchup: see below the fold for more.

1 My procmail script seems to be slipping up once in a while; four or five crucial messages didn't get forwarded. I still haven't figured out why.

Update 6/26 under "Search"

You know, by now, the basics. GMail provides you--by invitation only--a gigabyte of free storage space for your email. In exchange you get unobtrusive2 but nominally targetted ads. (Incidently, a gigabyte isn't really that much email. Even without big attachments, I seem to collect a megabyte a day. If I start sending out drafts of my papers with any frequency, I'd expect this to be done in about a year.)

  • Labels

Rather than having a folder system, GMail uses a label system (like Lotus Notes!): you tag a message with labels, which allows you to dig into folders to retrieve them. The difference is that messages can be multiply-labelled, and so the same message can live in several folders.

(This means that you should probably not delete messages--it might exist somewhere else. The UI makes it clear that you shouldn't, but why set bad habits? You've got a gigabyte to play with.)

Once you see the world from a "label" perspective, a few things make sense. Messages can be "starred" (a label that has its own folder, and has a visible user-interface widget); "sent" mail is just another sort of label. So is the "inbox": you move things out of the inbox and they just drop into the endless pool of "all mail", referred to as the "archive".

Irritatingly, "spam" is also a label. I've found messages in my spam folder that were labelled with a filter (see below): that is, I had told the system I wanted the message, and it was tagged as spam anyway. How rude!

Oops, a correction: "star" applies to a single message, not a thread. But the starred folder holds all threads with stars in them

  • Unread messages

Unread messages are bold-faced; in addition, any labels that contain unread messages are bold-faced. Except "all mail", which means that things that you don't read because you aren't really interested can go away. Drop to your all mail view, hit "select unread", and you've got them to go surf the CompUSA weekly specials from March. The window title--something like Gmail - Inbox (1)--indicates the number of unread inbox messages. (Even when the user is reading a different folder, the page label counts inbox messages.)

  • Conversations

GMail makes an effort to lump together messages in the same conversation. It seems to be mostly right: it seems to be a little bit relying on heuristics, a little bit on the "in-reply-to" header, and a little bit on content. I haven't seen it slip up yet.

On the other hand, I wish I could manually intervene ("These three messages? Same thread. Really. And that one? Isn't.") when there are things to be lumped together. I suppose that's partially what labels are for, but I also like seeing things in one continuous set.

The thread visualization, however, is weak. It's just a stack of headers. (See image). I'm spoiled by good work at Microsoft and IBM and even Google Groups to understand why I can't get at least some texture to reply relationships.

Email threads aren't flat, and participation changes, and it would be nice to have some way of knowing what the connections between messages are.

  • Filters

Gmail comes with a straightforward mechanism for labelling messages automatically, and shuffling them out of the inbox automatically. This allows you the workflow of "deal with all messages about that side-project all at once, and LATER". For example, messages that come in about JUNG:"http://jung.sourceforge.net" get both labelled and moved directly to the archive: essentially, they are tucked directly into my JUNG folder, and I try to heck that only once a day. Unread messages that have been moved to the archive don't show in the window title. It's also possible to label, to move directly to archive without labelling, and to delete messages.

Rather elegantly, the filter comes with a search. When you create a new filter, the system shows you all messages that the filter applies to, so you can judge whether it's correct or not.

The filter system seems to run all messages through all filters: that is, if the message is to be labelled (due to, say, the subject line) and it's to be moved to the archive (due to the sender), then both those actions will be performed. I don't know how the "move to trash" option works in conjunction with the other flags.

  • Searches

I like inline, powerful full-text search. It's good. After I graduate, in my Copious Spare Time ™, I will write a powerful full-text search for Thunderbird, and then my life will be much better. (The inline is important! Don't make me read my email from some other application!)

I found that the search fit right into my workstyle: I just tap a few letters, and whatever it was reappears. We'll see how that extends over a year of mail, rather than a few days. (I also found myself missing it when I used my desktop client.)

*Update, 6/26: Ok, I have a real annoyance with the search system. I got email from someone with a weird email address: "markjenols-at-aol.com". (Not really, but close enough.) It's from my aunt and uncle "Mark" and "Jenny" "Olson", who have an AOL account together. Now, I can't easily remember whether it's MarkJenOls, or MarkJennOls, or MarkJennyOlson, or what. So I did the obvious thing, and entered the name "Mark and Jen Olson" into the contact book. When I compose a new message to "Mark", it auto-corrects to allow me "Mark and Jenny Olson" (same for "Jen", or "Olson"). But I can't search my mail for "Mark", "Jen", "Jenny", "Olson", or "Ols." Google neither does substrings for "from" lines, nor does it search contacts.

I ultimately found the message through searching on (of all things) "AOL".

  • Contacts

Gmail tracks contacts who you send to or get email messages from and lists them all. You can correct the information, but you can't merge them ("bob@ics.uci.edu" is the same guy as "x.bob@uci.edu"). You also can't currently upload your contact list from some other source. However, the name completion for contacts is very elegant, and it even checks things like middle names.

  • Speaking of uploading...

There's also no easy connection between GMail and the rest of my archives. All messages from before last Thursday still languish in my IMAP folders. A couple of tools have emerged that are meant to help with that, but I haven't tried them out yet.

Gmail loader

(Note that there's another GMail:http://packages.debian.org/stable/mail/gmail.html out there, associated vaguely with the Gnome project. It also has tools like
Mbox2GMail and GMail2MBox but they are for the wrong GMail as far as I know.)

  • It's all online

When it comes down to it, my biggest gripe with gmail is its best feature: it's online. I send a lot of my best emails offline: without a 'Net to distract me (mmm... strongbad), without news to check, and with a couple years' archives to surf through, I can write interesting, thoughtful responses to things. When I next plug in, I synch up, and the email disappears to the aether. Yes, I can fake that with an external notepad, and maybe keep a copy of my gmail around to check, but it's really not the same experience: online-only email encourages me to write one-line responses, quick messages.

The interface encourages these one-line messages: between the automatic folding-away of old message content (pages of little arrows disappear into a single link "- Show quoted text -"), and the convenient type-to-reply box at the bottom, there's really no reason why I should edit my response much when I can dash off a post-it. Many times, a post-it is exactly what I need, and I have found the interface extremely convenient.

On the other hand, when I'm writing a longer email, I don't like the fact that I can't save an intermediate draft, that I can't freeze the computer and wander away, that I need to stay logged in and online throughout composition.

It gets a little worse. Even though you can pull a response into a new window, you need to do so before you start typing. Gmail can't transfer a draft into the new window. Which means that the one line-response that has suddenly morphed into a half-hour project is blocking your other email from being visible! You can't return to the inbox while you're working in an entry box without losing what you're doing. For those of us who really do use their email as a workplace--as a storage for short thoughts, long thoughts, and everything in the middle--gmail forces us to create a new window before we start typing, or lose out on inflow. (And on outflow: I can't compose a second message while I work on the first.)

Update 6/23/04: This seems to be recently fixed.

My work around has been to open a second window with a GMail inbox. It works, but it doesn't feel elegant.

  • Contacts aren't queries

Bear with me a moment: perhaps I live too much inside the space of my upcoming dissertation. But when I'm looking at my contact list, I need to exit it and then start a new search in order to get messages from or to that person. I fairly strongly feel that everything should be able to be turned into a query on the database. That would give me one-click access to my conversations with X when I see their names. Indeed, Gmail doesn't allow "sort by" anything except date.

  • And so, in conclusion:

I like Gmail. I think I like it enough to keep it as my primary mailer for the next few weeks while I work on my dissertation: it might discipline me to send shorter messages more frequently, and to stay off email more easily the rest of the time. After I'm done, I suppose I'll use Pop Goes the Gmail to get mail out of Gmail and back to my desktop. If I stop using it soon, it's most likely because I find forwarding to it too unreliable--which wouldn't be gmail's fault.

Of course, when I get proper synchronization between Gmail and the offline world and my del.ico.us and someday, StuffIveSeen then I will be in a position to not care where my data is, because it's all available. Someday.

2 On my 18" monitor the ads are so far over that I never see them:
I'm too busy looking at the left side of the screen, where the content
is. So I can't say whether the ads are any good or not.

Posted by danyelf at 01:13 PM | Comments (0) | TrackBack

Brad DeLong uses Google as his storage

Brad DeLong mentions that Google is the best way of finding his own materials. It's faster to type a few keywords than it is to search through his records--and as such, he finds himself posting more and more to his weblog.

I do the same: Google is, in some sense, "distance one" from everything. A quick check of my Googel search history shows "StringLabeller" (a term unique to JUNG), "Everyday collaboration" to find a copy of my CHI paper, and even "Danyel Fisher resume" to find my own resume.

Yes, it's two clicks off of my home page. But Google is faster.

Posted by danyelf at 12:37 PM | Comments (0) | TrackBack

June 17, 2004

Some messages take a year to deliver

For those who read SMTP headers for fun:

Received: from imap-1.xxxxx.net [xxx.xxx.xxx.xxx]
	by localhost with IMAP (fetchmail-5.9.11)
	for xxxxx@localhost (single-drop); Thu, 17 Jun 2004 17:51:17 -0700 (PDT)
Received: (qmail 15766 invoked by uid 89); 6 Aug 2003 07:46:33 -0000
Received: from geekchic.geekchic.com (
  by mailbox.xxxx.net with SMTP; 6 Aug 2003 07:46:33 -0000
Received: from BlackBox (pv106199.reshsg.uci.edu [])
	by geekchic.geekchic.com (8.9.3/8.9.3) with SMTP id AAA26513
	for <bonnie@darrouzet-nardi.net>; Wed, 6 Aug 2003 00:46:30 -0700

I sent the original message on Wednesday, August 6, 2003. It was received at the mail server at the destination on August 6, 2003. It was relayed to IMAP yesterday. Just under a year later.

I'm not the only one, either: apparently, someone just reconfigured the mail server, and the system poured out its guts.

Posted by danyelf at 06:15 PM | Comments (0) | TrackBack

unwashed richard stallman's laundry

Without attempting to interpret the comment on danah's blog, I simply must reprint it here:

to earstap, this is not easy and requires sofisticated signal processing systems either used by military/police, or a deep understanding of standard consummer technology that is re configured to be p2p to the nth power, within a hamilton arc: {laplace} lost within schroedinger's equation, and unwashed richard stallman's laundry...

(Courtesy one stephanos).

Posted by danyelf at 02:58 PM | Comments (0) | TrackBack

June 16, 2004

Addictive Services and Ethics

I don't like being prophetic. A few months ago, I asked about abandoning users, and wondered what the ethical implications of shutting down free and research systems were.

Researchers become very interested in building systems, and we study adoption a great deal: how many users can we get, how fast can we get them, and how many hours can we get them to use the system? Those are all good things: the CSCW paper that does wonderfully is the one where the speaker stands up and says, "It was more addictive then crack; we have the entire english-speaking world using it; and they never log off."

Unfortuantely, the next line is--all too often--"then we completed the six week study and turned it off. What a wonderful lesson we learned!"

Now we've seen two instances of this in the weblogs world recently.

Six Apart, the makers of this very weblog software, announced a new upgrade policy that was startlingly expensive and made difficult demands on many users. Even though the upgrade policy has been revised based on user feedback, some users are still irritated.

Yes, that's about upgrades, but very few people want to stay behind when the technology moves forward.

Of course, that's nothing on the l'affaire d'Weblogs, in which Dave Winer shut down (rather suddenly) his free webhosting service, thus shutting down several thousand weblogs at once.

Others have written elegantly about this situation: Many-to-Many sees this as an issue of managing expectations ("plan for success); Brad DeLong suggests that "An internet in which you can expect persistence, et cetera only if you pay for it is a quite different animal."

And LawMeme claims that Dave "had a serious obligation not to leave them in the lurch," and sees this as a blow to Winer's credibility as a weblog guru.

I'm just realizing that it's an ethical step: there's a serious danger to building a multi-user system of any sort. If people are dependant on it, they'll want it. And you need to think of a plan for what to do when your funding, or enthusiasm, or access runs out.

Posted by danyelf at 02:39 PM | Comments (0) | TrackBack

GMail invite?

I'll be posting a "how Gmail was my primary mail client for a week, and what I thought of it" in a few days. Until then, I have a steady supply of invites, it seems... just in case you somehow don't have one, and want one.

update: Er, I'm not shipping 'em out anymore. Go to gmail swap, or ebay, or something....

Posted by danyelf at 10:55 AM | Comments (0) | TrackBack

June 15, 2004

Take THAT, Ubicomp!

William Tozier is unimpressed by the ubiquitous computing visions he's heard. Apparently, someone told him that his fridge will someday order his milk.

But, if I may say so, it’s the most irredeemably boring vision of the future I’ve heard for several decades. My fridge will order my milk? Thousands of man-hours of research and thought by diligent creative grad students and technicians and a few professors leads to the disintermediation of the ... shopping list industry? What happens to all the innumerable real advances in multi-agent systems, smart materials, affective computing, and ubiquitous computing? We forget them, like the people in the Star Trek universe all forgot how to use an automatic pilot or a computer targeting system? (ref)

He wants a more exciting future vision:

My milk will sense it’s not feeling well, and will chat with the fridge and maybe ask it have a look-see with its extra senses and bring its extra smarts to bear, or ask some friends. Together they concoct a plan to remedy the situation. Maybe they do some chemistry. Maybe they develop some antibodies. Maybe they try to talk the bacteria out of their harshness, convert to a nice communal yoghurt and seek a permanent existence as a collective, nurtured and supported by the sheltering fridge. The least they can do is see it off to a noble end, with a little dignity, and make arrangements to take care of its progeny. (ref)

Posted by danyelf at 04:25 PM | Comments (0) | TrackBack

Four dimensions of english words

Via Language Log, a nifty visualization of 4-letter english words here (Written with Processing, which is incredibly pretty looking).

Each letter is a different dimension. X,Y,Z are each the second, third, and fourth letters of the word.

four letter.png

It's interesting, although it's got a few difficulties. I agree with Language Log that it really should do some sort of dimensionality reduction, and perhaps a shuffling of the elements: there's a lot of noise in the system from coincidences. (Specifically, the visualization implies that there is meaning to adjacency, that A is "closer" to B than it is to E. Is that true, linguistically?

Don't we have ways of measuring these things? Perhaps transforming it into another space--a pronunciation space, for instance--would be informative. Sort it by phonemes.

Posted by danyelf at 03:52 PM | Comments (0) | TrackBack

June 13, 2004

Irritating Vertical Lines

My computer is acting oddly these days. Regularly, bits of the screen overlay themselves with these little vertical lines.
irritating.png At the moment, as I write this entry, the top-left quarter of my screen shows them (except in parts that have recently refreshed--causing it to redraw always fixes the problem); so has the entire taskbar across the bottom. Ever seen this before?

It oddly recurs--this bit cleans up, then something sets it off on another screen, and it becomes visible again...

Posted by danyelf at 03:58 PM | Comments (0) | TrackBack

In another life...

.... I want to go back into the liberal arts and be a linguist. Doesn't this sound like fun?

Here's an entry from the OED's Day in the Life
(via LanguageHat )

Apart from othering (see above), I have been preoccupied with participles. Assigned the simple task of putting together a definition and quotation paragraph for the literal sense of the adjectival compound ‘oiled-up’ — which one would expect to appear in print well before the more fanciful metaphorical extensions of the term — I've hit upon a possible first quotation from Dickens' Hard Times: ‘All the melancholy-mad elephants, polished and oiled up for the day's monotony, were at their heavy exercise again.’ But is this, on second thought, more verb than adjective, with its implied ‘elephants [that had been]...oiled up’? It would be a shame not to be able to use the quotation: it would make a fine counterbalance to the ‘oiled-up love god’ who features in the last quotation in my draft entry, from a 2002 issue of Smash Hits magazine.

Posted by danyelf at 03:44 PM | Comments (0) | TrackBack

June 12, 2004

No more cell phone

I have finally become totally and entirely frustrated with the fact that my AT&T Wireless phone refuses to make a connection on campus.

It's particularly bad because when I re-upped my contract, I was assured that with the new AT&T digital, I'd have no problem getting reception anywhere a Cingular phone could. So there I sat, my officemate's full power of Cingular, my phone offline.

Incidently, everyone at AT&T customer service or sales was stunned by this: "but--it's Cingular! you should have no trouble with them!" Everyone I spoke to at AT&T technical services was unsurprised.

So I returned the phone today and cancelled the contract. I will try to get a phone soon; however, my old (617) number is dead and gone.

Posted by danyelf at 05:40 PM | Comments (0) | TrackBack

June 11, 2004


So I'm trying to debug a problem with JUnit and a friend's Eclipse install. He's been fighting with it for hours: he's checked newsgroups, he's installed old versions. He's sure that his particular install of JUnit and Eclipse is--must be!--broken.

I ask him to confirm that the system basically works. Can he run any other code? Can he get stuff up and going?

He can't. IT's not just JUnit, it's some bigger misconfiguration. And from there he's off and running.

Yodalike, I write:

Never hyperfocus on one problem. It never is. (Except when it is, but then it's often something else.)

Posted by danyelf at 04:04 PM | Comments (0) | TrackBack

Full text search for mail

My officemate just installed lookout for Outlook. Would be nice if a parallel version came out for Thunderbird... yes, I'd pay for it. (Heck, I'd code it if I needed to, and if I wasn't dissertatin' all the time.)

Never could get Zoe to work. I've got GMAIL up and running, but my mail archive is huge, and I'm not eager to email it all to myself quite yet. And I'm not really eager to lose my email when I'm offline. (My best archive work happens when I'm offline).

Posted by danyelf at 11:28 AM | Comments (0) | TrackBack

June 09, 2004


Just started a "social bookmark" set on del.icio.us (named, of course, madeoutofpeople. It's an interesting system. Perhaps a useful way to save bookmarks--I haven't decided yet.

Awkward interface for tagging and categories: you don't know how other people have labelled something until you've already added it; you can't easily surf stuff under a given tag when you are on someone else's page.

What I like about it is the multi-partite, time-coded graphs linking people to categories to content. Would love to throw it into JUNG and start looking at the evolution of topics and categorization...

Posted by danyelf at 11:40 AM | Comments (0) | TrackBack

Random Notes

Suspension of Disbelief: Airplane Movies "Love Actually." Watching it in late May takes several willing suspensions of disbelief. Ranging from sitting out here with the sun streaming in over my shoulder watching winter on TV, all the way out through believing that Hugh Grant is the Prime Minister of England. It's all very cute that he's a bumbling kind of funny prime minister. But it's not even remotely real. (On the other hand, Billy Bob Thornton really seems to enjoy the concept of being President of the USA, in a cowboy sort of way.)

Coffeeshop notes It's a Diedrichs coffee in Irvine. I'm procrastinating on my dissertation, of course, but I seem to do so more effectively here than in the office. Around me, students work; a group of friends laughs loudly, and, side-by-side on the couch next to me, a couple is on a first date. They've run into a few conversational dead ends (he didn't know that her home, Kansas City, was in Missouri; she simply looked dazed at his stories). But they both want to make something of the evening, and so they've decided to probe other routes. She's working on identifying odd features of his hand (she can't actually palm read, but she's going for the full wrist sensuality thing); he's saying things to her in Pig Latin, which causes her to giggle and him to whisper.

And something in me wants to come in, compete with him, sweep her off her feet: "Arlingday, heway eakspay igpay atlinlay ikelay away ikerpay. Unray away itwhway emay, andway ively anway incomprehensibleway ifelay."

Posted by danyelf at 09:50 AM | Comments (0) | TrackBack

On Email, To read and parse

The INBOX event

I'm not convinced (as others seem to be) that Sarbanes-Oxley and compliance rules are really something that we can just see as noise and route around. Dourish has pointed out that some of his students see email as highly formal: its the way they communicate with profs and the school, but not the way they communicate with each other.

Sarbanes-Oxley, and corporate restrictions on how much email you can retain, are both signs that the corporate world is trying to get its head around where email operates in the formal business process. Rather than bemoan the loss of archives and institutional memory, let's watch this process develop, try to understand the legal implications, and develop communication sources that address the underlying problems.

Posted by danyelf at 09:34 AM | Comments (0) | TrackBack

Corporate Technology Transfer

I’ve been interviewing recently at a couple of Very Large Organizations (hence, VLOs) that have a strong research point of view. The names are elided largely because I’d like to discuss a general problem. These particular organizations aren’t the only ones that I’ve heard of that suffer from this; rather, it seems to happen across the board. On the other hand, I have more experience talking with the VLOs.

The problem that I’m thinking of is the technology transfer issue. An article in Scientific American this month, for example, worries that MSR may not be able to transfer ideas into products; famously; it was exactly this problem that hurt Xerox PARC.

The VLOs are certainly aware of these challenges, and are beginning to tilt their internal incentive systems to reward good, technology-transferred behavior. There are always internal and structural difficulties (research, development, and marketing are not always well-harmonized, at virtually any organization); more so, there is an odd issue that can happen in the use and harnessing of creativity.

The VLOs that I’ve spoken to have all expressed concern about the development groups. In particular, development (and I write here purely from the biased research perceptive) is hesitant to adopt untested technologies. They don’t want to build things that may not work; they don’t want to work with pieces that haven’t been compellingly demonstrated and tested.

Even when a full package is visible, with all the pieces in place, it can take time to bring a product to market. Several years ago, when I worked at IBM Research in Cambridge, the team was working on getting the REMAIL project into development. Their solution was to build a complete prototype, and thence to pass that full prototype into development teams. The teams, then, could take from that prototype, implementing them into a product.

One interesting aspect—what I want to focus on—is the increasingly-tangled web that patents and intellectual property draw. Researchers are fairly constrained in what they can put in: they cannot violate patents, nor (since they need to publish) may they particularly work on ground that others have covered in depth.

Unfortunately, under this model, they also need to build those prototypes, and time spent building a prototype of known work is time lost; time spent building a prototype based on someone else’s patent might be seen as constraining the system’s ultimate use (or inviting a lawsuit).

Which means that the prototypes are built with an odd internal gap: they reflect a lot of local knowledge, but very little from outside the organization, even in a field where many people are working on the problem. This is not necessarily a tremendously constraining problem, as academics and researchers can be expert at recasting their work in terms that make it seem unique.

None the less: when something doesn’t make the prototype, it doesn’t get seen by the development and production team. And the development team can’t build things that it doesn’t see1.

This is, of course, a variant of the “not built here” syndrome that many places suffer from, on all levels. People often don’t trust code they didn’t write, or systems they haven’t put together themselves. And so a great deal of time and effort is often spent, in many contexts, reinventing, if not the wheel, then at least the tire. (That way, you get to write “VLO” on the rim). I’m concerned about it in this case because—it seems to me—to constrain not only the amount of time spent developing, but also the deeper issues of what ultimately becomes part of the product.

How does one bridge this gap?

1 There is an entirely different issue in trying to figure out how close to full implementations prototypes should be: I’ve heard everything argued from nothing-but-flash all the way out to “build a full version, and let the developers repackage and publish it.”

Posted by danyelf at 08:05 AM | Comments (0) | TrackBack

June 06, 2004

Random note

SEATTLE. Visiting Microsoft as a "guest researcher".

This evening, I saw my host's wife wandering about with a keyboard. "I need to reset my tablet pc," she said. "And to do that, I need to hit control-alt-delete. So I'm getting the remote control keyboard."

Wow. Keeping the keyboard for control-alt-delete.

Posted by danyelf at 10:46 PM | Comments (0) | TrackBack

June 04, 2004

Getting Stuff Done: A Hierarchy (School Organization, Part I)

Last year, as a graduate student, I joined the Graduate Council. This is a division of the Academic Senate, and is responsible for stamping and approving (or disapproving) issues that may affect graduate life. Student funding and housing are higher-profile work, but don't quite get the depth of time and energy that program reviews, catalog changes, and similar. The vast majority of the Council's decisions end in writing a letter, reccomending that a program provide more information, or approving a particular plan. This isn't budgetary approval (although budgetary issues may be raised); it is instead weighing proposals on their academic, and educational, impact.

In general, of course, graduate students don't know a lot about their schools. I think we all pick up, pretty quickly, the structure of the department--deans and professors and suchlike. There's a hierarchy of getting stuff done and checks written, and we all figure out how it goes.

It goes from God, to Jerry, to you, to me, to the cleaners.
-"Real Genius":http://www.imdb.com/title/tt0089886/quotes

In particular, we learn that administrative assistants know everything; and that anything can be fixed or finessed if an administrative assistant and a professor both agree on it. But I don't think we think much more about the rest of the university, outside our own department or school. Most are generally unaware of roles like the vice-chair for graduate affairs in their own department, much less the higher levels. So while the Graduate Council grinds through the (very imporant) "Graduate Student Rights and Responsibilities" document, I'll throw out a few notes. These apply to UCI; I'm sure there are parallel versions at your institution.

[more inside]

  • Association for Graduate Students is a representative body of grad students who are meant to resolve issues that affect students. At UCI, at least, many positions are vacant, so it's effectively self-appointed. (If you run, you'll get in the top few.) This is potentially an effective way to get a message through, partially because the administration is often eager to work through the official channels, and the AGS is an official channel for student voices. This is for generalized issues; ones that might affect policies...
  • ... which is distinct from employment issues, which may have to go through the union contract
  • ... which is distinct from formal complaints, which go through grievance procedures and possible the academic senate's procedures for issues like discrimination. (Which is, in turn, different from taking legal action).
  • Graduate work is generally supervised by Research and Graduate Studies' Office of Graduate Studies (henceforth RGS and OGS). Ignoring most of the stuff on the website (would you really look here to find a list of movies showing in Irvine?) it takes a little digging to find the Staff Publications page, with cool things like the graduate advisor's handbook (ought to be required skimming for every grad student, I think). There's all sorts of cool people there, like various Deans who are the people to Go To when Bad Things Have Happened. And the Graduate Council, which is the academic senate committee most closely relatd to RGS.

(Indeed, the chain--at least for systemic or rules-based problems--is "your advisor, your chair, your dean, the dean of graduate studies or the grad council, vice chancellor for student affairs". Each will be disappointed if you haven't tried to address it more locally.)

There's an ambiguity, there, in which issues are Grad Council and which ones are Dean. I don't really know how to address that--the Council studies curriciula and roles, while Deans deal with people, might be one version of the division of labor. But the distinctions are loosely-drawn, and of the various mysterious mechanisms, the Council flies the most under the radar.

For the record, lots of these organizations are surprisingly answerable to students. For example, Council minutes are a matter of public record. The question is how the public record should be made available. Do students know that they can simply speak to the Council Analyst and get them?

Posted by danyelf at 10:33 PM | Comments (0) | TrackBack

Center for Unconventional Security Affairs

To read, later: Center website.

Posted by danyelf at 04:06 PM | Comments (0) | TrackBack

Hybrid Vigor

UCI's ACE program had a bit of an administratively rough start, but is now off and running. I've got long happy associations with artists who compute, programmers who do art, and all the various combinations. There's weird art that messes with your head (like Eric and his Legal Tender and Dispersion); there's cool work that explores boundaries in new ways (like the PARC XFR project), and there's all sorts of other stuff.

Here's real live grad students doing it, right here at UCI!

Well, you can now see what they've got for themselves at the Beall Center . Most of it should be seen in person, but fortunately, the webserver-in-a-frog is online and reachable.


Unfortunately, the exhibit doesn't have a web client there, which led Paul and I to sit there with our (GPRS, web enabled) cell phones, trying to get to the website.

Then, failing that, calling various friends. "Go to that website. Good, now click there! Make it twitch!"

Posted by danyelf at 04:01 PM | Comments (0) | TrackBack

Sudden cultural realization

The latest "Onion" column, Ask a Jostens Class-Ring Salesman made me check out Josten's web site. Which has, it seems, a section of home-schooling options. (Indeed, the Onion article even references home-schooling options).

Graduation gowns. Class rings. Graduation keyrings and memory books and all the paraphenalia of high school graduation.

Now, I should admit a fair bit of ignorance up front: I was not home-schooled, and I didn't get a class ring for my high school. (Also, I missed my tenth reunion.)

I see that they have latin mottoes1 for home-schooled class rings: "Deus Veritas Familia (God Truth Family)" and "Veritas Familia Sapientia (Truth Family Wisdom)". But there are still lingering questions:

  1. Which mascot2 goes on it?
  2. What school colors do you choose?

I guess I misunderstand the notion of class rings: maybe I just know too many brass-rat wearers. but there's something in the class ring, I thought, about shared pain and shared experience. An identity-labelled sign of community and all that.

Perhaps this is partially based on a mistaken assumption: after all, class rings are so customizable that there's no easy way to tell whether you and I went to the same school without squinting. Which suggests it's about something a little more personal, and a little less Goffman.

1 Technically, these are just lists of nouns. How about "Deus Veritas et Famila" ("God, Truth, and Family")? Or even "In Familia, Veritas et Sapienta" ("In family, truth and wisdom")?

2 You can get a number of different sides on the ring. In the "beyond school" list, for example, are

Blading / Bowling / Bull Riding / Calf Roping / Checkered Flag / Cycling / Deer Hunting / Duck Hunting / Equestrian / Fishing / Low Rider / Martial Arts / Motorcycle / Mountain Biking / Pheasant Hunting / Riding / Rodeo / Scuba / Skateboarding / Snowboarding / Snowmobiling / Surfing / Water-skiing / Wildlife / Windsurfing

There are also possibilities for various clubs, afterschool activities and teams, and associations.

Posted by danyelf at 11:37 AM | Comments (0) | TrackBack

Online Communities: more pointers

Some stuff I've run into in the last few days that seems like I should keep track of (isn't this what de.ico.us is for? yes.)

Full Circle Associates Online Blog
Full Circle Online Community Toolkit

And, also,
Teleconference on Making Visualizations of Complex Information Accessible for People with Disabilities

... an issue of some interest to me, partially since I understand that large software makers with government contracts end up making decisions about how accessible their software is relative to the Americans with Disabilities Act. And that brings up visualization design issues. More later, after I hear back from my favorite ADA compliance expert...

Posted by danyelf at 09:22 AM | Comments (0) | TrackBack

June 03, 2004

Weird email concept

Bumplist (via FullCircle)

BumpList is a mailing list aiming to re-examine the culture and rules of online email lists. BumpList only allows for a minimum amount of subscribers so that when a new person subscribes, the first person to subscribe is "bumped", or unsubscribed from the list. Once subscribed, you can only be unsubscribed if someone else subscribes and "bumps" you off. BumpList actively encourages people to participate in the list process by requiring them to subscribe repeatedly if they are bumped off. The focus of the project is to determine if by attaching simple rules to communication mediums, the method and manner of correspondences that occur as well as behaviors of connection will change over time.

Currently, BumpList is limited to six people. The statistics page shows that the top people have accumulated 100-odd days online, with 1000-odd postings each (high traffic!) and 7000 or so bumps. That's seventy per day. You show up, you post, you get bumped. You resubscribe.

I realize it's an experiment, but:
* Why do people feel it's so compelling to keep getting back on? Or have they written scripts to do so?
* How do you handle that much traffic coming through?
* Is the exclusivity of the conversatin so compelling that the content doesn't matter? I usually join communities or conversations that I have something in common with.

Posted by danyelf at 08:12 AM | Comments (0) | TrackBack

Periodically perl

My friends have scared me with their unabashed love for LaTeX.

I do not write my dissertation in LaTeX. That's because I like WYSIWYG, and dislike the idea that a minor code being confused on page three of a document may cause page fifteen to look wrong. And because I dislike compiling my documents. Or getting compiler errors.

Which makes me think of my various other computer scientist heresies.
But the question of "what about scripts" is always a curiously pressing one. Somehow, I always find myself with vast quantities of data that need clever find-and-replace, small fixes, or a refactoring. Which is why one usually uses Perl, a computer language that seems to inspire phrases like bletcherous hack.

With no further ado, then, I'll suggest you check out Mark Lentczner's Periodic Table of Operators to get an idea of just how weird Perl really is.


Posted by danyelf at 07:37 AM | TrackBack

June 02, 2004

Sequence Analysis

[technical pointer, to self]

Software for the Analysis of Interaction Sequences

GSEQ (General Sequential Querier) is a program devised for sequential analysis. It reads compiled SDIS files and provides a variety of sequential statistics, including tables of lag frequencies, chi-squares, and adjusted residuals. Several kinds of data modifications are permitted, including recoding, lumping, chaining, time-windowing, and removing of behavioral codes. GSEQ can export results for further analyses using SPSS, BMDP, SAS, ILOG, etc. Users interact with GSEQ by a specific command language.

Didn't even know this field existed.

On the other hand, can't say I'm delighted by the notion of "buy the book, then buy the software" as a way of learning anything about a field...

Posted by danyelf at 10:25 PM | Comments (0) | TrackBack

Version Control and the Single Dissertation

As my dissertation moves along1, I'm finding that I need to track versions like never before. Here's the constraints (and I suspect they are not uncommon):

  1. I use binary files. In particular, I use Microsoft Word. Which means that things like CVS won't really give me a good "compare" for merges.
  2. I work on a variety of machines, including a desktop at home, a desktop in the office, and a laptop.
  3. I use network storage (always available from the office, usually from home, sometimes from the laptop)
  4. I use a pen drive (annoying to put in anything except the laptop2)
  5. I edit on vast quantities of printouts.

This leaves a nasty synchronization problem or two. Some are just the fault of the existence of paper (if I leave behind a draft, I really don't have it.) But some are just the fact that HotSync-for-desktops really isn't there. Below the fold is a (painstaking!) discussion of how I solve it. How do you?

1 Chapter 3 nearly done; Chapter 4 over halfway there; now working on 5; 1-2, 6 still in need of work; still working on the source document that will become 7...

2 Dude, I've got a Dell. And the front-side USB slots all point downward, under a weird plastic flap.

When I collaborate on papers with my advisor, we incrementally number. I send him draft 1, he sends me back draft 2. I respond with 3, and possibly 3a if I have more edits before I hear back from him. And so on.

This numbering goes in the filename. CHI-PAPER-1.DOC. And so on.

I've adapted that for this problem. Chapter-#-version.doc is the way an entire directory of my disk looks right now. When I copy to another medium, I increment. When I mail a copy to someone else, I increment. Paper printouts use Microsoft Word's rather nice auto-time-and-date feature, as well as the filename feature. Which means I have piles of paper around labelled "Chapter 3-5.doc 16 11:41 AM " on each page.

Now the numbers go up annoyingly fast, and there's no real way to synchronize myself between chapters temporally. But at least I can track versions, a little, and know something about which are obsolete. (It's harder to know which is the newest. Occasioanlly, I'll hit version K on one machine, and irritably remember that I've already created K on another.)

I don't have a canonical place to store versions. That's because I've been travelling to various places, and when I travel, I want the canonical version on my laptop; when I'm home, I want it on my network drive.



Looking back on this, it looks even worse. How do you do it?

Update: Red Ted has even longer names than I do. Of course, my dissertation is a smaller entity than his--they mean different things in the social sciences.

Posted by danyelf at 12:57 AM | TrackBack