The second article details ways that a Usenet conferencing system can be used in a corporate environment to reduce the amount of info that people wade through on a daily basis.
The third article (not yet written) I intend to be a technical-implementation document, describing the steps necessary to implement a working Usenet conferencing system in your environment, and the things to consider in tying the new system into your existing email system so as to extend your organization's capabilities, while reducing "info-glut" in the office, and allowing people to focus more clearly on their actual tasks.
First off, Usenet is the largest distributed database project in the history of Mankind. It's been under development, and growing, since about 1983, perhaps even earlier. It's a loose collection of machines across the planet that share data with each other. There are hundreds of thousands, perhaps millions of "nodes" in this database, and they try to keep each other up to date with all the latest news. They communicate with each other using NNTP, the Network News Transfer Protocol.
From an end-user perspective, Usenet appears like a bunch of "bulletin boards" or "chat areas" (unlike the current usage of the term "chat", though, discussions take place over the course of weeks or months, rather than in real-time like the "chat rooms" on the Web.) These "boards", or newsgroups, are where people post messages on certain topics. Someone posts a question, or opinion, or interesting factoid, and others are able to respond, either via private email to the poster, or as another public post that goes into the group as a "followup message".
Each time one of these postings is made, it is branded with a unique
"Message ID", which is a long string of characters generated by the news
server software. They usually look something like this:
Estimates of Usenet usage vary greatly, and there's no definitive way to
tell for sure. Besides, the number is growing every day, so it doesn't
really matter. The last number I heard was something around 30-33
million people, and that was a while ago. It may or may not be accurate,
but it either was, or will be, at some point. :-)
At the time of this writing, Usenet is generating about 6-8 gigabytes of
data per day. This data gets dutifully shuttled around to all of
the nodes participating in the Usenet network, and machines try to keep
each other up to date. They will get a note from one machine, and
immediately turn around and ask all of their "peers" if they've "heard
the news". They pass on the new Message ID, and say "Do you have this
one, yet?" If the answer is "Yes, I've got that one", then nothing is
done, but if the answer is "No, I haven't seen that one, yet", then the
message is passed on, and the next machine begins asking its peers. In
this way, news is passed quickly and efficiently around the globe. Often
in a matter of minutes.
Usenet isn't really any "place", just as the World Wide Web isn't any
"place", but it does form a very integral part of the virtual space of
the Net. Conceptually, it's the place where people go to talk. It's like
the coffee shops and meeting halls, the town squares and workrooms, of
our physical space.
Each newsgroup represents a different area of interest. Sometimes
they are on totally different topics (such as sci.chem and alt.swedish.chef.bork.bork.bork)
and sometimes they are splinter groups that differ in some very specific
way (such as rec.aquaria.freshwater.cichlids
and rec.aquaria.freshwater.goldfish).
The last count I saw was that there were more than 100,000 different groups
around the planet. Some of them are global, some are extremely regional,
dealing with one specific region, or city, or school, or even one class in
a school.
Luckily, many Usenet news reading programs have tools built in that
let you avoid most
of that spam. Learning how to use those tools is pretty easy, and gives
you surprising levels of control over what pops up on your screen.
A thread is not limited to one subject line, or a specific group of
people, or even a particular length of time. (Some go on for months,
some are only a message or two long, and then die out due to lack of
responses.) Think of it like a gathering in a coffee shop: Some people
are sitting around, someone starts up a discussion, and conversation
ensues. Over the next hour or two, some people get up and leave, some
come in and join the conversation, and eventually, everyone finishes up,
has their say, and heads home. The coffee shop closes, and the
discussion is complete. (Or, if it's a 24-hour shop, like Usenet, then
perhaps people keep on talking 'round the clock.) At the end, the
conversation may have wandered all over the place, and may not resemble
the initial question/statement at all. This is very common.
If you were in the coffee shop, you might join in with the conversation
at one table, or it might bore you, so you go to a different table, to
see what people are talking about there. You can do the same thing in
Usenet newsgroups, with the added benefit of being able to filter out
boring threads, and auto-select the ones that interest you.
At its simplest level, you can say "filter out anything about topic X"
and "select anything about topic Y", where you specify specific search
criteria. In many newsreaders, the default is to look on the
Subject: line. However, you can also say things like "select
anything by person A", and "filter out things by person B", or even
"select things by person A, except when they're responding to
person B, because I'm tired of listening to them argue." (!) These
involve scanning the From: line in postings, and in some more
advanced newsreaders, you can also look in other (or any) header lines,
such as Keywords:, Summary:, Distribution:,
Newsgroups:, check the number of lines in the post, etc. You can
also use some newsreaders to scan through the body of the article. Thus,
you can create complex sorting rules like the following:
Since the Subject: line can vary so widely, and since it's not
even required, it's not the best thing to filter with. It's useful as a
rule-of-thumb, but not dependable. Also, filtering by author is only
good if someone uses the same account all the time, and that's not
guaranteed. What is guaranteed (if the software is written
correctly) is that each and every message will have a unique
Message ID.
But if it's unique, you can't know in advance what it will
be, so how can you filter on it? Usenet posting software (if it's
written correctly) will keep track of those Message IDs, and when a new
message is posted, it gets a unique ID, as well as a References:
line that lists the ID or IDs that it's replying to. Thus, you can trace
back the thread, and find earlier posts, if you want to check what
someone wrote earlier.
Thus, every response refers back to an earlier post, and every fork in
the conversation, or every change in topic, is tracable by its Message
ID, even if the subject line is changed, or dropped.
You can see that this part of the thread starts in upper left with
the first post,
and within a few replies, begins to fork and fork again, as people reply
to replies, and reply to replies of replies. Somewhere down the line,
someone changes the subject, (each subject is noted by a number like
[1], [2], [3], etc...) but they are all still pertaining to the original
post, as determined by the Message-ID:, and the
References: lines.
Thus, if you want to keep from drowning in useless information, it
becomes necessary to proactively "prune" that tree, before it grows out
of control. In the above example, if you weren't interested in the
sub-thread that starts out section [3] (Defamation Suit), then adjusting
your newsreader to prune that message, and all things matching the
subject of "Defamation Suit" would save you from seeing a further 10
messages. However, just filtering on the subject wouldn't stop you from
seeing the two messages marked [5] (Lawyer-Like Dishonesty) which
stemmed off from the [3] branch of the thread-tree. If you were really,
truly not-interested in anything to do with the [3] tree, then those two
[5] reponses would be a waste of your time. By setting your newsreader
to kill the entire sub-thread, starting with the original [3] posting,
and any message posted in response to it, regardless of subject
change, you'd get rid of all of the [3] posts, the two [5] posts,
and any further responses in the future. In a very real sense,
you've removed the non-interesting stuff from bothering you, so you can
get on with things that interest you.
It's important to note that the original [1] tree, and the [4] tree are
still preserved by a sub-thread kill. You can follow those parts, and
kill off other sub-threads as you go. Or, you could say "this entire
thread is crap, junk it" and the newsreader can trace forward and
backward in the tree, taking note of each message ID, so that you don't
see them or any response to any message in the entire thread
ever again.
Think about the long-term effect of this. If one message can lead to
more than 100 messages, as we've seen above, then what could those 100
messages lead to? By cutting the threads as soon as you decide you're
not interested, you signal the computer to go to work making sure not to
bother you with boring things. (And of course, you can undo your
selections and filter rules, if you want to start reading a thread again
later.) Thus, if you approach each message you read with the question of
"Is this interesting to me? Do I want to read any more about this?" and
set your rules right then, you can prune the news as you go. Killing
this thread, selecting that one, following responses to one specific
article, but removing responses to another, etc.
Suppose you've got a fairly decent ruleset, and you've got your
newsreader weeding out lots of stuff for you. There's another technique
called "scoring" that gives you even finer-grained control over what
hits your eyeballs.
This capability is not in every newsreader, but it exists in a few of
the more advanced ones, such as trn, strn, and gnus.
Suppose you normally like the posts of person A, but occasionally they
post something really bone-headed. So you don't really want to kill all
their posts, but sometimes you wonder why you're bothering to read their
stuff. With scoring, you are allowed to "rank" articles, and assign a
point value to them. A "normal" Usenet article will start off with a
value of 0, and can be adjusted either upwards or downwards. Bumping it
up a point makes it "more interesting" than an average post, and
likewise, bumping it down makes it "less interesting". The newsreader
keeps track of your assigned scores, and will calculate article scores
when you enter a group. Often, people will score over a wide range, such
as +/- 1000 points. That way, any one specific article won't have a huge
effect on the overall score, unless you specify so.
Thus, if you normally enjoy reading person A's posts, and you've given
him a score of +50, his posts will show up in the newsreader as "more
interesting" than Joe Average's posts. However, every time you see a
boneheaded post, you can say "subtract 5 points", and after 11
boneheaded posts, person A's posts are LESS interesting than Joe
Average. (Of course, if they say something that really ticks you off,
you can give them a score of -500 and write them off.)
You can also set visibility thresholds, so that things below a certain
score just vanish from sight, and things above a certain score get
auto-selected for you to read. You can even set sub-thresholds that say
things like "If the score gets to -400, warn me that it's about to
vanish, and if it gets to -500, just remove it. Or if it's above +200,
auto-select it for me."
This allows you to get complex with your scoring rules, just as you did
with your thread-filtering rules. You can say "Bump up the score of
person A by 50 points, unless the subject is about 'Blah', then drop the
score by 100 points. If person A is responding to person B (whom I
like), then it gets 50 points for A, 50 for B, and another 25 because
it's a subject I'm interested in. (It's at 125.) If this is in response
to a thread that I started, give it an extra 200 points."
Suddenly it's at 325, and it's auto-selected for you to read and marked
as "VERY interesting."
Some newsreaders will let you rank articles by score, so you can say
"show me the most interesting article" and it will pull it up for
you. When you get to the point where the most interesting thing the
newsreader has to offer you isn't really all that interesting, you can
say "catch up the rest, and move on to the next group".
If all this scoring and ranking seems like a hassle, there is another
solution: Auto-scoring. I've only seen this amazing little feature in
the gnus newsreader, but hope to see it in others as time goes
on. Essentially, while reading news, you are leaving little marks as you
go through the articles. Some are getting read, some are getting killed,
some are just plain ignored. The Auto-scoring facility takes a look at
what you did while in a group, and when you leave, it updates some
internal score files automagically. You can (of course) adjust the point
values for everything, and choose what does or doesn't get watched for
scoring, but the defaults are fairly good to start with. (The gnus docs
basically say "Just turn this on, then go read news for a week. You'll
start to notice more interesting news popping up, and the boring stuff
will just start to go away.")
Here's a rough idea of how it works: (These numbers are rough, but
adjustable.)
All articles start at 0.
The net effect of this is that, if you went into a group, and killed a
thread with a 30-post argument between two people, then the Subject: of
that argument gets a net score of -300 (30 * -10 for killing), and each
author gets a -45 (15 each * -3 for killing). That subject won't come up
again if your threshold cuts off articles at -300, and both of those
people are pretty low on your list. Also, other things in the group all
get bumped up or down appropriately, based on whether you read them or
not. So over a very short period of time, a lot of the noise of Usenet
just goes away.
This is how the Usenet pros manage to follow along on relevant
discussions, skip over all the flame-bait (and subsequent flames), carry
on long-term conversations with friends & cohorts, answer questions by
newbies (that get through the filters as something unique and new, not
just a FAQ or response to something non-interesting), and still manage
to hold down a regular job. :-)
It's this capability of the newsreader software to manage the huge
volumes of data, and get the interesting stuff to you, that makes Usenet
into the treasure-trove that it is. Sure, there's spam. Sure, there's
lots of off-topic cruft. But there's also a wealth of info out
there. There are millions of minds, experts in every field and interest
area, all carrying on conversations. Quite often, they're willing to
answer honest questions, share some of their knowledge, impart some of
their wisdom, and help you find references so you can learn new
things.
That's why Usenet is the largest distributed database project in
the history of Mankind.
Dive in...and have fun. :-)
<Rrussia-minersVRlAO_8JJ.P@lnk.clari.net>
These are displayed in the headers on the Message-ID: line, and
they are always unique for each and every Usenet message
posted. Ever. This is a very important point to note, and one which we'll
return to shortly.What about all that spam?
Yes, there's spam on Usenet. In fact, Usenet is where the concept of spam
originated, as well as the term. Sometime around 1993, a few misguided
lawyers hit upon the brilliant idea of sending their message out to the
(then) 6,000 Usenet newsgroups in existence, to tell everyone about their
hot new service, the "Green Card Lottery". With that, a new phenomenon was
born, which everyone is still battling with to this day. An introduction to the concept of "threads"
The conversations in Usenet are many, and varied. They weave a very
interesting tapestry, and the individual pieces that make up this tapestry
are called "threads". A thread is a topic of discussion. Someone posts a
message, someone posts a reply, then another, and another... Over time,
the thread will grow, sometimes split into other threads, and eventually,
it dies out.
"Choose any message in the talk.politics
group that mentions the
phrase 'tax break', but filter out any posts from Persons X, Y, or Z,
(and anything from person A responding to X, Y, or Z), but select
anything else from person A, because I like what they have to say. Make
sure the post is less than 200 lines, otherwise, get rid of it, because
it's too much blather. Make sure that it's not cross-posted to more
than 3 other groups (checking the Newsgroups: line), otherwise,
chances are that it's spam. Also, make sure it doesn't have certain key
phrases in it like 'make money fast', '$50,000', or 'fast cash'...I know
those are spam."
As you can see, there's a surprising amount of flexibility in the filter
rules, if the newsreader software is sufficiently advanced. The net result
being that once you've constructed some rules, the computer will cut out
much of the stuff you're not interested in before you even see it.What if someone changes the subject?
Scanning on a specific Subject: line is all fine and good, until
someone changes the subject. This is all too common on Usenet, and can
quickly make your hand-tailored rulesets useless, if they're only
looking at the Subject: line. An age-old example would be the
subject line progression of something like this (which you can find at
almost any time in the alt.religion.* and talk.religion.*
groups):
Subject: Jesus Loves You!
As you can see, while all of these may be responses to one original post,
the topic-drift is pretty rapid, and would be difficult to filter. (Except
perhaps for the word "Subject:" ;^) ) This is where "threads" come to the
rescue, and where that unique Message ID that we mentioned earlier comes
into play.
Subject: Jesus Loves You! (And wants you to change your sinning ways!)
Subject: Satan Loves You!
Subject: Bob Dobbs Loves You! (And wants you to send $1...)
Subject: Buddha is indifferent!
Subject: My girlfriends and I would love for you to call 1-900-.....
A real-life example of a discussion thread
This is (most of) a thread tree taken from the
talk.politics
newsgroup at the time of this writing. It gives a good example of how a
topic can grow, change course, change subject, die out, and be reborn. It
was constructed by the trn
newsreader, written by Wayne
Davison.
Subjects:
[1] Note to Jol fans (if there are any):
[2] Why the Lawyer-Like-Dishonesty Posts?
[3] Defamation Suit
[4] Jol's accusations
[5] Lawyer-Like Dishonesty
(1)--[1]+-[1]
\-[1]+-[1]--[1]--[1]+-[1]--[1]--[1]--[1]--[1]--[1]--[1]
| |-[1]--[1]+-[1]--[1]+-[1]
| | | |-[1]--[1]+-[1]
| | | | \-[1]+-[1]--( )--[1]--[1]--[1]
| | | | \-[1]
| | | \-[1]--[1]+-[1]
| | | |-[1]
| | | \-[1]--( )--[1]
| | \-[1]--[1]--[1]--[1]--[1]--[1]--[1]--[1]--[1]--( )+-[1]
| | \-[1]--( )+
| |-[2]+-[2]
| | |-[2]--[2]
| | \-[2]
| \-[1]--[1]--[1]--[1]--( )--[1]--( )--[1]
|-[1]+-[1]--[1]--[1]+-[1]--[1]+-[1]--[1]
| | | \-[1]--[1]+-[1]--[1]
| | | \-[1]+-[1]+-[1]--[1]+-[1]--[1]--[1]--[1]--[1]
| | | | | \-[1]
| | | | \-[1]--[1]--[1]--[1]--[1]--[1]--[1]-
| | | |-[1]
| | | \-[1]
| | |-[1]--[1]
| | \-[1]--[1]
| \-[1]+-[1]+-[1]--[1]--[1]
| | \-[1]--[1]+-( )--[1]--[1]+-[1]--[1]+-[1]--[1]
| | | | \-[1]--[1]+-[1]
| | | | \-[1]
| | | \-[1]
| | \-[1]
| \-[1]
|-[1]--[1]--[1]+-[1]
| \-[1]+-[1]+-[1]--[1]
| | \-[1]
| \-[1]
\-[1]
[3]+-[3]
|-[3]
|-[3]
|-[3]+-[3]
| |-[3]
| \-[3]
|-[3]--[3]
|-[3]
\-( )--[5]--[5]
-[1]
-[4]+-[4]--[4]--[4]--[4]--[4]--[4]--[4]--[4]+-[4]--( )--[4]--( )--[4]
| \-[5]
|-[4]
\-[4]
The lower section, with the [3], [4], and [5] entries, are forks from an
earlier part of the discussion. (Apparently, this thread has been going on
for quite a while.) As you can see, the paths that the threads weave can
get quite complex, and they have the potential to grow for a lot longer
than you're likely to be interested in them. Scoring - an alternative to simple kill/select modes
But wait, there's more. :-)
If you read an article, the Subject: gets bumped up by +1, and the
author gets bumped up by +3.
If you ignore an article, Subject: gets -1, author gets -1.
If you delete the article, Subject: gets -10, author gets -3.
If you kill the thread, each following article's Subject: gets
-10, author gets -3.
When you leave the group and catchup articles, they all get something
like Subject: -3, author -1.Diligence pays off
All this filtering, killing and scoring takes a bit of diligence, but it
pays off in the long run. As you can hopefully see, by taking the time
to consider while you're reading whether you want to see topics
in the future, you can tailor the info-tide to bring you the things you
want, and remove the things you don't want. And since "Joe Average"
posts all start off as "presumed innocent" (with a score of 0), it's
only by comparing their content, headers, author, keywords, etc., that
determines whether you should see it or not. If it's actually "news to
you", and doesn't match any of your filters, then it will come through,
and you'll see it. So you get the news, but not the "olds" (unless
you've flagged it for reading).
Bio: Patrick Salsbury
lives in a dome in the mountains near Silicon Valley. He has held
various "Newsmaster" and "Postmaster" positions at various companies,
and is sorely acquainted with the problems of info-overload. His
resume can be found here.