To LUGNET HomepageTo LUGNET News HomepageTo LUGNET Guide Homepage
 Help on Searching
 
Post new message to lugnet.loc.auOpen lugnet.loc.au in your NNTP NewsreaderTo LUGNET News Traffic PageSign In (Members)
 Local / Australia / 10049
10048  |  10050
Subject: 
Re: .loc.au stats for October
Newsgroups: 
lugnet.loc.au
Date: 
Tue, 5 Nov 2002 12:19:09 GMT
Viewed: 
657 times
  
Uhhh... I just hope this last post was intended as a satire on "post
quality"!
:-)

Cheers,
Paul

"Kerry Raymond" <kerry@dstc.edu.au> wrote in message
news:H53tIA.6EF@lugnet.com...
Of course, to do this we'd need to work out how to measure the quality • per
word or post quality values...

The quality of a post can be coarsely approximated by applying quality • metrics
to each individual word, giving an overall posting content quality. So, • say, a
message involving quality words like "sensate", "post-expressionalism" and
"45.7%" would generate a higher quality score than a message involving • words
like "lots", "stuff", and "wow". Indeed, one can even apply severe • negative
volume metrics to such words as "wasssssssssuuupppp" and "insectoids", • thus
reducing such any postings involving such words to hideously low levels,
unredeemable even in such contexts as "45.7% of all insectoid sets • incorporate
post-expressionalism detectable to the more sensate builder". Any message
involving lots of numbers is almost certainly good, as these are either • set
numbers, part numbers, or Richie's monthly statistics, so numbers will • generate
high quality word ratings.

However, the posting content quality is not the final score. You must then
multiply it by bytes of new content divided by the bytes of included • quoted
content. This ensures that "Me too" messages are forced down the quality • metric
no matter how frequently post-expressionalism is mentioned.

Then you take any quoted content and recursively apply the quality • analysis to
it, and then compare the quality of the new content with the quoted • content.
The quality of the quoted content is then a ceiling on the possible • quality of
the new content. This is intended to discourage the proliferation of • threads of
low quality by downgrading all subsequent contributions no matter how • sensate.

This produces the primary quality score for the posting.

Finally some people are fundamentally low-quality authors of postings, and • this
must be incorporated to produce the modified quality score. This cannot be • done
initially but can be introduced once a certain volume of primary quality • score
data has been collected. By determining average primary quality for a • given
author, one can determine an author quality as a moving average. You then • take
the ratio of specific author quality divided by average author quality to • get
the author-multiplier which is then (as the name suggests) multiplied to • the
primary quality score to produce the modified quality score.

Note. It is very important to determine the specific author quality and • average
author quality scores from the *primary* quality scores, and not from the
modified quality scores. As can be appreciated, the use of modified • quality
scores to determine specific author quality and average author quality • will
introduce unbounded escalating feedback.

Having determined the modified quality score for each posting, it is then
entirely mechanical to determine the total quality of posts by that author • in
any given time period (e.g. a month) as well as to derive the average • quality
of posts by that author over the same time period.

However, over a sufficiently long period, it is likely that the average • quality
of posts over a given period is likely to approximate the • author-multiplier (as
a relative but not absolute scale). This makes it very difficult for • anyone to
significantly lift their game quality-wise. So, the use of moving average • for
computing the author-multiplier will need (over time, but probably not
immediately) to incorporate an aging of older post quality data, probably • based
on some kind of inverse Poisson differential decay. I'm not sure what • chi-value
to use to stretch the probability curve, but I would think we'd be aiming • for a
half-life of around 3 months, so probably something in the range of 1.5 to • 2
should be OK (using base "e" not 10, of course).

So, while some people will argue that quantitative evaluation of quality • is
fundamentally flawed no matter what interpretation of Chomsky conceptual
analysis you use, I say (quoting from The Matrix here) "never send a • person to
do a machine's job". My method can be entirely automated (essential for • ongoing
maintenance) and we could have aggregated ratings on each newsgroup, • allowing
the LUGnet traffic page to be based on quality rather than quantity • metrics.
Then when loc.au wants to take on the Italians (say), then we have to do • it
based on superior quality of postings and not just hitting send a lot.

Indeed, increasing the quality ratings on a newsgroup can be achieved • either by
sending many high-quality posts, or by discouraging the sending of • low-quality
postings. Thus some newsgroups could achieve a higher quality aggregate • through
having no postings than most of the market newsgroups (which are likely to
generate large negative aggregates).

Waddya reckon?

Kerry




Message is in Reply To:
  Re: .loc.au stats for October
 
(...) The quality of a post can be coarsely approximated by applying quality metrics to each individual word, giving an overall posting content quality. So, say, a message involving quality words like "sensate", "post-expressionalism" and "45.7%" (...) (22 years ago, 5-Nov-02, to lugnet.loc.au)

21 Messages in This Thread:











Entire Thread on One Page:
Nested:  All | Brief | Compact | Dots
Linear:  All | Brief | Compact
    

Custom Search

©2005 LUGNET. All rights reserved. - hosted by steinbruch.info GbR