Subject:
|
Re: .loc.au stats for October
|
Newsgroups:
|
lugnet.loc.au
|
Date:
|
Tue, 5 Nov 2002 13:01:17 GMT
|
Viewed:
|
787 times
|
| |
| |
> Of course, to do this we'd need to work out how to measure the quality per
> word or post quality values...
The quality of a post can be coarsely approximated by applying quality metrics
to each individual word, giving an overall posting content quality. So, say, a
message involving quality words like "sensate", "post-expressionalism" and
"45.7%" would generate a higher quality score than a message involving words
like "lots", "stuff", and "wow". Indeed, one can even apply severe negative
volume metrics to such words as "wasssssssssuuupppp" and "insectoids", thus
reducing such any postings involving such words to hideously low levels,
unredeemable even in such contexts as "45.7% of all insectoid sets incorporate
post-expressionalism detectable to the more sensate builder". Any message
involving lots of numbers is almost certainly good, as these are either set
numbers, part numbers, or Richie's monthly statistics, so numbers will generate
high quality word ratings.
However, the posting content quality is not the final score. You must then
multiply it by bytes of new content divided by the bytes of included quoted
content. This ensures that "Me too" messages are forced down the quality metric
no matter how frequently post-expressionalism is mentioned.
Then you take any quoted content and recursively apply the quality analysis to
it, and then compare the quality of the new content with the quoted content.
The quality of the quoted content is then a ceiling on the possible quality of
the new content. This is intended to discourage the proliferation of threads of
low quality by downgrading all subsequent contributions no matter how sensate.
This produces the primary quality score for the posting.
Finally some people are fundamentally low-quality authors of postings, and this
must be incorporated to produce the modified quality score. This cannot be done
initially but can be introduced once a certain volume of primary quality score
data has been collected. By determining average primary quality for a given
author, one can determine an author quality as a moving average. You then take
the ratio of specific author quality divided by average author quality to get
the author-multiplier which is then (as the name suggests) multiplied to the
primary quality score to produce the modified quality score.
Note. It is very important to determine the specific author quality and average
author quality scores from the *primary* quality scores, and not from the
modified quality scores. As can be appreciated, the use of modified quality
scores to determine specific author quality and average author quality will
introduce unbounded escalating feedback.
Having determined the modified quality score for each posting, it is then
entirely mechanical to determine the total quality of posts by that author in
any given time period (e.g. a month) as well as to derive the average quality
of posts by that author over the same time period.
However, over a sufficiently long period, it is likely that the average quality
of posts over a given period is likely to approximate the author-multiplier (as
a relative but not absolute scale). This makes it very difficult for anyone to
significantly lift their game quality-wise. So, the use of moving average for
computing the author-multiplier will need (over time, but probably not
immediately) to incorporate an aging of older post quality data, probably based
on some kind of inverse Poisson differential decay. I'm not sure what chi-value
to use to stretch the probability curve, but I would think we'd be aiming for a
half-life of around 3 months, so probably something in the range of 1.5 to 2
should be OK (using base "e" not 10, of course).
So, while some people will argue that quantitative evaluation of quality is
fundamentally flawed no matter what interpretation of Chomsky conceptual
analysis you use, I say (quoting from The Matrix here) "never send a person to
do a machine's job". My method can be entirely automated (essential for ongoing
maintenance) and we could have aggregated ratings on each newsgroup, allowing
the LUGnet traffic page to be based on quality rather than quantity metrics.
Then when loc.au wants to take on the Italians (say), then we have to do it
based on superior quality of postings and not just hitting send a lot.
Indeed, increasing the quality ratings on a newsgroup can be achieved either by
sending many high-quality posts, or by discouraging the sending of low-quality
postings. Thus some newsgroups could achieve a higher quality aggregate through
having no postings than most of the market newsgroups (which are likely to
generate large negative aggregates).
Waddya reckon?
Kerry
|
|
Message has 7 Replies: | | Re: .loc.au stats for October
|
| Uhhh... I just hope this last post was intended as a satire on "post quality"! :-) Cheers, Paul "Kerry Raymond" <kerry@dstc.edu.au> wrote in message news:H53tIA.6EF@lugnet.com... (...) per (...) metrics (...) say, a (...) words (...) negative (...) (...) (22 years ago, 5-Nov-02, to lugnet.loc.au)
| | | Re: .loc.au stats for October
|
| (...) My concern here with taking on the Italians is that they post in Italian. Therefore any quality metric has to be multi lingual. That goes without saying so that's presumably why Kerry didn't say it, but I needed to boost my new/quoted material (...) (22 years ago, 5-Nov-02, to lugnet.loc.au)
| | | Re: .loc.au stats for October
|
| Ah the perils of cutting and pasting, I forgot this bit while reformatting In lugnet.loc.au, Kerry Raymond writes: (what any mechanical grading system would SURELY have to rate as a very high quality post, as it contained lots of big and obscure (...) (22 years ago, 5-Nov-02, to lugnet.loc.au)
| | | RE: .loc.au stats for October
|
| Okay, hands up all those who think Kerry's spent too long in post-modern academia. The use of the word 'sensate' is obviously a troll to all those who are familiar with academic discourse. David. (22 years ago, 5-Nov-02, to lugnet.loc.au)
| | | Re: .loc.au stats for October
|
| (...) [judicious snipping raises new/quoted text ratio.] ["judicious" earns points] [phrase "new/quoted text ratio" earns bonus points] [please note bonus points for bracketed editorial comment] [ooohhh... and the phrase "bracketed editorial (...) (22 years ago, 5-Nov-02, to lugnet.loc.au)
|
Message is in Reply To:
| | Re: .loc.au stats for October
|
| I'm still waiting on the scale that is based on quality of post rather than quantity.... Perhaps some ideas for grading could be: -Average word quality multiplied by the total number of words that a LUGNetter posts -The Sum of all Quality posts -The (...) (22 years ago, 4-Nov-02, to lugnet.loc.au)
|
21 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
This Message and its Replies on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
|