Subject:
|
Re: .loc.au stats for October
|
Newsgroups:
|
lugnet.loc.au
|
Date:
|
Tue, 5 Nov 2002 12:19:09 GMT
|
Viewed:
|
743 times
|
| |
| |
Uhhh... I just hope this last post was intended as a satire on "post
quality"!
:-)
Cheers,
Paul
"Kerry Raymond" <kerry@dstc.edu.au> wrote in message
news:H53tIA.6EF@lugnet.com...
> > Of course, to do this we'd need to work out how to measure the quality per
> > word or post quality values...
>
> The quality of a post can be coarsely approximated by applying quality metrics
> to each individual word, giving an overall posting content quality. So, say, a
> message involving quality words like "sensate", "post-expressionalism" and
> "45.7%" would generate a higher quality score than a message involving words
> like "lots", "stuff", and "wow". Indeed, one can even apply severe negative
> volume metrics to such words as "wasssssssssuuupppp" and "insectoids", thus
> reducing such any postings involving such words to hideously low levels,
> unredeemable even in such contexts as "45.7% of all insectoid sets incorporate
> post-expressionalism detectable to the more sensate builder". Any message
> involving lots of numbers is almost certainly good, as these are either set
> numbers, part numbers, or Richie's monthly statistics, so numbers will generate
> high quality word ratings.
>
> However, the posting content quality is not the final score. You must then
> multiply it by bytes of new content divided by the bytes of included quoted
> content. This ensures that "Me too" messages are forced down the quality metric
> no matter how frequently post-expressionalism is mentioned.
>
> Then you take any quoted content and recursively apply the quality analysis to
> it, and then compare the quality of the new content with the quoted content.
> The quality of the quoted content is then a ceiling on the possible quality of
> the new content. This is intended to discourage the proliferation of threads of
> low quality by downgrading all subsequent contributions no matter how sensate.
>
> This produces the primary quality score for the posting.
>
> Finally some people are fundamentally low-quality authors of postings, and this
> must be incorporated to produce the modified quality score. This cannot be done
> initially but can be introduced once a certain volume of primary quality score
> data has been collected. By determining average primary quality for a given
> author, one can determine an author quality as a moving average. You then take
> the ratio of specific author quality divided by average author quality to get
> the author-multiplier which is then (as the name suggests) multiplied to the
> primary quality score to produce the modified quality score.
>
> Note. It is very important to determine the specific author quality and average
> author quality scores from the *primary* quality scores, and not from the
> modified quality scores. As can be appreciated, the use of modified quality
> scores to determine specific author quality and average author quality will
> introduce unbounded escalating feedback.
>
> Having determined the modified quality score for each posting, it is then
> entirely mechanical to determine the total quality of posts by that author in
> any given time period (e.g. a month) as well as to derive the average quality
> of posts by that author over the same time period.
>
> However, over a sufficiently long period, it is likely that the average quality
> of posts over a given period is likely to approximate the author-multiplier (as
> a relative but not absolute scale). This makes it very difficult for anyone to
> significantly lift their game quality-wise. So, the use of moving average for
> computing the author-multiplier will need (over time, but probably not
> immediately) to incorporate an aging of older post quality data, probably based
> on some kind of inverse Poisson differential decay. I'm not sure what chi-value
> to use to stretch the probability curve, but I would think we'd be aiming for a
> half-life of around 3 months, so probably something in the range of 1.5 to 2
> should be OK (using base "e" not 10, of course).
>
> So, while some people will argue that quantitative evaluation of quality is
> fundamentally flawed no matter what interpretation of Chomsky conceptual
> analysis you use, I say (quoting from The Matrix here) "never send a person to
> do a machine's job". My method can be entirely automated (essential for ongoing
> maintenance) and we could have aggregated ratings on each newsgroup, allowing
> the LUGnet traffic page to be based on quality rather than quantity metrics.
> Then when loc.au wants to take on the Italians (say), then we have to do it
> based on superior quality of postings and not just hitting send a lot.
>
> Indeed, increasing the quality ratings on a newsgroup can be achieved either by
> sending many high-quality posts, or by discouraging the sending of low-quality
> postings. Thus some newsgroups could achieve a higher quality aggregate through
> having no postings than most of the market newsgroups (which are likely to
> generate large negative aggregates).
>
> Waddya reckon?
>
> Kerry
>
|
|
Message is in Reply To:
| | Re: .loc.au stats for October
|
| (...) The quality of a post can be coarsely approximated by applying quality metrics to each individual word, giving an overall posting content quality. So, say, a message involving quality words like "sensate", "post-expressionalism" and "45.7%" (...) (22 years ago, 5-Nov-02, to lugnet.loc.au)
|
21 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
|