Australia : 10046


Local / Australia / 10046	10045 \| 10047

Subject:	Re: .loc.au stats for October
Newsgroups:	lugnet.loc.au
Date:	Tue, 5 Nov 2002 13:01:17 GMT
Viewed:	667 times

> Of course, to do this we'd need to work out how to measure the quality per > word or post quality values... The quality of a post can be coarsely approximated by applying quality metrics to each individual word, giving an overall posting content quality. So, say, a message involving quality words like "sensate", "post-expressionalism" and "45.7%" would generate a higher quality score than a message involving words like "lots", "stuff", and "wow". Indeed, one can even apply severe negative volume metrics to such words as "wasssssssssuuupppp" and "insectoids", thus reducing such any postings involving such words to hideously low levels, unredeemable even in such contexts as "45.7% of all insectoid sets incorporate post-expressionalism detectable to the more sensate builder". Any message involving lots of numbers is almost certainly good, as these are either set numbers, part numbers, or Richie's monthly statistics, so numbers will generate high quality word ratings. However, the posting content quality is not the final score. You must then multiply it by bytes of new content divided by the bytes of included quoted content. This ensures that "Me too" messages are forced down the quality metric no matter how frequently post-expressionalism is mentioned. Then you take any quoted content and recursively apply the quality analysis to it, and then compare the quality of the new content with the quoted content. The quality of the quoted content is then a ceiling on the possible quality of the new content. This is intended to discourage the proliferation of threads of low quality by downgrading all subsequent contributions no matter how sensate. This produces the primary quality score for the posting. Finally some people are fundamentally low-quality authors of postings, and this must be incorporated to produce the modified quality score. This cannot be done initially but can be introduced once a certain volume of primary quality score data has been collected. By determining average primary quality for a given author, one can determine an author quality as a moving average. You then take the ratio of specific author quality divided by average author quality to get the author-multiplier which is then (as the name suggests) multiplied to the primary quality score to produce the modified quality score. Note. It is very important to determine the specific author quality and average author quality scores from the *primary* quality scores, and not from the modified quality scores. As can be appreciated, the use of modified quality scores to determine specific author quality and average author quality will introduce unbounded escalating feedback. Having determined the modified quality score for each posting, it is then entirely mechanical to determine the total quality of posts by that author in any given time period (e.g. a month) as well as to derive the average quality of posts by that author over the same time period. However, over a sufficiently long period, it is likely that the average quality of posts over a given period is likely to approximate the author-multiplier (as a relative but not absolute scale). This makes it very difficult for anyone to significantly lift their game quality-wise. So, the use of moving average for computing the author-multiplier will need (over time, but probably not immediately) to incorporate an aging of older post quality data, probably based on some kind of inverse Poisson differential decay. I'm not sure what chi-value to use to stretch the probability curve, but I would think we'd be aiming for a half-life of around 3 months, so probably something in the range of 1.5 to 2 should be OK (using base "e" not 10, of course). So, while some people will argue that quantitative evaluation of quality is fundamentally flawed no matter what interpretation of Chomsky conceptual analysis you use, I say (quoting from The Matrix here) "never send a person to do a machine's job". My method can be entirely automated (essential for ongoing maintenance) and we could have aggregated ratings on each newsgroup, allowing the LUGnet traffic page to be based on quality rather than quantity metrics. Then when loc.au wants to take on the Italians (say), then we have to do it based on superior quality of postings and not just hitting send a lot. Indeed, increasing the quality ratings on a newsgroup can be achieved either by sending many high-quality posts, or by discouraging the sending of low-quality postings. Thus some newsgroups could achieve a higher quality aggregate through having no postings than most of the market newsgroups (which are likely to generate large negative aggregates). Waddya reckon? Kerry

Message has 7 Replies:

		Re: .loc.au stats for October
Uhhh... I just hope this last post was intended as a satire on "post quality"! :-) Cheers, Paul "Kerry Raymond" <kerry@dstc.edu.au> wrote in message news:H53tIA.6EF@lugnet.com... (...) per (...) metrics (...) say, a (...) words (...) negative (...) (...) (22 years ago, 5-Nov-02, to lugnet.loc.au)

		Re: .loc.au stats for October
(...) My concern here with taking on the Italians is that they post in Italian. Therefore any quality metric has to be multi lingual. That goes without saying so that's presumably why Kerry didn't say it, but I needed to boost my new/quoted material (...) (22 years ago, 5-Nov-02, to lugnet.loc.au)

		Re: .loc.au stats for October
Ah the perils of cutting and pasting, I forgot this bit while reformatting In lugnet.loc.au, Kerry Raymond writes: (what any mechanical grading system would SURELY have to rate as a very high quality post, as it contained lots of big and obscure (...) (22 years ago, 5-Nov-02, to lugnet.loc.au)

		Re: .loc.au stats for October
Don't you mean 'post-expressionism'? (...) (22 years ago, 5-Nov-02, to lugnet.loc.au)

		RE: .loc.au stats for October
Okay, hands up all those who think Kerry's spent too long in post-modern academia. The use of the word 'sensate' is obviously a troll to all those who are familiar with academic discourse. David. (22 years ago, 5-Nov-02, to lugnet.loc.au)

		Re: .loc.au stats for October
(...) So obviously that post would receive a hideously low quality rating, because it mentions "insectoids" twice.... ROSCO (22 years ago, 5-Nov-02, to lugnet.loc.au)

		Re: .loc.au stats for October
(...) [judicious snipping raises new/quoted text ratio.] ["judicious" earns points] [phrase "new/quoted text ratio" earns bonus points] [please note bonus points for bracketed editorial comment] [ooohhh... and the phrase "bracketed editorial (...) (22 years ago, 5-Nov-02, to lugnet.loc.au)

Message is in Reply To:

		Re: .loc.au stats for October
I'm still waiting on the scale that is based on quality of post rather than quantity.... Perhaps some ideas for grading could be: -Average word quality multiplied by the total number of words that a LUGNetter posts -The Sum of all Quality posts -The (...) (22 years ago, 4-Nov-02, to lugnet.loc.au)

21 Messages in This Thread:

Entire Thread on One Page:: Nested: All | Brief | Compact | Dots
Linear: All | Brief | Compact
This Message and its Replies on One Page:: Nested: All | Brief | Compact | Dots
Linear: All | Brief | Compact

Custom Search