Australia : 10049


Local / Australia / 10049	10048 \| 10050

Subject:	Re: .loc.au stats for October
Newsgroups:	lugnet.loc.au
Date:	Tue, 5 Nov 2002 12:19:09 GMT
Viewed:	1326 times

Uhhh... I just hope this last post was intended as a satire on "post quality"! :-) Cheers, Paul "Kerry Raymond" <kerry@dstc.edu.au> wrote in message news:H53tIA.6EF@lugnet.com... > > Of course, to do this we'd need to work out how to measure the quality • per > > word or post quality values... > > The quality of a post can be coarsely approximated by applying quality • metrics > to each individual word, giving an overall posting content quality. So, • say, a > message involving quality words like "sensate", "post-expressionalism" and > "45.7%" would generate a higher quality score than a message involving • words > like "lots", "stuff", and "wow". Indeed, one can even apply severe • negative > volume metrics to such words as "wasssssssssuuupppp" and "insectoids", • thus > reducing such any postings involving such words to hideously low levels, > unredeemable even in such contexts as "45.7% of all insectoid sets • incorporate > post-expressionalism detectable to the more sensate builder". Any message > involving lots of numbers is almost certainly good, as these are either • set > numbers, part numbers, or Richie's monthly statistics, so numbers will • generate > high quality word ratings. > > However, the posting content quality is not the final score. You must then > multiply it by bytes of new content divided by the bytes of included • quoted > content. This ensures that "Me too" messages are forced down the quality • metric > no matter how frequently post-expressionalism is mentioned. > > Then you take any quoted content and recursively apply the quality • analysis to > it, and then compare the quality of the new content with the quoted • content. > The quality of the quoted content is then a ceiling on the possible • quality of > the new content. This is intended to discourage the proliferation of • threads of > low quality by downgrading all subsequent contributions no matter how • sensate. > > This produces the primary quality score for the posting. > > Finally some people are fundamentally low-quality authors of postings, and • this > must be incorporated to produce the modified quality score. This cannot be • done > initially but can be introduced once a certain volume of primary quality • score > data has been collected. By determining average primary quality for a • given > author, one can determine an author quality as a moving average. You then • take > the ratio of specific author quality divided by average author quality to • get > the author-multiplier which is then (as the name suggests) multiplied to • the > primary quality score to produce the modified quality score. > > Note. It is very important to determine the specific author quality and • average > author quality scores from the *primary* quality scores, and not from the > modified quality scores. As can be appreciated, the use of modified • quality > scores to determine specific author quality and average author quality • will > introduce unbounded escalating feedback. > > Having determined the modified quality score for each posting, it is then > entirely mechanical to determine the total quality of posts by that author • in > any given time period (e.g. a month) as well as to derive the average • quality > of posts by that author over the same time period. > > However, over a sufficiently long period, it is likely that the average • quality > of posts over a given period is likely to approximate the • author-multiplier (as > a relative but not absolute scale). This makes it very difficult for • anyone to > significantly lift their game quality-wise. So, the use of moving average • for > computing the author-multiplier will need (over time, but probably not > immediately) to incorporate an aging of older post quality data, probably • based > on some kind of inverse Poisson differential decay. I'm not sure what • chi-value > to use to stretch the probability curve, but I would think we'd be aiming • for a > half-life of around 3 months, so probably something in the range of 1.5 to • 2 > should be OK (using base "e" not 10, of course). > > So, while some people will argue that quantitative evaluation of quality • is > fundamentally flawed no matter what interpretation of Chomsky conceptual > analysis you use, I say (quoting from The Matrix here) "never send a • person to > do a machine's job". My method can be entirely automated (essential for • ongoing > maintenance) and we could have aggregated ratings on each newsgroup, • allowing > the LUGnet traffic page to be based on quality rather than quantity • metrics. > Then when loc.au wants to take on the Italians (say), then we have to do • it > based on superior quality of postings and not just hitting send a lot. > > Indeed, increasing the quality ratings on a newsgroup can be achieved • either by > sending many high-quality posts, or by discouraging the sending of • low-quality > postings. Thus some newsgroups could achieve a higher quality aggregate • through > having no postings than most of the market newsgroups (which are likely to > generate large negative aggregates). > > Waddya reckon? > > Kerry >

Message is in Reply To:

		Re: .loc.au stats for October
(...) The quality of a post can be coarsely approximated by applying quality metrics to each individual word, giving an overall posting content quality. So, say, a message involving quality words like "sensate", "post-expressionalism" and "45.7%" (...) (23 years ago, 5-Nov-02, to lugnet.loc.au)

21 Messages in This Thread:

Entire Thread on One Page:: Nested: All | Brief | Compact | Dots
Linear: All | Brief | Compact

Custom Search