I think the thread has touched on just how hard it is to use numbers to come to any "objective" conclusion without doing a huge amount of work. We want "simple" answers that appear to be based on hard data and life is just normally not that way.
I have given some thought to the statistics available from Social and also looked at what methods have been developed to analyze these types of data. There are in fact some reasonable approaches but they need to be instituted at a point at which a forum is fully active so they are not applicable in our case.
It would be possible to do the following two things with existing information for all or a sample of members as long as Social is running and Profile pages can be accessed:
-
rank every member from "most" to "least" helpful based on objective criteria ( there are 4 metrics, Comments, Discussions, Hearts, and Likes, with 2 showing the contribution by the member an two showing the reaction)
-
multiple and conflicting rankings could be produced depending on what decisions were made concerning the weight given to each factor (for instance using the 4 measures Karl can be shown to be either the most or least helpful of the four I listed and the same is true for the other 3: with more complex weighting which certainly seems sensible here additional results are possible)
So just dealing with the numbers that exist very different and contradictory conclusions are possible depending on how one chooses to use them.
THAT CANNOT BE CHANGED AT THIS POINT IN TIME--IT IS TOO LATE
Next comes the question of whether the numbers were "rigged"
If one is trying to determine whether a group of members conspired to "inflate" totals that would require a formal investigation and is simply not going to happen. So there never will be a "firm" answer on that. It seems a little implausible to think there was conscious “rigging.”
There is much simpler explanation. The forum consisted of stratified layers of users. Each layer had it own intrinsic defining properties and could easily be shown, using using ANOVA or other standard tools, to have a separate identify. Comparing across these layers is absolutely wrong in any type of analysis and must be rejected on methodology grounds. It is the equivalent of comparing commute times in Los Angeles for someone who lives 40 miles from the job to the commute times of someone in Manhattan who lives two blocks from the office. Yes it is possible to measure the times but no sensible person would think that truly similar things are being compared.
Even within strata there are very serious methodology problems. For instance some members would simply state the correct answer; others would explain all the intermediate steps. Is it the case both members were equally helpful?
Then there is the enormous variety of types of posts. Some extremely knowledgeable members posted rarely and on arcane subjects that few members commented on. How should such contributions be viewed?
Then there is the “Ring Plus Bias” which distorts the data to the point they are impossible to use for any serious purpose.
The Ring Plus Bias was found in two areas: threads dealing with how members were benefitting from RingPlus service or anything related to RingPlus always attracted extremely high volumes of posts. Posters who expressed “positive” feelings in these threads would have dramatically better profiles that those who did not. However, these threads were primarily morale boosting in nature or opportunities for mutual emotional support. They did not and could not by their nature solve any problem nor provide any really new information. Yet they do greatly influence the statistics. In any analysis these would need to be broken out and treated separately.
Then there is the post count bias. As the number of posts a member makes increases it is to be expected that any ratio (Comment/Likes) will decline. The larger the number of posts the bigger the decline but the relationship is not linear.
Next there is the nature of the post bias. Shorter and less technical posts are likely to lead to higher ratios even though they actually provide less “useful” information.
Then there is the VM bias. It is simply not correct to compare any VM member with a non VM member. It is indeed valid to compare within categories of VMs (within Moderators and within EC).
However such comparisons are almost certainly poor measures and may produce completely erroneous results. The reason is obvious. Both EC and Moderators have functions that are handled out of sight and there is no way to adjust the data for this.
These are just a few of the issues involved. It is actually a lot more complex and confused.
The bottom line is the numbers could be used to produce one or many measures of “helpful.”
The meaning of “helpful” in any of these contexts is in the mind of the user not in the numbers themselves.
And now for an actual statistical question: