Worth noting that we have trended down on our traitstone percentage since the first three weeks (sitting under a 30% average finally), and the post-patch and pre patch samples are within each otherās margin of error when using a 95% confidence interval (about 17.5~21.5% at 95% CI prepatch and between about 20~39.5% at 95% CIā¦ we donāt have a great sample size post patch, but if they are āthe sameā then it is far more likely that we collectively have ābad luckā now compared to us collectively having āgood luckā when taking the earlier samples).
However, Iām a bit concerned at this comment:
We are dealing with supposedly random and supposedly independent events, right? Speaking, at least, to any given day of adventure board rolls, since repeats of the same task within a day have never been shown. In these cases, the accuracy of the sample should in no way be affected by the size of the data not sampled but estimated by the size of the sample itself. Sure, youād have more accurate data if you took more samples, but if you sampled say, 100 adventure board pulls and only 100 pulls had ever been done versus sampling 598 when 800,000 had been done, youād be more likely to have an accurate estimate of the rates of tasks supposed to be appearing on the adventure board with the latter. Again, assuming independent and random.
So, at the moment, our traitstone rates are still indistinguishable from ābad luckā with a degree of certaintyā¦ just barely. But also worth noting that while sampling will never be ādefinitiveā it can still be used to say things about a much larger data set. And right now it is saying āmaaaaybe something changed hereā.
If you are just saying ātrust us, its correct, and it hasnāt changedāā¦ Iāll be blunt, but that has been said before when it wasnāt true, and each time it happens becomes a little harder to just accept it when there is any evidence to the contrary. And I know you guys are busy, and I know a lot of threads about something being āwrong with the RNGā are a combination of cognitive biases and misunderstandings with no hard data attached, but being presented with actual evidence something might be off and then getting it summarily marked ānot a bugā without even checking with the dev(s) that configured the numbers really makes me wonder why bother reporting anything that needs this much effort just to get blown off without so much as āwe will keep an eye on itā.
So, work with us here, maybe you donāt want to bother the team with this yet (even though its been almost a full patch cycle and it will probably be two more before we get a low enough margin of error on the post-patch samples that shows if the samples converge or diverge, and thats if the ABs arenāt just changed in that time) what level does this have to rise to for it to be checked out by someone in the know?