Vault Weekend data collection - for w/e 13 Sept

Mithran · August 17, 2020, 12:44pm

Sort of…

If we assume the given rate is correct, the cumulative probability of getting the sample set as the one collected is worse than 179 million to 1. I’ve seen samples for other random events that described outliers or anomalies where nothing was actually wrong, and willing to give the benefit of the doubt in a lot of situations for way longer than I should given past events (though these days anything that doesn’t pass even a 99% confidence level test makes me nervous, especially if I see a trend), but we are way way beyond that. We have a >99.9999% confidence from the samples taken that the true rate falls outside what was stated, and 300 samples is absolutely a significant amount given a stated rate of 10% and our sampled rate of 1.6%. It is much harder to zero in one the actual rate through sampling, for sure, we’d have way less confidence saying “the actual drop rate is 1%, so yours is wrong”, for example, but we can be pretty confident in saying “based on this, I’m pretty sure it isn’t actually 10%”. Worth noting that also the best “luck” recorded data sets from person to person posted here would still fall on the side of “bad luck” for a true rate of 10%.

For a similar situation, consider a situation where you are running a coin flip trial and getting heads 24 out of 25 times, and then declaring there is no way to tell whether or not this is a “fair” coin. Yes, you’d want more data to prove it with “statistical significance”, but it is super easy to just flip a coin more times whereas, on our end, trying to collect Epic Vault Key data is extremely tedious, and going forward without even event weekend, borderline impossible until it rolls around again. By the way, hitting this hypothetical coin situation with a “fair” coin is a roughly 1 in 1.3 million chance, more than 100 times more likely than hitting 5 (or fewer) successes in a sample of 303 with an actual rate of 10%.

At some point, we have to request that, even if it potentially “wasting the devs time”, that this is at least “enough” for some checks be performed on their side both to see if the “correct” rate is in game (both during the vault event and not, and/or if we were quoted the “correct” rate in the 5.1 update notes). I’d imagine (or hope) they have a tool to show them all vault key drops over a weekend from Live data (rather than simulated) as well as all Epic Vault keys, at which point it wouldn’t take more than a glance to see how close it was to 10%. For example, if 10,000 total vault keys were dropped and anything less than 800 were epic, that is a big enough sample to declare with similar confidence that the rates are “not 10%”, even though it is much less “off” than our number, percentage wise.

The point being, any scenario where we have “enough data” to “prove” the rates are a given number with any degree of certainty is a scenario where a potential bug has been in the game way too long and way too much legwork had to be done to the playerbase if it is ever revealed as an actual bug.

I’d be lying if I said that past mistakes have happened with very similar things factored in to how much benefit of the doubt I’d be willing to extend here, but I can picture about a dozen ways this could have gone wrong, even a bunch where even if they pulled simulated data it would line up with “expected” while still being discrepant from the data given by the players.

At this point, it just seems far more likely someone screwed up in setting/communicating the rates (or both) than all the people that happened to be recording results and willing to post them (and/or just happened to be recording and spreadsheeting all their results but only decided to post after they were “unlucky”) had some degree of “bad luck” ranging from “minor” to “extreme outlier”. Its not impossible everything is working correctly, but…

cyberkiwi · August 17, 2020, 12:58pm

You have to also consider out of 11 people, 5 got an epic key each, and the 5 ALL used fewer tries.
Statistics lie all the time when you have no control over the sample.

If the samples were chosen before data collection and everyone followed through with submitting data, I agree the data is compelling. But when the data is heavily skewed by 2-3 sets that did not get the epic vault key, it becomes questionable.

This is like taking a huge stream of die rolls that print onto a ticker tape, and then selectively choosing the most interesting portions - the 11 sixes in a row multiple times, 20 sixes in a row, then a few normal runs, and using that very tiny sample of the population for chi-squared tests and projecting confidence intervals. A recipe for disaster.

I am not vouching for the developers at all, I’m just not willing to make any conclusion from an extremely small and limited data set that is not robust. For all we know, there is probably a bug somewhere in the application of said 10%. I can agree with the possibility.

Let me give you a hypothetical interpretation of the data:

the population of players is way more than 3000 (plucked from thin air)
this volume provides for 0/51 as a possibility 0.5% of the time, or 15 players
2 of those 15 players self-elect to report their outcomes
you have your very biased 11-person dataset

actreal · August 17, 2020, 1:15pm

I’m with @Mithran. Bayesian probability says that: since every other time players have had data that looks as bad as this, there’s been a bug or a miscommunication inside the dev team or both, it is extraordinarily likely that this is a dev error.

I didn’t collect data sufficiently accurately for inclusion in a data set, but no one in my guild got a single Epic Vault Key during a busy weekend of playing, and I’m not seeing any player gleefully post about getting more than one Epic key.

Whatever went wrong, I hope the devs fix it before next vault weekend so we can all have a real chance of getting one.

Mithran · August 17, 2020, 2:33pm

I will concede that this is a possible scenario, but I do not believe this is the most probably. Based on all the information we have now (including past data gathering endeavors by at least some of the people posting here and past mistakes made by the dev team), it is far more likely that this is a mistake on their end rather than some extreme confluence of a bunch of people recording and then withholding data. This isn’t the first data gathering endeavor on this forum, and skewing to the degree that would need to “corrupt” this data set to the point where it didn’t at least point to something being “off” would imply a bias far far greater than any I have experienced in any other data set I collected here. For example, if we removed all the people that got 0 vault keys from the larger set (leaving 5 out of 148), we still have a situation where our sample set at even 99.999% confidence level does not include the stated rate of 10% within in its confidence interval. We’d have to assume nearly every sample is skewed or biased, and while we can’t prove the rigor of this data, it is odd that there is no evidence to the contrary to be found.

Unless I’m completely misinterpreting the data?

The sheet shows every one of the five who got a key having at best 1 out of 11 successes, and at worst 1 out of 56. That would make every single data set “bad luck” if the true rate were 1 in 10 (neutral luck if it were 1 in 11, depending on how you interpreted the 10% in the original post, the 10% could be describing 1 EVK per 10 vault keys, which is closer to 9.1% successes)

Now, the absence of evidence does not necessarily mean the evidence of absence, but given even inconclusive data, no matter how you arrange it that trends heavily away from the conclusion “the rate of 10% is correct as stated”, based on the history of the multiple reliable reporting sources versus the proven fact that very similar game issues have happened more than once in the past, if forced to draw a conclusion, it is far more likely that the mistake occurred on their end.

Yes, absolutely, the “correct” response for an objective statistical analysis as proof is to gather more data and ensure it is more “pure”, but also probably underestimating what a monumental task this can be, even over multiple participants and that fact that it should not be on us, the playerbase, to ever need to collect enough data on something being wrong to rise to the level of “statistical proof” (both in terms of real time that needs to elapse, months or sometimes even years, in addition to the time spent actually gathering the data) before it will be examined from the developer side, where they have much better tools to find out what may have gone wrong in a fraction of the time and effort. Which was the larger point I was trying to make in my previous post.

But, at this point, given the inability to just “gather more data” in any reasonable way, even with an organized effort, until a point in which we’ve already resigned to leaving a potential drop error unaddressed in the game for at least another month, which is more likely?:

extreme reporting bias due to unhappy people, including several known data collectors, with no evidence to the contrary, where we have to assume we only got bad data in addition to some amount of “bad luck”
any one of a myriad of human errors that could have caused something that could explain the discrepancy, from logic/programming error to a data input error to simple miscommunication combined with a rushed development schedule and lack of testing compounded by sub-optimal working conditions to cause a similar errors that have already happened outside of these conditions on more that one occasion

If you weren’t around for the Chaos Portal drop rate saga, you are probably have a higher internal burden of proof before you are willing to even raise it as “probably an issue”. Having gone through that once already, I really really hope it never has to make it that far, because it just drags everything down. To the point where even if it turns out to be “a waste of time”, the smart move would be for the devs to at least give it a check. Thus, yes, they should also lower their burden of proof as well.

tl;dr: I can’t call the data “proof”, but I can definitely say there is enough of a trend that it needs to be double checked on the dev side, and if something is wrong and get to the level of “proof” before it is fixed, that is a slew of negative attention that they don’t need piled onto it just being wrong in the first place. If it were me sitting on the other side of the fence, I’d be pushing to check this out asap, on my own time if had to, because of the implications of continued repeated problems and what that means for the game’s image and how it affects overall discourse.

cyberkiwi · August 17, 2020, 2:42pm

You read it perfectly right. 5/148 is pretty abysmal
I am captive to logic, and your reasoning is perfectly sound - I acquiesce.

I am a fan of data, and I agree the burden should not be on players to find proof. But on a side-note, my data gathering exercise wasn’t aimed at proving anything - in fact I started it before the 10% drop rate was disputed.
I just want data to look at and play with, to reverse engineer what the devs won’t publish.

actreal · August 17, 2020, 2:46pm

That’s what pretty much everyone else who collects data on GoW is doing too.

It’s just that sometimes we reverse engineer stuff that the devs don’t yet know was not going to plan.

Magnusimus · August 17, 2020, 5:44pm

To add to the pile, though I don’t have data regarding when gnomes did/did not drop:

Personally killed enough gnomes to get 5 free Vault Keys from the tracker. Got 6 or 7 dropped from the gnomes I killed.

Even less robust stats (but I still think worth mentioning): The Unforgiven Family in total (6 guilds) got hundreds-of-thousands worth of trophies played last week; even if we assume only a fifth of these were gathered during the weekend for eligible gnome-generating gameplay, it’s pretty crazy that only 1 or 2 people (to my knowledge) pulled an Epic Vault Key across our guild family.

Could be we have a lot of unreported keys, but we’re pretty communicative, so I’m thinking that’s unlikely

Jefferson · August 17, 2020, 10:09pm

I can post some other numbers from members in SoA.

57 vault keys 0 epic
51 vault keys 0 epic (mine previously posted)
44 vault keys 0 epic
32 vault keys 0 epic

As a guild only 2 people got an Epic Vault key.
not everyone reported their exact vault key totals, but from the one’s that did report all their keys we farmed 497 keys and 2 of them were Epic Vault Keys.

Graeme · August 17, 2020, 10:53pm

30 vault keys 0 epic for me.

Only 1 person in our guild reported finding an EVK at the weekend.
Don’t have the data from any other members but we are a very active and communicative guild so would have expected more if the ‘10%’ was correct.

cyberkiwi · August 17, 2020, 11:02pm

Many thanks @Graeme, @Jefferson, @Magnusimus for your feedback.
Sorry that I haven’t added them to my stats since the data’s quite light on details. I might revisit that later. but it looks like 11 or 15 samples won’t make a difference with this one.

EraserParticles · August 17, 2020, 11:03pm

Made my post on the wrong thread! Serves me right for browsing on mobile in multiple tabs. I deleted the last one to move it here, apologies for the confusion. Mine (I didn’t post mine to the guild discord) looked like:

10.5k trophies earned during vault weekend (explore d7, so over 5k battles at 2 trophies per)
Nearly 600 gnomes from tracker (fell asleep before getting the exact number)
57 vault keys
14 tracker keys
0 epic vault keys

Pandemic + vault weekend

Jonathan · August 18, 2020, 4:58am

On this note, I think a few of the players that submitted data always collect it, e.g. Dust Angel, and had probably been collecting it before seeing the thread, so could realistically be treated as pre-selected samples .

Fleg · August 18, 2020, 5:57am

410 gnomes
31 vk (not including pity keys)
0 evk

Neritar · August 18, 2020, 6:03am

182 gnomes
18 vault keys (not including 8 keys from tracker)
0 epic vault keys

Our guild was quite active on weekend (85k trophies in the previous week), but I’m aware of only 1 player which got epic vault key, many others got 10-20 vault keys and 0 epic vault keys.

HounganAtLarge · August 19, 2020, 3:07am

78 Gnomes.
5 Vault keys.
0 Epic keys.

Sytro · August 20, 2020, 7:26am

Just one suggestion on the topic.

Vault week probably increased the drop rate of vault keys, but did not incease drop rate of epic vault keys.

Don’t remember the multipliers (for getting a vault key during vault event), but it might be that we’re looking for a much lower ratio of EVK vs VK than 1:10 on Vault weeks. Maybe 1:30 or even lower ?

Planet13 · August 20, 2020, 1:13pm

Thank you Dust_Angel

Mithran · September 8, 2020, 9:32pm

With another Vault event fast approaching, it would be great if we could get some people stating ahead of time their willingness to record and post data so we can a) eliminate as much sampling bias as possible and b) see if the trend from last month holds.

covertmuffin123 · September 8, 2020, 9:38pm

I plan to do this, and was going to make a thread asking people to precommit to recording data. I guess we could just do it here.

Since gnomes are automatically tracked by the game, we really just need people to record keys/epic keys. Total battles doesn’t seem relevant because the gnome rate itself isn’t in question.

Fleg · September 8, 2020, 11:40pm

Imho it’s already confirmed that it’s not 10% from the above data collections. If the coming weekend shows droprate more in line to what we expect then for sure it means a stealth fix from the devs with no admittance of error in 1st vault event of course.