Server Issues - An Explanation

Hi All,

As you all know, the servers had some errors!

I wanted to share exactly what happened… a TL;DR version, and then a longer version for the tech folks out there, just to keep you all informed.

TL;DR

Our new servers behave differently during login to our old servers… they create a bunch of stuff and don’t delete it… we missed that it was happening, and our database grew too large, ate up all our memory, and caused a problem.

Longer (Technical) Version

We were previously using a service for our servers called Parse. Parse closed down today, but we had successfully migrated off of it… the database was moved off last May to the infamous MongoDB (which is actually pretty awesome… MongoDB errors you have received have actually been an issue with the servers, not Mongo), and the actual servers were moved away over the last 2 weeks to something called “Parse Server” - written by the guys who made Parse to emulate how Parse operates.

What we didn’t realise is that Parse Server, during the LOGIN operation creates an object and doesn’t delete it.
Parse also created an object during LOGIN, but was automatically revoking it on the next LOGIN.
This was undocumented, and frankly for most applications they don’t get enough logins to matter… GEMS OF WAR ON THE OTHER HAND has rather a lot of users, 10’s upon 10’s of thousands of them logging in every day, in many cases multiple times!

As any of you who work with databases will know, you have a LOT of telemetry to examine to ensure your database is healthy. We have access to all of that and look at it regularly… we’d been a little concerned about the increase of disk space, but hadn’t figured out what was causing it, thought perhaps maybe it was the new server storing some extra data… made a note to investigate it if it kept growing… and then BAM! we started causing page faults like crazy last night… turns out that one table SESSIONS, which we hadn’t looked at, because it had never been a problem, had almost 50 million entries, and its indices were filling our memory.

Finally…

We haven’t solved the problem yet… though our tech leads assure us it’s well underway. We’ve added extra memory to our database servers while we fix this over the next couple of days, and we’ll keep you all posted how it goes.

I apologize that we weren’t around to keep you filled in last night… I’d seen the initial spikes at 2am, done some maintenance, watched them go back down and called it a day. We’ll try to keep a presence online a little more regularly while we’re fixing things this week.

37 Likes

The militancy of some players towards this has been pretty shameful, in my view. The staff should be allowed to sleep some time… this game is free for the majority of players.

However, it’s human nature to get angry and frustrated when we can’t get what we want when we want it. I was at work around 12 hours yesterday, so the server issues only affected me on my 40 minute train home - but if I’d been on leave, or been travelling (I often have 5-6 hours in a day on a train going oop north) I’d have been pretty frustrated at the lack of gems. Angry enough to cuss the team and threaten to quit? Dunno.

Surely this game is now big and successful enough to justify the cost of increased tech support, and overnight support. Without the creative lead spending his nights doing it.

17 Likes

:sweat_smile:

It does seem they’re working on increasing their staff numbers. (E.g. Saltypatra)
How many are going into tech/support staff? Dunno, but I wouldn’t be surprised if their members have gone up from even a month ago.


Thanks for the update Sirrian! I’ve taken the liberty of quoting you for the Steam side of things.
Hope things can be resolved shortly for all your’s sakes.

I can only imagine you’re speaking about me, (in addition to others) but I’d really appreciate it if you didn’t misquote and exaggerate what I write while you’re busy kissing ass. I never said the staff shouldn’t sleep; far from it. I stated that they should have someone scheduled for the time when the rest of the team is off which is pretty standard stuff.

Thanks, @Sirrian; that is all I’ve been wanting and I appreciate you addressing it. Fingers crossed it’s not necessary and everything goes smoothly with the rest of the database work.

4 Likes

Oh Sirrian :slight_smile:

Nothing really happend :slight_smile:

Nothing which could not be fixed by giving free mythic to everyone :slight_smile: (I would appreciate Pharos Ra)

4 Likes

Mythics aside, I won’t be surprised if that’s the reason they created that Mongo fellow, so they can give him as compensation next time the servers misbehave.

3 Likes

Technical City - where all developers live as a units + several bugs as a monsters. Imagine Sirrian as a legendary troop. With banhammer as a legendary skill. And skill called “Login Issues” as a theme :
“Login Issue” - affected troop is “AFK” - it disappears from the team until effect vanish. If the other troops are killed while unit is AFK you loose.

1 Like

thank you for the explanation, feels better to see you guys care about us enough to share the info :slight_smile:

the positive aspect of those server issues is/was - i got to actually try a daemon summoning dec xD if not the errors id never considering it haha even with this event

i hope you will fix it soon :slight_smile:

Eventho the server was pain in the ass yesterday I managed to get 150 snot stones. I just did the last 50 now. Thanks for the explanations, and the game has been great here since I woke up.

1 Like

@Sirrian thank you for your’s and the rest of the dev team’s tireless effort to keep those 10’s and 10’s of thousands of us crazy players that can’t get through a day without matching those pretty little circles. I appreciate everything you bunch do just as so many others do as well.

Keep up the good work. :thumbsup:

3 Likes

@Sirrian although this is not exact my area of experience and I do not know what kind of functionality of parse.com you used and what is your future plans, but just in case you did not seen - the list of alternatives: GitHub - relatedcode/GraphQLite: Rapid GraphQL prototyping, development, and testing. Core Server, Admin Console, iOS SDK, Web SDK.

1 Like

Wait… You’re only now using MongoDB? What were you using before? Or was that hidden behind Parse, so you never had to deal with it directly?

Just want to add my thanks for the team’s hard work and dedication. If we didn’t love this game, we wouldn’t really care, so KUDOS!

2 Likes

I agree. Some users act extremely childish here, entitled even. Sure, it sucks when Servers go down, but my god, if you have any idea or experience with online games at all, you’d know that the GoW devs handled it pretty well and fix stuff quickly. Honestly, devs, don’t beat yourself up over it. Lots of people nowadays like to get outraged at every single little thing that goes wrong.

2 Likes

Like someone raging about being kicked from a guild because they werent playing?

8 Likes

Serious stuff :slight_smile:

1 Like

3 Likes

I work 3rd shit computer operations, so I’m there too look out for the scary night problems that crash databases and websites. Which means I surf the Web and play gems when nothing is broke :slight_smile:

When stuff does break :frowning:

1 Like

Lol not sure if typo or perfect description of 3rd shift

3 Likes