How did so many Dungeon Crawl: Stone Soup players miss such an obvious bug?
Gunax 2021-08-17 08:06:48 +0000 UTC [ - ]
The game is so noisy that no one can really predict what will happen. If I hit the grue with my sword--it could reasonably do 10% of damage or 40%--and both seem plausible.
In a statistical sense, the greater the standard deviation is, the more difficult it is to detect a sample population that differs from the established distribution.
In a bayesian sense, the players may have been perfectly rational--the likelihood of doubling or tripling your winrate is just not very unlikely when you're playing a game which is very random. Perhaps players can rationally conclude they were in the top 1% of luck at the moment, and that the chance of a major bug is less than that.
As evidenced by the referenced posts, it only becomes obvious when the sample size is huge (eg. the sample of all players reading the subreddit).
ajuc 2021-08-17 09:51:40 +0000 UTC [ - ]
You are trained for years to NEVER BLAME THE LOWER LAYERS because that makes you a worse programmer. So when it is actually their fault - you might look stupid for exhausting every other possibility before checking these basic assumptions.
But this is still a good rule, because it breaks very rarely.
Retric 2021-08-17 10:33:56 +0000 UTC [ - ]
ajuc 2021-08-17 10:43:21 +0000 UTC [ - ]
It bricked all our devices in Germany.
We worked around it, notified the device driver developers and forgot about it.
A year later we shipped with new version of kernel and it broke again - because the bug was fixed in kernel and our workaround was now causing the bug :)
hinkley 2021-08-17 19:34:43 +0000 UTC [ - ]
jerf 2021-08-17 13:33:58 +0000 UTC [ - ]
In this case, while I agree that there was enough data to have a good guess that something had gone wrong after a certain period of time, it definitely is the sort of noisy data that would take some non-trivial aggregation to be sure, and even in hindsight I'm not so sure the signal was so strong that it's worth metaphorically beating anyone up about.
Moreover, only in hindsight is it obvious that the culprit was a doubling of melee damage. Prior to knowing that, the learning rate was constrained by the multiplicity of possible theories that could have explained this. It's not like it was simply a matter of "Either 1. the game is unchanged and people have just Got Gud or 2. the melee damage rate has been doubled", the span of possibilities was much larger than that. Had an oracle come down from whereever and offered that choice to the devs, sure, gathering enough data to resolve that one bit's worth of information would have been easy. But presented with the full, massive array of possibilities, it's a much harder task to even determine that there is a problem, let alone what it is.
Upshot, don't beat yourselves up too much. It may be obvious in hindsight but these are really quite hard problems to solve in the deliberate absence of information.
eutectic 2021-08-17 11:22:57 +0000 UTC [ - ]
gnramires 2021-08-17 12:28:21 +0000 UTC [ - ]
> Parthenocarpy:
> Well that explains me going from a 2% winrate to 17.95% in the span of two weeks
In that case, as soon as you got more than 1 win you should notice an abnormality (according to back-of-the napkin 8 trials to perceive the difference, using frequentist stat).
Also, that assumes you only get information from win/loss. In reality, every interaction (monster damage) should provide information (too easy kills, abnormal damage).
eutectic 2021-08-17 15:11:05 +0000 UTC [ - ]
Agreed that win rates are not the only data available, and that players probably should have noticed the difference. Attributing it to a bug is arguably a harder problem.
cridenour 2021-08-17 03:19:14 +0000 UTC [ - ]
This might be my favorite way to describe programming.
kibwen 2021-08-17 05:26:13 +0000 UTC [ - ]
notanzaiiswear 2021-08-17 09:14:18 +0000 UTC [ - ]
thaumasiotes 2021-08-17 10:06:51 +0000 UTC [ - ]
Tom4hawk 2021-08-17 11:31:00 +0000 UTC [ - ]
So.. technical debt and "hacks" ?:D
tygrak 2021-08-17 10:05:55 +0000 UTC [ - ]
gerdesj 2021-08-17 09:33:56 +0000 UTC [ - ]
"Anthill Inside" 8)
solarmist 2021-08-17 21:11:55 +0000 UTC [ - ]
On some computers there's a mysterious file that has all of the (world/universe's)? object variables, such as a person/object's location or height, age, etc, and magic is performed by manipulating those via verbal macros.
New spells are researched by literally programming and debugging.
It's a fun series, but it feels the author's done about as much as he can with it.
wccrawford 2021-08-17 15:53:41 +0000 UTC [ - ]
It was quite entertaining. But most of what he did was either off-the-cuff, or the book hid the hours and hours of testing and research so that it wouldn't bore the reader.
Unfortunately, I couldn't find the name of the book in my reading history.
notanzaiiswear 2021-08-18 09:43:28 +0000 UTC [ - ]
wccrawford 2021-08-18 12:02:37 +0000 UTC [ - ]
Apocalypse: Generic System (Systems of the Apocalypse Book 1)
I haven't read the sequel yet, though.
whatshisface 2021-08-17 15:09:11 +0000 UTC [ - ]
notanzaiiswear 2021-08-17 16:52:52 +0000 UTC [ - ]
And in HP people have the innate talent for magic.
whatshisface 2021-08-17 19:56:57 +0000 UTC [ - ]
vidarh 2021-08-17 11:42:34 +0000 UTC [ - ]
jholman 2021-08-17 06:45:48 +0000 UTC [ - ]
"Hey. Don't oversimplify. First you gotta hammer them flat and trap lightning inside them."
meowster 2021-08-17 04:16:58 +0000 UTC [ - ]
teruakohatu 2021-08-17 04:50:23 +0000 UTC [ - ]
tsimionescu 2021-08-17 09:39:05 +0000 UTC [ - ]
kaibee 2021-08-17 14:05:09 +0000 UTC [ - ]
p1necone 2021-08-17 06:05:27 +0000 UTC [ - ]
ben_w 2021-08-17 07:35:53 +0000 UTC [ - ]
fishtoaster 2021-08-17 06:27:21 +0000 UTC [ - ]
kleinsch 2021-08-17 03:26:39 +0000 UTC [ - ]
There’s an element of luck in roguelikes too. It’s part of what makes both types of games compelling. The outcome is uncertain when you sit down to play.
So are there psychological traits that make players likely to mistake this bug? Maybe. But on an individual basis, without aggregated winrate data, how would I know it’s not just a lucky winstreak? Especially when I’m playing a style of game where that’s the point?
thaumasiotes 2021-08-17 03:53:18 +0000 UTC [ - ]
A lot of historians and archaeologists would be ecstatic if this were actually true.
rozab 2021-08-17 11:42:55 +0000 UTC [ - ]
thaumasiotes 2021-08-17 12:40:48 +0000 UTC [ - ]
Jarwain 2021-08-17 05:27:01 +0000 UTC [ - ]
slim 2021-08-17 05:45:17 +0000 UTC [ - ]
goodcanadian 2021-08-17 08:37:47 +0000 UTC [ - ]
tsimionescu 2021-08-17 09:49:47 +0000 UTC [ - ]
thaumasiotes 2021-08-17 09:34:58 +0000 UTC [ - ]
soared 2021-08-17 03:55:44 +0000 UTC [ - ]
Having only had this job for a couple months I honestly cannot tell if the input -> output is just very very muddy, or if the documentation is wrong and the models all behave differently. Similar to Crawl I can't see behind the curtain.. my inputs are generic "factor 1", "factor 2" and my outputs have more meaning but are subject to unknowable fluctuations from outside sources.
> holding the system constant enough that you can evaluate your changes over time
I do not know if I can learn from the system because I'm not sure if its trustworthy. Good thing there are directions on how to test it included in the post :)
eeegnu 2021-08-17 05:07:20 +0000 UTC [ - ]
[1]: https://github.com/crawl/crawl/commit/ab847a317b82e2fb0316bf...
ballenf 2021-08-17 09:54:58 +0000 UTC [ - ]
That is, the only way you can get better at a game like this is to assume that the system is stable (otherwise you'll attribute failure to bad luck instead of bad strategy).
> It’s also important to avoid refuting observations by appealing to system definitions. If someone says “This feels like it’s doing more damage than it says”, resist the impulse to say “Nope, it does exactly this value; it’s written right here.” Instead, try to design an experiment that would prove whether the value written in the system is correct. If designing the experiment is very hard, that should be interpreted as a risk factor - if no one can check that the system is doing what it ought to, then maybe it really is wrong! Play yes-and with systems skeptics, letting them invest their time into correspondence work if they think something is wrong. Or be the system skeptic yourself, if a certain observation sits wrong with you.
That skill is a superpower in software development when dealing with complex systems that behave unexpectedly.
throwawaygal7 2021-08-17 15:38:53 +0000 UTC [ - ]
Simplified combat and magic, removal of extraneous skills, crippled food system... soon no doubt they'll try to 'balance' the mutation system which is the last of these old school rogue like features to still maintain a large presence.
I think the dispute is generational... I love crawl and had some modest victories but never defined myself around it. to me it was a distraction and just a fun game to sink time into in undergrad - not something to be relentlessly optimized.
kibwen 2021-08-17 17:25:33 +0000 UTC [ - ]
zijoud 2021-08-17 05:03:19 +0000 UTC [ - ]
I was only playing the game for a week before I had a "streak" of wins, which is something you shoot for. And it was all during a tournament. I felt cool then, but not now.
personjerry 2021-08-17 02:23:31 +0000 UTC [ - ]
Supermancho 2021-08-17 02:29:05 +0000 UTC [ - ]
The article leads with a clickbait question.
The reason they didn't notice is because the stats aren't readily available and public to everyone, to see the sharp uptick in wins paired with the game being filled with opaque systems. No player playing for the first time would notice or have a chance to notice.
To put it another way, this is an article about a (secret and mistaken) massive change to a very difficult game that made it slightly less difficult. Same article turns around and asks why nobody noticed, like that's an interesting question.
kibwen 2021-08-17 05:44:21 +0000 UTC [ - ]
The stats have always been public and instantly available, impressively so. Here's the bot command to query the stats in question:
!lg * start>"2015-03-06 00:00:00" start<"2015-03-21 00:00:00" cv=0.16 / won
Output: 1180/39437 games for * (start>'2015-03-06 00:00:00' start<'2015-03-21 00:00:00' cv=0.16): N=1180/39437 (2.99%)
So that's a 3% winrate for the major version in question during the dates the bug existed.Using another query to determine the all-time winrate across all versions:
!lg * / won
We see the historic winrate to be about 1%.The listgame interface is documented at https://github.com/crawl/sequell/blob/master/docs/listgame.m... , and the bots can be found on IRC and Discord.
thaumasiotes 2021-08-17 03:41:59 +0000 UTC [ - ]
No, lots of players noticed. The question is phrased as "how did the players miss this?", but that's not what the author means - what he means is "why didn't the community come to a consensus on what had happened within two weeks of the introduction of the change?"
Which is a much stupider question.
mandmandam 2021-08-17 07:54:55 +0000 UTC [ - ]
I don't think it's a stupid question at all; not when the effect is so large. It was literally double what it should be, leading to a 3x overall win rate.
Coming from Dota 2, if there was a bug like this it would be caught within minutes, if not seconds, and patched within an hour or two.
- Yes, the Dota player base is orders of magnitude larger, and the damage numbers are easily available. Also I have never played Crawl.
tsimionescu 2021-08-17 10:00:41 +0000 UTC [ - ]
The article also brings evidence that many players didn't notice or believe that anything changed, they specifically believed that nothing did change, which is a surprising belief for such a huge change - but understandable in the context of of such an opaque system. I also believe that other players noticed something had made the game easier, but no one realized what specifically had changed, or most likely suspected how big of a change it was.
thaumasiotes 2021-08-17 10:11:39 +0000 UTC [ - ]
Not really. The article cites responses in online discussion to that effect. But there is no evidence that those responses came from someone who had ever played the altered version of the game. With certainty, many of them didn't. You don't have to do a playthrough on the latest version before leaving a new comment in a forum thread.
It's a safe bet that the majority of the player base never encountered the bug at all, since it was only available if you downloaded the game within a two-week window.
djmips 2021-08-17 02:56:22 +0000 UTC [ - ]
hnxs 2021-08-17 02:33:57 +0000 UTC [ - ]
cortesoft 2021-08-17 03:12:28 +0000 UTC [ - ]
muzani 2021-08-17 04:14:13 +0000 UTC [ - ]
How would someone design tests that work well for roguelikes? It's not simply Human strikes Orc for 10 damage. Damage range may be a bell curve of 5-15. There's misses and hits. There's little bonuses that increase the miss probability, probably to ridiculous levels. There's armor, calculated from things like race, constitution, magic, divine bonuses, class mastery bonus.
And then you have procedurally generated... stuff. Equipment are the easiest to control. Some games have procedurally generated monsters, some have procedurally generated deities. To be able to reduce this stuff to tests kind of defeats the purpose of doing it, which is to be unpredictable.
Sometimes you have a combo of things that add up to 40% damage resistance. Is 40% too much or too little?
After all this, how do you detect that the bugged spear is doing an unreasonable amount of damage?
personjerry 2021-08-17 04:28:55 +0000 UTC [ - ]
muzani 2021-08-18 07:30:21 +0000 UTC [ - ]
camtarn 2021-08-17 13:39:06 +0000 UTC [ - ]
Wouldn't detect extremely improbable things, though, like having a spear with a super-rare effect + having armour with another super-rare effect, since those would show up so infrequently that they wouldn't skew the overall stats much.
a_e_k 2021-08-17 08:00:04 +0000 UTC [ - ]
and/or
(b) Have each test loop enough to get a good estimate of the probability distribution.
muzani 2021-08-18 07:33:49 +0000 UTC [ - ]
nimih 2021-08-17 04:56:01 +0000 UTC [ - ]
soared 2021-08-17 04:22:23 +0000 UTC [ - ]
dota_fanatic 2021-08-17 03:46:34 +0000 UTC [ - ]
Part of the joy of being human is understanding systems and working with them efficiently to achieve goals. Good luck doing that with crawl. You can sort of achieve this by dumping a lot of time and building up intuition via experience, but you still won't be able to say why you chose this over that for all your choices.
I found myself getting into scenarios, wondering, which choice is better along one particular dimension? STR vs DEX? This armor vs that? What are the trade-offs? The only real way to find that out would be to dump the game at its current state and then simulate both choices against a suite of enemies to see how it plays out at least in 1v1 situations, never mind when handling many enemies. It's frustrating playing a game with so many numbers yet the output of the systems they feed into are so opaque.
It's really unfortunate because although it already feels like a gem of a game, it could be soo much better if the interplay of the various combat / character systems wasn't so ridiculously complected. I would love to see some of the sensibilities of the Factorio team applied to crawl's UI and underlying combat/character systems.
eeegnu 2021-08-17 04:48:26 +0000 UTC [ - ]
dota_fanatic 2021-08-17 13:43:55 +0000 UTC [ - ]
With DCSS? Well... how much damage you do is going to depend on whether or not you're wearing a shield (what size?), what kind of armor you're wearing if any, how much you've trained for that weapon class, how much STR you have, what bonuses it has, and what bonuses you have from gear. Then RNG will play a significant role on top of that. Maybe RNG applies twice, once for your swing, once for their defense? It's been too long, I can't say confidently.
I realize it's challenging for the interface design because everything is already tight but I'm sure there's a solution. If I equip this armor/weapon/jewelry, how fast will I be attacking compared to now? What will my damage distribution look like before enemy resistances? Same with allocating stats, please give feedback in the form of derived data that is closer to the thing I care about (surviving).
harpiaharpyja 2021-08-17 14:51:24 +0000 UTC [ - ]
kibwen 2021-08-17 17:10:24 +0000 UTC [ - ]
tsywke44 2021-08-17 08:37:26 +0000 UTC [ - ]
dota_fanatic 2021-08-17 14:00:14 +0000 UTC [ - ]
Don't get me wrong, there's a lot of gold in those hills and I had a lot of fun getting better and getting my first win. But after that I was very frustrated with how hard it was to just stop at a fork in the road and try to answer, which choice is objectively better? I like difficult games that are fair and require skill growth to succeed in, but I want to be able to understand them so I can make informed choices as I get better as well. Too much: because that's what the black box determined. sorry, you died. should have been more cautious.
gambiting 2021-08-17 12:28:51 +0000 UTC [ - ]
I had the same experience in Caves of Qud(which is great btw) - there are some builds that really really really work, and others which are unwinnable.
spywaregorilla 2021-08-17 15:27:03 +0000 UTC [ - ]
There are easier builds and there are harder builds, but if you aren't good, you will lose. Case in point, the bug described here effectively doubled damage output and win rates were still pretty low.
kibwen 2021-08-17 17:18:49 +0000 UTC [ - ]
lmohseni 2021-08-17 13:16:40 +0000 UTC [ - ]
slingnow 2021-08-17 15:58:46 +0000 UTC [ - ]
This would be like attempting to play Go, and complaining that it's nothing like Chess because the optimal move generally isn't obvious even after 10,000 hours of play. And then loosely tying it into "part of the joy of being human" instead of simply stating that "this is my preference".
dota_fanatic 2021-08-17 19:19:21 +0000 UTC [ - ]
Go read their manifesto: https://github.com/crawl/crawl/blob/master/crawl-ref/docs/cr...
Specifically the section titled "Clarity".
> Things ought to work in an intuitive way.
Spoiler alert, they most certainly do not. There's nothing intuitive about being completely unable of knowing the practical effects of something as simple as equipping a piece of armor or a shield or allocating a point to STR vs DEX. I gave an example in another comment of how other games do provide information when comparing between options so you can intuit their effects. I have faith they'll get there, though. I believe they'll gain many more players if they can, too. :) I'm sure I'll be back in a year or two after they improve the gameplay more.
gnramires 2021-08-17 16:27:48 +0000 UTC [ - ]
In reality, both Chess and Go and all board games are fairly simple, you can learn all the rules in a few sittings and then just focus on strategy.
In roguelikes, often learning the system is the point (or a point) of the game. The fun, at least to me, is in dealing with the unknown and discovering how it works and what works.
Nethack (a classic roguelike) takes it to the extreme -- a bit too much in fact, because an essential game mechanic is an obscure passage you'd have to deduct (Elbereth). But the fun is there: you grab a potion and have no idea what happens if you drink it -- maybe you will be killed instantly, maybe it will make you into a god -- it's not just "which of those effects am I going to get?". You have to rely on previous knowledge, as well as expectations from similar games and what you know about the game.
But isn't that quite common IRL? Like, okay figuring out what you should be eating has a science to back it (albeit quite extensive): you read the most effective diets, what risks it reduces, you health, your weight and need to make a decision. Already not very simple strategic decision. And then you need to take into account satisfying your tastes, how often diet fads reverse (is eating butter healthy? go figure), how to do it when sharing a meal with other people, and so on. The unknowns are as significant as skillful strategic planning. The same could be said to running a company: you need to manage uncertainties in a vast decision space, with opportunities like calling for partnerships, merging, buying other companies, spending on uncertain research, and much more. You slowly acquire domain-specific knowledge (with help from the vast literature, plus a few insights of yours) and use that to in turn reduce and manage uncertainties and plan ahead (here your model of other players and capability to find the 'best move' -- least risky, greatest payoff, more robust -- will shine).
hinkley 2021-08-17 19:27:47 +0000 UTC [ - ]
2021-08-17 09:07:22 +0000 UTC [ - ]