The summer of 100K

Srynerson

Senior Member
Joined
Jun 4, 2011
Messages
2,605
Reaction score
367
Location
Denver
Country
llUnited States
I am not sure mate. Making errors / omissions is part of a legit game - A.2

He who makes the least mistakes is in a better shape. No? :)
That would make sense if there were so many playings for a scenario that we could be confident errors/omissions were distributed roughly equally between players reporting on each side, but that's not the case for a lot of scenarios. Also, errors/omissions are probably not randomly distributed, and those errors that are more common than others may disproportionately impact one side or another in a scenario.
 

volgaG68

Fighting WWII One DR At A Time
Joined
Jun 15, 2012
Messages
3,212
Reaction score
1,549
Location
La Crosse, KS
First name
Chris
Country
llUnited States
Welcome, back. Haven't seen you around in a while!
That would make sense if there were so many playings for a scenario that we could be confident errors/omissions were distributed roughly equally between players reporting on each side, but that's not the case for a lot of scenarios. Also, errors/omissions are probably not randomly distributed, and those errors that are more common than others may disproportionately impact one side or another in a scenario.
I see it this way. All of the reputable publishers use a fairly in-depth play-testing process. MMP and BFP even take it to the point of perfectionism, and will continue to iron out wrinkles months after we think the item should have already been released. They finally release the item and scenario X2 flashes onto ROAR with a 10-2 record. I immediately think that there is something the players out in ASL-Land are obviously not seeing. PT'd that many times, by that many people, of that many levels of capability.....some of the customers are just missing something. Yes, months after a release, sometimes someone will find a quirk that allows one to 'break' the scenario. Yet, I can not believe a glitch was found that quickly by that many people straight out of the gate; one that completely eluded the stable of PTers. The PTers do get the private ear of the designer though, and if something is misunderstood or unappreciated, they can be corrected and have at it again.

Then one follows the AARs, questions, and tourney reports, and sees that more than one player (some top-notch) have failed to correctly grasp the VC, an SSR, or appreciate a tactic the designer almost requires for a win. A "10-2 out of the gate" scenario is actually one of my favorite types to play, because I want to try and figure out what the new customers are not seeing, but what the PTers did see and saw well enough to judge it balanced. Invariably, someone pipes in with the solution, "use a fallback defense centered on this landmark", "all of your AFV are expendable, VBM-freeze him to death", or "deploy as much as possible each turn and it moves the balance back towards 50/50".

That is why I so rarely 'believe' the initial accounts of new scenarios being dogs. In fact, it is why I believe dogs are such an actual rarity, especially from perfectionists like MMP and BFP. Just my two cents.....
 

Philippe D.

Elder Member
Joined
Jul 1, 2016
Messages
2,132
Reaction score
1,393
Location
Bordeaux
Country
llFrance
It allows you to "enter the balance used by each side," isn't that the same thing?
Yes you're absolutely right; I just forgot because I hadn't submitted a game in quite some time. But then, if it allows you to _see_ this statistic, I really don't know how - so I'm not sure what the point of collecting the info is.
 

jrv

Forum Guru
Joined
May 25, 2005
Messages
21,998
Reaction score
6,206
Location
Teutoburger Wald
Country
llIceland
Yes you're absolutely right; I just forgot because I hadn't submitted a game in quite some time. But then, if it allows you to _see_ this statistic, I really don't know how - so I'm not sure what the point of collecting the info is.
When you look at the page for the individual scenario you will see the wins/losses broken down by balance used. The scenario name on one of the list pages is also a link to the individual page. This is the individual scenario page for "Fighting Withdrawal". I am not aware of a good way to put this on a list page to make it comprehensible, and some players have complained that they don't understand the individual scenario page either. For "Fighting Withdrawal" as of this moment the scenario was played 11 times with the Russians receiving the balance and the Finns not (for 2 Russian wins to 9 Finnish wins), the scenario was played six times with the Finns receiving the balance and the Russians not (for 5 Russian wins to 1 Finnish) and both sides have received the balance 3 times (for 1 Russian win to 2 Finnish). The remaining 317 playings were reported with no balance, with 162 Russian wins, 153 Finnish wins, 1 tie and 1 No Result reported.

I wonder about the reports where both sides receive the balance in this particular scenario, but ROAR allows such an entry because it is occasionally the recommended way to play some scenarios. Similarly a tie is not a result listed on the card for "Fighting Withdrawal." That may either be a mistake entry, or it may be the players agreed to the tie. The "No Result" is available to allow players to record a playing that couldn't be finished in a satisfactory way (ran out of time, the dog knocked over the game, etc.), was badly botched (boards set up wrong way) or similar ending that could not be classified as a win for either side.

For all practical purposes, for most or possibly all scenarios, the scenario was not played with any balance.

JR
 
Last edited:

bprobst

Elder Member
Joined
Oct 31, 2003
Messages
2,532
Reaction score
1,437
Location
Melbourne, Australia
First name
Bruce
Country
llAustralia
I would say that a player should only enter a play on ROAR if they are fairly confident that they played the scenario without making any significant errors or omissions with regard to the rules of the game in general, plus the SSRs and VCs for the scenario
Well, firstly, you're wrong. ROAR needs the raw numbers to work. The totality of the raw numbers erases the individual foibles of the individual games. If a scenario is played 100 times, and 10 of them were played "weirdly", who cares?

Secondly, your method requires that no-one ever record a game, ever. No living individual (nor any dead ones, I would wager) has ever played a game of ASL without fucking something up. So if you think that affects the results, then congratulations: all of the results are affected equally.
 

Michael Dorosh

der Spieß des Forums
Joined
Feb 6, 2004
Messages
15,733
Reaction score
2,765
Location
Calgary, AB
First name
Michael
Country
llCanada
I would say that a player should only enter a play on ROAR if they are fairly confident that they played the scenario without making any significant errors or omissions with regard to the rules of the game in general, plus the SSRs and VCs for the scenario
Does that ever happen?
 

boylermaker

Senior Member
Joined
Jan 22, 2012
Messages
581
Reaction score
526
Location
Virginia
Country
llUnited States
On the issue of whether or not to log "crappy" games into ROAR, I think that everybody is right in a sense. If the point of ROAR is to tell us whether a scenario is balanced under ideal conditions, then we probably don't want to add "crappy" records to ROAR, because there isn't any guarantee that the error that rules mistakes, etc, add is going to be symmetrically distributed around whatever the true balance is--and my guess would be that Srynerson is right that more often than not it won't be! In these sorts of cases, adding more records doesn't making things better.

On the other hand, I don't think that the best purpose of ROAR is to tell us whether a scenario is balanced under ideal conditions. "Ideal conditions" are probably something like "played by experience players with ample time to prepare who didn't make any rules mistakes," and players who can do that probably don't have a lot of need for ROAR to tell them whether a scenario is balanced in the first place! I think that ROAR is probably more helpful to people like me, who aren't good enough to look at a scenario card and know whether it's balanced, and who also aren't good enough to play it without making any avoidable errors. And for us, adding records from people like us makes ROAR more useful.

So imagine a scenario that depended on a perfect understanding of the Overrun rules. ASL pros could play it, and it's perfectly balanced, but if you don't understand all the defensive options available to infantry, it's a dog. All the people like me play it, and one side always wins. This is good information for our fellow dumb people to have, and it's hard to imagine where else we could get this information!

So all in all, I would encourage everybody to submit records of any scenario in which both players were trying to win (I think I would recommend against recording teaching scenarios if the teacher is trying to help his opponent win, but if he's just playing with one hand behind his back, then probably record it).

Also, here is a helpful tool to tell you whether there are enough records to say that a scenario might be unbalanced:
https://graphpad.com/quickcalcs/binomial1.cfm

For "number of successes", enter the recorded wins by either side (doesn't matter which).
For "number of trials", enter the total number of games played (technically the total number of games that didn't end in a tie, but it doesn't really matter).
Leave "probability" at 0.5 and hit calculate.
Then look at the "Two-tail P value" result. This is the probability that a balanced scenario would look like a dog just due to chance. If the number is low, then you might say that it's a dog. In my field (biology), we consider anything less than 0.05 to be low.

(Be careful with scenarios with a lot of playings, however. With these, you can get a low p-value even when the scenario isn't that tilted. For instance, if there are 350 playings of a scenario, and the attacker won 150 times, this has a quite low p-value, but the attacking is still winning 43% of the time, which isn't that unbalanced. So while the p-value gives you evidence that a scenario is unbalanced, it doesn't necessarily tell you how unbalanced it is.)
 

bprobst

Elder Member
Joined
Oct 31, 2003
Messages
2,532
Reaction score
1,437
Location
Melbourne, Australia
First name
Bruce
Country
llAustralia
If the point of ROAR is to tell us whether a scenario is balanced under ideal conditions
No ASL game has ever been played, or ever will be played, under "ideal conditions". It follows, therefore, that this cannot be the "point" of ROAR.

The point of ROAR is to give the players some numbers that might be useful in deciding whether a particular scenario that they have no experience with is likely to offer a "fair" situation before either player has rolled any dice or made any bad tactical decisions.

It's not complex and you don't need to study statistics. ROAR is a funnel. For any given playing, there are so many variables involved that a single result cannot tell you anything. A second result may or may not confirm the first result, but otherwise can't set a trend. Additional results can help to determine a trend, but when the total number of results is small, each individual result can generate a quite substantial change in the trend.

IMO when you get to about 40 results, the possibility of any single further result introducing a significant change to the trend is quite small. As the total number of results increases, you need an increasing number of results that don't follow the trend to introduce any further significant changes. Thus, ROAR has funneled the results into a fairly stable trend: either one side or the other seems to win a disproportionate amount of the timeand thus we can speculate that the scenario is not that balanced; or, each side seems to have about the same chance of winning, and hence we can speculate that the scenario is probably balanced. (If a scenario has less than 40 playings, I will usually ignore the ROAR numbers completely.)

Note that it's only ever "speculation". Your personal experience will always count for more than the ROAR numbers. It's quite possible (for various reasons) for an unbalanced scenario to have "good" ROAR numbers, even with many results. (And vice-versa, of course.) ROAR can be a useful tool, but it has numerous limitations and you can't treat it as gospel.

As far as "balanced" is concerned, I think 60:40 (with a minimum of 40 total results) is the upper limit. I wouldn't modify a scenario that has numbers in that range. If I was a much more experienced player than my opponent, I might give him the "favoured" side but that's as far as I'd go. On the other hand, for anything more extreme than 60:40, I wouldn't want to play the scenario without modifying it in some way to help the "unfavoured" side.
 

Philippe D.

Elder Member
Joined
Jul 1, 2016
Messages
2,132
Reaction score
1,393
Location
Bordeaux
Country
llFrance
(Be careful with scenarios with a lot of playings, however. With these, you can get a low p-value even when the scenario isn't that tilted. For instance, if there are 350 playings of a scenario, and the attacker won 150 times, this has a quite low p-value, but the attacking is still winning 43% of the time, which isn't that unbalanced. So while the p-value gives you evidence that a scenario is unbalanced, it doesn't necessarily tell you how unbalanced it is.)
Or, you can, in the same calculator, enter some probability instead of .5 ("perfect balance"), and find out what the p-value is with this parameter. For instance, with a .45 probability (not terribly unbalanced), 150 wins for one side and 200 for the other side (with .45 for the 150-side, of course) gets a .22 probability that you get this number of wins or fewer - not low enough to be confident that the scenario is that unbalanced.

As for Bruce's "40:60" rule of thumb, above: with only 40 results and perfect balance, the probability of getting 16 wins or fewer is 13.4% (or double that, 26.8%, if you include "16 or fewer, or 24 or higher", that is, "at least 40:60 imbalance"). It's a bit low, but not low enough to be confident that the scenario is unbalanced (using this as a threshold, you run more than a 25% chance of declaring a perfectly balanced scenario to be unbalanced). With a 40:60 ratio, the standard 5% risk level is reached at about 100 playings.
 
Last edited:

bprobst

Elder Member
Joined
Oct 31, 2003
Messages
2,532
Reaction score
1,437
Location
Melbourne, Australia
First name
Bruce
Country
llAustralia
As for Bruce's "40:60" rule of thumb, above: with only 40 results and perfect balance, the probability of getting 16 wins or fewer is 13.4% (or double that, 26.8%, if you include "16 or fewer, or 24 or higher", that is, "at least 40:60 imbalance").
I don't understand what you mean by "perfect balance". If the scenario has "perfect balance", it's 50:50, not 40:60. It's not an issue though, since no scenario is "perfectly balanced", and no game is ever played "perfectly". I think this is an example of statistics confusing the issue, not offering clarification or simplification.

It's a bit low, but not low enough to be confident that the scenario is unbalanced (using this as a threshold, you run more than a 25% chance of declaring a perfectly balanced scenario to be unbalanced). With a 40:60 ratio, the standard 5% risk level is reached at about 100 playings.
I never declare any scenario to be "unbalanced" until after I've played it. Prior to playing it, it's just speculation, based on many factors of which ROAR is only one. [EXC: Burzevo is a turd of the lowest caliber, and you don't need to play it to work that out.]
 

Philippe D.

Elder Member
Joined
Jul 1, 2016
Messages
2,132
Reaction score
1,393
Location
Bordeaux
Country
llFrance
"Perfect balance" is the ideal situation where each player has an even chance to win. I'm not claiming this exists - it's just a model to work from.

What I am saying is that such a gem would have a 26.8% chance of showing 40:60 imbalance or worse after 40 playings (i.e. one of the sides getting 16 wins or less). This is a rather high probability to start drawing conclusions. Now, reading your comment again, I realize I misread you - I thought you were saying this was about the point at which you'd stop accepting to play the scenario as is, but you were saying about the opposite.

boylermaker above said the standard threshold is usually 5% risk - one would not reject the hypothesis if the data had 5% chance or more of showing up under the hypothesis [here the hypothesis is that the scenario is 50-50 balanced]. And based on that, it would take (about) 100 reported playings to reach this level of risk at 40-60 reported results. I only mentioned this because it's somewhat counterintuitive; most people would look at the numbers, see 40-60 reports with a total of 40 or 50 playings, and say, "this is clearly unbalanced".
 

Danno

Ost Front Fanatic
Joined
Mar 12, 2005
Messages
1,472
Reaction score
873
Location
Land of OZ
Country
llUnited States
ROAR just passed 99300 games logged. The Texas tournament games are still to be logged.
 
Last edited:

djohannsen

Senior Member
Joined
Jan 3, 2017
Messages
762
Reaction score
620
Location
Within 800 meters.
Country
llUnited States
We're all novices. ASL is like golf: there are only a handful of people in the world who are really good at it. The rest of us are all just weekend duffers.
I know that there may never have been a game played where no rule was butchered, however, for my first half-dozen plays, I'm a complete novice playing against VERY experienced players (who in all probability are intentionally "keeping it close"). I don't know that these results are indicative of anything. Now that I am somewhere north of a half-dozen face-to-face games and MAY have an inkling of how the game is supposed to be played, it's probably getting time to start logging results. In fact, once I finally finish "Wise's War" (sadly, due to scheduling difficulties the play has dragged over a couple of months) I think that I've acquitted myself well enough to report the results (no matter which way it ends).
 

Danno

Ost Front Fanatic
Joined
Mar 12, 2005
Messages
1,472
Reaction score
873
Location
Land of OZ
Country
llUnited States
ROAR passed 99500 games logged. June was over 500 games.
 
Top