No Goblins Allowed
http://862838.jrbdt8wd.asia/

Land Count Odds
http://862838.jrbdt8wd.asia/viewtopic.php?f=38&t=17943
Page 1 of 9

Author:  VT2WA [ Sat Feb 18, 2017 11:47 am ]
Post subject:  Land Count Odds

It's taken a while but I've collected 1700 games worth of mana data.
(FYI, my background is a Masters in Aeronautical Engineering from Rensselaer Polytechnic Institute, and my Bachelors had a minor in Mathematics as well. So yes, keeping track of data is a fault of mine)

Controls:
Play with a deck that contains 26 lands and minimal, if possible no, land manipulation or draw manipulation effects. For example, no Evolving Wilds, no Ramp, no shuffling, no tutor effects. For each game I played with a qualified deck, I would commit to keeping whatever starting hand I would get. I would track the number of cards drawn off the top of the deck and I would count the number of lands vs non-land cards. Scry is ok, but I have to count what was on top, I can put stuff on the bottom, but I have to continue to count what WAS on top and the order.

I did play occasionally with Nissa Vastwood Seer. I would mark those games.

My wife recently went on a vacation to go visit her family (we both can't travel at the same time, our GSD doesn't do well in a kennel). During this time I went through the mind numbing process assembling a card deck in-real-life, with 26 lands and 34 non-lands and simulating 400 games.

I randomly choose a batch of games and looked at the number of cards drawn. I then shuffled the cards by hand thoroughly, and would draw that many cards. I kept the percentage of Nissa games, and mimicked the percentage in my real-life-control deck by putting a single copy of Nissa in. I collected the data by counting the number of lands and non-lands off the top.

Attached is the results. It's a graph showing the percentage of games where I drew X number of lands. The number of lands is on the X-axis, the percentage of occurrences is on the Y-axis.

I know some will say that 1700 games is not enough data. I can assure you when measuring a stat of land vs non-land with an expected long term average rate of 43%/57%, 1700 games is close to giving us an expected 3-sigma result. Yes, 400 games in-real-life is a little low, but it's not an insignificant number (and you try doing it for 6+ hours!)

The Blue line represents real-life-data game's data.
The Orange line represents Duels game's data.

Attachments:
Land %.png
Land %.png [ 17.22 KiB | Viewed 15228 times ]

Author:  InFaMoUsGeMiNi [ Sat Feb 18, 2017 11:56 am ]
Post subject:  Re: Land Count Odds

So, in summary, what exactly is your graph telling us in simple terms?

Author:  VT2WA [ Sat Feb 18, 2017 12:01 pm ]
Post subject:  Re: Land Count Odds

The above graph shows how often I would draw a given number of lands.

Basically the thing to look at is the shape of the curves. They are normalized, and therefore should be the same shape if all things are equal.

Here is another way of looking at it. I summarized the data as what percentage of lands I drew each game. Again, all things equal, the shape should be the same.

Trying to stay un-bias, but based on this data, I would conclude the percentage of lands you would draw in-real-life is not accurately represented in the game of Duels. In the game of duels, it appears you are more likely to draw an statistically low or statistically high number of lands. Basically it appears that the game manipulates the standard deviation when it comes to the "randomness" of drawing a land.

Again, in the graph below, Blue line is real-life data. Orange Line is Duels data.

Attachments:
Land %2.png
Land %2.png [ 23.98 KiB | Viewed 15195 times ]

Author:  Black Barney [ Sat Feb 18, 2017 12:30 pm ]
Post subject:  Re: Land Count Odds

Same types of decks being played in both? The 2nd graph is much better.

If I'm reading it right, you get mana starved less in Duels but you get flooded WAY more often. Which supports a theory I've heard that Duels wants you to draw Land more because it's a NPE of being mana starved

The Duels curve is suspiciously flat which shows a non normal distribution of occurrences

Really interesting, great work. I think your sample size is good

The uptick at the end is weird, unless you were playing ramp, you shouldn't be more likely to draw all your land than almost-all of your land

Your blue line testing is excellent. The curve is what you'd expect with the flat tail

Author:  Sjokwaave [ Sat Feb 18, 2017 12:42 pm ]
Post subject:  Re: Land Count Odds

Image

Yes, that all looks correct.

Author:  DJ0045 [ Sat Feb 18, 2017 12:51 pm ]
Post subject:  Re: Land Count Odds

X axis? Y axis? I'm not sure what you're measuring in either graph.

Also, real quick extra question... duels n = 1700 shuffled real cards n = 400?

If I'm understanding your post this is exactly the relationship you'd expect, precisely what probability theory would predict.

Author:  TheFlakyMage [ Sat Feb 18, 2017 1:01 pm ]
Post subject:  Re: Land Count Odds

Those were the number of duels games vs number of irl games. He collected data from 1700 duels games and from 400 paper games.

Edit
You didn't happen to keep a count of how many cards were drawn/scried each game, along with the # of lands drawn, did you? I'd be interested to see your graph with the x axis showing the ratio of land over nonland draws per game along with a total average for paper and duels ratios. There are a lot of factors that might alter the number of cards drawn per game like the game duration or the density of draw/scry effects in the decks being used. Your sample size is probably plenty large enough to even out these factors, but a simple ratio would make them entirely irrelevant.

Edit2

Nm. Your followup post answers this. Actually, you say as much in op. My bad for skimming.

Author:  VT2WA [ Sat Feb 18, 2017 1:05 pm ]
Post subject:  Re: Land Count Odds

Here's another way of looking at the data without a graph:

Duels games = 1700

% Land Drawn ---- # of Games
0% - 10% -------- 131
10% - 20% ------- 234
20% - 30% ------- 306
30% - 40% ------- 329
40% - 50% ------- 245
50% - 60% ------- 226
60% - 70% ------- 111
70% - 80% ------- 49
80% - 90% ------- 21
90% - 100% ------ 48


Real Life

% Land Drawn ---# of Games
0% - 10% ------- 6
10% - 20% ------ 15
20% - 30% ------ 35
30% - 40% ------ 107
40% - 50% ------ 121
50% - 60% ------ 69
60% - 70% ------ 39
70% - 80% ------ 6
80% - 90% ------ 2
90% - 100% ----- 0

Author:  DJ0045 [ Sat Feb 18, 2017 1:11 pm ]
Post subject:  Re: Land Count Odds

VT2WA wrote:
Here's another way of looking at the data without a graph:

Duels games = 1700

% Land Drawn # of Games
0-10 131
10-20 234
20-30 306
30-40 329
40-50 245
50-60 226
60-70 111
70-80 49
80-90 21
90-100 48


Real Life

% Land Drawn # of Games
0-10 6
10-20 15
20-30 35
30-40 107
40-50 121
50-60 69
60-70 39
70-80 6
80-90 2
90-100 0


Opening hand? Or over the course of an entire game, which has a random number of draws? For example, is 90-100 a 7/8 land opening hand, or a game where you drew 26 lands over a random length (e.g. Non-standard) number of draws?

Author:  VT2WA [ Sat Feb 18, 2017 1:16 pm ]
Post subject:  Re: Land Count Odds

It's total lands drawn vs non-lands drawn through the entire game.

Example:

I could have a starting hand of 3 lands and 4 non-lands. The game goes 10 turns, so I draw 10 additional cards: 4 are lands and 6 are non-lands.

Total game count: 7 lands and 10 non-lands. Which equates to 41.1% lands for the game.

Author:  DJ0045 [ Sat Feb 18, 2017 1:28 pm ]
Post subject:  Re: Land Count Odds

VT2WA wrote:
It's total lands drawn vs non-lands drawn through the entire game.

Example:

I could have a starting hand of 3 lands and 4 non-lands. The game goes 10 turns, so I draw 10 additional cards: 4 are lands and 6 are non-lands.

Total game count: 7 lands and 10 non-lands. Which equates to 41.1% lands for the game.


Ah okay, so roughly 3.5 % of your games in duels end with you drawing 100% lands for the entire game before your opponent kills you, regardless of game length. For example, you could have drawn 20 lands in a row, or died on t4 having seen 11 lands and they would be given the same weight in your results. Am I right?

I presume you also had an opponent with real cards, right? So these results are similarly generated with real cards?

Also, could you adjust those graphs slightly, as the expected value is 26/60%, so it's really odd to group things as you have, since we can't tell where the majority of the points surrounding the expected mean actually lie... e.g. 30-40% groups numbers that are relatively far from the expected value with numbers that are nearly dead on, same issue with 40-50%. It's a real visualization problem here, and could be causing things to look more interesting than they actually are.



Tl,dr; above is important, below is also important but the graph change is more so.

The other major problem is that by drawing a random number of cards each time, the variance of each of your results is different (I don't mean Real vs. Duels, btw... I mean each X has a different variance). So we can't even calculate the expected distributions from this (even if we assume Normality, which is probably okay), we may only be able to guess at it - and our guess is not likely to be correct.

Ex: E(mean) is .433333, the simplest way to envision variance for this system however is .4333333*(1-.43333333)/n... where n here is the number of draws you took (I'm treating the system as a Bernoulli random variable where land = 1 and not land = 0, which is probably about right in case you want to look it up). 4 turn match, n =11 or 12, ten turn match, n = 17 or 18. But from what I can tell you've grouped them all together.

Regardless of how we guess at the appropriate variance (which is tricky at best) it will still have that n factor, which your data seemingly ignores.

Layman terms: you're counting average sizes of fruit, but failing to distinguish between apples, oranges, grapes, etc... while comparing the distributions of those averages as though everything were the same exact fruit.

Put in more technical terms: if the above is correct your estimator could be both biased (questionably important, but still obviously true: the result 26/60 is not possible for all n, since card numbers have discrete values*) and inconsistent (highly likely), which would make the results hard to interpret.

* the bias here stems from the fact that game length average may not be the same for Duels and Real... let's say the average real game ends on t6, while the average duels ends on t7, then the expected mean values for each will be different, based solely on the fact that the maximum likely number of expected lands seen by t6 is 6 or 46%, whereas t7 is also 6 or 43%. This could also happen if you were more often on the play or the draw, etc... the point being, your average n matters a lot, both in terms of the possible results, and in terms of the meanings of those results.

End tl,dr;



Did you keep data on number of cards drawn? Because if you did, your data is still totally useable. You should just group real vs. duels, while keeping total cards drawn fixed (I.e. N is the same for all values on both lines), and if you want to be really smart about presentation, make sure to mark the highest likelyness value on each chart. Also, I'm happy to help if you want me to.

Author:  The Secret of TIMH [ Sat Feb 18, 2017 3:04 pm ]
Post subject:  Re: Land Count Odds

Are the results saved individually, or were they just saved in the aggregate? I'd be curious to see your first 400 duels games compared to the next 400.

Author:  DJ0045 [ Sat Feb 18, 2017 3:34 pm ]
Post subject:  Re: Land Count Odds

Are the results saved individually, or were they just saved in the aggregate? I'd be curious to see your first 400 duels games compared to the next 400.


Me too, fwiw, but just to be clear, the issues I describe above have nothing to do with the total number of trials... 1700 and 400 are both plenty. The issue arises from the number of draws in each trial not being the same. I'll try to be consistent going forward... n = number of draws per trial, t = number of trials. So basically, no problem with t, but the different n's are surely causing spurious results.

@BB it's worth mentioning that 1700 trials should cause additional smoothing, so it's the opposite of what you said - this is the expected result if the distribution is Normal, or similar. What you are confusing this with, is what happens if we resample from the result. For example, take 5 trials chosen at random, find the mean, then do it 100 times, and you should see a normal distribution for the resulting means with a peak density at the expected mean of 43.333333%.

The other reason for the smoothing is the variance issue I mentioned above, but that is WAY too technical for NGA. You guys would fall asleep from the explanation, assuming you haven't already from my above post.

Author:  Haven_pt [ Sat Feb 18, 2017 4:16 pm ]
Post subject:  Re: Land Count Odds

Great work. The Duels curve is quite flat as has been mentioned and somewhat un-bell-like as would be expected from a normal distribution, which adds weight to our long-time Suspicions that the RNG of Duels is skewed.

So now you can be certain that if you keep that 5-lander hand, your next draw will be Land. :)

Author:  DJ0045 [ Sat Feb 18, 2017 4:39 pm ]
Post subject:  Re: Land Count Odds

Haven_pt wrote:
Great work. The Duels curve is quite flat as has been mentioned and somewhat un-bell-like as would be expected from a normal distribution, which adds weight to our long-time Suspicions that the RNG of Duels is skewed.

So now you can be certain that if you keep that 5-lander hand, your next draw will be Land. :)


You're not going to get a bell curve for the mean. No wonder you guys think something is off. It would be skewed to the left unless you play with a 30 land deck. And even then I don't think it would be a normal distribution as the tails are truncated. I'm actually trying to find a site that graphs this... if I can I'll post it here. Otherwise, I may just generate one myself. The graph posted above shows about the level of skew I'd expect to see, but let me see if I can convince myself otherwise.

edit: actually, I'm hitting a bit of a wall with this one. I can simulate a Bernoulli Random Variable, and I get something like I describe above, but technically this isn't a bernoulli random variable. TBH, this may be beyond my programing skills to accomplish without spending hours figuring it out.

The bottom line though, is that because of the limits on each side of the distribution - e.g.: probability #lands < 0 = 0 and probability lands > n = 0, you have a really odd distribution for the means. Furthermore, since we are drawing without replacement even the model is pretty darn complicated. I've asked a couple friends of mine for their input, so I'll see if they have any bright ideas in how to crack this nut, but I don't think it's going to be easy.

Author:  The Secret of TIMH [ Sat Feb 18, 2017 6:26 pm ]
Post subject:  Re: Land Count Odds

DJ0045 wrote:
Are the results saved individually, or were they just saved in the aggregate? I'd be curious to see your first 400 duels games compared to the next 400.


Me too, fwiw, but just to be clear, the issues I describe above have nothing to do with the total number of trials... 1700 and 400 are both plenty.


I didn't think the issue would be the total number of trials. I'm curious to see what the variance would look like in comparing Duels to itself.

Curious how much is simply due to chance. I'm looking at it a little like he flipped a quarter 100 times, then flipped a penny 100 times and the results indicate quarters and pennies don't have the same distribution of heads to tails over 100 flips.

Author:  DJ0045 [ Sat Feb 18, 2017 6:52 pm ]
Post subject:  Re: Land Count Odds

DJ0045 wrote:
Are the results saved individually, or were they just saved in the aggregate? I'd be curious to see your first 400 duels games compared to the next 400.


Me too, fwiw, but just to be clear, the issues I describe above have nothing to do with the total number of trials... 1700 and 400 are both plenty.


I didn't think the issue would be the total number of trials. I'm curious to see what the variance would look like in comparing Duels to itself.

Curious how much is simply due to chance. I'm looking at it a little like he flipped a quarter 100 times, then flipped a penny 100 times and the results indicate quarters and pennies don't have the same distribution of heads to tails over 100 flips.


It's not a coin flip though. We can try to model it that way, using a Bernoulli random variable with p=~.4333333, but that's actually not going to be right since each draw has different probability from the last. And his distribution is also being affected by the randomness of n, because these are discrete results, a problem I mentioned earlier: your most likely result is the same for n=13 and 14, namely 6, this will cause weird things to happen when you aggregate them all together. This is why I offered to look at the data, if he lets me.

A coin flip is random with replacement, this is not. And the discreetness of the result (meaning each result is a whole number, these are not continuous values... e.g. Drawing 7.234 lands is not possible) means that certain result are more probable than others. None of this would matter if n was the same in every trial... e.g. Always look at the top 7 cards of the deck, as has been done before. If you do that, after 1700 trials you should get almost exactly 43.333333% land as the average (+- a tiny variance). But this is not what we've got here, as this data was not gathered in that fashion.

Also, fwiw, the way he's clustered the data is also really confusing the issue... those are by no means equal probability bands. Not even close. They aren't grouped by standard deviation, so of course they don't look (on visual inspection) right. Especially the second graph, which has all results from 30-40% and 40-50% lumped together, which they absolutely should not be.

Author:  DJ0045 [ Sat Feb 18, 2017 7:06 pm ]
Post subject:  Re: Land Count Odds

The data I'm interested in is the Duels data, btw... I'm not going to question his shuffling ability or anything so ridiculous as that. But if the question is about the duels shuffler, we can examine just his 1700 trial results and figure out whether the shuffler is probably right or not. That is more than enough of a controlled experiment.

Speaking of which, we could also use it to determine whether the probability of the results are innapropriately skewed, etc... e.g.: we get the right mean, but probabilities in the tail are too high... as people have often suggested, and which I have often doubted.

The first part would take literally two seconds running a quick statistics test with clustered errors on n. The second is slightly trickier, but also totally doable.

Author:  Eonblueapocalypse [ Sat Feb 18, 2017 7:58 pm ]
Post subject:  Re: Land Count Odds

DJ0045 wrote:
DJ0045 wrote:

Me too, fwiw, but just to be clear, the issues I describe above have nothing to do with the total number of trials... 1700 and 400 are both plenty.


I didn't think the issue would be the total number of trials. I'm curious to see what the variance would look like in comparing Duels to itself.

Curious how much is simply due to chance. I'm looking at it a little like he flipped a quarter 100 times, then flipped a penny 100 times and the results indicate quarters and pennies don't have the same distribution of heads to tails over 100 flips.


It's not a coin flip though. We can try to model it that way, using a Bernoulli random variable with p=~.4333333, but that's actually not going to be right since each draw has different probability from the last. And his distribution is also being affected by the randomness of n, because these are discrete results, a problem I mentioned earlier: your most likely result is the same for n=13 and 14, namely 6, this will cause weird things to happen when you aggregate them all together. This is why I offered to look at the data, if he lets me.

A coin flip is random with replacement, this is not. And the discreetness of the result (meaning each result is a whole number, these are not continuous values... e.g. Drawing 7.234 lands is not possible) means that certain result are more probable than others. None of this would matter if n was the same in every trial... e.g. Always look at the top 7 cards of the deck, as has been done before. If you do that, after 1700 trials you should get almost exactly 43.333333% land as the average (+- a tiny variance). But this is not what we've got here, as this data was not gathered in that fashion.

Also, fwiw, the way he's clustered the data is also really confusing the issue... those are by no means equal probability bands. Not even close. They aren't grouped by standard deviation, so of course they don't look (on visual inspection) right. Especially the second graph, which has all results from 30-40% and 40-50% lumped together, which they absolutely should not be.


This.

Binomial distribution isn't going to work in this scenario as there are no replacement effects. Hypergeometric distribution has to be used in this case.

I am functionally retarded when it comes to mathematics outside of theoretical knowledge, so I won't even begin to attempt to try and plot it out and make myself look like an idiot in the process.

That said, I know for a fact that there is a spreadsheet function for hypergeometric distribution which can be used to plot out things of this nature. While I don't remember the function, or the exact link to where to find it, I distinctly remember it because it was in part of a pretty old MTG article about how probability works in relation to the game.

Maybe tonight I will do some searching for it to see if I can still find the original article around somewhere.

Author:  DJ0045 [ Sat Feb 18, 2017 8:01 pm ]
Post subject:  Re: Land Count Odds

The function is totally straightforward. It's just a ratio of n choose k functions. What I don't know how to do is get a random simulation of them.

Page 1 of 9 All times are UTC - 6 hours [ DST ]
Powered by phpBB® Forum Software © phpBB Group
http://www.phpbb.com/