[ Post Response ] [ Return to Index ] [ Read Prev Msg ] [ Read Next Msg ]

BGonline.org Forums

Take the "O" out of "point" and it's "pint".

Posted By: MK
Date: Saturday, 10 August 2024, at 5:34 a.m.

Some of you who are old enough may remember the 80's song with the lyrics:

Take the "L" out of "lover" and it's "over".

If you want to reminisce for a minute or hear it for the first time, here's a link:

https://www.youtube.com/watch?v=wmpOkqwaPzA

Inspired by it, I'm trying to compose a gamblegammon song with the lyrics:

Take the "O" out of "point" and it's "pint".

It's not romantic at all. It's about "Bill and Bob" picking berries in the woods. But wait. Let me back up for a minute to tell you what lead me to write this “musin'cal” paper to begin with.

===================================================

As you may know from my postings about them here, I had recently done two cubeful, money-game experiments just out of curiosity to find out what would happen if a mutant player always made the second best checker move or the worst checker move at his first turn.

In the first experiment, cubeful results were as follows: In 20,000 games bot won 46,782 (92.54%) mutant won 3,773 (7.46%) points, with mutant losing 2.15ppg, i.e. (46,782-3,773)/20,000.

And cubeless results were as follows: In 20,000 games bot won 28,619 (93.45%) mutant won 2,006 (6.55%) points, with mutant losing 1.33 ppg, (28,619-2,006)/20,000.

In the second experiment, cubeful results were as follows: In 20,000 games bot won 32,208 (57.85%) mutant won 23,467 (42.15%) points, with mutant losing 0.44ppg, (32,208-23,467)/20,000.

And cubeless results were as follows: In 20,000 games bot won 16,319 (61.38%) mutant won 10,267 (38.62%) points, with mutant losing 0.30 ppg, (16,329-10,267)/20,000

The purpose of my cubeless versions of these experiments was to see whether the cube adds anything, (and what), to backgammon. But before we could get to that discussion, I and a few people in DailyGammon got mired in endlessly repeating arguments on how to go about it.

Zorba and Ian, argued, based the ppg's lost, that mutant did worse in cubeful. I'm not naming them to single them out but because they were the most vocal. Zorba said "ppg is the only figure that matters here" and Ian said "ppg tells me what I need to know in order to gauge the relative strength of the opponents", etc. As nobody else said anything to correct them, I assumed that everybody agreed with them.

I concluded, based on the percentages of points lost, that mutant did worse in cubeless. Adding that percentage is a universal mathematical concept but ppg is useless in comparing performances of different players.

===================================================

Now back to points and pints. Yesterday I hired "Bill and Bob", (I picked names that mathematiciand most often do to give examples;), to pick berries for me. I sent them into the woods with 10 buckets and lots of pint size plastic bags so that they could write their names on them and get paid $1 per pint properly.

When they returned with all buckets full, we counted the bags. Bill had picked 250 pints vs Bob had picked 150 pints. Bill bragged that he had picked 10 ppb, (pints per bucket), more berries than Bob, i.e. (250-150)/10. Bob didn't object.

Today, I sent them into the woods again with 10 buckets and lots of pint size bags. When they returned with all 10 buckets full, we again counted the bags. Bill had picked 340 pints vs Bob had picked 220 pints. Bill boasted that he did better than yesterday, since he picked 12 ppb, (pints per bucket), more than Bob, i.e. (340-220)/10. But this time Bob objected that it was actually him who did better than yesterday, since compared to the 37.5% of total pints picked yesterday, he picked 39% of total pints of berries picked today.

They asked me to arbitrate. At first I was puzzled for a second. But then, I said "Aha! I know. Yesterday I gave you guys 5-gallon buckets, (with an average of 40 pints per bucket), but today I gave you 7-gallon buckets (with an average of 56 pints per bucket). Thus, Bob is right."

They both had picked more pints today than yesterdy but in comparison, Bill had picked 1.5% less, Bob had picked 1.5% more of the total berries today.

As you all can see, ppb (pints per bucket) doesn't work even to compare the same two berry pickers' performance two days in a row, unless you also use the same size buckets consistently. What is true for "ppb" in berry picking is also true for "ppg" in gamblegammon.

In my first experiment, it looked like mutant did better in cubeless ppg's but cubeful *average bucket size* was 2.53 ppg, i.e. (46,782+3,773)/20,000 and cubeless *average game size* was 1.53 ppg, i.e. (28,619+2,006)/20,000, thus mutant actually did worse in cubeless.

In my second experiment, again it looked like mutant did better in cubeless version also based on ppg's, but cubeful *average game size* was 2.78 ppg, (32,208+23,467)/20,000 and cubeless *average bucket size* was 1.33 ppg, (16,319+10,267)/20,000, so again mutant actually did worse cubeless.

I used *average bucket size* and *average game size* interchangeably not to confuse you but to help you understand better.

===================================================

Where does this sloppily meaningless, useless usage of ppg come from?

Well, at least one culprit source seems to be GnuBG. In its analysis window, it doesn't even give the points counts for players, nor the number of games. It only gives ppg's to compare the strengths of the players.

For my "worst first checker move" experiment, I imported the actual SGF files for the batches of 1,000 games with the mutant's win percents closest to the mutant's overall win percents for all 20,000 games. Here's what GnuBG shows.

Cubeful session, batch #12 42.15% (42.21% for the entire session):

Actual result +651.000 -651.000 Advantage (actual) in ppg +0.651 -0.651

Cubeless session, batch #17 38.62% (38.72% for the entire session):

Actual result +304.000 -304.000 Advantage (actual) in ppg +0.304 -0.304

As I illustrated above using the “average pints per bucket", for picking berries, GnuBG's ppg figures are completely useless without knowing "average points per game", which is 2.73 ppg (1,577+1,152)/10,000 for cubeful and 1.35 ppg (826+522)/1,000 for cubeless batch of 1,000 games. A player's ppg is naturally higher when average ppg for the game itself is higher and comparing ppg's in different types of games is meaningless because of this.

===================================================

How far back this fallacious usage of ppg goes? I wouldn’t know exactly but at least one source seems to trace it to Gerald Tesauro in 1991, in this paper:

http://www.scholarpedia.org/article/User:Gerald_Tesauro/Proposed/Td-gammon

Things start out not too badly at first. Right under the first diagram, it says: "Performance is measured by expected points per game (ppg) won or lost against a benchmark opponent (Sun Microsystems' Gammontool program)". The keywords here are "against a benchmark opponent". Later, it also says: "against a fixed opponent".

After switching to "Pubeval", it still refers to it as a "benchmarking opponent". But after Version 0.0 there is no more mentioning of Gammontool or Pubeval.

Starting with Version 1.0 he turns to comparing his bot to top human players, i.e. in 51 games against Bill Robertie, Paul Magriel and Malcolm Davis his bot lost only 13 points, for an average rate of about one-quarter point per game.

He does at least give the number of games and points lost but the 0.25 ppg is still totally useless for strength comparision among four different players. He dumps apples, oranges and potatoes into a blender and produces some nutritious smoothie rich in ppgs, for the congregation of gamblegammon to slurp up for years and decades to come.

This is another example of the harm Tesauro has caused to the bacgammon AI, in addition to ending up creating a frankensteinic bot by incorporating human bias into it, based on advice from the gamblegammon giants of that era, also apparently needing blessings and seeking recognitions for his bot.

In the ensuing paragraphs about later verions, he compares his bot to more human players, (seemingly all gamblers), and then starts doing rollouts using Snowie to compare it to Robertie, etc. In the second paragraph, it’s disclosed "Jellyfish and Snowie, directly inspired by TD-Gammon". Thus, doing rollouts with Snowie surely couldn’t be more objective than doing with TD-Gammon.

It's really noteworthy that while talking about Version 3.1, he feels a need to clarify how his bot "made one doubling blunder costing 32 points in a single game (Davis redoubled to 16 and won a gammon). As a result, TD-Gammon ended up with an overall loss by 8 points". Ah, so, it appears that the "cpb", ("cubes per bucket") does indeed matter for making sense out of “ppg”, eh?

=================================================== In conclusion, ppg (points per game), ppb (points per bucket), ppps (pounds of potatoes per sac), etc. are all "brown math". You can’t compare things with units of measure of undefined size, such as “game”, “bucket”, “sac”, etc.

Let this paper full of "Muratisms" be my unintended/unmeant contribution to the pathetic small world of gamblegammon.

Copyright © 2024 Murat Kalinyaprak

 

Post Response

Your Name:
Your E-Mail Address:
Subject:
Message:

If necessary, enter your password below:

Password:

 

 

[ Post Response ] [ Return to Index ] [ Read Prev Msg ] [ Read Next Msg ]

BGonline.org Forums is maintained by Stick with WebBBS 5.12.