[2] | 1 | 1. Title: Poker Hand Dataset
|
---|
| 2 |
|
---|
| 3 | 2. Source Information
|
---|
| 4 |
|
---|
| 5 | a) Creators:
|
---|
| 6 |
|
---|
| 7 | Robert Cattral (cattral@gmail.com)
|
---|
| 8 |
|
---|
| 9 | Franz Oppacher (oppacher@scs.carleton.ca)
|
---|
| 10 | Carleton University, Department of Computer Science
|
---|
| 11 | Intelligent Systems Research Unit
|
---|
| 12 | 1125 Colonel By Drive, Ottawa, Ontario, Canada, K1S5B6
|
---|
| 13 |
|
---|
| 14 | c) Date of release: Jan 2007
|
---|
| 15 |
|
---|
| 16 | 3. Past Usage:
|
---|
| 17 | 1. R. Cattral, F. Oppacher, D. Deugo. Evolutionary Data Mining
|
---|
| 18 | with Automatic Rule Generalization. Recent Advances in Computers,
|
---|
| 19 | Computing and Communications, pp.296-300, WSEAS Press, 2002.
|
---|
| 20 | - Note: This was a slightly different dataset that had more
|
---|
| 21 | classes, and was considerably more difficult.
|
---|
| 22 |
|
---|
| 23 | - Predictive attribute: Poker Hand (labeled class)
|
---|
| 24 | - Found to be a challenging dataset for classification algorithms
|
---|
| 25 | - Relational learners have an advantage for some classes
|
---|
| 26 | - The ability to learn high level constructs has an advantage
|
---|
| 27 |
|
---|
| 28 | 4. Relevant Information:
|
---|
| 29 | Each record is an example of a hand consisting of five playing
|
---|
| 30 | cards drawn from a standard deck of 52. Each card is described
|
---|
| 31 | using two attributes (suit and rank), for a total of 10 predictive
|
---|
| 32 | attributes. There is one Class attribute that describes the
|
---|
| 33 | Poker Hand. The order of cards is important, which is why there
|
---|
| 34 | are 480 possible Royal Flush hands as compared to 4 (one for each
|
---|
| 35 | suit explained in more detail below).
|
---|
| 36 |
|
---|
| 37 | 5. Number of Instances: 25010 training, 1,000,000 testing
|
---|
| 38 |
|
---|
| 39 | 6. Number of Attributes: 10 predictive attributes, 1 goal attribute
|
---|
| 40 |
|
---|
| 41 | 7. Attribute Information:
|
---|
| 42 | 1) S1 Suit of card #1
|
---|
| 43 | Ordinal (1-4) representing {Hearts, Spades, Diamonds, Clubs}
|
---|
| 44 |
|
---|
| 45 | 2) C1 Rank of card #1
|
---|
| 46 | Numerical (1-13) representing (Ace, 2, 3, ... , Queen, King)
|
---|
| 47 |
|
---|
| 48 | 3) S2 Suit of card #2
|
---|
| 49 | Ordinal (1-4) representing {Hearts, Spades, Diamonds, Clubs}
|
---|
| 50 |
|
---|
| 51 | 4) C2 Rank of card #2
|
---|
| 52 | Numerical (1-13) representing (Ace, 2, 3, ... , Queen, King)
|
---|
| 53 |
|
---|
| 54 | 5) S3 Suit of card #3
|
---|
| 55 | Ordinal (1-4) representing {Hearts, Spades, Diamonds, Clubs}
|
---|
| 56 |
|
---|
| 57 | 6) C3 Rank of card #3
|
---|
| 58 | Numerical (1-13) representing (Ace, 2, 3, ... , Queen, King)
|
---|
| 59 |
|
---|
| 60 | 7) S4 Suit of card #4
|
---|
| 61 | Ordinal (1-4) representing {Hearts, Spades, Diamonds, Clubs}
|
---|
| 62 |
|
---|
| 63 | 8) C4 Rank of card #4
|
---|
| 64 | Numerical (1-13) representing (Ace, 2, 3, ... , Queen, King)
|
---|
| 65 |
|
---|
| 66 | 9) S5 Suit of card #5
|
---|
| 67 | Ordinal (1-4) representing {Hearts, Spades, Diamonds, Clubs}
|
---|
| 68 |
|
---|
| 69 | 10) C5 Rank of card 5
|
---|
| 70 | Numerical (1-13) representing (Ace, 2, 3, ... , Queen, King)
|
---|
| 71 |
|
---|
| 72 | 11) CLASS Poker Hand
|
---|
| 73 | Ordinal (0-9)
|
---|
| 74 |
|
---|
| 75 | 0: Nothing in hand; not a recognized poker hand
|
---|
| 76 | 1: One pair; one pair of equal ranks within five cards
|
---|
| 77 | 2: Two pairs; two pairs of equal ranks within five cards
|
---|
| 78 | 3: Three of a kind; three equal ranks within five cards
|
---|
| 79 | 4: Straight; five cards, sequentially ranked with no gaps
|
---|
| 80 | 5: Flush; five cards with the same suit
|
---|
| 81 | 6: Full house; pair + different rank three of a kind
|
---|
| 82 | 7: Four of a kind; four equal ranks within five cards
|
---|
| 83 | 8: Straight flush; straight + flush
|
---|
| 84 | 9: Royal flush; {Ace, King, Queen, Jack, Ten} + flush
|
---|
| 85 |
|
---|
| 86 |
|
---|
| 87 | 8. Missing Attribute Values: None
|
---|
| 88 |
|
---|
| 89 | 9. Class Distribution:
|
---|
| 90 |
|
---|
| 91 | The first percentage in parenthesis is the representation
|
---|
| 92 | within the training set. The second is the probability in the full domain.
|
---|
| 93 |
|
---|
| 94 | Training set:
|
---|
| 95 |
|
---|
| 96 | 0: Nothing in hand, 12493 instances (49.95202% / 50.117739%)
|
---|
| 97 | 1: One pair, 10599 instances, (42.37905% / 42.256903%)
|
---|
| 98 | 2: Two pairs, 1206 instances, (4.82207% / 4.753902%)
|
---|
| 99 | 3: Three of a kind, 513 instances, (2.05118% / 2.112845%)
|
---|
| 100 | 4: Straight, 93 instances, (0.37185% / 0.392465%)
|
---|
| 101 | 5: Flush, 54 instances, (0.21591% / 0.19654%)
|
---|
| 102 | 6: Full house, 36 instances, (0.14394% / 0.144058%)
|
---|
| 103 | 7: Four of a kind, 6 instances, (0.02399% / 0.02401%)
|
---|
| 104 | 8: Straight flush, 5 instances, (0.01999% / 0.001385%)
|
---|
| 105 | 9: Royal flush, 5 instances, (0.01999% / 0.000154%)
|
---|
| 106 |
|
---|
| 107 | The Straight flush and Royal flush hands are not as representative of
|
---|
| 108 | the true domain because they have been over-sampled. The Straight flush
|
---|
| 109 | is 14.43 times more likely to occur in the training set, while the
|
---|
| 110 | Royal flush is 129.82 times more likely.
|
---|
| 111 |
|
---|
| 112 | Total of 25010 instances in a domain of 311,875,200.
|
---|
| 113 |
|
---|
| 114 | Testing set:
|
---|
| 115 |
|
---|
| 116 | The value inside parenthesis indicates the representation within the test
|
---|
| 117 | set as compared to the entire domain. 1.0 would be perfect representation,
|
---|
| 118 | while <1.0 are under-represented and >1.0 are over-represented.
|
---|
| 119 |
|
---|
| 120 | 0: Nothing in hand, 501209 instances,(1.000063)
|
---|
| 121 | 1: One pair, 422498 instances,(0.999832)
|
---|
| 122 | 2: Two pairs, 47622 instances, (1.001746)
|
---|
| 123 | 3: Three of a kind, 21121 instances, (0.999647)
|
---|
| 124 | 4: Straight, 3885 instances, (0.989897)
|
---|
| 125 | 5: Flush, 1996 instances, (1.015569)
|
---|
| 126 | 6: Full house, 1424 instances, (0.988491)
|
---|
| 127 | 7: Four of a kind, 230 instances, (0.957934)
|
---|
| 128 | 8: Straight flush, 12 instances, (0.866426)
|
---|
| 129 | 9: Royal flush, 3 instances, (1.948052)
|
---|
| 130 |
|
---|
| 131 | Total of one million instances in a domain of 311,875,200.
|
---|
| 132 |
|
---|
| 133 |
|
---|
| 134 | 10. Statistics
|
---|
| 135 |
|
---|
| 136 | Poker Hand # of hands Probability # of combinations
|
---|
| 137 | Royal Flush 4 0.00000154 480
|
---|
| 138 | Straight Flush 36 0.00001385 4320
|
---|
| 139 | Four of a kind 624 0.0002401 74880
|
---|
| 140 | Full house 3744 0.00144058 449280
|
---|
| 141 | Flush 5108 0.0019654 612960
|
---|
| 142 | Straight 10200 0.00392464 1224000
|
---|
| 143 | Three of a kind 54912 0.02112845 6589440
|
---|
| 144 | Two pairs 123552 0.04753902 14826240
|
---|
| 145 | One pair 1098240 0.42256903 131788800
|
---|
| 146 | Nothing 1302540 0.50117739 156304800
|
---|
| 147 |
|
---|
| 148 | Total 2598960 1.0 311875200
|
---|
| 149 |
|
---|
| 150 | The number of combinations represents the number of instances in the entire domain.
|
---|
| 151 |
|
---|