source: liacs/ai/poker/data/poker-hand.names@ 115

Last change on this file since 115 was 2, checked in by Rick van der Zwet, 15 years ago

Initial import of data of old repository ('data') worth keeping (e.g. tracking
means of URL access statistics)

File size: 5.8 KB
Line 
11. Title: Poker Hand Dataset
2
32. Source Information
4
5a) Creators:
6
7 Robert Cattral (cattral@gmail.com)
8
9 Franz Oppacher (oppacher@scs.carleton.ca)
10 Carleton University, Department of Computer Science
11 Intelligent Systems Research Unit
12 1125 Colonel By Drive, Ottawa, Ontario, Canada, K1S5B6
13
14c) Date of release: Jan 2007
15
163. Past Usage:
17 1. R. Cattral, F. Oppacher, D. Deugo. Evolutionary Data Mining
18 with Automatic Rule Generalization. Recent Advances in Computers,
19 Computing and Communications, pp.296-300, WSEAS Press, 2002.
20 - Note: This was a slightly different dataset that had more
21 classes, and was considerably more difficult.
22
23 - Predictive attribute: Poker Hand (labeled ‘class’)
24 - Found to be a challenging dataset for classification algorithms
25 - Relational learners have an advantage for some classes
26 - The ability to learn high level constructs has an advantage
27
284. Relevant Information:
29 Each record is an example of a hand consisting of five playing
30 cards drawn from a standard deck of 52. Each card is described
31 using two attributes (suit and rank), for a total of 10 predictive
32 attributes. There is one Class attribute that describes the
33 “Poker Hand”. The order of cards is important, which is why there
34 are 480 possible Royal Flush hands as compared to 4 (one for each
35 suit – explained in more detail below).
36
375. Number of Instances: 25010 training, 1,000,000 testing
38
396. Number of Attributes: 10 predictive attributes, 1 goal attribute
40
417. Attribute Information:
42 1) S1 “Suit of card #1”
43 Ordinal (1-4) representing {Hearts, Spades, Diamonds, Clubs}
44
45 2) C1 “Rank of card #1”
46 Numerical (1-13) representing (Ace, 2, 3, ... , Queen, King)
47
48 3) S2 “Suit of card #2”
49 Ordinal (1-4) representing {Hearts, Spades, Diamonds, Clubs}
50
51 4) C2 “Rank of card #2”
52 Numerical (1-13) representing (Ace, 2, 3, ... , Queen, King)
53
54 5) S3 “Suit of card #3”
55 Ordinal (1-4) representing {Hearts, Spades, Diamonds, Clubs}
56
57 6) C3 “Rank of card #3”
58 Numerical (1-13) representing (Ace, 2, 3, ... , Queen, King)
59
60 7) S4 “Suit of card #4”
61 Ordinal (1-4) representing {Hearts, Spades, Diamonds, Clubs}
62
63 8) C4 “Rank of card #4”
64 Numerical (1-13) representing (Ace, 2, 3, ... , Queen, King)
65
66 9) S5 “Suit of card #5”
67 Ordinal (1-4) representing {Hearts, Spades, Diamonds, Clubs}
68
69 10) C5 “Rank of card 5”
70 Numerical (1-13) representing (Ace, 2, 3, ... , Queen, King)
71
72 11) CLASS “Poker Hand”
73 Ordinal (0-9)
74
75 0: Nothing in hand; not a recognized poker hand
76 1: One pair; one pair of equal ranks within five cards
77 2: Two pairs; two pairs of equal ranks within five cards
78 3: Three of a kind; three equal ranks within five cards
79 4: Straight; five cards, sequentially ranked with no gaps
80 5: Flush; five cards with the same suit
81 6: Full house; pair + different rank three of a kind
82 7: Four of a kind; four equal ranks within five cards
83 8: Straight flush; straight + flush
84 9: Royal flush; {Ace, King, Queen, Jack, Ten} + flush
85
86
878. Missing Attribute Values: None
88
899. Class Distribution:
90
91 The first percentage in parenthesis is the representation
92 within the training set. The second is the probability in the full domain.
93
94 Training set:
95
96 0: Nothing in hand, 12493 instances (49.95202% / 50.117739%)
97 1: One pair, 10599 instances, (42.37905% / 42.256903%)
98 2: Two pairs, 1206 instances, (4.82207% / 4.753902%)
99 3: Three of a kind, 513 instances, (2.05118% / 2.112845%)
100 4: Straight, 93 instances, (0.37185% / 0.392465%)
101 5: Flush, 54 instances, (0.21591% / 0.19654%)
102 6: Full house, 36 instances, (0.14394% / 0.144058%)
103 7: Four of a kind, 6 instances, (0.02399% / 0.02401%)
104 8: Straight flush, 5 instances, (0.01999% / 0.001385%)
105 9: Royal flush, 5 instances, (0.01999% / 0.000154%)
106
107 The Straight flush and Royal flush hands are not as representative of
108 the true domain because they have been over-sampled. The Straight flush
109 is 14.43 times more likely to occur in the training set, while the
110 Royal flush is 129.82 times more likely.
111
112 Total of 25010 instances in a domain of 311,875,200.
113
114 Testing set:
115
116 The value inside parenthesis indicates the representation within the test
117 set as compared to the entire domain. 1.0 would be perfect representation,
118 while <1.0 are under-represented and >1.0 are over-represented.
119
120 0: Nothing in hand, 501209 instances,(1.000063)
121 1: One pair, 422498 instances,(0.999832)
122 2: Two pairs, 47622 instances, (1.001746)
123 3: Three of a kind, 21121 instances, (0.999647)
124 4: Straight, 3885 instances, (0.989897)
125 5: Flush, 1996 instances, (1.015569)
126 6: Full house, 1424 instances, (0.988491)
127 7: Four of a kind, 230 instances, (0.957934)
128 8: Straight flush, 12 instances, (0.866426)
129 9: Royal flush, 3 instances, (1.948052)
130
131 Total of one million instances in a domain of 311,875,200.
132
133
13410. Statistics
135
136 Poker Hand # of hands Probability # of combinations
137 Royal Flush 4 0.00000154 480
138 Straight Flush 36 0.00001385 4320
139 Four of a kind 624 0.0002401 74880
140 Full house 3744 0.00144058 449280
141 Flush 5108 0.0019654 612960
142 Straight 10200 0.00392464 1224000
143 Three of a kind 54912 0.02112845 6589440
144 Two pairs 123552 0.04753902 14826240
145 One pair 1098240 0.42256903 131788800
146 Nothing 1302540 0.50117739 156304800
147
148 Total 2598960 1.0 311875200
149
150 The number of combinations represents the number of instances in the entire domain.
151
Note: See TracBrowser for help on using the repository browser.