Highest Voted 'tic-tac-toe' Questions - Artificial Intelligence Stack Exchange

9

votes

3 answers

How should I represent the input to a neural network for the games of tic-tac-toe, checkers or chess?

I've been reading a lot about TD-Gammon recently as I'm exploring options for AI in a video game I'm making. The video game is a turn-based positional sort of game, i.e. a "units", or game piece's, position will greatly impact its usefulness in that…

asked Nov 23 '17 at 03:02

NeomerArcana

210
3
12

6

votes

2 answers

Why does self-playing tic-tac-toe not become perfect?

I trained a DQN that learns tic-tac-toe by playing against itself with a reward of -1/0/+1 for a loss/draw/win. Every 500 episodes, I test the progress by letting it play some episodes (also 500) against a random player. As shown in the picture…

reinforcement-learning deep-rl dqn self-play tic-tac-toe

asked Jun 05 '18 at 21:16

user3877351

91
1
5

6

votes

1 answer

What are good learning strategies for Deep Q-Network with opponents?

I am trying to find out what are some good learning strategies for Deep Q-Network with opponents. Let's consider the well-known game Tic-Tac-Toe as an example: How should an opponent be implemented to get good and fast improvements? Is it better to…

reinforcement-learning q-learning dqn game-ai tic-tac-toe

asked Apr 04 '18 at 09:13

murthy10

81
5

4

votes

1 answer

How do we find the length (depth) of the game tic-tac-toe in adversarial search?

When we perform the tic-tac-toe game using adversarial search, I know how to make a tree. Is there a way to find the depth of the tree, and which level is the last level?

game-ai search minimax tic-tac-toe adversarial-search

asked Dec 07 '18 at 07:01

Adil Mustafa

163
1
4

4

votes

1 answer

Why is tic-tac-toe considered a non-deterministic environment?

I have been reading about deterministic and stochastic environments, when I came up with an article that states that tic-tac-toe is a non-deterministic environment. But why is that? An action will lead to a known state of the game and an agent has…

game-ai markov-decision-process game-theory pomdp tic-tac-toe

asked Aug 02 '20 at 22:40

EEAH

193
1
5

3

votes

2 answers

How can both agents know the terminal reward in self-play reinforcement learning?

There seems to be a major difference in how the terminal reward is received/handled in self-play RL vs "normal" RL, which confuses me. I implemented TicTacToe the normal way, where a single agent plays against an environment that manages the state…

reinforcement-learning game-ai self-play tic-tac-toe

asked May 30 '18 at 10:50

user3877351

91
1
5

3

votes

1 answer

Given these two reward functions, what can we say about the optimal Q-values, in self-play tic-tac-toe?

This corresponds to Exercise 1.1 of Sutton & Barto's book (2nd edition), and a discussion followed from this answer. Consider the following two reward functions Win = +1, Draw = 0, Loss = -1 Win = +1, Draw or Loss = 0 Can we say something about…

reinforcement-learning q-learning sutton-barto self-play tic-tac-toe

asked Jan 21 '19 at 18:59

pg2455

221
1
5

2

votes

3 answers

What is the optimal score for Tic Tac Toe for a reinforcement learning agent against a random opponent?

I guess this problem is encountered by everyone trying to solve Tic Tac Toe with various flavors of reinforcement learning. The answer is not "always win" because the random opponent may sometimes be able to draw the game. So it is slightly less…

reinforcement-learning deep-learning deep-rl tic-tac-toe

asked Sep 04 '21 at 04:44

Yan King Yin

235
1
8

2

votes

1 answer

Why can I still easily beat my Q-learning agent that was trained against another Q-learning agent to play tic tac toe?

I implemented the Q-learning algorithm to play tic-tac-toe. The AI plays against the same algorithm, but they don't share the same Q matrix. After 200,000 games, I still beat the AI very easily and it's rather dumb. My selection is made by epsilon…

reinforcement-learning machine-learning q-learning tic-tac-toe multi-agent-rl

asked Apr 12 '17 at 13:32

Irindul

39
7

2

votes

1 answer

Why isn't my Q-Learning agent able to play tic-tac-toe?

I tried to build a Q-learning agent which you can play tic tac toe against after training. Unfortunately, the agent performs pretty poorly. He tries to win but does not try to make me 'not winning' which ends up in me beating up the agent no matter…

reinforcement-learning q-learning game-ai combinatorial-games tic-tac-toe

asked Jan 16 '19 at 17:03

SHA256man

23
4

1

vote

0 answers

How do I improve my RL tic-tac-toe agent?

I have coded a neural-network-based RL tic-tac-toe agent. It trains well enough to win against random agents almost all the time, the larger the board (the code allows training on NxN boards and with winning line longer than 3), the closer to…

reinforcement-learning deep-rl tic-tac-toe

asked Dec 20 '22 at 13:43

Emil

111
4

1

vote

1 answer

How are rewards calculated for episodic tasks like playing chess or tic-tac-toe?

I am new to Reinforcement Learning and trying to understand the concept of reaping rewards during episodic tasks. I think in games like tic-tac-toe, rewards will be in terms of a win or lose. But does that mean we need to finish the entire game to…

reinforcement-learning deep-rl rewards reward-functions tic-tac-toe

asked Sep 10 '21 at 17:43

Saily_Shah

11
1

1

vote

2 answers

Non-Neural Network algorithms for large state space in zero sum games

I was reading online that tic-tac-toe has a state space of $3^9 = 19,683$. From my basic understanding, this sounds too large to use tabular Q-learning, as the Q table would be huge. Is this correct? If that is the case, can you suggest other…

reinforcement-learning q-learning algorithm-request tic-tac-toe zero-sum-games

asked Mar 29 '20 at 15:52

mason7663

603
3
10

1

vote

1 answer

How to make the RL player a perfect/expert (tic-tac-toe/chess) player?

I asked a question related to tic-tac-toe playing in RL. From the answer, it seems to me a lot is dependent on the opponent (rightly so, if we write down the expectation equations). My questions are (in the context of tic-tac-toe or chess): How to…

reinforcement-learning reference-request game-ai chess tic-tac-toe

asked May 19 '19 at 13:28

user9947

1

vote

1 answer

In tic-tac-toe, what is the effect of the starting state on the state and action value function?

I am simulating a Tic-Tac-Toe game with a human opponent. The way the RL trains is through policy/value iterations for a fixed number of iterations all specified by the user. Now, whether the human player has turn 1 or turn 2 will decide the…

reinforcement-learning value-functions value-iteration policy-iteration tic-tac-toe

asked Mar 17 '19 at 12:25

user9947

Questions tagged [tic-tac-toe]