numpy - Python method changing all key value pairs in dictionary -


i in process of building little python script learns play tic tac toe. process store each move made in game , score move based on whether lead winning outcome. attempt train on many rounds of play.

my problem lies update_weights() method. expect take stored moves (accessed board object , represented list [row,col]) , iterate through list of moves. method should reference board's stored weights (a dictionary of (3,3) numpy arrays) , update corresponding weight appropriate move.

e.g. assume win occurred. in winning sequence move #2 @ board position [0,1]. method should access dictionary of weights (keys move #) , multiply position of array [0,1] factor of 1.05.

the problem method changing arrays in weight dictionary, not 1 associated correct move # key. can't figure out how happening.

import numpy np import random      class ttt_board():          def __init__(self):             self.board_state = np.array([[0,0,0],[0,0,0],[0,0,0]])             self.board_weight = self.reset_board_weights()             self.moves = []          def reset_board_weights(self):             board_weight_instance = np.zeros((3,3))             board_weight_instance[board_weight_instance >= 0] = 0.5              board_weight = {0: board_weight_instance,                             1: board_weight_instance,                             2: board_weight_instance,                             3: board_weight_instance,                             4: board_weight_instance}              return board_weight          def reset_board(self):             self.board_state = np.array([[0,0,0],[0,0,0],[0,0,0]])          def reset_moves(self):             self.moves = []          def is_win(self):             board = self.board_state              if board.trace() == 3 or np.flipud(board).trace() == 3:                 return true             in range(3):                 if board.sum(axis=0)[i] == 3 or board.sum(axis=1)[i] == 3:                     return true             else:                 return false          def is_loss(self):             board = self.board_state              if board.trace() == 12 or np.flipud(board).trace() == 12:                 return true             in range(3):                 if board.sum(axis=0)[i] == 12 or board.sum(axis=1)[i] == 12:                     return true             else:                 return false          def is_tie(self):             board = self.board_state             board_full = true             in range(len(board)):                 k in range(len(board)):                     if board[i][k] == 0:                         board_full = false             if board_full ,  not self.is_win() , not self.is_loss():                 return true             else:                 return false          def update_board(self,player,space):             #takes player 1 or 4             #takes space list [0,0]             self.board_state[space[0],space[1]] = player              if player == 1:                 self.store_move(space)             return          def get_avail_spots(self):             avail_spots = []             board = self.board_state             in range(len(board)):                 k in range(len(board)):                     if board[i][k] == 0:                         avail_spots.append([i,k])             return avail_spots          def gen_next_move(self):             avail_spots = self.get_avail_spots()             move = random.randrange(len(avail_spots))             return avail_spots[move]          def update_weights(self,win):             moves = self.moves             if win:                 factor = 1.05             else:                 factor= 0.95             in range(len(moves)):                 row = moves[i][0]                 col = moves[i][1]                 old_weight = self.board_weight[i][row,col]                 new_weight = old_weight*factor                 self.board_weight[i][row,col] = new_weight             return          def store_move(self,move):             self.moves.append(move)             return       if __name__ == '__main__':          board = ttt_board()          while not board.is_win() , not board.is_loss() , not board.is_tie():             try:                 board.update_board(1,board.gen_next_move())                 board.update_board(4,board.gen_next_move())             except valueerror:                 break          if board.is_win():             board.update_weights(1)             print('player 1 wins: {w}'.format(w=board.is_win()))         elif board.is_loss():             board.update_weights(0)             print('player 2 wins: {l}'.format(l=board.is_loss()))         elif board.is_tie():             print('game ends in tie: {t}'.format(t=board.is_tie()))          print('here final board')         print(board.board_state)         print(board.board_weight)         print(board.moves) 

as can see running script, printed dictionary of weights after single game has identical array values each key. i expect each array changed in 1 position should accessed move # corresponding key associated with.

the problem share same reference on board_weight_instance array in dictionary

board_weight_instance = np.zeros((3,3)) board_weight_instance[board_weight_instance >= 0] = 0.5  board_weight = {0: board_weight_instance,                 1: board_weight_instance,                 2: board_weight_instance,                 3: board_weight_instance,                 4: board_weight_instance} 

i in dictionary comprehension, creating new reference each element using helper method:

@staticmethod def create_element():    board_weight_instance = np.zeros((3,3))    board_weight_instance[:] = 0.5  # simpler method    return board_weight_instance  board_weight = {i:self.create_element() in range(0,5)} 

in case, why using dictionary when can use list: no hashing, faster processing:

board_weight = [self.create_element() _ in range(0,5)] 

you can access same way


Comments

Popular posts from this blog

python - How to insert QWidgets in the middle of a Layout? -

python - serve multiple gunicorn django instances under nginx ubuntu -

module - Prestashop displayPaymentReturn hook url -