ntl

Automating chess with my voice

Chess boards are an 8x8 grid where rows and columns are represented by the numbers 1-8 and the letters a-h and therefore each square can be represented by a combination of both, for example b5 or h3.

I decided that Python would be a good choice for this project because of how quickly and easily I could implement and test everything.

I used the following libraries,

python-chess - used for handling move generation, move validation, and storing the board state.
pyautogui - used for automated input control on my computer
speech_recognition - used for interfacing with the speech recognition api
ImageGrab from PIL - used for handling image processing
time - used for controlling flow of program execution through time
webbrowser - used for opening my browser to the correct page

I needed a way to know where each chess piece was displayed on the screen and which color I am playing as. I made a calibration function to initialize everything,

import pyautogui as p
from PIL import ImageGrab
def calibrate():
  global result, res, square_size
  x_positions = []
  y_positions = []

  try:
    board_pos = p.locateOnScreen("white.png", confidence = 0.85)
  except (p.ImageNotFoundException, AttributeError):
    try:
      board_pos = p.locateOnScreen("black.png", confidence = 0.85)
    except (p.ImageNotFoundException, AttributeError):
      print("Board not found")

  left = board_pos.left
  top = board_pos.top
  right = left + board_pos.width
  bottom = top + board_pos.height
  square_size = board_pos.width / 8
  print(square_size)
  left -= square_size / 2
  top -= square_size / 2
  color = "black"
  for i in range(8):
      x_positions.append(left + square_size*(i + 1))

  for i in range(8):
      y_positions.append(top + square_size*(i + 1))

  for i in range(7, -1, -1):
      for j in range(7, -1, -1):
          square = letters[i] + str(numbers[j])
          positions[square] = [x_positions[len(x_positions) - i - 1], y_positions[j]]

  im = ImageGrab.grab().load()
  for squ, coor in positions.items():
    squares[squ] = im[positions[squ][0] + 55, positions[squ][1] + 55]

  for key,value in squares.items():
    if result.count(value) <= 2:
      result.append(value)
      res.append(key)

  first_color = im[positions["e4"][0], positions["e4"][1]]
  second_color = im[positions["d4"][0], positions["d4"][1]]

  return first_color, second_color

After creating a new board and assigning a state, I had to figure out how to programmatically control the board. To do this, I need to know where the board is on the screen, which color I am playing as, and what moves my opponent makes.

The next step was answering the question of how to get words from sound. I debated using the speech recognition web API but it was not compatible with the Firefox browser I was using. Instead, I opted to use Google's API, as it was very easy to use and only took about 10 lines of code. I defined a method to get speech to text that records from the microphone and then makes a request for either a string, if one was recognized, or a printed error. It looks like this,

import speech_recognition as sr
def record():
  r = sr.Recognizer()
  with sr.Microphone() as source:
    print("Recording")
    audio = r.listen(source)
    try:
      text = r.recognize_google(audio);
    except sr.UnknownValueError:
        print("Audio could not be understood")
    except sr.RequestError:
        print("Request could not be completed")
    return text

I then needed to parse the move from the recognized text.

def parse_move(text):
  san = "" 
  pieces = {"K": ["king"],
          "Q": ["queen"],
          "R": ["rook"],
          "B": ["bishop"],
          "N": ["knight", "night"],
          "0-0": ["castle", "castles"],
          "0-0-0": ["long"],
          "": ["pawn"]
          }

  move = text.lower().split()
  print("full-text: " + text)

  piece = move[0]
  square = move[-1]
  fyle = square[0]
  try:
    rank1 = square[1]
  except IndexError:
    print("IndexError")
    return "error"
  try:
    rank = int(rank1)
  except (ValueError, UnboundLocalError):
    print("Issue getting move")
    return "error"

  for key, value in pieces.items():
    for p in value:
      if piece == p:
        san += key

  if san == "0-0":
    return board.parse_san(san)
  if san == "0-0-0":
    return board.parse_san(san)

  if fyle not in letters:
    print("Coordinate not recognized")
    return "error"

  try:
    if rank < 0 and rank > 8:
      print("Rank not recognized")
  except UnboundLocalError:
    print("Error, no rank")
    return "error"

  print("file: " + fyle)
  try:
    print("rank: " + str(rank))
  except UnboundLocalError:
    print("Error, no rank")
    return "error"

  try:
    san += fyle + str(rank)
  except UnboundLocalError: 
    print("UnboundLocalError")
    return "error"
  print(san)

  try:
    san = board.parse_san(san)
  except ValueError:
    print("ValueError")
    return "error"
  return san

This function validates and makes the move using the python chess library.

def chess_make_move(board, move):
  from_square = move.uci()[:2]
  to_square = move.uci()[2:]
  if is_legal(board, move.uci()):
    make_move(from_square, to_square)
    board.push(move)
  else:
    if board.is_game_over():
      game_over = True
    else:
      print("Move not able to be played")

This function will use pyautogui to control the mouse and make the correct move on the board based on it's location on the screen.

def make_move(origin, destination):
  p.moveTo(positions[origin][0], positions[origin][1])
  p.click()
  p.dragTo(positions[destination][0],
           positions[destination][1],
           0.35, button='left')

handle_moves() will take an image of the screen, search through each square on the board, and find the move that was made by the opponent. After figuring out the move, it will be added to the 'board' object to maintain the game state.

def handle_moves():
  global res, result, move_number
  p.hotkey("command", "tab")
  im = ImageGrab.grab().load()
  p.hotkey("command", "tab")
  for squ, coor in positions.items():
    squares[squ] = im[positions[squ][0] + 50, positions[squ][1] + 50]

  for key,value in squares.items():
    if result.count(value) <= 2:
      result.append(value)
      res.append(key)
  print(res)
  li1 = res[-1]
  li2 = res[-2]
  if im[positions[li1][0], positions[li1][1]] == squares[li1]:
    move = li1 + li2
    print("move - " + move)
    try:
      board.push_uci(move)
      move_number += 1
    except ValueError:
      print("Move - " + move + " invalid")
  if im[positions[li2][0], positions[li2][1]] == squares[li2]:
    move = li2 + li1
    print("move - " + move)
    try:
      board.push_uci(move)
      move_number += 0
    except ValueError:
      print("Move - " + move + " invalid")

The main game loop begins by initializing and calibrating the board state, then will loop until game over and prompt the user to speak the moves when it is their move.