Path-finding Webapp Part 2: Back-end

This project was focused around developing a few different algorithms for path-finding. By following a series of directions and rules, an algorithm could search across an area (grid) and find the end. This project also focused on the development of an application where a user can create a maze on an interactive gird, choose and algorithm, and send it back for processing. This post goes into depth the algorithms and functions used in the back end of the application.


Endpoint:

Once the data is sent to the endpoint, a numpy array is created with the 2d array. Depending on the algorithm that was chosen, the respective function will be called with the numpy array being passed to the function. The start and end arguments are set to (29,0), and (0,29). In the future, users will be able to set the start and end points, and this data will be sent in the post request.


@application.route("/pathfinding_receiver",methods=["POST"])
def worker():
    if request.method == "POST":
        data = request.json['data']
        algorithm = request.json["algorithm"]
        data_np = np.array(data,dtype=np.int)
        if algorithm == "breadth_first_search":
            visited_list,path_list = breadth_first_search(data_np,start=(29,0),end=(0,29))
        elif algorithm == "depth_first_search":
            visited_list = depth_first_search(data_np, start=(29, 0), end=(0, 29), stack=[], visited_list=[], dead_end_list=[])
            path_list = [[29,0]]
        elif algorithm == "a_star_search":
            visited_list,path_list = a_star_search(data_np, start=(29,0),end=(0,29))

        return json.dumps({"status":"OK","visited_list":visited_list,"path_list":path_list})


Breadth-First Algorithm:

If the breadth-first algorithm was chosen in the drop-down, then the breadth_first_search function will be called. The function first creates a query list and a visited list. Next, a movement dictionary is created to provide the appropriate calculations when looking for child cells (neighboring cells). Finally, a node dictionary is created to house information of each cell, specifically their parent cells. This is later used in creating the shortest pathway.


q = list()
visited_list = list()

move_dict = {
    "north": [-1, 0],
    "east": [0, 1],
    "south": [1, 0],
    "west": [0, -1],
}

node_dict = dict()

While the query list is true (the list has at least one element) the current cell in the list is set as the parent cell. It is then appended to the visited list and a loop is created to loop through the elements in the movement dictionary. This will allow each surrounding cell to be gathered by taking the value of the element (i.e. calculation values) and adding them to the values of the parent cell. By adding those values to the parent cell, a child cell is found. So if the parent cell is at (1,0), and the north cell is one above the parent cell, then the calculation would be: north_cell =( 1 - 1,0 + 0). However, this would prove problematic with cells on the edges of the grid.


q.append(start)
while q:
    parent = q.pop(0)
    visited_list.append(parent)

    for key, value in move_dict.items():
        next_cell = (parent[0] + value[0], parent[1] + value[1])
        ...

Several checks would be included to prevent several problems from occurring. These checks would be supporting functions that would either check if the potential child cell was still within the bounds of the grid, or if one of the values of the cell was negative. The latter problem was unique since in python, [-1] would refer to the same element as if the last position was used instead (e.g if 5 was the last position, then [-1] = [5]). For this reason, it seemed simpler to keep everything in a positive context, including the position of the start and end cells.


while q:
  
  ...

  for key, value in move_dict.items():
        next_cell = (parent[0] + value[0], parent[1] + value[1])

    if cell_length_check(next_cell, grid) and cell_lists_check(next_cell,q,visited_list) and grid[next_cell] == 0 and cell_check_negative(next_cell):
          try:
              g = grid[next_cell]
          except IndexError:
              pass
  ...


def cell_length_check(next_cell, grid):
    if next_cell[0] < len(grid) and next_cell[1] < len(grid):
        return True


def cell_lists_check(next_cell, q, visited_list):
    if next_cell not in visited_list and next_cell not in q:
        return True


def cell_check_negative(next_cell):
    if next_cell[0] >= 0 and next_cell[1] >= 0:
        return True


If the potential child cell met all the checks, then it would be appended in the queue list and the node dictionary. The current parent was recorded in the node dictionary. Note, a try and except block was used to attempt to create a variable based on the next cells location within the grid. The checks and the try and except block may be redundant and thus unnecessary.


while q: 
   ...
  
    q.append(next_cell)
    node_dict[next_cell]={"parent":parent}
    ...

A final check was done to see if the end cell was in the visited list if it was then a 2d array of the visited list was recreated. A pathway list was created by using a function that would take the node dictionary, and the start and endpoints. The function would append the ending cell’s position and simply iterate over the node dictionary appending each cell's parent’s position. This would be returned as a list. Once the lists were created, they would be returned, ending the processing.


while q:

  ...
  
    if end in visited_list:
        visited_list_altered = create_2d_array(visited_list)
        path_list = shortest_path(node_dict,start,end)
    
        return visited_list_altered,path_list


Depth-First Algorithm:

The second algorithm that was developed was based on the depth-first search algorithm. The algorithm was designed to be recursive, arguments include a grid, start, and ending points, stack list, visited list, and a dead-end list.


def depth_first_search(grid, start, end, stack, visited_list, dead_end_list):


First, a movement dictionary was defined which allowed for cells north, east, south, and west of the parent cell to be considered. Next, a node dictionary was defined to capture information about a node such as its parent and children. Next, if the stack list is empty (which by default on the first run is should be empty), the starting point is appended to the list.


move_dict = {
    "north": [-1, 0],
    "east": [0, 1],
    "south": [1, 0],
    "west": [0, -1],
}

node_dict = {
    "node":{
        "parent":None,
        "childern":[],
    }
}

if len(stack) == 0:
    stack.append(start)

While the stack list contains and element, the parent key of the node dictionary is set to the last element in the stack list. Next, the last element in the stack list is also appended to the visited list. An if-statement is used to catch if the ending point is in the node dictionary; if it is then the visited list is returned.

while stack:
    node_dict["node"]["parent"] = stack[-1]
    print(f"node: {stack[-1]}")
    visited_list.append(stack[-1])
    if node_dict["node"]["parent"] == end:
        return visited_list
    ... 

Next, a loop is created to iterate over the items in the movement dictionary ( to capture the cells around the parent cell). The child cell is calculated based on the index of the parent cell added to the calculation values found in the movement dictionary. Once the child cell is calculated, a series of checks are created to ensure the child cell's location meets certain parameters. This check is similar to the above checks a cell in the breadth-first search function would undergo. One difference in the checks is to see if the cell is not in the dead-end list.


while stack:

    ...  
  
    for key,value in move_dict.items():
        child = (node_dict["node"]["parent"][0] + value[0],node_dict["node"]["parent"][1] + value[1])
        print(child)
        if child not in stack and child not in dead_end_list and cell_length_check(child,grid) and grid[child] == 0 and cell_check_negative(child):
            try:
                g = grid[child]
            except IndexError:
                pass
            node_dict["node"]["childern"].append(child)

            ...


Once the child cell has passed all the checks, and the child cell could be found within the grid, the child cell would be appended to the children key in the node dictionary.


while stack:
    ...
 
    node_dict["node"]["childern"].append(child)

    ...


Next, another loop was created to iterate over all the child cells in the node dictionary. First, the child cell would be appended to the stack, and the function would call itself to repeat the process. If the end was within the visited list, then the visited list would be returned. Otherwise, if the loop manages to iterate over all the child cells, then the parent cell is appended to the dead-end list and the visited list is returned. This function does not find the shortest path, but rather if a path to the end exists or not.


while stack:

  ...
  
  for child_node in node_dict["node"]["childern"]:
      stack.append(child_node)
      depth_first_search(grid, start, end, visited_list=visited_list, stack=stack, dead_end_list=dead_end_list)
      if end in visited_list:
          return visited_list
      stack.remove(child_node)
  dead_end_list.append(node_dict["node"]["parent"])

return visited_list
  


A* Algorithm:

The final algorithm that was used in this project was the A* search algorithm. This was a function that uses an algorithm that could think a bit more on how to move towards the end by considering the cost it takes to get to a neighboring cell, and a heuristic. The function accepts the grid, and the starting and ending points. Several nested functions were created to calculate the heuristic in different ways. These functions include the formulas for Euclidean distance, Manhattan distance, and diagonal distance. In a future update, users will be able to change the heuristic; the diagonal distance formula is currently the default.


def euclidean_distance(child_node,end):
    return math.sqrt((pow(child_node[0]-end[0],2))+(pow(child_node[1]-end[1],2)))

def manhattan_distance(child_node,end):
    return (abs(child_node[0] - end[0]) + abs(child_node[1]-end[1])) * 10

def diagnonal_distance(child_node,end):
    dx = abs(child_node[0] - end[0])
    dy = abs(child_node[1] - end[1])

    return 10 * (dx+dy) + (14 - 2 * 10) * min(dx,dy)


This was chosen based on how the algorithm is allowed to move, which is in the eight cardinal directions within a grid. First, two different cost variables are created which represent the cost required for one cell to move to another. By default, cells straight across are set to cost 10, while cells that are diagonal cost 14. A node dictionary is defined to store node information; this includes the node itself, the node’s parent, the g value (the cost to get to another cell), the h (heuristic), and f (g + h). The f value will ultimately determine how cost-efficient it is to move to a cell. An open, closed, and path list is defined. These lists will act similarly to the queue list/stack list and the visited list found above. Another movement dictionary is defined; however, four new directions are included, along with a cost key that holds the appropriate cost value (10 or 14).


str_line_cell_cost = 10
diag_line_cell_cost = 14

open_list = list()
closed_list = list()
closed_list_full = list()
path_list = list()

move_dict = {
    "north_west": {
        "calc": [-1, -1],
        "cost": diag_line_cell_cost
    },
    "north": {
        "calc": [-1, 0],
        "cost": str_line_cell_cost
    }, 

    ...

    "south_east":{
        "calc":[1, 1],
        "cost":diag_line_cell_cost
    },



A new node dictionary is created for the starting point, with the g value set to 0 and the parent, h, and f set to None.


open_list.append({
    "node": start,
    "parent": None,
    "g": 0,
    "h": None,
    "f": None
})



While the open list contains an element, a variable called parent is set to the value of the first element in the open list (which is a dictionary). This element's node value is appended to the closed list, which means that the node was essentially visited.


while open_list:
    parent = open_list[0]
    closed_list.append(parent["node"])
    closed_list_full.append(parent)

    ...



Next, a loop is created to iterate over the values in the movement dictionary. The child cell is calculated in the same method found in the other pathfinding functions and undergoes the same series of checks.


while open_list:

    ...

    for key,value in move_dict.items():
        child = (parent["node"][0] + value["calc"][0],parent["node"][1] + value["calc"][1])

        



If the child cell passes the checks, then g is calculated as the cost of the parent cells g value added to child cells g cost. The parent cell’s g value is the summation of all previous g values. Next, h is calculated by calling the diagonal distance function which is: insert equation. Next, f is the g added to h. The calculated values, child cells, and the parent cell are set as values for the node dictionary.

while open_list:

    ...


        if cell_length_check(child,grid) and cell_check_negative(child) and grid_cell_value(child, grid):
                if child not in closed_list and child not in open_list:
                    g = parent["g"] + value["cost"]
                    h = diagnonal_distance(child,end)
                    f = g+h
            
                    node_dict={
                        "node":child,
                        "parent": parent["node"],
                        "g":g,
                        "h":h,
                        "f":f
                    }
                  ...



Next, a loop and a series of conditionals are used to check if the current child cell is in the open list. If the child cell is found to be equal to a child cell in the open list, then the parent cells are check. If they are not equal then this means that a child cell is currently being considered as a child to multiple parent cells. To rectify this, the g values are a check against both child cells. If the current child cell’s g value is less then the child cell’s g value found in the list, then the child cell found in the list is overwritten with the current child cell. This effectively “re-parents” the child cell. This essentially means that from the current parent cell to the current child cell has a lower cost compared to the previous parent cell. If the child cell is not found in the open list, then it is appended to the open list.


while open_list:

  ...
  
  skip_append = False
  for index,item in enumerate(open_list):
      if item["node"] == node_dict["node"]:
          skip_append = True
          if item["parent"] != node_dict["parent"]:
              if node_dict["g"] < item["g"]:
                  open_list[index] = node_dict
                  break
  
  if skip_append == False:
      open_list.append(node_dict)
  ...



If the child cell is found to be the ending point, then the current node dictionary is appended to the closed list. The ending point is appended to the path list, and a current node variable is set to the ending point. The closed list is iterated on; if a node in the closed list is equal to the current node and is not equal to the start position, then the node is appended to the path list. This essentially travels through each cell and their associated parent, from the end position to the start position. The closed list and the path list are then returned.


while open_list:

...
      
      if child == end:
          closed_list_full.append(node_dict)
          path_list.append(end)
          current_node = end
          for item in closed_list_full[::-1]:
              if item["node"] == current_node:
                  if item["node"] == start:
                      break
                  else:
                      path_list.append(item["parent"])
                      current_node = item["parent"]
      
          return closed_list,path_list



If the child cell is not equal to the end, then the first element of the opened list is popped, and the open list is sorted based on all the f values, from least to greatest. If the open list runs out of elements, then the current closed list is returned. 


Conclusion:

This was a fun project to run through the process of reading an algorithm on paper and put it into practice using python. The part I found the most interesting was creating an application that could accept an input based on a visual created from the user and funnel it to the back-end for processing. I do have plans to revisit the project later to refactor code, add more algorithms, and add more features to the application. 


Comments: