1

The code below is adapted from this implementation.

from math import floor

basehash = hash

class IHT:
    "Structure to handle collisions"
    def __init__(self, sizeval):
        self.size = sizeval
        self.overfullCount = 0
        self.dictionary = {}

    def count (self):
        return len(self.dictionary)

    def getindex (self, obj, readonly=False):
        d = self.dictionary
        if obj in d: return d[obj]
        elif readonly: return None
        size = self.size
        count = self.count()
        if count >= size:
            if self.overfullCount==0: print('IHT full, starting to allow collisions')
            self.overfullCount += 1
            return basehash(obj) % self.size
        else:
            d[obj] = count
            return count

def hashcoords(coordinates, m, readonly=False):
    if type(m)==IHT: return m.getindex(tuple(coordinates), readonly)
    if type(m)==int: return basehash(tuple(coordinates)) % m
    if m==None: return coordinates


def tiles(ihtORsize, numtilings, floats, ints=[], readonly=False):
    """returns num-tilings tile indices corresponding to the floats and ints"""
    qfloats = [floor(f*numtilings) for f in floats]
    Tiles = []
    for tiling in range(numtilings):
        tilingX2 = tiling*2
        coords = [tiling]
        b = tiling
        for q in qfloats:
            coords.append( (q + b) // numtilings )
            b += tilingX2
        coords.extend(ints)
        Tiles.append(hashcoords(coords, ihtORsize, readonly))
    return Tiles

if __name__ == '__main__':
    tc=IHT(4096)
    tiles = tiles(tc, 8, [0, 0.5], ints=[], readonly=False)
    print(tiles)

I'm trying to figure out how the function tiles() works. It implements tile coding, which is explained in "Reinforcement Learning: An Introduction" (2020) Sutton and Barto on page 217.

So far I've figured out:

  • qfloats rescales the floating numbers to be the largest integer less than or equal to the original floating number, these are then re-scaled by the number of tilings.

  • Then, for each tiling, a list is created, the first element of the list is the tiling number, followed by the coordinates of the floats for that tiling, i.e. each of the re-scaled numbers is offset by the tiling number, b and integer divided by numtilings.

  • Finally, hashcoords first checks the dictionary d to see if this list has appeared before, if it has it returns the number relating to that list. It not it either creates a new entry with that list at the key and the count as the value or if the count is more than or equal to the size it adds one to overfullCount and returns basehash(obj) % self.size.

I'm struggling to understand two parts:

  1. What is the tilingX2 part doing?

  2. Why is tilingX2 added to b after the first coordinate has been calculated? It seems to me that each coordinate should be treated separately

  3. And why by a factor of 2?

  4. What is the expression basehash(obj) % self.size doing? I'm quite new to the concept of hashing. I know that, generally, they create a unique number for a given input (up to a limit), but I'm really struggling to understand what is going on in the line above.

nbro
  • 39,006
  • 12
  • 98
  • 176
Bazman
  • 111
  • 2
  • 1
    Hello. Welcome to AI SE :) Your post was not properly formatted, so I took the time to format it properly. Please, take a look at https://ai.stackexchange.com/help/formatting and at the current version of your post so that you know how a "good-looking" posts looks :) Take also a look at this https://ai.meta.stackexchange.com/q/1654/2444 if you have some time. – nbro Jan 21 '21 at 02:22
  • 1
    To partially answer your 4th question. The operator `a % b` is the [modulo operator](https://www.khanacademy.org/computing/computer-science/cryptography/modarithmetic/a/what-is-modular-arithmetic), i.e. it returns the remainder of the _integer_ division between `a` and `b`. So, say that you have `a = 4` and `b = 3`, then `a % b = 4 % 3 = 1`, because `3 * 1 + 1 = 4`. – nbro Jan 21 '21 at 02:33
  • 1
    So, basically, `basehash(obj) % self.size` is a number that is ensured to never be greater or equal to `self.size`. So, if you have an list of size `self.size`, you can index it with `basehash(obj) % self.size` and be sure you will not get an `IndexError`. – nbro Jan 21 '21 at 02:33

0 Answers0