Reinventing Keyboard Layout: Simulated Annealing Optimization

I got a chance to work on a project in which I have to figure out, if we didn’t have the current keyboard layout, what would be the best layout?

To determine the best layout, we were given an objective function which, given a layout, gives you the expected words per minute (wpm). The objective function takes into consideration the bi-gram frequencies of letters, and since the assignment was in Finland, it considers bi-gram frequencies of the Finnish language. At this point, the problem turned into an optimization problem.


Moving into optimizations algorithm, I surveyed some of them (including hill climbing, genetic algorithms, gradient descent, ant colony), but I was most comfortable with Simulated Annealing[1].

The main idea of any optimization algorithm is that it explores the neighborhood of a randomly chosen initial solution, if it finds a better solution, it moves to it, if not it stays put. Usually, this can get an approach stuck in a local maxima (as the algorithm stays put in the neighborhood of the local maxima, and does not move to a further neighborhood where a global maxima is present). The advantage of simulated annealing is that it injects the perfect amount of randomness to overcome the local maxima problem. A layout is defined as a string of 33 characters, with 4 empty spots (represented by dash) and a space (treated as a dash also). Here is the algorithm:

  1. First, generate a random solution
  2. Calculate wpm using the objective function given
  3. Generate a random neighboring solution (change one character randomly)
  4. Calculate the new solution’s wpm
  5. Compare them:
    • If wpmnew > wpmold: move to the new solution
    • If wpmnew < wpmold: maybe move to the new solution
  6. Repeat steps 3-5 above until an acceptable solution is found or you reach some maximum number of iterations.

The algorithm is pretty straight forward, until it comes to the “maybe” part of it. So, if the new solution performs worse than the old one, we “maybe” will move to it. This uncertainty is what enforces the randomness and prevents local maxima. To decide, we calculate ap (acceptance probability), which if bigger than a random value from 0 to 1, we move to the new solution. AP is calculated as follows:

ap = ewpmnew-wpmold/T

You are probably now wondering what the heck is the T. T is the temperature, and it is a function that tells you which iteration you are on. Usually you start with it at value 1, and then decrease it at the end of each iteration by multiplying by a constant k, k is typically between 0.8 and 0.99.

So, to sum it up, we replace step 5 in the algorithm with the step of calculating the acceptance probability. If AP > random[0,1], then move to new solution.

Lets analyze the ap value:

  1. In case new wpm is better than old wpm: the power will always be positive, and ap will be greater than 1, which consequently means it will always be greater than the random value, and we take the new solution.
  2. In case new wmp is worse than old wpm: the power will be negative, and depending on the value if T, the chances of taking the new solution is determined: when T has a high value, AP is bigger, so chances of taking the new solution (even though it is wrong) is higher. As T gets smaller, the algorithm chance of moving to a wrong solution decreases.

I ran the optimization algorithm with the following parameters: Tstart = 1 , Tnext = T * 0.9, Tmin = 0.000001. For each value of T, I tried 100 layouts.

My initial results were this keyboard (where A and O are special Finnish characters):

Selected Layout: ‘qyvuikrpf__zhAlsnjdc__wOmtaeogbx_’

q y v u i k r p f _ _
z h A l s n j d c _ _
w O m t a e o g b x _

This layout gave a wpm score of 40.9306 words per minute. I thought I reached a global maximum as you can see from the next graph:

18647704
After ~13k loops and 14 minutes, I thought I made it!

That was not it; turns out in my approach, I visit some solutions again as I am not keeping track of the layouts (neighborhoods) I visited before. Enters Tabu Search[2]. Tabu search is is a way to keep a tabu list of previously visited solutions, in order to not visit them again. I ran my algorithm again but with tabu search. The new algorithm ran as follows:

  1. First, generate a random solution
  2. Calculate wpm using the objective function given
  3. Generate a random neighboring solution (change one character randomly)
  4. Check if solution is in tabu list, if yes go to number 3 again, if no add to tabu list
  5. Calculate the new solution’s wpm
  6. Compare them:
    • If wpmnew > wpmold: move to the new solution
    • If wpmnew < wpmold: maybe move to the new solution
  7. Repeat steps 3-5 above until an acceptable solution is found or you reach some maximum number of iterations (or minimum value of T).

I set maximum size of tabu list to 1000, to give the algorithm an opportunity to revisit some of the old solutions in case it gets stuck. If the list exceeds this size, it removes the first (oldest) added layout.

Guess what, this worked! I got a better wpm, and a whole new layout in less loops/time.

Time/Loops: 10 mins, 7901 loops

New wpm: 40.9970

New Layout: qfgvasoAyw__xprkiejhO__zbutlnmdc_

q f g v a s o A y w _
_ x p r k i e j h O _
_ z b u t l n m d c _
selection_088
New WPM, New Layout!

[1] The Simulated Annealing Algorithm

[2] Simulated Annealing and Tabu Search

Leave a comment