Skip to content

4. Custom categorical problems

Kilian Brachtendorf edited this page Jan 28, 2019 · 5 revisions

Darwin requires a little help when dealing with categorical problems, due to the fact that mutation and crossover operations aren't simply performed by summing or averaging numbers. This example will guide you through 2 different approaches to solve these. The first discussed problem will be searching for a known string starting with randomly distributed letters. In the second problem we will tackle the traveling salesman.

String construction with "Simple Individuals"

Individuals represent solutions to a given problem. It's up to us to extends a class which describes our problem and computes how good a solution a given set of variable represents.

The simple individual class already offers default implementations for discrete crossover and mutation, a quick way to get started. It is really important to note that all instances of individuals are expected to be immutable. Meaning that it's internal state does not change once the object was created. Similar to Strings or boxed primitives (Integer, Long ...) altering operations return a new instance of the class instead.

public class TextIndividualSimple extends SimpleIndividual {

	protected Individual createIndividual(Object[] values) {}

	public int getVariableCount() {}

	protected double calculateFitness() {}

	protected Object mutateValue(int index, Object oldGene, double scaleFactor) {}
}

Describing the problem

To test how good a given solution is we first need to define our target value. In our case we want to find the String "Evolution is all about passing on the genome to the next generation, adapting and surviving through generation after generation.". We are planning on finding and changing individual letters. Luckily a string can be represented as a char[] array, each char representing a single letter. The method calculateFitness() is supposed to return a value to the fitness function as we have already discussed in example 1 - 3. To be able to calculate a distance metric our individual needs to know about our target. Lets add it to our individual. We could go ahead and use a static field but for re usability and potential multi threading sake we pass the value to each individual separately for now.

public class TextIndividualSimple extends SimpleIndividual {

   private final char[] target;

   public TextIndividualSimple(char[] target, Object currentGenes) {
      super(currentGenes);
      this.target = target;
   }
	
   protected Individual createIndividual(Object[] values) {
   }

   public int getVariableCount() {
   }

   protected double calculateFitness() {
   }

   protected Object mutateValue(int i, Object object, double scaleFactor) {
   }
}

The method public int getVariableCount() {} is expected to return the number of genes/variables we are altering. Since we want to mutate each letter we are returning the length of our target string.

public int getVariableCount() {
   return target.length;
}

Fitness function

Lets handle our fitness function. We have multiple ways to define the quality of our solution. Remember that we want to return a value of 0 if we have found our perfect solution, the higher the value the worse the solution is. For strings one of the edit distance formulas is the way to go. Is a solution better than if the letters are shifted? This depends a little bit on if our crossover and mutation operations are able to create a better result from such an individual. For now we roll with a simple value which counts the number of mismatching characters. e.g. if we are looking for HelloWorld and currently have an individual representing HelloWorll this would result in a distance value of 1.

protected double calculateFitness() {
   int fitness = 0; 
   for(int i = 0; i < getVariableCount(); i++) {
      if(target[i]!=(char)this.getValue(i)) {
         fitness++;
      }
   }
   return fitness;
}

The method this.getValue(i) returns the gene at the ith position. Since we know we are working with chars we can cast it.

Mutation

Next up is the mutateValue method which will be called if a value needs to be altered during a mutation operation. The index is number of the gene we want to mutate, the oldGene is the gene currently present in this individual and the scaleFactor is a number ranging from [0 - 1] indicating how drastic of a change the value should face.

For numerical problems the scale factor can indicate the amount of change to a value (e.g. a scale factor of 1 means the new value may deviate by 10 points, a scale factor of 0.5 indicating only a maximum shift of 1), for categorical problems this value can mostly be neglected.

Remember that individuals are immutable. We do not change the gene, but rather return a new copy if the gene is altered. The variable RNG is implemented by the superclass and holds a reference to a fast thread safe high quality random number generation. Looking at the asci table we can see that chars are represented as numerical values. Generating a random int between 32 - 126 covers all potential letters in the string.

@Override
protected Object mutateValue(int index, Object value, double scaleFactor) {
   //Return a random int between 32 and 126
   return (char)(RNG.nextInt(95)+32);
}

Finishing up

Finally the createIndividual method needs to be overwritten which needs to return a new individual given the new genes. After plugging everything together we are left with the following code

public class TextIndividualSimple extends SimpleIndividual {

    private final char[] target;

    public TextIndividualSimple(char[] target, Object currentGenes) {
        super(currentGenes);
        this.target = target;
    }
    
    @Override
    protected Individual createIndividual(Object newGenes) {
        return new TextIndividualSimple(target, newGenes);
    }

    @Override
    public int getVariableCount() {
        return target.length;
    }

    @Override
    protected double calculateFitness() {
        /*
         * Return a fitness value. 0 for an optimal solution.
         */
        int fitness = 0; 
        for(int i = 0; i < getVariableCount(); i++) {
            if(target[i]!=(char)this.getValue(i)) {
                fitness++;
            }
        }
        return fitness;
    }

    @Override
    protected Object mutateValue(int index, Object value, double scaleFactor) {
        //Return a random int between 32 and 126
        return (char)(RNG.nextInt(95)+32);
    }
}

Individual

To get a bit more power and performance it's recommended that you extends the Individual class directly. As you can see you'll end up with a few more methods you have to override yourself, but are able to use primitive arrays erasing the need to constant casting operations.

public class TextIndividual extends Individual {

    @Override
    public int getVariableCount() {
    }

    @Override
    public Individual crossover(CrossoverStrategyFuzzy crossoverStrategy, Individual... crossoverParent) {
    }

    @Override
    public Individual crossover(CrossoverStrategyDiscrete crossoverStrategy, Individual... crossoverParent) {
    }

    @Override
    public Individual mutate(double probability, double scaleFactor) {
    }

    @Override
    protected double calculateFitness() {
    }

    @Override
    public <T> T getValue(int index) {
    }

    @Override
    public int hashCode() {
    }

    @Override
    public boolean equals(Object obj) {
    }
}

The target, variable count and calculate fitness method is exactly the same as before. Simply copy and paste it from the example above. Equals and hashcode can later be autogenerated. We are left with implementing the crossover, mutate and getValue methods ourselves.

First of all we need a variable holding our current genes. Remember we are working with a string so a char array will be sufficient. The getValue method is an optional method which returns the gene at the ith position allowing for charting operations to be performed.

private final char[] variable; 

@Override
public <T> T getValue(int index) {
    return (T) Character.valueOf(variable[index]);
}

Crossover

Crossover operations take existing values and simply pass them on to new children.

The fuzzy crossover operation assumes that genes can be split and combined in a meaningful way. This is true for numbers, on the other hand it does not make sense to combine 1/3rd A with 2/3 of a D therefore we decide that we don't support fuzzy crossovers. In this case simply throw an exception for the method.

@Override
public Individual crossover(CrossoverStrategyFuzzy crossoverStrategy, Individual... crossoverParent) {
    throw new UnsupportedOperationException("Does not support fuzzy crossover");
}

Implementing the discrete crossover is straightforward.

    @Override
    public Individual crossover(CrossoverStrategyDiscrete crossoverStrategy, Individual... crossoverParent) {
        //Get an array telling us which parent should deliver which gene
        int[] crossoverVector = crossoverStrategy.getCrossoverVector(crossoverParent);
        
        //The genes of the new offspring
        char[] newValues = new char[variable.length];

        //Get the corresponding gene from the parent
        for (var i = 0; i < variable.length; i++) {
            int parentIndex = crossoverVector[i];
            newValues[i] = crossoverParent[parentIndex].getValue(i);
        }

        return new TextIndividual(target, newValues);
    }

Mutation

Mutation is essentially the same as crossover but instead of taking the values of existing parents we modify the value of a current parent with a certain probability.

@Override
    public Individual mutate(double probability, double scaleFactor) {

        // The scale factor indicates how much mutation is desired. The value usually
        // starts at 1 and linearily scales down to 0 once the algorithm hits maxGenerations
        // The probability stays the same thoughout the entire run. Since we usually want less mutation
        // at the end we multiply both values. If we work with numerical mutation we can stick with the
        // probability and just mutate the desired value a little bit less

        char[] newValues = new char[variable.length];

        for (int i = 0; i < variable.length; i++) {
            if (RNG.nextDouble() <= probability*scaleFactor) {
                newValues[i] = generateRandomChar();
            } else {
                //Copy the parent value
                newValues[i] = variable[i];
            }
        }
        return new TextIndividual(target, newValues);
    }

Putting everything together

public class TextIndividual extends Individual{

    /**
     * The goal string used to compute how good our solution is.
     */
    private final char[] target;

    /**
     * The current solution of this individual
     */
    private final char[] variable;

    /**
     * Restrict this individual to use a-zA-Z letters only. This reduces the search
     * space. If false use the full range of ASCII 32 - 127
     * 
     * @see http://www.asciitable.com/
     */
    private final static boolean LIMIT_TO_LETTERS = false;

    public TextIndividual(String target, char values[]) {
        this(target.toCharArray(), values);
    }

    public TextIndividual(char target[], char values[]) {
        this.target = target;
        this.variable = values;

    }

    public char getVariable(int index) {
        return variable[index];
    }

    @Override
    public int getVariableCount() {
        return target.length;
    }

    @Override
    public Individual crossover(CrossoverStrategyDiscrete crossoverStrategy, Individual... crossoverParent) {

        // Get an array telling us which parent should deliver which gene
        int[] crossoverVector = crossoverStrategy.getCrossoverVector(crossoverParent);

        // The genes of the new offspring
        char[] newValues = new char[variable.length];

        for (var i = 0; i < variable.length; i++) {
            int parentIndex = crossoverVector[i];
            newValues[i] = crossoverParent[parentIndex].getValue(i);
        }

        return new TextIndividual(target, newValues);
    }

    @Override
    public Individual crossover(CrossoverStrategyFuzzy crossoverStrategy, Individual... crossoverParent) {
        throw new UnsupportedOperationException("Does not support fuzzy crossover");
    }

    public String getResult() {
        return new String(variable);
    }

    @Override
    public Individual mutate(double probability, double scaleFactor) {

        // The scale factor indicates how much mutation is desired. The value usually
        // starts at 1 and linearily scales down to 0 once the algorithm hits maxGenerations
        // The probability stays the same thoughout the entire run. Since we usually want less mutation
        // at the end we multiply both values. If we work with numerical mutation we can stick with the
        // probability and just mutate the desired value a little bit less

        char[] newValues = new char[variable.length];

        for (int i = 0; i < variable.length; i++) {
            if (RNG.nextDouble() <= probability * scaleFactor) {
                newValues[i] = generateRandomChar();
            } else {
                newValues[i] = variable[i];
            }
        }
        return new TextIndividual(target, newValues);
    }

    @Override
    protected double calculateFitness() {

        /*
         * A very basic fitness function. We could also use a bit more complex edit
         * distance function like levenshtein distance if our crossover or mutation
         * allows for shifting of inputs.
         */
        int distance = 0;
        for (int i = 0; i < target.length; i++) {
            if (variable[i] != target[i]) {
                distance++;
            }
        }
        return distance;
    }

    @Override
    public String toString() {
        return "TextIndividual [variable=" + new String(variable) + ", fitness=" + calculateFitness() + ", birth="
                + getBirth() + "]";
    }

    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + Arrays.hashCode(target);
        result = prime * result + Arrays.hashCode(variable);
        return result;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        TextIndividual other = (TextIndividual) obj;
        if (!Arrays.equals(target, other.target))
            return false;
        if (!Arrays.equals(variable, other.variable))
            return false;
        return true;
    }

    /**
     * Optional method allowing to output the individual to a csv file for further
     * investigation.
     */
    @Override
    public String[] toCSV() {
        return new String[] { new String(variable) };
    }

    /**
     * Randomly generate a char
     * @return a random char 
     */
    public static char generateRandomChar() {
        var RNG = GeneticAlgorithm.RNG;
        if (LIMIT_TO_LETTERS) {
            int randomInt = RNG.nextInt(25);
            if (RNG.nextBoolean()) {
                // capital letter ASCI 65 - 90
                return (char) (randomInt + 65);
            } else {
                // lower case ASCI 97 - 122
                return (char) (randomInt + 97);
            }
        } else {
            int randomInt = RNG.nextInt(95) + 32;
            return (char) (randomInt);
        }
    }

    /**
     * Create an individual with the given target. We could also implement the individual prorotype
     * interface here
     * @param target the target char sequence of the individual
     * @return a randomly initialized TextIndividual
     */
    public static Individual createRandomIndividual(String target) {
        int vars = target.length();
        char[] values = new char[vars];

        for (int i = 0; i < vars; i++) {
            values[i] = generateRandomChar();
        }
        return new TextIndividual(target, values);
    }

    @SuppressWarnings("unchecked")
    @Override
    public <T> T getValue(int index) {
        return (T) Character.valueOf(variable[index]);
    }

Constructing the ga and running the example

private static int performAlgo() {
        int populationSize = 500;
        String needle = "Evolution is all about passing on the genome to the next generation, adapting and surviving through generation after generation.";

        var initialPopulation = new Individual[populationSize];

        for (int i = 0; i < populationSize; i++) {
            initialPopulation[i] = TextIndividual.createRandomIndividual(needle);
        }

        var ga = GeneticAlgorithm.builder().withInitialPopulation(initialPopulation)
                .withTargetFitness(0)
                .population()
                .advanced()
                .withForceCloneMutation(true, 10)
                .withMutationScalingStrategy(new LinearFitnessMutationScaling())
                .withCrossoverStrategy(new SinglePointDiscrete(2))
                .migration(100)
                .withNewSubpopulation()
                .withNewSubpopulation()
                .withCrossoverStrategy(new ScatteredDiscrete(2))
                .build();

        Result r = ga.calculate(10, Integer.MAX_VALUE, true);
        System.out.println(r.getBestResult());
        return r.getGenerationCount();
    }

Further examples

....