eco-algorithms or artificial ecologies

The principle of artificial ecologies (or eco-algorithms) is the use of inspiration from living ecologies to develop algorithms for learning and related searches. Standard genetic algorithms take their model from selective breeding with each individual regarded separately in assessing how well it matches some ideal. In contrast, eco-algorithms are based more on evolution 'in the wild' with competition for resources a central part of the fitness and breeding success.

For me this started out with looking for the solution to a specific problem, avoliding greedy rules in rule induction, but is also a more general way of looking at existing algorithms and also for suggesting new approaches.

a domain - rule induction

Rule induction is common in AI and is also very important in data mining.

Rule induction operates over some set of examples (lets say from a set X) and tries to produce rules to explain the examples. There are various types of examples and rules. The simplest are where there are two sets of examples, a set of positive ones (P) that we want to recognise and a set of negative ones (N) that we want to reject. The rule induction problem becomes:

(i) find a rule R: X Bool such that:: x P: R(x) is true; x N: R(x) is false

Variants of this include cases where each example is a pair (x,y). Instead of positive and negative examples, we simply have examples of 'seen' example pairs (let's call them S) and we want to find a rule that generates the ys from the xs to apply to unseen examples in the future:

(ii) find a rule R: X Y such that:: (x,y) S: R(x) = y

The positive/negative examples are a special case where the set Y is Boolean.

There are various algorithms for doing rule induction. I have used Quinlan's inductive decision tree algorithm in Query-by-Browsing which is a deterministic algorithm. The rule 'R' is then a decison tree.

Often algorithms build the overall rules 'R' from a number of simpler rules. In the case of examples which are simple key/attribute pairs, the rule set 'R' is a set of simple rules r_i, each of which is itself some set of attribute/value conditions (colour=green,cost<50).

In case (i) the overall rule R(x) would be the disjunction of all the individual ri(x).

	R(x)		= true		if i such that r_i(x) = true
			= false otherwise

In case (ii), each rule r_i would be of the form:

if condition(x) then y = constant or formula(x)

We can write this for short: r_i = c_i(x) f_i(x)

	R(x)		= f_i(x)		if i such that c_i(x) = true
			= undefined		if no c_i matches

To keep things simple I will assume the simpler positive/negative example problem (i) form now on, but most or all would adapt to the more general case.

Again various methods can be used to find suitable simpler rules that together make up the rule set, including genetic algorithms.

Each member of the population is s simple set of conditions for each attribute from a small set of possibilities such as:

(a)	attribute = value	- primarily for symbolic values
(b)	attribute < value	- for numeric attributes
(c)	don't care about this attribute

Cross over then simply consists of choosing for each attribute the condition from one or other parent. Mutation consists in either (i) changing the condition of an attribute from don't care to either (a) or (b) as appropriate, (ii) changing (a) or (b) to don't care, or possibly (iii) changing the value in condition (b). Fitness is simply a measure of how many positive examples are matched minus some cost for negative examples wrongly matched.

a problem - greedy rules

Genetic algorithms work well when the whole example set can be described by a single simple rule.

However, when several rules would be required things get more complicated.

Imagine there are two main groups of positive examples amongst many negative ones. Call them P1 and P2. Let's also assume that P1 has many more examples than P2. Rules that match examples in all or most of P1 will have higher fitness than rules matching P2 as the latter has less examples. So when rules are chosen for breeding the P1 rules are highest and eventually dominate. After breeding for some while it will typically be found that we have several candidate rules that match P1 pretty well, but none that match P2.

One way to deal with this is to breed until there is a best rule, r1. Then all examples that match r1 are removed. Breeding then recommences using only the remaining examples. Of course, this doesn't take advantage of any similarities between the two groups. Some solutions can help this too, for example, seeding the second breeding round with the successful rules from the first.

So, there are solutions, but it is all very ad hoc.

the solution - niches and artificial ecologies

In nature where there is a perfectly uniform environment, a single species can dominate as in prairie grasses. However, it is more common to see a mixture of species, each adapted to their own ecological niche.

Nature is not about survival of the fittest; it is survival of the fittest competing for a particular resource.

So a solution strategy to the greedy rule is to construct the algorithm so that the different patches of examples become niches within an artificial ecosystem.

To do this fitness is based not just on how many examples it matches, but also on how much competition there is for those resources.

in practice?

In fact I have done very little in developing this underlying principle. I started to use a simple version in a visual basic version of Query-by-Browsing, but it fell foul of lack of time!

Some fairly basic variants of a simple eco-algorithm follow.

(i) random resource allocation: At each breeding cycle for each positive example assign it to a randomly chosen matching rule in the population. The fitness of each rule is then the number of allocated examples minus the number of wrongly matched negative ones (the negatives could also be randomly allocated).

(ii) weight resource by competition: At each breeding cycle and each example x in P, let the weight of x , w_x, be one over the number of rules in the population matching x. The fitness of a rule is then the sum of the weights of the matching examples minus the (possibly weighted) negative matches.

(iii) grow unconsumed examples: Maintain for each example a running weight. Increase or decrease this depending on the number of rules in the population that match it. That is try to mimic the growth of plants that are not eaten. Use this with either normal fitness function or in combination with either of the above.

Note that (i) and (ii) are essentially the same except that (i) is a Monte Carlo version. Both should lead to populations where the number of rules that match a patch of positive examples will be proportionate to the number of examples in it.

speciation

The above modifications to standard genetic algorithms allow breeding between rules operating on different patches. This is useful sometimes if patches have some common features, but more liklely that such cross-breeds will have very low fitness. So, it is likely to be the case that in-breeding amongst rules that are matching the same patch will lead to faster optimisation. That is we would like some form of separation of species to take place.

There are two main ways we can achieve this:

by biasing breeding to occur between individuals matching similar elements
by starting off with groups of examples that are already 'species' and letting each species develop

An implementation of (1) could involve working out for each pair rules ri and rj in the population a similarity measure s_ij, for example:

s_ij =	number of examples x in P that both rules match



	some measure of each rule's spread

When breeding choose partners weighted by sij.

This has the disadvantage of an O(n²) calculation (where n is the population size) on each breeding cycle, but given the cycle involves an O(nm) cost anyway of matching rules and examples (where m is the number of examples), where the matches are more computationally expense, this may be an acceptable cost.

An alternative, cheaper approach is to choose breeding pairs by randomly choosing an example as a matchmaker and then breeding from the riles that most closely match the example. If this is done during normnal training it will have virtuall;y no additional cost. It also has the advantage of generating potential good rules for niches that are not covered by any existing species.

The second option (2) would involve having pre-assigned sub-populations, potentially on different machines on a distributed implementation. The sub-populations then compete for examples using one of the mechanisms already suggested, and then do a normal genetic algorithm within the sub-population based on the allocated examples.

Hybrid approaches between these two could involve periodically using common match similarity as in (1) to cluster the rules and then inbreeding as in (2).

existing eco-algorthims

To some extent any clustering algorithm could be regarded as a form of eco-algorithm, but more specifically both self-organising Kohonen nets and also ART nets (adaptive resonance theory) have elements of speciation and competition.

I also know that there is some work on co-evolution in the ALife community.

http://www.hcibook.com/alan/topics/ai/eco-algorithms.html

maintained by Alan Dix