Next: New Generation Up: Genetic Algorithm Previous: Crossover

Mutation

Let the parent be:

The mutation operator is defined so as to contribute to the exploratory behavior of the population of profiles. An offspring resulting from mutation searches a newer domain not covered by the parent or other members of the population. This helps the population of profiles to adapt to changing user interests. It also introduces a serendipitious element to the profiles.

The offspring is slightly different from the parent, so that the offspring explores a newer domain. At the same time, the offspring retains some attributes of the parent so that it can exploit the knowledge of the parent. This effect is achieved by modifying the newsgroup field of the profile, since this field controls the search domain of the profile. One of the newsgroups is randomly selected for replacement. This newsgroup is replaced by a randomly selected ``similar'' newsgroup.

A similarity function is defined over the set of all newsgroups, as explained in the following paragraph. This similarity function is used to compute a priori similarities between all pairs of newsgroups. When the original newsgroup needs to be replaced, its nearest neighbors are the candidates for replacement. A pre-decided threshold, say , is used to choose the number of candidates to be considered. Since the mutated offspring must be different from the parent, none of the candidate newsgroups should be already present in the parent. Once the set of nearest neighbors not already present in the parent is found, one of the candidate newsgroups is selected at random. This is used as the replacement in the offspring.

To calculate the similarity between two newsgroups, both are first represented as a vector of terms. The document frequencies of terms in a newsgroup are used as the term weights. The document frequency for a term in newsgroup is the fraction of the total number of documents in which contain the term . It is calculated as shown in equation . This gives a vector of terms weighted by document frequency. A similar vector of terms is calculated for the other newsgroup . The similarity between the two newsgroups is given by the cosine product of the two vectors, as shown in equation .

where is the document frequency of term in newsgroup , is the number of documents in which contain the term and is the total number of documents in . gives the similarity between newsgroups and .

As in the crossover operator, the fitness of the offspring is set to the default initial weight of .



Next: New Generation Up: Genetic Algorithm Previous: Crossover


MIT Media Lab - Autonomous Agents Group - agentmaster@media.mit.edu