Next: Document Up: The Algorithm Previous: The Algorithm

Representation

The representation used for profiles and documents is based on the vector space representation, commonly used in the information retrieval literature [38]. In the vector space representation, documents and queries are both represented as vectors in some hyper-space. A distance metric which measures the proximity of vectors to each other is defined over the space. When a query is received, it is translated into its vector representation and document vectors in the proximity of the query vector are retrieved in response to the search. The advantage of using a common vector space for both documents and queries is that a document can also be used as a query itself i.e. one can find documents that are similar to a given document. Once the document-query is translated to a vector, the same distance metric can be used as for other queries. This property of vector spaces is quite useful for the current application, since users can provide samples of interesting articles as an alternative to constructing intelligent queries.

A profile searches a part of the database looking for articles that are similar to it. Profiles are analogous to queries in information retrieval. The representations used for profiles and documents are described below.



MIT Media Lab - Autonomous Agents Group - agentmaster@media.mit.edu