Next: Genetic Algorithm Up: Future Work Previous: Future Work

The Filtering Engine

YAIF is the filtering engine in the current implementation of Newt. It is inadequate on a number of counts and can be improved upon. There are two interesting directions for future work concerning the filtering engine. One area of future work is to make incremental improvements to the existing filtering engine without changing the basic keyword based search engine. The other direction for future work is to make fundamental changes to the engine by replacing it with another that uses a different filtering method. Both these directions for future work are discussed below.

A disadvantage of YAIF is that text full indexing and matching takes too much time. Hence, agents cannot search for articles in real time. For example, if the user creates a new profile, it is difficult to test it right away since the filtering module is executed off-line. Even if the filtering module is executed on demand, it would take too much time to index all the articles in a newsgroup. One way to solve this problem is to keep pre-computed indexes for all the articles available for as long as the article is alive. An interesting challenge is to devise an efficient and incremental text indexing algorithm. The term frequencies can be computed for individual documents as they enter the database. However, the inverse document frequencies need to be recomputed incrementally as documents enter or leave the database.

Instead of trying to make improvements to the current keyword-based filtering engine, a new engine could be implemented using a completely different filtering method. For example, social or economic filtering could be used instead of the current cognitive filtering engine. The modular design of Newt makes it quite easy to replace the current filtering engine with another one. Taking this one step further, different agents in the system could have different filtering engines for recommending documents to the user. Each agent would filter documents using a different approach. However, the documents would be presented to the user through a single interface. The advantage is that the user will have a common interface to support her diverse filtering needs.

Keyword-based filtering schemes have their limitations as mentioned in Chapter . An option to overcome these limitations is to use techniques from the fields of natural language understanding and knowledge representation. For instance, FRAMER [17] is a knowledge representation library which can draw analogies in natural language texts. So, for example, if FRAMER infers that the news story A is analogous to the news story B, if the user likes document A, the agent could also recommend document B. Another option is to use the contextual/structural retrieval paradigm [30]. In this framework, the user can also provide qualitative relevance feedback explaining why a document is not relevant. It is worth exploring the possibility of using these approaches for building better filtering engines. There are certain issues that one must bear in mind while designing filtering systems. Representations of user interests must be sufficiently flexible to allow incremental modifications. The search engine must also be able to provide good reasons for selecting or rejecting a document. The representations must also satisfy the ``clustering'' requirement i.e. people whose interests are similar would also have similar representations. This is necessary for exploration, where, by slightly modifying the profile, the required result is a commensurate change in the documents selected.

Next: Genetic Algorithm Up: Future Work Previous: Future Work

MIT Media Lab - Autonomous Agents Group - agentmaster@media.mit.edu