The representations of all documents to be evaluated are now available. The
field scores are computed using the scalar products of the profile and document
term vectors. The fields considered by YAIF are Newsgroups, Location, Authors
and Keywords. The document score is computed as a weighted linear sum of the
field scores (see Section ).
The number of articles selected for presentation by each profile is proportional
to its fitness. The total number of articles to be presented by the agent can
be specified by the user. The number of articles to be presented by each profile
is calculated and stored in the profile. This is used by YAIF to select the
right number of articles for presentation. Only the pointers to the selected
articles are stored by YAIF in the profile, as in the ``ArtScore'' list in table
.
The article is not stored but is retrieved upon user request.
While selecting articles, care is taken to prevent presenting the same article
twice in different sessions. The profile keeps a list of articles that have
been presented to the user during previous sessions, as in the ``ArtRead'' list
in table .
The top scoring articles are compared to this list to prevent repetitions. The
list of previously read articles can potentially grow infinitely long. However,
USENET articles have expiry dates after which they are no longer available and
their Message-ID becomes invalid. The growth of the list of previously read
articles is prevented by retaining only those pointers which point to articles
that have not expired from the database. A single parse over the Message-ID's
collected by YAIF indicates the articles that are still ``alive''. The upper
bound on the length of the list of article pointers is the total number of alive
articles in the newsgroups searched by a profile.
Finally, YAIF writes back the updated profile as well as pointers to filtered articles into the profile. To summarize, the possible modifications to the profile caused by YAIF include changes to the profile due to feedback, the list of new articles filtered for the user and the list of previously read articles.