| |
Methods defined here:
- __init__(self, ConceptNetDB_handle=None)
- apply_swaplist(self, text)
- arg_grammar_accept_p(self, arg_chunked, re_pattern)
- chunk(self, text)
- chunk_and_lemmatise(self, text)
- generate_extraction(self, text)
- -inputs a raw text document
-outputs an extraction object which contains a
parsed digest of the text
-the extraction object can be passed as argument
to methods jist_*()
- guess_concept(self, text, simple_results_p=0)
- -inputs a raw text document
- the input is an egocentric description where
the subject is a mystery concept being described
- for example: 'Foo is red and delicious. Foo
spoils when it's been laid out too long. Foo has
a stem and meat and skin.'
- uses structure mapping similar to get_analogous_concepts()
to guess what the mystery concept might be
- the return type is the same as with get_analogous_concepts()
- if simple_results_p = 1, then output object is simply
a list of rank-ordered concepts
- guess_mood(self, text)
- -inputs a raw text document
-computes an affective classification of the document
using paul ekman's six basic emotions
- outputs the following six-tuple of pairs:
(
('happy', 0.5),
('sad', 0.4),
('angry', 0.3),
('fearful', 0.9),
('disgusted', 0.0),
('surprised', 0.0),
)
- the cdr of each pair is a score in the range [0.0,1.0]
representing the relative presence of that mood in the
document
- guess_topic(self, text, max_results=1000, flow_pinch=500, max_node_visits=1000)
- -inputs a raw text document
- extracts events, adjectives, and things from text
and finds their weighted contextual intersection
- returns a pair whose car is a trace of the
list of entities extracted from the text,
and cdr are the results (in the same format as the
return type of get_context)
- jist_adjs(self, extraction)
- -inputs an extraction object
-returns a list of all the adjectival phrases
and modifiers in the extraction object
- jist_entities(self, extraction)
- -inputs an extraction object
-returns a list of all the noun chunks
in the extraction object
- jist_pxs(self, extraction)
- -inputs an extraction object
-returns a list of all the prepositional phrases
in the extraction object
- jist_subj_events(self, extraction)
- -inputs an extraction object
-returns entries like: ('I','read book')
- jist_vsoos(self, extraction)
- -inputs an extraction object
-returns a list of Verb-Subject-Object-Object
tuples, each of the form: ('read','I','book','in bed')
- all words are lemmatised, and semantically empty tokens
are stripped (e.g.: the, a, modals)
- lemmatise(self, text)
- -inputs a raw text document
-outputs lemmatised text
- parse_pred_arg(self, pp)
- parses the predicate-argument string
returned by jist_predicates(), of the form:
'("pred name" "arg 1" "arg 2" etc)'
and returns them as a list
- postchunk_px(self, chunked)
- refine_arg(self, arg, re_pattern=None)
- repair_arg(self, arg_chunked, re_pattern)
- summarize_document(self, text, summary_size=5)
- inputs a raw text document
outputs an algorithmically generated summary
summary_size sets the size of the summary to generate,
in sentences
algorithm for summarization is:
1) compute a document vector using guess_topic()
2) compute sentence vectors using get_context()
3) saliency is the strength of each sentence vector
4) sentences are rank-ordered by saliency and pruned
to summary size
- tag(self, text)
- unpp_predicate(self, pp_pred)
|