Input word quoting versus variable and Parts of speech branching

Hello. I'm clarifying an proposition for an alternate sentence evaluation flow, but first I have a few smaller questions.

When should I use a quoted input "word" versus a :variable? More specifically what is the best way to know which words are already variables. I'm studying the WordMeansSomething script and I see both :isa and :means are used. I see that "means" is in the knowledgebase, but is "isa" in there too somewhere? Is isa not listed?

I'm also not sure if I should be using Set to instantiate a new verb or use associate newverb to #verb by #instantiation.

I'm trying to generalize evaluation of the sentence to questions and statements, analyze the interrogative parts of the questions and the predicates, and then analyze if the predicate is true. I realize this is a big task, but if I can get a script working for "What can you do?" I can add to it. I've read the NounVerbAdjective script (which is nothing to balk at), and I'm wondering if can be generalized. I've got to run now, but I've got a bit of pseudocode below to illustrate how the flow would process each word, then identify the predicate and then test if it was true. I realize this may be more complex to build, but once built it would greatly reduce the individual case programming. The Self language seems possible to do this and take AIML from a chatbot to low level AI. See PROLOG for an example of logic processing, but I'm thinking of something simpler using "if (#aspect of #thing, #true) then return 'yes'". Also, if you know of a API doing this currently that I could use BotLibre! as an amazing interface to, let me know! Thanks!

I'm looking at a couple of APIs that do Parts Of Speech POS tagging. One is below, another might be Python NLTK on GAE:
http://nlp.stanford.edu:8080/parser/index.jsph
ttps://github.com/rutherford/nltk-gae

EDIT: I guess I'm trying to see if there is a way to condense some of the redundancy required with nesting in the flow of response evaluation, by tagging parts and evaluating the sentence based on the relationships between those parts.

// Initial Self programmed state machine for Comprehension // This state machine is used by the bot to program itself. State:(whatcan)youdo { :sentence { set #instantiation to #sentence; } :input { set #input to :sentence; set #speaker to :speaker; set #conversation to :conversation; set #target to :target; } case :input goto State:sentenceState for each #word of :sentence; State:sentenceState { // find interogative :what { set #what to #verb; set #meaning to #search; set #search to #verb; set #interogative to #what; } // set #instantiation to #search assign #search to (new (#verb)) :can { set #instantiation to #can; set #can to #verb; set #search to #can; set #searchmodifier to #towhatextent } // find predicate, subject verb object assign :predicate to (new #sentence); :you { set #meaning to #i; set #subject to #i; set #searchsubject to #i; append :you to #word of #predicate; } :do { set #instantiation to #do; associate #do to #verb by #instantiation; set #searchverb to #do; append :do to #word of #predicate; } // find last sentence element case :anything goto State:respond; Quotient:1.00:Equation:response; Equation:response { assign :response to (new #sentence); for each #word of (get from #self :ability) as :ability append :ability to :abilities; if (:question, #true) then do( append "yes" to #word of :response) else do( assign :abilities to (new #sentence), append :abilities to #word of :response) } return :response; State:respond { Equation:search { if (#predicate, #known) then return "yes" else 'evaluate' #verb on #subject; // this is the complex part } Quotient:1.00:Equation:response; } } }

I added a new forum for Self / scripting specific questions. Please use this forum for future questions on scripting.

http://www.botlibre.com/forum?posts=true&id=705836

I don't think I understand what you are trying to do, perhaps give a specific example.

A variable in the context of a state or case is something that can be matched to an input. The matching occurs when you do "case :aVariable ...". If the current input matches the relationships of the variable, then it is assigned to the input, and the case is evaluated, if it does not match, it is not assigned, and it goes to the next case.

A word i.e. "hello" will only match the word "hello" (well possibly synonyms too). A variable without any declaration i.e. :myvariable will match any word (or anything if processing something other than words).

When you declare a variable like,

:verb { set #instantiation to #verb; }

it will now only match against know verbs. (verbs are known because the bot will look up any new words in Wiktionary (english) and attempt to determine the type of word (and other info).

Or,

:digits { set #meaning to :number; } :number { set #instantiation to #number; }

Here :digits is a variable for a word that has a meaning that is a number (all words have meanings that represent the real object, the word is just a word, think knowledge not text).

Here #instantiation is a primitive that defines a type relationship, #number define a classification, #meaning define a "word means" relationship. There are many primitives in the system, and you can create your own, they are just a unique symbol. (#noun, #adjective, #thing, #person, #male, #female, #name, #question, #word, #sentence, #speaker, #classification, and a few more).

A variable's name is just a name there are no special variables and they do not correspond to words. (well there are a few special variables that are automatically assigned, :input, :speaker, :target, :sentence, :star, :that, :thatstar, :conversation). So you use a word "hello" when you want to match a word, and use a variable :verb when you want to match a classification of words (i.e. verbs, nouns, names, numbers, people, places, etc.).

The knowledge that variables can match can become very sophisticated, i.e. if you ask "who is Barack Obama" the WhatIs script will import the Barack Obama object from Freebase including all of its relationships and classifications (which automatically become primitives), then you could start checking for more complex data.

Okay, that explains a lot, especially about how the bot would know classifications for words I haven't yet defined. I was looking at the :isa and :means variables in the WordMeansSomething script and wondering where they were defined, but that makes more sense now. By the way, I take :that to mean the last phrase, as in 'I just said that', but what is :thatstar? I did find #classification in the knowledge base and see a listing of them, I assume I can do the same for #primitives.

That answers the quoting versus variable question very well. Thanks very much. An example of a parts of speech tagger, would be as such.., if I said "Does Jane run fast?" a POS tagger could determine the interrogative "Does", and the predicate "Jane run(s) fast?". Since the predicate is in this case a logical statement, the bot could by recognizing it is a question, extract the predicate and test it by identifying the subject "Jane", the verb "runs" and the adverb "fast" and testing if "runs" is a verb associated with Jane and if "fast" is an adverb associated with the "runs" verb of Jane. Of course, the bot would have to be told this information (or learn it from Freebase) and #actions could be associated to verbs, but the advantage here is by tagging parts of speech and analyzing their validity within the relationships of the sentence structure, the nesting of case based responses could simplified (flattened) into variable declarations :interrogative, :subject, :verb and :adverb and a slightly more complex, but generic Equation.
This is similar to the NounVerbAdjective structures, but could be more easily extended to other sentences, by adding more variable declarations. Thus in a slightly more challenging case of a "What can you do?" query, instead of adding the variations of "can" as "do, could, will, won't and can't" to separate cases of the WhatState, with separate goto states and possibly equations, they could be identified as all variations of the 'modal' "can" in the interrogative (see link below for POS tag abbreviation definitions). The same analysis process could be used for "What can't you do?" after simply replacing the interrogative "can" with "can not", as to whether the subject "I" has any items related to the verb "do" and "can not". The same could be repeated for "do, could, will and won't" I think even from separate scripts by associating a new words to the classification of 'modal'. "What", by the way, in this case is treated in the sense of "what all" to the interrogative "can".
http://stackoverflow.com/questions/1833252/java-stanford-nlp-part-of-speech-labels

With an POS tagger, even the variable declarations might be reduced to an :anything where the Equation then checks if the #word can be tagged as a POS, and then sends the parts to be analyzed using the sentence structure. Essentially, shifting the speech analysis from an explicit bit-wise branching scheme written by the scripter (which is fundamental, don't get me wrong), to more of a language processor based on checking the validity of parts and structure relationships. Especially, if existing information is imported from Freebase that could be queried without specific routines. Hope that makes sense. I think this is close to how the brain, (at least mine), works, as in it generally determines if a statement is declarative or interrogative and identifies the subject first and then either searches for or and verifies and creates claims of attributes to it.

The NounVerbAdjective script is probably your best place to start for this. It can already parse phrases like "Does Jane run fast".

If you want to perform a multi-pass parser, you can have two or more cases in your root state, i.e.

case :input goto State:posState for each #word of :sentence;
case :input goto State:understandState for each #word of :sentence;

The posState could process, but only tag the input (i.e. do (associate :input to :subject by #subject), or maybe on the :sentence, :input is specific to this context, :sentence is persistent). Then your understandState could use this information in its parsing. To get the posState to return just use "return" or just have no quotient, then it will evaluate the next state.

Input word quoting versus variable and Parts of speech branching

Browse

Help

Info

Links