Chimrod

Jump to
PyBorg is a talking bot written in Python. It is able to go on IRC, and records the sentences and the structures which he can read, and uses thoses data to generate new new answers when we are talking to it. It uses the principle of the chains of Markov to generate the answers.
Unfortunately, the program is gone out of the web, and I take the developpement again, and put today a new release with enhanced functions !

The principle :

As megahal[2], PyBorg is a bot which "learn" the language as one speaks to him. It is not programmed to answer according to a particular language, but can answer in all the languages, if its base of data allows him. For each sentence he can read, the bot records the contents of the sentence, and the relations between the words. At the time to generate an answer, it will use thoses informations for build the answer, word by word.

For exemple, the words "how" and "are" are in the most time followed by "you", but never with "dog".
when the programm as choosen a answer begining with "how", there are great chances that it continue with "are" and "you"

For beginnig the answer, the program choose a word in the sentence given. The selected program the word which he knows the least best, which makes it possible to put side all the grammatical words, as well as the words which do not bring any direction to the sentence. (for example “it” “a” “” “of”…) Then, it generates the sentence with move back for all that precedes the word chosen, then, it generates the response starting from the word selected.

Example, by asking “Are you a robot, isn't it? ”, the program retains the word “robot” then generates its answer:
“robot” is preceded by “a” and “a robot” is often preceded by “are”, and "are" by "u"
Then, “a robot” is followed of ”, “ then”, “ is followed of “is” and finally will be followed of “that " Answer: “u are a robot, it is that”

That allows to generate answers, which, if the interlocutor does not know with whom it speaks, allow to manage a conversation

Of course, it thus is not of intelligence, but about statistical calculation which makes it possible to formulate the answers (with a good base of data, that makes it possible to generate “correct” sentences)

How does it works ? :

The program is divided into two modules. On the one hand there is the bot itself one, and of another share, the interfaces thanks to which it is possible “to communicate” with him (line-in, IRC, telnet, msn…)

To launch the program, according to whether one wants to test it in line or off-line, it is necessary to launch `pyborg-irc.py `or `pyborg-linein.py `. Under operation off-line, it is enough to type its sentence, so that the program generates its answer. During the first launching, the program remains quiet until a certain number of sentences to him were given so that it can make its answer.
During operation on-line, the program is connected to IRC, the information given in the file “pyborg-irc.cfg” (server, channel, nick etc), answers automatically all those which come to approach it into private, and answers of time other on the channels where it is connected. A parameter (owner) makes it possible to define which nick can control the bot.


Caution! For those which use the old version of PyBorg it is necessary to convert the dictionary into launching the program convert2.py before launching the bot.

The program was tested and functions without problem under Linux, Windows XP and FreeBSD, and “should” also function under other platforms.


As regards the memory use, I note (with my configuration) that the program occupies 32Mo of RAM, and that the files make 3,5Mo on the disc for a dictionary of 5.000 words with 437.000 different sentences (ratio of 90 sentences for a word).

Orders:

All the orders start with one “!”. According to whether the bot is connected to IRC or not, the list of the order is not the same one. Here the list of the orders which are usable all the time:

  • !help: help about the commands ! :)
  • !version: post the version of the bot
  • !quit: leave the program
  • !save: save the dictionary and the files of configuration
  • !words: post the number of words and sentences known
  • !known [Word]: post if the word [Word] is known and numbers it sentences in which it appears
  • !unlearn [Word]: erase the word of the dictionary
  • !purge: Post the number of words which appear only in one context
  • !purge [a number]: erase [a number] words of the dictionary, among the least frequent words
  • !replace [word1] [word2]: replace all the events of [word1] by [word2]
  • !censor [Word]: censure the word [Word]
  • !uncensor [Word]: withdraw [Word] list of censure
  • !learning [on|off]: authorize or not the training of new words
  • !limit [a number]: limit the number of words known with [a number] (by defect 6000)
  • !alias: List all alias what exists
  • !alias [alias]: List all the words which refer to {alias]
  • !alias [alias] [word1] [wordN]: create alias which replaces all the events of [Word] by [alias]

The following orders are available only on IRC:

  • !nick [nick]: change nick the bot
  • !shutup: prevent the bot from speaking on the channels
  • !wakeup: cancel one! shutup
  • !replyrate [a number]: probability that the bot answers a message diffused on the channel
  • !talk [nick] [message]: send the message [message] with [nick] on behalf of the bot
  • !join [channel]: connect the bot to [channel]
  • !leaves [channel]: fact of leaving the channel [channel]
  • !chans: list the channels where the bot is connected
  • !ignore of [nick]: ignore nick
  • !uningore [nick]: withdraw nick list of the people to be ignored
  • ! owner password: allows to be added to the list of the owners (the password is defined in the file pyborg-irc.cfg)
There are two files of configurations which carry both the extension .cfg
As soon as a variable is between hooks, it is possible to specify several values in the following way: [“key1”, “key2”,…]

pyborg.cfg:
  • num_aliases: variable of the program, not to change, indicates the number of known alias
  • num_contexts: variable of the program, not to change, indicates the number of known sentences
  • ignore_list: indicate the list of words which are not relevant in a sentence (ex: [“one”, “a”, “of”, “some”]
  • max_words: maximum limit with the number of known words, can be changed thanks to the order !limit
  • learning: indicate if the bot must learn or not. Can be changed thanks to the order !learning
  • aliases: the list of alias. Can be changed with the order! alias
  • censored: the list of the censured words. Can be changed with the order !censor !uncensor
  • num_words: variable of the program, not to change, indicates the number of known words
pyborg-irc.cfg:
  • owners: a list of owners of the bot
  • reply_chance: percentage of chance that the bot answers a message diffused on the channel. Can be changed with the order! replyrate (see the order! replyrate)
  • reply_to_ignored: 0 or 1 make it possible to answer or not the people who in the list of are ignored
  • chans: a list channels one where the bot must be connected (is not modified by the order !join)
  • servers: a list of waiters where the bot must be connected
  • ignorelist: a list of people which the bot will not answer (see the order! be unaware of! unignore)
  • quit_message: message of exit to the disconnection
  • password: password for the order! owner
  • ! speakin: 0 or 1 indicate if the bot must chatter on the channels, can be changed with the orders! shutup! wakeup

Files:

The last version of pyborg (currently version 1.1.1) (updated 23-11-06)
You can also consult the page of development of the project.

Links:

[1]: wiki404, the page of the wiki which speaks about pyborg (old version)
[2]: Megahal, another bot speaking.
Généré par ZITE-CMS | Admin