Sunday, October 4, 2015

Bayes 1: basic terms

A couple of years ago, while at Google, I've done a "20% project" of a bayesian expert system. The expert system existed before me but I was able to improve it. While doing so, I've figured out a few things about the Bayes logic. Maybe they're commonplace knowledge but I haven't seen them on the Internet, and moreover, most of the descriptions I've seen on the Internet tend to get things somewhat wrong. Especially in Wikipedia. Obviously, I can't talk about the expert system itself, and after a couple of years I don't remember that much about it anyway. But I took notes about the general formulas I've figured out. I've recently found those notes, and want to share them.

This has really nothing to do with CEP, though I guess the CEP can be used to drive the expert systems. I just want to share this knowledge somewhere, and why not here?

Some 25 years or so ago I've read a British collection of articles on the artificial intelligence. One of these articles was on the expert systems based on the bayesian formulas, which quite impressed me. Then I've played a little more with this stuff in college. I vaguely remembered that the author of that article had developed the system called Microexpert but now what I can find about Microexpert doesn't seem to quite match it, so maybe that one was from another article.

When I started working on that recent expert system and went to refresh my knowledge, I've thought, by now Bayes is used all over the place, and the descriptions should be all over the Internet, right? Nope. Not only there isn't much, but also a lot of them apply the Bayes formula somewhat wrong. So I've had to restore this stuff from vague memories and logical reasoning.

First I'll describe what I've restored from my memory and then I'll go into what I've come up with while actually mucking with the realities of the expert systems. This might be well known and published in some obscure articles or maybe even in well-known books, but I don't know where (if you have pointers, you're welcome to comment). But it's not on the Internet and not in a popular form. I want to put it there.

Let's start with the basics. The basic description of the formula created by Bayes actually can be found all over the Internet. You can look at Wikipedia (in a crappy article), or say read a lengthy explanation at http://yudkowsky.net/rational/bayes or http://lesswrong.com/lw/2b0/bayes_theorem_illustrated_my_way/ . But I'll try to give my own short descriptions and then go further. Read on.

The probability theory talks about hypotheses and events. Both of them relate to some system being examined, some kind of a "black box" whose contents we're trying to guess based on observations. If we could just look inside the box, there would be no need for all this, but in reality we could only poke the box in certain ways and then have to deduce its contents to the best of our knowledge based on that limited experimentation. The contents of the box is usually not very random. We normally know a good deal about what could be in the black boxes we get, we just don't know what is in this particular black box. The goal of an expert system is to look at a black box, do some poking, and classify the contents of the black box to one of the known categories with a good certainty. The expert system example in that old British book was trying to diagnose the automobile breakages. A medical expert system would be trying to diagnose what sickness a patient has.

It's kind of intuitive that the hypotheses are the descriptions of what might be in the box, and thus we're trying to select which contents seems most likely based on out experimentations, what hypothesis is most probable.

The events are the experiments. We might be able to select the experiments to perform or might be limited to observing what the mechanism in the black box does on its own.

The hypotheses are usually denoted H or Hsomething. The events are usually denoted E or Esomething.

No comments:

Post a Comment