It seems that you're using an outdated browser. Some things may not work as they should (or don't work at all).
We suggest you upgrade newer and better browser like: Chrome, Firefox, Internet Explorer or Opera

×
avatar
spindown: 2. "1/3 position." You can argue for this position based on long-run statistics. Imagine you repeat the experiment 1,000 times. Then the coin is expected to land heads 500 times and tails 500 times. Beauty would be awoken 500 times on Monday after the coin landed heads, 500 times on Monday after the coin landed tails, and 500 times on Tuesday after the coin landed tails. So she would be awoken 1,500 times, but only in 500 cases would the coin have landed heads. That is, in 1/3 of her awakenings the coin had landed heads, so her credence for heads should be 1/3. (You can also argue for this position in a more analytic way that doesn't involve repetitions.)
avatar
RealWeaponX: Individually it may seem that you have a 1/2 chance of it being heads, but in the long run it is always going to be twice as likely to be tails. This is in some ways similar to Monty Hall - with the MH problem, you are statistically more likely to win if you swap after the first box is open, but only in long run stats. Individually, your first guess is as good as your second, as you may have picked correctly.

So, in both cases, individually it may as well be a guess (1/2 in SB, 1/3 in MH), but statistically speaking there is a "correct answer".

In the case of Sleeping Beauty, she is specifically asked for a statistical probability of the coin having landed heads, therefore the answer is always 1/3.
I don't see this as so much a paradox as a matter of whether or not you count the two "awakenings" as a result of landing tails as one event or two. From Sleeping Beauty's perspective, I'm pretty sure you should count them as one event. It doesn't, as dmetras mentioned, actually matter that she is awoken twice for tails versus once for heads if you count the set of awakenings as one event because despite the fact that you get multiple guesses, there was still only one coin toss. The only difference between tails and heads is that you are allowed two (or many), independent guesses for tails and only one guess for heads - however, your ability to guess multiple times versus once does not influence the prior probability of what the coin landed as. In Monty Hall, you get extra information for your second guess, the guesses are not independent. Here she doesn't get extra information (and naturally doesn't even know that she's guessed before), the guesses are completely independent.
Post edited March 02, 2012 by crazy_dave
avatar
spindown: I think there's some confusion about the meaning of the word "credence." Basically, it means the following: From Beauty's point of view, which probability should she assign to the hypothesis that the coin landed heads? What it doesn't mean is: What side did the coin land on? That, of course, she cannot answer.
I fully understand the word "credence". Read what you just wrote again, carefully. "Assigning credence to the hypothesis that the coin landed heads" and asking her which side the coin landed on are essentially the same question. That's not a philosophical or mathematical issue, it's a linguistic issue, and a basic one at that.

If she was able to give credence to the hyphothesis that the coin landed on heads, then in turn, she would have to conclude that in all likelyhood, the coin landed on heads.

The fact is, she can't give credence to either scenario, as she doesn't know what the hell is going on. She just woke up from being asleep, and has no way of guessing how long she was asleep for.

avatar
spindown: 2. "1/3 position." You can argue for this position based on long-run statistics. Imagine you repeat the experiment 1,000 times. Then the coin is expected to land heads 500 times and tails 500 times. Beauty would be awoken 500 times on Monday after the coin landed heads, 500 times on Monday after the coin landed tails, and 500 times on Tuesday after the coin landed tails. So she would be awoken 1,500 times, but only in 500 cases the coin would have landed heads. That is, in 1/3 of her awakenings the coin had landed heads, so her credence for heads should be 1/3. (You can also argue for this position in a more analytic way that doesn't involve repetitions.)
This doesn't make sense on a number of levels. Firstly, the experiment isn't repeated a thousand times in your example. It's presented as a one off situation, decided by a coin toss. That means there is a 50% likelihood of either outcome.

You also seem to be talking about probability of her being asked on a Monday, which is entirely different from acertaining which side the coin landed on. Probability of it being a Tuesday when she is asked is only 1/3, but that doesn't change the odds of the result of the coin toss.

And again, this isn't a paradox.

avatar
spindown: Regarding the philosophical implications, I'd like to get into those a little later since they require more explanation
Ready when you are :)
avatar
Vestin: While my research of this issue is limited, I'd love to share with you the marvelous Placek's Paradox (also known as Kraków-Warsaw Paradox) which happens to undermine the theory.
hmmm ... I'll have to think about it some more but I'm not really sure that this is a true paradox either. I don't understand the setup - you know enough information about the car to assign a probability, but not enough to know the actual deterministic behavior of the car? That seems more of a problem of knowledge and models than it does undermining Bayes' Theorem and the applications it actually is used for in Science. I dunno ... I'll have to think about some more, but it seems like a misapplication ... but maybe that was the point? I guess I'm more just confused about the setup of the Karkow-Warsaw problem than the solution.
Post edited March 02, 2012 by crazy_dave
avatar
crazy_dave: That seems more of a problem of knowledge and models (...)
As far as I remember - that's what the theory is all about. With perfect information your certainty would be at 100% and there would be no need for a theory dealing with probability and predictions.

avatar
crazy_dave: (...) Bayes' Theorem and the applications it actually is used for in Science.
The theory itself is from the realm of epistemology, which is why I've heard about it in the first place.
I'm thinking about the extreme situation. Let's say if the coin comes up tail, sleeping beauty need to go through the experiment for 1000 days rather than 2. Now the "1/3 logic" no longer makes sense since that will imply sleeping beauty only has 1/1001 chance to escape this torture.
In fact, I think the problem of "1/3 logic" is that it mixes up two entirely different probabilities--the probability of flipping coin, and the probability about on which day sleeping beauty wakes up (assume the coin comes up tail)
I think the actual chance here is 2:1:1 (head&Monday : tail&Monday : tail&Tuesday)
Why does this question need to be pondered? What's it's value?

I am just curious here. You get lots of famous philosophical conundrums like this that people consider extensively but I don't see their worth. Perhaps I just need to be enlightened.
avatar
dmetras: Ahh, ok. I stand by the 1/2 position. More awakenings from a tails landing doesn't change the toss-up probability. The coin will always have two sides.
I agree with this, there is really not a paradox except sleeping beauty will not know which day she is waking up in.

There are only two outcomes of the toss

1- Heads: she get woken up on Mondays.
2- Tails: she get woken up on Mondays, made to forget and then woken up on Tuesday.

There are no other options, I would not consider it a paradox.. The only thing you get close to a paradox is from the viewpoint of the sleeper, as she can not be sure if she wakes up on Tuesday without some sort of observation.

For me this felt like a poor man's Schrödinger's cat, but there it can at least be argued that the cat exist in two states until it is observed. This do not apply since it is only from the sleepers viewpoint, and I will find it hard to argue that the observer (the sleeper) can exist in two states...
Post edited March 02, 2012 by amok
avatar
PandaLiang: I'm thinking about the extreme situation. Let's say if the coin comes up tail, sleeping beauty need to go through the experiment for 1000 days rather than 2. Now the "1/3 logic" no longer makes sense since that will imply sleeping beauty only has 1/1001 chance to escape this torture.
In fact, I think the problem of "1/3 logic" is that it mixes up two entirely different probabilities--the probability of flipping coin, and the probability about on which day sleeping beauty wakes up (assume the coin comes up tail)
I think the actual chance here is 2:1:1 (head&Monday : tail&Monday : tail&Tuesday)
I have no idea where you got the part about probability of escape from, but you got it right in your second paragraph. To sum the whole thing up in short:

- In a single, one off event decided by a coin toss, there is a 50% chance of either outcome. Bayes' Theorem is irrelevant, because there is no new information to be subjective about. A coin was tossed, it either landed on heads or tails, and that's all she's got to go on.

- If you're talking about probability of her being woken on a Monday or a Tuesday, yes, they have different odds, but they are irrelevant to the initial question because they don't affect the result of the coin toss.
avatar
Vestin: As far as I remember - that's what the theory is all about. With perfect information your certainty would be at 100% and there would be no need for a theory dealing with probability and predictions.

The theory itself is from the realm of epistemology, which is why I've heard about it in the first place.
So the problem is a misapplication of Bayes' Theorem which I take it is the point of the paper - i.e. people applying the theorem when they shouldn't?

One can construct an even more extreme example:

You have two coins which at the start of the experiment are both facing tails-end up: Both coins (labeled 1 and 2) are weighted to land heads-up 80% of the time and tails-up 20% of the time. You can only toss coin 2 if you get tails on coin 1. What is the probability after the experiment of coin 2 now facing heads-up given that coin 1 landed tails?

By the logic stated in the paper:
P(C2-Heads | C1 - Tails) = P(C1 - Tails | C2 - Heads) * P(C2 - Heads) / (PC1 - Tails)

Since one knows one has to have had tails in C1 to even have tossed the coin 2, C1 must have been tails (similarly to reach Warsaw, one must travel through Janki), so P(C1 - Tails | C2 - Heads) = 1. We know P(C2 - Heads) = 0.8 and (PC1 - Tails) = 0.2 so therefore the P(C2-Heads | C1 - Tails) = 4. Clearly nuts. :)

So what went wrong? Bayes' Theorem was misapplied. See it's true that one must reach a certain result for coin 1 in order to toss coin 2 at all but that only determines whether or not the coin 2 toss happens - not the result of the toss. So if coin 1 lands tails-up, coin 2's toss is now independent. When dealing with independent events:

P(A | B) = P(A and B)/P(B) However, since P(A) is independent P(A and B) is the P(A) * P(B). So the P(A | B) = P(A). That would be the correct application of the theorem.

Let's do a quick sanity check: In the coin example what is the probability that coin 2 lands tails-up given that coin 1 lands tails up. From the law of total probability:

P(C2-Heads | C1 - Tails) + P(C2-Tails| C1 - Tails) = 1
0.8 + x = 1
x = 0.2 = P(C2-Tails)

Now let's do the other one:

P(C2-Heads | C1 - Tails) + P(C2-Tails| C1 - Tails) = 1
4 + x = 1
x = -3 = just as impossible as P(C2-Heads | C1 - Tails) = 4 :)

I can make my example act eaxactly like the Krakow - Warsaw example - just make the first coin weighted at 0.3/0.7 heads/tails and the second one 0.5/0.5. However, since I can play with the weights of the coins in more ways than the distances can be played with, I can push the logic in the Krakow-Warsaw example further.

Essentially the set up of the Krakow-Warsaw problem misleads one to think one might have extra information upon reaching Janki, but because one doesn't really have new information, Bayes' theorem is then misapplied in this instance.

It's an interesting example of how one can confuse about whether or not there is a statistical dependency between two events. BUT Bayes' still works just dandy if you get the dependency right. :) That's my take on it at least. That may be the point of the paper though I'm simply to tired to comprehend that. :)

EDIT: One can formulate a near identical problem where one does have more information upon reaching Janski - let's say you are given a car that has an equal probability of have 1-10 whole numbers of gas in it. You are told you need at least 4 gallons to reach Janski (7/10 probability) and and at least 6 gallons to reach Warsaw (5/10 probability). You know you have at least 4 gallons in the car upon reaching Janski - that leaves only 4-10 as possible values - so the probability of reaching Warsaw has been raised to 5/7. In this instance, using Bayes' theorem upon reaching Janski is correct and the conditionalist is right and the determinist is wrong. However, they did apply the constraint in the paper that gas consumption is indeed known to be linear ... in which case the conditionalist is probably right (I'll have to think about it) ... remove that restriction and the paradox definitely still "works".
Post edited March 02, 2012 by crazy_dave
avatar
crazy_dave: It's an interesting example of how one can confuse about whether or not there is a statistical dependency between two events.
In a compendium I have here, it is merely a step in analyzing bayesianizm as one of inductional methods of reasoning

avatar
crazy_dave: BUT Bayes' still works just dandy if you get the dependency right.
3 pages later I get to something like this (my own translation):
"The discussion above suggests that to use Bayes' Theorem with merit we need to exert knowledge about the level of relevancy of accounts for the hypothesis in question".
This conclusion makes things quite easy, since we need to know additional things to know the things we want to know... and we get our beloved ad infinitum.

Don't take this personally, this doesn't make the method useless, it's merely flawed, to an extent... But that is merely to say that it isn't perfect, which is a state of affairs I'm more than willing to agree upon ;).

BTW - Bas van Frassen's response to Placek is hilarious and I'm glad you didn't follow the same path. VF said that when we arrive in Janki, this means that out of 10 cars, 3 stalled already, so ours is one of the remaining 7 and therefore - it is more likely that it's going to get to Warsaw. In response to which my wonderful book basically notes that there aren't 9 additional cars out there ;P.

EDIT:
avatar
crazy_dave: so the probability of reaching Warsaw has been raised to 5/7
Well - this is what I get for prasing you. Before I finish my post, you do this xD.

Fun fact: the amount of gas in the tank is discussed as another factor in my version (my book assumes the tank is full at the start and variations occur because of travel speed, weather conditions, etc. The version I linked assumes linear change but unknown amount of gas...)
Post edited March 02, 2012 by Vestin
avatar
Vestin: In a compendium I have here, it is merely a step in analyzing bayesianizm as one of inductional methods of reasoning

3 pages later I get to something like this (my own translation):
"The discussion above suggests that to use Bayes' Theorem with merit we need to exert knowledge about the level of relevancy of accounts for the hypothesis in question".
This conclusion makes things quite easy, since we need to know additional things to know the things we want to know... and we get our beloved ad infinitum.

Don't take this personally, this doesn't make the method useless, it's merely flawed, to an extent... But that is merely to say that it isn't perfect, which is a state of affairs I'm more than willing to agree upon ;).

BTW - Bas van Frassen's response to Placek is hilarious and I'm glad you didn't follow the same path. VF said that when we arrive in Janki, this means that out of 10 cars, 3 stalled already, so ours is one of the remaining 7 and therefore - it is more likely that it's going to get to Warsaw. In response to which my wonderful book basically notes that there aren't 9 additional cars out there ;P.

Well - this is what I get for prasing you. Before I finish my post, you do this xD.
I'm not taking it personally :) I'm not a Bayesian or Frequentist statistician so I have no horse in the race of whose interpretation is "better" - I apply the tools and techniques of one or the other when I happen to feel the situation merits or requires it. That doesn't mean I always pick the right one, I am human and similarly I feel this more of a human error in the application of Bayes' theorem rather than a flaw in the philosophy itself.

I just think that for any theorem you have to have the information relevant to use it - or you can't apply it correctly. That seems somewhat tautological. If all you have is the information given to you in the problem you might not be able to use Bayes' theorem (although on second thoughts I think for this particular instance of Krakow-Warsaw they might give enough to use it correctly - read my example in the new edit above). But obviously if you don't have enough information - i.e. the state of B doesn't actually give new information about A, then Bayes' still works fine, you just have to not be mislead and think B does actually give new information about A. I don't see an ad infinitum or flaw in the method but merely a flaw in to what the method was applied - i.e. this is a result of human error applying the wrong equation and set of conditions, if you apply the right ones within Bayes, you do get the right answer out.

Now if there are Bayesian philosophers who are applying conditional logic when they have shaky grounds that the information from one set actually does impact the knowledge about a different set (when they don't actually know that or the sets are actually independent) - then yes I completely agree that they are doing something very wrong as can be shown by the coin example.
Fun fact: the amount of gas in the tank is discussed as another factor in my version (my book assumes the tank is full at the start and variations occur because of travel speed, weather conditions, etc. The version I linked assumes linear change but unknown amount of gas...)
I think your book is right, but the idealizations in the version you linked to are too strict and may make the conditionalist right. If the gas consumption per unit travel is unknown, but on average it is 7/10 chance to Janski and 5/10 to Warsaw then getting to Janski doesn't tell you jack about getting to Warsaw since you don't know how much gas you have left or what the remaining gas mileage you're going to get is going to be. If on the other hand you know what the gas consumption is and you know you had enough to reach Janski then you do indeed get extra information from reaching Janski (because you know what the minimum amount of the gas in the tank had to have been). It all hinges on whether or not getting to Janski provides you with extra information. For instance my coin example is like your book's example - getting to Janski doesn't tell you anything any more than getting tails in coin 1 tells you anything about getting heads in coin 2. One can still use Bayes' theorem, but it's pointless because they (coin2 versus coin1 if coin1 is tails, or getting to Janski versus Warsaw) are independent events, so you're really just going to get the original values of each event occurring individually. And if one misapplies Bayes' theorem in either, then one can easily get a shockingly wrong answer as I showed when I took the coin example to its logical extreme (i.e. showed how one might reasonably apply a conditional probability and get a clearly nonsense result). :)
Post edited March 02, 2012 by crazy_dave
avatar
crazy_dave: I apply the tools and techniques of one or the other when I happen to feel the situation merits or requires it.
Oh, are you of the "particular sciences" ?

avatar
crazy_dave: I am human and similarly I feel this more of a human error in the application
avatar
crazy_dave: this is a result of human error applying the wrong equation and set of conditions
Dude, WTF ? This is not a "mistake", it's a part of some paper or other. You can argue that for "how far can I drive ?" the information "I have driven this far and I'm still driving" is irrelevant but that only raises further concerns. Besides - this was probably a jab at the iterative process of - consider this - some form of this theory, perhaps a more naive (earlier ?) one.

EDIT: Martial Law, relevant to the Kraków-Warsaw example, was in effect since 13. XII 1981, so this is an argument against Bayesianism from roughly 30 years ago.

avatar
crazy_dave: if you apply the right ones within Bayes, you do get the right answer out.
That's circular logic... which is the whole point. How do you know which conditions are relevant and to what extent ? Will you use the Bayesian method to reveal them (ad infinitum) or some other method (in which case - why bother with Bayesianism if there's another way of finding reliable answers ?).

avatar
crazy_dave: It all hinges on whether or not getting to Janski provides you with extra information.
No, that's not the question - that's the ANSWER. Placek claims that the theory would assume it as evidence and modify probability... but he states that it's unreasonable, since you've gained no relevant information. It's under the side-title "problems with the conditionalization rule".

avatar
crazy_dave: And if one misapplies Bayes' theorem in either, then one can easily get a shockingly wrong answer
That's the iPhone of theorems: "The theory is correct, you're simply using it wrong" @_@.

avatar
crazy_dave: i.e. showed how one might reasonably apply a conditional probability and get a clearly nonsense result
I think the point of his paper was to show that the theory CAN lead to nonsense. Your explanation is very vague - as long as there's no clear way to know what you need to know BEFORE using the theory... it's not that great of a theory, especially since it's supposed to deal with learning.
Post edited March 02, 2012 by Vestin
avatar
Vestin: No, that's not the question - that's the ANSWER. Placek claims that the theory would assume it as evidence and modify probability... but he states that it's unreasonable, since you've gained no relevant information. It's under the side-title "problems with the conditionalization rule".

That's the iPhone of theorems: "The theory is correct, you're simply using it wrong" @_@.

I think the point of his paper was to show that the theory CAN lead to nonsense. Your explanation is very vague - as long as there's no clear way to know what you need to know BEFORE using the theory... it's not that great of a theory, especially since it's supposed to deal with learning.
In the coin example, coin2 ending up as heads or tails has nothing to do with wether or not coin1 was tails once coin1's result of 'tails' is known. If someone were to then claim that they do have extra information based on the condition that is relevant to coin2's state would be mistaken and would get a laughably wrong conclusion. They cannot use the information of coin1 landing tails to help them gauge coin2's results. And that's clear before you begin.

In the K-W example it is trivial to construct a version where the information gained during a run is relevant and the example given I feel is flawed because I think you can gain relevant information that you would know from the problem setup is relevant - not because you gained the information. Here take another example: There are 3 cars and in each car is a chip that kills the engine at a given marker on a road 1, 2, or 3. Each marker will stop 1 car and 1 car only. You don't know at which marker the car you are in will stop. At the beginning of the experiment you thus have a 1/3 probability of reaching marker 3. If however while driving you don't stop at marker 1, the problem has now essentially reset. You are now essentially at the start of a new experiment where there are now two cars and two markers. You now know by the constraints imposed by the problem that one of the two cars MUST reach marker 3. By passing marker 1, you have shifted the probability from being in the car that reaches the end from 1/3 to 1/2. The constraints of the problem gives you that information. The Janski example I gave was identical - if you reach Janski you can thus work out how much minimum fuel you must have had and you know how much fuel you need. The problem has essentially shifted from the initial starting point you left at to Janski. At the first starting point, you didn't know how much fuel you had 1 (minimum) - 10 (maximum) gallons. Taking Janski to now be your new "starting position", you still don't know how much fuel you have, but the unknown you are now dealing with is 0-6 gallons. Thus while at the first starting site you had a 5/10 chance of reaching Warsaw, you know have a 5/7 chance of reaching Warsaw.

Now one can easily impose a looser of constraints by which you still wouldn't have any better idea what car you were in or how much fuel had until you actually reached the final marker or Warsaw.

In science for instance, one often has an unknown prior. There are scientists who will then run their model on a data set with a flat prior, then use the results of that model run as the prior for the next run of the model on the exact same data, and then continue to iterate. That process is obviously circular in exactly the manner you describe and condemn.
Post edited March 02, 2012 by crazy_dave
avatar
crazy_dave: the example given I feel is flawed
I think the point is this: find a way to force the Bayesian probability to "recalculate" while, on the intuitive level, you know that the probability hasn't changed one bit. As soon as you demand assessment of relevance using prior knowledge - the problem changes a bit.

avatar
crazy_dave: Here take another example: There are 3 cars and in each car is a chip that kills the engine at a given marker on a road 1, 2, or 3.
That's precisely the way it does NOT work.

avatar
crazy_dave: Taking Janski to now be your new "starting position", you still don't know how much fuel you have, but the unknown you are now dealing with is 0-6 gallons. Thus while at the first starting site you had a 5/10 chance of reaching Warsaw, you know have a 5/7 chance of reaching Warsaw.
Cute note #1: Think about is as a "half empty / half full" issue. You say that by getting this far I am more certain that I have enough fuel to reach Warsaw... You know what I say ? I say "Damn. With all the fuel I've used up until this point, I'm sure to run out any moment now". In other words - I'd argue that the probability of reaching Warsaw keeps DECREASING with every gallon burnt while simultaneously INCREASING with every mile driven... ultimately staying the same throughout the trip. Note - this is my personal speculation.
Cute note #2: Imagine taking a test and thinking there's a 10% chance you got an A. You then find out you passed the test. Does this increase your probability of getting an A ?

avatar
crazy_dave: That process is obviously circular in exactly the manner you describe and condemn.
No - that's not what I meant at all ! The circularity is BACKWARDS - before you begin your experiment, you need to know which information is relevant and which isn't. What method are you going to use to determine that ? Arbitrary apriori assignment of a probability also bugs me but that's a different issue.

All in all - I'm out of my element. Philosophy of Science is not my strong suit, so I'll leave things at this (at least until I do further research). I just wanted you to know that I'm fairly certain Placek's argument makes sense is SOME way - that guy is one of the smartest people I've had the honor of meeting... but that's just a personal hunch, so think of it what you will.

Also - it seems we've monopolized (not to mention derailed) this discussion beyond all forum decency >_>.
avatar
Vestin: I think the point is this: find a way to force the Bayesian probability to "recalculate" while, on the intuitive level, you know that the probability hasn't changed one bit. As soon as you demand assessment of relevance using prior knowledge - the problem changes a bit.

That's precisely the way it does NOT work.

Cute note #1: Think about is as a "half empty / half full" issue. You say that by getting this far I am more certain that I have enough fuel to reach Warsaw... You know what I say ? I say "Damn. With all the fuel I've used up until this point, I'm sure to run out any moment now". In other words - I'd argue that the probability of reaching Warsaw keeps DECREASING with every gallon burnt while simultaneously INCREASING with every mile driven... ultimately staying the same throughout the trip. Note - this is my personal speculation.
Cute note #2: Imagine taking a test and thinking there's a 10% chance you got an A. You then find out you passed the test. Does this increase your probability of getting an A ?

No - that's not what I meant at all ! The circularity is BACKWARDS - before you begin your experiment, you need to know which information is relevant and which isn't. What method are you going to use to determine that ? Arbitrary apriori assignment of a probability also bugs me but that's a different issue.

All in all - I'm out of my element. Philosophy of Science is not my strong suit, so I'll leave things at this (at least until I do further research). I just wanted you to know that I'm fairly certain Placek's argument makes sense is SOME way - that guy is one of the smartest people I've had the honor of meeting... but that's just a personal hunch, so think of it what you will.

Also - it seems we've monopolized (not to mention derailed) this discussion beyond all forum decency >_>.
Actually I think we're done because I think I understand were the confusion lies :) I actually agree with you. Bayesian arguments do contain an element of circularity AND it is in the prior. However, once you have a prior and conditional as defined by the problem, calculating the posterior from a conditional distribution is simple and neither circular nor misleading - it's simply a calculation. So as to your cute note 1 & 2, yes given the distribution of scores and gas you can calculate the probability of you getting an A or reaching Warsaw once you know/are told which part of the distribution your score/gas fill-age lays on. That's simply probability - in fact beyond Bayesian/Frequentist statistical philosophies, that's even just the law of total probability at that point and Bayes' formula is simply a convenient equation to use (i.e. Frequentists and Bayesians agree on this point - they may disagree about how one thinks about scores and likelihoods, but they both use many of the same tools, including Bayes' formula). It's never "wrong". However, in real life and Science the prior (i.e. the overall distribution of scores or gas) may be wrong or unknown and building that prior is where Frequentist and Bayesian analyses and interpretations of results differ. I'll explain below:

In the problems given I was under the impression that the prior can be assumed true - i.e. you've been "told" the coin has been weighted, you've been told there is an equal probability of of having 1 - 10 gallons, you've been told the parameters of the chips to stop the cars and the people telling you this a priori information know for an absolute fact that these are the right priors because they are essentially the Gods of the experiments.

However, how does one really know what the coin is weighted for - the prior of the coin problem? And I began to suspect that might be what you were aiming for. That's why I brought up the last point about what scientists sometimes do to create "priors" - it wasn't the initial use of the flat prior that was the issue. Yes you can always argue that the use of an arbitrary prior would lead to wrong results and in fact Bayesians frequently argue that against frequentist statistics - that they are simply Bayesians with arbitrary flat priors that don't fit reality. The argument against Bayesian by the frequentists are that in order to construct a prior you're supposed to have pre-information ... which is gotten from what exactly? If you get it from your model a la the method I described where you apply your model, get a posterior distribution, then use that as the new prior for your model, that is clearly circular - you're using the same information and model to develop the prior, thus you are just going to reconfirm your prior over and over and over again. Other methods might have a lot less to arguably know circularity at least methodologically, but epistimatically (sic) is still circular because you need some seed information to start yourself off with a reasonable and yet not "flat" prior. However, the Bayesian argue that's all Science - that in order to make a hypothesis you do first need an observation, something that doesn't fit current theory or for which there is no theory. That hypothesis then needs to make predictions which you test by making other observations. Epistimatically (again, sic) they argue all Science contains this element of circularity so you might use to your advantage.

However, again, to stress this point: Bayes' formula is never wrong. :) If you have a prior handed to you by the designer of the problem (and they know how they've designed the problem) and a the problem is set up so that there is a conditional distribution, the posterior is indeed probabilistically and fundamentally sound. The trouble is in real life, most of the time, the prior is not known in that fundamental way and must be gotten from previous information or trial which themselves could be methodologically circular or have unknown errors. Bayesianists argue this is a strength of the approach, Frequentists hate it to their very soul. :)
Post edited March 03, 2012 by crazy_dave