 Boost C++ Libraries

...one of the most highly regarded and expertly designed C++ library projects in the world.

Estimating Sample Sizes for the Negative Binomial.

Imagine you have an event (let's call it a "failure" - though we could equally well call it a success if we felt it was a 'good' event) that you know will occur in 1 in N trials. You may want to know how many trials you need to conduct to be P% sure of observing at least k such failures. If the failure events follow a negative binomial distribution (each trial either succeeds or fails) then the static member function `negative_binomial_distibution<>::find_minimum_number_of_trials` can be used to estimate the minimum number of trials required to be P% sure of observing the desired number of failures.

The example program neg_binomial_sample_sizes.cpp demonstrates its usage.

It centres around a routine that prints out a table of minimum sample sizes (number of trials) for various probability thresholds:

```void find_number_of_trials(double failures, double p);
```

First define a table of significance levels: these are the maximum acceptable probability that failure or fewer events will be observed.

```double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
```

Confidence value as % is (1 - alpha) * 100, so alpha 0.05 == 95% confidence that the desired number of failures will be observed. The values range from a very low 0.5 or 50% confidence up to an extremely high confidence of 99.999.

Much of the rest of the program is pretty-printing, the important part is in the calculation of minimum number of trials required for each value of alpha using:

```(int)ceil(negative_binomial::find_minimum_number_of_trials(failures, p, alpha[i]);
```

find_minimum_number_of_trials returns a double, so `ceil` rounds this up to ensure we have an integral minimum number of trials.

```void find_number_of_trials(double failures, double p)
{
// trials = number of trials
// failures = number of failures before achieving required success(es).
// p        = success fraction (0 <= p <= 1.).
//
// Calculate how many trials we need to ensure the
// required number of failures DOES exceed "failures".

cout << "\n""Target number of failures = " << (int)failures;
cout << ",   Success fraction = " << fixed << setprecision(1) << 100 * p << "%" << endl;
cout << "____________________________\n"
"Confidence        Min Number\n"
" Value (%)        Of Trials \n"
"____________________________\n";
// Now print out the data for the alpha table values.
for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha); ++i)
{ // Confidence values %:
cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]) << "      "
// find_minimum_number_of_trials
<< setw(6) << right
<< (int)ceil(negative_binomial::find_minimum_number_of_trials(failures, p, alpha[i]))
<< endl;
}
cout << endl;
} // void find_number_of_trials(double failures, double p)
```

finally we can produce some tables of minimum trials for the chosen confidence levels:

```int main()
{
find_number_of_trials(5, 0.5);
find_number_of_trials(50, 0.5);
find_number_of_trials(500, 0.5);
find_number_of_trials(50, 0.1);
find_number_of_trials(500, 0.1);
find_number_of_trials(5, 0.9);

return 0;
} // int main()
```
Note Since we're calculating the minimum number of trials required, we'll err on the safe side and take the ceiling of the result. Had we been calculating the maximum number of trials permitted to observe less than a certain number of failures then we would have taken the floor instead. We would also have called `find_minimum_number_of_trials` like this: ```floor(negative_binomial::find_minimum_number_of_trials(failures, p, alpha[i])) ``` which would give us the largest number of trials we could conduct and still be P% sure of observing failures or less failure events, when the probability of success is p.

We'll finish off by looking at some sample output, firstly suppose we wish to observe at least 5 "failures" with a 50/50 (0.5) chance of success or failure:

```Target number of failures = 5,   Success fraction = 50%

____________________________
Confidence        Min Number
Value (%)        Of Trials
____________________________
50.000          11
75.000          14
90.000          17
95.000          18
99.000          22
99.900          27
99.990          31
99.999          36

```

So 18 trials or more would yield a 95% chance that at least our 5 required failures would be observed.

Compare that to what happens if the success ratio is 90%:

```Target number of failures = 5.000,   Success fraction = 90.000%

____________________________
Confidence        Min Number
Value (%)        Of Trials
____________________________
50.000          57
75.000          73
90.000          91
95.000         103
99.000         127
99.900         159
99.990         189
99.999         217
```

So now 103 trials are required to observe at least 5 failures with 95% certainty.

 Copyright © 2006-2010, 2012-2014, 2017 Nikhar Agrawal, Anton Bikineev, Paul A. Bristow, Marco Guazzone, Christopher Kormanyos, Hubert Holin, Bruno Lalande, John Maddock, Jeremy Murphy, Johan Råde, Gautam Sewani, Benjamin Sobotta, Nicholas Thompson, Thijs van den Berg, Daryle Walker and Xiaogang Zhang Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)