Skip to content

Commit

Permalink
More revisions.
Browse files Browse the repository at this point in the history
  • Loading branch information
mmcdermott committed Aug 16, 2016
1 parent 4496de3 commit 2cc560b
Show file tree
Hide file tree
Showing 11 changed files with 107 additions and 288 deletions.
Binary file added assets/Example Output Expanded.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/Example Output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/Global Facts.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
36 changes: 21 additions & 15 deletions basic_modeling/input_types.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,20 @@ the word 'to', like this: `low to high`.

Guesstimate can convert your confidence interval into three different possible formal distributions.

1. [Normal Distributions](https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwjZ1ZDQyrfLAhVkr4MKHXOxDHsQFggcMAA&url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FNormal_distribution&usg=AFQjCNEOuAsc3h-p3E2f0u3Cnkdz3Np1kQ&sig2=u_UI4k0Y9zBKC7DA8vx6VQ&bvm=bv.116573086,d.dmo): This should be used when you think that values near the center of your range are more likely than values near the edges of your range, and values outside your range are possible, but increasingly unlikely.
Guesstimate interprets your input as a 90% CI distributed symmetrically about the mean.
2. [Uniform Distributions](https://en.wikipedia.org/wiki/Uniform_distribution_(continuous\)) : This should be used when
1. [Normal Distributions](https://en.wikipedia.org/wiki/Normal_distribution):
This should be used when you think that values near the center of your range are more likely than values near the
edges of your range, and values outside your range are possible, but increasingly unlikely. Guesstimate interprets
your input as a 90% CI distributed symmetrically about the mean.
2. [Uniform Distributions](https://en.wikipedia.org/wiki/Uniform_distribution_(continuous\): This should be used when
you are 100% certain that the value would fall within your range, and it is equally likely it would fall anywhere
within your range.
Guesstimate interprets your input as the full range of possible values the metric could take.
3. [Lognormal Distributions](https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwjxk_XRyrfLAhXswYMKHUxfB6sQFggdMAA&url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FLog-normal_distribution&usg=AFQjCNH6r87BB9IaLASYhr0DIL88rh0OGQ&sig2=Uckv53L7BzDE_SuiDJKmqQ): This should be used when your value must be positive, and is more likely near the left edge of your range than the right, but has a long tail of possibility to the right (more specifically, when values are symmetrically likely on the log scale across the multiplicative center of your range).
Guesstimate interprets your input as a 90% CI with your left endpoint at the 5th percentile and your right at the 95th percentile.
For lognormal, both endpoints of your confidence interval must be positive.
3. [Lognormal Distributions](https://en.wikipedia.org/wiki/Log-normal_distribution):
This should be used when your value must be positive, and is more likely near the left edge of your range than the
right, but has a long tail of possibility to the right (more specifically, when values are symmetrically likely on
the log scale across the multiplicative center of your range). Guesstimate interprets your input as a 90% CI with
your left endpoint at the 5th percentile and your right at the 95th percentile. For lognormal, both endpoints of
your confidence interval must be positive.

##### Proportions

Expand All @@ -44,19 +49,20 @@ reflective of the precision in your estimate (e.g. `1 of 5` will have more unce

![](https://s3.amazonaws.com/elevio-article-assets/565e550e67ffc/5674c67d9330e_function.png)

Values can be functions of other metrics. To do this, simply begin the field with an 'equals' sign, followed by the
formula. Each metric has a two letter variable name.
Values can be functions of other metrics. To do this, simply begin the field with an `=` sign, followed by the
formula. Each metric has a two letter variable name which will show as soon as you type the `=` sign in the value field.
To use another metric in your function, you simply refer to it by its associated two letter variable name. You can type
this variable name explicitly or simply click on that metric while the function is selected to insert it.

You can type this explicitly or simply click on that metric while the function is selected to insert it. You can also
use functions to specify specific distributions, with additional parameters. For example, if you wish to specify a
normal distribution by mean and standard deviation, you can do this via the functional form. This is covered in the
[Additional Distributions](../functions/distributions.md) article.
You can also use functions to specify specific distributions, with additional parameters. For example, if you wish to
specify a normal distribution by mean and standard deviation, you can do this via the functional form. This is covered
in the [Additional Distributions](../functions/distributions.md) article. You can read more about all supported
functions [here](../functions/README.md).

#### Custom Data

![](https://s3.amazonaws.com/elevio-article-assets/565e550e67ffc/56df472b84913_custom-data-example.png)

Custom data can be entered directly into the 'value' field by simply pasting a stream of comma, enter, or space
separated values. You can also expand the card into its full, expanded view, then edit the custom data field directly.
This view accepts comma separated values as well. Data will be up or down sampled to approximately 5000 samples, to
match the other nodes.
separated values. You can also expand the card into its full, expanded view, then edit the custom data samples directly.
Inputted samples will be used as an empirical distribution in all downstream functions.
16 changes: 11 additions & 5 deletions basic_modeling/interpreting_your_results.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,17 @@
# Interpreting Your Results

![](https://s3.amazonaws.com/elevio-article-assets/565e550e67ffc/56e86fc169786_small-card.png)
![](../assets/Example Output.png)

The standard view shows the average observed value and the 90% confidence interval (CI) around that value; in this example, the node says that 90% of the time, the number of months you could survive is within 6.5 - 3.2 = 3.3 and 6.5 + 3.2 = 9.7 months.
The metric card will show the expected value of that metric and a 90% confidence interval (CI) around that value; in this
example, the card shows that the metric has an expected value of 78, and 90% of the time the value of the metric is
between 37 and 160. This 90% confidence interval is formed from the upper 95% of the samples below the mean and the
lower 95% of the samples above the mean.

![](https://s3.amazonaws.com/elevio-article-assets/565e550e67ffc/56e873a551d5b_both.png)
![](../assets/Example Output Expanded.png)

The expanded view shows the same mean and CI (red box) and a table of percentiles (blue box). Percentiles show how likely it is that the observed value would fall below a threshold; here, there is a 1% chance that you could only survive for fewer than 3.08 months, and a 95% chance that you could only survive for fewer than 9.539 months.
The expanded view shows the same mean and CI (red box) and a table of percentiles (blue box). Percentiles show how
likely it is that the observed value would fall below a threshold; here, there is a 1% chance that you could only
survive for fewer than 15.781 months, and a 95% chance that you could only survive for fewer than 46.562 months.

To ask, "How often will I be able to survive for more than 9.539 months?", simply subtract the percentile for that value (95%) from 100%. Here, you could survive for more than 9.539 months only 5% of the time.
To ask, "How often will I be able to survive for more than 46.562 months?", simply subtract the percentile for that value
(95%) from 100%. Here, you could survive for more than 46.562 months only 5% of the time.
27 changes: 21 additions & 6 deletions facts.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,20 @@
# Facts

*Facts are available only to organizations with private plans.*
## Global Facts

Have a few similar metrics that are used in multiple sheets? You can use facts to have some constants that will be similar among all of your models.
Guesstimate maintains a database of certain global facts that can be used in any function in any model. Right now, this
dataset is limited to city population sizes, for a select, but extensive list of cities. to use these facts, simply type
an '@' sign, followed by the city name, then a '.' followed by the word, 'population' within a functional form, like
this:

![Global facts use expressions like '=@Chicago.population'](./assets/Global Facts.png)

## Organizational Facts

*Organizational facts are available only to organizations with private plans.*

Have a few similar metrics that are used in multiple sheets? You can use facts to have some constants that will be
similar among all of your models.

Organization level facts are currently all private and are only available for use in private models.

Expand All @@ -18,18 +30,21 @@ or

``40K to 43K``.

The value can be a data input. It cannot be a function of other facts. Also, if this is a range, you will not be able to choose between Normal, Lognormal, and Uniform distributions.
The value can be a data input. It cannot be a function. Also, if this is a range, you will not be able to choose between
the standard distribution options for that range.

The **name** is whatever you want to use to refer to the fact. It is only used for your own reference.

The **hashtag** is what you can use to refer to the fact inside of a function. For instance,

``= #monthly_revenue / 12 ``

Facts can be simply added or edited. They are findable on the organizations' page. We suggest being careful with deleting facts: while this is possible, if you have a model that uses that fact, that model may break.
Facts can be simply added or edited. They are findable on the organizations' page. We suggest being careful with
deleting facts: while this is possible, if you have a model that uses that fact, that model may break.

### Using Facts

Facts are simple to use in models. In a function, type the hashtag that represents a fact in order to refer to it. This should auto-complete, so after typing the first few characters, you can click **TAB** to complete the word. This is sometimes a bit buggy; if there are issues, we suggest copying & pasting the name.
Facts are simple to use in models. In a function, type the hashtag that represents a fact in order to refer to it.
This should auto-complete, so after typing the first few characters, you can click **TAB** to complete the word.

In the case that the fact does not exist any more, there should be an error that indicates this.
Facts are still an experimental feature; if you encounter any issues in their usage, please contact us.
4 changes: 4 additions & 0 deletions functions/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
## Functions

Guesstimate uses [math.js](http://mathjs.org/) for math parsing. Most operators, functions, and constants from math.js
are available. Units are not available with Guesstimate. They have a list of constants
[here](http://mathjs.org/docs/reference/constants.html).

* [Operators](operators.md)
* [Available Functions](existing_functions.md)
* [Financial Functions](finance_functions.md)
Expand Down
76 changes: 22 additions & 54 deletions functions/distributions.md
Original file line number Diff line number Diff line change
@@ -1,56 +1,24 @@
# Distributions

Guesstimate supports a variety of statistical distributions beyond those selectable from confidence intervals. If the input parameters to these distribution functions are deterministic, 5000 samples will be generated at those parameter values. If the inputs are themselves sampled, one sample will be drawn, per input sample.

[Beta](https://en.wikipedia.org/wiki/Beta_distribution)
`=beta(α, β)`

[Central F](https://en.wikipedia.org/wiki/F-distribution)
`=centralF(d<sub>1</sub>,d<sub>2</sub>)`

[Cauchy](https://en.wikipedia.org/wiki/Cauchy_distribution)
`=cauchy(x<sub>0</sub>,γ)`

[Chi-squared](https://en.wikipedia.org/wiki/Chi-squared_distribution)
`=chisquare(k)`

[Exponential](https://en.wikipedia.org/wiki/Exponential_distribution)
`=exponential(λ)`

[Inverse-gamma](https://en.wikipedia.org/wiki/Inverse-gamma_distribution)
`=invgamma(α, β)`

[Gamma](https://en.wikipedia.org/wiki/Gamma_distribution)
`=gamma(k, θ)`

[Lognormal](https://en.wikipedia.org/wiki/Lognormal_distribution)
`=lognormal(μ, σ)`

[Normal](https://en.wikipedia.org/wiki/Normal_distribution)
`=normal(μ, σ)`

[Student's T](https://en.wikipedia.org/wiki/Student%27s_t-distribution)
`=studentt(ν)`


[Weibull](https://en.wikipedia.org/wiki/Weibull_distribution)
`=weibull(λ,k)`


[Uniform (continuous)](https://en.wikipedia.org/wiki/Uniform_distribution_(continuous))
`=uniform(a,b)`


[Bernoulli](https://en.wikipedia.org/wiki/Bernoulli_distribution)
`=bernoulli(p), =test(p)`


[Binomial](https://en.wikipedia.org/wiki/Binomial_distribution)
`=binomial(n,p)`

[Negative Binomial](https://en.wikipedia.org/wiki/Negative_binomial_distribution)
`=negBinomial(r,p)`


[Poisson](https://en.wikipedia.org/wiki/Poisson_distribution)
`=poisson(λ)`
Guesstimate supports a variety of statistical distributions beyond those selectable from confidence intervals. If the
input parameters to these distribution functions are deterministic, 5000 samples will be generated at those parameter
values. If the inputs are themselves sampled, one sample will be drawn, per input sample.

| Distribution Name | Use Cases | Syntax |
| ----------------- | --------- | ------ |
| [Beta](https://en.wikipedia.org/wiki/Beta_distribution) | Estimating Proporitions or Percentages | `=beta`$$(\alpha, \beta)$$ |
| [Central F](https://en.wikipedia.org/wiki/F-distribution) | Testing the Variance of Observed Samples | `=centralF`$$(d_1, d_2)$$ |
| [Cauchy](https://en.wikipedia.org/wiki/Cauchy_distribution) | The x-intercept of a ray with uniformly distributed angle | `=cauchy`$$(x_0, \gamma)$$ |
| [Chi-squared](https://en.wikipedia.org/wiki/Chi-squared_distribution) | The sum of the squares of normal random variables | `=chisquare`$$(k)$$ |
| [Exponential](https://en.wikipedia.org/wiki/Exponential_distribution) | The waiting time until the occurence of a rare event with a specified rate. | `=exponential`$$(\lambda)$$ |
| [Gamma](https://en.wikipedia.org/wiki/Gamma_distribution) | A generalization of the sum of exponential random variables | `=gamma`$$(k, \theta)$$ |
| [Inverse-gamma](https://en.wikipedia.org/wiki/Inverse-gamma_distribution) | The reciprocal of a gamma random variable | `=invgamma`$$(\alpha, \beta)$$ |
| [Lognormal](https://en.wikipedia.org/wiki/Lognormal_distribution) | The product of many positive, independent random variables | `=lognormal`$$(\mu, \sigma)$$ |
| [Normal](https://en.wikipedia.org/wiki/Normal_distribution) | The sum of many independent random variables | `=normal`$$(\mu, \sigma)$$ |
| [Student's T](https://en.wikipedia.org/wiki/Student%27s_t-distribution) | An estimator for the difference between the true mean and the mean of N independent samples of a random variable, for small N. | `=studentt`$$(\nu)$$ |
| [Weibull](https://en.wikipedia.org/wiki/Weibull_distribution) | The lifetime of a component for which failure rate is proportional to time | `=weibull`$$(\lambda, k)$$ |
| [Uniform (continuous)](https://en.wikipedia.org/wiki/Uniform_distribution_(continuous) | An estimate where all equally sized uniforms have the same likelihood | `=uniform`$$(a,b)$$ |
| [Bernoulli](https://en.wikipedia.org/wiki/Bernoulli_distribution) | The value 1 (success) with probability $$p$$, and 0 (failure) otherwise. Used for accounting for discrete trials. | `=bernoulli`$$(p)$$, `=test`$$(p)$$ |
| [Binomial](https://en.wikipedia.org/wiki/Binomial_distribution) | The sum of $$n$$ independent Bernoulli distributions with parameter $$p$$ | `=binomial`$$(n,p)$$ |
| [Negative Binomial](https://en.wikipedia.org/wiki/Negative_binomial_distribution) | The number of success before $$r$$ failures is reached in a series of Bernoulli trials with parameter $$p$$ | `=negBinomial`$$(r,p)$$ |
| [Poisson](https://en.wikipedia.org/wiki/Poisson_distribution) | The number of events occurring in a fixed interval, with known average rate $$\lambda$$, if events occur independently. | `=poisson`$$(\lambda)$$ |
Loading

0 comments on commit 2cc560b

Please sign in to comment.