Monday, July 28, 2014

Workflow diagram

So apparently the web has moved on to HTML5; let's get some diagrams going.
workflow developer developer bug_assigned bug_assigned developer->bug_assigned takes possession mq local repo patch patch/tests mq->patch sheriff sheriff incoming incoming sheriff->incoming tree closure/backout bug_new bug_new / bug_reopened bug_new->developer looked at bug_useless unusable bug bug_new->bug_useless triage module_owner module owner module_owner->bug_new confirmed module_owner->bug_useless buildbot builder cached_builds build cache buildbot->cached_builds testslaves tests tbpl build/test results by changeset testslaves->tbpl test results users users bug_untriaged bug_untriaged users->bug_untriaged new issue or automated crash report cached_builds->testslaves try run testsuite patch->try incoming->buildbot experimental experimental branch incoming->experimental unstable unstable / master incoming->unstable testing testing incoming->testing approval stable stable incoming->stable approval incoming->bug_assigned backout experimental->developer unstable->developer unstable->mq rebase unstable->testing bug_fixed bug fixed - QA examines unstable->bug_fixed frozen frozen/beta testing->frozen release driver frozen->stable stable->users common usage tbpl->sheriff find regressions tbpl->module_owner try->buildbot reviewer review try->reviewer r? bug_untriaged->bug_new popular vote bug_untriaged->module_owner bug_untriaged->bug_untriaged resolved duplicate bug_assigned->mq new branch bug_assigned->bug_new nevermind reviewer->incoming push reviewer->bug_assigned r- bug_verified bug_verified bug_useless->bug_verified officially unsupported bug_closed bug_closed bug_verified->bug_closed Bug is not definitely needed anymore bug_fixed->bug_new bug_fixed->bug_verified wiki wiki documentation bug_fixed->wiki

Thursday, July 24, 2014

Decisions

On unfamiliar ground, such as a changing industry or a new product or business launch, models can be dangerous. In the messy world we live in, they often make invalid assumptions or identify non-existent patterns. Few research efforts have focused on analyzing historical data or identifying similarities between phenomena (which can be considered replications of statistical tools). Thus, most statistical methods are untested and, at best, partial solutions, as the relevant variables and their relationships with outcomes often have no consensus. Although complex regression models have been developed, and shown to be somewhat accurate in their domain, they generally only consider a small subset of the factors that predict success. Even once a statistical method is proven and applicable, many challenges remain unsolved in visualization and user interaction. Interactively and quickly exploring algorithms, parameters, metadata, and data sets requires a large amount of functionality, including zooming, highlighting, filtering, clipping, sorting, smoothing, searching, plotting, focusing, lensing, and side-by-side visualizations. Few graphical toolkit libraries implement all of these or in a fashion that scales to large data sets.

Given this, many decisions still require humans for their consideration and conclusions. Unfortunately, humans are subject to many biases. When presented with a data sample, people naturally begin performing a comparison of the problem with the sample. Due to anchoring, even if the sample is irrelevant, it will still influence their decision. Due to effects such as backfires, bandwagons, decoys, and focusing, it can even influence their answer in the wrong direction. Supposing the data does have a deep correspondence with the problem at hand and creates the correct conclusion, there is still hindsight bias in computing the uncertainty of results and hyperbolic discounting of the payoffs of results.

Using multiple data samples encourages a statistical and historical view of the problem and thus produces more accurate forecasts. It compresses the natural try-fail cycle people use to formulate solutions, allowing people to learn from the past and determine relationships between determinants and outcomes. On average, using more data samples generates more strategies and ensures that the chosen strategy is more successful.

The similarity between samples and the problem has an interesting effect. When generating strategies, it is most useful to have a set of distantly-related or even unrelated samples to consider, as these generate better ideas. However, closely-related examples are a better basis for predicting performance or evaluating strategies, and can greatly reduce over-optimistic evaluations.

It is thus necessary to have both distantly- and closely-related samples for a creative and correct decision. Unfortunately, even when encouraged to do so, people are not naturally inclined to form a broad set of samples; they suffer recall or availability bias. Using a sample set selected randomly from a representative reference class by a unbiased method such as partitioning into intervals or deciles generally reduces this bias. Focusing on the relative similarity of samples and the aspects of them that are generalizable to multiple cases shifts the discussion towards the empirical facts and deep structures of the problem at hand, avoiding superficial details and their unwanted effects. Robust analogizing with differential attention to the most similar and least similar cases increases the precision of outcome analysis and greatly facilitates learning from other people's experiences.

The following procedure incorporates these ideas:
  1. Define the problem and your purpose
    1. predictors, relevant categories or potentialities
    2. the plan/timeline/budget/milestone variables desired
    3. What kind of comparison is being made
  2. Generate a reference class of similar/analogous problems/cases, using as many analogies and references as possible
    1. The class should be universal ("all possible states/choices/outcomes/things of the world")
    2. Common analogies include position, diversity, resources, cycles, economics, and ecology
    3. Make design choices about the study - data sources etc.
  3. Select a set of samples from the reference class. The set should be unbiased and distribution-based, e.g. a "random subset".
    1. predetermine the method of selection, using a rigorously structured approach
    2. avoid memory sampling
    3. avoid using a large set of samples - subsampling allows you to either measure the fragility of a conclusion (if the procedure is repeated) or reduce labor (if not)
  4. Assess the source cases - research
    1. strategies that were pursued, relevant qualities/quantities, and how they turned out ("results")
    2. interaction effects between states, choices, and outcomes - historical analogies
  5. Assess the similarity of the source cases to the target
    1. subjective weighting
      1. rank & rate for similarity - crowdsourcing survey, multiple observers, estimate observer reliability
      2. aggregate with robust mean function - give small weight to outliers
      3. can use non-experts or experts depending on domain / availability of info
    2. rule-based similarity - if relevant features/importance are known
  6. Construct an estimate of results, using the similarity weightings
    1. create measure for comparison of outcome, e.g. money
    2. obtain priors over the probability of the outcome occurring
    3. hierarchical cluster analysis - avg. of ~6 - identify "top" cluster containing samples most similar to problem
    4. estimate average outcomes of samples - use to predict problem
  7. Assess the predictability of outcomes - regression on samples using estimate from (6) and other known variables
    1. Make statistical corrections of the estimate
    2. Adapt or translate results