We have compiled, and organized some of the most frequently used terms and definitions for your convenience. Click through to view all the definitions available.

Please wait, loading...

Monday - Friday 9:00AM - 6PM

[cq_vc_separator titleas=”icon” icon=”typicons” icon_typicons=”typcn typcn-chevron-right” bgcolor=”#ffffff” fontcolor=”#001039″ bordercolor=”#ffffff” borderstyle=”dotted” titleposition=”0″ bordersize=”3px”]

See Artificial Intelligence.

A mathematical formula, set of rules, procedure, or some combination of these. These are often the “raw materials” or components for machine learning. Algorithms can be well or poorly designed. They can be easy to understand or, inscrutable black boxes. The Association for Computing Machinery has offered seven principles for transparent, algorithms and accountability.

(See Machine Learning, Black Box, Glass Box)

A term with multiple meanings:

- The discipline and practice of using logical, Boolean, and quantitative methods to find answers and support decisions, whether for business analysis, scientific analysis, risk analysis, academic research, or for some other purpose such as Sabermetrics in baseball
- Results of analytic efforts reported to decision makers and clients
Lone Star ®*(this is what**means when we use the term)*

A term with multiple meanings:

- Discovery and communication data patterns
- Use of analysis method of all types
- The discipline and practice of different types of analysis
- The work done during a project that delivers analysis
Lone Star ®*(this is what**means when we use the term*)

ANOVA attempts to understand whether data samples came from populations with the same average. We expect random draws won’t add up to the same average all the time. ANOVA gives insight about whether the differences might be due, simply to randomness.

A term with more than one meaning. Today, most people (including Lone Star) mean endowing computers with ability which would have required a human in the past. This includes doing things which would have required human skills, as well as mimicking processes which seem to be “cognitive” including making choices, communication on human terms, and adaptive learning.

This definition is broader than older definitions from the early days of AI. Early AI definitions were rooted in concepts from Utility Theory and biological concepts based on the limited understanding of neurons at the time. These included the assumption humans were “rational” and thus a good basis for AI architectures. Early AI also embraced the claim that human intelligence might “… be so precisely described that a machine can be made to simulate it.”

Two AI valleys of despair (AI winters) occurred in the late 20^{th} century. These setbacks led to broader inclusion of adaptive computing and optimal estimation which might not have been considered as “real AI” earlier. Other moderating influences have been Prospect Theory (humans are not really very rational) and, realization human neural processes are more varied and complex than previously thought.

A good example of the shifting nature of AI definitions can be seen in aircraft autopilots. Autopilots were built long before digital computers (Sperry built the first autopilot in 1912, first computers were built in the 1940’s). Digital flight control (autopilots with computers) didn’t appear until the 1970’s. In today’s semantics, these might have been “AI” but were not considered AI at the time. Today, even Sperry’s hydraulic autopilot might be considered AI; it replaced human flight control and worked a lot better in 1912 than some digital bots in 2017.

(See Game Theory, Prospect Theory, Machine Learning, Neural Net, Utility Theory)

An important Prospect Theory bias, which distorts human judgment in dealing with uncertain prospects. It occurs when a decision maker judges information which is more easily recalled from memory (remembered) as being more probable. Information which is vivid (because the decision maker is already familiar with or has prior experience concerning this information) is more easily recalled and evaluated as being more probable than equally probable information with which the decision maker is not familiar.

(See Prospect Theory)

An important Prospect Theory bias, which distorts human judgment in dealing with uncertain prospects. Base rate data are ignored or devalued in favor of other, less relevant data. Often occurs when base rate data are somewhat abstract in comparison to more concrete, but less relevant additional data.

(See Prospect Theory)

The base rate fallacy is misunderstanding how real odds interact with the odds of detection. For example, if your mother can tell when you are fibbing or telling the truth 99% of the time, what are the odds she is right when she says you are telling the truth? It turns out we can’t really answer this question without understanding whether you fib a lot, or just a little or not at all. But, if we know you tell the truth 90% of the time, and fib the other 10%, then your mom’s performance looks like this for 1,000 times she assesses your truthfulness:

You tell the truth 900 times. Mom says you tell the truth 891 times. 1% of the time, when you tell the truth (9 times) she says you are fibbing when you are not. You fib 100 times. Mom says you fib 99 of those, and you get away with fibbing once out of the hundred. So – even though mom is 99% accurate, she is wrong 9 times out of 10 when she thinks you fib.

The base rate fallacy is depending on the accuracy of the test, and ignoring what is being tested. This is just one example of how humans get confused about conditional probability, but it has life and death implications for understanding health screening, law enforcement and other serious matters. To formally do this kind of math, we might use Bayes Theorem (see).

Bayes theorem is usually defined as a formula which specifies how to do probability math. In a broader sense, it has been said Bayes was to probability what Pythagoras was to triangles and geometry.

Bayesian methods specify how to change old (a priori) estimates of probabilities after receiving new information. In classic Bayes-speak, updated estimates are “posterior probabilities.”

Bayes’ Theorem and Bayesian methods let us make better estimates in the face of uncertainty. These methods are applied to a wide range of problems, from search and rescue strategy to spam filtering. For example: a lost vessel might, or might not be in some portion of a search area, and a given email might, or might not be spam. Bayesian methods provide the means to determine the optimal method for solving these problems.

Most filters that operate under uncertainty (or “noise”) are Bayesian, whether their designers know it or not. This applies to oil exploration, signal processing, control theory, economic forecasting, and any other field where such filters are used.

(See Explicit Bayesian and Implicit Bayesian, Filter, Noise)

Richard Bellman was a mathematician who made important contributions on a number of fronts. After working on the Manhattan project in World War II, he made the first major post-war improvements in optimization mathematics. He made important contributions related to the application of Markov mathematics, and his work is still useful in optimal control theory, and economics. Bellman published roughly 700 books and papers, including over 100 papers after brain surgery left him with complications. Bellman’s work is cited in a number of Ron Howard’s writings (e.g., Howard’s 1964 book, “Dynamic Programming and Markov Processes”) which in turn played a role in the founding of Decision Analysis.

(See Curse of Dimensionality, Decision Analysis, Markov Decision Processes, Howard)

Benefit Cost Analysis (BCA) is a term with several meanings. It is sometimes confused with Business Case Analysis which shares the BCA acronym. Benefit Cost Analysis often seeks to understand, and in some sense optimize the trade-offs between outcomes of spending money (the “benefit” and, how much money to spend (the “cost”). BCA can have very specific meaning, such as those imposed by the Office of Management and Budget (OMB) circular A-94 and Executive Order 12866, or it can have a more general meaning. Other related terms are cost-benefit and cost-effectiveness analysis.

(See Business Case Analysis, Business Case Analysis Simulation)

“Big Data” doesn’t have a single definition. It is a collection of concepts related by their size and their existence as databases. Usually, the term deals with more than one very large data set. These data bases are complex (or just so large) as to be impractical to explore with traditional database tools or processing methods. As data sets reach Terabyte size they contain valuable information due to their sheer mass, but present challenges of many types. At some point, Big Data must be interpreted, visualized, modeled or somehow “reduced.”

There is also a band named “Big Data” but we don’t vouch for them. They are an electronic pop duo who focuses on the digital user privacy and government surveillance.

A term with negative connotations suggesting an algorithm or process is hidden and mysterious. It may also suggest secretive or untrustworthy. This is a charge often levied at some forms of Artificial Intelligence and in particular deep learning.

(See Algorithm, Artificial Intelligence, Deep Learning, Glass Box)

A “Black Swan” is the occurrence or observation of something which seemed impossible, or previously unknown. This is also the title of a popular book on randomness, models, and markets. Swans of unusual character are labeled black, when, in fact, we can predict something of their behavior. In doing so, they are no longer black, but can be thought of as gray. They only seem black if we fail to acknowledge their potential existence and fail to look. See below:

**White Swan**– A true white swan is familiar and predictable, well understood with proven, testable cause and effect – good predictions are made today.**Light Gray Swan**– A light gray swan is not as well understood as everyone thinks – surprises lurk here, hidden behind paradigms which need to be challenged; useful predictions are being made, but better, safer predictions are within reach.**Medium Gray Swan**– A medium gray swan is not mysterious but simply not modeled and measured – useful predictions about most causes and effects can be made, if effort is expended.**Dark Gray Swan**– A dark gray swan has some inherently difficult and/or complex features, but one which, nonetheless can be at understood well enough to make some useful predictions, even if some things are not knowable.**Black Swan**– A true black swan is completely unfamiliar and essentially unpredictable except in abstract terms, few if any useful predictions can be made.

Bootstrapping is a method used to deal with limited data sets. When available samples are much smaller than a real universe, Bootstrapping uses sampling with replacement from available data. The goal is to gain insight into a larger set of data by treating the limited set as if it were larger.

George Edward Pelham Box was, like Bayes, a member of the Royal Society and worked in probability and statistics. His work in World War II provided a practical, applied perspective to his approach to analysis and applied mathematics.

Box was a polymath, whose interests were wide ranging from chemistry to quality control, time-series methods, design of experiments (DOE), biography, and Bayesian inference. Box published a number of papers and books, some of which foreshadowed Big Data. A number of methods and principles bear his name, including Box–Jenkins models, Box–Cox transformations, and Box–Behnken designs.

Box is credited with being one of the 20^{th} century voices who brought attention back to Bayesian principles, while at the same time, raising concerns about what we would now call “unsupervised machine learning” and arguing for “mechanistic models” which represent cause and effect.

Box is famous for having said, *“all models are wrong, but some are useful.”*

Business Case Analysis (BCA) is a term with several meanings. In some contexts it just means a PowerPoint presentation with a list of reasons why the speaker wants to take a course of action. In other cases, it involves a cost-benefit trade study, with math models. When Lone Star ® uses the term, we usually mean a procedure with simulations to conduct a range of side-by-side scenarios and courses of action.

(See Benefit Cost Analysis, Business Case Analysis Simulation, Scenario Based Planning)

Business Case Analysis Simulation (BCAS) is a form of BCA, and/or a BCA tool. BCAS conducts BCA using simulation tools. The result is usually a richer, more insightful understanding of the cost-benefit trade space than a PowerPoint or spreadsheet analysis can provide.

(See Business Case Analysis, Simulation, Benefit Cost Analysis)

See Streetlight Effect.

The Central Limit Theorem describes how distributions, which may NOT be normal distributions, will tend to create normal distributions when samples are randomly drawn from them. Central Limit Theorem explains why a complex system, with many random variables, can have well behaved outputs in spite of unruly inputs.

(See Distribution, Normal Distribution)

A signal processing concept which also explains the limits of information channels, including human information processing, comprehension, and cognition. Usually attributed to Claude Shannon. George A. Miller of Harvard was one of the first to apply Shannon’s principles to humans, in 1956.

(See Likert Scale, Shannon)

One of the really useful things from formal statistics. This is a rule about how many times we expect a statistical test to be true. It is does not depend on assumptions about the nature of our data or its distribution, making it universal and very powerful. It is also one of the uses of standard deviation that will probably survive in the age of artificial intelligence and super computers.

If we have a sample of data (a list of numbers) the fraction of data samples which are k standard deviations or more from the arithmetic mean is, at most, 1/k^2.

So, for example, if we have a sample of 100,000,000 web shoppers, and we know their average shopping cart has $25 in it, and we know the standard deviation of shopping cart value is $20, then we can think about high rollers – people who are 10 standard deviations from the mean. 10 x$20 = $200. How many will be within $200 of $25 (or will spend $225 or less)? 1/k^2 will be 1/10^10, or 1/100. So we expect less than 1% to be over the $225 shopping cart size. But, out of our huge data base, this is roughly a million people. And, remember that we have some people who have really small shopping carts. We need to know if we have to worry about that side of the distribution too.

This only works for* common sense* standard deviations. Don’t try it with zero, negative numbers of other strange things. What the heck would that mean anyway?

The illogical or irrational processing of information, often by simply ignoring it. The human brain seems to use a number of strategies to reduce the time and energy required to think. In some cases, we might be predisposed to assume things are good (optimism bias), while in others we may assume the worst. Researchers claim there are over 175 observed biases. Although some may only to provide a means to publish peer reviewed papers, it is pretty clear humans have a lot of biases.

Framing is one of the most common bias; how we arrange information presented to us. Many of our most important framing biases are described by Prospect Theory.

Cognitive bias is a critical consideration for modeling human behavior. Lone Star’s Competitive Differentiation offerings use very advanced methods and tools to help our clients avoid these traps, and to gain advantage when competitors fall prey to them.

(See Availability Cognitive Bias, Base Rate Bias, Prospect Theory, Kahneman, Utility Theory)

Control Systems are machines which control other machines. Some applications include thermostats, cruise control, and auto-pilot systems. Complex Control Systems are implemented in digital processors. Most Complex Control Systems must deal with noisy, uncertain or probabilistic data inputs in order to generate control signals.

(See Kalman Filter, Prescriptive Analysis)

A data sampling method chosen because it is easy, rather than controlled for being representative. For example, if we ask our co-workers whether we think a competitor’s products are priced fairly, it seems likely the results will not match the general population. But it might make more sense to ask potential customers who must choose between the firms. It is very easy to ask our colleagues, and fairly easy to buy web respondents from the general population. These both might be “convenience samples” if the real question is what real customers think.

The problem with errors from convenience samples is that it can be so hard to know when you have fallen into this trap. Even worse, data collected carefully for one purpose may (or may not) be a convenience sample in another case.

See Benefit Cost Analysis.

See Benefit Cost Analysis.

A concept of price forecasting. The price (or cost) something “could cost” if not based on historical precedents. Could cost is often a component of Business Analysis.

(See Should Cost)

Criticality is the state of a system which is neither stable, nor unstable. A snow-pack just before an avalanche is critical. Systems which approach criticality seem stable, but have lost much of the constraint which retains stability.

So called “Black Swan” events can be caused by unlikely, but possible effects on systems near criticality. The Snow Patrol causes controlled avalanches by firing cannons (a large input, unlikely to be seen in nature), rather than wait for the snow-pack to build and finally collapse due to a puff of wind at criticality.

(See black swan)

The “curse of dimensionality” is a problem which occurs as new variables are added to data sets and statistical data models. The term was coined by Richard Bellman in 1961 who lucidly explained how some seemingly simple data sets can grow to sizes which are difficult, or impossible to manage. In some modeling methods, the data space becomes increasingly sparse. Statistical classifiers and predictors break down when available data becomes insufficient to generate useful results across all the dimensions defined by the variables. The curse also causes problems for researchers in complex problems, driving them to the use of DOE.

For most statistical methods, computational intensity, memory storage and other measures of difficulty grow exponentially as new variables are added. The problem plagues Big Data, including Big Data in word mapping or gene mapping, where an analysis might have to deal with thousands of words or genes. Lone Star ® methods help avoid the curse of dimensionality.

(See Bellman, DOE, Data Reduction, Decision Analysis, Big Data)

The conversion of information into forms which can be processed to create more value and insight. Datafication is a big data idea, and a big data driver. For example, when information which had been trapped on paper becomes available for processing, Datafication has occurred. Another meaning is connected to organizations, when organizations use tools and processes, like those associated with IoT and IIoT, the organization might be said to be going through Datafication. Datafication is not the same as digitization (see).

Data Architecture usually refers to the structure and architecture of a database, or the architectural relationships among databases.

The process of dealing with issues in the data used for analysis. Issues include errors, missing data, mismatched index, and out of sync timing, among other things.

(See Fourth Great Lie)

A term with multiple meanings:

- An analysis technique involving the selective, and systematic withholding of data from an algorithm or simulation, with the objective of learning which information is most important by some measure, usually accuracy
- A human tendency of denying data which challenges comfort and prejudice. For example, “When we showed her the data, the CEO said, ‘Your information is incorrect.’” Data denial comes in many subtle sub-species; “your data is _____” (old, not applicable, wrong, previously refuted, etc….)

A term with multiple meanings:

- Blending diverse phenomenology such as infrared and visible light images, or electromagnetic and seismic survey information
- Dimensional diversity, comparing similar information across dimensions such as age or location in demographics
- Redundancy methods to provide security against accidental or malicious information corruption
- Redundancy methods to provide richer, higher confidence analysis

There are other less common meanings, and legal meanings related to social measures of workforce or educational diversity.

Using methods sometimes associated with Data Mining to assess stochastic model outputs. In some cases naturally connected to Design of Experiments (DOE). Data Farming is often required because of the Forth Great Lie.

(See Data Mining, Design of Experiments, Fourth Great Lie)

Data mining involves extracting summaries and patterns in data. The term often is used in connection with large databases. In Big Data applications a wide range of techniques fall under the umbrella of “Data Mining”. These apply to Business Analysis, Risk Analysis, Risk Modeling, and Predictive Analytics.

(See Analytics, Business Case Analysis, Machine Learning)

Data Reduction has two meanings. Originally, it meant distilling useful information, or “signal” from a larger set of noisy data, or untangling mixed signals into their original, “pure” components. More recently it has become an insult, hurled by Big Data advocates who imply that nearly all the data has some kind of information in it, so, they would claim “reduction” involves loss of value, and loss of insight.

(See Flaw of Averages)

The term “Data Rules” has multiple meanings:

- In analysis terms, it can mean the rules about data usage and restrictions such as privacy, intellectual property, copyrights, governance and other rules which control the data.
- In database terms, it can mean the rules which govern the data set, relationships and other items related to Data Architecture

(See Data Architecture)

A collection of methods used to process data, usually data sets. In some cases number crunching and guessing are called “data science,” but these often fail to employ anything like the scientific method. At Lone Star ®, we believe Data Science should actually use the scientific methods.

(See Extrapolation, Falsification)

A practitioner of Data Science, if actually using scientific methods.

Decision analysis (DA) includes mathematics and methodology addressing decisions in a formal manner. The term “Decision Analysis” was used in 1964 by Ronald A. Howard of Stanford University. He defined most of the practice and professional application of DA, and founded more than one company performing DA. Two Lone Star ® founders used DA and brought it to our company. DA had three generations before Enhanced Decision Analysis (EDA) defined a new level of performance.

(See Enhanced Decision Analysis)

Decision Quality is a term with multiple meanings. Generally, it means use of information which is good enough for making a choice. For example, if you have $20,000 to use for buying a car, you may not know exactly which car to buy, but you know a great deal about what cars you are NOT going to buy.

Decision Quality has a wide range of applications in Business Analysis, Risk Analysis, and Enterprise Simulation, among others.

At Lone Star ®, we make the distinction between this level of knowledge and an “execution quality” plan (knowing exactly what car to buy, where to buy it…). Practitioners of Decision Analysis use the term to describe a DA model which is “directional.”

(See Directional Model, Decision Analysis)

A term with more than one meaning. A common meaning refers to a neural network with many layers. In this context, the extra layers of the neural net often are intended to avoid the need for humans to specify the attributes (or “features”) which define the goals of the algorithm the neural net is supposed to achieve. These extra layers are sometimes called “hidden layers” and give rise to charges that deep learning lacks transparency and is a black box.

(See Black Box, Glass Box, Neural Net)

Design of Experiments (DOE) is a term with multiple meanings. Usually, DOE refers to information-gathering in circumstances where the observations are expected (or known) to be variable, or even random. DOE methods are most useful when a full observation of a process or universe is not feasible, and when noisy data is expected (or known) to be not fully under the control of the experimenter. In these cases, DOE helps to guide how the limited data set will be collected in a way to generate the most insight, given the constraints.

DOE is not new. The 18^{th} century experiments on Scurvy are sometimes cited as an early example. Ronald A. Fisher was one of the most influential 20^{th} century proponents, in particular through his 1935 book, “The Design of Experiments.” G.E.P. Box and his coworkers furthered Fisher’s work.

Among statisticians, chemical process engineers, semiconductor process engineers, market researchers, and others who use DOE methods, there are differences in semantics, and some real disagreement. Generally, the disagreements have to do with the underlying assumptions related to the application.

(See Box, G.E.P, curse of dimensionality)

A term with more than one meaning. It can refer to a representation of a design, such as a Computer Aided Design (CAD) model. It can refer to a dynamic model, such as a flight simulation.

At Lone Star ® we usually mean a simulation of a system with a degree of fidelity adequate to predict outcomes such as desired results, failures, or both. We also mean prescriptive models which suggest what to do in pursuit of desired outcomes, or to avoid problems.

(see Predictive Analysis, Prescriptive Analysis)

The conversion of analog data and records to a digital format. When we convert from an old paper picture to a JPEG file, we have digitized the image. This is not the same as Datafication (see).

A model which helps indicate a preferred choice, or predicts the ranking of outcomes, but may not provide calibrated predictions. Using a Likert scale, we can test how many people think a room is “hot” but we can’t use that method to actually measure temperature. Likert voting can also help distinguish between temperatures, but still won’t give us a calibrated measurement. Directional models often provide decision quality information.

(See Decision Quality)

The term “distribution” can have a number of meanings in business and mathematics. A probability distribution is a concept used in statistics and probability. A distribution represents the probability of any possible group of outcomes within the set that defines the distribution. Distributions can be represented by data sets, or by formulas which define a distribution.

The “Domino Test” is more of a rule of thumb than a hard diagnostic test for analytics. As a rule of thumb, if you can’t order a Domino’s Pizza at a location, you probably can’t afford to ship out your Big Data stream, either. The Domino Test is an example of how New York’s Silicon Alley and California’s Silicon Valley can easily become disconnected from “fly over country.” For example, there are eight counties in Texas with fewer than 1,000 people in them. Odds are good that you can’t get a pizza delivered in most of them, and you can’t get a cheap fiber connection for your oil well, either. Loving County, Texas for example has roughly one person per eight square miles.

Econometrics literally means “economic measurement.” It is concerned with applying mathematics (including probability and statistics) to economics and economic theory. It can be used for market estimation, pricing, and other uses. It applies to large markets (telecommunications, oil & gas, national security) and to smaller markets (your local farmer’s market).

The “edge” refers to the edge of the network. The meaning of “edge” can vary with context, network type, and other factors. For Lone Star ®, the edge is really important. We mean analytics “at the edge” (near the point of need). This is critical because networks can’t quickly or cheaply transport the terabytes of data from aircraft, oil rigs, pipelines, etc… So, local intelligence is needed for operations, safety and other important functions.

The edge is important when speed, cost, or data volume are important. The concept of edge computing is related to Fog computing.

(see IoT, IIoT, Fog)

EDA is analysis from a set of tools, mathematics, and methodology which can be characterized as the 4^{th} and 5^{th} generations of DA. EDA naturally supports ensemble methods. Lone Star is an EDA leader.

The distinctions between EDA as 4^{th} generation DA include a more open decision frame (not a fixed, or frozen set of decision criteria), as well as other differences. 5^{th} generation DA involves still more distinctions, including automation to create models of very large scale.

(See Decision Analysis)

Ensemble methods refers to modeling methods which blend more than one “pure” technique. A very simply example is the practice of averaging several models and blending their predictions.

Ensemble Methods are often found in Big Data applications in Business Analytics, and Risk Analytics; a common use is Predictive Analytics.

Weather forecasters and retail sales forecasts often involve much more sophisticated ensemble methods.

Many Lone Star models are based on ensemble methods.

Enterprise optimization is the use of enterprise simulation to improve the performance and productivity of a system or “enterprise”. In some cases, the term incorporates the meaning of optimization, or finding the “best” solution. At Lone Star ® we use both definitions, although we are one of the few organizations capable of finding truly optimal solutions for complex systems (“enterprises”).

(See enterprise simulation, optimization)

Enterprise simulation is a term with several meanings. It can mean the simulation of a specific business or “enterprise”. At Lone Star ®, we usually mean a simulation which incorporates several attributes and disciplines of a problem into a single simulation. When a simulation includes competitive issues using game theory, physics, or engineering representations of processes, business analysis, and risk analysis, Lone Star ® would call this multi-disciplinary simulation tool an “enterprise model.”

Bayesian math in the classic form. This means a very large number of conditional probability estimates for a problem of even moderate complexity. Explicitly Bayesian methods are best applied to fairly simple problems.

(See the “Curse of Dimensionality”)

Extrapolation is one of the most common and least accurate ways to predict the future. Extrapolation is the mathematical extension of current trends into the future.

Extrapolation in July would predict, based on the temperature increase since January, all life in the Northern Hemisphere will be wiped out by New Year’s Day. The daily high rose about 70 degrees in six months, and another 70 degree increase will be fatal. This is a silly example, but the idea is the same one behind serious predictions made nearly every week in the popular press. Sadly, these are often attributed to so-called “Data Scientists” and to “Data Science”.

(See Data Science)

Falsification is proving a theory untrue. Falsifiability was promoted by philosopher Karl Popper. Popper saw the topic as a watershed, or a demarcation—separating the scientific from the unscientific. Models and simulations allow us to test ideas, promoting some, and falsifying others.

(See Null Hypothesis)

A method to convert a series of observations based on time, into the frequency domain, and to convert frequency domain representations into time based representations. FFTs are examples of tools which apply to a wide range of business problems, such as seismic processing for energy exploration, supply chains, and retail transaction pattern analysis. For example, Wal-Mart uses daily sales (time series) to estimate the local payroll cycles (frequency) around each store. An FFT can be used as a type of filter.

(See, Filter, Fit, Frequency)

A filter can be an algorithm which “filters out” some information or noise, or a filter can be an algorithm which forms an estimate of information within a noisy signal. Filters sometime operate in the frequency domain. Economic adjustment for seasonal variation and FM radio tuners are two examples of frequency domain filters. Filters can be rules, or algorithms. They can be as simple as taking an average, or as complex as those found in “optimal filtering.”

(See, Fit, Flaw of Averages, Frequency, Noise)

Fit means to form a summary of data which “fits” a formula or a rule. Fitting often involves finding the “Best Fit” which has the smallest error in matching the data, which may be noisy.

(See Noise, Over Fit, Flaw of Averages, Filter)

A book and concept written and promoted by Sam Savage. The Flaw of Averages occurs in all types of business and scientific estimates when focusing on single average values. The average (or single number representation) fails to convey extremes, uncertainty and risk. In a sense, an average can be an extreme case of over fitting.

(See Data Reduction, Over Fitting)

A term originally attributed to Cisco which refers to computing distributed across the network, but in particular at the Edge. Over time computing has swung back and forth between centralized and distributed processing. Fog computing can be called Edge computing, and Fogging. The central argument for Fog computing is (to quote Cisco) “Connecting more and different kinds of things directly to the cloud Is impractical.”

(See IoT, Edge)

There are three other great lies (#3 is the answer to a Carole King song). The 4^{th} applies to analysis, and it occurs when clients tell us “we have the data”. It’s OK, we can work around it. Lone Star is proficient in finding ways to quantify what you thought was in your database.

Frequency is number of repetitions per unit of time. A metronome that counts off 90 beats per minute is counting at that frequency. The Christmas shopping season and the Lone Star ® Chili Cook-off both happen once per year. These are examples stating frequency but not quantity. Frequency does not tell us how many total beats the metronome counts out during a practice session.

Scientific measurement of frequency is often expressed in events per second, called Hertz(Hz), in honor of Heinrich Hertz.

In some cases frequency analysis uses frequency described by sinusoidal characteristics because some systems, and some mathematics techniques are easier to express and assess as the sums of sines and cosines, usually with each component of the sum expressed with different frequency and magnitude. One common method for doing this uses the FFT.

(See FFT, Filter)

Game theory is a mathematical and logical study of independent competing actors. The theory applies to a wide range of applications, but is particularly useful in competitions: such as spectrum auctions, competitive bids, oil and gas leases, and other business settings.

Modern game theory is usually seen as starting with a proof about equilibria in two-person zero-sum games by John von Neumann. Von Neumann and Oskar Morganstern followed the proof with a 1944 book “Theory of Games and Economic Behavior.” A later edition developed an axiomatic theory of expected utility, which allowed modeling uncertain competitive decisions. In the 1950’s and 60’s it was hoped Game Theory would extend Utility Theory, and explain the underlying “rationality” of human decision makers, and allow computers to mimic humans. This hope proved unfounded and largely unexplained until the development of Prospect Theory.

Lone Star ® frequently uses Prospect-Game Theory Ensemble methods.

(See Utility Theory, Prospect Theory)

A term with positive connotations, suggesting an algorithm or process is transparent and understandable. It may also suggest trustworthiness. Lone Star is strongly in favor of glass box methods.

(See Algorithm, Black Box)

Hadoop is an open source computing system, typically residing on several processors. It often is installed on low-cost commodity computer clusters. “Big Data” companies like Facebook and Google use it.

Dr. Ron Howard of Stanford first used the term “Decision Analysis”. He earned a doctorate from MIT in 1958 and later joined Stanford. He pioneered methods for Markov decision problems, and was an early leader in the concept of Influence Diagrams as visual representations of decision context. In publications from 1964 through 1968, he codified much of what defined the first three generations of decision analysis, and helped found companies who offered services in decision analysis. The methods Howard described can be characterized as “Implicit Bayesian” and “Boxian” because they use probability distributions (not conditional probabilities) and because they map cause and effect (not just correlation).

(See Box, Decision Analysis, Enhanced Decision Analysis, Implicit Bayes).

The “Internet of Things” refers to the connection of embedded computing and sensing via the internet. The origins of the Internet assumed (in 1981) that computers, manned by human operators would be connected by Internet Protocol (IP). Since then, automobiles, mobile phones, and many other devices have been assigned IP addresses, and the Internet has become an Internet of Things, not just humans.

Some early proponents of the Internet of Things assumed the “Things” would be owned (mostly) by individual consumers. But IP connections to retail scanners, factory equipment, security cameras, and oil wells offer compelling value for business. It seems likely industrial applications of IOT will exceed consumer uses.

IP version 4 (IPv4) provides a theoretical maximum of about 4.2 billion unique addresses. Given the population of earth, and the fact that not everyone has a computer, 4.2 billion seemed like enough at one time. Today it seems clear we will soon exceed a trillion sensors providing data via the IoT. While not all these devices will require unique public IP addresses, IIoTT is one reason why IPv6, with far more addresses is needed.

(See Control System, Prescriptive Analysis)

A data science acronym for Ingest, Model, Query, Analyze, and Visualize; it has more than one meaning. It can refer to the steps the data scientist takes in approaching a problem. IMQAV may also refer to how a data science effort is organized, with different participants responsible for different aspects of the effort. Some use it as a project “architecture” and by this they mean that each letter may refer to different tools, project steps, and participants.

Implicit Bayes refers to a number of methods using probability and obeying Bayes Theorem, but without the explicit definition of conditional probabilities. Monte Carlo Markov Chains (MCMC) are one example of this, as are the Monte Carlo methods used by most Decision Analysis practices. EDA is implicitly Bayesian.

A codified cause and effect mapping used in Decision Analysis. Generally, the first three generations of DA use similar diagram symbols and rules. Beyond the 3^{rd} generation, the mappings are richer with more alternatives and more flexibility than earlier generations. At Lone Star ® we often use the term “Influence Representation” to make the distinction between our 5^{th} generation methods and earlier methods. Influence Diagrams are sometimes confused with Decision Trees, but they are different tools. A Decision Tree breaks out possible decisions into their consequences whereas an Influence Diagram maps out the relationships which comprise a decision.

(See Decision Analysis, Enhanced Decision Analysis)

Information economics is the economic measurement of the value of information. Specifically, it attempts to assign value to additional information. For example, in energy exploration, electromagnetic surveys may provide more information than traditional seismic surveys. Another example from advertising would be the value of targeting demographics in ad placement.

This is generally considered to be a sub-field of microeconomics and has application to game theory (information asymmetry), contract law, information technology, advertising, pharmaceutical development, oil exploration, and many other fields.

“The Use of Knowledge in Society” is a 1945 article written by economist Friedrich Hayek, and is sometimes cited as the origin of modern information economics. More recently, the writings of Douglas W. Hubbard have popularized the term.

(See Game Theory)

A mathematical construct and the foundation for signal processing, cryptography, pattern recognition, data storage and retrieval, statistical inference, and nearly all of our digital world. Data processing professionals who ignore information theory are, by definition, not practicing “data science,” regardless of their business card, or LinkedIn profile.

(See Nyquist, Shannon, Box)

Internet of Everything.

Daniel Kahneman is a Nobel Prize winning scientist, best known for having first described a set of cognitive biases in 1979, with his coauthor: Amos Tversky. Their paper “Prospect Theory: An Analysis of Decision under Risk” has been called a “seminal paper in behavioral economics” and refutes Utility Theory.

(See Cognitive Bias, Prospect Theory, NPV, Game Theory, Utility Theory)

Kalman Filters are usually used in control systems, and are a means to find the optimal response to noisy inputs. Kalman Filters are Bayesian and are indirectly related to the EDA methods used by Lone Star ®.

(See Control System, Enhanced Decision Analysis, Prescriptive Analysis)

Under the Law of Large Numbers, the odds improve that a sample set of data is a good representation of the entire universe of the data as the sample set gets larger. Powerful Lone Star ® methods use the law of large numbers to provide better results for our clients.

A Likert scale is a psychometric scale commonly involved in questionnaires. It is named for psychologist Rensis Likert. Practitioners make distinctions about Likert related semantics, such as “Likert Item” vs. “Likert Scale.” Generically, the term is used to refer to an ordered set of response choices presented to respondents. Often there are five or seven choices, with the neutral response in the center of the ordered choices. Although there is controversy about the best number of choices to present, most experts agree respondents can only convey a limited number of shades of distinction. This is usually less than nine, and for most purposes five alternatives will work well. The reason for this is found in the limited channel capacity of humans, and the inability to perceive fine shades of distinction in most cases.

(See channel capacity)

A linear model defines a simple, linear scale factor relationship between a result (dependent variable) and a cause (independent variable). If you use 2500 kilowatt hours(kWh), your usage is the independent variable. Your cost at 10 cents per kWh means your bill (dependent variable) will be $250. Many linear models are simplifications of a non-linear world. Traditionally, most optimization methods in OR have been linear models. Lone Star ® is a leader in non-linear models and optimization.

(see OR, optimization)

Machine learning means different things in different domains. At Lone Star ®, we mean any form of analytics based on using computers to “infer” or “learn” from data sets.

The specific form we use most often is “supervised machine learning” (because machines get confused about cause and effect). This cause-effect confusion applies to both “big data” and other data dominated methods.

(See Artificial Intelligence, Curse of Dimensionality, Neural Net)

Markov Decision Processes (MDPs) are used to model decisions which are the result of both random, uncontrolled causes (like weather) as well as choices made by one or more decision makers. MDP is named for the Markovs, a family of mathematicians, and in particular, Andrey Markov. MDPs are related to a number of analysis and optimization topics, including dynamic programming, learning, robotics, automated control, economics, manufacturing, and other decision processes.

Not all MDP “decisions” are “decisions” in the common meaning of the word. For example, a “Cart-Pole” process is a well-known control system problem related to balance on a moving platform. It can be solved with MDP mathematics, but may not seem to be a “decision.” Other problems are more obviously “decisions” such as a Queueing theory problem called the “Cashier’s Nightmare.”

Bellman and Howard both made important contributions to MDPs during the mid-20th century. The mathematics in Decision Analysis are related to, though different from, MDPs. These relationships between MDPs and DA presage the expansion of DA to EDA, and the broader set of problems Lone Star can address with our EDA toolset. Some of the topics our EDA tools can address may not always seem to be “decisions” either.

(See Bellman, Howard, Decision Analysis, Enhanced Decision Analysis)

Harry Markowitz is a polymath who has made contributions in economics, simulation, analysis, and probability. He is best known for Modern Portfolio Theory, for which he won a Nobel Prize.

(see Modern Portfolio Theory)

The average value of a set. In the set {1, 1, 1, 3, 4, 5, 13} 4 is the average.

The “middle number” of an ordered set. Roughly speaking, it is a value which, in a set of numbers is greater than half, and smaller than the other half. Probabilistically, it is the number which has an equal chance of being smaller or larger than other numbers drawn from the set. In the set {1, 1, 1, 3, 4, 5, 13} 3 is the median, because it is in the middle. The median is often used in place of the mean to calculate the central tendency of a data set since it is not affected by extreme outliers.

The most likely value drawn from a probabilistic set. In the set {1, 1, 1, 3, 4, 5, 13} 1 is the mode, because it is the most likely outcome.

A model is an abstraction that represents something else. A model is an understanding of a thing’s characteristics, separated from ultimate realities, or actual objects. At Lone Star ®, we are interested in mathematical models. In a mathematical model, we need a knowledge framework with numbers as part of the representation.

Models are found in a range of applications, including business modeling, risk modeling, and decision analysis, to name a few.

Some models observe relationships and correlation without describing cause and effect relationships. There is a danger in these models, often built with machine learning. They can come to conclusions like “Cancer causes smoking.” Other models focus on cause and effect, which Box called “mechanistic” models. Mechanistic models require model architecture based on knowledge about the relationships of the things represented in the model. There is a danger in these models that relationships are omitted or misstated.

(See Box, Business Case Analysis, Simulation, Machine Learning)

The Nobel winning work of Harry Markowitz, and derivative work which led to several other Nobel prizes. Markowitz simply called it “portfolio theory” and derived the mathematical basis between risk, returns, and portfolio construction.

The basic idea of MPT is that some risks are correlated. If we have two investments with exactly the same expected return, we have lower net risk if their returns are uncorrelated. The insight that some asset performance is positively correlated, uncorrelated, and negatively correlated forms the basis of portfolio construction.

MPT shows that for a given level of risk tolerance, there is a maximum expected return. Across a two dimensional space (risk/return) there is a boundary defining feasible portfolios. Now on the boundary, we have selected the best portfolio we can expect to choose, for a given level of risk.

(See Markowitz, Utility Theory)

Monte Carlo simulations approximate random or uncertain phenomena. This type of simulation is widely used in Decision Analysis, Econometrics, “practical statistics,” in queuing theory, control theory, market analysis, and other fields where many variables are unknown, or unknowable as fixed, deterministic quantities. These apply to pricing, competitive bids, telecommunications, networking, oil and gas, and other fields. Many Lone Star ® models use Monte Carlo simulation methods.

The Monty Hall problem is a well-known puzzle based on a television game show hosted by Mr. Hall. It deals with risk and uncertainty. Humans tend to choose their strategy incorrectly when given this problem. Work by Walter Herbranson shows pigeons do better than people. Hebranson’s work confirms that humans tend to fit information into pre-conceived and biased information frameworks described by Kahneman and Tversky, which form the basis for Prospect Theory.

(See Prospect Theory)

A term with more than one meaning. Generally, in data science, it refers to making data usable. This assumes, of course, data exists (see the Fourth Great Lie). In computer science, it can refer to irreversible changes in a file, program, or data. Since true-believing big data zealots cringe at the thought of throwing anything away, irreversibility is a less frequent implication at the moment.

A term referring to a class of algorithms which measure the “distance” between items. Usually the items are the data defining of something in a data base. For example, in Sabermetrics, baseball players can be measured in many ways. Each measurement is one dimension of comparison. The difference in one dimension, such as base running speed describes the “distance” in that measure. Combining all the measures available describes the distance between the players. This method often allows us to group objects in ways that are otherwise not obvious.

(See Sabermetrics)

A concept promoted by G.E.P. Box. Box warned that “needless elaboration” did not improve, and often obscures our understanding of a problem we analyze. For example, the average value of a six sided die is 3.5. To test this experimentally, we might roll 10, or 100 dice to get more confidence, but at some point, we just don’t need to roll more. There is no need to roll 100,000 or a million.

A term with multiple meanings. Analysis uses “artificial neural networks” or ANNs, which are distinct from biological systems, though biology was the inspiration. Today biological systems are better understood, so the reference is more of an analogy.

ANNs are AI and of machine learning, though not all AI (or machine learning) uses ANNs.

Some ANNs seem be mathematically equivalent to other algorithms. For example, Markov Decision Processes (MDPs) pre-date ANNs, but ANNs can use MDPs.

There are many types of ANNs, including:

- Feedforward neural network (FNN)
- Recurrent neural network (RNN)
- Probabilistic neural network (PNN)
- Time delay neural network (TDNN)
- Regulatory feedback network (RFNN)
- Convolutional neural network (CNN)
- Associative neural network (ASNN)

(See Artificial Intelligence, Machine Learning, Markov Decision Processes)

In reliability, this refers to the probability of conformance, or reliability. A “Two Nines” system will conform over 99% of the specified time spans. For example, if 100 cars are rented for one day rentals, with a requirement of 99% reliability, we are specifying either zero, or one breakdown is allowed. The exact details of specifications of this type can be very complex. “Five Nines” systems are 99.999% reliable, and this is a benchmark of very high reliability in some industries.

Noise means somewhat different things in different domains. In mathematical terms, the definition is usually related to the signal processing meaning, which is the concept of signal (or data) other than information being processed. It is often illustrated by different types of interference including audio noise, which can be a residual low level “hiss or hum,” video noise which can be seen as different kinds of image degradation and snow in moving images, or the image noise in digital photography.

The mathematics of noise owes a great deal to John B. Johnson and Harry Nyquist. Working at Bell Labs in 1926, Johnson made the first measurements of noise. Nyquist formulated a mathematical understanding of Johnson’s work, and together they published their findings in 1928.

Today, the concept is applied to any kind of natural or man-made interference, jamming, or signal degradation. And, it has been applied to a wide range of topics. For example, the concept of noise applies to a range of business analysis, risk analysis, and market analysis applications. Fischer Black (co-author of the Black-Scholes equation for option pricing) wrote an important essay in 1986, titled “Noise,” related to financial markets and trading.

Most data is noisy. John Volpi is fond of saying, “there is a lot of information in noisy data”; it’s often a mistake to think that noisy data is “bad.”

(See Filter, Fit, Overfit)

A real time voting technique developed by Lone Star ® and used to quickly separate the Significant Few from the Trivial Many. Especially useful when a substantial list of items are under consideration.

The normal distribution is also called the “Bell Curve” or “Gaussian distribution,” although there are other “bell curves” which are not Gaussian. Many disciplines use the normal distribution thinking most things are Gaussian unless there is evidence otherwise. However, real world examples of normal distributions are less common than some claim.

The normal distribution is important to probabilistic mathematics because of the central limit theorem. In many cases, blending random variables independently will have a distribution close to the normal, and solutions to other mathematics problems are also the normal distribution.

Normal distributions are sometimes assumed to apply because they are easy to manipulate in probability equations. Such assumptions can be flawed leading to a belief something possible (or even probable) is really impossible. When these “impossible” things happen, they are called “Black Swans.”

Misapplication of normal distributions is not the only cause of Black Swans making surprise appearances, but this is one reason they appear.

(See Central Limit Theorem, Distribution, t Distribution)

Stands for Net Present Value, which is the estimate of the current worth of future cash flows. NPV is often taught as a “factual” basis for business analysis, but is nearly always based on point estimates of several future conditions which are, in fact uncertain. NPV can be useful for comparing alternatives, but it can be difficult to determine whether the underlying assumptions are comparably accurate. Easy to use NPV formulas in spreadsheets have increased the popularity, and the misuse, of NPV. Measures like NPV are important for large, fixed cost capital projects and apply to oil and gas, utilities, and networking projects.

(See Prospect Theory, Flaw of Averages, Utility Theory)

In statistical hypothesis testing, we attempt to disprove the null hypothesis by using available data. In probabilistic settings, we can never truly “prove” something is true, but we can often prove the alternative seems to be, most likely, false.

(See Falsification)

Harry Nyquist worked for AT&T doing R&D and joined Bell Labs when it was created. He made important contributions to understanding the relationship between bandwidth, channel capacity and noise. He laid the groundwork for Claude Shannon’s later work in the creation of information theory and, he authored a classic paper on closed loop control systems which still is the standard model more than 80 years later. A number of things are named for Nyquist:

**Nyquist Criterion**– Defines conditions which help ensure robust communications channel integrity.**Nyquist Frequency**– A sampling frequency used in signal processing and analysis of things which vary over time. It is half the sampling rate of the system, and defines how the system will misbehave with naughtiness like aliasing.**Nyquist Plot –**A graphical representation of the response of a system across frequency.**Nyquist Rate**– Twice the bandwidth of finite bandwidth channel or function (please do not confuse this with the Nyquist frequency) this is the minimum sampling rate which meets the Nyquist Criterion.**Nyquist-Shannon Sampling Theorem**– Defines the sample rate needed to capture all the information in a continuous (analog) signal. This is the foundation of understanding the connection between the analog world and the digital domain. (ignore it at your very great peril)**Nyquist Stability Criterion**– A simple, but powerful test for stability in feedback systems.

(See Information Theory, Shannon, Volpi’s Rule of Decision Making, Channel Capacity)

A way to express probability. 5:3 (or Five to Three) means that on average, if we saw eight events, five would come out one way, and three would be the other. Humans tend to better understand odds ratios, or expressions like “five times out of eight” than percentage expressions of probability like “62.5% probable.”

On Line Analysis and Processing.

A term familiar to Texans, but obscure to some other people. It has to do with highly competent, highly empowered people, which we embrace at Lone Star ®. For the full story, put the term in a search engine – it’s worth reading.

Operations research (OR), also called, or is closely related to: operational research, operations analysis, decision science, management science, and other related activities and terms. It is a systematic approach to solving problems, which uses one or more analytical tools in the process of analysis to improve decisions. OR often seeks to find the largest/best outcomes, or to find the smallest/least outcome. The origins of OR are debated, but are usually assigned to military efforts before or during World War II. Today it applies to a wide range of markets and industries including transportation, defense, natural resources, energy, telecommunications, networks, social media, supply chains, logistics, and marketing, among others.

(See INFORMS, optimization)

Optimization is a term with multiple meanings. In mathematical optimization, economics, game theory and other quantitative fields, the meanings are related, and roughly the same. In these fields, optimization means finding the best of some set of available alternatives, based on some criteria defining “best” and frequently including one or more defined limiting factors. The set of alternatives may be a discrete list of choices, a complex, multi-dimensional space, or a topological surface.

Optimization is often associated with the term “linear programming,” which refers to optimization and dates to an era when “programming” did not refer to computers. Like many methods of the day, linear representations (linear models) were used to simplify the methods to be compatible with slide rules and other pre-computer methods.

Many of the great minds in mathematics and science have explored optimization, including Newton, Fermat, Gauss, and others. Important 20^{th} century discoveries were made by von Neumann, Dantzig and Kantorovich during and shortly after World War II. Much of the early support and semantics for “linear programming” came from U.S. military logistics problems and schedules.

A number of advances emerged after the mid-20^{th} century, including the Bellman equation and other methods supporting optimization of dynamic and non-linear systems. Lone Star ® methods support optimization of large, non-linear models.

(See Linear Models, Curse of Dimensionality, Operations Research)

Overfitting occurs when a model has so many details it becomes less useful than a simple model. Usually this occurs when there is an attempt to fit both noise and data. Since noise is random, we expect the next batch of data samples to have different noise, and the old fit will not work.

(See Fit, Noise)

A visual representation of Michael Porter’s “Five Forces” model. Lone Star ® uses a “Nine Forces” model, which better represents the complexity of the 21^{st} century, while preserving Porter’s groundbreaking ideas. Porter Plots are a visual representation of qualitative business analysis.

See Modern Portfolio Theory.

Predictive Analysis takes many forms. These all provide some form of prediction about what might happen. Best practice is to also estimate the probability related to the prediction. Weather forecasts (there is a 40% chance of rain this afternoon) are one type of forecast. Some big data algorithms are (or claim to be predictive). Most Lone Star ® work in EDA is Predictive Analysis. (See Enhanced Decision Analysis)

Prescriptive Analysis is relatively rare. It is an automated model (computer simulation) which prescribes actions based on the analysis of complex data. Prescriptive analysis is related to control theory but deals with much more complex Big Data and IOT data streams. Lone Star ® is one of the few companies capable of Prescriptive Analysis.

(See Big Data, Control System, IoT)

The estimated price to win a competition or auction, related to Should Cost, Could Cost, Game Theory and Prospect Theory. It applies to competitions among sellers (major Federal contracts) and to competitions among buyers (spectrum auctions, oil and gas leases, Merger & Acquisition auctions).

Prospect theory has roots in mathematical psychology, economics and mathematics. It is particularly useful in pricing and competitive bids. It is a framework, or collection of observations and phenomena describing how people make decisions when facing uncertain “prospects” such as risk, gains, and losses. It refutes NPV, expected value, and Utility Theory which all suggest a fully “rational” basis for choosing a final outcome.

Prospect Theory describes how humans evaluate the “prospect” of gains or losses. It also lays out the basic rules humans tend to use, which are described as “heuristics.” These characteristics make Prospect Theory useful for Risk Analysis and Risk Modeling. It also provides more accuracy and power to the use of Game Theory.

(See NPV, Game Theory, Utility Theory”)

A qualitative variable has values which are adjectives, such as colors, genders, nationalities, and is related to the idea of categorical variables. A blue pixie living in zip code 22001 would be categorized by the zip code, but you can see they might also be categorized as a blue person, and as a pixie. For most purposes there is not difference between “qualitative” and “categorical” variables. For Big Data problems we expect to find zero difference in these ideas for most cases. The real challenge with qualitative variables is the urge to use adjectives when we really need a number. Saying “the risk is high” is NOT the same as saying, “there is a 42% chance of going broke.”

Mathematical treatment of ordered or gated events, such as waiting in line, or waiting for a factory process to cycle from one activity to another. Generally credited to the work of Erlang. Related to the “theory of constraints.”

See Data Reduction.

Regression, or Regression Analysis attempts to find (or “fit”) an equation to describe the relationship between a dependent variable and one or more independent variables. Different disciplines tend to use different semantics to describe regression, and different regression methods apply to different types of problems and data sets. Regression is sometimes automated as an element of machine learning and “big data.”

(See correlation, big data, fit, filtering)

Doing what is supposed to happen. If your computer turns on every time you press the power key, it is reliable. If it randomly shuts down every 2 hours, then it is not.

A concept with different meanings in different contexts. Generally, risk means some possibilities, which are uncertain, are undesirable. Douglas Hubbard makes the distinction between uncertainty and risk; uncertainty means more than one possibility exists, while risk means that some of these possibilities are bad.

**Big Data**– The concepts of risk in big data analytics are still emerging, namely the risks of drawing faulty conclusions from what seems to be compelling correlations. Standardized confidence measures are not yet reliable or a consensus.**Investments**– The possibility of returns different than expected, including potential loss.**Insurance**– Specified contingencies which may not occur.**Oil & Gas**– Beyond the general corporate meaning, energy firms deal with political risk and geological risks. Geological risks are defined in probabilistic terms, which attempt to standardize the meaning of terms like “Economic Ultimate Recovery.”**Corporate Legal**(Sarbanes Oxley) – The uncertainties which might lead to material non-performance, non-compliance or loss to investors.**General Legal**– The definitions and theory of legal risk are complex including some wonderful Latin:*Ubi periculum, ibi et lucrum collooatur.*He who risks a thing, should receive the profit from it*Cujus est dominium ejus est periculum.*He who has ownership should bear the risk.

The effort to understand risk is ancient, as shown by the Latin expressions, but humans are just not good at this.

(See Prospect Theory, Monte Hall Problem, Uncertainty)

Refers to resiliency. Essentially, it is how well a system can keep doing roughly what it is supposed to do when it is under stress or pressure, or when conditions are not “nominal.” A great number of analytics solutions are correct 99% of the time when nothing interesting is happening, but when the 1% occurs they crash. This is not robust.

Sabermetrics refers to the quantitative analysis of baseball. The term was coined by Bill James, derived from SABR, or the Society of American Baseball Research. It was the basis of the movie *Moneyball*, about Billy Beane, and the Oakland Athletics.

A family of mathematicians, including Leonard Jimmie Savage, Sam Savage and Richard Savage. The family has been called “statistical royalty,” but is marked by self-deprecation. Leonard preferred being called “Jimmie.” Richard once claimed his duties as Chair of Yale’s Statistics Department was “None.” Sam claims he became a professor after concluding he could not support himself as a musician.

Jimmie Savage was a polymath, and influenced an impressive array of mid-20th century thinkers, including von Neumann, Friedman, Samuelson, and others. Like Box, he helped change the term “Bayesian” from an insult to a compliment. His 1954 book, The Foundations of Statistics, is still considered a classic. The Savage Award for Bayesian dissertations is named for him.

Richard, the brother of Jimmie, (I. Richard) made an impact at a number of important universities, and was President of the American Statistical Association and pushed for the use of statistical understanding in public policy, including AIDS diffusion, DNA fingerprinting, human rights and national defense.

Sam, the son of Jimmie is the founder of Probability Management and author of The Flaw of Averages, perhaps the most approachable book ever written on probability and uncertainty. He has also made contributions to simulation, computer science and mathematics education, and invented of the SIP, an open data structure for conveying uncertainty.

(See Bayes, Box, SIP)

A “What if” a model or simulation can address. A scenario might be “What if it rains?” If your mental model is about a party, the answer might be “move it inside.” But if your model is about a baseball game the answers are more complicated. One or more scenarios are a way to specify an analysis task, whether using a complex simulation, or having a conversation.

A process of presenting detailed future scenarios to explore what issues and questions need to be dealt with. SBP is often associated with government policies, defense planning, and business analysis for strategy.

SBP tends to suggest needed strategic alternatives, and is usually considered a strategy tool; it is often used in facilitated group sessions. Lone Star ® uses proprietary methods to design meaningful scenarios used in SBP.

Claude Shannon is indisputably the father of information theory. He was recognized for his innovation and brilliance before he joined Bell Labs where we would work with Harry Nyquist, Hendrick Bode, and Alan Turing. This was the genesis of modern control systems, modern cryptography, modern signal process, modern communication theory, and the dawn of the digital age.

In 1948, all of this converged in a two-part article published in the July and October issues of the Bell Systems Technical Journal: “A Mathematical Theory of Communication.” The article had impact across a range of sciences, including the so called soft science of psychology. Shannon showed the relationship between “noise” (or “entropy”) and the information being transmitted.

Demand to better understand this foundational work soon led to a book with an important change. The books title was “THE” not “A” Mathematical Theory… and, everyone understood this was not an egotistical change. Shannon had defined THE mathematics of information.

Because much of his work was classified, Shannon is not as well-known as other, lesser lights. He has been called “The Greatest Genius No One Has Heard Of.” That being said, it is not clear he would have cared. Shannon spent is life working on things he was interested in, and having fun with his friends, his family and his wife.

(See Nyquist, Information Theory, channel capacity)

A concept of price forecasting. The price (or cost) something “should cost” based on historical precedents.

(See Could Cost)

A simulation is an automated model. In a sense, any model which runs in a computer is a simulation, but some make the distinction that simulations should map cause and effect. Such cause and effect simulations are usually based on “mechanistic models.”

Simulations can include scientific simulations, engineering simulations, signal processing simulations, conflict simulation, business simulation, and many others. These simulations can be useful for business analysis, risk modeling, system modeling, and enterprise simulations.

(See model, Box, EDA, Enterprise Simulation)

Stochastic Information Packet. A standard for sharing probabilistic information. The SIP standard is managed by ProbabilityManagement.org, a non-profit which hosts the standard committee.

Stochastic Multi-Objective Decision Analysis. A method to model complex decisions in which there is uncertainty about the importance of the objectives, uncertainty about achieving objectives, or both. SMODA is particularly valuable in Game Theory applications when imperfect information exists about the decision criteria or other characteristics of the various players.

SMODA allows a competitor to make better decisions in spite of uncertainty and imperfect information about other competitors, and the referee, if the contest has one.

SMODA is a method pioneered by Lone Star ® for clients facing complex economic competitions.

Systems of Systems.

Lone Star ® methods support modeling large, complex, systems of systems. SoS problems are an example of large scale representations which require EDA to address, and which could be difficult using earlier methods.

(See Enhanced Decision Analysis)

Spectral Analysis assesses frequency components of what may seem to be a random process or time series, and can be based on a wide range of data types. It applies to a range of technical and business applications in telecom, oil and gas, retail, signal processing, and other fields.

(See FFT)

A simplification, cows have complicated shapes. However, it may be okay to assume a cow is just a big ball of meat and bone for some analysis purposes. Or…It might be a disaster to make such a simplification.

Good analysis requires using spherical cows at times, and avoiding them at others. G.E.P. Box taught that models should be a complex as necessary but not needlessly elaborate. John Volpi has some wonderful stories about spherical cows.

(See Box)

Stochastic Optimization refers to maximizing or minimizing a mathematical output when one or more variables have uncertain conditions. Most early optimizers required deterministic mathematics. Non-linear stochastic optimization is difficult. Lone Star ® is one of the few organizations who can deliver it.

(See Richard Bellman, Optimization)

The selection of data which might not be meaningful. Also known as the streetlight bias, or the streetlight fallacy. A term with multiple meanings, it can refer to a number of data biases in social science, cognition, analysis, statistics, data science, and IoT. For example, in time and motion studies, we must look at BOTH how long a task takes, and, how long preparation for the task took.

It takes 10 minutes to properly slice the classic Texas BBQ (Beef Brisket, of course). But it takes nearly an hour to prep the meat and the fire. It can take 16 hours on the smoke, and the meat must rest about an hour before slicing. The 10 minutes of slicing which we can see in a serving line tells us nothing about real time needed to create the classic Texas meat candy; often more than 18 hours.

The term refers to an old joke – A policeman sees a drunk on his knees, under a streetlight looking for something and asks what he’s looking for. “I lost my keys” the drunk says, “Where,” the cop asks. “A couple of blocks away” the drunk replies. “Why look here?” asks the cop. “The light is better!”

The term is roughly the same idea as the “spotlight fallacy” and the “caveman effect.”

We call prehistoric humans “cavemen” because the easy information about them has been preserved on the walls and floors of caves. We really don’t have a good idea how many of them actually lived in caves, but their preserved history in caves is the most accessible to us. Thus, these effects can be forms of Convenience Sampling.

The streetlight effect is related to cognitive biases. The analyst or data scientist can fall prey to their own assumptions and prejudices, leading to flawed data selection; the flawed human creates a flawed model. Kahneman’s WYSIATI is an example. It can also be related to collecting data at the wrong intervals and losing information; some government reports are issued once a year, which may distort or hide time variant information.

Properly designed IoT systems avoid the streetlight effect, by collecting the right data.

(See Cognitive Bias, Convenience Sample, Nyquist, WYSIATI)

(or “Student’s T”) A widely quoted, and perhaps the most widely misunderstood continuous statistical distribution. It is useful for testing and understanding the outcome of statistical findings. In particular, the “T-Test” is a measure of whether an outcome might have been due to chance, rather than a pattern or cause.

T-distribution is specified by just one parameter – the number of degrees of freedom.

W. S. Gossett first described it. Gossett understood what many people today seem to miss – it is an approximation (and just an approximation) of the distribution of the means of randomly drawn samples from a fixed population, and it is not Gaussian. He used the name “Student” in his 1908 publication leading to the name of the distribution. It is one of many distributions which we may need less, with today’s advent of rich data and computers.

A very simple concept with several meanings in different contexts. All Monte Carlo methods deal with uncertainty. Most big data analytics cope with uncertainty. Some examples of different meanings:

- In most contexts; the condition or state of being indefinite, or indeterminate (this is what Lone Star ® means).
- Some use very narrow definition, restricted to things which cannot be measured (this is NOT what Lone Star ® means).
- In weather forecasting or oil exploration; having a probabilistic assessment about a future event, like next week’s weather, or, the results a drilling a well next year (future uncertainty).
- In historical assessments; a wide range of specific meanings deal with uncertainty about the past, and may include the cause and form of uncertainties in past spatial, temporal, and other attributes.
- In a legal context; an additional meaning can include “dubious” or having serious doubt.

An important theorem in Artificial Intelligence and Neural Networks: A feed-forward network with at least one hidden layer, and a finite number of neurons can approximate any function which is

- Continuous
- Non-constant
- Bounded
- Monotonically Increasing

At Lone Star ®, we call these “The 4 Conditions” or T4C. A first version of this was proved by George Cybenko in 1989 for sigmoid activation functions.

This is the foundation of much thinking and optimism for AI.

Utility theory applies to several disciplines, notably finance and economics. It attempts to explain how value is attributed of an object, right, or other asset. It observes the non-linear relationship in changes of attributed value as the asset changes characteristics. It was published in 1738 by Daniel Bernoulli, seeking explain how the marginal desirability of money decreases as more is gained.

Utility Theory is often applied to pricing studies and economic studies. The result is often a measure of “indifference.” Indifference is the point, or boundary curve between desirable choices. These are defined by utility functions, which are usually related to cost, value, and risk measures. In particular, Utility Theory can be useful when combined with Game Theory.

Prospect Theory shows that Utility Theory ignores many human behaviors and limitations, and as such Utility Theory is an incomplete description of human choices and decision making. In some markets, such as Oil and Gas, rational behavior can make Utility Theory useful, in spite of its limitations elsewhere.

(See Prospect Theory, Game Theory)

In any market, the participant with the third largest market share is the best place to look for the innovation which will eventually win out, even though the innovating company may not succeed (the rule of three varies with author and source; John Volpi never claimed this was an original insight, but his version of it is very colorful).

If the mean time between reorganization is less than the mean time to make a major decision, your group will probably never make major decisions (see Nyquist)

If you have to ask, “is it too late” then most likely it is far, far too late.

Never bet against a technology with 100 Million times more volume than yours.

What you see is all there is. A concept which explains a number of our cognitive limitations, popularized by Daniel Kahneman, who won a Nobel Prize for Prospect Theory.

(See Availability Cognitive Bias, Prospect Theory)

*Zero Jerk Factor ^{TM }*. This is a term we use at Lone Star ® to describe a work environment where people collaborate as an empowered team, where each member is respected.