##### Machine Learning

Machine learning means different things in different domains. At Lone Star ®, we mean any form of analytics based on using computers to “infer” or “learn” from data sets.

The specific form we use most often is “supervised machine learning” (because machines get confused about cause and effect). This cause-effect confusion applies to both “big data” and other data dominated methods.

(See Artificial Intelligence, Curse of Dimensionality, Neural Net)

##### Markov Decision Process

Markov Decision Processes (MDPs) are used to model decisions which are the result of both random, uncontrolled causes (like weather) as well as choices made by one or more decision makers. MDP is named for the Markovs, a family of mathematicians, and in particular, Andrey Markov. MDPs are related to a number of analysis and optimization topics, including dynamic programming, learning, robotics, automated control, economics, manufacturing, and other decision processes.

Not all MDP “decisions” are “decisions” in the common meaning of the word. For example, a “Cart-Pole” process is a well-known control system problem related to balance on a moving platform. It can be solved with MDP mathematics, but may not seem to be a “decision.” Other problems are more obviously “decisions” such as a Queueing theory problem called the “Cashier’s Nightmare.”

Bellman and Howard both made important contributions to MDPs during the mid-20th century. The mathematics in Decision Analysis are related to, though different from, MDPs. These relationships between MDPs and DA presage the expansion of DA to EDA, and the broader set of problems Lone Star can address with our EDA toolset. Some of the topics our EDA tools can address may not always seem to be “decisions” either.

(See Bellman, Howard, Decision Analysis, Enhanced Decision Analysis)

##### Markowitz

Harry Markowitz is a polymath who has made contributions in economics, simulation, analysis, and probability. He is best known for Modern Portfolio Theory, for which he won a Nobel Prize.

(see Modern Portfolio Theory)

##### Mean

The average value of a set. In the set {1, 1, 1, 3, 4, 5, 13} 4 is the average.

##### Median

The “middle number” of an ordered set. Roughly speaking, it is a value which, in a set of numbers is greater than half, and smaller than the other half. Probabilistically, it is the number which has an equal chance of being smaller or larger than other numbers drawn from the set. In the set {1, 1, 1, 3, 4, 5, 13} 3 is the median, because it is in the middle. The median is often used in place of the mean to calculate the central tendency of a data set since it is not affected by extreme outliers.

##### Mode

The most likely value drawn from a probabilistic set. In the set {1, 1, 1, 3, 4, 5, 13} 1 is the mode, because it is the most likely outcome.

##### Model

A model is an abstraction that represents something else. A model is an understanding of a thing’s characteristics, separated from ultimate realities, or actual objects. At Lone Star ®, we are interested in mathematical models. In a mathematical model, we need a knowledge framework with numbers as part of the representation.

Models are found in a range of applications, including business modeling, risk modeling, and decision analysis, to name a few.

Some models observe relationships and correlation without describing cause and effect relationships. There is a danger in these models, often built with machine learning. They can come to conclusions like “Cancer causes smoking.” Other models focus on cause and effect, which Box called “mechanistic” models. Mechanistic models require model architecture based on knowledge about the relationships of the things represented in the model. There is a danger in these models that relationships are omitted or misstated.

(See Box, Business Case Analysis, Simulation, Machine Learning)

##### Modern Portfolio Theory (MPT)

The Nobel winning work of Harry Markowitz, and derivative work which led to several other Nobel prizes. Markowitz simply called it “portfolio theory” and derived the mathematical basis between risk, returns, and portfolio construction.

The basic idea of MPT is that some risks are correlated. If we have two investments with exactly the same expected return, we have lower net risk if their returns are uncorrelated. The insight that some asset performance is positively correlated, uncorrelated, and negatively correlated forms the basis of portfolio construction.

MPT shows that for a given level of risk tolerance, there is a maximum expected return. Across a two dimensional space (risk/return) there is a boundary defining feasible portfolios. Now on the boundary, we have selected the best portfolio we can expect to choose, for a given level of risk.

(See Markowitz, Utility Theory)

##### Monte Carlo Simulation

Monte Carlo simulations approximate random or uncertain phenomena. This type of simulation is widely used in Decision Analysis, Econometrics, “practical statistics,” in queuing theory, control theory, market analysis, and other fields where many variables are unknown, or unknowable as fixed, deterministic quantities. These apply to pricing, competitive bids, telecommunications, networking, oil and gas, and other fields. Many Lone Star ® models use Monte Carlo simulation methods.

##### Monty Hall Problem

The Monty Hall problem is a well-known puzzle based on a television game show hosted by Mr. Hall. It deals with risk and uncertainty. Humans tend to choose their strategy incorrectly when given this problem. Work by Walter Herbranson shows pigeons do better than people. Hebranson’s work confirms that humans tend to fit information into pre-conceived and biased information frameworks described by Kahneman and Tversky, which form the basis for Prospect Theory.

(See Prospect Theory)

##### Munging

A term with more than one meaning. Generally, in data science, it refers to making data usable. This assumes, of course, data exists (see the Fourth Great Lie). In computer science, it can refer to irreversible changes in a file, program, or data. Since true-believing big data zealots cringe at the thought of throwing anything away, irreversibility is a less frequent implication at the moment.

##### Nearest Neighbor

A term referring to a class of algorithms which measure the “distance” between items. Usually the items are the data defining of something in a data base. For example, in Sabermetrics, baseball players can be measured in many ways. Each measurement is one dimension of comparison. The difference in one dimension, such as base running speed describes the “distance” in that measure. Combining all the measures available describes the distance between the players. This method often allows us to group objects in ways that are otherwise not obvious.

(See Sabermetrics)

##### Needless Elaboration

A concept promoted by G.E.P. Box. Box warned that “needless elaboration” did not improve, and often obscures our understanding of a problem we analyze. For example, the average value of a six sided die is 3.5. To test this experimentally, we might roll 10, or 100 dice to get more confidence, but at some point, we just don’t need to roll more. There is no need to roll 100,000 or a million.

##### Neural Net

A term with multiple meanings. Analysis uses “artificial neural networks” or ANNs, which are distinct from biological systems, though biology was the inspiration. Today biological systems are better understood, so the reference is more of an analogy.

ANNs are AI and of machine learning, though not all AI (or machine learning) uses ANNs.

Some ANNs seem be mathematically equivalent to other algorithms. For example, Markov Decision Processes (MDPs) pre-date ANNs, but ANNs can use MDPs.

There are many types of ANNs, including:

- Feedforward neural network (FNN)
- Recurrent neural network (RNN)
- Probabilistic neural network (PNN)
- Time delay neural network (TDNN)
- Regulatory feedback network (RFNN)
- Convolutional neural network (CNN)
- Associative neural network (ASNN)

(See Artificial Intelligence, Machine Learning, Markov Decision Processes)

##### Nines

In reliability, this refers to the probability of conformance, or reliability. A “Two Nines” system will conform over 99% of the specified time spans. For example, if 100 cars are rented for one day rentals, with a requirement of 99% reliability, we are specifying either zero, or one breakdown is allowed. The exact details of specifications of this type can be very complex. “Five Nines” systems are 99.999% reliable, and this is a benchmark of very high reliability in some industries.

##### Noise

Noise means somewhat different things in different domains. In mathematical terms, the definition is usually related to the signal processing meaning, which is the concept of signal (or data) other than information being processed. It is often illustrated by different types of interference including audio noise, which can be a residual low level “hiss or hum,” video noise which can be seen as different kinds of image degradation and snow in moving images, or the image noise in digital photography.

The mathematics of noise owes a great deal to John B. Johnson and Harry Nyquist. Working at Bell Labs in 1926, Johnson made the first measurements of noise. Nyquist formulated a mathematical understanding of Johnson’s work, and together they published their findings in 1928.

Today, the concept is applied to any kind of natural or man-made interference, jamming, or signal degradation. And, it has been applied to a wide range of topics. For example, the concept of noise applies to a range of business analysis, risk analysis, and market analysis applications. Fischer Black (co-author of the Black-Scholes equation for option pricing) wrote an important essay in 1986, titled “Noise,” related to financial markets and trading.

Most data is noisy. John Volpi is fond of saying, “there is a lot of information in noisy data”; it’s often a mistake to think that noisy data is “bad.”

(See Filter, Fit, Overfit)

##### Non-Linear Voting

A real time voting technique developed by Lone Star ® and used to quickly separate the Significant Few from the Trivial Many. Especially useful when a substantial list of items are under consideration.

##### Normal Distribution

The normal distribution is also called the “Bell Curve” or “Gaussian distribution,” although there are other “bell curves” which are not Gaussian. Many disciplines use the normal distribution thinking most things are Gaussian unless there is evidence otherwise. However, real world examples of normal distributions are less common than some claim.

The normal distribution is important to probabilistic mathematics because of the central limit theorem. In many cases, blending random variables independently will have a distribution close to the normal, and solutions to other mathematics problems are also the normal distribution.

Normal distributions are sometimes assumed to apply because they are easy to manipulate in probability equations. Such assumptions can be flawed leading to a belief something possible (or even probable) is really impossible. When these “impossible” things happen, they are called “Black Swans.”

Misapplication of normal distributions is not the only cause of Black Swans making surprise appearances, but this is one reason they appear.

(See Central Limit Theorem, Distribution, t Distribution)

##### NPV

Stands for Net Present Value, which is the estimate of the current worth of future cash flows. NPV is often taught as a “factual” basis for business analysis, but is nearly always based on point estimates of several future conditions which are, in fact uncertain. NPV can be useful for comparing alternatives, but it can be difficult to determine whether the underlying assumptions are comparably accurate. Easy to use NPV formulas in spreadsheets have increased the popularity, and the misuse, of NPV. Measures like NPV are important for large, fixed cost capital projects and apply to oil and gas, utilities, and networking projects.

(See Prospect Theory, Flaw of Averages, Utility Theory)

##### Null Hypothesis

In statistical hypothesis testing, we attempt to disprove the null hypothesis by using available data. In probabilistic settings, we can never truly “prove” something is true, but we can often prove the alternative seems to be, most likely, false.

(See Falsification)

##### Nyquist

Harry Nyquist worked for AT&T doing R&D and joined Bell Labs when it was created. He made important contributions to understanding the relationship between bandwidth, channel capacity and noise. He laid the groundwork for Claude Shannon’s later work in the creation of information theory and, he authored a classic paper on closed loop control systems which still is the standard model more than 80 years later. A number of things are named for Nyquist:

**Nyquist Criterion** – Defines conditions which help ensure robust communications channel integrity.
**Nyquist Frequency** – A sampling frequency used in signal processing and analysis of things which vary over time. It is half the sampling rate of the system, and defines how the system will misbehave with naughtiness like aliasing.
**Nyquist Plot –** A graphical representation of the response of a system across frequency.
**Nyquist Rate** – Twice the bandwidth of finite bandwidth channel or function (please do not confuse this with the Nyquist frequency) this is the minimum sampling rate which meets the Nyquist Criterion.
**Nyquist-Shannon Sampling Theorem** – Defines the sample rate needed to capture all the information in a continuous (analog) signal. This is the foundation of understanding the connection between the analog world and the digital domain. (ignore it at your very great peril)
**Nyquist Stability Criterion** – A simple, but powerful test for stability in feedback systems.

(See Information Theory, Shannon, Volpi’s Rule of Decision Making, Channel Capacity)