In 1994, Jim Collins and Jerry Porras, former and current Stanford professors, published the best-seller Built to Last that described how “long-term sustained performance can be engineered into the DNA of an enterprise.” It sold over a million copies. Buoyed by that success, Collins and a research team set out to find the characteristics of companiesContinue reading “Good to Great”
Blog
Selection Bias
Selection bias is a sampling or data collection process that yields a biased, or unrepresentative, sample. It can occur in numerous situations, here are just a few:
Space Shuttle Explosion
In 1986, the U.S. space shuttle Challenger exploded several minutes after launch. A later investigation found that the cause of the disaster was O-ring failure, due to cold temperatures. The temperature at launch was 39 degrees, colder than any prior launch. The cold caused the O-rings to become stiff and brittle, losing the flexibility thatContinue reading “Space Shuttle Explosion”
Alaskan Generosity
People in Alaska are extraordinarily generous – that’s what a predictive model showed, when applied to a charitable organization’s donor list. A closer examination revealed a flaw – while the original data was for all 50 states, the model’s training data for Alaska included donors, but excluded non-donors. The reason? The data was 99% non-donors,Continue reading “Alaskan Generosity”
Industry Spotlight – The Military
Abraham Wald, a persecuted Jewish mathematician who fled Austria just before World War II, led an analysis of allied bombers returning from missions. Hitherto, the Air Force had focused on reinforcing areas that showed the most damage on return. Wald convinced them instead to focus on the areas that consistently showed no damage. He reasonedContinue reading “Industry Spotlight – The Military”
Why Analytics Projects Fail – 5 Reasons
With the news full of so many successes in the fields of analytics, machine learning and artificial intelligence, it is easy to lose sight of the high failure rate of analytics projects. McKinsey just came out with a report that only 8% of big companies (revenue > $ 1 billion) have successfully scaled and integratedContinue reading “Why Analytics Projects Fail – 5 Reasons”
Political Analytics and Microtargeting
The statistics of targeting individual voters with specific messages, as opposed to messaging that went to whole groups, began in the U.S over a decade ago with the Democrats. Political targeting is now an established business, or at least a discipline within the broader realm of political consulting. By 2016, the Republicans had surged wellContinue reading “Political Analytics and Microtargeting”
The Statistics of Persuasion
The Art of Persuasion is the title of more than one book in the self-help genre, books that have spawned blogs, podcasts, speaking gigs and more. But the science of persuasion is actually of more interest, because it produces useful rules that can be studied and deployed. Marketers and politicians have long been enthusiastic usersContinue reading “The Statistics of Persuasion”
Historical Spotlight – ISOQOL
25 years ago the International Society of Quality of Life Research was founded with a mission to advance the science of quality of life and related patient-centered outcomes in health research, care and policy. While focusing on quality of life (QOL) in healthcare may seem like a no-brainer, measuring it is not as easy asContinue reading “Historical Spotlight – ISOQOL”
Book Review: Thinking Fast and Slow
Daniel Kahneman won a Nobel Prize in Economics for his work in behavioral economics, much of it with his colleague Amos Tversky, who died in 2006. Kahneman’s 2011 classic, Thinking Fast and Slow, is a superbly-written non-technical summary of their fascinating research and its often counter-intuitive findings. The best feature of the book is theContinue reading “Book Review: Thinking Fast and Slow”
Likert Scale
A “likert scale” is used in self-report rating surveys to allow users to express an opinion or assessment of something on a gradient scale. For example, a response could range from “agree strongly” through “agree somewhat” and “disagree somewhat” on to “disagree strongly.” Two key decisions the survey designer faces are
-
How many gradients to allow, and
-
Whether to include a neutral midpoint
Football Analytics
Preparing for the Superbowl Your team is at midfield, you have the ball, it’s 4th down with 2 yards to go. Should you go for it? (Apologies in advance to our many readers, especially those outside the U.S., who are not aficionados of American football, but it’s Superbowl week in the U.S. A quick guideContinue reading “Football Analytics”
Job Spotlight: Digital Marketer
A digital marketer handles a variety of tasks in online marketing – managing online advertising and search engine optimization (SEO), implementing tracking systems (e.g. to identify how a person came to a retailer), web development, preparing creatives, implementing tests, and, of course, analytics. There are typically three types of employers: Marketing agencies that contract outContinue reading “Job Spotlight: Digital Marketer”
Dummy Variable
A dummy variable is a binary (0/1) variable created to indicate whether a case belongs to a particular category. Typically a dummy variable will be derived from a multi-category variable. For example, an insurance policy might be residential, commercial or automotive, and there would be three dummy variables created:
Things are Getting Better
In the visualization below, which line do you think represents the UN’s forecast for the number of children in the world in the year 2100? Hans Rosling, in his book Factfulness, presents this chart and notes that in a sample of Norwegian teachers, only 9% correctly identified the correct answer. Rosling, who died two yearsContinue reading “Things are Getting Better”
Artificial Lawyers
Can statistical and machine learning methods replace lawyers? A host of entrepreneurs think so, and do the folks who run www.artificiallawyer.com. Text mining and predictive model products are available now to predict case staffing requirements and perform automated document discovery, and natural language algorithms conduct legal research and case review. In 2017, a predictive algorithmContinue reading “Artificial Lawyers”
Entity Resolution and Identifying Bad Guys
Earlier, we described how Jen Golbeck (who teaches Network Analysis at Statistics.com) analyzed Facebook connections to identify fake accounts (the account holders friends all had the same number of friends, which is highly improbable statistically). Network analysis and studying connections lie at the heart of entity resolution. To a sales and marketing person, entity resolutionContinue reading “Entity Resolution and Identifying Bad Guys”
Work and Heat
If you are working on New Year’s Eve or New Year’s Day, odds are it is from home, where you can (usually) control the temperature in the home. Which, from the standpoint of productivity, is a good thing. According to a study from Cornell, raising the office temperature from 68 degrees to 77 degrees increasedContinue reading “Work and Heat”
Curbstoning
Curbstoning, to an established auto dealer, is the practice of unlicensed car dealers selling cars from streetside, where the cars may be parked along the curb. With a pretense of being an individual selling a car on his or her own, and with no fixed location, such dealers avoid the fixed costs and regulations thatContinue reading “Curbstoning”
Snowball Sampling
Snowball sampling is a form of sampling in which the selection of new sample subjects is suggested by prior subjects. From a statistical perspective, the method is prone to high variance and bias, compared to random sampling. The characteristics of the initial subject may propagate through the sample to some degree, and a sample derived by starting with subject 1 may differ from that produced by by starting with subject 2, even if the resulting sample in both cases contains both subject 1 and subject 2. However, …