Register today for our Generative AI Foundations course. Use code GenAI99 for a discount price of $99!
Skip to content

Social Network Analysis (SNA) in Medicine

In hospitals, “sentinel events” are events that carry with them a significant risk of unexpected death or harm.  It is estimated that ⅔ of such sentinel events result from communications failures during the handoff of a patient from one provider to another (e.g. during a nursing shift change). In a recent paper, a team ofContinue reading “Social Network Analysis (SNA) in Medicine”

Matching Algorithms

Some applications of machine learning and artificial intelligence are recognizably impressive – predicting future hospital readmission of discharged patients, for example, or diagnosing retinopathy. Others – self-driving cars, for example – seem almost magical. The matching problem, though, is one where your first reaction might be “What’s so hard about that?” For example, to takeContinue reading “Matching Algorithms”

Feature Engineering and Data Prep – Still Needed?

It is a truism of machine learning and predictive analytics that 80% of an analyst’s time is consumed in cleaning and preparing the needed data. I saw an estimate by a Google engineer that 25% of the time was spent just looking for the right data. A big part of this process is human-driven featureContinue reading “Feature Engineering and Data Prep – Still Needed?”

Confusing Terms in Data Science – A Look at Synonyms, Homonyms and more

To a statistician, a sample is a collection of observations (cases). To a machine learner, it’s a single observation. Modern data science has its origin in several different fields, which leads to potentially confusing homonyms and synonyms, like these: Homonyms (words with multiple meanings): Bias: To a lay person, bias refers to an opinion about somethingContinue reading “Confusing Terms in Data Science – A Look at Synonyms, Homonyms and more”

Handling the Noise – Boost It or Ignore It?

In most statistical modeling or machine learning prediction tasks, there will be cases that can be easily predicted based on their predictor values (signal), as well as cases where predictions are unclear (noise). Two statistical learning methods, boosting and ProfWeight, use those difficult cases in exactly opposite ways – boosting up-weights them, and ProfWeight down-weightsContinue reading “Handling the Noise – Boost It or Ignore It?”

Good to Great

In 1994, Jim Collins and Jerry Porras, former and current Stanford professors, published the best-seller Built to Last that described how “long-term sustained performance can be engineered into the DNA of an enterprise.”  It sold over a million copies. Buoyed by that success, Collins and a research team set out to find the characteristics of companiesContinue reading “Good to Great”

The False Alarm Conundrum

False alarms are one of the most poorly understood problems in applied statistics and biostatistics. The fundamental problem is the wide application of a statistical or diagnostic test in search of something that is relatively rare. Consider the Apple Watch’s new feature that detects atrial fibrillation (afib). Among people with irregular heartbeats, Apple claims aContinue reading “The False Alarm Conundrum”

Random Selection for Harvard Admission?

An ethical algorithm… Ethics in algorithms is a popular topic now. Usually the conversation centers around the possible unintentional bias or harm that a statistical or machine learning algorithm could do when it is used to select, score, rate, or rank people. For example – a credit scoring algorithm may include a predictor that isContinue reading “Random Selection for Harvard Admission?”

GE Regresses to the Mean

Thirty years ago, GE became the brightest star in the firmament of statistical ideas in business when it adopted Six Sigma methods of quality improvement. Those methods had been introduced by Motorola, but Jack Welch’s embrace of the same methods at GE, a diverse manufacturing powerhouse, helped bring stardom to industrial statisticians. Last week, GE’sContinue reading “GE Regresses to the Mean”

Benford’s Law Applies to Online Social Networks

Fake social media accounts and Russian meddling in US elections have been in the news lately, with Mark Zuckerberg (Facebook founder) testifying this week before the US Congress. Dr. Jen Golbeck, who teaches Network Analysis at Statistics.com, published an ingenious way to determine whether a Facebook, Twitter or other social media account is fraudulent. HerContinue reading “Benford’s Law Applies to Online Social Networks”

The Real Facebook Controversy

Cambridge Analytica’s wholesale scraping of Facebook user data is big news now, and people are shocked that personal data is being shared and traded on a massive scale on the internet. But the real issue with social media is not harming to individual users whose information was shared, but sophisticated and sometimes subtle mass manipulationContinue reading “The Real Facebook Controversy”

Masters Programs versus an Online Certificate in Data Science from Statistics.com

We just attended the analytics conference of INFORMS’ (The Institute for Operations Research and the Management Sciences) this week in Baltimore, and they held a special meeting for directors of academic analytics programs to better align what universities are producing with what industry is seeking. The number of such programs is still growing rapidly (>200),Continue reading “Masters Programs versus an Online Certificate in Data Science from Statistics.com”

“Money and Brains” and “Furs and Station Wagons”

“Money and Brains” and “Furs and Station Wagons” were evocative customer shorthands that the marketing company Claritas came up with over a half century ago. These names, which facilitated the work of marketers and sales people, were shorthand descriptions of segments of customers identified through statistical cluster analysis. Cluster analysis is also used in marketContinue reading ““Money and Brains” and “Furs and Station Wagons””

Quotes about Data Science

“The goal is to turn data into information, and information into insight.” – Carly Fiorina, former CEO, Hewlett-Packard Co. Speech given at Oracle OpenWorld “Data is the new science. Big data holds the answers.” – Pat Gelsinger, CEO, EMC, Big Bets on Big Data, Forbes“Hiding within those mounds of data is knowledge that could change the lifeContinue reading “Quotes about Data Science”

College Credit Recommendation

Statistics.com Receives College Recommendation from the American Council on Education (ACE) College Credit Recommendation for Online Data Science Courses from The Institute for Statistics Education at Statistics.com LLC The American Council on Education‘s College Credit Recommendation Service (ACE CREDIT) has evaluated and recommended college credit for 5 more of The Institute for Statistics Education atContinue reading “College Credit Recommendation”

Big Data and Clinical Trials in Medicine

There was an interesting article a couple of weeks ago in the New York Times magazine section on the role that Big Data can play in treating patients — discovering things that clinical trials are too slow, too expensive, and too blunt to find. The story was about a very particular set of lupus symptoms,Continue reading “Big Data and Clinical Trials in Medicine”