Data science is one of a host of similar terms. “Artificial intelligence” has been around since the 1960’s and “data mining” for at least a couple of decades. “Machine learning” came out of the computer science community, and “analytics,” “data analytics,” and “predictive analytics” came out of the statistics and OR communities. Among all ofContinue reading “Job Spotlight: Data Scientist”
Author Archives: Dave Flatley
Course Spotlight: Survival Analysis
Convinced that he, like his father, would die in his 40’s, Winston Churchill lived his early life in a frenetic hurry. He had participated in four wars on three continents by his mid-20’s, served in multiple ministerial positions by his 30’s, and published 12 books by his 40’s. Little did he know that more thanContinue reading “Course Spotlight: Survival Analysis”
Instructor Spotlight: David Kleinbaum
David Kleinbaum developed several courses for Statistics.com, including Survival Analysis, Epidemiologic Statistics, and Designing Valid Statistical Studies. David retired a little over a year ago from Emory University, where he was a popular and effective teacher with the ability to distill and explain difficult statistical concepts with clarity and concision. David had a flair forContinue reading “Instructor Spotlight: David Kleinbaum”
Likert scale assessment surveys
Do you work with multiple choice tests, or Likert scale assessment surveys? Rasch methods help you construct linear measures from these forms of scored observations and analyze the results from such surveys and tests. “Practical Rasch Measurement – Core Topics“ In this course, you will learn practical aspects of data setup, analysis, output interpretation, fit analysis, differentialContinue reading “Likert scale assessment surveys”
Historical Spotlight: Jacob Wolfowitz
World War II was a crucible of technological innovation, including advances in statistics. Jacob Wolfowitz, born a century ago (1920), looked at the problem of noisy radio transmissions. Coded radio transmissions were critical elements of military command and control, and they were plagued by the problem of atmospheric or other interference – “noise”. The weakerContinue reading “Historical Spotlight: Jacob Wolfowitz”
Certificate Graduate: Karolis Urbonas, Amazon
The Statistics.com courses have helped me a lot, pushing me to the limit and making me learn much more than I expected I could. The knowledge I gained I could immediately leverage in my job … then eventually led to landing a job in my dream company – Amazon. -Karolis Urbanas, Global Head of MachineContinue reading “Certificate Graduate: Karolis Urbonas, Amazon”
Certificate Graduate: Cristobal Bazan, United Nations Agency
Certificate Student Profile of Cristobal Bazan My courses help me look at more complex problems using different approaches to show more interesting aspects of conditions, beyond just tables and charts, more than just sampling or descriptive statistics. Cristobal Bazan United Nations Agency How do you use statistics in your job? I work in a statisticalContinue reading “Certificate Graduate: Cristobal Bazan, United Nations Agency”
Feature Engineering and Data Prep – Still Needed?
It is a truism of machine learning and predictive analytics that 80% of an analyst’s time is consumed in cleaning and preparing the needed data. I saw an estimate by a Google engineer that 25% of the time was spent just looking for the right data. A big part of this process is human-driven featureContinue reading “Feature Engineering and Data Prep – Still Needed?”
Job Spotlight: Risk Analyst
Many jobs are centered around risk management. If you’re looking through job postings, of course, you’ll see lots of jobs whose purpose is to make sure that nothing bad happens – the equivalent of locking the doors and closing the windows. More interesting from a statistical perspective are the jobs that assume that bad thingsContinue reading “Job Spotlight: Risk Analyst”
Problem of the Week: The Value of Bedrooms
Question: You work for an internet real-estate company, building statistical models to predict home price on the basis of square footage, number of bedrooms, number of bathrooms, property type (single family home, townhouse, multiplex), and age. Surprisingly, you find the coefficient for bedrooms is negative, meaning that adding bedrooms decreases value. What might account forContinue reading “Problem of the Week: The Value of Bedrooms”
Industry Spotlight – Pharma
The cost of bringing a new drug to market is over $2 billion, by some estimates. This covers the R&D, clinical trial testing and regulatory approval costs of the drug that makes it through the whole process, and also the same costs of the 9 drugs that don’t make it, for each drug that does.Continue reading “Industry Spotlight – Pharma”
Statistically Significant – But Not True
If you are looking for the Feature Engineering blog post, you can find it here: https://www.statistics.com/feature-engineering-data-prep-still-needed/ In 2015, at an Alzheimer’s conference, Biogen researchers presented dramatic brain scans showing that the antibody aducanumab effectively cleared out plaque in the brain, plaque that was associated with Alzheimer’s disease. Their study involved 166 patients in a randomized,Continue reading “Statistically Significant – But Not True”
Book Review: Everyone Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We REALLY Are
This week’s book review is of Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are, Seth Stephens-Davidowitz’s fascinating book about how social media data reveals all sorts of things about us that we barely know ourselves. For example, did you know that the ages 8-12 areContinue reading “Book Review: Everyone Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We REALLY Are”
Industry Spotlight – Precision Agriculture
The application of analytics to agriculture has given rise to what is called “precision agriculture”, a science that seeks to take advantage of and use detailed information that is local in time and place. Tractors and farm equipment are being equipped with sensors and software that allow them to respond automatically to external data, andContinue reading “Industry Spotlight – Precision Agriculture”
Historical Spotlight: Ronald A. Fisher
In 1919, Ronald A. Fisher was appointed as chief statistician at the agricultural research station in Rothamsted, a post created for him. His work there resulted, in 1925, in the publication of his classic Statistical Methods for Research Workers. An important message of his book was that statisticians needed to be involved at a practicalContinue reading “Historical Spotlight: Ronald A. Fisher”
Instructor Spotlight: Prof. David Unwin
Prof. David Unwin has guided, developed and taught the spatial analysis curriculum at Statistics.com since 2005. David lives in central England, about an hour north of the storied Rothamsted agricultural research center. Until his retirement in 2002, he was Professor of Geography at Birkbeck College, University of London, where he retains an Emeritus Chair inContinue reading “Instructor Spotlight: Prof. David Unwin”
Statistics in Agriculture: Encycloweedia
Weeds are big business – the global herbicide market is over $35 billion annually. Weeds are also big government (think “invasive species”). California’s listing of weeds is called Encycloweedia, and the state publishes a quarterly newsletter called Noxious Times. Colorado publishes a similar periodical, Invader. The weed-killer Roundup is the focus of lawsuits that illustrateContinue reading “Statistics in Agriculture: Encycloweedia”
Tensor
A tensor is the multidimensional extension of a matrix (i.e. scalar > vector > matrix > tensor).
Problem of the Week: Missing Data
Question: You have a supervised learning task with 30 predictors, in which 5% of the observations are missing. The missing data are randomly distributed across variables and records. If your strategy for coping with missing data is to drop records with missing data, what proportion of the records will be dropped? Is the assumption ofContinue reading “Problem of the Week: Missing Data”
Student Spotlight: Barry Eggleston
Barry Eggleston is a health research statistician who has worked on both clinical trials and observational studies, and is currently with RTI in North Carolina. In his early career, his work was solely designing and analyzing clinical trials using typical biostatistics methods ranging from t-test to survival analysis and mixed models. After moving to RTIContinue reading “Student Spotlight: Barry Eggleston”