In this week’s Brief, we look at social categories, and the role that statistics and data science have played in social engineering – 100 years ago and today. Our course spotlight is
- April 3 – May 1: Categorical Data Analysis
See you in class!
The Normal Share of Paupers
In 2009, China began regional pilot programs that repurposed credit scores to a broader purpose – scoring a person’s “social credit.” 100 years earlier, at the height of the eugenics craze, the famous statistician Francis Galton undertook to repurpose statistical concepts in service of social engineering.[…]
Word of the Week
Purity
In classification, purity measures the extent to which a group of records share the same class. It is also termed class purity or homogeneity, and sometimes impurity is measured instead. The measure Gini impurity, for example, is calculated for a two-class case as p(1-p), where p = the proportion of records belonging to class 1. The lower the Gini impurity, the more pure. Measuring impurity is particularly important in decision tree algorithms that split data so as to maximize purity in the resulting partitions […]
Company Spotlight
Rightship (Maritime Risk Assessment)
The maritime business has developed as somewhat of a “stateless” business – crews, flags of registry, ship owners, chartering agencies, and cargo owners can all be subject to different legal and regulatory regimes, leading to a myriad of unknowns and risks for those involved in the business. Ship collisions, sinkings, oil spills, mechanical failures, resulting strandings, terrorist attacks – these are some of the risks that must be assessed and weighed.
Rightship is a company that applies predictive analytics to this problem, yielding a score for a ship that assesses that ship’s likelihood of becoming involved in an incident. Read here how Elder Research supplied important external validation for the Rightship system.
Quote
“I only believe in statistics that I doctored myself”
– Winston Churchill, who set up the British state statistical service, S Branch, in 1939
Course Spotlight
April 3 – May 1: Categorical Data Analysis
In this course you will learn how to analyze categorical and count data. Your instructor is Brian Marx, Professor of Statistics at Louisiana State University and the Coordinating Editor of Statistical Modelling: An International Journal. You will learn how to:
- Calculate joint, marginal and conditional probabilities for R by C tables
- Test for independence, and equality of proportions
- Fit and interpret logistic models for binary data
- Fit and interpret Poisson models for count data
- Check model assumptions and analyze residuals and goodness-of-fit
See you in class!