Intersection of Sports, Analytics Moves From Field to Front Office

Analytics in sports has tended to focus on players and roster management. This thought remains rooted in the Moneyball-supported view of sports. Yet, sports extends beyond players and roster management. Analytics is moving from the field to the front office as these job openings show.

A person looking to break into this field needs to know cluster analysis and multidimensional scaling. To set his or her resume apart, this person should know factor analysis and conjoint analysis.

Additionally, this person would need to know how to communicate results to different audience with communication including both written and oral forms. A demonstrated ability to link analysis to tactical recommendations is essential.

Finally, this candidate should have experience with creating, collecting, and analyzing data from surveys.

Through Otterbein marketing major, we offer the opportunity to acquire these experiences and skills through our degree program.

Google Makes Goat Farmer's List

There's burying the lede, and then there's Ethan Baron's effort. Four screens, and more than 20 paragraphs into a story, Baron finally gets around to how a goat farmer can generate sufficient traffic to his MBA ranking website. The money accusation:

Then how did Kenan-Flagler get into those rankings? The MBA@UNC program is run by Kenan-Flagler’s executive education unit, which is a limited liability company, in conjunction with educational technology company 2U, which delivers and markets the program. Susan Cates, associate dean of executive education for Kenan-Flagler – whose name was on the first email Poets&Quants received after requesting information on the school via BestCollegeReviews – describes MBA@UNC’s business relationship with the dubious rankings websites as 'a tiny subset of a subset of the marketing that we do for the program.'

UNC underwrites this service, which benefits from a solid SEO effort. By appearing higher in search results, prospective students are more likely to click on the goat farmer's website.

The bigger question remains begged though. There is no value in rankings to prospective students. Perhaps the idea any one including a goat farmer can create and promote a ranking supports the argument for ending rankings.

All the Colleges That's Fit to Rank

Feeling left out of the college rankings racket, the New York Times published its list. And, like lists from US News World & Report, Washington Monthly, and Money, the New York Times' list appears light on logic and analytical support. Despite the presence of so many list as well as the exhaustive Princeton Review's effort, the Times decided to rank colleges based on their economic diversity. This ranking relies on two measures:

The College Access Index is based on the share of freshmen in recent years who came from low-income families (measured by the share receiving a Pell grant) and on the net price of attendance for low- and middle-income families.

No justification was given for these two measures as opposed to another set of measures. Finally, The Times then set an arbitrary four-year graduation rate of 75% or higher. The measures produce a list that places Vassar, Grinnell, and Swarthmore among the top 10 schools. Other academically elite schools such as the University of California at Berkeley and Washington University fail to make the list because they either fall below the arbitrary 75% cut off, or do not admit many Pell Grant recipients. Inside Higher Education provides more zaniness created by the arbitrariness of the list.

If the Time's list appears rather arbitrary in nature, it - like almost all college lists - are arbitrary by definition. No set of measures could explain why one college appears better than another college. Yet, very few outside of some higher education senior executives question the wisdom of ranking colleges. Rankings are intended to establish preference or order.

A better exercise would to create clusters of colleges based on some measures. Indeed, I have argued for this approach herehere, and would be happy to support a Distinction project in this area.

Rankings of colleges create more chaos than order.


Yet Another College List

Money magazine joins the college ranking racket with its twist on the money-making, analytics-light list. To distinguish its list from all lists available on the market, Money uses two measures. One, Money determines how much money will parents borrow to pay for their child's degree from a particular college. Two, Money estimates how much the child will generate in salary with a degree from a particular college.

Based on these simple measures, Money adds BYU, Harvey Mudd, and Babson to the standard top-10 schools such as MIT, and Stanford.

Instead of trying to create a list that appears somewhat different from all available college lists, perhaps Money could have developed a cluster analysis to group universities. Such an effort would provide something of greater value to parents.

(Too Early or Too Late) Friday Questions

Usually on Fridays, a question from a Principle of Marketing student gets posted. On Monday, several questions are being answered for either a very early or very late Friday questions.

What is an example of a nice graph to use in our PowerPoint?

Robin Williams includes some examples in her book on layouts for the non-designer. Stephen Few's blog and books also discuss this concept. You should be able to find William's book at the Otterbein library and Few's books through interlibrary loan.

By clicking on the "Samples" link under the "MKTG3100" heading, you will find five reports that serve as good guides for design including tables, and graphs.

I need more Excel knowledge (e.g., step by step, how to run different formulae, etc.).

For step-by-step instructions, there are dozens of videos posted on YouTube. If you search for a topic such as "correlation in Excel," "Poisson distribution in Excel," or "scatter chart in Excel," then you will receive several recommendations. You will need to watch a few to determine the video that best suits your needs.

I want to know how to do correlation metrics in a form of charts and graphs and Excel.

An X-Y scatter plot should provide some insight. Also, you should look at the chandoo Excel blog for instructions, suggestions, and guides on using various pieces of Excel.

When a cluster analysis does not work, is there another effective way to segment the market without an undifferentiated approach?

First, consider theory. How or why would there be a difference in the market? In the Kirin and Dell cases, no one is more likely to drink Kirin or use a Dell than another person based on the given characteristics.

Second, try a different algorithm tool such as factor analysis. This tool follows a similar idea as cluster analysis as you are looking for groups. However, factor analysis explains why a group exists. That is, it provides insight into how groups are unique but necessarily how are groups different.

Third, perform a contingency table with a chi-square test. This approach does not require you to impose a distribution on the data. However, you need to select two or three items at a time. For example, from the Kirin case, we could have created a contingency table using (a) likelihood to purchase an import beer, and (b) prefer a strong taste. The remaining questions would not have been considered.

In MKTG3850, we cover factor analysis and contingency table in greater detail.

Friday Fun Question

Most Fridays, I will answer a question or questions from Principles of Marketing students about marketing. This week's question: How do you determine the target market?

Several methods exist to segment a market beyond the simplistic (everyone wants my product) and the gut (young women are very interested in my market offering). More specific methods include using parametric analysis such as regression or ANOVA, and non-parametric analysis such as chi square and Spearman's rank rho. These methods are covered in the Marketing 3850 (Marketing Analytics) course.

Parametric and nonparametric provide critical region where the analysis can decide to reject, or fail to reject the null hypothesis. Typically, the null hypothesis is stated as there is no difference between groups, the means are equal, or the medians are equal depending on the analytical tool.

Beyond those approaches exist cluster analysis and factor analysis. We cover cluster analysis in both Principles of Marketing (MKTG3100) and Marketing Analytics (MKTG3850). The thought is to form groups based on distance, or loss of information, to determine how similar or dissimilar the observations appear. Statsoft provides a longer, more detailed explanation as well as an embedded video.

Similarly, factor analysis relies on the amount of error each group, or factor, shares. Through this approach, the number of observations or variables can be reduced to some type of groups, which allows for classification. We will most likely cover this approach in MKTG3850 for Autumn 2014. This entry from Statsoft provides additional discussion.

Please note that these responses appear brief.

After a segmenting the market, a target market should be selected based on several considerations, including:

  • Size and growth of the market segment;
  • Cost to service the market segment;
  • Firm's ability to meet the demands of that market segment.

That is, the firm should evaluate the attractiveness of each segment before selecting the segment to target.

A Dram for Coolest Use of Cluster Analysis

Luba Gloukhov enjoys single malt scotch. While enjoying a sip, Gloukhov wondered if scotch distilleries could be clustered.

Using a dataset and K-means clustering, Gloukhov was able to create groups, or segments, of single malt scotch.

The analysis is well done and well interpreted. It would have been easier if Marketing Engineering had been used.

Hopefully, Gluukhov will inspire other people to explore cluster analysis to form groups that would not otherwise be apparent.

Cluster analysis would provide a better way to organize this bar.

Cluster Analysis Thought

The IBM dataset contains more observations than are probably necessary to conduct cluster analysis. Instead of using the full sample, a savvy research would split the sample. Splitting the sample would create two outcomes. One, the reduced dataset would be more manageable. Two, a test/restest would be possible. That is, develop a cluster model with the first model and confirm (or disconfirm) with the second model.

As to the process for splitting the sample, plenty resources exist on sampling procedures.

Make Clusters, Not Lists

The appearance of autumnal colors signals the annual onslaught of best college lists. US News & World Report, which seemingly exists solely as a brand for college rankings, provides one of the better known lists along with the more comprehensive Princeton Review lists, and the zanier Washington Monthly list. Joe Nocera dives head first into this pile of lists with this rank money quote:

U.S. News likes to claim that it uses rigorous methodology, but, honestly, it’s just a list put together by magazine editors. The whole exercise is a little silly. Or rather, it would be if it weren’t so pernicious.

US News & World reports' ranking method has been under attack since the list made its first appearance. The attacks do have merits as Nocera's column illustrate.

Furthermore, schools game these lists including public schools like Clemson, selective private schools like Claremont McKenna, and law schools like the University of Illinois. Other tricks to improve spots on these rankings include encouraging lots of students to apply for an admission spot, and ignoring test scores from certain applicant groups.

The rankings remain silly. As with any ranking, distance between observations have been lost. For example, US News & World Report ranked Yale third behind Harvard and Princeton. However, no one knows if the distance between Yale, and Harvard and Princeton is one point or a hundred points. Yale's closeness to Harvard and Princeton remains a mystery.

Ultimately, these lists tell us nothing about the schools. We want to believe that the Ivy League schools are better than their brethren who comprise the New England Small College Athletic Conference. The comparison, though, is apples (e.g., Harvard, Princeton, Penn) to oranges (e.g., Williams, Amherst, Trinity). Yet, by definition, rankings impose a structure on the roughly 500 institutes of higher learning based in the United States.

Instead of arbitrarily and capriciously creating a ranking out of whole cloth, high school juniors and their parents would be better serve if schools were grouped, or clustered, by some attributes. The attributes could be varied and vast, including:

  • Acceptance rate
  • Median ACT/SAT scores
  • Mean, median amount of student loan debt
  • Number of states and countries represented in prior entering classes
  • Mean, median time to degree completion
  • Number of graduate degrees offered
  • Presence of professional graduate schools

The above list of attributes represents a starting, not an ending point. With the attributes coded and measured, a savvy analyst could use factor analysis, which SPSS offers, or discriminant analysis, which Marketing Engineering offers, to create the clusters. Finally, the analyst would label, or name, each cluster with something clever such as "Leafy Rural Elite," and "Everybody Welcomed."

Also, the cluster approach minimizes schools' attempts to game the system. A school could continue to shade reporting numbers. With so many attributes, these efforts would have minimal effect.

Finally, this analytical tool would reveal groupings that otherwise would not be apparent. For example, based on these attributes, Miami University could be considered part of the same group as the University of Texas and the University of Michigan. Initially, we would be hesitant to create such a grouping. By analyzing the data, we would have support for such an argument.

Looking at each cluster, high school juniors could rely on more meaningful information. It makes no difference that Harvard is ranked ahead of Yale by a criterion. A cluster that includes Harvard and Yale along with a separate, and distinct cluster that contains Eastern Connecticut State and Midwestern State would give a better indication of where students and their families should be begin collecting information, scheduling campus tours, and focusing their efforts.

Fruit by the Clusterful