Choosing a Machine Learning Classifier

Translation in Chinese:


Data Science: What are the best blogs for data miners and data scientists to read?


How to hire data scientists and get hired as one


As you might have heard before if you read McKinsey reports, the New York Times or just about any technology news site, data scientists are in high demand. Heck, the Harvard Business Review called it the sexiest job of the 21st century. But landing a gig as a data scientist isn’t easy — especially a top-notch gig at a major web or e-commerce company where merely talented people are a dime a dozen.

However, companies are starting to talk openly about what they look for in data scientists, including the skills someone should have and what they’ll need to know to survive an interview. I spent a day at the Predictive Analytics World conference on Monday and heard both Netflix and Orbitz give their two cents. That’s also the same day Hortonworks published a blog post about how to build a data science team.

Granted that…

View original post 1,379 more words

Misc to read

How to pick good metrics:
4 Business Metrics You Can’t Afford to Ignore
Measuring What Matters: How To Pick A Good Metric
The Lean Analytics Cycle: Metrics > Hypothesis > Experiment > Act case study
Amazon’s business strategy and revenue model: A history and 2014 update
All the post from the author worth of reading.

Basic statistics
How to deal with multicollinearity when performing variable selection?
Logistic regression assumptions
Do your data violate linear regression assumptions? , Testing the assumptions of linear regression
One Sample z-Test for Proportions
A website that may help:
Linear mixed model implementation in lme4
Fitting multilevel models in complex survey data with design weights: Recommendations
Comparison of two multi-level models and Application of Winbugs

Venture Capital

How to Break Into Venture Capital

What Do You Do as a Venture Capitalist?


Two blogs (read later)

hierarchical survival models

posterior predictive checks



1. SQL Introduction to database

2. Book to read:

  • Mixed effect model: Data analysis using regression and multilevel models
  • Business questions: Data Science for Business: What you need to know about data mining and data-analytic thinking
  • Brain teaser: A  practical guide to quantitative finance interviews, Chapter two
  • Machine Learning / Data Mining: Elements of Statistical Learning

3. Learn Python

4. Optional

Big data and hoop:

  • map reduce abstraction
  • Distributed File System
  • map reduce process

Resume preparation