Data Mining vs. Research Q’s

  • Section 2.2 p.13 - 16
  • Data Mining is good at finding patterns and making predictions under stability
  • Not good at improving understanding nor improve theory main reason are:
    1. Answers what’s in the data , not explaining why. Correlation != Causation
    2. Does not deal with abstraction, can see observations but not at developing theory
    3. Results in false positives - observations found in sample but not outside of it. Random relationships eventually occur when testing everything
  • Can lead to Research Questions
    • Come to data without a theory, noticed interesting data patterns
    • Confirm it holds up in other data aka replication of data patterns