One weekend a few months ago Vad [1] and I were hanging around the new Metamarkets office reading Hacker News.  We noticed something strange: two different headlines, both linking to identical content, resulted in dramatically different popularity ranks.  Do headlines matter so much? What drives observed popularity?

We started to investigate.

pretty
(Above: Rolling 10 days of article ranks. Click for an interactive version.)

The right way to answer this question was pretty obvious: crunch the data.  We started scraping HN titles along with article ranks and fed the resulting data into our online feature learning stack.

Below is the distilled summary of the result, our “Top Ten Hacker News Headline Hacks,” including feature weight, standard error and p-value vs. zero. Positive weight means the feature is predictive of high article rank.

Hack #1:  Maximize Controversy

1.4 ± 0.5 [p<1e-5] | essential
1.3 ± 0.5 [p<1e-5] could
1.2 ± 0.4 [p<1e-5] problem
1.3 ± 0.8 [p<1e-5] survived the
1.0 ± 0.5 [p<1e-5] controversy
0.9 ± 0.3 [p<1e-5] impossible

Hack #2:  Question Authority

0.7 ± 0.2 [p<1e-5] why ____ future
0.4 ± 1.0 [p=0.2] the ____ behind
0.2 ± 0.3 [p=0.04] why don’t
0.1 ± 0.3 [p=0.06] | lessons

Hack #3:  Avoid False Promises

-1.5 ± 0.8 [p<1e-5] tricks
-0.7 ± 0.5 [p<1e-5] the world |
-0.7 ± 0.2 [p<1e-5] the greatest
-0.6 ± 0.3 [p<1e-5] awesome
-0.6 ± 0.7 [p=0.003] anatomy of a
-0.5 ± 0.3 [p<1e-5] guide to

Hack #4:  Short is Sweet

-0.3 ± 0.04 [p<1e-5] {# WORDS}

Hack #5:  Execution not Ideas

2.6 ± 2.1 [p<1e-5] showing
1.5 ± 0.7 [p<1e-5] | building
0.6 ± 0.3 [p<1e-5] makes
0.5 ± 0.4 [p<1e-5] starting a company
0.3 ± 0.3 [p<1e-3] join a startup

-1.1 ± 0.3 [p<1e-5] ideas
-1.1 ± 0.3 [p<1e-5] idea?

Hack #6:  Everybody Loves a Winner

1.7 ± 0.7 [p<1e-5] | ____ acquires
0.5 ± 0.3 [<1e-5] hire
0.4 ± 0.7 [p=0.02] worth

Hack #7:  Everybody Loves Data

1.9 ± 1.8 [p<1e-4] data |
0.6 ± 0.8 [p=0.004] data –
0.5 ± 0.1 [p<1e-5] visualize data in

-1.3 ± 0.7 [p<1e-5] algorithm

Hack #8: Nobody Cares About You

-0.2 ± 0.3 [p=0.008] my startup
-0.9 ± 0.2 [p<1e-5] silicon valley

Hack #9:  Some Topics are Just Miserable

-0.4 ± 0.3 [p<1e-5] angry birds
-0.2 ± 0.1 [p<1e-5] harry potter
-0.5 ± 0.4 [<1e-4] taxes
-1.5 ± 1.0 [<1e-5] downtime

Hack #10:  Social is For Losers

-0.6 ± 0.9 [p=0.007] social
-0.5 ± 0.4 [p<1e-4] gamification
-0.3 ± 0.6 [p=0.04] twitter |
-2.4 ± 1.5 [p<1e-5] airbnb

Standard disclaimer: the above coefficients are provided for entertainment purposes only. Feature interactions in text are a bitch. Correlation does not imply causation. Past performance does not guarantee future success.

How We Did It

We extracted n-gram (e.g. “Harry Potter”, “Google”, “Silicon Valley”) and skip features (e.g., “a ____ for”,  ”| ____ acquires”) for each title, including start- and end-of-sentence markers and optionally punctuation. For learning we used boosted stochastic gradient descent with logistic loss [2], predicting whether the article made it to the top 20 or not during its observed lifetime. Strong regularization was used to eliminate spurious features, and twenty bootstrap replicates were used to measure significance of coefficients and classification accuracy.

For this untuned, first-pass model, we achieved 64% classification accuracy on a hold out set over the past two months. Positive predictive value was 25.7%, negative predictive value was 73.1%, sensitivity was 18.2% and specificity was 80.9% [3]. Despite this weak predictive power, we found some interesting correlations, more of which we’ll release as the model improves.

[1] Of Koalas to the Max fame.
[2] Think: wabbit style.
[3] Predictive diagnostics.