math - Where do mathematical algorithms for Reddit's ranking, as an example, come from? -


recently looking @ reddit's algorithm determining makes post "hot" topic , content suitable reddit homepage.

the article reading here: http://amix.dk/blog/post/19588

i've noticed have mathematical logorithms , have created kind of mathematical function determine hotness/relevance of post.

in formulas used, each of mathematical components come , how know use them?

thank you!

-- bakz

edit: clarify, graduated high school , apologize if answer question seems pretty obvious. again!

i'll tackle first formula, "hotness" of posts. formulas come requirements. designers of reddit have thought want achieve, , designed formulas accordingly. can't tell requirements had in mind, can @ implementation , guess wanted system along these lines:

  1. scores shouldn't need recomputed unless number of votes change. reduces number of changes database, , makes easier achieve consistency if data replicated. (so scoring system based on scores getting lower article ages no good).

  2. if 2 stories equally old, 1 more upvotes should higher. (so there needs contribution votes.)

  3. the more upvotes story gets, longer should remain near top of ranking.

  4. old stories shouldn't stay @ top of rankings ever, if had lots of upvotes. (after day or two), new stories need outrank them. (so there needs contribution date, , must outweigh score due votes soon, no matter how many votes gets.)

  5. stories more downvotes upvotes should not appear in rankings @ all.

now let's @ formula: log z + yt / 45000 , see how satisfies these requirements.

  1. if number of votes not change, z, y , t unchanged. score unchanged. satisfies requirement (1).

  2. if 2 stories have same age, have same value t. 1 more upvotes has higher value of z, , since log monotonic, has higher score. satisfies requirement (2).

  3. the more upvotes story has, higher z, longer until story higher t can outrank it. satisfies requirement (3).

  4. logarithm function grows more gets larger (take @ graph). story needs more , more upvotes on time keep newer stories. satisfies requirement (4).

  5. if story has more downvotes upvotes, z = 1 , y = −1 score negative. satisfies requirement (5).

the constant 45,000 scale factor brings upvotes , age balance. there 86,400 seconds in day, t gets larger amount each day. dividing t 45,000 gives 1.92 means 1 day's relative newness worth 101.92 = 83 votes, , 2 days' relative newness worth 7,000 votes.


Comments

Popular posts from this blog

c++ - Is it possible to compile a VST on linux? -

java - Output of Eclipse is rubbish -

jquery - Confused with JSON data and normal data in Django ajax request -