math - Where do mathematical algorithms for Reddit's ranking, as an example, come from? -
recently looking @ reddit's algorithm determining makes post "hot" topic , content suitable reddit homepage.
the article reading here: http://amix.dk/blog/post/19588
i've noticed have mathematical logorithms , have created kind of mathematical function determine hotness/relevance of post.
in formulas used, each of mathematical components come , how know use them?
thank you!
-- bakz
edit: clarify, graduated high school , apologize if answer question seems pretty obvious. again!
i'll tackle first formula, "hotness" of posts. formulas come requirements. designers of reddit have thought want achieve, , designed formulas accordingly. can't tell requirements had in mind, can @ implementation , guess wanted system along these lines:
scores shouldn't need recomputed unless number of votes change. reduces number of changes database, , makes easier achieve consistency if data replicated. (so scoring system based on scores getting lower article ages no good).
if 2 stories equally old, 1 more upvotes should higher. (so there needs contribution votes.)
the more upvotes story gets, longer should remain near top of ranking.
old stories shouldn't stay @ top of rankings ever, if had lots of upvotes. (after day or two), new stories need outrank them. (so there needs contribution date, , must outweigh score due votes soon, no matter how many votes gets.)
stories more downvotes upvotes should not appear in rankings @ all.
now let's @ formula: log z + yt / 45000 , see how satisfies these requirements.
if number of votes not change, z, y , t unchanged. score unchanged. satisfies requirement (1).
if 2 stories have same age, have same value t. 1 more upvotes has higher value of z, , since log monotonic, has higher score. satisfies requirement (2).
the more upvotes story has, higher z, longer until story higher t can outrank it. satisfies requirement (3).
logarithm function grows more gets larger (take @ graph). story needs more , more upvotes on time keep newer stories. satisfies requirement (4).
if story has more downvotes upvotes, z = 1 , y = −1 score negative. satisfies requirement (5).
the constant 45,000 scale factor brings upvotes , age balance. there 86,400 seconds in day, t gets larger amount each day. dividing t 45,000 gives 1.92 means 1 day's relative newness worth 101.92 = 83 votes, , 2 days' relative newness worth 7,000 votes.
Comments
Post a Comment