we can see that many websites now have content scores, whether e-commerce, information sharing, or content downloads. The content of the score is mainly divided into two categories, one is the user to the content of the score, mainly for the use of user experience, such as electronic commerce website commodity score, content sharing website content score, this is currently the most common scoring models, and calculate the comprehensive score of the content is relatively simple, mostly mean score of all users; another way is to score score on the content of the site itself, mainly for the history of user behavior data, such as access by user evaluation of the content of the popular degree etc..
The content of the website here to introduce the
scoring model for second classes of scale, because the score score is relatively fixed, 100 points, 10 points or 5 points, and the user with respect to the numerical behavior data generated by each content is different, may be thousands of magnitude, may be tens of magnitude. Even millions, how these data will be converted into the standard score system, and make the final score distribution more reasonable and effective, can let the real high quality content to get a high score, and recommended to the user, here is the key to solve the problem.
content scoring example
introduces the application of the case before, we should first note the application environment and the specific requirements: if there is a content sharing website, website content will need to score 5 points, to show the form, each content score is possible only this 5 1-5 score, to show the popularity of the content of each web site, to provide reference for selecting and reading the user.
this is one of the most simple application of the contents of the above scores, has been very clear on the score of the objective to distinguish the content of popularity, and the final data show: to show the form of 5 points. For such a well-defined data requirement, we can select metrics, build models, and eventually output results.
1, select indicator
evaluation of content popularity, seemingly quite simple, direct content views (PV) as the evaluation index is not on the line? Indeed, PV is a good choice, a choice is the most simple, but in fact there is a better choice, access number (Visits), the number of users access (UV) and these two indexes can be removed with a short time users continuously refresh the same content, so we may choose users to access the UV as evaluation index.
2, build scoring model
now is the key content, the popularity of content clearly, need to evaluate the first to eliminate index units of measurement, and the distribution of scores of control in the range of requirements: 1-5.
eliminate metric units? Maybe you’ve already thought, yes >