What is product quality?

What are examples or evidence of a machine learning based, movies/music/books recommendation systems which contributes to a better product (either in quality of the product, or more profitable) that human-based ones?

  • For examples, if the netflix prize in 2010 wasn't ever actually implemented (http://www.forbes.com/sites/ryanholiday/2012/04/16/what-the-failed-1m-netflix-prize-tells-us-about-business-advice/) - are ML based recommendation systems for movies/books/music really only useful indirectly for research purposes for the most part?

  • Answer:

    I imagine that Netflix expected that they may not be able to use the winning solution. The average solution was some hairy blend of R, perl, Java and a manual workflow running over days -- utterly unready for production. I imagine it was conceived in large part as marketing. It succeeded in establishing a brand as a serious high-tech company. It might have been worth it for recruiting alone. Happily, it also brought interest in unsupervised learning in general (clustering, CF), which is still undeveloped compared to supervised learning (classifiers, regression). The key problem with the Netflix Prize was that it required optimizing for only minimum reconstruction error of some input. Not scalability, speed, maintainability. Not even, actually, the goal of recommendations: picking the top 1-5 things you will most likely interact with now. This was probably inevitable for this contest, where some understood and objective criteria had to be established. But, academic papers have historically also optimized for these same things. Improve RMSE by 2% on a known small data set and publish. The Netflix quote in Forbes article highlights this truth: accuracy is not the entire story in the real world, and diminishing returns come quickly. Their original Cinematch system was not exactly cutting edge, but was by definition about 85% as good as the money-is-no-object solution. I see this repeatedly with users of Apache Mahout. The mature recommender parts are not sexy: solid, general implementations of simple approaches. But it's reasonably well packaged and intended to be consumed in a real production system. As an easy "85%" solution, it turns out to be a great point on the cost-performance curve. ML tools can get this right and add value. But in 2013 they will have to Not require a PhD on staff Be accessible to a developer within a day of attention and effort Answer in real time, like we expect every other bit of infrastructure to do Be readily evaluatable -- should help you tune and evaluate its effectiveness Handle big data that's been so meticulously stored, with the view that eventually ML would turn it to $$$ Optimal vs merely good is just not the priority right now -- it's getting that first easy solution that provides most of the business value.

Sean Owen at Quora Visit the source

Was this solution helpful to you?

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.