How to measure cache performance?

When using multiple classifiers, how do you measure the ensemble's performance [SK-Learn]?

How get 'combined' performance metrics for combined classifiers? I have a classification problem (predicting whether a sequence belongs to a class or not), for which I decided to use multiple classification methods, in order to help filter out the false positives. (The problem is in bioinformatics - classifying protein sequences as being Neuropeptide precursors sequences. http://bioinformatics.oxfordjournals.org/content/early/2013/12/13/bioinformatics.btt725.abstract if anyone's interested, http://bioinformatics.oxfordjournals.org/content/early/2013/12/13/bioinformatics.btt725.abstract) . Now, the classifiers have roughly similar performance metrics (83-94% accuracy/precision/etc' on the training set for 10-fold CV), so my 'naive' approach was to simply use multiple classifiers (Random Forests, ExtraTrees, SVM (Linear kernel), SVM (RBF kernel) and GRB) , and to use a simple majority vote. MY question is: How can I get the performance metrics for the different classifiers and/or their votes predictions? That is, I want to see if using the multiple classifiers improves my performance at all, or which combination of them does. My intuition is maybe to use the ROC score, but I don't know how to "combine" the results and to get it from a combination of classifiers. (That is, to see what the ROC curve is just for each classifier alone [already known], then to see the ROC curve or AUC for the training data using combinations of classifiers). (I currently filter the predictions using "predict probabilities", filtered arbitrarily for results with a predicted score below '0.85'. Additional filtering is "how many classifiers agreed on this protein's positive classification"). (The http://neuropid.cs.huji.ac.il/ implementation - http://neuropid.cs.huji.ac.il/ ) The whole shebang is implemented using SciKit learn and python. Citations and all!) Thank you very much!!
Answer:

You can do three things. In order of difficulty: Bagging Evaluate it the same as you have done with each of the classifiers themselves. After you use majority vote, you have a bunch of predicted classes right? Get your confusion matrix and calculate prec, recall etc. This is an "equal weighted" majority vote evaluation, since each model's vote has the same importance and is guaranteed to reduce variance. You can think about it like this; Each model performs better in some scenarios and worse in others. With bagging, you are "averaging out" the weaknesses in each model with the strengths of others. I'd randomly choose a few sets of models and compare their performance against the majority vote with all models to see what is better (in the same way a random forest works). Boosting With boosting, you have each model specialize on certain subsets of examples. On a high level, you use a new model to predict for instances where another model misclassified. Theoretically, boosting reduces both variance and bias and is "one step" better than bagging. But, in my experience, the extra incremental work in implementing boosting hasn't been worth the small gains. The guys who won the Netflix prize might disagree ! http://www.netflixprize.com/assets/GrandPrize2009_BPC_BellKor.pdf Great reference: http://people.cs.pitt.edu/~milos/courses/cs2750-Spring04/lectures/class23.pdf Stacking Fundamentally, stacking is a case of deep learning. In essence, stacking uses the predictions of each model to train a new model that outputs a "combined" prediction. This is a learning approach to the ensemble problem, "learned" bagging/boosting. This is the most difficult approach since you'll have to do some feature engineering and tune the new model, while keeping in mind that it is a high level hierarchical model that is using the output of several models. I've seen extremely varied approaches to this, and most have been very effective. This also automates a lot of the manual work you would have to do with bagging and boosting, like model selection and weighting. Difficult but ideal, in the end, you get what you put into this approach. TLDR Start with bagging and see how well your models do, and how much more you want to invest, then deice if you need to do any more.

Ferris Jumah at Quora Visit the source

Was this solution helpful to you?

Related Q & A:

How do you measure national income? and why do countries need to measure it?Best solution by en.wikipedia.org
How do I measure how loud a car audio system is?Best solution by answers.yahoo.com
How can I remove a picture's watermark using Matlab's image processing toolbox?Best solution by Yahoo! Answers
How can I improve my computer's performance and speed?Best solution by Yahoo! Answers
How to hide email recipient addresses when sending multiple emails?Best solution by Yahoo! Answers

Just Added Q & A:

How many active mobile subscribers are there in China?Best solution by Quora
How to find the right vacation?Best solution by bookit.com
How To Make Your Own Primer?Best solution by thekrazycouponlady.com
How do you get the domain & range?Best solution by ChaCha
How do you open pop up blockers?Best solution by Yahoo! Answers

For every problem there is a solution! Proved by Solucija.

Got an issue and looking for advice?
Ask Solucija to search every corner of the Web for help.
Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.