What do you think about the quality of existing tools for sentiment analysis (API included)?

This is a follow-up question to .
Answer:

The quality mainly depends on the type of seed date you have. Let's say you have a set of seed date of 1000 positive and negative reviews of toothpaste products. Then, you can probably find the sentiment of new toothpaste product reviews pretty accurately. The problem is that you don't always have seed data; or if you do it is not categorized. Thus making most tools pretty inaccurate (which is why most tools have a human override that the software then uses to improve). Another problem area are tweets which are only 140 characters long, which makes it harder to determine any sentiment because the text is so short. You can usually get statistically significant results with the current tools when looking at trends, but when you narrow down to the individual level - they aren't that accurate.

Was this solution helpful to you?

Other answers

We've published an extensive methodology explaining how we measure quality of our output for different projects: http://support.semantria.com/customer/portal/articles/973525-measurement-methodology---accuracy-recall-precision This methodology allows us to accurately estimate accuracy, precision, and recall of our sentiment or categorization output vs humans for any given dataset. There are a lot of tools (including ConveyAPI below - http://www.3scale.net/2012/07/converseon-enabling-the-integration-of-its-technology-into-multiple-tools-and-apps-via-api/) that claim 99% or so accuracy, precision, and recall or whatever. In reality, you can play with these numbers the way you want and make things look extremely good when you need it. The real picture is not that bright. It all depends on what kind of content you are analyzing and whether your content belongs to a specific/vertical domain or not. We give 60-65% precision/recall for sentiment analysis on general twitter data out of the box (test it for yourself - http://www.semantria.com/demo). If your content belongs to a specific vertical, you can bring that up by augmenting our base sentiment dictionary with your own training data. I've seen people get to 90-95% sentiment analysis precision/recall on domain-specific twitter data. Facebook content, being much cleaner and grammatically correct, usually hits 70-75% precision/recall out of the box. With some training within a specific vertical you can get that to 90%. Longer form content (blogs/news), you start at around 75-80% and can bring it to 90-95% after some work. Now, contrary to many claims, it is virtually impossible to have 90% precision and recall out of the box for all verticals... just because "sucks" means opposite things when you say "this sucks" or "my vacuum sucks well". This alone will require context understanding, which nobody has been able to crack yet. We've started first inroads into contextual understanding of sentiment data by ingesting all of Wikipedia and learning semantic differences between coke (drug) and coke (drink) for example. This understanding lets us disambiguate between both, and then react properly, while doing sentiment analysis. The best way to answer your question is to try. With Semantria you can either try our web demo, or simply register for a trial (http://www.semantria.com/trial), and then downoad our Excel add-in (www.http://semantria.com/excel) and then just process your content. From what I know, other players are not exposing any public demos, etc so you would have to contact their sales teams and sit through a demo call. If you are giving them a dataset to process for testing purposes, make sure the timeframe is short, to avoid any manual tinkering (it's 3d text mining company I work for, I've seen all kinds of things happen to client data during bake-offs before).

Oleg Rogynskyy

If you are working with a large corpus, with individual documents at a reasonable size and written in proper English (or German, etc), I believe a lot of the standard NLP toolkits, algorithms, etc function reasonably well. However, it's my understanding that effectiveness of existing methodologies drops off sharply where certain mediums or forms of expression are concerned. F.ex, I don't know of very many (if any) existing tools or methodologies that handle textual sarcasm or satire well - which makes whole classes of data unusable, especially comments and posts on blogs and social media. Heck, thanks to Poe's Law, even humans have a hard time distinguishing this stuff sometimes, so it's debatable whether or not its even reasonable for machines to try determining whether or not a review is sarcastic.

Vaibhav Mallya

Determining quality is a subjective matter, that pretty much depends on what's important to you. Is it PRECISION? (e.g. the degree to which records labeled as positive are not false positives). Is it RECALL (e.g. how many of the total positive mentions from the dataset that were actually identified as positive). You could have high precision and low recall without much difficulty. You could also have high recall and low precision. There are cases where both are quite acceptable. Both high recall and precision may be hard (but not impossible, but certainly expensive to build). Furthermore, are all mistakes the system makes equally bad? E.g. if the system labels a positive record as neutral is that a big problem? It probably is not as big a problem as if the system makes mistakes between positive and negative mentions. Again, your tolerance depends on your use case. Finally, make sure you're measuring the systems performance against human performance - and as you know humans don't always agree. Read this blog post for some good insights into how to evaluate text analytics technologies from the Chief Scientist behind ConveyAPi: http://blog.converseon.com/2012/06/11/social-media-analytics-performance-measurement/ Hope this helps.

Vidar Brekke

Related Q & A:

What do you think about educational learning games?Best solution by Yahoo! Answers
What do you think is the best American beer?Best solution by answers.yahoo.com
What do you think about Brazil?Best solution by Yahoo! Answers
What is a cheap but good quality cable service?Best solution by reviews.com
What are the dutch people like? what do they think of english people?Best solution by Quora

Just Added Q & A:

How many active mobile subscribers are there in China?Best solution by Quora
How to find the right vacation?Best solution by bookit.com
How To Make Your Own Primer?Best solution by thekrazycouponlady.com
How do you get the domain & range?Best solution by ChaCha
How do you open pop up blockers?Best solution by Yahoo! Answers

For every problem there is a solution! Proved by Solucija.

Got an issue and looking for advice?
Ask Solucija to search every corner of the Web for help.
Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.