What academic reference management tools provide successful automatic metadata extraction from a collection of PDF files?
-
It seems like it shouldn't be hard to detect the title of each file and look up the metadata online automatically, but I haven't yet encountered a tool that does this robustly for computer science papers. This is a follow-up question to .
-
Answer:
I have been using Mendeley Desktop for biology pdfs, and it generally works quite well. If I know the PubMed retrieval number, Mendeley can correct its automatic metadata extraction if I think it's wrong. It also supports retrieval through a few other databases. I have not tried it out with computer science papers, but it might do the trick for you.
Joydeep Banerjee at Quora Visit the source
Other answers
PDF files do generally not contain the meta data needed to create the correct output as required by citation styles. It is very difficult to extract the correct information from the text of a PDF file, apart from the title, but the title alone may not be enough to find the rest of the information in machine readable format in a bibliographic database. Citavi (https://www.citavi.com) extracts the metadata from a PDF file if there is any, and if the file contains a DOI somewhere on the first pages, it uses the DOI to look up the information in external databases like CrossRef or PubMed. Journal pages, on the other hand, very often do contain the bibliographic information in machine readable format, either in the COinS format, or as HighWire Press Tags, so you should use a citation manager that can extract this information from the journal landing page and then attach the PDF file of the paper to the reference. Please note that information extracted from the PDF or imported from a landing page may not be complete or correct, so make sure to double check everything that you did not type yourself (which you would check, too). Citavi lets you create tasks like "Verify bibliographic information" and tag the references as "verified".
Patrick Hilt
Zotero can do it if the PDF has metadata embedded in it. http://www.zotero.org/support/retrieve_pdf_metadata
Stephen Francoeur
Mendeley desktop is the most hassle free way to extract metadata from PDFs..zotero can do the same, but mendeley has 2 GB of free storage while zotero only has a 100 MB. Also check out Colwiz desktop..excellent tool.
Karthik Bala
Both Refworks and Zotero support limited collection of metadata from various sources including PDFs however some people find that this functionality is somewhat frustrating because it is heavily dependent upon the creator of the original document correctly formatting the data and adding it in the appropriate fields. No bibliographic tool can match the data to the fields on the fly from the raw information about a document. Oliver
Oliver Starr
Related Q & A:
- How To Read Pdf Files On Nokia C503?Best solution by nds1.nokia.com
- How to convert XML files to PDF files?Best solution by Super User
- How to manage my pdf files?Best solution by Quora
- What is the best inventory management software to use with Magento for managing a supply chain?Best solution by Quora
- What are the best tools to analyze a competitor's website?Best solution by Quora
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.