How to close instance of an XML DOCUMENT?

WildCARD search using lucene in a large file containing 100 millions Strings taking too much time. i want the result in 1-2 seconds. any help?

I have a file size 1.43 gb. the file contains 100 millions strings ( 3 - 80 characters length) separated line by line in the file. i am doing WILDCARD search on the file using lucene. right now i am creating one document for each string. i want total count of the search keyword (*searchkeyword*).here is my code lucene.demo.java public class LuceneDemo { //a path to directory where Lucene will store index files private static String indexDirectory = "C:\\indextofile"; // a path to directory which contains data files that need to be indexed private static String dataDirectory = "C:\\indexofilef"; public static int count = 0; private Searcher indexSearcher; public static void main(String[] args) throws FileNotFoundException, IOException { LuceneDemo luceneDemo = new LuceneDemo(); //create Lucene index luceneDemo.createLuceneIndex(); //create IndexSearcher luceneDemo.createIndexSearcher(); luceneDemo.termQueryExample(); } private void createLuceneIndex(){ Indexer indexer = new Indexer(indexDirectory,dataDirectory); //Create IndexWriter System.out.println("testing-4"); indexer.createIndexWriter(); try { //Index data indexer.indexData(); } catch (FileNotFoundException e) { throw new RuntimeException(e); } catch (IOException e) { throw new RuntimeException(e); } } private void createIndexSearcher() throws CorruptIndexException, IOException{ /* Create instance of IndexSearcher */ indexSearcher = new IndexSearcher(indexDirectory); } private void termQueryExample() throws CorruptIndexException, IOException{ try { Directory directory = FSDirectory.getDirectory(indexDirectory); //IndexSearcher indexSearcher = new IndexSearcher(directory); BooleanQuery.setMaxClauseCount(102400000); Term term = new Term("reversecontent", "bubble*com"); Query query = new WildcardQuery(term); Hits hits = indexSearcher.search(query); System.out.println("######## Hits :"+hits.length()); } catch (Exception e) { e.printStackTrace(); } } } Indexer.java public class Indexer { private IndexWriter indexWriter; /*Location of directory where index files are stored */ private String indexDirectory ; /*Location of data directory */ private String dataDirectory ; public String FIELD_CONTENTS = "contents"; public Indexer(String indexDirectory, String dataDirectory){ this.indexDirectory = indexDirectory ; this.dataDirectory = dataDirectory ; } /** * This method creates an instance of IndexWriter which is used * to add Documents and write indexes on the disc. */ void createIndexWriter(){ if(indexWriter == null){ try{ //Create instance of Directory where index files will be stored Directory fsDirectory = FSDirectory.getDirectory(indexDirectory); /* Create instance of analyzer, which will be used to tokenize the input data */ Analyzer standardAnalyzer = new KeywordAnalyzer(); //Create a new index boolean create = true; //Create the instance of deletion policy IndexDeletionPolicy deletionPolicy = new KeepOnlyLastCommitDeletionPolicy(); indexWriter = new IndexWriter(fsDirectory,standardAnalyzer,create, deletionPolicy,IndexWriter.MaxFieldLength.UNLIMITED); }catch(IOException ie){ System.out.println("Error in creating IndexWriter"); throw new RuntimeException(ie); } } } void indexData() throws FileNotFoundException, IOException{ File[] files = getFilesToBeIndxed(); for(File file:files){ FileReader fr = new FileReader(file); // To store the contents read via File Reader BufferedReader br = new BufferedReader(fr); // Read br and store a line in 'data', print data String data; System.out.println("start"); while((data = br.readLine()) != null) { String newdata = data+".com"; Document doc = new Document(); //doc.add(new Field("content", newdata, // http://Store.NO, Index.NOT_ANALYZED)); doc.add(new Field("reversecontent", new StringBuffer(newdata).reverse().toString(), http://Store.NO, Index.NOT_ANALYZED)); indexWriter.addDocument(doc); } System.out.println("end"); // Add these fields to a Lucene Document //Step 3: Add this document to Lucene Index. } /* Requests an "optimize" operation on an index, priming the index for the fastest available search */ indexWriter.optimize(); System.out.println("optimization done"); /* * Commits all changes to the index and closes all associated files. */ indexWriter.close(); } private File[] getFilesToBeIndxed(){ File dataDir = new File(dataDirectory); if(!dataDir.exists()){ throw new RuntimeException(dataDirectory+" does not exist"); } File[] files = dataDir.listFiles(); return files; } }
Answer:

a. I believe the problem is your using a KeywordAnalyzer. This means each of your 100 million strings will get a different index term, unless some of them are identical. Try switching to a StandardAnalyzer, say. This will allow the index to be much more efficient. b. Try testing this on a small scale (say 10000 strings), to see that you are getting proper results. c. I believe the Lucene users mailing list should give you better responses than Quora for a specific technical Lucene question. d. Other than that, read http://www.manning.com/hatcher3/. This is the best book I know about information retrieval. And it explains Lucene like no other resource.

Yuval Feinstein at Quora Visit the source

Was this solution helpful to you?

Related Q & A:

Can I estimate the row count of a large mysql table using the disk space?Best solution by Database Administrators
When i do a search and click on a result a get a page addressed rc10 overture what do i do?Best solution by Yahoo! Answers
How much can I print out of one Lexmark #2 ink cartridge?Best solution by Yahoo! Answers
How to search for a particular string in a text file using java?Best solution by Stack Overflow
Should I upgrade to OS 3.1.2?Best solution by answers.yahoo.com

Just Added Q & A:

How many active mobile subscribers are there in China?Best solution by Quora
How to find the right vacation?Best solution by bookit.com
How To Make Your Own Primer?Best solution by thekrazycouponlady.com
How do you get the domain & range?Best solution by ChaCha
How do you open pop up blockers?Best solution by Yahoo! Answers

For every problem there is a solution! Proved by Solucija.

Got an issue and looking for advice?
Ask Solucija to search every corner of the Web for help.
Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.