Wojtek212 | 1 Aug 2008 02:51
Picon

FileNotFoundException during indexing


Hi,
I'm sometimes receiving FileNotFoundExceptions during indexing.

java.io.FileNotFoundException: /tmp/content/3615.0-3618.0/_3p.fnm (No such
file or directory)
	at
com.test.vcssearch.DefaultServiceIndexer$2.run(DefaultServiceIndexer.java:245)
	at java.lang.Thread.run(Thread.java:595)
Caused by: com.test.search.IndexingException: java.io.FileNotFoundException:
/tmp/content/3615.0-3618.0/_3p.fnm (No such file or directory)
	at
com.test.search.impl.lucene.IndexManager.removeDocuments(IndexManager.java:293)
	at
com.test.search.impl.lucene.IndexManager.removeDocuments(IndexManager.java:199)
	at com.test.search.impl.lucene.IndexManager.reindex(IndexManager.java:250)
	at com.testsearch.impl.lucene.IndexManager.reindex(IndexManager.java:301)
	at
com.test.vcssearch.DefaultServiceIndexer$2.run(DefaultServiceIndexer.java:239)
	... 1 more
Caused by: java.io.FileNotFoundException: /tmp/content/3615.0-3618.0/_3p.fnm
(No such file or directory)
	at java.io.RandomAccessFile.open(Native Method)
	at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
	at
org.apache.lucene.store.FSIndexInput$Descriptor.<init>(FSDirectory.java:497)
	at org.apache.lucene.store.FSIndexInput.<init>(FSDirectory.java:522)
	at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:434)
	at
org.apache.lucene.index.CompoundFileWriter.copyFile(CompoundFileWriter.java:204)
(Continue reading)

Michael McCandless | 1 Aug 2008 03:12

Re: FileNotFoundException during indexing


Are you only creating one instance of IndexManager and then sharing  
that instance across all threads?

Can you put some logging/printing where you call IndexReader.unLock,  
to see how often that's happening?  That method is dangerous because  
if you unlock a still-active IndexWriter it leads to exactly this kind  
of exception.

Mike

Wojtek212 wrote:

>
> Hi,
> I'm sometimes receiving FileNotFoundExceptions during indexing.
>
> java.io.FileNotFoundException: /tmp/content/3615.0-3618.0/_3p.fnm  
> (No such
> file or directory)
> 	at
> com.test.vcssearch.DefaultServiceIndexer 
> $2.run(DefaultServiceIndexer.java:245)
> 	at java.lang.Thread.run(Thread.java:595)
> Caused by: com.test.search.IndexingException:  
> java.io.FileNotFoundException:
> /tmp/content/3615.0-3618.0/_3p.fnm (No such file or directory)
> 	at
> com 
> .test 
(Continue reading)

Christopher M Collins | 1 Aug 2008 04:06
Picon
Favicon

SpanRegexQuery


Hello,

I'm trying to use SpanRegexQuery as one of the clauses in my SpanQuery.
When I give it a regex like: "L[a-z]+ing" and do a rewrite on the final
query I get terms like "Labinger" and "Lackonsingh" along with the expected
terms "Labeling", "Lacing", etc.  It's as if the regex is treated as a
"find()" and not a "match()" in Java.  Is there a way to make it behave
like a full match, and not a prefix regex?

Thanks!

Christopher

______________________________________________________________
Christopher Collins \ http://www.cs.utoronto.ca/~ccollins
Department of Computer Science \ University of Toronto
Collaborative User Experience Group \ IBM Research
Daniel Noll | 1 Aug 2008 04:27
Gravatar

Re: SpanRegexQuery

Christopher M Collins wrote:
> Hello,
> 
> I'm trying to use SpanRegexQuery as one of the clauses in my SpanQuery.
> When I give it a regex like: "L[a-z]+ing" and do a rewrite on the final
> query I get terms like "Labinger" and "Lackonsingh" along with the expected
> terms "Labeling", "Lacing", etc.  It's as if the regex is treated as a
> "find()" and not a "match()" in Java.  Is there a way to make it behave
> like a full match, and not a prefix regex?

Have you tried appending $ onto the end of it?  I think we noticed the 
same issue with regex queries here and had to apply a workaround of that 
sort.

Daniel

--

-- 
Daniel Noll
Ganesh - yahoo | 1 Aug 2008 07:44
Picon
Favicon

Re: Using lucene as a database... good idea or bad idea?

Thanks Andy and Karsten.

----- Original Message ----- 
From: "Andy Liu" <andyliu1227 <at> gmail.com>
To: <java-user <at> lucene.apache.org>
Sent: Thursday, July 31, 2008 8:16 PM
Subject: Re: Using lucene as a database... good idea or bad idea?

> If essentially all you need is key-value storage, Berkeley DB for Java 
> works
> well.  Lookup by ID is fast, can iterate through documents, supports
> secondary keys, updates, etc.
>
> Lucene would work relatively well for this, although inserting documents
> might not be as fast, because segments need to be merged and data ends up
> getting copied over again at certain points.  So if you're running a batch
> process with a lot of inserts, you might get better throughput with BDB as
> opposed to Lucene, but, of course, benchmark to confirm ;)
>
> Andy
>
> On Thu, Jul 31, 2008 at 9:12 AM, Karsten F.
> <karsten-lucene <at> fiz-technik.de>wrote:
>
>>
>> Hi Ganesh,
>>
>> in this Thread nobody said, that lucene is a good storage server.
>> Only "it could be used as storage server" (Grant: Connect data storage 
>> with
(Continue reading)

Wojtek212 | 1 Aug 2008 09:38
Picon

Re: FileNotFoundException during indexing


Hi Mike,
I'm sharing one instance of IndexManager across all threads and as I've
noticed only this one is used during indexing.

I'm unlocking before every indexing operation to make sure it would be
possible.
When IndexWriter is closed I assume it releases the lock and finishes its
work.
Does IndexWriter executes some threads and doesn't wait when they are
finished?
It's the only one situation I can imagine that there 2 IndexWriters...

--

-- 
View this message in context: http://www.nabble.com/FileNotFoundException-during-indexing-tp18766343p18769652.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
Michael McCandless | 1 Aug 2008 11:42

Re: FileNotFoundException during indexing


Wojtek212 wrote:

>
> Hi Mike,
> I'm sharing one instance of IndexManager across all threads and as  
> I've
> noticed only this one is used during indexing.

OK, maybe triple check this -- because that's the only way in your  
code I can see 2 IWs being live at once.

> I'm unlocking before every indexing operation to make sure it would be
> possible.

This is what makes me nervous (and why I suggest you print every time  
IndexReader.isLocked returns true, to be 100% sure it's not being  
called).

You should only very very rarely (after a JVM crash, or, if the JVM  
exits but you didn't close your IndexWriter) actually need to use  
IndexReader.unLock, and if you call it when you shouldn't (because  
another IW is in fact still "live"), disaster ensues.

> When IndexWriter is closed I assume it releases the lock and  
> finishes its
> work.
> Does IndexWriter executes some threads and doesn't wait when they are
> finished?
> It's the only one situation I can imagine that there 2 IndexWriters...
(Continue reading)

Michael McCandless | 1 Aug 2008 11:45

Re: FileNotFoundException during indexing


Another option is to switch to native locks (dir.setLockFactory(new  
NativeFSLockFactory()), at which point you will never have to call  
IndexReader.unLock because native locks are always properly released  
by the OS when the JVM exits/crashes.

If on switching to native locks, and removing the call to  
IndexReader.unlock, you see IndexWriter.open hitting  
LockObtainFailedException, then that means somehow you are trying to  
open two live writers on the same index.

Mike

Wojtek212 wrote:

>
> Hi Mike,
> I'm sharing one instance of IndexManager across all threads and as  
> I've
> noticed only this one is used during indexing.
>
> I'm unlocking before every indexing operation to make sure it would be
> possible.
> When IndexWriter is closed I assume it releases the lock and  
> finishes its
> work.
> Does IndexWriter executes some threads and doesn't wait when they are
> finished?
> It's the only one situation I can imagine that there 2 IndexWriters...
>
(Continue reading)

Erik Hatcher | 1 Aug 2008 12:24
Favicon

Re: SpanRegexQuery


On Jul 31, 2008, at 10:06 PM, Christopher M Collins wrote:
> I'm trying to use SpanRegexQuery as one of the clauses in my  
> SpanQuery.
> When I give it a regex like: "L[a-z]+ing" and do a rewrite on the  
> final
> query I get terms like "Labinger" and "Lackonsingh" along with the  
> expected
> terms "Labeling", "Lacing", etc.  It's as if the regex is treated as a
> "find()" and not a "match()" in Java.  Is there a way to make it  
> behave
> like a full match, and not a prefix regex?

There are two implementations of the regex engine built into  
SpanRegexQuery, one using Java's java.util.regex, the other using  
Jakarta Regexp.  The default implementation is java.util.regex, which  
matches like this:

   pattern.matcher(string).lookingAt()

And Jakarta Regexp matches like this:

   regexp.match(string)

I'm not sure myself the differences in these two without doing some  
tests, but certainly they should, ahem, match in at least the  
expectation of whether there is an implied ^string$ or not.  But at a  
quick glance the respective javadocs, it does seem like the  
java.util.regex implementation should be using  
pattern.matcher(string).matches() instead.  lookingAt() always starts  
(Continue reading)

Wojtek212 | 1 Aug 2008 13:46
Picon

Re: FileNotFoundException during indexing


I've checked unlock ant it is not called until exception occurs.

BTW, I' ve tried to use FSDirectorectory with NativeFSLockFactory and I
didn't get
LockObtainFailedException. I removed also this part making unlocking
(IndexReader.unlock).

The exception is:
Exception in thread "Thread-95"
org.apache.lucene.index.MergePolicy$MergeException:
java.io.FileNotFoundException: /tmp/content/3615.0-3618.0/_5.cfs (No such
file or directory)
        at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:271)
Caused by: java.io.FileNotFoundException: /tmp/content/3615.0-3618.0/_5.cfs
(No such file or directory)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
        at
org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:506)
        at
org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:536)
        at
org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:445)
        at
org.apache.lucene.index.CompoundFileReader.<init>(CompoundFileReader.java:70)
        at
org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:277)
        at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262)
(Continue reading)


Gmane