Mukundaraman Valakumaresan | 23 Jul 06:49 2014

Query using doc Id

Hi,

Is it possible to execute queries using doc Id as a query parameter

For eg, query docs whose doc Id is between 100 and 200

Thanks & Regards
Mukund
Ameya Aware | 22 Jul 22:50 2014
Picon

NoClassDefFoundError while indexing in Solr

Hi

I am running into below error while indexing a file in solr.

Can you please help to fix this?

ERROR - 2014-07-22 16:40:32.126; org.apache.solr.common.SolrException;
null:java.lang.RuntimeException: java.lang.NoClassDefFoundError:
com/uwyn/jhighlight/renderer/XhtmlRendererFactory
at
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:790)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:439)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
(Continue reading)

Warren Bell | 22 Jul 22:29 2014

How to get Lacuma to match Lucuma

What field type or filters do I use to get something like the word “Lacuma” to return results with
“Lucuma” in it ? The word “Lucuma” has been indexed in a field with field type text_en_splitting
that came with the original solar examples.

Thanks,

Warren

   <fieldType name="text_en_splitting" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->
        <!-- Case insensitive stop word removal.
        -->
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="lang/stopwords_en.txt"
                />
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1"
catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
        <filter class="solr.PorterStemFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory"
(Continue reading)

Ameya Aware | 22 Jul 19:37 2014
Picon

Java heap Space error

Hi

i am running into java heap space issue. Please see below log.

ERROR - 2014-07-22 11:38:59.370; org.apache.solr.common.SolrException;
null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
at
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:790)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:439)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at
(Continue reading)

Johannes Siegert | 22 Jul 13:26 2014
Picon

wrong docFreq while executing query based on uniqueKey-field

Hi.

My solr-index (version=4.7.2.) has an id-field:

<field  name="id"  type="string"  indexed="true"  stored="true"/>
...
<uniqueKey>id</uniqueKey>

The index will be updated once per hour.

I use the following query to retrieve some documents:

"q=id:2^2 id:1^1"

I would expect that the document(2) should be always before the 
document(1). But after many index updates document(1) is before document(2).

With debug=true I could see the problem. The document(1) has a 
docFreq=2, while the document(2) has a docFreq=1.

How could the docFreq of the uniqueKey-field be hight than 1? Could 
anyone explain this behavior to me?

Thanks!

Johannes

elisabeth benoit | 22 Jul 09:12 2014
Picon

spatial search: find result in bbox OR first result outside bbox

Hello,

I am using solr 4.2.1. I have the following use case.

I should find results inside bbox OR if there is none, first result outside
bbox within a 1000 km distance. I was wondering what is the best way to
proceed.

I was considering doing a geofilt search from the center of my bounding box
and post filtering results.

fq={!geofilt sfield=store}&pt=45.15,-93.85&d=1000

From a performance point of view I don't think it's a good solution though,
since solr will have to calculate every document distance, then sort.

I was wondering if there was another way to do this and avoid sending more
than one request to solr.

Thanks,
Elisabeth
Michael Ryan | 22 Jul 04:50 2014

DocValues without re-index?

Is it possible to use DocValues on an existing index without first re-indexing?

-Michael
Jeff Wartes | 22 Jul 01:37 2014

SolrCloud extended warmup support


I’d like to ensure an extended warmup is done on each SolrCloud node prior to that node serving traffic.
I can do certain things prior to starting Solr, such as pump the index dir through /dev/null to pre-warm the
filesystem cache, and post-start I can use the ping handler with a health check file to prevent the node
from entering the clients load balancer until I’m ready.
What I seem to be missing is control over when a node starts participating in queries sent to the other nodes.

I can, of course, add solrconfig.xml firstSearcher queries, which I assume (and fervently hope!) happens
before a node registers itself in ZK clusterstate.json as ready for work, but that doesn’t scale so well
if I want that initial warmup to run thousands of queries, or run them with some paralleism. I’m storing
solrconfig.xml in ZK, so I’m sensitive to the size.

Any ideas, or corrections to my assumptions?

Thanks.
Darren Lee | 22 Jul 01:14 2014

SolrCloud replica dies under high throughput

Hi,

I'm doing some benchmarking with Solr Cloud 4.9.0. I am trying to work out exactly how much throughput my
cluster can handle.

Consistently in my test I see a replica go into recovering state forever caused by what looks like a timeout
during replication. I can understand the timeout and failure (I am hitting it fairly hard) but what seems
odd to me is that when I stop the heavy load it still does not recover the next time it tries, it seems broken
forever until I manually go in, clear the index and let it do a full resync.

Is this normal? Am I misunderstanding something? My cluster has 4 nodes (2 shards, 2 replicas) (AWS
m3.2xlarge). I am indexing with ~800 concurrent connections and a 10 sec soft commit. I consistently get
this problem with a throughput of around 1.5 million documents per hour.

Thanks all,
Darren

Stack Traces & Messages:

[qtp779330563-627] ERROR org.apache.solr.servlet.SolrDispatchFilter  â
null:org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
        at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:226)
        at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:195)
        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:422)
        at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
        at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:233)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
(Continue reading)

pramodEbay | 22 Jul 01:13 2014
Picon

How do I disable distributed search feature when I have only one shard

Hi there,

We have a solr cloud set up with only one shard. There is one leader and 15
followers. So the data is replicated on 15 nodes. When we run a solr query,
only one node should handle the request and we do not need any distributed
search feature as all the nodes are exact copies of each other.

Under certain load scenarios, we are seeing SOLRJ api is adding
isShard=true&distrib=false&shard.url=A,B,C etc. to all the queries.  Is the
solr query waiting for responses from A, B and C before returning back to
the client. If that is true, it is unnecessary and causing problems for us
under heavy load.

The thing is, somehow, these parameters are automagically added during query
time. How do we disable this. The solrj query that we build programatically
does not add these three parameters. Is there some configuration we can turn
on, to tell solrj not to add these parameters to the solr request.

Thanks,
Pramod

--
View this message in context: http://lucene.472066.n3.nabble.com/How-do-I-disable-distributed-search-feature-when-I-have-only-one-shard-tp4148449.html
Sent from the Solr - User mailing list archive at Nabble.com.

prashantc88 | 21 Jul 17:29 2014

Solr schema.xml query analyser

 0 down vote favorite
	

I am a complete beginner to Solr and need some help.

My task is to provide a match when the search term contains the indexed
field.

For example:

    If query= foo bar and textExactMatch= foo, I should not get a MATCH
    If query= foo bar and textExactMatch= foo bar, I should get a MATCH
    If query= foo bar and textExactMatch= xyz foo bar/foo bar xyz, I should
get a MATCH

I am indexing my field as follows:

<fieldType name="textExactMatch" class="solr.TextField"
positionIncrementGap="100">
            <analyzer type="index">
                    <tokenizer class="solr.KeywordTokenizerFactory"/>
                    <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
                    <filter class="solr.LowerCaseFilterFactory"/>
            </analyzer>

So I'm indexing the text for the field as it is without breaking it further
down. Could someone help me out with how should I tokenize and filter the
field during query time.

(Continue reading)


Gmane