sraav | 4 Mar 01:40 2015
Picon

Solr join + Boost in single query

David,

Is it possible to write a query to join two cores and either bring back data
from the two cores or to boost on the data coming back from either of the
cores? Is that possible with Solr? 

Raavi

--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-join-Boost-in-single-query-tp4190825.html
Sent from the Solr - User mailing list archive at Nabble.com.

Tom Devel | 3 Mar 22:51 2015
Picon

Search over a multiValued field

Hi,

I am running Solr 5.0.0 and have a question about proximity search and
multiValued fields.

I am indexing xml files of the following form with foundField being a field
defined as multiValued and text_en my in schema.xml.

<?xml version="1.0" encoding="UTF-8"?>
<add><doc>
<field name="id">8</field>
<field name="foundField">"Oranges from South California - ordered"</field>
<field name="foundField">"Green Apples - available"</field>
<field name="foundField">"Black Report Books - ordered"</field>
</doc></add>

There are several such documents, and for instance, I would like to query
all documents having in the foundField "Oranges" and "ordered". The
following proximity query takes care of it:

q=foundField:("oranges AND ordered"~2)

However, a field could have more words, and I also cannot know the
proximity of the desired query words in advance. Setting the proximity
value too high results in false positives, the following query also returns
the document (although "available" was in the entry about Apples):

foundField:("oranges AND available"~200)

I do not think that tweaking a proximity value is the correct approach.
(Continue reading)

Darin Amos | 3 Mar 17:23 2015
Picon

Grouping on multivalued fields. Alternative approaches?

Hi All,

I have read over and over that SOLR still does not support grouping on multivalued fields, however I have a
requirement in which grouping on multi valued fields is the perfect solve for. 

Has anyone ever worked on a 3rd party library to do this, or is there any alternative ways to do this using a
tokenized text field? Surely someone must have solved this problem by now.

I am currently running SOLR 4.3.

Thanks!

Darin
johnmunir | 3 Mar 15:32 2015
Picon

Access permission


Hi,

I'm indexing data off a DB.  The data is secured with access permission.  That is record-A can be seen by
users-x, while record-B can be seen by users-y and yet record-C can be seen by users x and y.  Even more, the
group access permission can change over time.

The question I have is this: how to handle this in Solr?  Is there anything I can do during index and / or search
time?  What's the best practice to handle access permission in search?

Thanks!

- MJ

Aman Tandon | 3 Mar 12:26 2015
Picon

Help needed to understand zookeeper in solrcloud

Hi,

I read in various blogs that we should use the odd number of zookeeper in
the ensemble, So why is it so?

With Regards
Aman Tandon
Aman Tandon | 3 Mar 12:21 2015
Picon

How to start solr in solr cloud mode using external zookeeper ?

Hi,

I am new to solr-cloud, i have connected the zookeepers located on 3 remote
servers. All the configs are uploaded and linked successfully.

Now i am stuck to how to start solr in cloud mode using these external
zookeeper which are remotely located.

Zookeeper is installed at 3 servers and using the 2181 as client port. ON
all three server, solr server along with external zookeeper is present.

solrcloud1.com (solr + zookeper is present)
solrcloud2.com
solrcloud3.com

Now i have to start the solr by telling the solr to use the external
zookeeper. So how should I do that.

Thanks in advance.

With Regards
Aman Tandon
Clemens Wyss DEV | 3 Mar 10:00 2015
Picon

ex(clude) facet.query ?

[Solr 5.0]
Whereas in 

fq={!tag="facet15"}facet15__d_i:1.8 facet15__d_i:2.2
&q=(*:*)
&facet=true
&facet.mincount=1
&facet.field={!key="facet15" ex="facet15"}facet15__d_i

"facet15" is not affected by the fq (as desired). This does not hold true for the facet.query

fq={!tag="till2"}facet15__d_i:[* TO 2.0]
&q=(*:*)
&facet=true
&facet.mincount=1
&facet.query={!key="till2" ex="till2"}facet15__d_i:[* TO 2.0]
&facet.query={!key="from2" ex="from2"}facet15__d_i:[2.0 TO *]

from2-facet returns 0

Does "ex" not work for "faect.query" ?

Zheng Lin Edwin Yeo | 3 Mar 07:00 2015
Picon

Unable to show the indexed content in Solr 5.0

Hi,

The content field is unable to be shown during searching, even though the
following line has been added to the schema using curl from the resource
named in 'managedSchemaResourceName'.

<field name="content" stored="true" type="text_general" indexed="true"/>

I'm using the schema from ManagedIndexSchemaFactory.

As the ExtractRequestHandler has already been defined in solrconfig.xml by
default, and I'm using the ManagedIndexSchemaFactory. I have add the
content field line to allow the indexed content to be shown when user does
a query, as the default setting is not for the content to be shown. I added
in using curl as follows:

$ curl -X POST -H 'Content-type:application/json' --data-binary '{
"update-field" :

{ "name":"text", "type":"text_general", "stored":true, "indexed":true,
"storeOffsetsWithPositions":true}

}' http://localhost:8983/solr/collection1/schema

I have indexed the document using the following command:
java -Dc=collection1 -Dauto=true -jar example\exampledocs\post.jar
example\exampledcos\solr-word.pdf.

The document is successfully indexed, and when I does a search of any words
from the content, the search is able to return document ID and other
(Continue reading)

Matt Hilt | 2 Mar 23:48 2015
Picon

Slow highlighting on Solr 5.0.0

Short form:
While testing Solr 5.0.0 within our staging environment, I noticed that highlight enabled queries are much slower than I saw with 4.10. Are there any obvious reasons why this might be the case? As far as I can tell, nothing has changed with the default highlight search component or its parameters.


A little more detail:
The bulk of the collection config set was stolen from the basic 4.X example config set. I changed my schema.xml and solrconfig.xml just enough to get 5.0 to create a new collection (removed non-trie fields, some other deprecated response handler definitions, etc). I can provide my version of the solr.HighlightComponent config, but it is identical to the sample_techproducts_configs example in 5.0.  Are there any other config files I could provide that might be useful?


Number on “much slower”:
I indexed a very small subset of my data into the new collection and used the /select interface to do a simple debug query. Solr 4.10 gives the following pertinent info:
"response": { "numFound": 72628,
...
"debug": {
"timing": { "time": 95, "process": { "time": 94, "query": { "time": 6 }, "highlight": { "time": 84 }, "debug": { "time": 4 } }
---------------------------
Whereas solr 5.0 is:
"response": { "numFound": 1093,
...
"debug": {  
"timing": { "time": 6551, "process": { "time": 6549, "query": { "time": 0 }, "highlight": { "time": 6524 }, "debug": { "time": 25 }



Attachment (smime.p7s): application/pkcs7-signature, 4296 bytes
Matt B | 2 Mar 21:04 2015

Slow cross-core joins

I've recently inherited a Solr instance that is required to perform numerous joins between two cores,
usually as filter queries, similar to the one below:

q=firstName=Matt&fq=-({!to=emailAddress toIndex=accounts type=join fromIndex=lists
from=listValue}list_id:000038f2-351b-11e4-9579-001e67654bce OR {!to=emailDomain
toIndex=accounts type=join fromIndex=lists
from=listValue}list_id:000038f2-351b-11e4-9579-001e67654bce OR {!to=emailDomainReversed
toIndex=accounts type=join fromIndex=lists from=listValue}list_id:000038f2-351b-11e4-9579-001e67654bce)

The accounts core is about 35GB with ~40,000,000 documents and the lists core is about 9 GB with 90,0000,000
documents.  There may be anywhere from one to one million documents in the lists core matching any
particular list_id.  The idea is to filter a search query on the accounts core to include or exclude any
documents with an email address, email domain, or reverse email domain that is found within the lists core
for a particular list id.  The lists core is frequently updated on a daily basis with both additions and deletions.

Not surprisingly, such queries are very slow, usually taking minutes to return any results.

Are there any possible strategies to significantly increase the performance of such queries?  The JVM max
heap size is set to 16 GB and the server has 64 GB RAM.
Tang, Rebecca | 2 Mar 20:19 2015
Picon

solr bug 6143 (facet count and CollapsingQParserPlugin)

We use the CollapsingQParser to group possible duplicate records.  We are running into the issue reported
by bug 6143.  CollapsingQParser only supports facet.truncate but it returns counts that confuses our
customers.  What we need is group.facets.

I wanted to check if a "new feature" bug has been logged for implementing group.facets for
CollapsingQParserPlugin.  If not, could I log it?

Rebecca Tang
Applications Developer, UCSF CKM
Industry Documents Digital Libraries
E: rebecca.tang <at> ucsf.edu


Gmane