Koji Sekiguchi | 1 Jun 2010 02:29
Picon

Re: NPE error when extending DefaultSolrHighlighter

(10/06/01 6:45), Gerald wrote:
> I was looking at solr-386 and thought I would try to create a custom
> highlighter for something I was doing.
>
> I created a class that looks something like this:
>
> public class CustomOutputHighlighter extends DefaultSolrHighlighter {
>            <at> Override
>           public NamedList doHighlighting(DocList docs, Query query,
> SolrQueryRequest req, String[] defaultFields) throws IOException {
>                   NamedList highlightedValues = super.doHighlighting(docs,
> query, req, defaultFields);
>
>                   // do more stuff here
>
>                   return highlightedValues
>                   }
> }
>
> and have replaced the<highlighting>  line in my solrconfig xml so that it
> looks something like this:
>
> <highlighting class="com.xxxxxxx.solr.highlight.CustomOutputHighlighter">
>
> and left all the existing default highlighting parameters as-is
>
> The code compiles with no problem, and should simply perform the normal
> highlighting (since all I am doing is calling the original doHighlighting
> code and returning the results).  However, when I start Solr, I get an NPE
> error:
(Continue reading)

jlist9 | 1 Jun 2010 02:58
Picon

Re: Luke browser does not show non-String Solr fields?

The id field has type "long" in schema.xml. In Luke, they are shown
as "hex dump". When viewing a doc (returned by *:*), I pick the ID field
and press the "Show" button, Luke pops up a dialog that allows me
to change the "Show Content As" value. When I choose "Number",
I get an error message:

"Some values could not be properly represented in this format. They
are marked in grey and presented as a hex dump."

So it seems like Luke does not understand Solr's long type. This
is not a native Lucene type?

On Mon, May 31, 2010 at 9:52 AM, Chris Hostetter
<hossman_lucene <at> fucit.org> wrote:
>
> : 1. Queries like "id:123" which work fine in /solr/admin web interface but
> : returns nothing in Luke. Query "*:*" returns all records fine in Luke. I
> : expect Luke returns the same result as /solr/admin since it's essentially
> : a Lucene query?
>
> you haven't told us what fieldtype you are using for the "id" field -- but
> i'm going to go out on a limb and guess it's TrieIntFieldType (or possibly
> a SortedIntFieldType) ... those field types encode their values  in such a
> way that they sort lexigraphicaly and produce faster range queries -- if
> Luke doesn't kow about that special encoding, it can search on them (or
> even display the terms properly)
>
> Luke has a "view terms" feature right? ... look at the raw terms in your
> "id" ifeld and i bet you'll see they look nothing like numbers -- and
> that's why you can search on them as numbers in Luke
(Continue reading)

findbestopensource | 1 Jun 2010 08:17
Picon

Re: newbie question on how to batch commit documents

Add commit after the loop. I would advise to use commit in a separate
thread. I do keep separate timer thread, where every minute I will do
commit and at the end of every day I will optimize the index.

Regards
Aditya
www.findbestopensource.com

On Tue, Jun 1, 2010 at 2:57 AM, Steve Kuo <kuosenhao <at> gmail.com> wrote:

> I have a newbie question on what is the best way to batch add/commit a
> large
> collection of document data via solrj.  My first attempt  was to write a
> multi-threaded application that did following.
>
> Collection<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();
> for (Widget w : widges) {
>    doc.addField("id", w.getId());
>    doc.addField("name", w.getName());
>   doc.addField("price", w.getPrice());
>    doc.addField("category", w.getCat());
>    doc.addField("srcType", w.getSrcType());
>    docs.add(doc);
>
>    // commit docs to solr server
>    server.add(docs);
>    server.commit();
> }
>
> And I got this exception.
(Continue reading)

olivier sallou | 1 Jun 2010 09:01
Picon
Gravatar

Re: newbie question on how to batch commit documents

I would additionally suggest to use embeddedSolrServer for large uploads if
possible, performance are better.

2010/5/31 Steve Kuo <kuosenhao <at> gmail.com>

> I have a newbie question on what is the best way to batch add/commit a
> large
> collection of document data via solrj.  My first attempt  was to write a
> multi-threaded application that did following.
>
> Collection<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();
> for (Widget w : widges) {
>    doc.addField("id", w.getId());
>    doc.addField("name", w.getName());
>   doc.addField("price", w.getPrice());
>    doc.addField("category", w.getCat());
>    doc.addField("srcType", w.getSrcType());
>    docs.add(doc);
>
>    // commit docs to solr server
>    server.add(docs);
>    server.commit();
> }
>
> And I got this exception.
>
> rg.apache.solr.common.SolrException:
>
> Error_opening_new_searcher_exceeded_limit_of_maxWarmingSearchers2_try_again_later
>
(Continue reading)

NarasimhaRaju | 1 Jun 2010 10:45
Picon
Favicon

Re: Interleaving the results

Can some body throw some ideas, on how to achieve (interleaving) from with in the application especially in
a distributed setup?

 “ There are only 10 types of people in this world:-
Those who understand binary and those who don’t “ 

Regards, 
P.N.Raju,

________________________________
From: Lance Norskog <goksron <at> gmail.com>
To: solr-user <at> lucene.apache.org
Sent: Sat, May 29, 2010 3:04:46 AM
Subject: Re: Interleaving the results

There is no interleaving tool. There is a random number tool. You will
have to achive this in your application.

On Fri, May 28, 2010 at 8:23 AM, NarasimhaRaju <rajuxgen <at> yahoo.com> wrote:
> Hi,
> how to achieve custom ordering of the documents when there is a general query?
>
> Usecase:
> Interleave documents from different customers one after the other.
>
> Example:
> Say i have 10 documents in the index belonging to 3 customers (customer_id field in the index ) and using
query *:*
> so all the documents in the results score the same.
> but i want the results to be interleaved
(Continue reading)

jlist9 | 1 Jun 2010 10:47
Picon

MoreLikeThis: /solr/mlt NOT_FOUND

I have some experience using MLT with the StandardRequestHandler with Python
but I can't figure out how to do it with solrj. It seems that to do
MLT with solrj I have
to use MoreLikeThisRequestHandler and there seems no way to use
StandardRequestHandler for MLT with solrj (please correct me if I'm wrong.)

So I try to test it by following this page:
http://wiki.apache.org/solr/MoreLikeThisHandler
but I get this error:

HTTP ERROR: 404
NOT_FOUND
RequestURI=/solr/mlt

Do I need to do something in the config file before I can use MLT?

Thanks

rabahb | 1 Jun 2010 10:59
Picon

Re: Solr Architecture discussion


Hi Chris,

Thanks for your insights. I totally understand your point about steps 4 and
5. I wanted to control the moment when the swap would happen on the slave
side but as you say there is no use for that. It only adds up complexity
that internal solr mechanisms are already providing.  

For the replication aspect, I re-read the whole documentation and with the
light you shed on that topic, I realize that the only problem here is the
huge amount of data that can be passed over the wire depending on the
segments that the indexing will update. As you say, optimizing can have a
devastating effect on the replication phase as, if I have a good
understanding of what you said, this could potentially update all the index
segments. 

OK! so if I rephrase it, the best strategy in my case is to limit the
optimization phases in order to prioritize the replication performance, and
make the optimization only when the replication activity is not so crucial
in order to avoid degrading the search performances. 

Thank you very much. That helps a lot.

--

-- 
View this message in context: http://lucene.472066.n3.nabble.com/Solr-Architecture-discussion-tp825708p860767.html
Sent from the Solr - User mailing list archive at Nabble.com.

rabahb | 1 Jun 2010 12:19
Picon

Re: Solr Architecture discussion


Thinking twice about this architecture ...

I'm concerned about the way I'm going to automate the following steps:

A- The slaves would regularly poll Master-core1 for changes
B- A backup of the current index would be created
C- Re-Indexing will happen on Master-core2 
D- When Indexing is done, we'll trigger a swap between Master-core1 and
core2
E- Slaves will then poll and pickup the freshly updated index segments
F- and so on!

This seems to be simple when it's done manually. But I can not just sit
there and trigger a button to send the events. To reach that goal, I
realized that on solution would be to have 2 cores on the master side, while
the slaves would only have one core (as previously discussed). We'll just
need to configure the slave polling period (A,E), and send the right http
request (B,C,D). 

Well ok, step A is automated "natively". Easy enough, using the internal
solr capabilities.
But how can B,C, and D. I'll do it manually. Wait! I'm not sure my boss will
pay for that.

All right so I imagine that I should implement a process that will automate
the phases that I would otherwise do manually. This would be an external
process not based on solr mechanism.

My questions are:
(Continue reading)

stockii | 1 Jun 2010 13:12
Favicon

DIH, Full-Import, DB and Performance.


Hello..

We have about 4 Million Products in our our Database and the Import takes
about 1,5 hours. In this Time is the Performance of the Database very bad
and our Server crashed sometimes. It's seems that DIH send only ONE select
to the db ?!?! is that right ? 

all other processes cannot connect to the db =(...

thats very bad !!!! what is the best solution to make a full-import better,
so that we dont have such problems !?!?!?!? an import with PHP takes tooooo
long for us !

thats the query: 
query="select *
                FROM items_de.shop_items as i, shops as s 
	WHERE s.id=i.shop_id AND s.is_active=1 AND s.is_testmode=0 AND parent_id IS
NULL"  >

AND the Mappings for the categories:
<entity name="item_category" pk="id, shop_item_id" dataSource="items"
 query="select shop_category_id, order_index FROM
shop_item_category_mappings WHERE     shop_item_id='${item.id}'" >
			   			
<field column="shop_category_id" name="shop_category_id" />
<field column="order_index" 	 name="popularity"/>
			
<entity name="categoryName" pk="id" dataSource="items"
	query="select name, path from shop_categories where id =
(Continue reading)

Darren Govoni | 1 Jun 2010 13:14

Spatial Query with LatLonType

Hi,
  I read over the SpatialWiki. It wasn't clear how to query for
documents with LatLon fields
that reside inside a specific bounding box (not distance from). Simply
put, I have a google map
and want to construct a query for single LatLon fields that are inside
the map view (between the lat/lon corners).

Ranged filter won't work because lat lon are not separate fields in this
case (and that doesn't produce correct results for me anyway).

thanks for any tips.

Darren

Gmane