Zeeshan Lakhani | 25 Mar 17:33 2015

Re: Riak2.0 with Solr Search: index on one node contains not all entries

Ok, no worries.

Also, sometimes, AAE may take a little time to complete the exchanges. Definitely read through my responses in http://lists.basho.com/pipermail/riak-users_lists.basho.com/2015-March/016926.html if you continue to see issues with Riak Search being out-of-sync.

Thanks.

Zeeshan Lakhani
programmer | 
software engineer at <at> basho | 
org. member/founder of <at> papers_we_love | paperswelove.org
twitter => <at> zeeshanlakhani

On Mar 25, 2015, at 12:27 PM, Michael Weibel <michael.weibel <at> gmail.com> wrote:

Hi Zeeshan,

Ok, will do that and report back as soon as I have it.. Might take a while though because I first also have to figure out whether I have still the same issue or not.. ;)

Thanks!
Michael


2015-03-25 16:56 GMT+01:00 Zeeshan Lakhani <zlakhani <at> basho.com>:
Hey Michael,

Ideally, for this “testing" setup, n_val=2 would be the effective choice. I’d create a new bucket_type/bucket and re-PUT your data in and test search again to be sure.

Let me know. Thanks.

Zeeshan Lakhani
programmer | 
software engineer at <at> basho | 
org. member/founder of <at> papers_we_love | paperswelove.org
twitter => <at> zeeshanlakhani

On Mar 25, 2015, at 11:02 AM, Michael Weibel <michael.weibel <at> gmail.com> wrote:

Hi Zeeshan,

Thanks for your answer.
Just to be sure, does your custom schema include the required fields, as mentioned in the docs: http://docs.basho.com/riak/latest/dev/advanced/search-schema/#Custom-Schemas?

Yes, I double checked that now to make sure, and the schema includes the required fields.
 
 Are these Riak nodes joined? What’s your ring size, n_val value?

They run in a cluster, yes. Output of "riak-admin status":

ring_creation_size : 64
ring_members : ['riak <at> IPADDRESS','riak <at> IPADDRESS']
ring_num_partitions : 64
ring_ownership : <<"[{'riak <at> IPADDRESS',32},{'riak <at> IPADDRESS',32}]">>
rings_reconciled : 0
rings_reconciled_total : 33

n_val is 3, the initial one. According to the docs we should probably either add another node or reduce it to "2" though..correct? (Sorry, newbie in riak here :D)
 
How are you querying the Solr nodes to know which node has the data and which one doesn't? Coverage is R=1, so you would be getting a different number on some search queries (using the standard /search/query/<index_name>?...) if its only on one of the Solr cores.

Yes exactly. That's how I figured out that there's a difference, later on I queried the separate Solr instances using the solr admin interface itself. I also then fetched the missing key on both the riak nodes (without going to solr, just fetching it directly using the riak HTTP API) and they exist on both nodes. 
 
Can you also post me a screenshot of your search AAE exchanges, e.g. `riak-admin search aae-status`? You could look at this thread, http://lists.basho.com/pipermail/riak-users_lists.basho.com/2015-March/016926.html, for answers on how to perform read-repair/repair the AAE tree.

aae-status is in the .log file attached.

So you'd propose to perform a read-repair on the AAE tree?

Best,
Michael

 

Thanks.

Zeeshan Lakhani
programmer | 
software engineer at <at> basho | 
org. member/founder of <at> papers_we_love | paperswelove.org
twitter => <at> zeeshanlakhani

On Mar 25, 2015, at 6:42 AM, Michael Weibel <michael.weibel <at> gmail.com> wrote:

Hi all,

I have on a test environment two riak nodes and each of them has solr activated which index 3 buckets using a custom schema.
After testing a bit back and forth, I have the case that on one solr node, an entry is not in the index (I know in which node though). 
Fetching the specific key in the bucket works however, both nodes have the respective entry.

1) How can this happen? I don't see any error/warning in the logs (neither solr nor riak logs). 
2) Is there a possibility to fix this without having to do e.g. a PUT on the specific key with the same content in order to update it?

I tried to run a repair on the failing node using the guide: http://docs.basho.com/riak/1.4.7/ops/running/recovery/repairing-indexes/#Repairing-Search-Indexes
When running the repair command on the partitions I received, it gave me however a [{<PartitionId>, down}, {...}] response, which gives me an uncomfortable feeling, but I didn't really figure out yet what this means exactly. 

Thanks a lot for your help :)

Best,
Michael
_______________________________________________
riak-users mailing list
riak-users <at> lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


<riak-admin-aae-status.log>



_______________________________________________
riak-users mailing list
riak-users <at> lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Michael Weibel | 25 Mar 11:42 2015
Picon

Riak2.0 with Solr Search: index on one node contains not all entries

Hi all,

I have on a test environment two riak nodes and each of them has solr activated which index 3 buckets using a custom schema.
After testing a bit back and forth, I have the case that on one solr node, an entry is not in the index (I know in which node though).
Fetching the specific key in the bucket works however, both nodes have the respective entry.

1) How can this happen? I don't see any error/warning in the logs (neither solr nor riak logs).
2) Is there a possibility to fix this without having to do e.g. a PUT on the specific key with the same content in order to update it?

I tried to run a repair on the failing node using the guide: http://docs.basho.com/riak/1.4.7/ops/running/recovery/repairing-indexes/#Repairing-Search-Indexes
When running the repair command on the partitions I received, it gave me however a [{<PartitionId>, down}, {...}] response, which gives me an uncomfortable feeling, but I didn't really figure out yet what this means exactly.

Thanks a lot for your help :)

Best,
Michael
_______________________________________________
riak-users mailing list
riak-users <at> lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
John O'Brien | 24 Mar 17:36 2015
Picon

Discrepancies between /stats http endpoint and riak-admin stat show metrics...

Quick question...

Is it safe to assume these two metrics endpoint are expected to display the same stats.

watch -n 1 riak-admin stat show **get_fsm_time**

shows 0's for node_get_fsm_time_99/95  (occasionally will see stats build up for them, but then vanish back to 0).

The stats interface via :8098/stats seems to be more reliable, in that I never see 0 entries..

This running riak 2.0.5, although I'm pretty sure I recall seeing this on previous deployments.

Thanks,

John
_______________________________________________
riak-users mailing list
riak-users <at> lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Jonathan Koff | 23 Mar 16:25 2015

Ensembles failing to reach "Leader ready" state

Hi all,

I recently used Riak’s Strong Consistency functionality to get auto-incrementing IDs for a feature of an application I’m working on, and although this worked great in dev (5 nodes in 1 VM) and staging (3 servers across NA) environments, I’ve run into some odd behaviour in production (originally 3 servers, now 4) that prevents it from working.

I initially noticed that consistent requests were immediately failing as timeouts, and upon checking `riak-admin ensemble-status` saw that many ensembles were at 0 / 3, from the vantage point of the box I was SSH’d into. Interestingly, SSH-ing into different boxes showed different results. Here’s a brief snippet of what I see now, after adding a fourth server in a troubleshooting attempt:

*Machine 1* (104.131.39.61)

============================== Consensus System ===============================
Enabled:     true
Active:      true
Ring Ready:  true
Validation:  strong (trusted majority required)
Metadata:    best-effort replication (asynchronous)

================================== Ensembles ==================================
 Ensemble     Quorum        Nodes      Leader
-------------------------------------------------------------------------------
   root       0 / 6         3 / 6      --
    2         0 / 3         3 / 3      --
    3         3 / 3         3 / 3      riak <at> 104.131.130.237
    4         3 / 3         3 / 3      riak <at> 104.131.130.237
    5         3 / 3         3 / 3      riak <at> 104.131.130.237
    6         0 / 3         3 / 3      --
    7         0 / 3         3 / 3      --
    8         0 / 3         3 / 3      --
    9         3 / 3         3 / 3      riak <at> 104.131.130.237
    10        3 / 3         3 / 3      riak <at> 104.131.130.237
    11        0 / 3         3 / 3      --

*Machine 2* (104.236.79.78)

============================== Consensus System ===============================
Enabled:     true
Active:      true
Ring Ready:  true
Validation:  strong (trusted majority required)
Metadata:    best-effort replication (asynchronous)

================================== Ensembles ==================================
 Ensemble     Quorum        Nodes      Leader
-------------------------------------------------------------------------------
   root       0 / 6         3 / 6      --
    2         3 / 3         3 / 3      riak <at> 104.236.79.78
    3         3 / 3         3 / 3      riak <at> 104.131.130.237
    4         3 / 3         3 / 3      riak <at> 104.131.130.237
    5         3 / 3         3 / 3      riak <at> 104.131.130.237
    6         3 / 3         3 / 3      riak <at> 104.236.79.78
    7         0 / 3         3 / 3      --
    8         0 / 3         3 / 3      --
    9         3 / 3         3 / 3      riak <at> 104.131.130.237
    10        3 / 3         3 / 3      riak <at> 104.131.130.237
    11        3 / 3         3 / 3      riak <at> 104.236.79.78

*Machine 3* (104.131.130.237)

============================== Consensus System ===============================
Enabled:     true
Active:      true
Ring Ready:  true
Validation:  strong (trusted majority required)
Metadata:    best-effort replication (asynchronous)

================================== Ensembles ==================================
 Ensemble     Quorum        Nodes      Leader
-------------------------------------------------------------------------------
   root       0 / 6         3 / 6      --
    2         0 / 3         3 / 3      --
    3         3 / 3         3 / 3      riak <at> 104.131.130.237
    4         3 / 3         3 / 3      riak <at> 104.131.130.237
    5         3 / 3         3 / 3      riak <at> 104.131.130.237
    6         0 / 3         3 / 3      --
    7         0 / 3         3 / 3      --
    8         0 / 3         3 / 3      --
    9         3 / 3         3 / 3      riak <at> 104.131.130.237
    10        3 / 3         3 / 3      riak <at> 104.131.130.237
    11        0 / 3         3 / 3      --

*Machine 4* (162.243.5.87)

============================== Consensus System ===============================
Enabled:     true
Active:      true
Ring Ready:  true
Validation:  strong (trusted majority required)
Metadata:    best-effort replication (asynchronous)

================================== Ensembles ==================================
 Ensemble     Quorum        Nodes      Leader
-------------------------------------------------------------------------------
   root       0 / 6         3 / 6      --
    2         3 / 3         3 / 3      riak <at> 104.236.79.78
    3         3 / 3         3 / 3      riak <at> 104.131.130.237
    4         3 / 3         3 / 3      riak <at> 104.131.130.237
    5         3 / 3         3 / 3      riak <at> 104.131.130.237
    6         3 / 3         3 / 3      riak <at> 104.236.79.78
    7         3 / 3         3 / 3      riak <at> 162.243.5.87
    8         3 / 3         3 / 3      riak <at> 162.243.5.87
    9         3 / 3         3 / 3      riak <at> 104.131.130.237
    10        3 / 3         3 / 3      riak <at> 104.131.130.237
    11        3 / 3         3 / 3      riak <at> 104.236.79.78


Interestingly, Machine 4 has full quora for all ensembles except for root, while Machine 3 only sees itself as a leader.

Another interesting point is the output of `riak-admin ensemble-status root`:

================================= Ensemble #1 =================================
Id:           root
Leader:       --
Leader ready: false

==================================== Peers ====================================
 Peer  Status     Trusted          Epoch         Node
-------------------------------------------------------------------------------
  1    (offline)    --              --           riak <at> 104.131.45.32
  2      probe      no              8            riak <at> 104.131.130.237
  3    (offline)    --              --           riak <at> 104.131.141.237
  4    (offline)    --              --           riak <at> 104.131.199.79
  5      probe      no              8            riak <at> 104.236.79.78
  6      probe      no              8            riak <at> 162.243.5.87

This is consistent across all 4 machines, and seems to include some old IPs from machines that left the cluster quite a while back, almost definitely before I’d used Riak's Strong Consistency. Note that the reason I added the fourth machine (104.131.39.61) was to see if this output would change, perhaps resulting in a quorum for the root ensemble.

For reference, here’s the status of a sample ensemble that isn’t “Leader ready”, from the perspective of Machine 2:
================================ Ensemble #62 =================================
Id:           {kv,1370157784997721485815954530671515330927436759040,3}
Leader:       --
Leader ready: false

==================================== Peers ====================================
 Peer  Status     Trusted          Epoch         Node
-------------------------------------------------------------------------------
  1    following    yes             43           riak <at> 104.131.130.237
  2    following    yes             43           riak <at> 104.236.79.78
  3     leading     yes             43           riak <at> 162.243.5.87


My config consists of riak.conf with:

strong_consistency = on

and advanced.config with:

[
  {riak_core,
    [
      {target_n_val, 5}
      ]},
  {riak_ensemble,
    [
      {ensemble_tick, 5000}
    ]}
].

though I’ve experimented with the latter in an attempt to get this resolved.

I didn’t see any relevant-looking log output on any of the servers.

Has anyone come across this before?

Thanks!

Jonathan Koff B.CS.
co-founder of Projexity

follow us on facebook at: www.facebook.com/projexity
follow us on twitter at: twitter.com/projexity

_______________________________________________
riak-users mailing list
riak-users <at> lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Toby Corkindale | 23 Mar 04:54 2015
Picon

Prometheus stats gathering for Riak? Riak CS?

Hi,
I wondered if anyone has written a stats gathering plugin for Prometheus?
It doesn't look like it'd be too hard to do; but I'm still lazy enough to hope that someone else has done it first :)


Toby
_______________________________________________
riak-users mailing list
riak-users <at> lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Jason W | 23 Mar 03:03 2015
Picon

riak search java client - sql injection

Hello,

I try to use the riak search java client, specifically the Search.Builder class, like the following 

Search search = new Search.Builder("test", "_yz_rb:accounts AND email:" + [user-email]).


"[user-email]" is what user entered in the login form, my question is about sql injection, it seems like the java search client api doesn't prevent sql injection, are there any other api/methods that I can use to prevent this?  Thank you

_______________________________________________
riak-users mailing list
riak-users <at> lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Matt Brooks | 22 Mar 21:13 2015
Picon

Trouble with Riak Search JSON Extractor

Hello,

I have a quick question about a search schema that would index an array of strings in a JSON object. I am storing user data in JSON that looks something like this:

{ "name" : "John Smith", "email" : "jsmith <at> gmail.com", "groups" : [ "3304cf79", "abe155cf" ] }
The custom schema I use for users includes this field entry: 

<field name="groups" type="string" indexed="true" stored="false" multiValued="true"/>
I also have the following field type entry: 

<fieldType name="string" class="solr.StrField" sortMissingLast="true"/>
I assumed that these entries would allow me to search for users using a query like: 

curl $RIAK/search/query/user?wt=json&q=groups:3304cf79
But no docs are returned. I'm almost certain I have an issue with my schema. How can I query users by string entries in the "groups" array?

Thank you, 
Matt.

_______________________________________________
riak-users mailing list
riak-users <at> lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Mohamed Abd Elhafez | 18 Mar 23:25 2015
Picon

How to run bash bench distributed

Hi guys,

I am very new to Riak and I am trying to run some benchmarks for our cluster

I saw this blog post about running basho bench distributed to be able to get high throughput.

I also found the example config that can run in distributed mode, but when I try to run it:

./basho_bench examples/riakc_pb_distributed.config, I get the message: "Basho bench not started in distributed mode, and distribute_work = true" 

Is there some documentation no how to run basho bench distributed? 


Thanks,

Mohamed 

_______________________________________________
riak-users mailing list
riak-users <at> lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Matthew Brender | 13 Mar 17:40 2015

Riak Recap - March 13 2015

Hello Riakators!
 
Welcome back to The Recap. We're trying out another new format this week.
 
## Feedback
The first change you'll notice is that we're going with plaintext this time. Do you like it? Do you prefer Markdown? 
* Give a definitive answer in this 3 question survey: https://surveyplanet.com/5501f6ac52dada35360b26a5
 
## Important announcements
We've had two advisories worth noting:
* Advisory in response to POODLE security concern
* Advisory for 2.0.4 and Map Data Type Disk Incompatibility
 
 
## Recently answered
* Christopher answered a question on link walking data modeling [1]
* There is a great thread on recommended Operating System for Riak starting here [2]
* Shawn found a discrepancy between Riak CS & S3 for setting ACLs and opened an issue for it [3]
* Steve & Zeeshan have a dialogue about this known issue with Solr [4]
* Patrick ran into a Riak CS disagreement with URLs that's a known issue [5]
* And there are more Riak CS & Solr conversations that can be reviewed here [6]
 
 
## Still open
I found no threads that I needed to ask people to help solve this week. That's exciting!
 
## Event news
* NYC Riak Meetup happening this Wednesday - you can sign up here [7]
* Incredible engineers are speaking at Erlang Factory at the end of the month [8]
* We have a speaking slot at Enterprise Data World in D.C. [9]
* Manu and others will be at NoSQL Matters Paris [10]
 
 
## For the weekend
* Former Basho CTO Justin Sheehy wrote on the challenges of time in distributed systems, titled "There is No Now" [11]
 
 
 
Thanks for reading!
Matt
 
Developer Advocate
Basho Technologies
-----------------------------------

_______________________________________________
riak-users mailing list
riak-users <at> lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Roma Lakotko | 12 Mar 10:59 2015
Picon

Different numFound request to riak search

Each request to riak search return different results. It's return different numFound.

I use request like this:

http://localhost:8098/search/query/assets?wt=json&q=type:*&sort=_yz_rk%20asc

If add start offset it can return:

"response": { "numFound": 1248, "start": 1247, "docs": [ { "_yz_id": "1*default*assets*fff63ecf-a0c4-4ecf-b24d-c493ca3a302f*44", "_yz_rk": "fff63ecf-a0c4-4ecf-b24d-c493ca3a302f", "_yz_rt": "default", "_yz_rb": "assets" } ] }
On next request it return something like this
"numFound": 1224, "start": 1247, "docs": []
I have 1 node installation, and no process write to Riak.I have same problem this production cluster with 7 nodes.
Scheme for document
<?xml version="1.0" encoding="UTF-8" ?> <schema name="schedule" version="1.5"> <fields> <field name="objectId" type="string_ci" indexed="true" stored="false" /> <field name="type" type="string_ci" indexed="true" stored="false" /> <field name="objectType" type="string_ci" indexed="true" stored="false" /> <field name="contentType" type="string_ci" indexed="true" stored="false" /> <field name="properties" type="string_ci" indexed="true" stored="false" multiValued="true" /> <field name="tag" type="string_ci" indexed="true" stored="false" /> <field name="isUploaded" type="boolean" indexed="true" stored="false" /> <field name="published" type="boolean" indexed="true" stored="false" /> <field name="drm" type="boolean" indexed="true" stored="false" /> <field name="dateCreated" type="date" indexed="true" stored="false" /> <!-- All of these fields are required by Riak Search --> <field name="_yz_id" type="_yz_str" indexed="true" stored="true" multiValued="false" required="true"/> <field name="_yz_ed" type="_yz_str" indexed="true" stored="false" multiValued="false"/> <field name="_yz_pn" type="_yz_str" indexed="true" stored="false" multiValued="false"/> <field name="_yz_fpn" type="_yz_str" indexed="true" stored="false" multiValued="false"/> <field name="_yz_vtag" type="_yz_str" indexed="true" stored="false" multiValued="false"/> <field name="_yz_rk" type="_yz_str" indexed="true" stored="true" multiValued="false"/> <field name="_yz_rt" type="_yz_str" indexed="true" stored="true" multiValued="false"/> <field name="_yz_rb" type="_yz_str" indexed="true" stored="true" multiValued="false"/> <field name="_yz_err" type="_yz_str" indexed="true" stored="false" multiValued="false"/> <dynamicField name="*" type="ignored"/> </fields> <uniqueKey>_yz_id</uniqueKey> <types> <!-- YZ String: Used for non-analyzed fields text_ru --> <fieldType name="date" class="solr.TrieDateField" sortMissingLast="true" omitNorms="true"/> <fieldType name="double" class="solr.TrieDoubleField" sortMissingLast="true" omitNorms="true"/> <fieldType name="int" class="solr.TrieIntField" sortMissingLast="true" omitNorms="true"/> <fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" omitNorms="true"/> <fieldType name="_yz_str" class="solr.StrField" sortMissingLast="true" /> <fieldtype name="ignored" stored="false" indexed="false" multiValued="true" class="solr.StrField" /> <fieldType name="string_ci" class="solr.TextField" sortMissingLast="true" omitNorms="true"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory" /> <filter class='solr.PatternReplaceFilterFactory' pattern='ё' replacement='е' replace='all'/> </analyzer> </fieldType> </types></schema>

Best regards,Roman
_______________________________________________
riak-users mailing list
riak-users <at> lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Patrick F. Marques | 11 Mar 19:32 2015
Picon

putObjectPolicy fails due to wrong Content-Type

Hi,

Hi, I'm trying to use node-aws-sdk to set bucket policies but the SDK sends the request with the wrong Content-Type, "application/octet-stream". I'm not sure if there is something wrong on my side or if AWS S3 also supports this since it is its SDK.

To find the problem I was digging in the SDK code because the server response is just a 500 Internal Server Error. It could be nice to return at least something like "Unaccepted Content-Type in request".

Regards,
Patrick Marques
_______________________________________________
riak-users mailing list
riak-users <at> lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Gmane