Toby Corkindale | 6 Jul 06:40 2015

Re: haproxy issues with protocol buffers on 2.x releases

Ah hah!
Coming back to this problem after a month seems to have jogged my thoughts. I've figured it out.

The Basho guide to haproxy for Riak CS specifies:
    timeout client    5000
    timeout server    5000

In the example configuration. Those are values in milliseconds.
So if the client or server does not send any data over the connection for 5000 milliseconds, haproxy considers it dead, and closes the connection.

On a moderately well loaded production system, you probably have enough data coming and going to keep those connections alive, but on a testing instance, the connections will be getting closed down all the time.

My fix is to add these lines:
    timeout tunnel 7d
    timeout client-fin 30s

But I'd appreciate your thoughts too.


On Mon, 6 Jul 2015 at 13:13 Toby Corkindale <toby <at>> wrote:
Hi Matt,
I've tested against both haproxy 1.4 and haproxy 1.5,  both Riak 2.0.5 and 2.1.1, and Riak CS 2.0.1 and Stanchion 2.0.0.
I've test both KVM and LXC virtualisation.

In all combinations I tested, the problem persists. If Riak CS connects to Riak via haproxy, then Riak CS frequently loses the connection and returns a failure to the s3 client.

The failures show up very quickly, if you want to try and replicate this. I'll upload the configurations for you. They're for a three-node cluster, which is obviously too small for production, but should be the minimum to demonstrate this bug, right? I'm sure it'd manifest on a four or five node cluster too.

I run this command:
    perl; s3cmd -c s3cfg sync *.txt s3://test/
( just creates 100 small unique files by writing a bit of junk and an index value to them.)

If Riak-CS is talking to the haproxy port instead of directly to Riak, it'll fail partway through the sync. Every time. The number of files it gets through varies between 1 and 40ish, I'd say, but I haven't actually kept track.


On Sat, 4 Jul 2015 at 01:30 Matthew Brender <mbrender <at>> wrote:
Hey Toby, 

Did you find anything further during this testing? I'd love to make sure others on riak-user know how to configure local testing to prevent this situation. 


Matt Brender | Developer Advocacy Lead
Basho Technologies
c: +1 617.817.3195 

On Fri, Jun 5, 2015 at 2:16 AM, Toby Corkindale <toby <at>> wrote:
Hi Kota,
Our production nodes are Riak CS 1.5 and Riak 1.4.x -- they're running haproxy 1.4.x, and it's all been happy for some time now.

Testing the new nodes, still same haproxy versions, but Riak CS 2.0.1 and Riak 1.0.5.
Very confused as to why the connections are being dropped when going through haproxy. The problem persists even after restarting CS.
I tried staggering the restarts.. increasing the PB request pool.. etc..  no change.

But it works fine if CS connects directly to the localhost riak pb.
(Which isn't a great idea.. big Riak instances sometimes take too long to start, and CS falls over because it started too fast and couldn't connect, if you're going to localhost)

Confusing! I'm wondering if it's because the testing machines are in virtual machines, compared to production which is real hardware.
But.. normally haproxy still works fine on VMs.

I'll continue to play around.. Must be something that's botched on the testing setup... but don't want to replicate that into production!

On Fri, 5 Jun 2015 at 13:59 Kota Uenishi <kota <at>> wrote:

As PB connection management haven't been changed between CS 1.5 and
2.0, I think it should work. What's the version the load balancing
working stable? It depends of the reason why connection has been cut,
but I would recommend you restart just the CS node and recreate the
connection pool.

On Thu, Jun 4, 2015 at 2:33 PM, Toby Corkindale <toby <at>> wrote:
> Hi,
> I've been happily using haproxy in front of Riak and Riak CS 1.x in
> production for quite a while.
> I've been trying to bring up a new cluster based on riak/cs 2.0.x recently,
> as you've probably noticed from the flurry of emails to this list :)
> I'm discovering that if I have haproxy sitting between riak-cs and riak,
> then I get a lot of errors about disconnections. Initially I thought this
> must be related to pb backlogs or pool sizes or file handle limits -- but
> I've played with all those things to no avail.
> I *have* noticed that if I get riak-cs to connect directly to a riak
> (bypassing haproxy) then everything is fine, including with the original
> default request pool and backlog sizes.
> I am essentially using the recommended haproxy.cfg, which has worked fine in
> production elsewhere.
> Any suggestions?
> Error message sample follows:
> 2015-06-04 15:26:16.447 [warning]
> <0.283.0> <at> riak_cs_riak_client:get_user_with_pbc:293 Fetching user re
> cord with strong option failed: disconnected
> 2015-06-04 15:26:16.447 [warning]
> <0.2095.0> <at> riak_cs_pbc:check_connection_status:97 Connection status
> of <0.287.0> at maybe_create_user: {false,[]}
riak-users mailing list
riak-users <at>
Matthew Brender | 3 Jul 18:09 2015

Riak Recap - July 3, 2015

Here's The Recap!

## Basho Docs Announcment
* A brief reminder that documentation of Basho is going through a series of updates in the coming months. Be sure to review "How To Contribute" section of our README [0] before opening a PR.

## ICYMI Code Announcements
* The Riak Ruby client 2.2.1 dropped recently. See the announcement details from Bryce [1]

## Recently answered
* There is a thoughtful thread about mass-deletion of keys from a bucket [2]
* Toby found Riak S2 (aka CS) documentation should mention stanchion 2.0, not 1.5 [3]
* Toby ran into an environmental issue with local testing on Riak S2 (aka CS) with haproxy [4]
* Alexander has an indexing problem that Zeeshan helped resolve here [5]
* Matthew Von-Maszewski helped Joe configure his multi-tier leveldb backend [6]
* Alexander gives a thorough explanation about ring size in a question about a potentially missing key [7]
* Bryan helped find an old race condition of 2.0.1 in Alex's cluster [8]
* Cian digs into memory_ets in response to Meltwater's blog post [9]
* Cosmin and Sargun help think through whether Strong Consistency is needed for a new deployment [10]
* Jaska ran into an Erlang version error during installation that Drew helped along [11]
* Henning was able to reproduce a condition with the Java client that Luke opened an issue for [12]

## Worth Re-reading
* How do you manage your Riak cluster on AWS? Share your ideas with Alexander here [13]

## New projects
* Customers at Sqor open sourced an AWS backup tool for Riak. Thank you for sharing it! [14]
* I started to gather Vagrantfiles for local testing of Riak KV here [15]. Please PR!

## What'd we miss?
Have you brought Riak into production at a new place? Did you find a blog or presentation helpful? Good! Share it in our release notes:

And for everyone celebrating, Happy 4th of July! Enjoy the fireworks.

       .''.      .        *''*    :_\/_:     .
      :_\/_:   _\(/_  .:.*_\/_*   : /\ :  .'.:.'.
  .''.: /\ :    /)\   ':'* /\ *  : '..'.  -=:o:=-
 :_\/_:'.:::.  | ' *''*    * '.\'/.'_\(/_ '.':'.'
 : /\ : :::::  =  *_\/_*     -= o =- /)\     '  *
  '..'  ':::' === * /\ *     .'/.\'.  ' ._____
      *        |   *..*         :       |.   |' .---"|
        *      |     _           .--'|  ||   | _|    |
        *      |  .-'|       __  |   |  |    ||      |
     .-----.   |  |' |  ||  |  | |   |  |    ||      |
 ___'       ' /"\ |  '-."".    '-'   '-.'    '`      |____
                       ~-~-~-~-~-~-~-~-~-~   /|
          )      ~-~-~-~-~-~-~-~  /|~       /_|\
        _-H-__  -~-~-~-~-~-~     /_|\    -~======-~
~-\XXXXXXXXXX/~     ~-~-~-~     /__|_\ ~-~-~-~
~-~-~-~-~-~    ~-~~-~-~-~-~    ========  ~-~-~-~
      ~-~~-~-~-~-~-~-~-~-~-~-~-~ ~-~~-~-~-~-~

Share and enjoy,
Matt Brender
Developer Advocacy <at> Basho
<at> mjbrender

riak-users mailing list
riak-users <at>
Anita Wong | 2 Jul 13:55 2015

Riak-CS RAM usage


I’ve got a problem in using Riak-CS as file storage cluster.

Now the bitcask folder has 9GB of data in it, and the RAM usage of Riak is >8GB.

Just checked that the bitcask backend will put all the keys into the RAM, but how come now I would get this
large amount of RAM being used?

Is there any way to change the Riak-CS to use leveldb instead of bitcask? Besides, currently I got lots of
data in the bitcask, how can I migrate the data into leveldb?



riak-users mailing list
riak-users <at>
Wendy Liu | 1 Jul 17:47 2015

Unexpected stored value for multivalued field in riak search

Hi all,

I'm having an issue with stored multiValued fields in Riak search not being stored the way I would expect.

If I use the default schema, and store the following JSON document (from the documentation)

{"people_ss": ["Ryan", "Eric", "Brett"]}

and then search with the query "people_ss:Ryan", the document that I get back contains "people_ss": ["Ryan", "Eric", "Brett"], as I would expect.

However, if I instead create the document

{"people_ss": ["Ryan"]}

then searching with the query "people_ss:Ryan" results in a document with "people_ss": "Ryan", where the value associated with "people_ss" is a string, not a list of strings as I would expect.

I couldn't find anything about this in any of the Github issue trackers or the mailing list. Is this a bug, or desired behaviour? Is there a way to ensure that lists with a single element are stored in Solr as lists, instead of as strings?

I'm using the official Riak Python client with Riak 2.1.1 and search enabled.

Thanks in advance!
riak-users mailing list
riak-users <at>
Henning Verbeek | 30 Jun 13:17 2015

Java Client: minConnections

I've reported in the past that my application is seeing stange
exceptions when running under even the slightest load (see

The observation is that some threads simply hang when this exception
occurs, waiting for ever without understanding what is going on. This
unreliability is very bad for my application where a lot of
Riak-related tasks are handled in Threads in the background.

I keep trying to find out what is causing this, but I have no real
clue. But after playing around with
RiakNode.Builder.min/maxConnections, the problem does not occur

Specifically, I have now set:
RiakNode riakNode = new RiakNode.Builder()

If I leave minConnections at 1, the problem occurs, reproducibly. With
higher values, it doesn't.

Is there maybe an issue in the code? Are connections reaped too quickly?

Timo Gatsonides | 29 Jun 21:07 2015

handoff taking very, very long

I had a cluster with 7 Riak nodes. They have multiple backends, a LevelDB on “fast disc” and a LevelDB on “big disc”. The “big disc" was filling up, amongst others because deletion of old data is not working properly (I’m still running 1.4.12, getting ready to switch to 2.x).

So I added a new node, number 8, with bigger discs and planned a migration of all data from one of the nodes (number 5) to the new node. However that has been running for almost a week now. I had to restart the node a few times because I hit a kernel BUG (seems like a race condition with heavy write activity, triggered after 10-20 hours, running ZFS … yes, ZFS on Linux however I want a snapshot before migrating to Riak 2.x).

The directories on the NEW node actually hold more data then the OLD node, see “du -sm” output below:

434021 /bigdata/riak/bigleveldb/456719261665907161938651510223838443642478919680
456331 /bigdata/riak/bigleveldb/730750818665451459101842416358141509827966271488
413217 /bigdata/riak/bigleveldb/822094670998632891489572718402909198556462055424
409491 /bigdata/riak/bigleveldb/91343852333181432387730302044767688728495783936

411476 /bigdata/riak/bigleveldb/456719261665907161938651510223838443642478919680
412018 /bigdata/riak/bigleveldb/730750818665451459101842416358141509827966271488
376401 /bigdata/riak/bigleveldb/822094670998632891489572718402909198556462055424
386021 /bigdata/riak/bigleveldb/91343852333181432387730302044767688728495783936

I’ve also tried changing “transfer-limit”, to see if things would move more quickly with 1, 2 or 4 and also to force a stop/start.

But it looks like the ownership will never complete.

Can someone please advise how to proceed? E.g. can I just mark node nr 5 as down with riak-admin since it looks like the new node already has all the data? And it will read-repair whatever it’s missing? (note that I’ve switched off AAE as that doesn’t seem to work well with this data as well).

As a side note I know that I’m using values that are considered too large for typical Riak usage with values ranging 1-10Mb / value.


riak-users mailing list
riak-users <at>
jaska | 26 Jun 15:18 2015

Installing error??

What does these errors mean in riak install?


Missing plugins: [rebar_lock_deps_plugin]? ERROR: OTP release 18 does not
match required regex R16|17?

I have installed and running erlang and otp v17:

I have OpenSSl installed (1.0.1f)

I did install all other required dependencies needed to install riak.

Cant seem to find solution to this. Please Help!!

View this message in context:
Sent from the Riak Users mailing list archive at
Lauren Rother | 22 Jun 17:16 2015

Basho docs to no longer have 'master' branch.

Hi Community,

If you have or were thinking of contributing to the Basho docs project, a few things are changing that you should know about!

First and most importantly, in order to make sure we don't lose history, we're going to version our docs via branches.
What this means for you: We will be removing the master branch as of July 1st, 2015. Use the most current version branch as HEAD (right now, that's riak/2.1.1), and submit PRs to the version you want to fix/change/add to. (See instructions below on updating your existing fork.)

Second, we're moving to a point where we vastly prefer that you fork the basho_docs repo and submit a PR from there. 
What this means for you: See the[1] for instructions on forking. (We know not everyone can do this, and we are not going to be monsters about it. It is simply a strong preference.)

Third, we're busy planning changes to the Docs platform. This will likely mean that more things and workflows are shifting around. Stay tuned for further announcements! 

As always, if you have any questions, concerns, or suggestions, please reply here or email us at docs <at>

The Basho Docs team

Instructions for updating your existing fork

If you had forked the basho_docs repo at some point and would like to continue contributing, you'll probably want to update your origin/HEAD to point at the most current version of the docs: riak/2.1.1.This will make sure you're pulling the most recent version of the doc you're updating. (We know this is a bit annoying and we are working on changes that will likely make this obsolete.)

Here's how: 
`git remote set-head upstream riak/2.1.1`

**Note, the above command assumes you have forked the repo and set basho/basho_docs as your upstream remote.

riak-users mailing list
riak-users <at>
tele | 20 Jun 09:59 2015

uWSGI + Riak connection issues

Hi All,

I'm trying to troubleshoot an issue and i'm posting here because its
caused by connecting to Riak even if i may miss some configuration on
uwsgi. This is my enviroment:

nginx + uwsgi + flask app

The flask app uses Riak and Redis.
The connection between nginx and uwsgi is via unix socket.

If i use only one process in uwsgi i can easily run simultaneous
requests without hitting the issue i'm having. When i add even only one
more process all the workers gets busy and the app hangs. If i remove
the riak code part it's working fine, so the issue has to be somewhere
on the connection pooling or something else.

I'm experiencing the same issues as this user:

If i use protobuf protocol i hit the DecodeErrors messages, sometimes i
don't get any error the app just hangs. If i use the http protocol with
riak, i don't get any exception but it just hangs.

It hangs on a simple snippet:

user_bucket = riak_client.bucket_type('user_type').bucket('users')
user_info = user_bucket.get(user_id)

I'm using Locust to generate traffic

1 uwsgi worker, locust 10 users hatch 2 seconds = no issues
2+ uwsgi worker, locust 10 users hatch 2 seconds = app hangs after few

For Riak i have 3 nodes running on the same box, i'm using the latest
version from git.

The app hangs in any of those connection scenarios:

riak_client = riak.RiakClient(host='', pb_port=10017,
protocol='pbc') riak_client = riak.RiakClient(protocol='pbc',
nodes=[{'host':'', 'pb_port':10017},{'host':'',
'pb_port':10027},{'host':'', 'pb_port':10037}]) riak_client =
riak.RiakClient(protocol='http', http_port=10018, host='')
riak_client = riak.RiakClient(protocol='http',
nodes=[{'host':'', 'http_port':10018},{'host':'',
'http_port':10028},{'host':'', 'http_port':10038}])

My uwsgi config is the following:

vhost = true
socket = /tmp/app.sock
venv = /opt/app/venv
chdir = /opt/app/
module = myapp
callable = app
processes = 2
master = true
master = true
post-buffering = 1
carbon =
stats = /tmp/stats.sock

If i sniff the network traffic, when it hangs uwsgi basically stops
sending any request to riak, all the workers becomes busy and the only
way to restore it it's a restart of uwsgi.

My SW versions are the following:

Riak latest from git.

Python libs:
riak (2.2.0)
riak-pb (
protobuf (2.5.0)

UWSGI: 2.0.10

Any idea on how i can troubleshoot this issue? It seems related to
uwsgi but it's happening only when using the Riak connection.

Thank you

Bryce Kerley | 20 Jun 02:12 2015

[ANN] Riak Ruby Client 2.2.1

The Riak Ruby client version 2.2.1 is now available. This gem is an officially-supported client for the Riak key-value database, including support for key-value objects, buckets, bucket types, bucket properties, Riak Search/Yokozuna, secondary indexes, and more.

This version, 2.2.1, is a bugfix release, and includes additional testing of character encodings.

Bug fixes:

* Support bucket-typed buckets when creating secondary-index input phases
  for map-reduce, thanks to Takeshi Akima.
* Support Riak Search 2 / Yokozuna results as input phases for map-reduce,
  thanks again to Takeshi Akima.
* `BucketTyped::Bucket#get_index` now includes the bucket type name in the
  2i request.
* `Bucket#==` now performs an encoding-independent comparison on bucket names.
* `BucketType#==` also does an encoding-independent comparison on type names.

Testing enhancements:

* Non-ASCII UTF-8 strings, and binary strings containing byte 255 are tested
  with key-value, secondary index, CRDT, and Riak Search interfaces. These
  findings are available on our documentation site:

Updating your applications that use Bundler is easy: make sure the Gemfile doesn’t exclude 2.2.1, and `bundle update riak-client`. Gems and applications that transitively depend on riak-client may have a more complex update procedure, or it may be as simple as a `bundle update`. Make sure your tests pass before putting it into production!

Thanks to everyone who has participated in developing, testing, using, and improving this gem!

Bryce Kerley
Software Engineer
Attachment (smime.p7s): application/pkcs7-signature, 5418 bytes
riak-users mailing list
riak-users <at>
Matthew Brender | 19 Jun 18:09 2015

Riak Recap - June 19, 2015

The riak-user list has been incredibly active! Here are some of the
conversations that have not yet been part of The Recap.

## Basho Docs Announcement
* The documentation of Basho is going through a series of updates in
the coming months. Be sure to review "How To Contribute" section of
our README [0] before opening a PR

## ICYMI Code Announcements
* The Riak Ruby client 2.2.0 dropped recently. See the announcement
details from Bryce [1]

## Recently answered
* Sambit ran into a error after renaming a Riak node. Charles reminds
him of this thorough document on the matter [2]
* Konstantin was reassurred of his rolling upgrade considerations [3]
and reviewed the requirements for leveldb [4]
* Sinh has a really interesting question regarding SolrSpatial with
Riak Search [5]
* Adam asked about a build for Ubuntu Trusty Tahr [6] and found what
he needed on Packagecloud [7]
* Riak S2 (aka CS) had an incorrect URL pointed out by Toby [8] and
corrected by Chris' PR [9]
* Robert's JVM had a memory mismatch between available memory and
allocated memory that Luke helped with [10]
* Cosmin simulated an error and requested a more detailed error [11]
* Shunichi pointed out a confusion that Toby ran into in his Riak S2
(aka CS) riak.conf file [12]. This complexity will reduce in 2.1
* This thread, starting here [13], gets into the details of deleting
massive amounts of data from an existing Riak KV system

## Worth Re-reading
* If you haven't reviewed it already, it is well worth revisiting the
Syslog-ng usage of Riak KV here [14]
* If you're interested in the difference between Riak and Orleans, you
should review Christopher Meiklejohn's side-by-side comparison here

## Community Notes
* Webmachine is now its own organization. If you're into the
Webmachine code base, get involved [16]
* Have you brought Riak KV or S2 into production at a new place? Did
you find a blog or presentation helpful? Good! Share it in our release

Share and enjoy,
Matt Brender
Developer Advocacy  <at>  Basho
 <at> mjbrender