RE: Non empty error stack failing non-blocking SSL IO
Uri Simchoni <uris <at> ctera.com>
2011-01-04 05:26:01 GMT
I realize that I must be doing all that. The difference I see from errno (and the reason I wrote this) is that if
you fail to read errno, it does not affect the outcome of the NEXT system call (save for few documented cases
which specifically instruct you to clear errno before calling the function). That strikes me as a design problem.
It's difficult, in a large system, to make sure that everyone "play by the book", and with non-blocking IO
it's common that the same thread deals with unrelated tasks (with a select() loop and "socket handlers").
So what can happen here is that "my code" runs OpenSSL with full error checking, and "somebody else's" code
runs OpenSSL with no error checking, breaking "my code". It's preferable for a general-purpose library
to be designed to avoid such scenarios, and in this particular case it appears to have a solution - check
socket blocking state before checking the error queue. That's what I was getting at.
From: owner-openssl-dev <at> openssl.org [mailto:owner-openssl-dev <at> openssl.org] On Behalf Of aerowolf <at> gmail.com
Sent: Tuesday, January 04, 2011 5:51 AM
To: openssl-dev <at> openssl.org
Subject: Re: Non empty error stack failing non-blocking SSL IO
If your program ignores the error queue, your program is doing the equivalent of not checking errno after
every system call. The program is required to deal with the error queue, because it is OpenSSL's only
mechanism for informing the application code of the wide variety of potential protocol and
The program should absolutely not be doing the same things in the cases of SSL_get_error() returning
SSL_ERROR_SSL and SSL_WANT_READ. (It may be that someone missed a break statement at the end of one case
and it's falling through to the next.) Either way, this is not anomalous behavior on OpenSSL's part.
After you call SSL_read() and get zero bytes, you must determine why you got zero bytes, and that's where you
should call SSL_get_error(). If it returns SSL_ERROR_SSL, you must check the error queue to determine
exactly why the SSL session is in an error state. (The reason for the queue is because you're supposed to be
interested in and handle every error that comes up in the process, not merely the most-recent.)
On Mon, Jan 3, 2011 at 4:22 AM, Uri Simchoni <uris <at> ctera.com> wrote:
> I’m using OpenSSL 0.9.8i, and have noticed the following scenario:
> - Some OpenSSL crypto function returns with an error, leaving a
> description of the error on the error queue
> - The application neglects to call ERR_clear_error()
> - SSL_read() is then called on a non-blocking socket and returns
> because there’s no input available
> - Calling SSL_get_error() returns SSL_ERROR_SSL instead of
> SSL_ERROR_WANT_READ, because the error queue is not empty.
> Would it be possible to modify the code so that blocking socket takes
> precedence over the error queue?
> If not, what is the recommended programming practice with non-blocking
> - ensure the everybody call ERR_clear_error() after an error
> - call ERR_clear_error() before SSL read/write (but if that’s
> recommended why isn’t it inside SSL_read/SSL_write)