marc.wyburn | 11 Sep 16:58 2014

protocol stream is incorrect when using to_sql

Hi,
I'm new to Pandas and trying to push some data to a SQL table, just a single column, the SQL datatype is set to 'text'.
The error I receive is
09/11/2014 02:40:32 PM DEBUG:
(ProgrammingError) ('42000', '[42000] [Microsoft][ODBC SQL Server Driver][SQL Server]
The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Parameter 4 (""):
The supplied value is not a valid instance of data type float.
Check the source data for invalid values.
An example of an invalid value is data of numeric type with scale greater than precision. (8023)
(SQLExecDirectW)') u'INSERT INTO [t] ([Level3]) VALUES (?)' (nan,)
I can see from the highlighted part of the error that the value isn't a float but I can't figure out why PANDAS has it set to a float.
 
>>> c[['Level3']]
   FRCLevel3
0        NaN
>>> v = c[['Level3']]
>>> v.dtypes
FRCLevel3    float64
dtype: object
I'm stumped as to why the PANDAS has the object set to a float64 or am I misunderstanding how PANDAS sets types?
 
Thanks, Marc.

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.
Bryan Van de Ven | 10 Sep 15:05 2014

ANN: Bokeh 0.6 release


On behalf of the Bokeh team, I am very happy to announce the release of Bokeh version 0.6!

Bokeh is a Python library for visualizing large and realtime datasets on the web. Its goal is to provide to
developers (and domain experts) with capabilities to easily create novel and powerful visualizations
that extract insight from local or remote (possibly large) data sets, and to easily publish those
visualization to the web for others to explore and interact with.

This release includes many bug fixes and improvements over our most recent 0.5.2 release:

  * Abstract Rendering recipes for large data sets: isocontour, heatmap
  * New charts in bokeh.charts: Time Series and Categorical Heatmap
  * Full Python 3 support for bokeh-server
  * Much expanded User and Dev Guides
  * Multiple axes and ranges capability
  * Plot object graph query interface
  * Hit-testing (hover tool support) for patch glyphs

See the CHANGELOG for full details.

I'd also like to announce a new Github Organization for Bokeh: https://github.com/bokeh. Currently it is
home to Scala and and Julia language bindings for Bokeh, but the Bokeh project itself will be moved there
before the next 0.7 release.  Any implementors of new language bindings who are interested in hosting your
project under this organization are encouraged to contact us.

In upcoming releases, you should expect to see more new layout capabilities (colorbar axes, better grid
plots and improved annotations), additional tools, even more widgets and more charts, R language
bindings, Blaze integration and cloud hosting for Bokeh apps.

Don't forget to check out the full documentation, interactive gallery, and tutorial at

    http://bokeh.pydata.org

as well as the Bokeh IPython notebook nbviewer index (including all the tutorials) at:

    http://nbviewer.ipython.org/github/ContinuumIO/bokeh-notebooks/blob/master/index.ipynb

If you are using Anaconda, you can install with conda:

    conda install bokeh

Alternatively, you can install with pip:

    pip install bokeh

BokehJS is also available by CDN for use in standalone javascript applications:

    http://cdn.pydata.org/bokeh-0.6.min.js
    http://cdn.pydata.org/bokeh-0.6.min.css

Issues, enhancement requests, and pull requests can be made on the Bokeh Github page: 

    https://github.com/continuumio/bokeh

Questions can be directed to the Bokeh mailing list: bokeh@...

If you have interest in helping to develop Bokeh, please get involved!

Thanks,

Bryan Van de Ven
Continuum Analytics
http://continuum.io

--

-- 
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe@...
For more options, visit https://groups.google.com/d/optout.

'Michael' via PyData | 10 Sep 12:24 2014

DataFrame.index.asof() issue or bug

This is an example:

103 Inn: dfcon.index.asof(P.to_datetime('1990-1-1'))
103 Out: Timestamp('1989-12-31 06:29:27')
104 Inn: dfcon.index.asof('1990-1-1')
104 Out: Timestamp('1990-01-01 00:00:00')
105 Inn: dfcon.iloc[_]
Traceback (most recent call last):

  File "<ipython-input-105-27da9bd60abe>", line 1, in <module>
    dfcon.iloc[_]

  File "/Users/ifmichael/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1144, in __getitem__
    return self._getitem_axis(key, axis=0)

  File "/Users/ifmichael/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1408, in _getitem_axis
    key = self._convert_scalar_indexer(key, axis)

  File "/Users/ifmichael/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 158, in _convert_scalar_indexer
    return ax._convert_scalar_indexer(key, typ=self.name)

  File "/Users/ifmichael/anaconda/lib/python2.7/site-packages/pandas/core/index.py", line 498, in _convert_scalar_indexer
    return self._convert_indexer_error(key, 'label')

  File "/Users/ifmichael/anaconda/lib/python2.7/site-packages/pandas/core/index.py", line 638, in _convert_indexer_error
    "type ({2})".format(msg, key, self.__class__.__name__))

TypeError: the label [1990-01-01 00:00:00] is not a proper indexer for this index type (DatetimeIndex)


Apparently dfcon.index.asof('1990-1-1') knows, in a degree, to convert to Timestamp, but it messes up the time ('00:00:00')
I cannot say more about this, but I guess it's a bug


--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.
Fabian Braennstroem | 8 Sep 20:01 2014

pandas read_html with german decimal

Hello,

does anyone have an idea, how to read html tables of  german sites with 
its decimal writing;
e.g. the tables on the bottom of http://www.finanzen.net/bilanz_guv/SAP

Right now, I try to read that with:
     df = 
pd.read_html(url,infer_types=False,parse_dates=False,header=0,skiprows=0,thousands=".",match=pattern,index_col=0)
     dfR = df[0].replace(",",".")
     dfR = pd.DataFrame(dfR, dtype='float')

But, this is not really working.

Thanks in advance!
Best Regards
Fabian

--

-- 
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe@...
For more options, visit https://groups.google.com/d/optout.

Dr. Leo | 8 Sep 07:37 2014

Announcing new module pandaSDMX-0.1 - a pandas-powered client for statistical data and metadata exchange

Hi,

[this is a repost of an earlier msg, now as a separate thread.]

I have just released the new module pandaSDMX on pypi. It is a fork from
www.github.com/widukind/pysdmx
 where development has slowed down.

The goal of this project is to develop a client to query statistical
data from as many SDMX data providers as possible and to expose datasets
as pandas timeseries or dataframes. v0.1 smoothly queries data from
Eurostat, the statistical office of the EU. Eurostat disseminates about
4500 datasets on 28 EU countries and on everything you can imagine.
Querying ECB data is a bit bumpy, and FAO and ILO are even worse...

While pandaSDMX currently supports only some core features of SDMX, it
has some exciting features. It automatically generates multi-indexed
dataframes based on the structural metadata provided with the datasets.
Dataflow information is stored in SQLite tables so you can easily search
the descriptions and create your own tables referencing your favorite
datasets.

Install it with pip and see the code examples at

https://pypi.python.org/pypi/pandaSDMX/0.1

Any help and feedback is highly welcome. See the ToDo list included in
the package.

Enjoy!

Leo

--

-- 
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe@...
For more options, visit https://groups.google.com/d/optout.

Josh Quigley | 8 Sep 03:57 2014
Picon

groupby object cannot be unpickled

Hi,


It appears that groupby object cannot be unpickled. The following code reproduces the problem for me:

import pandas as pd

import pickle


df = pd.DataFrame([[1,2,3], [4,5,6], [7,8,9]])

gr = df.groupby(0)

b = pickle.dumps(gr)

gr2 = pickle.loads(b)     # Fails with RuntimeError


<snip 50 repeat lines...>


  File "C:\Miniconda3\lib\site-packages\pandas\core\groupby.py", line 482, in __getattr__

    if attr in self.obj:


  File "C:\Miniconda3\lib\site-packages\pandas\core\groupby.py", line 482, in __getattr__

    if attr in self.obj:


RuntimeError: maximum recursion depth exceeded while calling a Python object



While it might not be that useful to be able to pickle groupby objects (although they sometimes are expensive to calculate) my two-cents is its nicer if these things 'just work' - or at least fail during pickling to save users time and effort debugging.

Regards,

Josh.

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.
Adam Hughes | 8 Sep 01:33 2014
Picon

Getting memory address of dataframe/series?

Hi, 

This is probably a dumb question, but is it possible to return the memory address of a pandas DataFrame?  I'm working on some stuff where I need to constantly test if frames are being passed by reference or object are being created new, and this would help.  In the past, for classes that inherit from python's object class, I could just hack super's __repr__ because python's object class by default prints the memory address.  I was wondering if there was a straightforward way to do the same for frames.

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.
fabio.zanini | 4 Sep 09:35 2014

read excel: column-wise convert_float?

Dear all,

(This is my first post, I hope I don't mess up.)

I regularly use pandas to parse excel sheets in which only some numerical columns are integers, the other being float. I suggest to implement a column-wise variation on the convert_float argument of pandas.io.excel.read_excel, similar to what is done for CSV files. My current alternative is to read the column separately and then stitch the table back together by concatenation, but that's quite awkward and inefficient.

Would that be a useful thing? I can try and code it down, it should be simple enough.

Cheers, thanks
Fabio

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.
Seth P | 2 Sep 21:29 2014
Picon

pivot_panel()?

Any reason there isn't a function pivot_panel(), analogous to pivot_table()? It would be similar to pivot_table(), but would take items, major_axis, and minor_axis rather than index and columns, and would return a Panel rather than a DataFrame. This would be quite useful when reading certain 3-dimensional data that is stored in a "skinny" table.

(I don't think pivot() quite does this, but can't quite figure it out.)

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.
Adam | 29 Aug 01:56 2014
Picon

read_csv() simple datetimes, can't seem to parse correctly

I have a file with the header being datetimes:

Wavelength,2014-05-22 15:38:23,2014-05-22 15:38:26,2014-05-22 15:38:30,2014-05-22 15:38:34,2014-05-22 15:38:37,2014-05-22 15:38:41,2014-05-22 15:38:45,2014-05-22

I've been following the IO Tools tutorial, trying various permutations of "parse_dates", "infer_datetime_format" etc.. but keep getting the same non DatetimeIndex:

Index([u'2014-05-22 15:38:23', u'2014-05-22 15:38:26', u'2014-05-22 15:38:30',

I'm wondering if the unicode is screwing something up.  While I can simply pass this into a DatetimeIndex after the standard Index has been created, I'm really trying to understand what might be hanging up the parser.  I figured this is probably a common happening, and might have an obvious answer without sharing files/code?

Thanks

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.
andy hayden | 28 Aug 20:40 2014
Picon

help running the vbench suite

I tried to run the vbench suite last night (I was hoping to get some timeseries for all commits between 0.13.1 and 0.14.1), and it run a few of the recent commits ok (I think), but had LOTS of NaNs times with the following traceback:

Traceback (most recent call last):
 
File "/usr/local/lib/python2.7/dist-packages/vbench/benchmark.py", line 112, in run
    ns
= self._setup()
 
File "/usr/local/lib/python2.7/dist-packages/vbench/benchmark.py", line 70, in _setup
   
exec self.setup in ns
 
File "<string>", line 1, in <module>
ImportError: No module named pandas_vb_common

That's when I run:

 % python vb_suite/run_suite.py
2014-08-28 11:24:51,199 [INFO  ] Initializing benchmark runner for 442 benchmarks (runner.py:53)
2014-08-28 11:24:51,199 [INFO  ] Initializing GitRepo to look at /home/andy/pandas (git.py:27)
2014-08-28 11:25:17,192 [INFO  ] Initializing DB at /home/andy/pandas/vb_suite/benchmarks.db (db.py:16)
2014-08-28 11:25:17,200 [INFO  ] Cloning http://www.github.com/pydata/pandas.git over to /home/andy/tmp/vb_pandas_tmp (git.py:164)
2014-08-28 11:25:17,200 [INFO  ] Deleting /home/andy/tmp/vb_pandas_tmp first (git.py:167)
2014-08-28 11:25:32,506 [WARNING] stderr: Cloning into '/home/andy/tmp/vb_pandas_tmp'...
|  (utils.py:109)
2014-08-28 11:25:32,506 [INFO  ] Cloning /home/andy/tmp/vb_pandas_tmp over to /home/andy/tmp/vb_pandas (git.py:164)
2014-08-28 11:25:32,506 [INFO  ] Deleting /home/andy/tmp/vb_pandas first (git.py:167)
2014-08-28 11:25:32,636 [WARNING] stderr: Cloning into '/home/andy/tmp/vb_pandas'...
| done.
|  (utils.py:109)
2014-08-28 11:25:32,822 [INFO  ] Getting benchmarks (runner.py:145)
2014-08-28 11:25:32,828 [INFO  ] Registering 442 benchmarks (runner.py:148)
2014-08-28 11:26:46,220 [INFO  ] Collecting revisions to run (runner.py:93)
2014-08-28 11:26:46,361 [INFO  ] Running benchmarks for 1005 revisions (runner.py:96)
2014-08-28 11:26:46,366 [INFO  ] No benchmarks need running at 698ee45 (runner.py:160)
2014-08-28 11:26:46,369 [INFO  ] No benchmarks need running at 6e793a1 (runner.py:160)
2014-08-28 11:26:46,372 [INFO  ] No benchmarks need running at 6c537e4 (runner.py:160)
2014-08-28 11:26:46,375 [INFO  ] Running 72 benchmarks for revision 5e9c404 (runner.py:163)
2014-08-28 11:26:46,376 [INFO  ] Switching to revision 5e9c404 (git.py:195)
2014-08-28 11:26:46,376 [INFO  ] Cloning /home/andy/tmp/vb_pandas_tmp over to /home/andy/tmp/vb_pandas (git.py:164)
2014-08-28 11:26:46,376 [INFO  ] Deleting /home/andy/tmp/vb_pandas first (git.py:167)
2014-08-28 11:26:46,506 [WARNING] stderr: Cloning into '/home/andy/tmp/vb_pandas'...
| done.
|  (utils.py:109)
2014-08-28 11:26:46,743 [ERROR ] stderr: cp: cannot stat pandas_vb_common.py’: No such file or directory
|  (utils.py:109)
2014-08-28 11:26:51,856 [WARNING] Returned with non-0 code: 72 (runner.py:188)
# These last few lines then basically repeat (and none of the tests run/have NaN times)


Has anyone seen this? or am I missing / not doing this the correct way? (In the past I've only vbenched against a specific commit, which works great!)

Thanks,
Andy

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.

Gmane