Miki Tebeka | 4 Apr 16:54 2014
Picon

Splitting a string and adding fields as columns

Greetings,

I have a DataFrame where one of the columns is a string. I'd like to split that string and add the field as columns to the DataFrame.

The working solution I have is

sr = df['%r'].str.split()
df = pd.concat([df, pd.DataFrame(sr.tolist(), columns=req_cols)], axis=1)

Is there a better way? I mostly would like to avoid sr.tolist() since it might be big.

Thanks,
--
Miki

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.
Paul Hobson | 1 Apr 16:14 2014
Picon

Re: Quick excel -> remote pandas session

%cpaste (maybe %paste) magics should disable the tab-completion.
-paul


On Mon, Mar 31, 2014 at 8:40 PM, Jeffrey Tratner <jtratner-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

You can also use the IPython magic function (I think it's %edit) to copy/paste into an editor

On Mar 31, 2014 4:12 PM, "Jeff Reback" <jeffreback <at> gmail.com> wrote:
you can specify a sep (separator)
try '\s+' or '\t'

this is essentially what read_clipboard does


On Mar 31, 2014, at 6:58 PM, Brett Thomas <brettpthomas-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

Thanks Jeff, will do that. Just two minor annoyances, both easy to work around: 
- Excel copies data with tab separators, which triggers tab completion even in a triple quoted string 
- Have to figure out the dimensions for reshaping 


On Mon, Mar 31, 2014 at 1:29 PM, Jeff Reback <jeffreback-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
well

assign the actual text to a variable in I python? (eg pasting it)

then it's straightforward

data=""" 

data pasted here

""""

df = pd.read_csv(StringIO(data),sep='\s+')

should work

this is how I copy / paste from a browser for example 

On Mar 31, 2014, at 3:25 PM, Brett Thomas <brettpthomas-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

My most common workflow frustration with pandas is as follows: 

I'm working in a remote ipython shell, and I want to load data from an excel file that was emailed to me. I'd love to open the Excel spreadsheet locally, copy a group of cells, and paste it into a DataFrame on the remote machine. 

I tried unsuccessfully to get read_clipboard() to work over X windows. Is there a way to reliably paste a block of cells from excel as plain text, and load into a dataframe? 

Apologize for the simplistic question, but I've found that the friction here stops me from using pandas for quite a few simple tasks. 

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.
Brett Thomas | 31 Mar 21:25 2014
Picon

Quick excel -> remote pandas session

My most common workflow frustration with pandas is as follows: 

I'm working in a remote ipython shell, and I want to load data from an excel file that was emailed to me. I'd love to open the Excel spreadsheet locally, copy a group of cells, and paste it into a DataFrame on the remote machine. 

I tried unsuccessfully to get read_clipboard() to work over X windows. Is there a way to reliably paste a block of cells from excel as plain text, and load into a dataframe? 

Apologize for the simplistic question, but I've found that the friction here stops me from using pandas for quite a few simple tasks. 

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.
Moritz Beber | 28 Mar 22:53 2014
Picon

(ab-)using pandas.Index

Dear all,

In order to have non-integer indexes to matrices, I would like to use something like the pandas.Index. It is sorted and it can be sliced.

I am wondering, however, if it is overkill. Finding the index to an element will be slower than if I used a dict and when I looked at the source for the Index class there was a lot of magic I'm sure I wouldn't need.

Do you recommend using an OrderedDict for my use case, or what do you think?

Basically the ability to get a slice of indexes is tempting me.

Best,
Moritz

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.
Skipper Seabold | 23 Mar 18:06 2014
Picon

Expanding dates?

Is there an easy way to expand a time-series index in one direction or
the other, assuming we have an offset. Something like shift but that
increases the length? Assume I only have an index.

I'm currently doing something like

    dates = pd.DatetimeIndex(start='3/1/1958', periods=30, freq='M')
    pd.DatetimeIndex(start=dates[0] - 1, periods=len(dates) + 1,
freq=dates.inferred_freq)

Skipper

--

-- 
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe@...
For more options, visit https://groups.google.com/d/optout.

aaelony | 20 Mar 23:08 2014
Picon

Row percents for Categorical variable counts

 Hi -

Pardon my question, I'm very new to Python and Pandas.

What is the best way to get row percents from a data frame with fields A,B,C,and D?   I've seen comments on stackoverflow, but there appear to be hidden indexing or axis settings that yield errors, and no examples start from a data frame where all fields are categorical variables.


In my data frame d all fields are strings. 

I have the following:

d2 = d[ (d['B']== 2)]

d3 = d2[['A','B', 'C','D']].drop_duplicates().groupby(['B','C']) 

d3_sums = d3['D'].value_counts().unstack().T.sum()

But

d3['D'].value_counts().unstack() / d3_sums does not work...

Many thanks,
Avram

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.
Rishabh SHARMA | 12 Mar 14:46 2014
Picon

Data Frame Indexing Help


Hi

Could someone point me as to how Dataframe object is able to index rows according to array-like structure.
Even a  brief overview of the process would be helpful.I tried reading the dataframe code but it didnt help.


Thanks
Rishabh Sharma

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.
Keith Brown | 11 Mar 23:45 2014
Picon

data frame help

Hi,

I am trying to build a data frame structure which will help with querying this type of data. I have several hundreds of these entries and I intent to use it like this:
What is the average price everyone bids? who had the lowest/higest bid in the trial? etc...

So, here is how my data looks like. The winner of each trial is in the first column. 

winner(trial, Name, how much he sold, prize)
trial, Name, how much s(he) sold, units

Trial0,Joseph,$200,IPod
Trial0,Amanda,$150,25
Trial0,Karl,$160,35
Trial0,Bob,$170,46

Trial1,Karl,$200,Zune
Trial1,Bob,$150,11
Trial1,Tom,$160,33
Trial1,Mike,$170,56


--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.
Hans Skov-Petersen | 11 Mar 16:06 2014
Picon

Pandas: DLL load failed: The specified procedure could not be found

Hi there,

I am trying to make Pandas run with my present Python version.
For a number of reasons I am presently (still) running Python 2.6.
I am on Win7, 64 bit.

I downloaded pandas-0.13.1.win-amd64-py2.6.exe (md5) from https://pypi.python.org/pypi/pandas#downloads

Install is ok, but when importing the Pandas module I get this error:
DLL load failed: The specified procedure could not be found

Any clues?

Cheers
Hans

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.
Felix Lawrence | 10 Mar 06:15 2014
Picon

Redesigning handling of multidimensional data

Hi,

I've been using pandas for a couple of months now, and I've found it great, but have encountered some awkwardness with indexing multidimensional data. To start a discussion, I've blogged about the problems and have proposed some solutions [1]. The changes I suggest are fairly radical and could transform the way people index multidimensional data in pandas, making people less reliant on group_by et al, and removing the need to stack/unstack.

I don't have the time or technical chops to pull this off by myself in the near future.

Does this vision have any support? Can it be refined and implemented? How do we start?

TLDR: please improve MultiIndexes to the point that MultiIndex + Series is the preferred way to store matrix-style data.

[1] http://camelcode.wordpress.com/2014/02/28/index-to-the-koalas-series-of-posts/

Cheers,
Felix

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
For more options, visit https://groups.google.com/d/optout.
Bryan Van de Ven | 10 Mar 19:17 2014

ANN: Bokeh 0.4.2

I am happy to announce the release of Bokeh version 0.4.2!

Bokeh is a Python library for visualizing large and realtime datasets on the web.  Its goal is to provide
elegant, concise construction of novel graphics in the style of Protovis/D3, while delivering
high-performance interactivity to thin clients.  Bokeh includes its own Javascript library (BokehJS)
that implements a reactive scenegraph representation of the plot, and renders efficiently to HTML5
Canvas. Bokeh works well with IPython Notebook, but can generate standalone graphics that embed into
regular HTML.

Check out the full documentation, interactive gallery, and tutorial at

     http://bokeh.pydata.org

If you are using Anaconda, you can install with conda:

     conda install bokeh

Alternatively, you can install with pip:

     pip install bokeh

Some of the new features in this release include:

* Additional Matplotlib and Seaborn compatibility (PolyCollection)
* Extensive tutorial with exercises and solutions added to docs
* new %bokeh magic for improved IPython notebook integration
* Windows support for bokeh-server with two new storage backends (in-memory and shelve)

Also, we've fixed lots of little bugs - see the CHANGELOG for full details.

BokehJS is also available by CDN for use in standalone javascript applications:

     http://cdn.pydata.org/bokeh-0.4.2.js
     http://cdn.pydata.org/bokeh-0.4.2.css
     http://cdn.pydata.org/bokeh-0.4.2.min.js
     http://cdn.pydata.org/bokeh-0.4.2.min.css

Some examples of BokehJS use can be found on the Bokeh JSFiddle page:

     http://jsfiddle.net/user/bokeh/fiddles/

The release of Bokeh 0.5 is planned for late March. Some notable features we plan to include are:

* Abstract Rendering for semantically meaningful downsampling of large datasets
* Better grid-based layout system, using Cassowary.js
* More MPL/Seaborn/ggplot.py compatibility and examples
* Additional tools, improved interactions, and better plot frame
* Touch support

Issues, enhancement requests, and pull requests can be made on the Bokeh Github page: https://github.com/continuumio/bokeh

Questions can be directed to the Bokeh mailing list: bokeh@...

Special thanks to recent contributors: Melissa Gymrek, Amy Troschinetz, Ben Zaitlen, Damian Avila, and
Terry Jones

Regards,

Bryan Van de Ven
Continuum Analytics
http://continuum.io

--

-- 
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe@...
For more options, visit https://groups.google.com/d/optout.


Gmane