Pierre GM | 1 Dec 01:12
Picon

Re: scikits.timeseries question

On Nov 30, 2009, at 6:58 PM, Christopher Barker wrote:
> HI all,
> 
> Maybe I'm missing something, but I can't seem to get this to work as I'd 
> like.

I guess you're confusing DateArrays and TimeSeries. DateArrays are just arrays of dates (think a ndarray
of datetime objects, or a ndarray with a datetime64 dtype). TimeSeries are like MaskedArrays, the
combination of a ndarray of values with 2 others ndarrays: one array of booleans (the mask), one DateArray.

> I have a bunch of data that is indexed by "day since Jan 1, 2001". It 
> seemed I should be able to do a DateArray like this:
> 
> In [40]: import scikits.timeseries as ts
> 
> In [41]: sd = ts.Date(freq='D', year=2001, month=1, day=1)
> 
> In [42]: sd
> Out[42]: <D : 01-Jan-2001>

All is well here.

> In [43]: da = ts.date_array((1,2,3,4), start_date=sd)

Check the doc for date_array: the first argument can be
        * an existing :class:`DateArray` object;
        * a sequence of :class:`Date` objects with the same frequency;
        * a sequence of :class:`datetime.datetime` objects;
        * a sequence of dates in string format;
        * a sequence of integers corresponding to the representation of 
(Continue reading)

Christopher Barker | 1 Dec 01:23
Picon
Favicon

Re: scikits.timeseries question

Pierre GM wrote:
> On Nov 30, 2009, at 6:58 PM, Christopher Barker wrote:
> I guess you're confusing DateArrays and TimeSeries.

> DateArrays are just arrays of dates (think a ndarray of datetime
 > objects, or a ndarray with a datetime64 dtype). TimeSeries are like
 > MaskedArrays, the combination of a ndarray of values with 2 others
 > ndarrays: one array of booleans (the mask), one DateArray.

Actually, I think I got that.

>> In [41]: sd = ts.Date(freq='D', year=2001, month=1, day=1)
>>
>> In [42]: sd
>> Out[42]: <D : 01-Jan-2001>
> 
> All is well here.

yup.

>> In [43]: da = ts.date_array((1,2,3,4), start_date=sd)
> 
> Check the doc for date_array: the first argument can be

...

>         * a sequence of integers corresponding to the representation of 
>           :class:`Date` objects.

That's what I'm trying to give it.
(Continue reading)

Pierre GM | 1 Dec 01:49
Picon

Re: scikits.timeseries question

On Nov 30, 2009, at 7:23 PM, Christopher Barker wrote:
> Pierre GM wrote:
> ...
> 
>>        * a sequence of integers corresponding to the representation of 
>>          :class:`Date` objects.
> 
> That's what I'm trying to give it.

Ah OK. Well, the answer is: that depends. iIf you know that your dates are just in daily increments from
2001-01-01 (like a range), then just use start_date and length.

If you may have several duplicated dates (like 2001-01-01, 2001-01-02, 2001-01-02, 2001-01-03...),
then the easiest is probably:
>>> da = ts.date_array(np.array(0,1,1,2)+sd)

np.array(...) + sd gives you a ndarray of Date objects (so its dtype is np.object), and you use that as the
input of date_array. The frequency should be recognized properly.

Note that if 1 in your data set means '2001-01-01', then use (sd-1) instead, but you would have guessed that.

> While I'm at it -- what I really have is a big 'ol 3-d array, which is 
> gridded model output, of shape: (time, lat, lon). Time is expressed in 
> days since...
> 
> I need to do a moving average of the while grid over time. Can a 
> time_serie be n-d, with time as one of the axis?

Well, I never tried so I can tell you. Check wheter lib.moving_funcs supports 2D data. If not, not a big deal:
just fill the missing dates (so that you have a regular-spaced series with masked elements for missing
(Continue reading)

Robert Ferrell | 1 Dec 01:53

Re: scikits.timeseries question


On Nov 30, 2009, at 5:23 PM, Christopher Barker wrote:

> Pierre GM wrote:
>> On Nov 30, 2009, at 6:58 PM, Christopher Barker wrote:
>> I guess you're confusing DateArrays and TimeSeries.
>
>> DateArrays are just arrays of dates (think a ndarray of datetime
>> objects, or a ndarray with a datetime64 dtype). TimeSeries are like
>> MaskedArrays, the combination of a ndarray of values with 2 others
>> ndarrays: one array of booleans (the mask), one DateArray.
>
> Actually, I think I got that.
>
>>> In [41]: sd = ts.Date(freq='D', year=2001, month=1, day=1)
>>>
>>> In [42]: sd
>>> Out[42]: <D : 01-Jan-2001>
>>
>> All is well here.
>
> yup.
>
>>> In [43]: da = ts.date_array((1,2,3,4), start_date=sd)
>>
>> Check the doc for date_array: the first argument can be
>
> ...
>
>>        * a sequence of integers corresponding to the representation  
(Continue reading)

Pierre GM | 1 Dec 02:06
Picon

Re: scikits.timeseries question

On Nov 30, 2009, at 7:53 PM, Robert Ferrell wrote:
> 
> I may be misunderstanding what you are trying to do, but here's what I  
> do:
> 
> In [68]: sd = ts.Date('d', '2001-01-01')
> 
> In [69]: dates = ts.date_array(cumsum(ones(4)) + sd)
> 
> In [70]: dates
> Out[70]:
> DateArray([02-Jan-2001, 03-Jan-2001, 04-Jan-2001, 05-Jan-2001],
>           freq='D')

The cumsum approach works only if you have irregular time steps as inputs (as in 1 day after the first, 1 day
after that, 3 days after that...). If you have regular time steps of 1, just use arange+start_date (or even
just length+start_date)
Christopher Barker | 1 Dec 02:16
Picon
Favicon

Re: scikits.timeseries question

Pierre GM wrote:

> Ah OK. Well, the answer is: that depends. iIf you know that your
> dates are just in daily increments from 2001-01-01 (like a range),
> then just use start_date and length.

right -- but I don't know that.

> If you may have several duplicated dates (like 2001-01-01,
> 2001-01-02, 2001-01-02, 2001-01-03...), then the easiest is probably:
> 
>>>> da = ts.date_array(np.array(0,1,1,2)+sd)

nope -- not duplicated, but maybe there are missing ones. The point is 
that I have an array of "days since", and I want array of 
timeseries.dates (which is a DateArray, yes?)

> np.array(...) + sd gives you a ndarray of Date objects (so its dtype
> is np.object), and you use that as the input of date_array. The
> frequency should be recognized properly.

OK -- though it seems I SHOULD be able to go straight to an DateArray, 
and I'm still confused about what this means:

>> In [43]: da = ts.date_array((1,2,3,4), start_date=sd)
> 
> Check the doc for date_array: the first argument can be
>         * an existing :class:`DateArray` object;
>         * a sequence of :class:`Date` objects with the same frequency;
>         * a sequence of :class:`datetime.datetime` objects;
(Continue reading)

Pierre GM | 1 Dec 02:39
Picon

Re: scikits.timeseries question

On Nov 30, 2009, at 8:16 PM, Christopher Barker wrote:

> nope -- not duplicated, but maybe there are missing ones. The point is 
> that I have an array of "days since", and I want array of 
> timeseries.dates (which is a DateArray, yes?)

Got it. Duplicated and/or missing dates correspond to the same problem: you can't assume that your dates
are regularly spaced, so you can't use start_date and length.

>> np.array(...) + sd gives you a ndarray of Date objects (so its dtype
>> is np.object), and you use that as the input of date_array. The
>> frequency should be recognized properly.
> 
> OK -- though it seems I SHOULD be able to go straight to an DateArray, 
> and I'm still confused about what this means:

Well, that depends on the type of starting date, actually. If it's a Date, adding a ndarray to it will give you
a  ndarray of Date objects. If it's a DateArray of length 1, it'll give you a DateArray. (Note to self: we
could probably be a bit more consistent on this one...)

>>> In [43]: da = ts.date_array((1,2,3,4), start_date=sd)
>> 
>> Check the doc for date_array: the first argument can be
>>        * an existing :class:`DateArray` object;
>>        * a sequence of :class:`Date` objects with the same frequency;
>>        * a sequence of :class:`datetime.datetime` objects;
>>        * a sequence of dates in string format;
>>        * a sequence of integers corresponding to the representation of 
>>          :class:`Date` objects.
> 
(Continue reading)

Robert Ferrell | 1 Dec 03:59

Re: scikits.timeseries question


On Nov 30, 2009, at 6:16 PM, Christopher Barker wrote:

> Pierre GM wrote:
>
>> Ah OK. Well, the answer is: that depends. iIf you know that your
>> dates are just in daily increments from 2001-01-01 (like a range),
>> then just use start_date and length.
>
> right -- but I don't know that.
>
>> If you may have several duplicated dates (like 2001-01-01,
>> 2001-01-02, 2001-01-02, 2001-01-03...), then the easiest is probably:
>>
>>>>> da = ts.date_array(np.array(0,1,1,2)+sd)
>
> nope -- not duplicated, but maybe there are missing ones. The point is
> that I have an array of "days since", and I want array of
> timeseries.dates (which is a DateArray, yes?)

I don't think so.  An array of dates is not a DateArray.

In [98]: sd = ts.Date('d', '2001-01-01')

In [99]: zeros(4) + sd
Out[99]: array([01-Jan-2001, 01-Jan-2001, 01-Jan-2001, 01-Jan-2001],  
dtype=object)

This seems natural to me, (array + Date = array) although I do have to  
include an extra line sometimes to get a DateArray if I need it.  If I  
(Continue reading)

Robert Ferrell | 1 Dec 04:15

Re: scikits.timeseries question


On Nov 30, 2009, at 6:06 PM, Pierre GM wrote:

> On Nov 30, 2009, at 7:53 PM, Robert Ferrell wrote:
>>
>> I may be misunderstanding what you are trying to do, but here's  
>> what I
>> do:
>>
>> In [68]: sd = ts.Date('d', '2001-01-01')
>>
>> In [69]: dates = ts.date_array(cumsum(ones(4)) + sd)
>>
>> In [70]: dates
>> Out[70]:
>> DateArray([02-Jan-2001, 03-Jan-2001, 04-Jan-2001, 05-Jan-2001],
>>          freq='D')
>
> The cumsum approach works only if you have irregular time steps as  
> inputs (as in 1 day after the first, 1 day after that, 3 days after  
> that...). If you have regular time steps of 1, just use arange 
> +start_date (or even just length+start_date)

Sort of.  The cumsum approach works even if the intervals are uniform,  
of course, but it may be overkill and arange may be sufficient.

In any case, I get the impression that the OP has an array of integer  
offsets generated in some other fashion entirely.
Pierre GM | 1 Dec 05:03
Picon

Re: scikits.timeseries question

On Nov 30, 2009, at 9:59 PM, Robert Ferrell wrote:

> This seems natural to me, (array + Date = array) although I do have to  
> include an extra line sometimes to get a DateArray if I need it.  If I  
> need a timeseries, sometimes I can skip making the DateArray explicitly.

Well, keep in mind that Date was implemented a few years ago already, far before the new datetime64 dtype,
and it was the easiest way we had to define a new datatype (well, a kind of datatype). I'll check how we can
merge the two approaches when I'll have some time.
Anyhow, in practice, a Date object will be seen as a np.object by numpy, and you end up having a ndarray with a
np.object dtype.

> Is the issue that sd is a Date and not a DateArray?  You can always  
> make a DataArray with sd, of the correct length, and then add to that:
> 
> In [83]: sd = ts.Date('d', '2001-01-01')
> 
> In [84]: d1 = ts.date_array(zeros(4) + sd)

Wow, that's overkill ! Just make sd a DateArray:
>>> np.arange(4) + ts.DateArray(sd)

Now, because DateArray is a subclass of ndarray with a higher priority, its _add__ method takes over and the
ouput is a DateArray.

> 
>> and I'm still confused about what this means:
>> 
>>>> In [43]: da = ts.date_array((1,2,3,4), start_date=sd)
> 
(Continue reading)


Gmane