1 Jul 01:45 2011

### Time Series using 15 minute intervals using scikits.timeseries

Hi,

Using scikits timeseries I can create daily and hourly time series....no prob

But....

I have time series at 15 minutes intervals...this I dont know how to do...

Can a timeseries array handle 15 min intervals?
Do I use a minute intervals and use mask arrays for the missing minutes?
Also..I can figure out how to create a array at minute intervals.

So..what is best practice?  Any examples?

Thanks

st = ts.Date('H',
year=ts_start_date.year,month=ts_start_date.month,day=ts_start_date.day,hour=ts_start_hour)
ed = ts.Date('H',
year=ts_end_date.year,month=ts_end_date.month,day=ts_end_date.day,hour=ts_end_hour)
st_beg = st.asfreq('H', relation='START')
ed_end = ed.asfreq('H', relation='END')
1 Jul 09:42 2011

### Re: Fitting procedure to take advantage of cluster

On 29/06/11 18:18, Giovanni Luca Ciampaglia wrote:
> Hi,
> there are several strategies, depending on your problem. You could use a
> surrogate model, like a Gaussian Process, to fit the data (see for
> example Higdon et al
> http://epubs.siam.org/sisc/resource/1/sjoce3/v26/i2/p448_s1?isAuthorized=no).
> I have personally used scikits.learn for GP estimation but there is also
> PyMC that should do the same (never tried it).
>
I can also immodestly recommend my own code for Gaussian processes. It
is not based on Markov chain Monte Carlo but rather a maximum likelihood
approach:
http://sysbio.mrc-bsu.cam.ac.uk/group/index.php/Gaussian_processes_in_python
1 Jul 09:48 2011

### Question: scipy.stats.gamma.fit

Dear scipy-users,

I'm using scipy.stats.gamma.fit to fit a set of random variables for
gamma distribution. And to validate the results I also use the fitdistr
function in R. However the results generated by these two packages are
different, i.e. shape parameter and scale parameter for the gamma pdf
are different. Though the difference is not large, I'm wondering what
causes this difference. I think both of them are using maximum
likelihood estimation to fit the function.

Best regards!
Ning
1 Jul 11:07 2011

### Re: Time Series using 15 minute intervals using scikits.timeseries

On Jul 1, 2011, at 1:45 AM, David Montgomery wrote:

> Hi,
>
> Using scikits timeseries I can create daily and hourly time series....no prob
>
> But....
>
> I have time series at 15 minutes intervals...this I dont know how to do...
>
> Can a timeseries array handle 15 min intervals?
> Do I use a minute intervals and use mask arrays for the missing minutes?
> Also..I can figure out how to create a array at minute intervals.
>
> So..what is best practice?  Any examples?

First possibility, you get the latest experimental version of scikits.timeseries on github. There's
support for multiple of frequencies (like 15min).
If you're not comfortable with tinkering with experimental code, you have several solutions, depending
1. You create a minute-freq series and mask 14/15 of the data. Simple but wasteful and problematic if you
have a large series. Still, the easiest solution
2. You create a hour-freq series as a 2D array: each column would correspond to the data for one quarter of
this hour. That's more compact in terms of memory, but you'll have to jump through some extra hoops if you
need to convert the array to another frequency (conversion routines don't really like 2D arrays...)
1 Jul 11:22 2011

### Re: Time Series using 15 minute intervals using scikits.timeseries

Awesoke...

for the github version...any docs or an example for creating a 15 min array?

On Fri, Jul 1, 2011 at 7:07 PM, Pierre GM <pgmdevlist <at> gmail.com> wrote:
>
> On Jul 1, 2011, at 1:45 AM, David Montgomery wrote:
>
>> Hi,
>>
>> Using scikits timeseries I can create daily and hourly time series....no prob
>>
>> But....
>>
>> I have time series at 15 minutes intervals...this I dont know how to do...
>>
>> Can a timeseries array handle 15 min intervals?
>> Do I use a minute intervals and use mask arrays for the missing minutes?
>> Also..I can figure out how to create a array at minute intervals.
>>
>> So..what is best practice?  Any examples?
>
> First possibility, you get the latest experimental version of scikits.timeseries on github. There's
support for multiple of frequencies (like 15min).
> If you're not comfortable with tinkering with experimental code, you have several solutions, depending
> 1. You create a minute-freq series and mask 14/15 of the data. Simple but wasteful and problematic if you
have a large series. Still, the easiest solution
> 2. You create a hour-freq series as a 2D array: each column would correspond to the data for one quarter of
this hour. That's more compact in terms of memory, but you'll have to jump through some extra hoops if you

1 Jul 12:11 2011

### Re: Time Series using 15 minute intervals using scikits.timeseries

On Jul 1, 2011, at 11:22 AM, David Montgomery wrote:

> Awesoke...
>
> for the github version...any docs or an example for creating a 15 min array?

Use the 'timestep' optional argument in scikits.timeseries.date_array.

BTW, make sure you're using the https://github.com/pierregm/scikits.timeseries-sandbox/
repository (that's the experimental one I was telling you about).
Note that support is *very* limited, as I don't really have time to work on scikits.timeseries these days.
Anyhow, there'll be some major overhaul in the mid future once Mark W. new datetime dtype will be stable.
1 Jul 12:23 2011

### Invitation to connect on LinkedIn

Jose,

- Jelle

 Accept
View invitation from Jelle Feringa

DID YOU KNOW your LinkedIn profile helps you control your public image when people search for you?

_______________________________________________
SciPy-User mailing list
SciPy-User <at> scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user
1 Jul 12:36 2011

### Weird error in fmin_l_bfgs_b

Hi,
I'm getting an error in scipy.optimize.fmin_l_bfgs_b, apparently related to the fortran wrapper. This is strange, because exactly the same problem works well with the TNC solver. I have a function that returns both a scalar value (that will be minimised) and the derivative of the function at that point. The error in the L-BFG-S solver is
File "/usr/lib/python2.7/dist-packages/scipy/optimize/lbfgsb.py", line 181, in fmin_l_bfgs_b
isave, dsave)
ValueError: failed to initialize intent(inout) array -- input not fortran contiguous

My code looks like this:

# x0 is the starting point, a 1d array
>>> solution, x, info = scipy.optimize.fmin_tnc( cost_function, x0,    args=([operators]),  bounds=bounds )
# Using fmin_tnc works well, solution is what I expect it to be

>> solution, cost, information = scipy.optimize.fmin_l_bfgs_b (  cost_function, solution, bounds=bounds,  args=[ operators ], iprint=101 )
2011-07-01 11:34:24,703 - eoldas.Model - INFO - 46 days, 46 quantised days
tnc: Version 1.3, (c) 2002-2003, Jean-Sebastien Roy (js <at> jeannot.org)
tnc: RCS ID: <at> (#) \$Jeannot: tnc.c,v 1.205 2005/01/28 18:27:31 js Exp \$
NIT   NF   F                       GTG
0    1  1.988301629303336E+02   8.17118991E+06
tnc: fscale = 0.000249879
1    5  1.338514420154698E+01   1.82689516E+04
tnc: fscale = 0.00528464
2    9  9.476573219561992E+00   2.21390020E+04
3   19  6.684083971679802E+00   3.88897225E+03
4   69  6.274247682836059E+00   2.43671753E+03
tnc: |fn-fn-1] = 4.5037e-13 -> convergence
5  120  6.274247682835608E+00   2.43671753E+03
tnc: Converged (|f_n-f_(n-1)| ~= 0)
RUNNING THE L-BFGS-B CODE

* * *

Machine precision = 1.084D-19
N =           46     M =           10

L = -2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01
-2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01
-2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01
-2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01
-2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01
-2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01
-2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01
-2.0000D-01 -2.0000D-01 -2.0000D-01 -2.0000D-01

X0 =  5.6013D-02  1.1717D-01  1.9201D-01  2.7557D-01  3.7013D-01  4.5702D-01
5.3491D-01  6.0661D-01  6.7624D-01  7.4649D-01  8.0318D-01  8.5203D-01
8.8633D-01  9.0102D-01  8.9914D-01  8.7521D-01  8.2816D-01  7.6529D-01
7.0559D-01  6.5371D-01  6.0520D-01  5.5814D-01  5.0991D-01  4.4783D-01
3.7790D-01  3.0041D-01  2.1894D-01  1.5147D-01  1.0832D-01  8.3926D-02
6.6473D-02  4.8621D-02  3.2567D-02  2.0086D-02  1.0881D-02  2.4890D-03
8.8000D-04 -4.2729D-03 -4.6658D-03 -5.5940D-03 -4.1690D-03 -1.2577D-02
-2.2529D-02 -2.9114D-02 -1.5938D-02  1.9755D-02

U =  1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00
1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00
1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00
1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00
1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00
1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00
1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00
1.2000D+00  1.2000D+00  1.2000D+00  1.2000D+00

At X0         0 variables are exactly at the bounds
Traceback (most recent call last):
File "example_identity.py", line 199, in <module>
main ( sys.argv )
File "example_identity.py", line 166, in main
solution, cost, information = scipy.optimize.fmin_l_bfgs_b (  cost_function, solution, bounds=bounds,  args=[ operators ], iprint=101 )
File "/usr/lib/python2.7/dist-packages/scipy/optimize/lbfgsb.py", line 181, in fmin_l_bfgs_b
isave, dsave)
ValueError: failed to initialize intent(inout) array -- input not fortran contiguous

Any clues of where to look for issues?
Thanks!
jose

_______________________________________________
SciPy-User mailing list
SciPy-User <at> scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user
1 Jul 18:52 2011

### Re: Time Series using 15 minute intervals using scikits.timeseries

On Fri, Jul 1, 2011 at 6:11 AM, Pierre GM <pgmdevlist <at> gmail.com> wrote:
>
> On Jul 1, 2011, at 11:22 AM, David Montgomery wrote:
>
>> Awesoke...
>>
>> for the github version...any docs or an example for creating a 15 min array?
>
> Use the 'timestep' optional argument in scikits.timeseries.date_array.
>
> BTW, make sure you're using the https://github.com/pierregm/scikits.timeseries-sandbox/
repository (that's the experimental one I was telling you about).
> Note that support is *very* limited, as I don't really have time to work on scikits.timeseries these days.
Anyhow, there'll be some major overhaul in the mid future once Mark W. new datetime dtype will be stable.
> _______________________________________________
> SciPy-User mailing list
> SciPy-User <at> scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>

depending on your data manipulation needs, you could also give pandas
a shot-- generating 15-minute date ranges for example is quite simple:

In [3]: DateRange('7/1/2011', '7/2/2011', offset=datetools.Minute(15))
Out[3]:
<class 'pandas.core.daterange.DateRange'>
offset: <15 Minutes>, tzinfo: None
[2011-07-01 00:00:00, ..., 2011-07-02 00:00:00]
length: 97

The date range can be used to conform a time series you loaded from some source:

('pad' a.k.a. "ffill" propagates values forward into holes, optional)

I've got some resampling code in the works that would help with e.g.
converting 15-minute data into hourly data or that sort of thing but
it's in less-than-complete form at the moment so like I said depends
on what you need to do. Give me a few weeks on that bit =)

best,
Wes
1 Jul 19:11 2011

### Re: Fitting procedure to take advantage of cluster

On 06/29/2011 11:54 AM, J. David Lee wrote:
> Hello,
>
> I'm attempting to perform a fit of a model function's output to some
> measured data. The model has around 12 parameters, and takes tens of
> minutes to run. I have access to a cluster with several thousand
> processors that can run the simulations in parallel, so I'm wondering if
> there are any algorithms out there that I can use to leverage this
> computing power to efficiently solve my problem - that is, besides grid
> searches or Monte-Carlo methods.
>
>
> David
> _______________________________________________
> SciPy-User mailing list
> SciPy-User <at> scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
I want to thank everyone for their suggestions. I've read through most
of the links presented, and am getting a clearer idea of what I need to do.

Here's a quick clarification of my problem for those who are interested:

I'm running a single-processor plasma simulation modeling an experiment.
It has tens or hundreds of parameters, but most are constrained by
measurements. For my purposes, the output consists of several x-ray
spectra which I am trying to match against measured spectra. I have
about 12 or 14 parameters in all that I am changing in order to match
the spectra. Each run of the simulation takes a few to a few tens of
minutes. I have the ability to run the compiled code on a number of
machines, but I can't easily run python scripts on the machines.

After some thinking, I'm considering the feasibility of parallelizing
the routines in scipy's optimize module. My initial thought is to allow
the user to specify a function that would run the objective function on
multiple inputs. This would be useful, for example, when performing a
simplex shrink, or in numerical gradient / hessian calculations with
multiple variables.

From my point of view, this would allow me to use a hybrid
Monte-Carlo/minimization procedure to look for a global minimum.

I'm interested to hear other's opinions on the matter.

Thanks again,

David

Gmane