James Gibbons | 1 Jul 17:32
Picon
Favicon

Re: Model specification with RPy/vegan integration (Gregory, Matthew)

> 
> Hi all,
> 
> (This might be a bit tangetial and technical question for this
> listserve.  Please advise if there is a more appropriate venue.)
> 
> I'm trying to call vegan 'cca' from Python using Rpy.  I read my species
> and environmental data into R data frames without an issue and can even
> run the 'full' cca model without a problem, e.g. 
> 
>   from rpy import *
>   set_default_mode(NO_CONVERSION)
> 
>   # Import the vegan library
>   r.library('vegan')
> 
>   # Open up the data files and read into data frames
>   spp_file = r.read_csv('spp.csv', header=r.TRUE)
>   env_file = r.read_csv('env.csv', header=r.TRUE)
>   spp = r.data_frame(spp_file, row_names='FCID')
>   env = r.data_frame(env_file, row_names='FCID')
> 
>   # Squareroot transform the species matrix
>   spp = r.sqrt(spp)
> 
>   # Create the vegan CCA
>   out_cca = r.cca(spp, env)
> 
> However, when I try to specify the model to only include a subset of the
> environmental variables, e.g.
(Continue reading)

Gregory, Matthew | 1 Jul 19:41
Picon
Favicon

Re: Model specification with RPy/vegan integration(Gregory, Matthew)

Hi James and all,

James Gibbons wrote:
> I don't use RPy but I would guess this should be something like
> 
> model = r.formula("spp ~ ANNPRE + ANNTMP + DEM + SLPPCT")

Thanks to all (offline and online) for suggesting solutions for this
problem.  My problem was that the names of my data frames (spp, env)
were not being 'seen' by R as valid objects when using the formula
version of cca (cca.formula).  Deep within this function, there is a
call to eval.parent() and 'spp' was not being recognized as a valid
name.

Through some examples on the RPy wiki
(http://rpy.wiki.sourceforge.net/Manual+-+Linear+modeling), it appears
that it is sometimes necessary (especially with formulae?) to register a
name with the object by using R's assign.  When I added the below code,
everything worked spendidly.

  r.assign('spp', spp)
  r.assign('env', env)
  model = r("spp ~ ANNPRE + ANNTMP + DEM + SLPPCT")
  out_cca = r.cca(model, env)

Again, thanks for the helpful suggestions for a newbie R user.

matt
Diego Fontaneto | 2 Jul 11:37
Picon

adonis/vegan


Why does order matter in specifying a model in adonis?

Call:
adonis(formula = presence ~ DO + ph + T + EC, data = presence.env,
permutations = 1000, method = "jaccard") 

                Df SumsOfSqs  MeanSqs  F.Model     R2 Pr(>F)  
DO         1.00000   0.59095  0.59095  1.47514 0.0448  0.035 *
ph         1.00000   0.59785  0.59785  1.49235 0.0453  0.855  
T          1.00000   0.53954  0.53954  1.34681 0.0409  1.000  
EC         1.00000   0.65139  0.65139  1.62601 0.0494  1.000  
Residuals 27.00000  10.81641  0.40061          0.8197         
Total     31.00000  13.19614                   1.0000         

but changing order, both R2 and P change.

Call:
adonis(formula = presence ~ EC + DO + ph + T, data = presence.env,
permutations = 1000, method = "jaccard") 

                Df SumsOfSqs  MeanSqs  F.Model     R2 Pr(>F)   
EC         1.00000   0.69313  0.69313  1.73019 0.0525  0.005 **
DO         1.00000   0.51736  0.51736  1.29143 0.0392  0.958   
ph         1.00000   0.50494  0.50494  1.26043 0.0383  1.000   
T          1.00000   0.66431  0.66431  1.65826 0.0503  1.000   
Residuals 27.00000  10.81641  0.40061          0.8197          
Total     31.00000  13.19614                   1.0000          

----------------------------------------------------------------
(Continue reading)

Gavin Simpson | 2 Jul 12:54
Picon
Picon
Favicon

Re: adonis/vegan

On Wed, 2008-07-02 at 10:37 +0100, Diego Fontaneto wrote:
> Why does order matter in specifying a model in adonis?

Because they are sequential sums of squares. See the Value section
of ?adonis.

<snip />

> 
> ----------------------------------------------------------------
> 
> Another question in adonis: is there a way to specify an error term, like
> adding "random=~1|locality" to the previous models?

No, not with the current implementation.

Do you want to do this to take account of group structure in your data?
If so, the permutation test can be altered so that it doesn't permute
freely but permutes within levels of 'locality' for example.

I have added code to vegan to generate these restricted permutations
(and for spatial grids and time series/transects, amongst others), in
function permuted.index2. However the code is not quite finished and the
API is still bedding down whilst I make the last few additions to the
code. Once this is done, we vegan developers can begin moving functions
that use permuted.index internally to use the new permutation code from
permuted.index2. This isn't a minor job however so may take a little bit
of time. 

> Thanks,
(Continue reading)

Philip Dixon | 2 Jul 14:18
Favicon

Why use Rpy?

What is the benefit of using Rpy?

I'm familiar with Python as a high-level compiled language.  Python
was very popular here (Iowa State, Statistics) a few years ago to
vastly speed up S+ simulation studies.  One could recode into Python
a lot more quickly than recoding into C++.  Both Python and C++ were
much faster than S+; I don't know how Python compares with native R.

Thanks,
Philip Dixon
Gregory, Matthew | 2 Jul 19:31
Picon
Favicon

Re: Why use Rpy?

Philip Dixon wrote:
> What is the benefit of using Rpy?
> 
> I'm familiar with Python as a high-level compiled language.  
> Python was very popular here (Iowa State, Statistics) a few 
> years ago to vastly speed up S+ simulation studies.  One 
> could recode into Python a lot more quickly than recoding 
> into C++.  Both Python and C++ were much faster than S+; I 
> don't know how Python compares with native R.

I guess since I think I was the one who brought it up, I should probably
explain my rationale.  The really glib answer (for me) is that I know
Python and wasn't willing to learn another programming language for the
small bit that I needed from R (probably not a very popular opinion on
an R listserve ...).  RPy provides that relatively seamless link into R
and given that most of our codebase is in Python, this was the path of
least resistance for me.  There is also the RSPython package which has
similar functionality to RPy.  

I can only speak to the benefits that I've found from Python, which
isn't to say they don't exist in R - I'm just not fully aware of them.
The main strength of Python for me is the vast array of amazing packages
that are written for it.  This includes: 

-  Numpy/Scipy/matplotlib for array handling, scientific computing and
graphing
-  bindings for GDAL for abstract spatial translation and projection
support
-  PIL for image processing
-  numerous other packages that have nothing to do with statistical
(Continue reading)

Olivia LeDee | 3 Jul 21:29
Picon

help with assignment


Would someone mind helping with the following code:

1) After attaching the primary dataset

ambi1<-read.table("ambi.txt",header=TRUE,sep='\t')

attach(ambi1)

2) I would like to change na's for variable (pwd) to 0

pwd[is.na(pwd)]<-0

3) Then, I need the results of the following based on all observations.

sqrt(area)

m1<-lm(prs~area+sqrt(area))

summary(m1)

residuals(m1)

3) Next, I would like to omit na's from the rest of the dataset. I need this
truncated dataset to include 0's for variable pwd.

ambi2<-na.omit(ambi1)

In this process, I am overwriting step 2. Can someone help (other than
omitting na's for each column)?
(Continue reading)

Nicholas Lewin-Koh | 4 Jul 00:11
Favicon

Why use Rpy?

Hi Phil,
The main reason would be for accessing R functionality in 
broader applications. Python does use a more efficient
memory model than R, however when using R functions
through rpy, R will still make copies with assignment within
called R functions, so I am not clear that there is any
gain on that front. Python is a very structured
language and is perhaps more consistent under the hood than
R is with its many legacy hiccups due to the next letter in
the alphabet. Also python has a cleaner object model than s4.

So really it is a nice mechanism to access R functionality from python
programs. So for creating plots in web applications using django,
or using other python treats that are much further developed than in
R.

Hope this helps

Nicholas 

> ------------------------------
> 
> Message: 2
> Date: Wed, 02 Jul 2008 07:18:06 CDT
> From: Philip Dixon <pdixon@...>
> Subject: [R-sig-eco] Why use Rpy?
> To: r-sig-ecology@...
> Message-ID: <200807021218.HAA03665@...>
> 
> What is the benefit of using Rpy?
(Continue reading)

tyler | 4 Jul 01:11
Picon

Re: help with assignment

"Olivia LeDee" <lede0025@...> writes:

> Would someone mind helping with the following code:
>
> 1) After attaching the primary dataset
>
> ambi1<-read.table("ambi.txt",header=TRUE,sep='\t')
> attach(ambi1)

> 2) I would like to change na's for variable (pwd) to 0
> pwd[is.na(pwd)]<-0
>
> 3) Then, I need the results of the following based on all observations.
> sqrt(area)
> m1<-lm(prs~area+sqrt(area))
> summary(m1)
> residuals(m1)
>
> 3) Next, I would like to omit na's from the rest of the dataset. I need this
> truncated dataset to include 0's for variable pwd.
>
> ambi2<-na.omit(ambi1)
>
> In this process, I am overwriting step 2. Can someone help (other than
> omitting na's for each column)?
>

Hi Olivia,

If I understand you correctly, the problem is that after this command:
(Continue reading)

Mike Dunbar | 4 Jul 10:30
Picon
Favicon

Re: help with assignment

Exactly: STEER WELL CLEAR OF attach(). This is just the sort of problem you can get into.
Mike

>>> tyler <tyler.smith@...> 04/07/2008 00:11 >>>
"Olivia LeDee" <lede0025@...> writes:

> Would someone mind helping with the following code:
>
> 1) After attaching the primary dataset
>
> ambi1<-read.table("ambi.txt",header=TRUE,sep='\t')
> attach(ambi1)

> 2) I would like to change na's for variable (pwd) to 0
> pwd[is.na(pwd)]<-0
>
> 3) Then, I need the results of the following based on all observations.
> sqrt(area)
> m1<-lm(prs~area+sqrt(area))
> summary(m1)
> residuals(m1)
>
> 3) Next, I would like to omit na's from the rest of the dataset. I need this
> truncated dataset to include 0's for variable pwd.
>
> ambi2<-na.omit(ambi1)
>
> In this process, I am overwriting step 2. Can someone help (other than
> omitting na's for each column)?
>
(Continue reading)


Gmane