Jonathan DuBois | 1 Mar 2011 01:14
Picon

regression with categorical nuisance variable

Hi,

I am new to R, so I am unsure of the formula to set up this analysis.
I would like to run a linear model with a continuous dependent
variable (brain volume) and a continuous independent variable (age)
while controlling for a categorical nuisance variable (gender).

Age and brain volume are correlated.
There are no gender differences in age but there are significant
gender differences in brain volume.
Therefore, I would like to control for gender when assessing the
association between brain volume and age.

Any help would be very much appreciated.

Jon

Ista Zahn | 1 Mar 2011 01:31
Picon
Favicon

Re: regression with categorical nuisance variable

Hi Jon,
Just enter it as a predictor in the model. You almost can't go wrong
with this one. Usually I would caution you to convert your categorical
variables to factors and make sure the contrasts are set how you want
them, but in this case it doesn't matter because there are (I assume)
only two levels of gender, and you don't really care about
interpreting the coefficient anyway.

Best,
Ista

On Mon, Feb 28, 2011 at 7:14 PM, Jonathan DuBois
<jonathan.m.dubois <at> gmail.com> wrote:
> Hi,
>
> I am new to R, so I am unsure of the formula to set up this analysis.
> I would like to run a linear model with a continuous dependent
> variable (brain volume) and a continuous independent variable (age)
> while controlling for a categorical nuisance variable (gender).
>
> Age and brain volume are correlated.
> There are no gender differences in age but there are significant
> gender differences in brain volume.
> Therefore, I would like to control for gender when assessing the
> association between brain volume and age.
>
> Any help would be very much appreciated.
>
> Jon
>
(Continue reading)

Peter Ehlers | 1 Mar 2011 01:31
Picon
Favicon

Re: nls not solving

On 2011-02-28 14:14, Schatzi wrote:
> I am not sure how you simplified the model to:
> y = a + b(1 - exp(kl)) - b exp(-kx)
>
> I tried simplifying it but only got to:
> y = a + b - b * exp(kl) * exp(-kx)
>
> I agree that the model must not be identifiable. That makes sense,
> especially given that removing either a or l makes the model work. Can you
> please further explain the math though as I am not understanding it? I do
> not see you obtained your equation and when I tried to solve using your
> equation I got quite different numbers. Thank you.

You can obviously write your function as

  f <- f(x, A, B, K) {A - B * exp(-Kx)}

i.e. in terms of *3* parameters. In that form,
it's apple pie for nls().

  fm <- nls(y ~ f(x, A, B, K),
            start = list(A = 50, B = 60, K = 1)

  coef(fm)
  xx <- seq(0, 72, length = 101)
  yy <- predict(fm, newdata = list(x = xx))
  plot(x, y)
  lines(xx, yy, col = "red")

Peter Ehlers
(Continue reading)

Lao Meng | 1 Mar 2011 01:52
Picon

Re: regression with categorical nuisance variable

You may try analysis of covariance.
But,as you say"There are no gender differences in age ",then why not combine
2 gender's age and ignore the gender?

2011/3/1 Jonathan DuBois <jonathan.m.dubois <at> gmail.com>

> Hi,
>
> I am new to R, so I am unsure of the formula to set up this analysis.
> I would like to run a linear model with a continuous dependent
> variable (brain volume) and a continuous independent variable (age)
> while controlling for a categorical nuisance variable (gender).
>
> Age and brain volume are correlated.
> There are no gender differences in age but there are significant
> gender differences in brain volume.
> Therefore, I would like to control for gender when assessing the
> association between brain volume and age.
>
> Any help would be very much appreciated.
>
> Jon
>
> ______________________________________________
> R-help <at> r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
(Continue reading)

array chip | 1 Mar 2011 02:02
Picon
Favicon

Re: nested case-control study

Terry, thanks very much! 

Professor Langholz used a SAS software trick to estimate absolute risk by 
creating a fake variable "entry_time" that is 0.001 less than the variable 
"exit_time" (i.e. time to event), and then use both variables in Phreg. Is this 
equivalent to your creating a dummy survival with time=1?

Another question is, is using offset(logweight) inside the formula of coxph() 
the same as using weight=logweight argument in coxph(), because my understanding 
of Professor Langholz's approach for nested case-control study is weighted 
regression?

Thank you very much for the help.

John

________________________________
From: Terry Therneau <therneau <at> mayo.edu>

Cc: r-help <at> r-project.org
Sent: Mon, February 28, 2011 6:59:23 AM
Subject: Re: [R] nested case-control study

> Hi, I am wondering if there is a package for doing conditional
logistic
> regression for nested case-control study as described in "Estimation
of
> absolute
> risk from nested case-control data" by Langholz and Borgan (1997)
where
(Continue reading)

Umesh Rosyara | 1 Mar 2011 01:58
Picon

Re: stuk at another point: simple question

Dear All

I now realized that it is not simple to deal with realworld problems! 

This what I tried without any success:

a <- seq(1, nvar, by = 2)
b <- seq(2, nvar, by = 2)

#df2 <- transform(df2, ima1p1 = df2$x1[df2$Parent1],       # Parent 1's
allele 1
      #ima2p1 = df2$x2[df2$Parent1],                  # Parent 1's allele 2
      #ima1p2 = df2$x1[df,                    # Parent 2's allele 1
      #ima2p2 = df2$x2[df2$Parent2])                 # Parent2's allele 2

out <- lapply(1:nmark, function(ind){
      n <- nvar/2
      transform(df2, ima1p1 = df2[, a[ind]][df$Parent1],   # Parent 1's
allele 1
      ima2p1 = df2[, b[ind]][df2$Parent1],                  # Parent 1's
allele 2
      ima1p2 = df2[, a[ind]][df2$Parent2],                  # Parent 2's
allele 1
      ima2p2 = df2[, a[ind]][df2$Parent2])}                  # Parent2's
allele 2

I could go further down because I had already an error ! I am particularly
confused how can apply the index in df2$Parent1 or df2$ parent2. 

Please help.
(Continue reading)

ufuk beyaztas | 1 Mar 2011 01:39
Picon

selection of a subset from a loop

Hi dear all, 

The code like this;

e <- rnorm(n=50, mean=0, sd=sqrt(0.5625))
x0 <- c(rep(1,50))
x1 <- rnorm(n=50,mean=2,sd=1)
x2 <- rnorm(n=50,mean=2,sd=1)
x3 <- rnorm(n=50,mean=2,sd=1)
x4 <- rnorm(n=50,mean=2,sd=1)
y <- 1+ 2*x1+4*x2+3*x3+2*x4+e
x2[1] = 10     #influential observarion
y[1] = 10      #influential observarion

X <- matrix(c(x0,x1,x2,x3,x4),ncol=5)
Y <- matrix(y,ncol=1)
Design.data <- cbind(X, Y)

result <- list ()

for( i in 1: 3100) {

data <- Design.data[sample(50,50,replace=TRUE),]
dataX <- data[,1:5]
dataY <- data[,6]

B.cap.simulation <- solve(crossprod(dataX)) %*% crossprod(dataX, dataY)
P.simulation <- dataX %*% solve(crossprod(dataX)) %*% t(dataX)
Y.cap.simulation <- P.simulation %*% dataY
e.simulation <- dataY - Y.cap.simulation
(Continue reading)

Matt Shotwell | 1 Mar 2011 03:59
Favicon

Re: Robust variance estimation with rq (failure of the bootstrap?)

Jim, 

If repeated measurements on patients are correlated, then resampling all
measurements independently induces an incorrect sampling distribution
(=> incorrect variance) on a statistic of these data. One solution, as
you mention, is the block or cluster bootstrap, which preserves the
correlation among repeated observations in resamples. I don't
immediately see why the cluster bootstrap is unsuitable.

Beyond this, I would be concerned about *any* variance estimates that
are blind to correlated observations.

The bootstrap variance estimate may be larger than the asymptotic
variance estimate, but that alone isn't evidence to favor one over the
other.

Also, I can't justify (to myself) why skew would hamper the quality of
bootstrap variance estimates. I wonder how it affects the sandwich
variance estimate...

Best,
Matt

On Mon, 2011-02-28 at 17:50 -0600, James Shaw wrote:
> I am fitting quantile regression models using data collected from a
> sample of 124 patients.  When modeling cross-sectional associations, I
> have noticed that nonparametric bootstrap estimates of the variances
> of parameter estimates are much greater in magnitude than the
> empirical Huber estimates derived using summary.rq's "nid" option.
> The outcome variable is severely skewed, and I am afraid that this may
(Continue reading)

chen jia | 1 Mar 2011 03:57
Picon
Favicon

Re: Data type problem when extract data from SQLite to R by using RSQLite

Hi Seth,

Thanks for the reply. I provide info from sessionInfo() and about
schema that you ask. Please take a look.

The output from sessionInfo() is
> sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UT> sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
(Continue reading)

Laura Clasemann | 1 Mar 2011 03:17
Picon
Favicon

Entering table with multiple columns & rows


Hi,

I'm having difficulty with getting a table to show with
multiple rows and columns. Below is the commands that I've typed in and
errors that I am getting. Thank you.

Laura 

 
Table trying to enter:

Diet:                 Binger-yes:           Binger-No:              Total:
None 24 134 158
Healthy 9 52 61
Unhealthy 23 72 95
Dangerous 12 15 27

> diet=matrix(c(24,134,9,52,23,72,12,15),ncol=4,byrow=TRUE)
> rownames(diet)=c("none", "healthy", "unhealthy", "dangerous")
Error in dimnames(x) <- dn :
  length of 'dimnames' [1] not equal to array extent
> diet=matrix(c(24,134,9,52,23,72,12,15), ncol=4, byrow=4)
> rownanes(diet)=c("none", "healthy", "unhealthy", "dangerous")
Error in rownanes(diet) = c("none", "healthy", "unhealthy", "dangerous") :
  could not find function "rownanes<-"
> rownames(diet)=c("none", "healthy", "unhealthy", "dangerous")
Error in dimnames(x) <- dn :
  length of 'dimnames' [1] not equal to array extent
> diet=matrix(c(24,134,9,52,23,72,12,15), ncol=4, byrow=4)
(Continue reading)


Gmane