1 Mar 2011 01:14

### regression with categorical nuisance variable

```Hi,

I am new to R, so I am unsure of the formula to set up this analysis.
I would like to run a linear model with a continuous dependent
variable (brain volume) and a continuous independent variable (age)
while controlling for a categorical nuisance variable (gender).

Age and brain volume are correlated.
There are no gender differences in age but there are significant
gender differences in brain volume.
Therefore, I would like to control for gender when assessing the
association between brain volume and age.

Any help would be very much appreciated.

Jon

```
1 Mar 2011 01:31

### Re: regression with categorical nuisance variable

```Hi Jon,
Just enter it as a predictor in the model. You almost can't go wrong
with this one. Usually I would caution you to convert your categorical
variables to factors and make sure the contrasts are set how you want
them, but in this case it doesn't matter because there are (I assume)
only two levels of gender, and you don't really care about
interpreting the coefficient anyway.

Best,
Ista

On Mon, Feb 28, 2011 at 7:14 PM, Jonathan DuBois
<jonathan.m.dubois <at> gmail.com> wrote:
> Hi,
>
> I am new to R, so I am unsure of the formula to set up this analysis.
> I would like to run a linear model with a continuous dependent
> variable (brain volume) and a continuous independent variable (age)
> while controlling for a categorical nuisance variable (gender).
>
> Age and brain volume are correlated.
> There are no gender differences in age but there are significant
> gender differences in brain volume.
> Therefore, I would like to control for gender when assessing the
> association between brain volume and age.
>
> Any help would be very much appreciated.
>
> Jon
>
```

1 Mar 2011 01:31

### Re: nls not solving

```On 2011-02-28 14:14, Schatzi wrote:
> I am not sure how you simplified the model to:
> y = a + b(1 - exp(kl)) - b exp(-kx)
>
> I tried simplifying it but only got to:
> y = a + b - b * exp(kl) * exp(-kx)
>
> I agree that the model must not be identifiable. That makes sense,
> especially given that removing either a or l makes the model work. Can you
> please further explain the math though as I am not understanding it? I do
> not see you obtained your equation and when I tried to solve using your
> equation I got quite different numbers. Thank you.

You can obviously write your function as

f <- f(x, A, B, K) {A - B * exp(-Kx)}

i.e. in terms of *3* parameters. In that form,
it's apple pie for nls().

fm <- nls(y ~ f(x, A, B, K),
start = list(A = 50, B = 60, K = 1)

coef(fm)
xx <- seq(0, 72, length = 101)
yy <- predict(fm, newdata = list(x = xx))
plot(x, y)
lines(xx, yy, col = "red")

Peter Ehlers
```

1 Mar 2011 01:52

### Re: regression with categorical nuisance variable

```You may try analysis of covariance.
But,as you say"There are no gender differences in age ",then why not combine
2 gender's age and ignore the gender?

2011/3/1 Jonathan DuBois <jonathan.m.dubois <at> gmail.com>

> Hi,
>
> I am new to R, so I am unsure of the formula to set up this analysis.
> I would like to run a linear model with a continuous dependent
> variable (brain volume) and a continuous independent variable (age)
> while controlling for a categorical nuisance variable (gender).
>
> Age and brain volume are correlated.
> There are no gender differences in age but there are significant
> gender differences in brain volume.
> Therefore, I would like to control for gender when assessing the
> association between brain volume and age.
>
> Any help would be very much appreciated.
>
> Jon
>
> ______________________________________________
> R-help <at> r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
```

1 Mar 2011 02:02

### Re: nested case-control study

```Terry, thanks very much!

Professor Langholz used a SAS software trick to estimate absolute risk by
creating a fake variable "entry_time" that is 0.001 less than the variable
"exit_time" (i.e. time to event), and then use both variables in Phreg. Is this
equivalent to your creating a dummy survival with time=1?

Another question is, is using offset(logweight) inside the formula of coxph()
the same as using weight=logweight argument in coxph(), because my understanding
of Professor Langholz's approach for nested case-control study is weighted
regression?

Thank you very much for the help.

John

________________________________
From: Terry Therneau <therneau <at> mayo.edu>

Cc: r-help <at> r-project.org
Sent: Mon, February 28, 2011 6:59:23 AM
Subject: Re: [R] nested case-control study

> Hi, I am wondering if there is a package for doing conditional
logistic
> regression for nested case-control study as described in "Estimation
of
> absolute
> risk from nested case-control data" by Langholz and Borgan (1997)
where
```

1 Mar 2011 01:58

### Re: stuk at another point: simple question

```Dear All

I now realized that it is not simple to deal with realworld problems!

This what I tried without any success:

a <- seq(1, nvar, by = 2)
b <- seq(2, nvar, by = 2)

#df2 <- transform(df2, ima1p1 = df2\$x1[df2\$Parent1],       # Parent 1's
allele 1
#ima2p1 = df2\$x2[df2\$Parent1],                  # Parent 1's allele 2
#ima1p2 = df2\$x1[df,                    # Parent 2's allele 1
#ima2p2 = df2\$x2[df2\$Parent2])                 # Parent2's allele 2

out <- lapply(1:nmark, function(ind){
n <- nvar/2
transform(df2, ima1p1 = df2[, a[ind]][df\$Parent1],   # Parent 1's
allele 1
ima2p1 = df2[, b[ind]][df2\$Parent1],                  # Parent 1's
allele 2
ima1p2 = df2[, a[ind]][df2\$Parent2],                  # Parent 2's
allele 1
ima2p2 = df2[, a[ind]][df2\$Parent2])}                  # Parent2's
allele 2

I could go further down because I had already an error ! I am particularly
confused how can apply the index in df2\$Parent1 or df2\$ parent2.

```

1 Mar 2011 01:39

### selection of a subset from a loop

```Hi dear all,

The code like this;

e <- rnorm(n=50, mean=0, sd=sqrt(0.5625))
x0 <- c(rep(1,50))
x1 <- rnorm(n=50,mean=2,sd=1)
x2 <- rnorm(n=50,mean=2,sd=1)
x3 <- rnorm(n=50,mean=2,sd=1)
x4 <- rnorm(n=50,mean=2,sd=1)
y <- 1+ 2*x1+4*x2+3*x3+2*x4+e
x2[1] = 10     #influential observarion
y[1] = 10      #influential observarion

X <- matrix(c(x0,x1,x2,x3,x4),ncol=5)
Y <- matrix(y,ncol=1)
Design.data <- cbind(X, Y)

result <- list ()

for( i in 1: 3100) {

data <- Design.data[sample(50,50,replace=TRUE),]
dataX <- data[,1:5]
dataY <- data[,6]

B.cap.simulation <- solve(crossprod(dataX)) %*% crossprod(dataX, dataY)
P.simulation <- dataX %*% solve(crossprod(dataX)) %*% t(dataX)
Y.cap.simulation <- P.simulation %*% dataY
e.simulation <- dataY - Y.cap.simulation
```

1 Mar 2011 03:59

### Re: Robust variance estimation with rq (failure of the bootstrap?)

```Jim,

If repeated measurements on patients are correlated, then resampling all
measurements independently induces an incorrect sampling distribution
(=> incorrect variance) on a statistic of these data. One solution, as
you mention, is the block or cluster bootstrap, which preserves the
correlation among repeated observations in resamples. I don't
immediately see why the cluster bootstrap is unsuitable.

Beyond this, I would be concerned about *any* variance estimates that
are blind to correlated observations.

The bootstrap variance estimate may be larger than the asymptotic
variance estimate, but that alone isn't evidence to favor one over the
other.

Also, I can't justify (to myself) why skew would hamper the quality of
bootstrap variance estimates. I wonder how it affects the sandwich
variance estimate...

Best,
Matt

On Mon, 2011-02-28 at 17:50 -0600, James Shaw wrote:
> I am fitting quantile regression models using data collected from a
> sample of 124 patients.  When modeling cross-sectional associations, I
> have noticed that nonparametric bootstrap estimates of the variances
> of parameter estimates are much greater in magnitude than the
> empirical Huber estimates derived using summary.rq's "nid" option.
> The outcome variable is severely skewed, and I am afraid that this may
```

1 Mar 2011 03:57

### Re: Data type problem when extract data from SQLite to R by using RSQLite

```Hi Seth,

The output from sessionInfo() is
> sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UT> sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8       LC_NAME=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
```

1 Mar 2011 03:17

### Entering table with multiple columns & rows

```
Hi,

I'm having difficulty with getting a table to show with
multiple rows and columns. Below is the commands that I've typed in and
errors that I am getting. Thank you.

Laura

Table trying to enter:

Diet:                 Binger-yes:           Binger-No:              Total:
None 24 134 158
Healthy 9 52 61
Unhealthy 23 72 95
Dangerous 12 15 27

> diet=matrix(c(24,134,9,52,23,72,12,15),ncol=4,byrow=TRUE)
> rownames(diet)=c("none", "healthy", "unhealthy", "dangerous")
Error in dimnames(x) <- dn :
length of 'dimnames' [1] not equal to array extent
> diet=matrix(c(24,134,9,52,23,72,12,15), ncol=4, byrow=4)
> rownanes(diet)=c("none", "healthy", "unhealthy", "dangerous")
Error in rownanes(diet) = c("none", "healthy", "unhealthy", "dangerous") :
could not find function "rownanes<-"
> rownames(diet)=c("none", "healthy", "unhealthy", "dangerous")
Error in dimnames(x) <- dn :
length of 'dimnames' [1] not equal to array extent
> diet=matrix(c(24,134,9,52,23,72,12,15), ncol=4, byrow=4)
```