genome | 15 Dec 18:11 2014

Digest for genome <at> soe.ucsc.edu - 1 update in 1 topic

Nadine Mwemena <nadmub03 <at> hotmail.com>: Dec 14 09:56AM -0500

Hello,My name is Nadine and I am a grad student at UMUC. I need some help completing an assignment for my introduction to bioinformatics class . I tried doing it myself but I couldn't find my way around in the website. Please help. Here are the questions:
a. Using human genome Feb 2009 (Hg19), locate the gene.
Clear all tracks and only display 1. Base Position (full), 2.
RefSeq
Genes (full), 3. ENCODE Regulation (show). Take a screenshot of the gene with
those three tracks, including regions 1000 bp immediately upstream and 1000 bp downstream
of the gene.
 
b. How many mRNA transcript does this gene have? How many
exons in each transcript?
 
c. using 5 sentences or less, discuss what information does the ENCODE track provide?
I look forward to your response.
Thanks,Nadine
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to genome+unsubscribe <at> soe.ucsc.edu.

To unsubscribe from this group and stop receiving emails from it, send an email to genome+unsubscribe <at> soe.ucsc.edu.
genome | 5 Dec 18:06 2014

Digest for genome <at> soe.ucsc.edu - 9 updates in 7 topics

Jonathan Casper <jcasper <at> soe.ucsc.edu>: Dec 04 05:43PM -0800

Hello Hussain,
 
Thank you for your question about the divergence values displayed for items
in the RepeatMasker track. These values are calculated by the
RepeatMasker software,
and your best source of information on this topic is most likely to contact
the authors directly. That said, the information in http://www.repeatmasker
.org/webrepeatmaskerhelp.html suggests that the divergence value is simply
the percentage of bases in the matching region that were substitutions with
respect to the consensus repeat sequence.
 
I hope this is helpful. If you have any further questions, please reply to
genome <at> soe.ucsc.edu or genome-mirror <at> soe.ucsc.edu. Questions sent to those
addresses will be archived in publicly-accessible forums for the benefit of
other users. If your question contains sensitive data, you may send it
instead to genome-www <at> soe.ucsc.edu.
 
--
Jonathan Casper
UCSC Genome Bioinformatics Group
 
On Fri, Nov 28, 2014 at 11:33 AM, Askree, Hussain <hussain.askree <at> emory.edu>
wrote:
 
Matthew Speir <mspeir <at> soe.ucsc.edu>: Dec 04 04:15PM -0800

Hi Brijesh,
 
Thank you for your question about getting SNP data for the IL10 gene.
There are many different variation tracks available for the hg19
assembly of the human genome,
http://genome.ucsc.edu/cgi-bin/hgGateway?db=hg19. These variation tracks
are separated into two track groups in the UCSC Genome Browser, the
"Phenotype and Literature" group and the "Variation" group. SNP tracks
under "Phenotype and Literature", such as OMIM AV SNPs or ClinVar
Variants, contain variants that have been associated with a disease
phenotype. Most of the SNP tracks under "Variation", such as All
SNPs(141) or EVS Variants, contain SNPs and other variants from various
SNP data repositories. If you are interested in any and all SNPs that
may occur in the IL10 gene, then you will mostly likely be interested in
the All SNPs(141) track,
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=snp141, which
contains SNPs from the dbSNP 141 release.
 
You can get the SNPs from the All SNPs(141) track that occur in both the
exons and introns of the IL10 gene using the Table Browser. To get this
information, use the following steps:
 
1. Navigate to the Table Browser, http://genome.ucsc.edu/cgi-bin/hgTables.
 
2. Select your assembly and tracks
 
clade: Mammal
genome: Human
assembly: Feb. 2009 (GRCh37/hg19)
group: Variation
track: All SNPs(141)
table: snp141
region: click the button next to position, and the enter the
position: chr1:206,940,948-206,945,839
output: all fields from selected table
output file: enter a file name to save your results to a file, or
leave blank to display results in your browser
 
3. Click "get output".
 
You can see an explanation of what the various columns in your output
represent by clicking the "describe table schema" button on the Table
Browser after you've selected the snp141 table. You can also replace the
track and table with your variation track of interest if the All
SNPs(141) track does not fit your need.
 
I hope this is helpful. If you have any further questions, please reply
to genome <at> soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
On 12/4/14, 5:13 AM, Brijesh Dabhi wrote:
Scott Wilson <livvy01 <at> uab.edu>: Dec 04 07:50PM

Dear UCSC,
 
Do your mouse genome maps include BAC contains? I am trying to find a BAC that has the MLK3 gene. Thanks.
 
Scott
Matthew Speir <mspeir <at> soe.ucsc.edu>: Dec 04 03:57PM -0800

Hi Scott,
 
Thank you for your question about getting BAC information for the MLK3
gene. We do have BAC End Pair information on the Genome Browser for the
mm9 assembly, http://genome.ucsc.edu/cgi-bin/hgGateway?db=mm9. From that
Gateway page, you can search for your MLK3 gene to find it's position
within the genome. Please note that in our UCSC Genes track, MLK3 is
labelled as Map3k11. Click the only result under the "UCSC Genes"
section to navigate to the Genome Browser position for that gene. Then,
you will need to turn on the BAC End Pairs track. To do so, find the Bac
End Pairs track under the "Mapping and Sequencing" group, select "pack"
or "full" from the drop-down below the track name and then click
refresh. You should now see a gene track containing the MLK3 gene and
another track showing the BAC End Pairs.
 
I hope this is helpful. If you have any further questions, please reply
to genome <at> soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
On 12/4/14, 11:50 AM, Scott Wilson wrote:
Clara BENOIT <clara.benoitp <at> gmail.com>: Dec 04 07:51PM +0100

Hi,
 
I'm trying to visualize 2 bigwig files on UCSC.
I followed the step-by-step description of how to set up a track hub and I
loaded it in UCSC (with no error message). The tracks appear in the browser
but there is not any data inside... (see screenshot attached)
Here's my hub link :
https://fasterdb.lyon.unicancer.fr/UCSC_hubs/sipp_hub/hub.txt.
 
I know that my bigwig files are correct because I can visualize them in IGV
(see screenshot attached).
 
When I used hubcheck with the link above, I got this error :
*Couldn't open /Users/clarabenoit/.hg.conf , No such file or directory*
I used the following workaround : HGDB_CONF=/dev/null hubCheck
https://fasterdb.lyon.unicancer.fr/UCSC_hubs/sipp_hub/hub.txt.
And I didn't get any error message.
 
Is there a problem with my hub that is not reported by hubcheck?
 
Thanks in advance,
Clara Benoit
Brian Lee <brianlee <at> soe.ucsc.edu>: Dec 04 12:30PM -0800

Dear Clara Benoit,
 
Thank you for using the UCSC Genome Browser and your question about the
display of a hub bigWig file that is displaying at IGV. The issue stems
from the chromosome names in the bigWig files being displayed. The files
have 1 for chromosome 1 and MT for mitochondria, which are acceptable
notations at IGV, whereas the UCSC browser needs the abbreviation "chr" in
front, and uses M instead of MT.
 
Thank you for your input about hubCheck, we are planning to improve it and
will definitely include reporting this kind of error. If you have more
suggestions we would greatly enjoy hearing them!
 
Here are some steps I checked that you could use to change your annotation
files so they will display at UCSC in a hub. If you do not have the
original wiggle/bedGraph files, you will need to expand your bigWigs with
the utility "bigWigToWig", then you would perform some command line changes
to add"chr" to the front of each line and swap MT with M (and remove any
comment lines). Lastly you would repackage the wig/bedGraph back up into a
new binary bigWig with "wigToBigWig".
 
Here is the directory you would use to find the correct utility for your
operating system by running "uname -a" on the command line, then download
"bigWigToWig" and "wigToBigWig": http://hgdownload.soe.ucsc.edu/admin/exe/
 
To run the wigToBigWig function you will also need a file that outlines the
sizes of each chromosome (chrom.sizes), you can acquire it by running this
command:
curl -O http://genome.ucsc.edu/goldenPath/help/hg19.chrom.sizes
 
1. Expand your bigWig (if you don't have it already in the non-bigWig
state).
bigWigToWig accepted_hits.bw new.wig
 
2. Change the MT to a M and remove comments and add chr to the start of
every line:
cat new.wig | grep -v "#" | sed -e "s/MT/M/" | sed -e "s/^/chr/" >
new.fixed.wig
 
3. Turn the fixed.wig, that now has chrM instead of MT and chr20 instead of
20, into a bigWig
wigToBigWig new.fixed.wig hg19.chrom.sizes new.fixed.bw
 
Now you can put the URL to that new.fixed.bw into your trackDb.txt, or even
load it directly via custom tracks:
http://genome.ucsc.edu/cgi-bin/hgTracks?hgt.customText=http://university.edu/lab/directory/new.fixed.bw
 
Thank you again for using the UCSC Genome Browser hubs feature and sharing
the issues you are having. Please feel free to reply with more feedback or
send further suggestions to our suggestion box at
http://genome.ucsc.edu/cgi-bin/hgUserSuggestion.
 
If you have any further questions, please reply to genome <at> soe.ucsc.edu. All
messages sent to that address are archived on a publicly-accessible forum.
If your question includes sensitive data, you may send it instead to
genome-www <at> soe.ucsc.edu.
 
All the best,
 
Brian Lee
UCSC Genome Bioinformatics Group
 
On Thu, Dec 4, 2014 at 10:51 AM, Clara BENOIT <clara.benoitp <at> gmail.com>
wrote:
 
Luvina Guruvadoo <luvina <at> soe.ucsc.edu>: Dec 04 10:49AM -0800

Hello Martin,
 
Thank you for your question. From the Gateway page (
http://genome.ucsc.edu/cgi-bin/hgGateway), choose your assembly and click
the 'track search' button. Type in a search term and click 'search'. For a
description of the track, click on the track name. To turn a track on,
click the checkbox next to the track and then click 'view in browser'. For
more information about using Track Search, please refer to this help page:
http://genome.ucsc.edu/goldenPath/help/trackSearch.html.
 
If you are unfamiliar with the Genome Browser, we have a number of
resources and tutorials to help get you started:
http://genome.ucsc.edu/training.html
 
To answer your question about the Gene Sorter, yes, your results will show
matches in all chromosomes. Please see this help page for more information
on how to use and understand the Gene Sorter display:
http://genome.ucsc.edu/goldenPath/help/hgNearHelp.html.
 
If you have any further questions, please reply to genome <at> soe.ucsc.edu. All
messages sent to that address are archived on a publicly-accessible forum.
If your question includes sensitive data, you may send it instead to
genome-www <at> soe.ucsc.edu.
 
- - -
Luvina Guruvadoo
UCSC Genome Bioinformatics Group
 
 
On Wed, Dec 3, 2014 at 3:13 AM, Martin Garbe <
Brian Lee <brianlee <at> soe.ucsc.edu>: Dec 04 10:11AM -0800

Dear Eric,
 
Thank you for using the UCSC Genome Browser and your question about the
last two columns in the Txn Fac ChIP tables referring to experiment IDs and
experiment scores.
 
When viewing the Table Schemas in the Table Browser, if you scroll down you
should see a Track Description page, such as the below:
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeRegTfbsClusteredV3
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeRegTfbsClusteredV2
 
Reviewing these Track Descriptions will often help provide more details
than the one line in the schema details including these sentences: "The
first list field (expNums) contains numeric identifiers for experiments,
keyed to the wgEncodeRegTfbsClusteredInputsV3 table, which includes such
information as the experiment's underlying Uniform TFBS table name, factor
targeted, antibody used, cell type, treatment (if any), and laboratory
source. The second list field (expScores) contains the scores for the
corresponding experiments."
 
Also on theTrack Description page you will find a "downloads" link that
will bring you to a page where you can quickly acquire files, versus using
the Table Browser to output the information to a named.txt file. In this
case you could click the links for the wgEncodeRegTfbsClusteredV2.bed.gz
and wgEncodeRegTfbsClusteredV3.bed.gz files:
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeRegTfbsClustered/
 
In essence these final two columns are metaData sharing coded information
through further associated tables to explain the experimental cell types
providing evidence for transcription factor binding at each location. If
this information is not of interest, you could ignore those last two
columns. Please see this previously answered mailing list question for
further information:
https://groups.google.com/a/soe.ucsc.edu/d/msg/genome/93Rqrrs--y8/VceGRwgyh88J
 
Thank you again for your inquiry and using the UCSC Genome Browser. If you
have any further questions, please reply to genome <at> soe.ucsc.edu. All
messages sent to that address are archived on a publicly-accessible forum.
If your question includes sensitive data, you may send it instead to
genome-www <at> soe.ucsc.edu.
 
All the best,
 
Brian Lee
UCSC Genome Bioinformatics Group
 
Update Watch Duplicate Copy Move
 
 
Matthew Speir <mspeir <at> soe.ucsc.edu>: Dec 04 10:10AM -0800

Hi Jie,
 
Thank you for your question about transcription factor data for Rat
(RGSC 6.0/rn6). Unfortunately, it appears that we don't provide
transcription factor data for any of the rat assemblies that we host
here at UCSC. However, you may be able to find other online resources
that contain this information.
 
I hope this is helpful. If you have any further questions, please reply
to genome <at> soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
On 12/4/14, 9:03 AM, Hou, Jie (MU-Student) wrote:
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to genome+unsubscribe <at> soe.ucsc.edu.

To unsubscribe from this group and stop receiving emails from it, send an email to genome+unsubscribe <at> soe.ucsc.edu.
genome | 2 Dec 18:12 2014

Digest for genome <at> soe.ucsc.edu - 6 updates in 5 topics

Matthew Speir <mspeir <at> soe.ucsc.edu>: Dec 01 04:36PM -0800

Hi Juan,
 
Thank you for your question about the RepeatMasker track in the UCSC
Genome Browser. It appears that you are confusing the "GIRI Repbase
Reports library" with the "GIRI Repbase-derived RepeatMasker library".
When using RepeatMasker to look for repeats in your species of interest,
you should use the GIRI Repbase-derived RepeatMasker Library. In there,
you will find the name, sequence and phylogenetic labels for each repeat
so you can determine if that repeat is found in your organism of
interest. One of the creators of RepeatMasker notes that if you download
that library along with the RepeatMasker package, you can use one of the
included utilities to find the repeats specific to an organism by running:
 
<RepeatMaskerDir>/util/queryRepeatDatabase.pl -species "human" -stat
 
I hope this is helpful. If you have any further questions, please reply
to genome <at> soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
"Fu, Yong" <yfu <at> coh.org>: Dec 01 04:19PM -0800

Dear Sir/Madam,
 
I am a postdoc at Beckman Research Institute, City of Hope. I downloaded all protein coding gene intron sequences from assembly hg19, track UCSC gene, the whole genome. However, I want to download intron sequences into the following five categories:
 
 
1. Introns flanked by coding region on both sides;
 
2. Introns flanked by coding region on left side only;
 
3. Introns flanked by coding region on right side only;
 
4. Introns not flanked by coding region, but on the 5' side of a coding gene;
 
5. Introns not flanked by coding region, but on the 3' side of a coding gene;
 
 
It will be very helpful for my project if the length of flanking code regions is also included in the downloaded sequence file or in a separate file.
 
I tried to figure out the method but I couldn't. Could you give me the instruction for how to do above downloads?
 
Thanks for your help in advance!
 
Simon
 
Department of Molecular Pharmacology
Beckman Research Institute
City of Hope
1500 East Duarte Road
Duarte, CA 91010-3000
Tel: (626) 256-HOPE ext. 65988
Email: yfu <at> coh.org
 
 
 
---------------------------------------------------------------------
*SECURITY/CONFIDENTIALITY WARNING:
This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (fpc5p)
---------------------------------------------------------------------
Clint Christensen <clintc <at> txbiomedgenetics.org>: Dec 01 04:29PM -0600

Howdy!
 
I am trying to download restriction site data for the entire (cow, rhesus, and baboon) genomes from table browser but cannot seem to find it. The track is present in the group "Mapping and Sequencing" portion of Genome Browser and data is available when at the appropriate zoom level. If it is not possible to get the data from the Genome Browser website or Table Browser, can someone offer advise on how to do this? I have checked other online resources with no luck (such as Rebase and ExPasy). Thank you in advance for any help.
 
 
Clint Christensen
Senior Research Assistant, Department of Genetics
Texas Biomedical Research Institute
7620 NW Loop 410
San Antonio, TX 78245
210-258-9779
DENNIS XING <yxing <at> berkeley.edu>: Nov 29 11:25AM -0800

Hi,
 
I am interested in analyzing the Ebola virus, and I need the aligned 148
complete ebola genomes from UCSC genome browser.
 
Could you email me the file?
 
Thanks,
 
Dennis
"Steve Heitner" <steve <at> soe.ucsc.edu>: Dec 01 10:18AM -0800

Hello, Dennis.
 
Our Ebola downloads are located at http://hgdownload.cse.ucsc.edu/downloads.html#ebola_virus. The file you are likely interested in is http://hgdownload.cse.ucsc.edu/goldenPath/eboVir3/bigZips/160sequences.tar.gz. The corresponding MAF file can be found at http://hgdownload.cse.ucsc.edu/goldenPath/eboVir3/multiz160way/eboVir3.multiz160way.maf.gz.
 
Please contact us again at genome <at> soe.ucsc.edu if you have any further questions. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
---
Steve Heitner
UCSC Genome Bioinformatics Group
 

 
From: DENNIS XING [mailto:yxing <at> berkeley.edu]
Sent: Saturday, November 29, 2014 11:25 AM
To: genome <at> soe.ucsc.edu
Subject: [genome] download entire aligned ebola virus sequece
 

 
Hi,
 
I am interested in analyzing the Ebola virus, and I need the aligned 148 complete ebola genomes from UCSC genome browser.
 
Could you email me the file?
 
Thanks,
 
Dennis
 
--
Sergei Manakov <siarheimanakov <at> gmail.com>: Nov 28 08:16PM -0800

Hi Ray,
 
You said that you have .bam files and the screen shot that you are showing
looks like bedgraph or bigWig track. To get a similar view of your
libraries you have to generate bedgraph or bigWig files from your BAM files
and then create UCSC tracks from there. Yes, it is possible to create
tracks directly from BAM files, but as you said, they look quite different
(you get a view of individual aligned read, rather than a view of per-base
coverage as in bedgraph/bigWig). You can read about bedgraph and bigWig
following links on this page:
 
http://genome.ucsc.edu/goldenPath/help/customTrack.html
 
I am sure there is many ways to do it, but when I create bigWig files from
bam files I do the following:
 
1. Out of BAM create BED format files using bamToBed utility
http://bedtools.readthedocs.org/en/latest/content/tools/bamtobed.html
2. Out of BED create bedgraph format files using genomeCoverageBed program
http://bedtools.readthedocs.org/en/latest/content/tools/genomecov.html
3. Out of bedgraph create bigWig format files using bedGraphToBigWig
utility http://genome.ucsc.edu/goldenpath/help/bigWig.html
 
Then you can use bigWig file to make a track that looks like what you have
in the screenshot
 
Sergei.
 
 
 
--
Sergei (Siarhei Manakou) Manakov
 
California Institute of Technology
MC 147-75
 
land: +1 626 395 3593
 
mobile: + 1 858 729 4531
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to genome+unsubscribe <at> soe.ucsc.edu.

To unsubscribe from this group and stop receiving emails from it, send an email to genome+unsubscribe <at> soe.ucsc.edu.
genome | 30 Nov 18:13 2014

Digest for genome <at> soe.ucsc.edu - 1 update in 1 topic

Robert Kuhn <kuhn <at> soe.ucsc.edu>: Nov 29 04:34PM -0800

Hello, Tingfen,
 
For any of our data tracks, you can read the track Description by clicking
on the link above the pulldown track controls below the Browser graphic.
For the Genome Variants (GV) track, you need to scroll down the resulting
page to read the Description section where you learn that the GV track
does not confine itself to non-coding variants, but is the a track of
individual
genomes that have been sequenced, including James Watson, Craig Venter
and others.
 
The track Description page for this track is found here:
 
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&c=chr21&g=pgSnp
 
regards,
 
--b0b kuhn
ucsc genome bioinformatics group
 
On Fri, Nov 28, 2014 at 7:36 AM, Yan, Tingfen (NIH/NIMHD) [F] <
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to genome+unsubscribe <at> soe.ucsc.edu.

To unsubscribe from this group and stop receiving emails from it, send an email to genome+unsubscribe <at> soe.ucsc.edu.
genome | 22 Nov 18:25 2014

Digest for genome <at> soe.ucsc.edu - 1 update in 1 topic

"Steve Heitner" <steve <at> soe.ucsc.edu>: Nov 21 10:10AM -0800

Hello, Tom.
 
This is actually not the first time we’ve received a report like this. A feature request was opened to add this feature, but it has yet to be implemented. I will re-open an internal discussion to see if we can move this up in priority.
 
In the meantime, there is a workaround. This question has also been asked at Biostars (https://www.biostars.org/p/93112/). One of the recommended solutions was to create a diff of your own list of IDs and the list of IDs output by the Table Browser. One possible method is described in detail at http://linux.die.net/man/1/comm.
 
Please contact us again at genome <at> soe.ucsc.edu if you have any further questions. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
---
Steve Heitner
UCSC Genome Bioinformatics Group
 

 
From: Thomas Cullup [mailto:Thomas.Cullup <at> gosh.nhs.uk]
Sent: Friday, November 21, 2014 6:18 AM
To: 'genome <at> soe.ucsc.edu'
Subject: [genome] List of 'not found' identifiers
 

 
Hi
 

 
I’m hoping you can help; I am trying to submit a list of gene identifiers to the table browser in order that I can pull out exon start and end coordinates. It seems that some of the identifiers are not found:
 

 
* “Note: 8 of the 142 given identifiers (e.g. UQCC2) have no match in table knownGene, field name or in alias table kgAlias, field alias. Try the "describe table schema" button for more information about the table and field.
 
However, without knowing which 8 in the above query (other than UQCC2) have no match, it is hard for me to correct this. It would be great if all the unmatched identifiers are given, so that I know which I need to check/change.
 

 
Is this possible please?
 

 
Thanks very much for your help
 

 
Tom
 

 
Thomas Cullup
 
Clinical Scientist
 
Regional Molecular Genetics Laboratory
 
Great Ormond Street Hospital for Children NHS Foundation Trust
 
Level 6 York House
 
37 Queen Square
 
London
 
WC1N 3BH
 

 
Tel: 020 7762 6877
 
Fax: 020 7813 8196
 
thomas.cullup <at> gosh.nhs.uk
 

 

 
*********************************************************************************************************
 
This message may contain confidential information. If you are not the intended recipient please inform the sender that you have received the message in error before deleting it.
 
Please do not disclose, copy or distribute information in this e-mail or take any action in reliance on its contents: to do so is strictly prohibited and may be unlawful.
 
Thank you for your co-operation.
 
*********************************************************************************************************
 
--
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to genome+unsubscribe <at> soe.ucsc.edu.

To unsubscribe from this group and stop receiving emails from it, send an email to genome+unsubscribe <at> soe.ucsc.edu.
genome | 18 Nov 18:20 2014

Digest for genome <at> soe.ucsc.edu - 2 updates in 2 topics

Matthew Speir <mspeir <at> soe.ucsc.edu>: Nov 17 02:56PM -0800

Hi Gleb,
 
Thank you for your question about lifting coordinates from hg38 to hg19.
If you are ever unsure about liftOver results, I recommend using the
Genome Browser to look at the region and the sequence in detail. You can
use the Genome Browser to grab a 20bp region that includes your 2 bases
of interest from hg38. Here's the sequence I used:
 
>hg38_dna range=chr1:450732-450751
TATATCATTTATGAGATCCT
 
If you use BLAT, http://genome.ucsc.edu/cgi-bin/hgBlat?db=hg38, to map
that sequence to hg19 and hg38, you'll notice that there are hits to a
few different positions in the Genome, all with 100% identity. It
appears that the sequence you're interested in occurs in a few different
positions in the genome, and when liftOver outputs these positions, the
chr5 position is output first. You can use the "-multiple" option for
liftOver to output the other two positions.
 
I hope this is helpful. If you have any further questions, please reply
to genome <at> soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
On 11/14/14, 11:42 AM, GLEB KICHAEV wrote:
Luvina Guruvadoo <luvina <at> soe.ucsc.edu>: Nov 17 12:57PM -0800

Hi Omar,
 
Instead of selecting "BED" as your output format, select "selected fields
from primary and related tables". After clicking "get output", select
"wgEncodeGencodeAttrsV20" under Linked Tables, and click "allow selection
from checked tables". Finally, select any appropriate fields under
hg38.wgEncodeGencodeBasicV2, scroll down and check "transcriptType" under
hg38.wgEncodeGencodeAttrsV20 fields. Click "get output".
 
If you have any further questions, please reply to genome <at> soe.ucsc.edu. All
messages sent to that address are archived on a publicly-accessible forum.
If your question includes sensitive data, you may send it instead to
genome-www <at> soe.ucsc.edu.
 
- - -
Luvina Guruvadoo
UCSC Genome Bioinformatics Group
 
On Wed, Nov 12, 2014 at 2:17 PM, Omar Shams <oshams <at> physics.rutgers.edu>
wrote:
 
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to genome+unsubscribe <at> soe.ucsc.edu.

To unsubscribe from this group and stop receiving emails from it, send an email to genome+unsubscribe <at> soe.ucsc.edu.
genome | 20 Sep 19:07 2014

Digest for genome <at> soe.ucsc.edu - 7 updates in 4 topics

<M.S.Roost <at> lumc.nl>: Sep 19 05:44PM

Dear Sir or Madam,
 
I'm trying to fetch genomic data from UCSC via R (Biomart), and I've been having problems getting it for a few hours now. Could it be there is some maintenance going on?
 
Thank you very much!
 
Best Regards,
Matthias
Matthew Speir <mspeir <at> soe.ucsc.edu>: Sep 19 04:01PM -0700

Hi Matthias,
 
Thank you for your question about the UCSC Genome Browser. We are not
doing any maintenance on our public MySQL server, nor am I aware of any
issues with this server. Are you still experiencing issues at this time?
 
I hope this is helpful. If you have any further questions, please reply
to genome <at> soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
Matthew Speir <mspeir <at> soe.ucsc.edu>: Sep 19 03:53PM -0700

Hi Pengyao,
 
Thank you for your question about gene track updates for the dm3
assembly. There are two gene tracks on the dm3 assembly that are updated
on a regular basis. The RefSeq Genes track is update with new
information that we download from RefSeq on a nightly basis. The data
for the RefSeq Genes track was last updated 2014-09-13. The Ensembl
Genes track is updated whenever Ensembl releases a new version of their
gene predictions for the dm3 assembly. The data for the Ensembl Genes
track was last updated 2014-03-11. You can always see when the data in a
track was last updated by looking at a track's description page. You can
get to a track's description page by clicking the track name in the
groups below the main browser display, or by clicking the grey bar next
to the track on the far left of the main browser display. For example,
on the dm3 RefSeq Genes description page,
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=dm3&g=refGene, you can see
the date just below the track settings on the line labeled "Data last
updated".
 
I hope this is helpful. If you have any further questions, please reply
to genome <at> soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
On 9/18/14, 12:07 PM, Pengyao Jiang wrote:
Matthew Speir <mspeir <at> soe.ucsc.edu>: Sep 19 11:16AM -0700

Hi Nelson,
 
Thank you for your question about updating the D. melanogaster assembly
in the UCSC Genome Browser. A preview of the BDGP Release 6/dm6 Genome
Browser is available on our preview site at
http://genome-preview.ucsc.edu/cgi-bin/hgGateway?db=dm6. However, please
keep in mind that this our test server, and that much of the data has
not undergone our standard quality review process and is subject to
change. We do have plans to release this to our public website at
http://genome.ucsc.edu/, but I do not have a projected date of when that
might happen.
 
I hope this is helpful. If you have any further questions, please reply
to genome <at> soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
On 9/18/14, 8:15 PM, Nelson Lau wrote:
Nelson Lau <nlau <at> brandeis.edu>: Sep 19 02:41PM -0400

Thanks, Matthew. Is the Liftover tool able to help us convert the
Dm3/Release5 coordinates to the Dm6/Release6 coordinates? We have a paper
accepted where the work was done on Dm3, but the journal would prefer all
the coordinates be adjusted to Dm6, including BED files. I understand this
build is on the test server, but presumably the new build will be released
soon? Please advise how we can convert all our file coordinates to Dm6.
 
 
 
*From:* Matthew Speir [mailto:mspeir <at> soe.ucsc.edu]
*Sent:* Friday, September 19, 2014 2:16 PM
*To:* Nelson Lau; genome <at> soe.ucsc.edu
*Subject:* Re: [genome] Release 6 Drosophila melanogaster genome?
 
 
 
Hi Nelson,
 
Thank you for your question about updating the D. melanogaster assembly in
the UCSC Genome Browser. A preview of the BDGP Release 6/dm6 Genome Browser
is available on our preview site at
http://genome-preview.ucsc.edu/cgi-bin/hgGateway?db=dm6. However, please
keep in mind that this our test server, and that much of the data has not
undergone our standard quality review process and is subject to change. We
do have plans to release this to our public website at
http://genome.ucsc.edu/, but I do not have a projected date of when that
might happen.
 
I hope this is helpful. If you have any further questions, please reply to
genome <at> soe.ucsc.edu. All messages sent to that address are archived on a
publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
 
On 9/18/14, 8:15 PM, Nelson Lau wrote:
 
Hello,
 
 
 
Do you have an ETA when the newest Release 6 of the Drosophila Melanogaster
genome will be loaded into the UCSC Browser? The current latest build is
Dm3, Release 5.
 
 
 
Thanks,
 
 
 
Nelson Lau, Ph.D.
 
Assistant Professor - Biology
 
Brandeis University
 
415 South St, MS029
 
Science Receiving
 
Waltham MA 02454, USA
 
Ph: 781-736-2445
http://www.bio.brandeis.edu/laulab
 
 
 
 
 
--
Matthew Speir <mspeir <at> soe.ucsc.edu>: Sep 19 01:54PM -0700

Hi Nelson,
 
Yes, you can use the liftOver tool just like you normally would to
convert dm3 to dm6 coordinates. You can access the web interface for
liftOver on our preview site at
http://genome-preview.ucsc.edu/cgi-bin/hgLiftOver. To convert the
coordinates from dm3 to dm6 use the following settings:
 
Original Genome: D. melanogaster
Original Assembly: BDGP R5
New Genome: D. melanogaster
New Assembly: BDGP Release 6 + ISO1 MT
 
If you want to use the command line liftOver utility, you can find the
liftOver chain file on our preview downloads server here:
http://hgdownload-test.soe.ucsc.edu/goldenPath/dm3/liftOver/dm3ToDm6.over.chain.gz.
Again, please keep in mind that the data on our preview servers has yet
to undergo our standard quality assurance process and may be subject to
change.
 
I hope this is helpful. If you have any further questions, please reply
to genome <at> soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
On 9/19/14, 11:41 AM, Nelson Lau wrote:
Jonathan Casper <jcasper <at> soe.ucsc.edu>: Sep 19 12:20PM -0700

Hello Nitika,
 
Thank you for your question about finding genes that match your probe sets.
I suggest that you start by building a custom track from your probe sets
and then use the "intersection" part of the UCSC Table Browser (
http://genome.ucsc.edu/cgi-bin/hgTables) to get the genes that overlap with
your probe sets. See
http://genome.ucsc.edu/goldenPath/help/customTrack.html for
more information about building a custom track, and
http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html#Intersection for
more information on using intersections in the Table Browser.
 
Depending on what gene information you need, the intersection tool might
not be enough for your research. If that is the case, the tools at Galaxy (
https://usegalaxy.org) can help you to get more specific gene information.
The following mailing list questions describe how to use the Table Browser
together with the tools at Galaxy to find gene information:
https://groups.google.com/a/soe.ucsc.edu/d/topic/genome/LdYxhGwswno/discussion
https://groups.google.com/a/soe.ucsc.edu/d/topic/genome/lF0UpVKYi8I/discussion
https://groups.google.com/a/soe.ucsc.edu/d/topic/genome/y7zQDKO-8Hk/discussion
 
I hope this is helpful. If you have any further questions, please reply to
genome <at> soe.ucsc.edu or genome-mirror <at> soe.ucsc.edu. Questions sent to those
addresses will be archived in publicly-accessible forums for the benefit of
other users. If your question contains sensitive data, you may send it
instead to genome-www <at> soe.ucsc.edu.
 
--
Jonathan Casper
UCSC Genome Bioinformatics Group
 
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to genome+unsubscribe <at> soe.ucsc.edu.

To unsubscribe from this group and stop receiving emails from it, send an email to genome+unsubscribe <at> soe.ucsc.edu.
genome | 13 Sep 19:05 2014

Digest for genome <at> soe.ucsc.edu - 5 updates in 5 topics

Matthew Speir <mspeir <at> soe.ucsc.edu>: Sep 12 03:41PM -0700

Hi Bill,
 
Thank you for your question about getting gene symbols as part of your
from the Table Browser. You are on the right track with your current
Table Browser settings, and the only issue is your output settings. The
reason you are not getting gene symbols as part of your output is
because these are not stored in the knownGene table for the UCSC Genes
track, but instead stored in a linked table. When you select the output
option "all fields from selected table", you are only getting the
information contained in the knownGene table. I recommend using the
"selected fields from primary and related tables" output option. After
you click "get output", you will be taken to another page where you will
be able to select fields from both the knownGene table and various
linked tables that you want as part of your output. On this page, select
those fields from the "Select Fields from hg19.knownGene" section that
you are interested in. In the "hg19.kgXref fields" section, you will
find a number of alternative IDs for the transcripts in the knownGene
table. Check the box next to "geneSymbol", and any other IDs you are
interested in. Finally, click "get output". Your output will consist of
the fields you selected as columns in order starting from the top of the
"hg19.knownGene" section. While this output option doesn't necessarily
format your output in a terribly useful way, you can use a simple UNIX
command line utility such as awk to rearrange the columns however you want.
 
I hope this is helpful. If you have any further questions, please reply
to genome <at> soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
On 9/11/14, 7:54 AM, LaFramboise, William A wrote:
Luvina Guruvadoo <luvina <at> soe.ucsc.edu>: Sep 12 02:45PM -0700

Hello Peter,
 
Our funding mandates that we focus on vertebrate genomes, so we have been
unable to pursue an update to C. elegans. You are welcome to create an
assembly hub, however, as noted in this mailing list question:
https://groups.google.com/a/soe.ucsc.edu/d/topic/genome/eZ_wBLH66I0/discussion.
Assembly hubs are a tool we developed to allow users to display their own
genome assemblies and accompanying annotation in the UCSC Genome Browser.
 
For more information on creating assembly hubs, please review the track hub
help pages at http://genome.ucsc.edu/goldenPath/help/hgTrackHubHelp.html and
the assembly hubs wiki page at
http://genomewiki.ucsc.edu/index.php/Assembly_Hubs.
If you have any further questions, please reply to genome <at> soe.ucsc.edu. All
messages sent to that address are archived on a publicly-accessible forum.
If your question includes sensitive data, you may send it instead to
genome-www <at> soe.ucsc.edu.
 
- - -
Luvina Guruvadoo
UCSC Genome Bioinformatics Group
 
 
On Fri, Sep 12, 2014 at 1:07 AM, Peter Frommolt <peter.frommolt <at> uni-koeln.de
Luvina Guruvadoo <luvina <at> soe.ucsc.edu>: Sep 12 12:56PM -0700

Hello,
 
Thanks for your question. I'm assuming you used the Human mRNAs track on
hg19 to download the sequence:
http://genome.ucsc.edu/cgi-bin/hgc?db=hg19&g=mrna&i=AY590150. This track is
generated by aligning GenBank human mRNAs against the genome using BLAT. It
appears PIF (proteolysis inducing factor) has not yet been annotated in the
mouse reference assembly, see http://www.ncbi.nlm.nih.gov/gene/100126829.
 
If you have any further questions, please reply to genome <at> soe.ucsc.edu. All
messages sent to that address are archived on a publicly-accessible forum.
If your question includes sensitive data, you may send it instead to
genome-www <at> soe.ucsc.edu.
 
- - -
Luvina Guruvadoo
UCSC Genome Bioinformatics Group
 
 
Brian Lee <brianlee <at> soe.ucsc.edu>: Sep 12 12:16PM -0700

Dear Hai-Bing Xie,
 
Thank you for using the UCSC Genome Browser and your question about the
chromosomal coordinates for Factorbook-identified canonical motifs seen as
green highlighted bars in the clustered transcription factor binding sites
track.
 
The Factorbook motif identifications and localizations where provided by
the Zlab (http://zlab.umassmed.edu/zlab/) at the UMass Medical School and
are available in two tables, the first providing the position of each
factorbook item, factorbookMotifPos, the second providing the position
weight matrix, factorbookMotifPwm.
 
These are located in the general hg19 annotation database section of our
hgdownload server along with a corresponding .sql file:
http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/factorbookMotifPos.txt.gz
http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/factorbookMotifPwm.txt.gz
 
You can access these table via the Public MySQL server:
http://genome.ucsc.edu/goldenPath/help/mysql.html
mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e 'show tables
like "factorbook%";' hg19
mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e 'select * from
factorbookMotifPos;' hg19
 
There are two additional tables, factorbookMotifCanonical and
factorbookGeneAlias, that help map the information from the Zlab to the
target terms used in the UCSC Genome Browser.
 
You can alternatively use the hg19 Table Browser to access these tables:
http://genome.ucsc.edu/cgi-bin/hgTables
1. Set the "group:" to "All tables"
2. Set the table to "factorbookMotifPos"
3. Click "genome" to get the entire table, or click the "define regions"
button and get enter coordinates of interest, such as "chrX 14000000
150000000".
4. Click "get output". If desired, you could set "output format" to "custom
track" and see the results in the browser.
 
What is displayed in the wgEncodeRegTfbsClustered track is the result of a
computational mapping of the factorbookMotifPos items to the clustered TFBS
locations filtered for the highest score per cluster. There is not an easy
path to obtain these exact mappings, but you can perform similar operations
with the Table Browser.
 
For example if you were looking at the region around SOD1,
chr21:33,031,597-33,041,570, you could enter this as the defined region in
the Table Browser (step 3).
4. Click the "create" button next to "filter".
5. Set the "score" is ">" then a desired amount, such as "2" and click
"submit".
6. Click the "create" button next to "intersection".
7. Select "group: Regulation" and "track: Txn Factor ChIP" and "table:
wgEncodeRegTfbsClusteredV3" then click "submit".
8. Click "get output". If desired, you could set "output format" to "custom
track" and see the results in the browser.
 
Thank you again for your inquiry and using the UCSC Genome Browser. If you
have any further questions, please reply to genome <at> soe.ucsc.edu. All
messages sent to that address are archived on a publicly-accessible forum.
If your question includes sensitive data, you may send it instead to
genome-www <at> soe.ucsc.edu.
 
All the best,
 
Brian Lee
UCSC Genome Bioinformatics Group
 
 
Matthew Speir <mspeir <at> soe.ucsc.edu>: Sep 12 10:35AM -0700

Hi Turgay,
 
Thank you for your questions about the UCSC Genome Browser. You are
correct about the coordinate systems used in our tables, and
subsequently in the files you downloaded. In our tables, we use 0-based
start coordinates and 1-based end coordinates. You can read more this on
the following pages:
 
* http://genome.ucsc.edu/FAQ/FAQtracks#tracks1
* http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms
 
 
By default, the Genome Browser displays the sequence of the plus strand.
The NBPF1 gene, however, is on the minus strand. When you were looking
at this position in the Browser, it is likely that you were looking at
the sequence for this codon on the plus strand, which is "AGA". If you
were to view this on the minus strand, you would see that the sequence
for this codon is "UCU", which does in fact code for serine. Please see
this answer to a previous mailing list question for a great explanation
of our strand display for genes and how to view the minus strand in the
Browser:
https://groups.google.com/a/soe.ucsc.edu/d/msg/genome/_EjI7ddU_PY/O9UB7DwBvc8J.
 
 
I hope this is helpful. If you have any further questions, please reply
to genome <at> soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
On 9/11/14, 8:40 AM, Turgay Aytac wrote:
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to genome+unsubscribe <at> soe.ucsc.edu.

To unsubscribe from this group and stop receiving emails from it, send an email to genome+unsubscribe <at> soe.ucsc.edu.
genome | 6 Sep 19:15 2014

Digest for genome <at> soe.ucsc.edu - 2 updates in 2 topics

Jonathan Casper <jcasper <at> soe.ucsc.edu>: Sep 05 11:11AM -0700

Hello Alireza,
 
You are absolutely right. When looking at a chain file, the coordinates for
things on the - strand are changed around. To find the item on the +
strand, you will need to use the coordinates (tSize-tEnd, tSize-tStart].
When using a GTF file, you do not need to change the coordinates for items
on the - strand.
 
It is unfortunate that differences like this exist, which can be confusing
for people who need to work with many file formats. That is part of why we
try to provide tools for working with these file formats and converting
between them, so that these differences are hidden away.
 
A word of warning: you may also need to work with the PSL data format (
http://genome.ucsc.edu/FAQ/FAQformat.html#format2) at some point. PSL has
its own way of describing items on the - strand. The start and stop
position for each alignment use + strand coordinates, but the list of start
positions of the blocks within each alignment uses - strand coordinates.
This can be confusing.
 
I hope this is helpful. If you have any further questions, please reply to
genome <at> soe.ucsc.edu or genome-mirror <at> soe.ucsc.edu. Questions sent to those
addresses will be archived in publicly-accessible forums for the benefit of
other users. If your question contains sensitive data, you may send it
instead to genome-www <at> soe.ucsc.edu.
 
--
Jonathan Casper
UCSC Genome Bioinformatics Group
 
 
On Fri, Sep 5, 2014 at 7:33 AM, Alireza Fotuhi Siahpirani <
Matthew Speir <mspeir <at> soe.ucsc.edu>: Sep 05 11:10AM -0700

Hello Fei,
 
Thank you for your question about visualizing Multiz alignments. We
recommend using the UCSC Genome Browser to visualize your alignments. If
you have your alignments in MAF format, and we host the assembly used
for the alignment, you can upload these alignments as a custom track.
For more information on the MAF format, refer to the following help
page: http://genome.ucsc.edu/FAQ/FAQformat.html#format5. More
information on uploading custom tracks can be found here:
http://genome.ucsc.edu/goldenPath/help/customTrack.html.
 
I hope this is helpful. If you have any further questions, please reply
to genome <at> soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
On 9/4/14, 4:21 PM, Liu, Fei wrote:
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to genome+unsubscribe <at> soe.ucsc.edu.

To unsubscribe from this group and stop receiving emails from it, send an email to genome+unsubscribe <at> soe.ucsc.edu.
genome | 28 Aug 19:13 2014

Digest for genome <at> soe.ucsc.edu - 6 updates in 5 topics

"Simonson, Tatum" <tsimonson <at> mail.ucsd.edu>: Aug 28 10:05AM

Thank you for your quick response, Steve.
 
We will reference the links and the manuscript associated with these data.
 
Sincerely,
Tatum
________________________________________
From: Steve Heitner [steve <at> soe.ucsc.edu]
Sent: Monday, August 25, 2014 11:06 AM
To: Simonson, Tatum; genome <at> soe.ucsc.edu
Cc: joseptarrago11 <at> gmail.com
Subject: RE: [genome] UCSC browser display of expression data
 
Hello, Tatum.
 
I would first like to point out that the link you provided is from our
official European mirror, http://genome-euro.ucsc.edu. I'm not certain if
you're intentionally using that mirror, but if not, our US site
(http://genome.ucsc.edu) would certainly be faster for you.
 
On the results page you mentioned, the data in the "Microarray Expression
Data" section comes from our GNF Atlas 2 track,
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=gnfAtlas2. You can see
the equivalent GNF Atlas 2 results page for uc010aux.3 at
http://genome.ucsc.edu/cgi-bin/hgc?c=chr14&o=94844709&t=94849578&g=gnfAtlas2
&i=202833_s_at&i2=202833_s_at. I believe the description page and the color
key will help to answer your questions. On the UCSC Genes results page, the
"Absolute" data is just another representation of the "Ratios" data using a
different color scheme.
 
Please contact us again at genome <at> soe.ucsc.edu if you have any further
questions. Questions sent to that address will be archived in a
publicly-accessible forum for the benefit of other users. If your question
contains sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
---
Steve Heitner
UCSC Genome Bioinformatics Group
 
-----Original Message-----
From: Simonson, Tatum [mailto:tsimonson <at> mail.ucsd.edu]
Sent: Monday, August 25, 2014 9:07 AM
To: genome <at> soe.ucsc.edu
Cc: joseptarrago11 <at> gmail.com
Subject: [genome] UCSC browser display of expression data
 
Hi there,
 
I am working with students on the UCSD Genome Browser and have a few
questions regarding microarry data presented on the site (please see link
provided below for our gene of interest).
 
https://genome-euro.ucsc.edu/cgi-bin/hgGene?hgg_gene=uc010aux.3&hgg_prot=P01
009&hgg_chrom=chr14&hgg_start=94843083&hgg_end=94857029&hgg_type=knownGene&d
b=hg19&hgsid=198846835_U1gmMFViIJiN78El0AmBYxlY04FQ
 
Is there a thorough description provided on the site or elsewhere as to how
the absolute and ratio data are determined and displayed other than "high"
and "low"?
What does the white to dark grey gradient indicate for absolute data (i.e.,
is this standardized for each tissue and compared relative to other
tissues)?
What criteria are used to determine high and low values (what is the
baseline used for comparison and again are these tissue-specific), and what
does a black box indicate (baseline or no data)?
 
Thank you in advance for your assistance!
 
Sincerely,
Tatum Simonson
 
--
Ya Hu <teresayahu <at> gmail.com>: Aug 28 03:50PM +0800

Dear sir or madam,
 
I am trying to check whether a mutation on a site is synonymous or
non-synonymous in human genome, (given position and allele status before
and after mutation), which files could I refer to from UCSC download site,
please?
 
Many thanks,
 
Ya Hu
Matthew Speir <mspeir <at> soe.ucsc.edu>: Aug 27 01:13PM -0700

Hi Fahad,
 
Thank you for your question about the LiftOver utility. Could you share
a few lines of the BED file you've been using as input for standalone
LiftOver? If you do not wish to share this information with the public
list, you can send them to me directly. This difference in output
coordinates is likely due to different input formats. The standalone
LiftOver utility takes bed format as input. The web LiftOver utility can
take both position and bed formats as input. Putting these two different
formats into the web LiftOver utility will give you different results.
This is because the position format is one-based and the bed format is
zero-based, and each one is interpreted by the Genome Browser
differently. You can read more about the differences between the two and
how they are interpreted by the Genome Browser here:
http://genome.ucsc.edu/FAQ/FAQtracks#tracks1.
 
I hope this is helpful. If you have any further questions, please reply
to genome <at> soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
On 8/21/14, 7:00 AM, Fahad Syed wrote:
Jeremy Johnson <jjohnson <at> broadinstitute.org>: Aug 28 09:13AM -0400

Hi Fahad,
 
A few things:
 
1) We're not responsible for maintaining in UCSC's tools. I suggest you
bring this up with them (genome <at> soe.ucsc.edu), but 1-off problems are
quite common when dealing with genomic data (computer science people like
to start numbering at 0, biologists at 1.)
 
2) We've already done this liftover, and have the data hosted on the
CanFam3.1 genome on UCSC via a Track Hub. If you go to UCSC, load the
CanFam3.1 assembly, the hit the "Track Hubs" button and load the "Broad
Improved Canine Annotation v1
<https://www.broadinstitute.org/ftp/pub/vgb/dog/trackHub/hub.txt>" Track
hub, there is a track there listed as "Survey SNPs." These contain the
SNPs from CanFam2, as well as some addition SNPs we generated when we
designed the dog SNP array.
 
3) You can check the results yourself between the discrepent tools, as you
have the truth set of genotypes from CanFam2. Whichever result matches
those genotypes (i.e. returns the same bases from CanFam3.1) is the correct
tool.
 
Hope this helps, and good luck!
 
~Jeremy
 
 
Jeremy Johnson
Project Coordinator
Vertebrate Biology Group
Broad Institute
 
Tel: (617) 714-7935
Brian Lee <brianlee <at> soe.ucsc.edu>: Aug 27 10:46AM -0700

Dear Gert,
 
Thank you for using the UCSC Genome Browser and your message about the
dragAndDrop documentation.
 
The notation in the documentation should read "dragAndDrop subTracks" and
is being updated, thank you for bringing this to our attention!
 
Thank you again for using and helping to improve the UCSC Genome Browser.
If you have any further questions, please reply togenome <at> soe.ucsc.edu. All
messages sent to that address are archived on a publicly-accessible forum.
If your question includes sensitive data, you may send it instead to
genome-www <at> soe.ucsc.edu.
 
All the best,
 
Brian Lee
UCSC Genome Bioinformatics Group
 
 
On Wed, Aug 27, 2014 at 8:40 AM, Gert Hulselmans <hulselmansgert <at> gmail.com>
wrote:
 
robert kuhn <kuhn <at> soe.ucsc.edu>: Aug 27 10:36AM -0700

Hello, Emma,
 
> I do think this is inconvenient... but I assume there are technical
reasons
> behind it.
 
We agree that this situation is inconvenient, but thanks for understanding.
We built genome-euro to make genome browsing faster over there, and
carefully considered the tradeoffs between allowing the two machines to
diverge at the Session-storage function vs keeping them in sync.
 
Because we believed that most researchers would use one or the other
machine and not very often find themselves on the other machine, the cost
of the very large storage and transmission to keep them in sync was not
justified. Our hosts at the University of Bielefeld are generous with
their
support of the European mirror, and we chose to not stress their bandwidth
with the constant syncing of data between the two sites.
 
We were about to contact you again to suggest that you sync up your
sessions in one place using save/reload, but are happy to see that you
found that solution on your own.
 
best regards, and thanks for using the Genome Browser.
 
--b0b kuhn
ucsc genome bioinformatics group
 
On 8/27/2014 12:17 AM, Emma Ivansson wrote:
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to genome+unsubscribe <at> soe.ucsc.edu.

To unsubscribe from this group and stop receiving emails from it, send an email to genome+unsubscribe <at> soe.ucsc.edu.
genome | 23 Aug 19:06 2014

Digest for genome <at> soe.ucsc.edu - 5 updates in 4 topics

Matthew Speir <mspeir <at> soe.ucsc.edu>: Aug 22 04:54PM -0700

Hi Anton,
 
We have been unable to replicate this issue on our end. Is it possible
that you are using a mirror site? Please confirm that you are using our
main site at http://genome.ucsc.edu/, or our European site at
http://genome-euro.ucsc.edu/. If you are using one of these two sites,
could you please provide us with your browser and operating system
information? Also, could you provide us with a detailed set of
instructions so that we can try to replicate this issue ourselves?
 
If you have any further questions, please reply to genome <at> soe.ucsc.edu.
All messages sent to that address are archived on a publicly-accessible
Google Groups forum. If your question includes sensitive data, you may
send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
On 8/21/14, 9:45 PM, Anton Kratz wrote:
Steve Fischer <sfischer <at> upenn.edu>: Aug 22 06:41PM -0400

Hi,
 
We are evaluating a number of genome browser systems, such as UCSC
Browser, JBrowse and GenomeViewer for use as the Genome Browser on the
EuPathDB websites. These sites are databases that hold genome
information for parasites.
 
As part of our evaluation we like to look at sites, besides the main
UCSC Browser and mirrors, that use this software. I am having trouble
finding examples. Can you refer me to any?
 
Thanks,
Steve
 
--
~~~ *><>* ~~~
Robert Kuhn <kuhn <at> soe.ucsc.edu>: Aug 22 04:35PM -0700

Hi, Steve,
 
You probably want to post this to the genome-mirror mailing list.
Then people who are mirroring the Browser are likely to see it and
can contact you with their URLs.
 
Or you could look in the genome-mirror list archives:
 
https://groups.google.com/a/soe.ucsc.edu/forum/#!forum/genome-mirror
 
Alternatively, you should consider using the Assembly Hubs mechansim,
which allows users to put their genomes on our public instance of the
Browser without downloading or installing the code locally. In that case,
you simply host the relevant files to make your own assembly.
 
The following link:
 
http://genome.ucsc.edu/cgi-bin/hgHubConnect
 
goes to a page with a further link to the User's Guide as well as offering
several remotely-hosted non-UCSC genome assemblies, such as Arabidopsis.
and Drosophila simulans for you to look at.
 
regards,
 
--b0b kuhn
ucsc genome bioinformatics group
 
 
 
Matthew Speir <mspeir <at> soe.ucsc.edu>: Aug 22 04:08PM -0700

Hi Alireza,
 
Thank you for your question about using the mm9 and hg19 pairwise
alignment files. Please see this previously answered mailing list
question for instructions on how to extract the alignments for your
regions of interest here:
https://groups.google.com/a/soe.ucsc.edu/d/msg/genome/tOEkoCYem3c/3FMCjaxVQGUJ.
While that answer is dealing with danRer7 and hg19, you can easily
replace any reference to a danRer7 file with an equivalent file in mm9.
You can find the mm9 2bit file here:
http://hgdownload.soe.ucsc.edu/goldenPath/mm9/bigZips/mm9.2bit, and the
hg19/mm9 pairwise alignment file here:
http://hgdownload.soe.ucsc.edu/goldenPath/hg19/vsMm9/hg19.mm9.all.chain.gz.
You can find the twoBitInfo, chainToAxt, axtToMaf, and mafsInRegion
utilities referenced in those instructions under the directory for your
system here: http://hgdownload.soe.ucsc.edu/admin/exe/.
 
I hope this is helpful. If you have any further questions, please reply
to genome <at> soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
On 8/21/14, 4:45 PM, Alireza Fotuhi Siahpirani wrote:
Matthew Speir <mspeir <at> soe.ucsc.edu>: Aug 22 01:39PM -0700

Hi Venkat,
 
Thank you for your question about converting Illumina exm IDs. I'm not
familiar with exm IDs, and I couldn't find any information on them after
some web searches. I would suggest you contact Illumina, the creators of
the OmniExpressExome chip, for an explanation of what these exm IDs
represent. You can find contact information for their customer support
here: http://www.illumina.com/company/contact-us.ilmn. If you can
provide me with a few sample lines from your file that contains these
IDs and their positions, I may be able to suggest a technique for using
the Genome Browser to convert them. If this file contains private
information that you do not wish to share with the public list, you may
send the file to me directly.
 
I hope this is helpful. If you have any further questions, please reply
to genome <at> soe.ucsc.edu. All messages sent to that address are archived
on a publicly-accessible Google Groups forum. If your question includes
sensitive data, you may send it instead to genome-www <at> soe.ucsc.edu.
 
Matthew Speir
UCSC Genome Bioinformatics Group
 
 
On 8/21/14, 9:49 AM, Venkat Addala wrote:
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to genome+unsubscribe <at> soe.ucsc.edu.

To unsubscribe from this group and stop receiving emails from it, send an email to genome+unsubscribe <at> soe.ucsc.edu.

Gmane