genome | 11 May 2013 20:32
Favicon

Digest for genome <at> soe.ucsc.edu - 1 Message in 1 Topic

Group: http://groups.google.com/a/soe.ucsc.edu/group/genome/topics

    Jackie Jia Zhou <dr.jzhou <at> gmail.com> May 10 01:36PM -0700  

    Hi,
     
    I downloaded the phastCon scores for rat with other 8 vertebrates from this
    link : http://hgdownload.soe.ucsc.edu/goldenPath/rn4/phastCons9way/
    I have a couple of questions regarding the files/scores:
    1) There are many genomic regions (the genomic coordiates in these regions
    are not in the downloaded files) that do not have a phastCon score in the
    file, why is that? does that mean these regions are not conserved at all?
    can I treat these regions as having a score=0 when handling the file?
    2) Is there any threshold value for conservation score based on which we
    can define a particular block of gene to be ideally conserved?
     
    Thank you,
     
    -Jackie

     

You received this message because you are subscribed to the Google Group genome.
You can post via email.
To unsubscribe from this group, send an empty message.
For more options, visit this group.

--
 
 
 
genome | 10 May 2013 20:21
Favicon

Digest for genome <at> soe.ucsc.edu - 6 Messages in 5 Topics

Group: http://groups.google.com/a/soe.ucsc.edu/group/genome/topics

    Brian Lee <brianlee <at> soe.ucsc.edu> May 10 09:27AM -0700  

    Dear David,
     
    Thank you for using the UCSC Genome Browser and your appreciation of our
    support. It is not readily clear what you may be attempting for your
    research without examples.
     
    Perhaps you may be submitting through the table browser a selection of
    defined regions, and then perhaps clicking the "summary/statistics" button
    against a DNase track to get desired output like "Item bases 2,128,151
    (3.95%)"? There may be some technical solutions, depending on an
    understanding of what you are attempting, however, they may be more
    challenging to pursue for someone less familiar with computers as you
    mentioned.
     
    You might be interested in the DNase Clusters track,
    http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeRegDnaseClusteredV2.
    This track shows DNase hypersensitive areas assayed in a large collection
    of cell types, aggregated and uniformly processed. In the table browser you
    can select the "Regulation" group, "DNase Clusters" track, and the
    "wgEncodeRegDnaseClusteredV2" table, and then click the "describe table
    schema" button to learn more about this data in BED format.
     
    If you created a custom track of your regions, one approach you can take is
    to do an intersection of your data with the tables you may be interested
    in, such as the wgEncodeRegDnaseClusteredV2 table. However, performing an
    intersection with a custom track with 2.5 million regions using the Table
    Browser will likely time out. To learn more about creating a custom track
    please review the User Guide:
    http://genome.ucsc.edu/goldenPath/help/customTrack.html, I also recommend
    watching this Table Browser Custom Track tutorial:
    http://www.openhelix.com/cgi/tutorialInfo.cgi?id=28. Also, read about
    performing intersections with the Table Browser here:
    http://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html#Intersection
     
    More technical solutions could be to download some tools available from our
    utilities directory, http://hgdownload.cse.ucsc.edu/admin/exe/,
    specifically bigWigAverageOverBed. This program will complete an
    intersection with a bed file and a bigWig file. Depending on what you are
    attempting to do, you perhaps could also perform MySQL queries on the
    tables of interest, please read more here:
    http://genome.ucsc.edu/goldenPath/help/mysql.html
     
    Without a clear picture of what you are attempting it is hard to direct you
    on the best path. Thank you again for your inquiry and using the UCSC
    Genome Browser. If you have further questions, please feel free to contact
    the mailing list again at genome <at> soe.ucsc.edu.
     
    All the best,
     
    Brian Lee
    UCSC Genome Bioinformatics Group
     
     
    On Tue, May 7, 2013 at 12:09 PM, Rosenbaum, David J - (djr1) <

     

    Pauline Fujita <pauline <at> soe.ucsc.edu> May 09 04:04PM -0700  

    Hi Yi,
     
    In the case of NM_001271872 the problem is that the human reference
    hg19 has a deletion that includes the first exon of NM_001271872. This
    has been repaired in a patch which you can see by turning on the "GRC
    patch" and "GRC incident" tracks.
     
    For the rest of the discrepancies you're seeing, they are likely to be
    caused by a difference between the hg19 reference sequence and the
    sequence of whomever contributed the mRNA to RefSeq (i.e. RefSeq
    mRNA's come from many different individuals whose genome sequence
    doesn't necessarily match the reference sequence, either because there
    is an error in the assembly of the reference sequence, or because of
    polymorphism in the population.)
     
    Best regards,
     
    Pauline Fujita
    UCSC Genome Bioinformatics Group
    http://genome.ucsc.edu
     
     

     

    grant hartzog <hartzog <at> ucsc.edu> May 09 03:03PM -0700  

    I am trying to install the kentsourcetools on my Mac (x86_64, OS 10.8.3) and am failing. It appears to be a problem with zlib, which is on my machine. Any clues as to what might be going wrong here?
     
    I exported the following two variables
     
    MYSQLLIBS="/usr/local/mysql/lib/libmysqlclient.a -lz"
    MYSQLINC=/usr/local/mysql/include/mysql
     
    and then did this:
     
    dhcp-151-90:~ grant$ cd kent/src
    dhcp-151-90:src grant$ make
    cd lib && make
    make[1]: `x86_64/jkweb.a' is up to date.
    cd jkOwnLib && make
    make[1]: `../lib/x86_64/jkOwnLib.a' is up to date.
    cd hg/lib && make
    make[1]: `../../lib/x86_64/jkhgap.a' is up to date.
    mkdir -p /Users/grant/bin/scripts
    mkdir -p /Users/grant/bin/x86_64
    ameme
    gcc -O -g -o /Users/grant/bin/x86_64/ameme ameme.o fragFind.o ../lib/x86_64/jkweb.a -pthread -lssl -lcrypto -lpng -lm -lm
    true /Users/grant/bin/x86_64/ameme
     
    [snip]
     
    gcc -O -g -o /Users/grant/bin/x86_64/bigBedInfo bigBedInfo.o ../../lib/x86_64/jkweb.a -pthread -lssl -lcrypto -lpng -lm -lz -lm
    true /Users/grant/bin/x86_64/bigBedInfo
    bigBedNamedItems
    gcc -O -g -o /Users/grant/bin/x86_64/bigBedNamedItems bigBedNamedItems.o ../../lib/x86_64/jkweb.a -pthread -lssl -lcrypto -lpng -lm -lm
    Undefined symbols for architecture x86_64:
    "_compress", referenced from:
    _zCompress in jkweb.a(zlibFace.o)
    _zSelfTest in jkweb.a(zlibFace.o)
    "_uncompress", referenced from:
    _zUncompress in jkweb.a(zlibFace.o)
    _zSelfTest in jkweb.a(zlibFace.o)
    ld: symbol(s) not found for architecture x86_64
    collect2: ld returned 1 exit status
    make[2]: *** [all] Error 1
    make[1]: *** [all] Error 2
    make: *** [utils] Error 2

     

    Hiram Clawson <hiram <at> soe.ucsc.edu> May 09 04:00PM -0700  

    Good Afternoon Grant:
     
    Please correct the 'makefile' in the bigBedNamedItems
    directory to read:
     
    kentSrc = ../..
    A = bigBedNamedItems
    include $(kentSrc)/inc/userApp.mk
    L += -lm -lz
     
    Your 'makefile' is probably missing the -lz on the L += line.
    It was updated after the most recent source code release.
     
    --Hiram
     
    On 5/9/13 3:03 PM, grant hartzog wrote:

     

    Pauline Fujita <pauline <at> soe.ucsc.edu> May 09 12:42PM -0700  

    Hello Hongxin,
     
    Unfortunately this is beyond the scope of this mailing list - which
    is to address specific questions about the UCSC Genome Browser
    software, database, genome assemblies and release cycles.
     
    You might want to consult the "support" section of the GMOD site:
     
    http://gmod.org/wiki/Main_Page
     
     
    Best of luck with your research,
     
    Pauline Fujita
    UCSC Genome Bioinformatics Group
    http://genome.ucsc.edu
     
     
    On Thu, May 9, 2013 at 8:04 AM, Zhang, Hongxin (MU-Student)

     

    Pauline Fujita <pauline <at> soe.ucsc.edu> May 09 12:29PM -0700  

    Hello Ruti,
     
    You can use our liftover tool to convert coordinates between
    assemblies. You will find the tool here:
     
    https://genome.ucsc.edu/cgi-bin/hgLiftOver
     
    (also accessible by clicking the "utilities" menu on the left,
    followed by the "batch coordinate conversion" link)
     
    Note that your coordinates will need to be in BED or chrN:start-end
    (i.e. chr18:43484054-72498703) formats. For more info on BED format
    see this help doc: https://genome.ucsc.edu/FAQ/FAQformat.html#format1
     
    If you are looking at a coordinate within a browser (i.e. you're
    looking at Chr18:43,484,054-72,498,703 in the hg18 browser) you can
    also jump to the corresponding location in another browser - i.e. hg19
    by clicking on the "View" menu in the top bar, and selecting "In Other
    Genomes (Convert)"
     
    If you have further questions please feel free to contact the mailing
    list again at genome <at> soe.ucsc.edu.
     
    Best regards,
     
    Pauline Fujita
    UCSC Genome Bioinformatics Group
    http://genome.ucsc.edu
     
     

     

You received this message because you are subscribed to the Google Group genome.
You can post via email.
To unsubscribe from this group, send an empty message.
For more options, visit this group.

--
 
 
 
genome | 4 May 2013 19:05
Favicon

Digest for genome <at> soe.ucsc.edu - 1 Message in 1 Topic

Group: http://groups.google.com/a/soe.ucsc.edu/group/genome/topics

    Pauline Fujita <pauline <at> soe.ucsc.edu> May 03 04:52PM -0700  

    Hello Paul,
     
    You might find this tutorial on the Genome Browser to be a useful
    starting point:
     
    http://www.openhelix.com/cgi/tutorialInfo.cgi?id=27
     
    In response to your specific questions:
     
    1) The first position of a chromosome in the Genome Browser is 1. To
    see it, navigate to the beginning of a chromsome (chr1:1-20, for
    instance) and look at how the bases are numbered. Note that in our
    tables, though, the first base is numbered 0. See this page to read
    about why: http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms.
     
    2) Ns take up space, just as A, C, G, and T do.
     
    3) Masking does not affect positioning.
     
    Hopefully this is enough to get you started. If you have further
    questions please feel free to contact the mailing list again at
    genome <at> soe.ucsc.edu.
     
    Best regards,
     
    Pauline Fujita
    UCSC Genome Bioinformatics Group
    http://genome.ucsc.edu
     
     

     

You received this message because you are subscribed to the Google Group genome.
You can post via email.
To unsubscribe from this group, send an empty message.
For more options, visit this group.

--
 
 
 
genome | 3 May 2013 20:52
Favicon

Digest for genome <at> soe.ucsc.edu - 5 Messages in 3 Topics

Group: http://groups.google.com/a/soe.ucsc.edu/group/genome/topics

    Michael Krahn <krahnmichael <at> hotmail.com> May 03 03:32PM +0200  

    Hi!
    I would like to search for a specific BAC for my gene of interest - I remember, that this was possible in the genome viewer, however I cannot find out how.
    Many thanks for your help !
    Michael

     

    Brooke Rhead <rhead <at> soe.ucsc.edu> May 02 12:15PM -0700  

    Hi Brad,
     
    I spoke to our developers, and there aren't any straightforward ways to
    find the custom track number. There might be an easier way to
    accomplish your goal, though. You could potentially use a track hub
    instead, which would allow you to control the contents of the hub and
    make changes to it without interacting with the Genome Browser at all
    (see http://genome.ucsc.edu/goldenPath/help/hgTrackHubHelp.html for more
    info). Alternatively, perhaps you could use a saved session
    (http://genome.ucsc.edu/goldenPath/help/hgSessionHelp.html); for
    instance, we have an FAQ that describes how to link to specific tracks
    in the Genome Browser using a saved session with all tracks hidden:
    http://genome.ucsc.edu/FAQ/FAQlink.html#link4.
     
    What is your usage scenario? Maybe we can suggest some alternatives.
     
    --
    Brooke Rhead
    UCSC Genome Bioinformatics Group
     
     
    On 5/1/13 3:57 PM, Brad Davis wrote:

     

    Brad Davis <bdavis <at> bcgsc.ca> May 02 12:28PM -0700  

    Hello Brooke,
     
    We are building an epigenome portal to allow users to view the reference
    epigenomes being created at the BC Cancer Agency. It is similar in function
    to the Epigenome Roadmap Atlas. Currently when a user selects a set tracks
    they want to visualize the track information is written to a hub file at a
    internet accessible address and this hub file (e.g.
    http://edccwashu.bcgsc.ca/hubs/hub.ucsc.Thu-May--2-04-27-39-2013.wmS5iRPtf0)
    is then transmitted as part of the HTTP request to the UCSC browser. The
    issue is that if a user selects a track decides to visualize it and then
    comes back to the table and unselects a track, the track continues to be
    displayed on the genome browser. I understand that the user can remove the
    track from their display using the manage tracks feature of the UCSC
    browser, but I was hoping to make our table a little bit more transparent to
    for users. The manage tracks tool is great, but I am aiming to make as few
    assumptions about the users understanding of genome browsers as possible.
    Admittedly any experienced user will be familiar with them, but I would like
    to not have to make that assumption. I know that there is a feature which
    allows for all of the tracks to be removed as a group, but that goes too
    far. I can imagine a scenario where a user has loaded their own tracks
    manually (or via some other genomics data center) and then comes to ours to
    load tracks. If we automatically cleared the track history for them every
    time they selected a track it would undo any previous work they had done, so
    the remove all tracks feature really isn't an option. The TrackHub
    suggestion might work, I'll have to talk with our guys, but it seems like a
    promising solution. Ideally what we want is something that we can integrate
    into our current system.
     
    Thanks,
    Brad Davis
     

     

    Brooke Rhead <rhead <at> soe.ucsc.edu> May 02 05:42PM -0700  

    Hi Brad,
     
    We don't currently have a mechanism in track hubs or in custom tracks to
    control visibilities of data hubs, or a way to disconnect from a
    particular hub via URL, so on second thought, track hubs might not be
    the best solution for now.
     
    Instead, you may be able to get to the custom track table names with the
    numbers appended to them by constructing a URL to the Table Browser and
    then doing some screen scraping. For instance, this URL will take you
    to the Table Browser for hg19 with the Custom Tracks track group selected:
     
    http://genome.ucsc.edu/cgi-bin/hgTables?clade=mammal&org=Human&db=hg19&hgta_group=user
     
    You might have to do some experimenting to make it work.
     
    --
    Brooke Rhead
    UCSC Genome Bioinformatics Group
     
     
    On 5/2/13 12:28 PM, Brad Davis wrote:

     

    Andrew Fritz <ajfritz <at> buffalo.edu> May 02 03:46PM -0400  

    Hi,
    I am trying to find the exact base pair beginning and end sites for the
    nucleolar organizing regions on chromosomes 13,14,15,21,and 22, but I cant
    seem to find this information anywhere. All I know is that they are on the
    p arms of these chromosomes but I need to know much more specifically.
    Thank you!
    Andrew Fritz
    University at Buffalo

     

You received this message because you are subscribed to the Google Group genome.
You can post via email.
To unsubscribe from this group, send an empty message.
For more options, visit this group.

--
 
 
 
genome | 1 May 2013 19:19
Favicon

Digest for genome <at> soe.ucsc.edu - 4 Messages in 3 Topics

Group: http://groups.google.com/a/soe.ucsc.edu/group/genome/topics

    sjammula <sjammula <at> myriad.com> May 01 10:15AM -0600  

    Hello,
     
    In the past, I have seen the genome coordinates increase from left to right when any gene is displayed. But now I notice that the display shows them decreasing from from left to right.
     
    Is there a way to change the display to go back to the old way of displaying coordinates ?
     
    I use Safari on an imac to access genome browser.
     
    Thanks,
    Srikanth Jammulapati

     

    Michael Mueller <michael.mueller <at> imperial.ac.uk> Apr 30 10:07PM +0100  

    Hi,
     
    I was wondering if there is a way of getting bigwig/BAM tracks to work
    with SFTP connections?
     
    Michael

     

    Brian Lee <brianlee <at> soe.ucsc.edu> Apr 30 03:03PM -0700  

    Dear Michael,
     
    Thank you for using the UCSC Genome Browser and your question about using
    SFTP for bigWig and BAM custom tracks.
     
    Unfortunately SFTP connections are not currently supported, we may support
    them in the future and your interest in such a feature has been noted. We
    do support FTP, HTTP, and HTTPS. For HTTPS connections passwords need
    hexidecimal representation. For example, in the password mypwd at wk, the <at>
    character should be replaced by %40, resulting in the modified password
    mypwd%40wk.
     
    Thank you again for your inquiry and using the UCSC Genome Browser. If you
    have further questions, please feel free to contact the mailing list again
    at genome <at> soe.ucsc.edu.
     
    All the best,
     
    Brian Lee
    UCSC Genome Bioinformatics Group
     
     
    On Tue, Apr 30, 2013 at 2:07 PM, Michael Mueller <

     

You received this message because you are subscribed to the Google Group genome.
You can post via email.
To unsubscribe from this group, send an empty message.
For more options, visit this group.

--
 
 
 
genome | 26 Apr 2013 20:42
Favicon

Digest for genome <at> soe.ucsc.edu - 4 Messages in 4 Topics

Group: http://groups.google.com/a/soe.ucsc.edu/group/genome/topics

    Moonmoon Deb <moonmoondeb <at> gmail.com> Apr 26 06:07PM +0530  

    sir
    please let me know how can i download the sequence of my selected region of
    a chromosome
    --
    Thanking You with Regards.
    Moonmoon Deb

     

    Pauline Fujita <pauline <at> soe.ucsc.edu> Apr 25 04:31PM -0700  

    Hello Venkat,
     
    The best option for that number of gene regions would be to use the Table
    Browser and the define regions function. To do this go to the Table Browser
    (http://genome.ucsc.edu/cgi-bin/hgTables) and select:
     
    group: Variation and Repeats
    track: All SNPs (137)
    table: snp137
     
    region: click the "define regions" button
     
    In the region definition menu, you can paste or upload a file containing
    the coordinates of your gene regions of interest. These coordinates will
    need to conform to the syntax used in the browser - which is described just
    below the box where you can enter them.
     
    Note also that our coordinates always have a zero-based start and a
    one-based end - you can read more about that in this FAQ:
     
    http://genome.ucsc.edu/FAQ/FAQtracks#tracks1
     
    If you do not already know the coordinates of your regions of interest, you
    could also generate a list of coordinates based on a list of genes using
    the Table Browser. To do this select:
     
    group: Genes and Gene Prediction Tracks
    track: your preferred gene prediction track (i.e. UCSC Genes)
    table: knownGene
     
    region: genome
    identifiers: click "paste list" and enter your list of gene names - note
    that they will have to be of the form described by the examples
     
    output format: BED
     
    then after clicking 'get output' you will need to specify which genic
    regions you want to include (i.e. intron/exons etc). You can then use this
    BED output in the "define regions" step above.
     
    If you have further questions please feel free to contact the mailing list
    again at genome <at> soe.ucsc.edu.
     
    Best regards,
     
    Pauline Fujita
    UCSC Genome Bioinformatics Group
    http://genome.ucsc.edu
     
     
     
     
     
     

     

    Mehar <mehar.com <at> gmail.com> Apr 25 11:05PM +0300  

    Hi,
     
    I am in need of known SNPs file for canFam3 build and I am regularly
    looking at the link
    _http://hgdownload.soe.ucsc.edu/goldenPath/canFam3/database/_ since 2011
    september, to download the known SNPs file. Unfortunately, it was not
    available in the canFam3 annotation database.
     
    Could someone help to provide the SNP file for canFam3?
     
    Thanks in advance.
     
     
    Regards
    Mehar

     

    Pauline Fujita <pauline <at> soe.ucsc.edu> Apr 25 12:34PM -0700  

    Hello Rohit,
     
    This depends where you are looking at the exon numbers. For output from the
    Table Browser (genome.ucsc.edu/cgi-bin/hgTables) exons will be numbered in
    order of increasing chromosomal coordinate (not 5' to 3').
     
    However, if you are looking at RefSeq in the browser, and have the
    next/previous exon navigation arrows turned on (enabled by clicking on the
    "configure" button below the main browser display - and checking the box at
    the top), then the exon numbers you will see as you hover your mouse above
    the next/prev arrow are numbered 5' to 3' to match RefSeq.
     
    Hopefully this clarifies things. If you have further questions please feel
    free to contact the mailing list again at genome <at> soe.ucsc.edu.
     
     
    Best regards,
     
    Pauline Fujita
    UCSC Genome Bioinformatics Group
    http://genome.ucsc.edu
     
     
     
     
    On Thu, Apr 25, 2013 at 4:05 AM, Rohit Gavval <rohit_gavval <at> persistent.co.in

     

You received this message because you are subscribed to the Google Group genome.
You can post via email.
To unsubscribe from this group, send an empty message.
For more options, visit this group.

--
 
 
 
genome | 19 Apr 2013 19:41
Favicon

Digest for genome <at> soe.ucsc.edu - 3 Messages in 2 Topics

Group: http://groups.google.com/a/soe.ucsc.edu/group/genome/topics

    David Nguyen <d.hh.nguyen <at> gmail.com> Apr 18 02:55PM -0400  

    Hi there,
     
    I'm interested in looking at the TFBS data and downloaded the files. One
    is called:
     
    spp.optimal.wgEncodeBroadHistoneGm12878CtcfStdAlnRep0_VS_wgEncodeBroadHistoneGm12878ControlStdAlnRep0
     
    However, I don't have a program that can open this file.
    What and where is the program for this?
     
    Thanks,
     
    -dave

     

    xiewanchen2013 <xiewanchen2013 <at> gmail.com> Apr 19 10:14PM +0800  

    Dear Madam/Sir,
     
    I want to know if there is any available data of normalized RNA-seq and Chip-seq data for differential analysis between cell-lines such as by edgeR or DEseq.
     
    Thank you for your time!
     
     
     
     
    xiewanchen2013

     

You received this message because you are subscribed to the Google Group genome.
You can post via email.
To unsubscribe from this group, send an empty message.
For more options, visit this group.

--
 
 
 
genome | 15 Apr 2013 20:28
Favicon

Digest for genome <at> soe.ucsc.edu - 3 Messages in 2 Topics

Group: http://groups.google.com/a/soe.ucsc.edu/group/genome/topics

    Meenakshi Sharma <meenakshi8 <at> gmail.com> Apr 14 10:28PM -0500  

    Hello,
     
    I have a question. Suppose I have genomic regions (with start and
    end locations) from human genome. And I want to know whether these regions
    are part of genes or repetitive elements. I am interested in biological
    function of these regions.
     
    Is there a combined general database which I can use to query these regions
    against?
    Currently I scan these regions against refGene db.
     
    Thanks and regards,
    Meenakshi Sharma

     

    "Rosenfeld, Jeffrey" <rosenfj1 <at> umdnj.edu> Apr 13 09:11PM -0400  

    Hi,
     
    Is there a mirror for the ENCODE data? The download is extremely slow.
     
    Thanks,
     
    Jeff
     
    ------------------------------------------------------
    Jeffrey Rosenfeld, Ph. D
    IST/High Performance and Research Computing
    University of Medicine and Dentistry of New Jersey (UMDNJ)
    973-972-1004 (voice)
    973-972-7412 (fax)
    MSB-C631
    185 South Orange Avenue
    Newark, NJ 07101
     
    Sackler Institute for Comparative Genomics
    American Museum of Natural History

     

You received this message because you are subscribed to the Google Group genome.
You can post via email.
To unsubscribe from this group, send an empty message.
For more options, visit this group.

--
 
 
 
genome | 13 Apr 2013 20:35
Favicon

Digest for genome <at> soe.ucsc.edu - 2 Messages in 2 Topics

Group: http://groups.google.com/a/soe.ucsc.edu/group/genome/topics

    Luvina Guruvadoo <luvina <at> soe.ucsc.edu> Apr 12 01:21PM -0700  

    Hi Brad,
     
    Your URL (http://www.bcgsc.ca/downloads/edcc/test/fib03_Standard.bed)
    already contains a track line, which is why you are receiving this
    error. Replace the second line of your customTracks.txt file with only
    the URL, and it should upload without any errors.
     
    If you have further questions please feel free to contact the mailing
    list again at genome <at> soe.ucsc.edu.
     
    ---
    Luvina Guruvadoo
    UCSC Genome Bioinformatics Group
     
    On 4/12/2013 9:50 AM, Brad Davis wrote:

     

    Luvina Guruvadoo <luvina <at> soe.ucsc.edu> Apr 12 01:12PM -0700  

    Hi Marco,
     
    You can read more about the SNP track on its track description page.
    This can be accessed by clicking on the gray bar to the left of the
    track in the main display, or by clicking on the track title above its
    pull down menu below the main display. See the SNP 137 track description
    here: http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=19&g=snp137. This
    should provide the answers to questions 1 and 3.
     
    To answer the second part of question 1, it really depends on the
    purpose of your analysis. You could use the "All SNP" track, but note
    that it contains not only polymorphisms, but also rare SNPs, known
    disease SNPs, and variants that are questionable because they map to
    multiple locations in the genome.
     
    For your last question, the usual reason for inconsistency between the
    dbSNP website and ours is that they constantly update their web
    database, while we only load our SNPs tracks once when they make
    official release download files.
     
    I hope this helps. If you have additional questions, feel free to
    contact us again at genome <at> soe.ucsc.edu.
     
    ---
    Luvina Guruvadoo
    UCSC Genome Bioinformatics Group
     
    On 4/12/2013 3:25 AM, Marco Santagostino wrote:

     

You received this message because you are subscribed to the Google Group genome.
You can post via email.
To unsubscribe from this group, send an empty message.
For more options, visit this group.

--
 
 
 
genome | 5 Apr 2013 19:36
Favicon

Digest for genome <at> soe.ucsc.edu - 12 Messages in 11 Topics

Group: http://groups.google.com/a/soe.ucsc.edu/group/genome/topics

    chenjh <chenjhbio <at> 163.com> Apr 05 04:24PM +0800  

    dear sir,
    Very sorry to trouble you.
     
     
    But we met problems,would you like to tell us how to download human dbSNP 137?
     
     
    many thanks for that in advance

     

    Luvina Guruvadoo <luvina <at> soe.ucsc.edu> Apr 05 09:50AM -0700  

    Hello,
     
    You can download dbSNP 137 tables directly from our downloads server.
    From the main page, click on 'Downloads' (located on left-side
    navigation menu). On the following page, click on 'Human' then
    'Annotation Database'. This will take you here:
    http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/. Scroll down
    and you will find all download files associated with dbSNP 137.
     
    Please contact us again at genome <at> soe.ucsc.edu if you have any further
    questions.
     
    ---
    Luvina Guruvadoo
    UCSC Genome Bioinformatics Group
     
     
     
    On 4/5/2013 1:24 AM, chenjh wrote:

     

    Peng Yu <pengyu.ut <at> gmail.com> Apr 05 11:31AM -0500  

    Hi,
     
    The following mysql code shows that many proteins appears more than
    once in knownGene.
     
    ~~~
    mysql> select count(*) from knownGene where proteinID!="";
    +----------+
    | count(*) |
    +----------+
    | 41735 |
    +----------+
    1 row in set (0.32 sec)
     
    mysql> select count(distinct proteinID) from knownGene where proteinID!="";
    +---------------------------+
    | count(distinct proteinID) |
    +---------------------------+
    | 34117 |
    +---------------------------+
    1 row in set (0.23 sec)
     
    mysql> select proteinID from knownGene group by proteinID having
    count(*) >=2 limit 5;
    +-----------+
    | proteinID |
    +-----------+
    | |
    | A0ELI5 |
    | A0JNT0 |
    | A0JNY8 |
    | A0JNZ2 |
    +-----------+
    5 rows in set (0.26 sec)
    mysql> select * from knownGene where proteinID="A0ELI5";
    +------------+-------+--------+----------+----------+----------+----------+-----------+-----------------------------------------------------------------+-----------------------------------------------------------------+-----------+------------+
    | name | chrom | strand | txStart | txEnd | cdsStart |
    cdsEnd | exonCount | exonStarts
    | exonEnds
    | proteinID | alignID |
    +------------+-------+--------+----------+----------+----------+----------+-----------+-----------------------------------------------------------------+-----------------------------------------------------------------+-----------+------------+
    | uc009pvp.1 | chr9 | + | 57556375 | 57597969 | 57561204 |
    57596302 | 7 |
    57556375,57561187,57563754,57574992,57587696,57592391,57595967, |
    57556506,57561368,57564074,57575328,57587850,57592609,57597969, |
    A0ELI5 | uc009pvp.1 |
    | uc009pvq.1 | chr9 | + | 57556375 | 57597969 | 57575057 |
    57596302 | 6 |
    57556375,57561187,57574992,57587696,57592391,57595967, |
    57556506,57561368,57575328,57587850,57592609,57597969, |
    A0ELI5 | uc009pvq.1 |
    +------------+-------+--------+----------+----------+----------+----------+-----------+-----------------------------------------------------------------+-----------------------------------------------------------------+-----------+------------+
    2 rows in set (0.07 sec)
    ~~~
     
    The following shows that UniProt has more mouse proteins than the ones
    in knownGene.
    ~~~
    ~$ wget -qO- ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/proteomes/MOUSE.fasta.gz
    | gunzip | grep '^>' | wc -l
    50798
    ~~~
     
    Does anybody know why for both these questions?
     
    --
    Regards,
    Peng

     

    "Steve Heitner" <steve <at> soe.ucsc.edu> Apr 05 08:19AM -0700  

    Hello, Shijian.
     
    If you are unable to solve the connection problem at your institution, you
    may consider one of the many public storage solutions available. Also, we
    recommend using HTTP rather than FTP. The URL will operate more
    efficiently, requiring only one connection and a single request rather than
    two connections and several setup requests.
     
    Please contact us again at genome <at> soe.ucsc.edu if you have any further
    questions.
     
    ---
    Steve Heitner
    UCSC Genome Bioinformatics Group
     

     
    From: genome <at> soe.ucsc.edu [mailto:genome <at> soe.ucsc.edu] On Behalf Of Zhangsj
    Sent: Monday, April 01, 2013 4:39 PM
    To: genome
    Subject: Re:RE: [genome] Custom Track URL error
     

     
    Sorry, I should had told that I had tried this and I really could view my
    data on browser and I don't set the listen_address for vsftp. So I guess
    it's not the IP permit for UCSC to access my FTP.
     

     

     
    ------------------ Original ------------------
     
    From: "Steve Heitner"<steve <at> soe.ucsc.edu>;
     
    Date: Tue, Apr 2, 2013 01:59 AM
     
    To: "'Zhangsj'"<zhangsjsky <at> foxmail.com>; "'genome'"<genome <at> soe.ucsc.edu>;
     
    Subject: RE: [genome] Custom Track URL error
     

     
    Hello, Shijian.
     
    I just attempted to browse to your FTP site and it was unavailable. If you
    are unable to view the contents of the directory in a web browser, you will
    likewise be unable to load your files as a custom track. Make sure that
    when you browse to ftp://162.105.138.90, you can see your files in the
    directory. Once you can do that, try loading your custom track again.
     
    Please contact us again at genome <at> soe.ucsc.edu if you have any further
    questions.
     
    ---
    Steve Heitner
    UCSC Genome Bioinformatics Group
     

     
    From: genome <at> soe.ucsc.edu [mailto:genome <at> soe.ucsc.edu] On Behalf Of Zhangsj
    Sent: Monday, April 01, 2013 8:39 AM
    To: genome
    Subject: [genome] Custom Track URL error
     

     
    Hi, dear UCSC members
     

     
    I want to upload BAM file to UCSC, but got the following error:
     
    Can't access My BAM's bigDataUrl ftp://162.105.138.90/adipose.sorted.bam
    and/or the associated index file
    ftp://162.105.138.90/adipose.sorted.bam.bai: TCP non-blocking connect() to
    162.105.138.90 timed-out in select() after 10000 milliseconds - Cancelling!
     

     
    The track line is: track type=bam name="My BAM"
    bigDataUrl=ftp://162.105.138.90/adipose.sorted.bam
     

     
    I had set up FTP server on 162.105.138.90 with vsFTP and config it as
    passive connection mode as UCSC requires. The anonymous FTP is on. The
    adipose.sorted.bam.bai has been in the same directory as adipose.sorted.bam.
     

     
    Did I miss something else? Hope your reply.
     

     
    ------------------
     
    With best wishes,
     
    Shijian Sky Zhang
     
    ---------------------------------------
    Institute of Molecular Medicine, Peking University
    Yingjie Exchange Center
    Beijing 100871, China
    E-mail: zhangsjsky <at> foxmail.com
     
    zhangsjsky <at> pku.edu.cn
     

     
    --



     
    --

     

    Pauline Fujita <pauline <at> soe.ucsc.edu> Apr 04 05:13PM -0700  

    Hello Mark,
     
    We think we might have corrected the issue. If you are still having
    trouble with this please contact the mailing list again at
    genome <at> soe.ucsc.edu.
     
    Best regards,
     
    Pauline Fujita
    UCSC Genome Bioinformatics Group
    http://genome.ucsc.edu
     
     
     

     

    Pauline Fujita <pauline <at> soe.ucsc.edu> Apr 04 01:51PM -0700  

    Hello Herty,
     
    If you are wanting to draw graphs of signal data from ENCODE, you probably
    want to use the ENCODE uniform signals which are in bigWig format. These
    are available from the ENCODE downloads page:
     
    encodeproject.org/ENCODE/downloads.html
     
    You may also find our documentation on ENCODE file formats useful:
     
    http://encodeproject.org/FAQ/FAQformat.html#ENCODE
     
    As far as R packages are concerned, unfortunately that is beyond the scope
    of this list. There are quite a few packages available online, you will
    have to see which best fits your needs. A good starting point might be the
    bioconductor site:
     
    http://www.bioconductor.org/help/search/index.html?q=R
     
    Best of luck with your research! If you have further questions regarding
    Genome Browser usage, please feel free to contact the mailing list again at
    genome <at> soe.ucsc.edu.
     
    Best regards,
     
    Pauline Fujita
    UCSC Genome Bioinformatics Group
    http://genome.ucsc.edu
     
     
    On Thu, Apr 4, 2013 at 2:45 AM, Herty LIANY (GIS)

     

    Pauline Fujita <pauline <at> soe.ucsc.edu> Apr 04 01:25PM -0700  

    Hello Sarah,
     
    You can find the files you're look for in the downloads area of our test site:
     
    http://hgdownload-test.soe.ucsc.edu/goldenPath/nemVec1/bigZips/
     
    Note that in general, items on our test server have not been through
    our quality assurance process and are subject to change.
     
     
    Best regards,
     
    Pauline Fujita
    UCSC Genome Bioinformatics Group
    http://genome.ucsc.edu
     
     

     

    Kate Rosenbloom <kate <at> soe.ucsc.edu> Apr 04 12:59PM -0700  

    Hello Brenden,
     
    The Txn Factor ChIP-seq browser track contains clusters of enriched
    sites from ChIP-seq performed in cell types from
    all 3 tiers of ENCODE. You can see the complete list of cell types
    included in this track on the track description page:
     
    http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeRegTfbsClusteredV2
     
    The table you pulled from the Table Browser includes information for
    each cluster indicating which cell types contributed to the cluster, and
    what the signal strength for that cell type was in the peak contributing
    to the cluster, but not the specific coordinates for the peaks on a
    cell-specific basis. For that, you would need the individual
    per-cell-type datasets. The uniform ChIP-seq peaks that were used to
    generate this version of the Txn Factor ChIP track are available at the
    ENCODE Analysis Track Hub, for downloads, and access with the Table
    Browser (and Genome Browser). For a limited number of datasets, you can
    readily extract the peak coordinates for your region of interest using
    the Table Browser after loading the track hub. You can do this directly
    from the browser Track Connect page, or indirectly from the ENCODE
    downloads page (click the TFBS link in the Uniform Peaks section
    header): http://encodeproject.org/ENCODE/downloads.html
     
    Unfortunately the ENCODE data is so voluminous that often downloading
    the files and processing at your site is necessary to extract data for
    your specific needs. To extract all ChIP-seq peaks in your region of
    interest from the 400+ datasets represented here, you would need to
    download the TFBS Peaks (SPP) via the download link, and filter the
    resulting narrowPeak BED files for your regions of interest.
    Fortunately the Peak files are the most compact of the ENCODE files, so
    should be relatively easy to download.
     
    Note that the datasets represented here are based on data submitted for
    the ENCODE January 2011 data freeze, with analysis reported by the
    ENCODE Analysis Working Group in coordinated publications in September
    2012: http://encodeproject.org/ENCODE/analysis.html.
     
    Cheers,
    Kate
    ---
    Kate Rosenbloom
    UCSC Genome Bioinformatics
     
    On 4/4/13 8:32 AM, Chen, Brenden wrote:

     

    Brooke Rhead <rhead <at> soe.ucsc.edu> Apr 04 12:18PM -0700  

    Hi Bob,
     
    Input signal values are multiplied by a normalization factor calculated
    as the ratio of the maximum score value (1000) to the signal value at 1
    standard deviation from the mean, with values exceeding 1000 capped at
    1000. This has the effect of distributing scores up to mean + 1std
    across the score range, but assigning all above to the max score.
     
    We are not aware of a standard threshold for weak interactions.
     
    If you have further questions, please feel free to contact us again at
    genome <at> soe.ucsc.edu.
     
    --
    Brooke Rhead
    UCSC Genome Bioinformatics Group
     
     
    On 4/2/13 3:58 PM, Robert O'Connor wrote:

     

    Brooke Rhead <rhead <at> soe.ucsc.edu> Apr 04 12:00PM -0700  

    Hi Navin,
     
    If you are using the Table Browser to and the "CDS FASTA" output format,
    then the CDS alignments from the UCSC Genes track (table: knownGene) are
    in phase. If you are using RefSeq Genes (table: refGene), a few of the
    refGene exons might not be in phase after the first exon since the mRNA
    may have bases in it that aren't in the reference sequence, but the huge
    majority will be in phase.
     
    --
    Brooke Rhead
    UCSC Genome Bioinformatics Group
     
     
    On 4/3/13 12:15 PM, Navin Rustagi wrote:

     

    Brian Lee <brianlee <at> soe.ucsc.edu> Apr 04 11:47AM -0700  

    Dear Mark,
     
    Thank you for using the UCSC Genome Browser and your question about
    obtaining chromosomal location of genes from unique identifiers such
    as UCSC transcripts and Entrez Gene accession numbers.
     
    You can use our Table Browser "identifiers (names/accessions):"
    function to upload a list of genes and select the desired output from
    the corresponding track. For example, you can pull the coordinates
    from the UCSC genes knownGene table if you upload a file of UCSC gene
    names similar to this:
    uc007aet.1
    uc007aeu.1
    uc007aev.1
    ...
     
    1. Navigate to the Table Browser Tool,
    http://genome.ucsc.edu/cgi-bin/hgTables, and make the following
    selections:
     
    Clade: Mammal
    Genome: Mouse
    Assembly: mm9
    Group: Genes and Gene Prediction Tracks
    Track: UCSC Genes
    Table: knownGene
    Region: genome
     
    2. Click the "upload list" button to use your UCSC genes identifiers
    file in the format described above.
     
    3. Set "output format:" to "selected fields from primary and related
    tables" and click "get output".
     
    4. On the "Select Fields from mm9.knownGene" page, select your desired
    information. For example you can select "name", "chrom", "txStart",
    and "txEnd" and then click "get output" to get information similar to
    this:
     
    #name chrom txStart txEnd
    uc007aet.1 chr1 3195984 3205713
    uc007aeu.1 chr1 3204562 3661579
    uc007aev.1 chr1 3638391 3648985
    ...
     
    If you have Entrez Gene identifiers of the type like "382301", which
    corresponds to UCSC identifier "uc009vfk.1", you can get the same
    output using the Table Browser "filter" tool.
     
    1. Repeat step 1 above.
     
    2. Click the "clear list" button to remove the previous identifier
    operation. Next to "filter:", click the "create" button.
     
    3. Scroll down to the "Linked Tables" section and put a check mark
    next to the "knownToLocusLink" table and click the "allow filtering
    using fields in checked tables" button.
     
    4. Under "mm9.knownToLocusLink based filters", find the "values does
    match" line. Replace the "*", with a list of your Entrez Gene
    identifiers, for example paste, "18019 11735 11496 16709 12916 ..."
    The box is small, but it will accept a list of identifiers.
     
    5. Repeat step 3 from above.
     
    6. Repeat step 4 from above. Here you could use the "Linked Tables" to
    add the "value" from knownToLocusLink in your output and you will get
    output like:
    #mm9.knownGene.name mm9.knownGene.chrom mm9.knownGene.txStart
    mm9.knownGene.txEnd mm9.knownToLocusLink.value
    uc008oav.1 chr2 168301909 168415848 18019
     
    When selecting a table to browse in the Table Browser, by clicking the
    "describe table schema" button you can learn more about other tables.
    For example you can also change the above examples Table Browser
    settings to "Track: RefSeq Genes" and "Table: RefGene", if you are
    using identifiers like "NM_001195025". Or you can make similar filter
    selections using Linked Tables based on RefSeq's RefGene table rather
    than UCSC knownGene to obtain RefSeq coordinate output. For example
    linking through "knownToRefSeq" and then "knownToLocusLink" you can
    filter on Entrez Gene IDs and select output like:
    #mm9.refGene.name mm9.refGene.chrom mm9.refGene.txStart
    mm9.refGene.txEnd mm9.knownToLocusLink.value
    NM_010899 chr2 168301909 168415783 18019
     
    Unfortunately it looks like we do not have any track or tables that
    contain MGI identifiers of the type "MGI:1341105"
     
    Thank you again for your inquiry and using the UCSC Genome Browser. If
    you have further questions please feel free to contact the mailing
    list again at genome <at> soe.ucsc.edu.
     
    All the best,
     
    Brian Lee
    UCSC Genome Bioinformatics Group
     

     

You received this message because you are subscribed to the Google Group genome.
You can post via email.
To unsubscribe from this group, send an empty message.
For more options, visit this group.

--
 
 
 
genome | 23 Mar 2013 18:51
Favicon

Digest for genome <at> soe.ucsc.edu - 3 Messages in 3 Topics

Group: http://groups.google.com/a/soe.ucsc.edu/group/genome/topics

    Luvina Guruvadoo <luvina <at> soe.ucsc.edu> Mar 22 03:14PM -0700  

    Hi Chen,
     
    There are actually two UCSC gene models for IQCF5:
     
    uc003dbq.4 chr3:51,840,728-51,937,386 (which matches Ensembl)
    uc011bdx.2 chr3:51,907,737-51,909,600
     
    Note, the transcript you are referred to in your email corresponds to
    IQCF5. You can read more about how UCSC genes are built on the track
    description page:
    http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=knownGene
     
    If you have further questions please feel free to contact the mailing
    list again at genome <at> soe.ucsc.edu.
     
    ---
    Luvina Guruvadoo
    UCSC Genome Bioinformatics Group
     
     
    On 3/22/2013 9:27 AM, mingchen wrote:

     

    Pauline Fujita <pauline <at> soe.ucsc.edu> Mar 22 12:12PM -0700  

    Hello Ganesh,
     
    uc010dhl.1 is the unique identifier given to this transcript variant
    within the "UCSC genes" gene prediction dataset. These identifiers are
    generated when we create this track here at UCSC, and updated if the
    transcript changes. You can read more about the UCSC genes track on
    its description page here:
     
    http://genome.ucsc.edu/cgi-bin/hgTrackUi?g=knownGene
     
    Hopefully that is enough to get you started. If you have further
    questions please feel free to contact the mailing list again at
    genome <at> soe.ucsc.edu.
     
    Best regards,
     
    Pauline Fujita
    UCSC Genome Bioinformatics Group
    http://genome.ucsc.edu
     
     
     

     

    Brooke Rhead <rhead <at> soe.ucsc.edu> Mar 22 11:47AM -0700  

    Hi Matt,
     
    I see now. The links from Affymetrix load four custom tracks into the
    Genome Browser (they show up in the track controls under the group
    "other" at the bottom of the page). You will need to contact Affy
    directly about the contents of the tracks, since we don't actually host
    the data here.
     
    --
    Brooke Rhead
    UCSC Genome Bioinformatics Group
     
     
    On 3/22/13 2:32 AM, matt arno wrote:

     

You received this message because you are subscribed to the Google Group genome.
You can post via email.
To unsubscribe from this group, send an empty message.
For more options, visit this group.

--
 
 
 

Gmane