If you have more questions, please contact annolnc@mail.cbi.pku.edu.cn.

  1. What is AnnoLnc?

    AnnoLnc is a web server for integratively annotating novel human lncRNAs. Designed as a one-stop portal, it accepts human lncRNA sequences as input, and generates a full spectrum of annotations covering sequence and structure features, regulation, expression, protein interaction, genetic association and evolution. Furthermore, heterogeneous annotations are integrated to facilitate unveiling biologically meaningful clues.

  2. Why do you develop AnnoLnc?

    Recent years, long noncoding RNAs (lncRNAs) have been demonstrated to be such essential and widespread molecules as proteins. While the number of newly identified lncRNAs is increasing continuously, their functions are still largely elusive. Although there are some lncRNA databases, they could not deal with newly identified lncRNAs. So we developed this server to help biologists interrogate novel lncRNAs from various perspectives, and we are dedicated to help biologists discover useful clues underlying heterogeneous annotations. Of course, AnnoLnc also accepts known lncRNAs.

  3. Is AnnoLnc a web server or a database?

    AnnoLnc provides on-the-fly analysis for input sequences rather than simple data retrieval. Every valid input sequence will be mapped to the human genome to get the genomic location, then followed by downstream analysis. For efficiency, we employ a “cache” strategy. Note that even a submitted sequence is one base different from cached lncRNAs, AnnoLnc will regard it as a novel sequence to reanalyze.

  4. How does each annotation module of AnnoLnc work?

    Please see the details in the method page.

  5. How to submit lncRNA sequences? What is the requirement for input sequences?

    There are 3 ways to submit lncRNA sequences:

    Submit requirement (AnnoLnc will discard sequences that don't meet the requirement):

    • Fasta format: your sequences must be in fasta format. That is to say, sequence names are required.
    • The limitation of total sequences: if you paste sequences, up to 100 one time; if you upload a fasta file, up to 500 sequences one time.
    • Name requirement: names of your sequences should be less than 100 characters. Only characters in [A-Za-z0-9_.,-] are allowed. Illegal characters will be discarded.
    • Sequence requirement: your sequences should be longer than 20bp and shorter than 100,000bp. Only characters in DNA and RNA sequences are allowed.
  6. Why don't some of my sequences have "Locus" in the "Overview" page?

    There are two situations:

    • The analysis has just been started, the sequence hasn’t been mapped to the human genome. In this case, the page will be refreshed automatically every 15 seconds.
    • If all the analyses are finished and it still has no "Locus", this means that it can't be located in the human genome hg19. Maybe it's not a human sequence. In this case, AnnoLnc will still run analyses at "sequence level" for it.
  7. Why do some of my sequences have multiple "Locus" in the "Overview" page?

    It's because this sequence can be mapped to multiple loci in the human genome hg19. AnnoLnc will run analyses for all the loci.

  8. How to understand the annotation results?

    Click the at the annotation page, you can see detailed explanation of each annotation result. If you have more questions, please contact annolnc@mail.cbi.pku.edu.cn.

  9. How are the ChIP-Seq datasets analyzed?

    We incorporated the uniform peak files from here directly. These datasets are generated by the ENCODE project and widely used by the community. The original datasets contain 690 peak files. We filtered out some replicated experiments and left 498 of them. About details of the original data analysis, you can refer to this guideline and scripts here.

  10. How are the RNA-Seq datasets analyzed?

    Please see the details in the method page (Expression).

  11. How are the CLIP-Seq datasets analyzed?

  12. What is the integrated view?

    The integrated view is a set of pre-tuned custom tracks in the UCSC genome browser. Taking advantage of the nature of the genome browser view, spatial correlations across different kinds of annotations can be easily discovered. Annotations at the transcript level including transcript structure, TF binding sites, miRNA binding sites, protein binding sites, SNP locations are sent to the UCSC genome browser by URL. PhyloP scores and conserved elements are presented as UCSC local tracks. Note that only annotations nearby the lncRNA can be displayed. About the definition of "nearby", please refer to details of each module in the method page.

    Tracks are:
    • User submitted sequence: the transcript structure of the lncRNA.
    • Trait associated SNPs.
    • PhyloP score of mammals: the UCSC local track.
    • Conserved elements: the UCSC local track.
    • Protein binding sites: binding sites of the same kind of samples (the same cell type treated with the same condition) are merge. Different colors mean different cell lines.
    • miRNA binding sites: binding sites are colored by red if they are supported by AGO CLIP-Seq data.
    • Transcriptional regulation: TF binding sites of the same kind of samples (the same cell type treated with the same condition) are merge. Different colors mean different cell lines.
  13. 有中文的介绍或者教程吗? (Do you have an introduction or tutorial in Chinese?)