如何在ggplot2中从R数据框的其他列添加条件注释。

huangapple go评论72阅读模式
英文:

How to add conditional annotation from other column of R dataframe in ggplot2

问题

I will translate the code part as requested:

我正在尝试从包含超过300k行的数据框中制作一个图表,其中Y的高值和低值将从另一列而不是X和Y中的数据进行注释。
我该怎么做..!!

数据框:

             chr   start   end.x     StoZ.x Hscore end.y StoZ.y Tier Gene
        1   chr1       1   10000  0.0000000      0    NA     NA <NA> <NA>
        2   chr1   10001   20000  3.6488202      2    NA     NA <NA> <NA>
        3   chr1   20001   30000 -0.8475483      0    NA     NA <NA> <NA>
        4   chr1   30001   40000  2.1279359      2    NA     NA <NA> <NA>
        5   chr1   40001   50000  1.4119515      0    NA     NA <NA> <NA>
            ................
            ................
            ................
      256   chr1 2550001 2560000 -2.363378       1 2560000 -2.363378 T1 TNFRSF14
      257   chr1 2560001 2570000 -0.796173       0 2570000 -0.796173 T1 TNFRSF14
           ................
           ................
           ................
      305   chr1 3040001 3050000 0.0564608       0 NA NA NA NA
      306   chr1 3050001 3060000 1.4822029       0 NA NA NA NA
      307   chr1 3060001 3070000 1.7718186       0 3070000 1.7718186 T1 PRDM16
      308   chr1 3070001 3080000 1.5650776       0 3080000 1.5650776 T1 PRDM16
           ................
           ................
           ................

The translation of the code snippet is complete. If you have any more specific questions or need further assistance, please feel free to ask.

英文:

I'm trying to make a plot from a dataframe of over 300k rows where the high and low values of Y will be annotated from another column rather than the x and y.
How can I do that..!!

Dataframe :

         chr   start   end.x     StoZ.x Hscore end.y StoZ.y Tier Gene
    1   chr1       1   10000  0.0000000      0    NA     NA &lt;NA&gt; &lt;NA&gt;
    2   chr1   10001   20000  3.6488202      2    NA     NA &lt;NA&gt; &lt;NA&gt;
    3   chr1   20001   30000 -0.8475483      0    NA     NA &lt;NA&gt; &lt;NA&gt;
    4   chr1   30001   40000  2.1279359      2    NA     NA &lt;NA&gt; &lt;NA&gt;
    5   chr1   40001   50000  1.4119515      0    NA     NA &lt;NA&gt; &lt;NA&gt;
        ................
        ................
        ................
  256   chr1 2550001 2560000 -2.363378       1 2560000 -2.363378 T1 TNFRSF14
  257   chr1 2560001 2570000 -0.796173       0 2570000 -0.796173 T1 TNFRSF14
       ................
       ................
       ................
  305   chr1 3040001 3050000 0.0564608       0 NA NA NA NA
  306   chr1 3050001 3060000 1.4822029       0 NA NA NA NA
  307   chr1 3060001 3070000 1.7718186       0 3070000 1.7718186 T1 PRDM16
  308   chr1 3070001 3080000 1.5650776       0 3080000 1.5650776 T1 PRDM16

       ................
       ................
       ................

如何在ggplot2中从R数据框的其他列添加条件注释。

Till now what I did,

 p2&lt;-ggplot(BreastCanT1T2mer, aes(start, StoZ.x)) + 
      geom_jitter(aes(color =  Hscore), pch = 23, size = 0.2) +      
      scale_colour_manual(values = c(&quot;gray70&quot;,&quot;#525480&quot;,&quot;#9F5370&quot;,&quot;blue&quot;,&quot;red&quot;), labels=c(&#39;NEUT&#39;, &#39;LOSS&#39;,  &#39;GAIN&#39;, &quot;DEL&quot;,&quot;AMP&quot; ))+
      theme(legend.position = c(0.5, .06), legend.direction = &quot;horizontal&quot;)+
      theme(legend.title = element_blank()) +
      geom_hline(yintercept=0, linetype=&quot;dashed&quot;, color = &quot;black&quot;)+
      guides(colour = guide_legend(override.aes = list(size=1.5)))+
      xlab(&quot;Chromosomes&quot;) + ylab(&quot;Stoffers-Z Score&quot;)+ 
      facet_wrap(~factor(chr, c(&quot;chr1&quot;, &quot;chr2&quot;, &quot;chr3&quot;, &quot;chr4&quot;, &quot;chr5&quot;, &quot;chr6&quot;, &quot;chr7&quot;, &quot;chr8&quot;, &quot;chr9&quot;, &quot;chr10&quot;, &quot;chr11&quot;, &quot;chr12&quot;, &quot;chr13&quot;, &quot;chr14&quot;,&quot;chr15&quot;, &quot;chr16&quot;,&quot;chr17&quot;,&quot;chr18&quot;,&quot;chr19&quot;,&quot;chr20&quot;,&quot;chr21&quot;,&quot;chr22&quot;, &quot;chrX&quot;, &quot;chrY&quot;)), nrow =1, strip.position = &quot;bottom&quot;) +
      theme(panel.spacing.x = unit(0.1, &quot;lines&quot;))+
      theme(axis.text.x=element_blank())+
      theme( legend.key = element_rect(fill = &quot;transparent&quot;, colour = NA))
    p2
    
    p2+stat_peaks(span=7,size= 1, ignore_threshold = 0.95, color=&quot;brown&quot;)+
      stat_peaks(geom=&quot;text&quot;, size= 1.5, span=7, ignore_threshold = 0.95, color=&quot;red&quot;, angle=90, hjust=-0.1, check_overlap = TRUE) +
      stat_valleys(span=7,size= 1, ignore_threshold = 0.95, color=&quot;skyblue&quot;)+
      stat_valleys(geom=&quot;text&quot;,  size= 1.5, span=7, ignore_threshold = 0.95, color=&quot;black&quot;, angle=90, hjust=1.1, check_overlap = TRUE)

My expectation is that the Gene name will appear only once where the Gene is available (StoZ is probably highest and lowest) and where the StoZ value should be less than -10 or greater than +10.

**Improvement after @Elin's sugesstion

p2+geom_text(aes(label=Gene))

Here, the Name of the Genes should appear only once and y conditioned on <-2 and >+2.

如何在ggplot2中从R数据框的其他列添加条件注释。

答案1

得分: 2

What I have so far is here, however, I believe the output may be better.

p2 + geom_text_repel(aes(label = ifelse(StoZ.x < -20, unique(Gene), "")), size = 1.5, angle = 45) +
  geom_text_repel(aes(label = ifelse(StoZ.x > 20, unique(Gene), "")), size = 1.5, angle = 45)

Now I'm wondering if I can make geom_text color based on the Tier column.

So far, what I've done hasn't yielded the desired results. Because I have called scale_color_manual in p2, I can't call it here. So I'm attempting to use scale_fill_manual, but it's not working.

p2 + geom_text_repel(aes(label = ifelse(StoZ.x < -10, unique(Gene), ""), hjust = 1, vjust = 0.5, size = 0.5, angle = 45), fill = as.factor(BreastCanT1T2mer$Tier)) +
  scale_fill_manual(values = c("brown", "navy"))
英文:

What I have so far is here, however, I believe the output may be better.

p2+geom_text_repel(aes(label= ifelse(StoZ.x &lt; -20, unique(Gene), &quot;&quot;)), size=1.5, angle = 45) +
  geom_text_repel(aes(label= ifelse(StoZ.x &gt; 20, unique(Gene), &quot;&quot;)), size=1.5,  angle = 45) 

如何在ggplot2中从R数据框的其他列添加条件注释。

Now I'm wondering if I can make geom_text color based Tier column.

So far, what I've done hasn't yielded the desired results. Because I have called scale_color_manual in p2, I can't call it here. So I'm attempting to use scale_fill_manual. but it's not working.
Hoping for advices.

p2+geom_text_repel(aes(label= ifelse(StoZ.x &lt; -10, unique(Gene), &quot;&quot;), hjust = 1, vjust=.5, size=0.5, angle = 45), fill = as.factor(BreastCanT1T2mer$Tier)) +
  scale_fill_manual(values = c(&quot;brown&quot;, &quot;navy&quot;))

huangapple
  • 本文由 发表于 2023年5月21日 09:29:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/76297945.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定