英文:
How to add conditional annotation from other column of R dataframe in ggplot2
问题
I will translate the code part as requested:
我正在尝试从包含超过300k行的数据框中制作一个图表,其中Y的高值和低值将从另一列而不是X和Y中的数据进行注释。
我该怎么做..!!
数据框:
chr start end.x StoZ.x Hscore end.y StoZ.y Tier Gene
1 chr1 1 10000 0.0000000 0 NA NA <NA> <NA>
2 chr1 10001 20000 3.6488202 2 NA NA <NA> <NA>
3 chr1 20001 30000 -0.8475483 0 NA NA <NA> <NA>
4 chr1 30001 40000 2.1279359 2 NA NA <NA> <NA>
5 chr1 40001 50000 1.4119515 0 NA NA <NA> <NA>
................
................
................
256 chr1 2550001 2560000 -2.363378 1 2560000 -2.363378 T1 TNFRSF14
257 chr1 2560001 2570000 -0.796173 0 2570000 -0.796173 T1 TNFRSF14
................
................
................
305 chr1 3040001 3050000 0.0564608 0 NA NA NA NA
306 chr1 3050001 3060000 1.4822029 0 NA NA NA NA
307 chr1 3060001 3070000 1.7718186 0 3070000 1.7718186 T1 PRDM16
308 chr1 3070001 3080000 1.5650776 0 3080000 1.5650776 T1 PRDM16
................
................
................
The translation of the code snippet is complete. If you have any more specific questions or need further assistance, please feel free to ask.
英文:
I'm trying to make a plot from a dataframe of over 300k rows where the high and low values of Y will be annotated from another column rather than the x and y.
How can I do that..!!
Dataframe :
chr start end.x StoZ.x Hscore end.y StoZ.y Tier Gene
1 chr1 1 10000 0.0000000 0 NA NA <NA> <NA>
2 chr1 10001 20000 3.6488202 2 NA NA <NA> <NA>
3 chr1 20001 30000 -0.8475483 0 NA NA <NA> <NA>
4 chr1 30001 40000 2.1279359 2 NA NA <NA> <NA>
5 chr1 40001 50000 1.4119515 0 NA NA <NA> <NA>
................
................
................
256 chr1 2550001 2560000 -2.363378 1 2560000 -2.363378 T1 TNFRSF14
257 chr1 2560001 2570000 -0.796173 0 2570000 -0.796173 T1 TNFRSF14
................
................
................
305 chr1 3040001 3050000 0.0564608 0 NA NA NA NA
306 chr1 3050001 3060000 1.4822029 0 NA NA NA NA
307 chr1 3060001 3070000 1.7718186 0 3070000 1.7718186 T1 PRDM16
308 chr1 3070001 3080000 1.5650776 0 3080000 1.5650776 T1 PRDM16
................
................
................
Till now what I did,
p2<-ggplot(BreastCanT1T2mer, aes(start, StoZ.x)) +
geom_jitter(aes(color = Hscore), pch = 23, size = 0.2) +
scale_colour_manual(values = c("gray70","#525480","#9F5370","blue","red"), labels=c('NEUT', 'LOSS', 'GAIN', "DEL","AMP" ))+
theme(legend.position = c(0.5, .06), legend.direction = "horizontal")+
theme(legend.title = element_blank()) +
geom_hline(yintercept=0, linetype="dashed", color = "black")+
guides(colour = guide_legend(override.aes = list(size=1.5)))+
xlab("Chromosomes") + ylab("Stoffers-Z Score")+
facet_wrap(~factor(chr, c("chr1", "chr2", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", "chr9", "chr10", "chr11", "chr12", "chr13", "chr14","chr15", "chr16","chr17","chr18","chr19","chr20","chr21","chr22", "chrX", "chrY")), nrow =1, strip.position = "bottom") +
theme(panel.spacing.x = unit(0.1, "lines"))+
theme(axis.text.x=element_blank())+
theme( legend.key = element_rect(fill = "transparent", colour = NA))
p2
p2+stat_peaks(span=7,size= 1, ignore_threshold = 0.95, color="brown")+
stat_peaks(geom="text", size= 1.5, span=7, ignore_threshold = 0.95, color="red", angle=90, hjust=-0.1, check_overlap = TRUE) +
stat_valleys(span=7,size= 1, ignore_threshold = 0.95, color="skyblue")+
stat_valleys(geom="text", size= 1.5, span=7, ignore_threshold = 0.95, color="black", angle=90, hjust=1.1, check_overlap = TRUE)
My expectation is that the Gene name will appear only once where the Gene is available (StoZ is probably highest and lowest) and where the StoZ value should be less than -10 or greater than +10.
**Improvement after @Elin's sugesstion
p2+geom_text(aes(label=Gene))
Here, the Name of the Genes should appear only once and y conditioned on <-2 and >+2.
答案1
得分: 2
What I have so far is here, however, I believe the output may be better.
p2 + geom_text_repel(aes(label = ifelse(StoZ.x < -20, unique(Gene), "")), size = 1.5, angle = 45) +
geom_text_repel(aes(label = ifelse(StoZ.x > 20, unique(Gene), "")), size = 1.5, angle = 45)
Now I'm wondering if I can make geom_text
color based on the Tier
column.
So far, what I've done hasn't yielded the desired results. Because I have called scale_color_manual
in p2
, I can't call it here. So I'm attempting to use scale_fill_manual
, but it's not working.
p2 + geom_text_repel(aes(label = ifelse(StoZ.x < -10, unique(Gene), ""), hjust = 1, vjust = 0.5, size = 0.5, angle = 45), fill = as.factor(BreastCanT1T2mer$Tier)) +
scale_fill_manual(values = c("brown", "navy"))
英文:
What I have so far is here, however, I believe the output may be better.
p2+geom_text_repel(aes(label= ifelse(StoZ.x < -20, unique(Gene), "")), size=1.5, angle = 45) +
geom_text_repel(aes(label= ifelse(StoZ.x > 20, unique(Gene), "")), size=1.5, angle = 45)
Now I'm wondering if I can make geom_text color based Tier
column.
So far, what I've done hasn't yielded the desired results. Because I have called scale_color_manual
in p2
, I can't call it here. So I'm attempting to use scale_fill_manual
. but it's not working.
Hoping for advices.
p2+geom_text_repel(aes(label= ifelse(StoZ.x < -10, unique(Gene), ""), hjust = 1, vjust=.5, size=0.5, angle = 45), fill = as.factor(BreastCanT1T2mer$Tier)) +
scale_fill_manual(values = c("brown", "navy"))
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论