英文:
How can I build a histogram with factor intervals?
问题
我需要基于一些因素构建直方图,但这些因素描述了数字区间,例如0-2000、2000-4000、4000-6000、6000-8000、8000-10000等区间,我知道项目落入这些区间的频率,应该如何做?
我尝试将这些区间转化为数字,但并没有真正取得任何进展。
英文:
I need to build a histogram out of some factors, but those factors describe number intervals, for example, the intervals 0-2000, 2000-4000, 4000-6000, 6000-8000, 8000-10000 and i know in what frequence itens falls into those intervals, how would i do it?
I've tried turning the intervals into numbers but, didn't really arrived at anywhere.
答案1
得分: 0
以下是翻译好的内容:
问题是如何将数字的因子转化为类似数字的东西,以便可以在其上绘制直方图。
quux <- data.frame(x = factor(c("0-2000", "2000-4000", "4000-6000", "6000-8000", "8000-10000")))
quux
# x
# 1 0-2000
# 2 2000-4000
# 3 4000-6000
# 4 6000-8000
# 5 8000-10000
我认为最简单的方法是为每个字符串提出两个值,每个值都是一个数字。
nums <- lapply(strsplit(levels(quux$x), "[^0-9]+"), as.numeric)
str(nums)
# List of 5
# $ : num [1:2] 0 2000
# $ : num [1:2] 2000 4000
# $ : num [1:2] 4000 6000
# $ : num [1:2] 6000 8000
# $ : num [1:2] 8000 10000
您可以将这些转化为任何您想要表示的“数字”。例如:
### 每对中的第一个值
sapply(nums, `[[`, 1)
# [1] 0 2000 4000 6000 8000
### 最小值,如果它们不总是按顺序排列,与上述不同;
### 这次显示了 'na.rm=TRUE' 的添加,以防有非数字
sapply(nums, min, na.rm = TRUE)
# [1] 0 2000 4000 6000 8000
### 每对的平均值
sapply(nums, mean)
# [1] 1000 3000 5000 7000 9000
无论您选择哪种方式,然后可以将该值放入您计划使用的直方图绘制表达式中。
英文:
Your problem is how to convert factors of numbers into something number-like so that you can plot a histogram on it.
quux <- data.frame(x = factor(c("0-2000", "2000-4000", "4000-6000", "6000-8000", "8000-10000")))
quux
# x
# 1 0-2000
# 2 2000-4000
# 3 4000-6000
# 4 6000-8000
# 5 8000-10000
I think the easiest start is to come up with two values for each string, each a value.
nums <- lapply(strsplit(levels(quux$x), "[^0-9]+"), as.numeric)
str(nums)
# List of 5
# $ : num [1:2] 0 2000
# $ : num [1:2] 2000 4000
# $ : num [1:2] 4000 6000
# $ : num [1:2] 6000 8000
# $ : num [1:2] 8000 10000
You can convert this into whatever "numbers" you want each to represent. Examples:
### first of each pair
sapply(nums, `[[`, 1)
# [1] 0 2000 4000 6000 8000
### min, different from above if they are not always in order;
### this time showing addition of the 'na.rm=TRUE' in case
### there are non-numbers
sapply(nums, min, na.rm = TRUE)
# [1] 0 2000 4000 6000 8000
### average of each pair
sapply(nums, mean)
# [1] 1000 3000 5000 7000 9000
Whichever you choose, you can then place that value into whatever hist-plotting expression you're planning to use.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论