有没有一种方法可以找到两个长度为5个字符的字符串之间的ASCII距离

huangapple go评论72阅读模式
英文:

Is there an approach to finding the ASCII distance between two strings of 5 characters

问题

我正在尝试找到一种方法来计算并打印用户输入的字符串与随机生成的字符串之间的Ascii距离。

Scanner scan = new Scanner(System.in);
System.out.print("请输入一个由5个大写字母组成的字符串:");
String userString = scan.nextLine();

和一个随机生成的字符串

int leftLimit = 65; // 大写字母 'A'
int rightLimit = 90; // 大写字母 'Z'
int stringLength = 5;
Random random = new Random();
String randString = random.ints(leftLimit, rightLimit + 1)
    .filter(i -> (i <= 57 || i >= 65) && (i <= 90 || i >= 97))
    .limit(stringLength)
    .collect(StringBuilder::new, StringBuilder::appendCodePoint, StringBuilder::append)
    .toString();

是否有一种方法可以在不必将两个字符串的每个单独字符分开、进行比较并将它们重新相加的情况下计算距离呢?

英文:

I am trying to find a way to calculate and print the Ascii distance between a string from user input

 Scanner scan = new Scanner(System.in);
	System.out.print(&quot;Please enter a string of 5 uppercase characters:&quot;);
	String userString = scan.nextLine();

and a randomly generated string

 int leftLimit = 65; // Upper-case &#39;A&#39;
    int rightLimit = 90; // Upper-case &#39;Z&#39;
    int stringLength = 5;
    Random random = new Random();
    String randString = random.ints(leftLimit, rightLimit + 1)
    	.filter(i -&gt; (i &lt;= 57 || i &gt;= 65) &amp;&amp; (i &lt;= 90 || i &gt;= 97))
    	.limit(stringLength)
    	.collect(StringBuilder::new, StringBuilder::appendCodePoint, StringBuilder::append)
    	.toString();

Is there a way to calculate the distance without having to separate each individual character from the two strings, comparing them and adding them back together?

答案1

得分: 2

使用编辑距离(Levenshtein距离)

你可以:

你还可以查看:

英文:

Use Edit distance (Levenshtein distance)

You can

you can also check

答案2

得分: 0

创建一个二维数组,用距离填充数组 - 您可以直接索引二维数组,以获取字符之间的距离。
因此,有一个表达式可以总结一组数组访问。

英文:

Create an 2D array fill the array with distances - you can index directly into the 2D array to pull out the distance between the characters.
So one expression that sums up a set of array accesses.

答案3

得分: 0

流(Streams),顾名思义,就是流。除非你能严格基于一个输入来定义操作,否则它们的效果不会很好:从流中获取一个元素,而不知道其索引或引用整个集合。

在这里,这是一个问题;毕竟,要对你的输入中的 'H' 进行操作,你需要从随机代码中获取匹配的字符。

我不确定你为什么觉得“将每个单独的字符分开,进行比较,然后将它们重新加在一起”对你来说是如此令人反感。这难道不是从问题描述到计算机运行指令的一个相当清晰的映射吗?

另一种选择更加复杂:你可以尝试创建一个混合对象,其中包含字母和它的索引,对此进行流操作,并使用索引在第二个字符串中查找字符。或者,你可以尝试创建一个包含两个字符的混合对象(例如,对于输入 ABCDE 和 HELLO,一个包含 A 和 H 的对象),但你需要编写更多的代码来设置它,比起简单的、无流的方法来说。

因此,让我们从简单的方式开始:

int difference = 0;
for (int i = 0; i < stringLength; i++) {
    char a = inString.charAt(i);
    char b = randomString.charAt(i);
    difference += difference(a, b);
}

你需要自己编写 difference 方法,但它将是一个非常简单的一行代码。

试图从某种类型的两个集合中获取,并从中创建一个单一的流,其中流中的每个元素都是来自每个集合的匹配元素(例如,一个流 [“HA”,“EB”,“LC”,“LD”,“OE”])通常称为“压缩”(与流行的文件压缩算法和产品没有关系),而 Java 实际上并不支持它(尚未支持?)。有一些第三方库可以做到这一点,但考虑到上述代码如此简单,我认为压缩并不是你在这里寻找的内容。

如果你确实必须这样做,我想代码可能如下所示:

// 一个包含 0、1、2、3、4 的流
IntStream.range(0, stringLength)
// 将 0 映射为 “HA”,将 1 映射为 “EB”,以此类推
.mapToObj(idx -> "" + inString.charAt(idx) + randomString.charAt(idx))
// 将 “HA” 映射为差异分数
.mapToInt(x -> difference(x))
// 并求和。
.sum();

public int difference(String a) {
   // 读者的练习
}
英文:

Streams are, well, as the name says, streams. They don't work very well unless you can define an operation strictly on the basis of one input: One element from a stream, without knowing its index or referring to the entire collection.

Here, that is a problem; after all, to operate on, say, the 'H' in your input, you need the matching character from your random code.

I'm not sure why you find 'separate each individual character, compare them, and add them back together' is so distasteful to you. Isn't that a pretty clean mapping from the problem description to instructions for your computer to run?

The alternative is more convoluted: You could attempt to create a mixed object that contains both the letter as well as its index, stream over this, and use the index to look up the character in the second string. Alternatively, you could attempt to create a mix object containing both characters (so, for inputs ABCDE and HELLO, an object containing both A and H), but you'd be writing far more code to get that set up, then the simple, no-streams way.

So, let's start with the simple way:

int difference = 0;
for (int i = 0; i &lt; stringLength; i++) {
    char a = inString.charAt(i);
    char b = randomString.charAt(i);
    difference += difference(a, b);
}

You'd have to write the difference method yourself - but it'd be a very very simple one-liner.

Trying to take two collections of some sort, and from them create a single stream where each element in the stream is matching elements from each collection (so, a stream of [&quot;HA&quot;, &quot;EB&quot;, &quot;LC&quot;, &quot;LD&quot;, &quot;OE&quot;]) is generally called 'zipping' (no relation to the popular file compression algorithm and product), and java doesn't really support it (yet?). There are some third party libraries that can do it, but given that the above is so simple I don't think zipping is what you're looking for here.

If you absolutely must, I guess i'd look something like:

// a stream of 0,1,2,3,4
IntStream.range(0, stringLength)
// map 0 to &quot;HA&quot;, 1 to &quot;EB&quot;, etcetera
.mapToObj(idx -&gt; &quot;&quot; + inString.charAt(idx) + randomString.charAt(idx))
// map &quot;HA&quot; to the difference score
.mapToInt(x -&gt; difference(x))
// and sum it.
.sum();

public int difference(String a) {
   // exercise for the reader
}

答案4

得分: -2

以下是我为您翻译的代码部分:

function z = asciidistance(input0)
    
if nargin ~= 1
    error('请输入一个字符串');
end

size0 = size(input0);

if size0(1) ~= 1
    error('请输入一个字符串');
end

length0 = size0(2);

rng('shuffle');

a = 32;
b = 127;

string0 = (b-a).*rand(length0,1) + a;

x = char(floor(string0));

z = (input0 - x);

ascii0 = sum(abs(z),'all');
ascii1 = abs(sum(z,'all'));

disp(ascii0);
disp(ascii1);

disp(ascii0/ascii1/length0);

end

这个脚本还区分了每个字符基础上的绝对ASCII距离与每个字符串基础上的ASCII距离,因此会返回两个整数作为ASCII距离。我还包括了这两个值的极限,该值接近于所比较的字符串长度的倒数。这实际上近似于在运行时生成的每个随机字符串生成事件的熵E。

在标准的错误检查之后,脚本首先找到输入字符串的长度。rng 函数对随机数生成器进行了种子化。ab 变量定义了ASCII表中除不可打印字符外的范围,其范围从32到126(包括126)。127实际上被用作上界,以便下一行代码可以生成具有输入长度的随机字符串变量。接下来的代码将字符串转换为ASCII表提供的字母数字字符。再接下来的代码对两个字符串逐元素进行减法,并将结果存储起来。随后的两行代码分别以第一段中提到的两种方式求取了ASCII距离的和。最后,这些值会被打印出来,同时还会提供随机字符串生成事件的熵E。

英文:

Here is my code for this (ASCII distance) in MATLAB

function z = asciidistance(input0)

if nargin ~= 1

    error(&#39;please enter a string&#39;);

end

size0 = size(input0);

if size0(1) ~= 1

	error (&#39;please enter a string&#39;);

end

length0 = size0(2);

rng(&#39;shuffle&#39;);

a = 32;
b = 127;

string0 = (b-a).*rand(length0,1) + a;

x = char(floor(string0));

z = (input0 - x);

ascii0 = sum(abs(z),&#39;all&#39;);
ascii1 = abs(sum(z,&#39;all&#39;));

disp(ascii0);
disp(ascii1);

disp(ascii0/ascii1/length0);

end

This script also differentiates between the absolute ASCII distance on a per-character basis vs that on a per-string basis, thus resulting in two integers returned for the ASCII distance.

I have also included the limit of these two values, the value of which approaches the inverse of the length of strings being compared. This actually approximates the entropy, E, of every random string generation event when run.

After standard error checking, the script first finds the length of the input string. The rnd function seeds the random number generator. the a and b variables define the ASCII table minus non-printable characters, which ends at 126, inclusively. 127 is actually used as an upper bound so that the next line of code can generate a random string of variables of input length. The following line of code turns the string into the alphanumeric characters provided by the ASCII table. The following line of code subtracts the two strings element-wise and stores the result. The next two lines of code sum up the ASCII distances in the two ways mentioned in the first paragraph. Finally, the values are printed out, as well as providing the entropy, E, of the random string generation event.

huangapple
  • 本文由 发表于 2020年9月28日 02:55:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/64092167.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定