英文:
How do you remove double quotes from the image of a JavaCC token?
问题
在Java中,我正在接受满足以下条件的字符串:
<STRING : "\"\"" ("\\\"" ~[] | ~["\"" "\\\""] )* "\"\"" >
因此,该图像最终会打印任何带有另一组双引号的字符串。
例如,我会输入:"This is a sentence."
然后值将变为:""This is a sentence."" 存储在一个字符串变量中。
在Java中是否有一种方法可以删除多余的双引号,以便仅打印:""This is a sentence.""?
英文:
In JavaCC, I'm accepting strings that are under the condition:
< STRING : "\"" ("\\" ~[] | ~["\"","\\"] )* "\"" >
So the image ends up printing anything that is a string but with another set of double quotes.
For example, I'll input: "This is a sentence."
And the value will result : ""This is a sentence."" stored in a String variable.
Is there a way in Java to remove the extra set of double quotes so that it only prints: "This is a sentence."?
答案1
得分: 2
如果与您的标记匹配的输入是"Hello"
,那么标记的image
字段的值将是一个7个字符长的字符串,其第一个和最后一个字符是双引号字符。它们实际上不是额外的,它们在输入中。假设您编写了以下代码:
void foo() : {
Token t ; }
{
t = <STRING>
{ System.out.println( t.image ) ; }
}
这将打印出7个字符,然后是一个换行符。
现在,如果您不想要这些字符,好的,@Bryan的答案将可以做到。
void foo() : {
Token t ; }
{
t = <STRING>
{ { String nakedImage = t.image.substring(1,t.image.length()-1) ;
System.out.println( nakedImage ) ; } }
}
应该注意的是,没有引号被移除。在Java中,String
对象是不可变的,这意味着它们不能被改变。实际上发生的是创建了一个新的String
对象,并且将对它的引用分配给了nakedImage
变量。t.image
引用的String
对象保持不变。
现在您仍然需要处理反斜杠的问题。如果输入是"Hello\tWorld"
,那么t.image
的长度将是14个字符,而nakedImage
的长度将是12个字符。在这一点上,我要做的是通过一个函数运行字符串,构建一个新的字符串,其中nakedImage
的转义序列被替换为单个字符。因此,对于这个例子,该函数的结果将是11个字符长。
void foo() : {
Token t ; }
{
t = <STRING>
{ { String nakedImage = t.image.substring(1,t.image.length()-1) ;
String unescapedImage = unescape( nakedImage ) ;
System.out.println( unescapedImage ) ; } }
}
这是一个这样的函数,基于我为Java编译器编写的函数。
private static String unescape( String str ) {
StringBuffer result = new StringBuffer() ;
for( int i=0, len = str.length() ; i<len ; ) {
char ch = str.charAt(i) ;
// 设置ch并增加i;
if( ch == '\\' ) {
ch = str.charAt(i+1) ;
switch( ch ) {
case 'b' : ch = '\b' ; i += 2 ; break ;
case 't' : ch = '\t' ; i += 2 ; break ;
case 'n' : ch = '\n' ; i += 2 ; break ;
case 'f' : ch = '\f' ; i += 2 ; break ;
case 'r' : ch = '\r' ; i += 2 ; break ;
case '"' : case '\'' : case '\\' : i+= 2 ; break ;
default:
/*TODO 处理错误。*/ } }
else {
i += 1 ; }
result.append( ch ) ; }
return result.toString() ;
}
英文:
If the input matched by your token is "Hello"
then the value of the image
field of the token will be a 7 character string whose first and last characters are double quote characters. They're not really extra they were they in the input. Say you write
void foo() : {
Token t ; }
{
t = <STRING>
{ System.out.println( t.image ) ; }
}
That'll print 7 characters and then a newline.
Now if you don't want those characters, well, @Bryan's answer will do it.
void foo() : {
Token t ; }
{
t = <STRING>
{ { String nakedImage = t.image.substring(1,t.image.length()-1) ;
System.out.println( nakedImage ) ; } }
}
It should be noted that no quotes are removed. String
objects in Java are immutable, meaning they can't be changed. What really happens is that a new String
object gets created and a reference to it is assign to the nakedImage
variable. The String
object that t.image
is a reference to remains the same.
Now you still have the problem of dealing with the back slashes. If the input is "Hello\tWorld", then t.image
will be 14 characters long and nakedImage
will be 12 characters long. What I do at this point is to run the string through a function builds a new string that has single characters where the nakedImage
has escape sequences. So the result of that function on this example would be 11 characters long.
void foo() : {
Token t ; }
{
t = <STRING>
{ { String nakedImage = t.image.substring(1,t.image.length()-1) ;
String unescapedImage = unescape( nakedImage ) ;
System.out.println( unescapedImage ) ; } }
}
Here's such a function, based on one I wrote for a Java compiler.
private static String unescape( String str ) {
StringBuffer result = new StringBuffer() ;
for( int i=0, len = str.length() ; i<len ; ) {
char ch = str.charAt(i) ;
// Set ch and increment i ;
if( ch == '\\' ) {
ch = str.charAt(i+1) ;
switch( ch ) {
case 'b' : ch = '\b' ; i += 2 ; break ;
case 't' : ch = '\t' ; i += 2 ; break ;
case 'n' : ch = '\n' ; i += 2 ; break ;
case 'f' : ch = '\f' ; i += 2 ; break ;
case 'r' : ch = '\r' ; i += 2 ; break ;
case '"' : case '\'' : case '\\' : i+= 2 ; break ;
default:
/*TODO Deal with errors. */ } }
else {
i += 1 ; }
result.append( ch ) ; }
return result.toString() ;
}
答案2
得分: 0
str = str.substring(1, str.length() - 1);
Javacc的替代方案
https://stackoverflow.com/questions/11878392/parsing-strings-with-javacc
英文:
str = str.substring(1,str.length()-1)
alternate for Javacc
https://stackoverflow.com/questions/11878392/parsing-strings-with-javacc
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论