I would like to find out how to use sed to ONLY remove the space AND the bizarre characters from the following echo command:

我想了解如何使用sed仅从以下echo命令中删除空格和奇怪的字符:

echo -e "A \xd8\xa8"

So I tried:

所以我尝试过:

echo -e "A \xd8\xa8" | sed -r "s/[^[:print:]]//g"

but doesn't remove anything,

但是没有删除任何东西,

echo -e "A \xd8\xa8" | sed -r "s/[^[:alnum:]]//g"

only removes the space

只删除空间

echo -e "A \xd8\xa8" | sed -r "s/[^[:alpha:]]//g"

(same result),

echo -e "A \xd8\xa8" | sed -r "s/[^[:ascii:]]//g"

returns an error (invalid character class name), and

返回错误(无效的字符类名称),和

echo -e "A \xd8\xa8" | sed -r "s/[^\w ]//g"

removes everything...

Expected result: "A"

预期结果:“A”

Any ideas ?

有任何想法吗 ?

thanks!

3 个解决方案

#1


2

If you want sed to not consider e.g. Arabic characters to be alphabetic (which they are), you need to set a locale that does not consider them thus.

如果你想sed不考虑例如阿拉伯字符是字母(它们是),您需要设置一个不考虑它们的区域设置。

The "C" locale only considers the basic character set, i.e. only [A-Za-z] are alphabetic. I am assuming what you want is to delete everything that's not a character from that range (your question is fuzzy about what you really want):

“C”语言环境仅考虑基本字符集,即仅[A-Za-z]是字母。我假设你想要的是删除那个不是该范围内的角色的所有东西(你的问题很模糊你真正想要的东西):

echo -e "A \xd8\xa8" | LC_CTYPE=C sed -r "s/[^[:alpha:]]//g" | hexdump -C

Output:

00000000  41 0a
00000002

更多相关文章

  1. Linux Shell编程(15)——操作字符串
  2. linux中常用时间和字符串之间相互转化
  3. Bash脚本删除目录中多个文件名末尾的'x'字符数量?
  4. 文本文件到字符串数组?
  5. 字符串处理函数strcat和strtok
  6. 对linux字符设备的理解(整体架构)
  7. 嵌入式Linux要学哪些东西?你真的造吗?
  8. gdb捕获syscall条件和字符串比较
  9. Linux源码包里有个scripts文件夹,里面放的东西起什么作用?

随机推荐

  1. 使用python脚本配置zabbix发送报警邮件
  2. Python PyV8安装测试(Win7)
  3. Python引起的混乱解决之道——感悟
  4. 使用Python操作MongoDB
  5. Python第十天 print >> f,和fd.write()的
  6. 如何用位于括号外的逗号分隔字符串?
  7. 数组与矩阵---需要排序的最短子数组长度
  8. 如何为Google Cloud Endpoints方法生成py
  9. pyuthon高级技巧2
  10. python闭包变量迟邦定