I have a list of words sorted using g_ascii_strcasecmp function. I need to process this list in java. What is the equivalent sorting function in java? In order to implement binary search I need a correct comparison function. So far I have the function below but it is not always produces the correct result.

我有一个使用g_ascii_strcasecmp函数排序的单词列表。我需要用java处理这个列表。java中的等价排序函数是什么?为了实现二进制搜索,我需要一个正确的比较函数。到目前为止,我有下面的函数,但它并不总是产生正确的结果。

public int compareStrings(String str) {
    Collator collator = Collator.getInstance();//TODO: implement locale?
    return collator.compare(this.wordString, str);
}

UPDATE. List example: "T, t, T'ai Chi Ch'uan, t'other, T-, T-bone, T-bone steak, T-junction, tabasco, Tabassaran, tabby".

更新。例如:“T, T, T'ai Chi 'uan, T 'other, T-, T-bone, T-bone steak, T-junction, tabasco, Tabassaran, tabby”。

2 个解决方案

#1


1

I wouldn't use Collator, having read its Javadoc, because you have no control over how the strings get compared. You can pick the locale, but how that locale tells Collator how to compare strings is out of your hands.

读过它的Javadoc之后,我不会使用Collator,因为您无法控制如何比较字符串。您可以选择语言环境,但是语言环境如何告诉Collator如何比较字符串是不可能的。

If you know that the characters in your strings are all ASCII characters, then I'd just use the String.compareTo() method, which sorts lexicographically based on unicode character value. If all the characters in the strings are ASCII characters, their unicode character value will be their ASCII value and so sorting lexicographically on their unicode value will be the same as sorting lexicographically on their ASCII value, which appears to be what g_ascii_stcasecmp does. And if you need case-insensitivity, you could use String.compareToIgnoreCase().

如果您知道您的字符串中的字符都是ASCII字符,那么我就使用String.compareTo()方法,它基于unicode字符值对词法进行排序。如果字符串中的所有字符都是ASCII字符,那么它们的unicode字符值将是它们的ASCII值,因此对它们的unicode值进行字典上的排序将与对它们的ASCII值进行字典上的排序相同,这似乎是g_ascii_stcasecmp所做的。如果需要区分大小写,可以使用String.compareToIgnoreCase()。


As I noted in the comment, I think you'll need to write your own comparison function. You'll need to loop through the characters in the string, skipping over the ones that aren't in the ASCII range. So something like this, which is a simple, stupid implementation and needs to be beefed up to cover the corner cases I imagine g_ascii_strcasecmp does:

正如我在评论中提到的,我认为您需要编写自己的比较函数。您需要循环字符串中的字符,跳过不在ASCII范围内的字符。像这样的东西,这是一个简单的,愚蠢的实现需要加强以覆盖我认为g_ascii_strcasecmp所做的

public int compareStrings(String str) {
    List<Character> myAsciiChars = onlyAsciiChars(this.wordString);
    List<Character> theirAsciiChars = onlyAsciiChars(str);

    if (myAsciiChars.size() > theirAsciiChars.size()) {
        return 1;
    }
    else if (myAsciiChars.size() < theirAsciiChars.size()) {
        return -1;
    }

    for (int i=0; i < myAsciiChars.size(); i++) {
        if (myAsciiChars.get(i) > theirAsciiChars.get(i)) {
            return 1;
        }
        else if (myAsciiChars.get(i) < theirAsciiChars.get(i)) {
            return -1;
        }
    }

    return 0;
}

private final static char MAX_ASCII_VALUE = 127; // (Or 255 if using extended ASCII)

private List<Character> onlyAsciiChars(String s) {
    List<Character> asciiChars = new ArrayList<>();
    for (char c : s.toCharArray()) {
        if (c <= MAX_ASCII_VALUE) {
            asciiChars.add(c);
        }
    }
    return asciiChars;
}

更多相关文章

  1. JavaScript替换字符串中最后一个字符
  2. Java中怎么把字符串数组转为整形数组
  3. java里如何取出一个字符串中的数字?
  4. 在java中对Unicode的字符比U+FFFF多吗?
  5. 【java工具类】网站安全---将特殊字符编码成为html实体
  6. Java解析Json字符串--Map
  7. java 中判断一个字符串中大小写字母的个数及其思路
  8. java7 switch语句使用字符串
  9. 在Java中修剪可能的字符串前缀

随机推荐

  1. 从PHP登录请求获取空的AJAX响应
  2. PHP-设计模式-依赖注入
  3. 键入提示 - 指定对象数组
  4. PHP MySqli Multiple query & while in o
  5. MySQL命名约定,字段名称是否应包含表名?
  6. composer不能生成sf2 autoload文件。
  7. 用一个听众听多个事件?
  8. PHP常用函数归类总结【大全】
  9. 使用php浏览文件时,编写文件的完整路径
  10. Windows 环境下php安装openssl证书