问题
Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher's h-index.
According to the definition of h-index on Wikipedia: "A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each."
Note: If there are several possible values for h, the maximum one is taken as the h-index.
例子
For example, given citations = [3, 0, 6, 1, 5], which means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, his h-index is 3.
分析
题目有点绕,通俗点理解就是:有一个数组A,大小为n。现在要找一个数h,使得A有至少h个元素不小于h。
现假设数组为[3, 0, 6, 1, 5]。
方法一:将数组从小到大排序,然后从大到小(n-->0)枚举h值,再计算数组中不小于h的元素个数。数组排序后为[0, 1, 3, 5, 6]。假设h=5,不小于5的元素个数为2,2<5,不满足要求;h=4,不小于4的元素个数为2,仍不满足要求;h=3,不小于3的元素个数为3,3>=3,满足要求,所以h=3。在排序数组中计算不小于某个数的元素个数可以使用stl的lower_bound函数,时间复杂度为O(logn)。再加上之前的排序,总的复杂度为O(nlogn)。
方法二:将数组从大到小排序,然后找到一个最大的数组下标,该下标对应的元素大于该下标。数组排序后为[6, 5, 3, 1, 0]。满足条件的下标为3,所以h=3。因为数组从大到小排序,假设下标为i,i其实就等于数组中不小于A[i]的元素的个数。当A[i]>=i时,则必然有A存在至少i个元素不小于i。由于需要排序,时间复杂度为O(nlogn)。
方法三:其实我们不需要将数组排序。考虑有n+1个bucket,编号为0到n。遍历数组,如果数组元素a不小于n,bucket[n]加1;如果数组元素小于n,bucket[a]加一。接着定义一个total变量,从后往前累加bucket,如果total大于等于bucket下标i,则h=i。实际上total的含义的就是数组中不小于第i个元素的个数。
数组为[3, 0, 6, 1, 5],bucket初始化为[0, 0, 0, 0, 0],计数之后为[1, 1, 0, 1, 0, 2]。当i=5时,total=2, 2<5,不满足要求;i=4,total=2+0=2<4,不满足要求;i=3,total=2+1=3>=3,满足要求。因此h=3。由于该方法不需要排序,只变量了两次数组,因此时间复杂度为O(n)。
要点
bucket sort. 大于n的元素被压缩到第n个bucket里。
时间复杂度
O(n)
空间复杂度
O(n)
代码
方法一
class Solution {
public:
int hIndex(vector<int>& citations) {
sort(citations.begin(), citations.end());
for (int i = citations.size(); i >= 0; i--) {
if (lower_bound(citations.begin(), citations.end(), i) - citations.begin() <= citations.size() - i)
return i;
}
return 0;
}
};
方法二
class Solution {
public:
int hIndex(vector<int>& citations) {
sort(citations.begin(), citations.end(), greater<int>());
for (int i = citations.size(); i >= 1; i--) {
if (citations[i - 1] >= i)
return i;
}
return 0;
}
};
方法三
class Solution {
public:
int hIndex(vector<int>& citations) {
int n = citations.size(), count = 0;
vector<int> bucket(n + 1, 0);
for (int c : citations)
bucket[min(c, n)]++;
for (int i = n; i >= 0; i--) {
count += bucket[i];
if (count >= i) return i;
}
return 0;
}
};