词性标注的结果与单词大小写有关,比如China会被标注为NNP,即专有名词,china则会被标注为NN,即名词。
运行下列程序:
from nltk import pos_tag
from nltk import word_tokenize
def get_word_label(sent):
if len(sent) < 1:
return []
word_label = pos_tag(word_tokenize(sent))
return word_label
if __name__ == "__main__":
s = "I love China"
word_pos = get_word_label(s)
print word_pos
s = "i love china"
word_pos = get_word_label(s)
print word_pos
运行结果为:

image.png