python 词频统计

"""Count words."""

def count_words(s, n):

    """Return the n most frequently occuring words in s."""

    from operator import itemgetter, attrgetter 

    # TODO: Count the number of occurences of each word in s

    strl_ist = s.replace('\n', ' ').lower().split(' ')

    count_dict = {}

    for str in strl_ist:

        if str in count_dict.keys():

            count_dict[str] = count_dict[str] + 1

        else:

            count_dict[str] = 1

    # TODO: Sort the occurences in descending order (alphabetically in case of ties)

    count_list=sorted(count_dict.iteritems(),key=itemgetter(1),reverse=True)


    top_n = count_list[0:n]

    # TODO: Return the top n most frequent words.

    return top_n

def test_run():

    """Test count_words() with some inputs."""

    print count_words("cat bat mat cat bat cat", 3)

    print count_words("betty bought a bit of butter but the butter was bitter", 3)

if __name__ == '__main__':

    test_run()

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi阅读 7,486评论 0 10
  • **2014真题Directions:Read the following text. Choose the be...
    又是夜半惊坐起阅读 10,023评论 0 23
  • 女儿对花生过敏,急救注射针剂EpiPen是常备药物。食物过敏在中国偶有发生但通常既不普遍也不那么严重,故而不在公众...
    北美之北阅读 1,225评论 10 4
  • 给简书报两个bug。 发布文章后,先修改再分享,第一次分享出去的不是最新修改版,而是十几分钟前的修改版,上次说过了...
    纯银V阅读 2,241评论 0 5
  • 绾青丝,着红衣,一池清水涟漪。裙裾曳地,染荷意,纵往不思量。提步涉江,初荷采撷,能忆采莲女子?
    婉青婉青阅读 543评论 1 3