im2txt 论文解读及keras代码实现

处理数据
- 读入vgg_feats.mat
- 读入

数据及介绍

Flickr8k 2013

数据处理
Flickr8k.token.txt每一行数据格式： 000268201_693b08cb0e.jpg#0 A child in a pink dress is climbing up a set of stairs in an entry way .
我们将其解析存放到一个dict()中img_to_caps.存放格式
img_to_caps['000268201_693b08cb0e'] =[caption1,caption2,caption3,caption4,caption5].

    def __init__(self,path=path.join('Flickr8k_text','Flickr8k.token.sample.txt'),n_vocab=100,max_seq_len=16):
        '''
            the format of evary line in Flickr8k.token.txt
            000268201_693b08cb0e.jpg#0  A child in a pink dress is climbing up a set of stairs in an entry way .
            we will parse it to store img_to_caps, its the format is
            img_to_caps['000268201_693b08cb0e'] = [caption1,caption2,caption3,caption4,caption5]
        '''
        self.img_to_caps = dict()

        with open(path) as f:
            for line in f:
                tokens = line.split(' ')
                img_fname, cap_idx = tokens[0].split('#')
                caption = ' '.join(tokens[1:]).strip()
                if img_fname not in self.img_to_caps:
                    self.img_to_caps[img_fname] = []
                self.img_to_caps[img_fname].append(caption)
        self.img_fnames = self.img_to_caps.keys()
        print(self.img_fnames)
        print(self.img_to_caps)

最后编辑于：2017.12.11 13:50:42

©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成，浏览时请结合常识与多方信息审慎甄别。
平台声明：文章内容（如有图片或视频亦包括在内）由作者上传并发布，文章内容仅代表作者本人观点，简书系信息发布平台，仅提供信息存储服务。

im2txt 论文解读及keras代码实现

im2txt 论文解读及keras代码实现

数据及介绍

相关阅读更多精彩内容

友情链接更多精彩内容