annotations中的三个数据文件包含哪些数据项
images/metadata文件又包含哪些数据项

训练数据集包含哪些文件

1.整个数据集包括两大部分：annotations和image/metadata。

annotations

包含train.jsonl、val.jsonl、test.jsonl（即训练集、验证集和测试集）3个jsonl类型的文件

`train.jsonl`的内容

数据项确认

通过代码查看训练数据每一条的内容是否一致：

import json
file=open("train-temp.jsonl",'r',encoding='utf-8')
list=[]
for line in file.readlines():
    obj=json.loads(line)
    # print(type(obj))
    list.append(obj)
for i in list:
    print(i.keys())

在这里插入图片描述

由结果可知，每一条数据所包含的属性都是一样的。

各属性含义

`movie`属性

官方文档中说：movie: the movie that the image comes from. For MovieClips, we had to guess; for LSMDC, it’s given by the dataset.对于guess和given没有太理解是什么意思。

原数据集中含有两类movie，即MovieClips和LSMDC，其在jsonl文件中的书写与数据集的对应关系如下：

#LSMDC
train.jsonl：{
    
    "movie":"xxxxxx"}
dataset中：lsmdc_xxxxxx\xxxxxx_00.03.47.191-00.03.52.313@0.jpg
dataset中：lsmdc_xxxxxx\xxxxxx_00.03.47.191-00.03.52.313@0.json
#MovieClips
train.jsonl：{
    
    "movie":"xxxxxx"}
dataset中：movieclips_xxxxxx\yyyyyy@0.jpg
dataset中：movieclips_xxxxxx\yyyyyy@0.json

`objects`属性(重要)

a list of objects detected

例如：

'objects':['person', 'person', 'bottle']

`interesting_scores`属性

是一个列表；包含一个数字；表示兴趣度

1表示极度感兴趣；0表示还行；-1表示无聊

例如：

{
    
    'interesting_scores': [0]}

`answer_likelihood`属性

How likely the turker said that their answer was

有三种：不可能，有可能，很可能(对应unlikely, possible, or likely)

'answer_likelihood': 'possible'

`img_fn`属性(重要)

the filename of the image, within the vcr1images directory

'img_fn': 'lsmdc_3038_ITS_COMPLICATED/[email protected]'

`img-id`属性

A shorter ID of this image

It will look something like {SPLIT}-{NUMBER}

'img_id': 'train-1'

`metadata_fn`属性(重要)

图像对应的json文件的文件名

within the vcr1images directory

'metadata_fn': 'lsmdc_3038_ITS_COMPLICATED/[email protected]'

`{question/answer/rationale}_orig`属性

原始的问题/答案/理由文本

由人工标注的

'answer_orig': '2 has spinach in her teeth.'
'question_orig': 'Why is 1 smiling at 2?'
'rationale_orig': '1 is smiling and appears to be looking at the mouth of 2'

`question`属性(重要)

原始问题的分词版本

其中，探测到的目标用列表表示，列表中的元素为该目标在'objects'中的索引(从0开始)

'question': ['Why', 'is', [0], 'smiling', 'at', [1], '?']
# 还有下面这种情况
'question_orig':'what are 1 and 2 doing?'
'question':["What", "are", [0,1], "doing", "?"]

所以在原始文本中相关对象的索引是从1开始，而在分词版本中相关对象的索引是从0开始

`question_number`属性

The first question/answer/rationale for this img-id/img-fn has qid=0, the second has qid=1, etc.

'question_number': 1

`{answer/rationale}_match_iter`属性

在对抗匹配迭代过程中，每个答案是在第几次迭代时被引入的

从0开始。0表示该答案是基本事实答案；1表示该答案是在第1次迭代的时候被引入的

'answer_match_iter': [3, 1, 0, 2]
#这个列表对于每个
###解释如下：
#3:第一个错误答案是在第三次迭代时被引入的
#1:第二个错误答案是在第一次迭代时被引入的
#0:第三个答案是基本事实答案；answer_label=2
#2:第四个错误答案是在第二次迭代时被引入的
###
'rationale_match_iter': [3, 0, 2, 1]

`match_fold`属性

为了避免问题答案重叠而划分了多个fold

#格式：'{split}-{number}'
'match_fold': 'train-0'

`match_index`属性

在每一个match_fold中，question、answer和rationale的索引

'match_index': 4

`{answer/rationale}_sources`属性

错误答案所对应的match_index。

关于他举的例子（下图），没有看懂（知道的小伙伴麻烦在评论区告诉我哦）

在这里插入图片描述

'answer_sources': [9336, 8843, 4, 3308] #长度均为4 
'rationale_sources': [6658, 4, 5296, 8618] #长度均为4

`{answer/rationale}_choices`属性(重要)

四个元素

每个元素的格式和question属性一样

'answer_choices':[[[1], 'has', 'given', 'her', 'some', 'spare', 'change', '.'], 
                  ['She', 'is', 'happy', 'to', 'see', 'him', 'happy', '.'], 
                  [[1], 'has', 'spinach', 'in', 'her', 'teeth', '.'], 
                  [[0], 'complimented', 'how', 'she', 'looks', '.']
                 ]
'rationale_choices':[[[0], 'is', 'sitting', 'behind', [1], 'and', 'is', 'jealous', '.'], 
                     [[0], 'is', 'smiling', 'and', 'appears', 'to', 'be', 'looking','at', 'the', 'mouth', 'of', [1], '.'], 
                     ['Men', 'often', 'look', 'and', 'smile', 'at', 'attractive','women', '.'], 
                     ['People', 'will', 'often', 'smile', 'when', 'lying', '.']
                    ]

`{answer/rationale}_label`属性(重要)

Which answer (0 to 3) is right in answer_choices/rationale_choices。

'answer_label': 2
'rationale_label': 1

`annot_id`属性

The index of this question, answer, and rationale. It will look something like {SPLIT}-{NUMBER}

'annot_id': 'train-4'

test.jsonl的内容

包含哪些数据项

import json
file=open("test-temp.jsonl",'r',encoding='utf-8')
list=[]
for line in file.readlines():
    obj=json.loads(line)
    # print(type(obj))
    list.append(obj)
for i in list:
    print(i.keys())

在这里插入图片描述

缺少如下属性：

interesting_scores、answer_likelihood、{question/answer/rationale}_orig、

{answer/rationale}_match_iter

{answer/rationale}_sources

{answer/rationale}_label

images文件

包含图片以及图片对应的metadata文件（即图片+json文件）

json文件内容

import json
list = []
file_list=["image1.json","image2.json","image3.json"]
for file in file_list:
    f=open(file,'r',encoding='utf-8')

    for line in f.readlines():
        obj = json.loads(line)
        list.append(obj)
print(len(list))
for i in list:
    print(i.keys())

在这里插入图片描述

每个metadata文件包含上图几个属性：boxes、segms、names、width、height

`boxes`属性

图片中的每个对象表示为：[x1, y1, x2, y2, score]

The score is the probability output by Detectron

因此，就是一个矩形框的某条对角线的两个端点的坐标

# 子列表的个数就是该图片中探测到的对象的数量
'boxes':[[903.2135009765625, 701.7916259765625, 1014.09423828125, 1055.47412109375, 0.9703705906867981], 
         [994.289794921875, 704.1527099609375, 1111.846435546875, 947.826904296875, 0.8998854756355286], 
         [1009.2838134765625, 840.1869506835938, 1194.0753173828125, 1074.7960205078125, 0.9434481859207153], 
         [720.6085815429688, 283.8370666503906, 795.2588500976562, 487.2865295410156, 0.9889575839042664], 
         [791.5657348632812, 234.8731231689453, 858.7852172851562, 438.1197509765625, 0.9607705473899841], 
         [1429.427001953125, 928.6102905273438, 1574.16259765625, 1075.0577392578125, 0.8753396272659302], 
         [1479.6678466796875, 773.1539306640625, 1625.8377685546875, 907.2779541015625, 0.9426108598709106], 
         [502.2203369140625, 854.4915161132812, 615.8026123046875, 1041.707763671875, 0.7735394239425659], 
         [669.2015991210938, 991.70947265625, 738.2836303710938, 1075.1796875, 0.7525131106376648], 
         [1267.6439208984375, 751.4714965820312, 1386.6234130859375, 917.6212768554688, 0.87295001745224], 
         [1004.145263671875, 533.8333740234375, 1118.03173828125, 692.065673828125, 0.7064476013183594], 
         [447.1738586425781, 970.9711303710938, 603.43896484375, 1075.8031005859375, 0.8442613482475281], 
         [1111.0369873046875, 658.0494384765625, 1250.2884521484375, 1002.32421875, 0.8830368518829346], 
         [218.87393188476562, 392.3623352050781, 818.292724609375, 913.132568359375, 0.7114404439926147]
        ]

`segms`属性

每一个对象对应一个多边形列表

每一个多边形列表里面含多个[x,y]点，这些点由cv2.findContours识别出来作为对象的轮廓

'segms':[[[x1,y1],[x2,y2],...,[xn,yn]],
         [],
         ...
         []
        ]

`width`和`height`属性

图片的宽度和高度

'width':1920
'height':1080

`segms`属性

每一个对象对应一个多边形列表

每一个多边形列表里面含多个[x,y]点，这些点由cv2.findContours识别出来作为对象的轮廓

'segms':[[[x1,y1],[x2,y2],...,[xn,yn]],
         [],
         ...
         []
        ]

`width`和`height`属性

图片的宽度和高度

'width':1920
'height':1080

vcrimages以及annotations数据内容分析

训练数据集包含哪些文件

annotations

`train.jsonl`的内容

数据项确认

各属性含义

`movie`属性

`objects`属性(重要)

`interesting_scores`属性

`answer_likelihood`属性

`img_fn`属性(重要)

`img-id`属性

`metadata_fn`属性(重要)

`{question/answer/rationale}_orig`属性

`question`属性(重要)

`question_number`属性

`{answer/rationale}_match_iter`属性

`match_fold`属性

`match_index`属性

`{answer/rationale}_sources`属性

`{answer/rationale}_choices`属性(重要)

`{answer/rationale}_label`属性(重要)

`annot_id`属性

test.jsonl的内容

包含哪些数据项

images文件

json文件内容

`boxes`属性

`segms`属性

`width`和`height`属性

`segms`属性

`width`和`height`属性

猜你喜欢

vcrimages以及annotations数据内容分析

训练数据集包含哪些文件

annotations

train.jsonl的内容

数据项确认

各属性含义

movie属性

objects属性(重要)

interesting_scores属性

answer_likelihood属性

img_fn属性(重要)

img-id属性

metadata_fn属性(重要)

{question/answer/rationale}_orig属性

question属性(重要)

question_number属性

{answer/rationale}_match_iter属性

match_fold属性

match_index属性

{answer/rationale}_sources属性

{answer/rationale}_choices属性(重要)

{answer/rationale}_label属性(重要)

annot_id属性

test.jsonl的内容

包含哪些数据项

images文件

json文件内容

boxes属性

segms属性

width和height属性

segms属性

width和height属性

猜你喜欢

`train.jsonl`的内容

`movie`属性

`objects`属性(重要)

`interesting_scores`属性

`answer_likelihood`属性

`img_fn`属性(重要)

`img-id`属性

`metadata_fn`属性(重要)

`{question/answer/rationale}_orig`属性

`question`属性(重要)

`question_number`属性

`{answer/rationale}_match_iter`属性

`match_fold`属性

`match_index`属性

`{answer/rationale}_sources`属性

`{answer/rationale}_choices`属性(重要)

`{answer/rationale}_label`属性(重要)

`annot_id`属性

`boxes`属性

`segms`属性

`width`和`height`属性

`segms`属性

`width`和`height`属性