Tensorflow 队列 - 在训练数据和验证数据之间切换

时间：2023-09-29

本文介绍了Tensorflow 队列 - 在训练数据和验证数据之间切换的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着跟版网的小编来一起学习吧！

问题描述

我正在尝试使用队列从 Tensorflow 中的文件加载数据.

I am trying to make use of queues for loading data from files in Tensorflow.

我想在每个 epoch 结束时使用验证数据运行图表，以便更好地了解训练的进展情况.

I would like to to run the graph with validation data at the end of each epoch to get a better feel for how the training is going.

这就是我遇到问题的地方.我似乎无法弄清楚如何使用队列时在训练数据和验证数据之间进行切换.

That is where i am running into problems. I cant seem to figure out how to make the switch between training data and validation data when using queues.

我已将我的代码精简为一个最小的玩具示例，以便更容易得到帮助.我没有包含加载图像文件、执行推理和训练的所有代码，而是在文件名加载到队列中的位置.

I have stripped down my code to a bare minimum toy example to make it easier to get help. Instead of including all the code that loads the image files, performs inference, and training, I have chopped it off at the point where the filenames are loaded into the queue.

import tensorflow as tf

#  DATA
train_items = ["train_file_{}".format(i) for i in range(6)]
valid_items = ["valid_file_{}".format(i) for i in range(3)]

# SETTINGS
batch_size = 3
batches_per_epoch = 2
epochs = 2

# CREATE GRAPH
graph = tf.Graph()
with graph.as_default():
    file_list = tf.placeholder(dtype=tf.string, shape=None)
    
    # Create a queue consisting of the strings in `file_list`
    q = tf.train.string_input_producer(train_items, shuffle=False, num_epochs=None)
    
    # Create batch of items.
    x = q.dequeue_many(batch_size)
    
    # Inference, train op, and accuracy calculation after this point
    # ...


# RUN SESSION
with tf.Session(graph=graph) as sess:
    # Initialize variables
    sess.run(tf.global_variables_initializer())
    sess.run(tf.local_variables_initializer())
    
    # Start populating the queue.
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(sess=sess, coord=coord)
    
    try:
        for epoch in range(epochs):
            print("-"*60)
            for step in range(batches_per_epoch):
                if coord.should_stop():
                    break
                train_batch = sess.run(x, feed_dict={file_list: train_items})
                print("TRAIN_BATCH: {}".format(train_batch))
    
            valid_batch = sess.run(x, feed_dict={file_list: valid_items})
            print("
VALID_BATCH : {} 
".format(valid_batch))
    
    except Exception, e:
        coord.request_stop(e)
    finally:
        coord.request_stop()
        coord.join(threads)

变化和实验

为 `num_epochs`

尝试不同的值

num_epochs=无

如果我将 tf.train.string_input_producer() 中的 num_epochs 参数设置为None 它给出以下输出，这表明它正在按预期运行两个时期，但它正在使用数据运行评估时从训练集中获取.

Variations and experiments

Trying different values for `num_epochs`

num_epochs=None

If i set the num_epochs argument in tf.train.string_input_producer()to None it gives be the following output, which shows that it is running two epochs as intended, but it is using data from the training set when running evaluation.

------------------------------------------------------------
TRAIN_BATCH: ['train_file_0' 'train_file_1' 'train_file_2']
TRAIN_BATCH: ['train_file_3' 'train_file_4' 'train_file_5']

VALID_BATCH : ['train_file_0' 'train_file_1' 'train_file_2']

------------------------------------------------------------
TRAIN_BATCH: ['train_file_3' 'train_file_4' 'train_file_5']
TRAIN_BATCH: ['train_file_0' 'train_file_1' 'train_file_2']

VALID_BATCH : ['train_file_3' 'train_file_4' 'train_file_5']

num_epochs=2

如果我将 tf.train.string_input_producer() 中的 num_epochs 参数设置为 2它给出了以下输出，这表明它甚至根本没有运行完整的两个批次(并且评估仍在使用训练数据)

num_epochs=2

If i set the num_epochs argument in tf.train.string_input_producer() to 2 it gives be the following output, which shows that it is not even running the full two batches at all (and evaliation is still using training data)

------------------------------------------------------------
TRAIN_BATCH: ['train_file_0' 'train_file_1' 'train_file_2']
TRAIN_BATCH: ['train_file_3' 'train_file_4' 'train_file_5']

VALID_BATCH : ['train_file_0' 'train_file_1' 'train_file_2']

------------------------------------------------------------
TRAIN_BATCH: ['train_file_3' 'train_file_4' 'train_file_5']

num_epochs=1

如果我将 tf.train.string_input_producer() 中的 num_epochs 参数设置为 1希望它会被冲走队列中的任何其他训练数据，以便它可以利用验证数据，我得到以下输出，这表明它正在终止它通过了一个时期的训练数据，并且没有通过加载评估数据.

num_epochs=1

If i set the num_epochs argument in tf.train.string_input_producer() to 1 in the hopes that it will flush out any aditional training data from the queue so it can make use of the validation data, i get the following output, which shows that it is terminating as soon as it gets through one epoch of training data, and does not get to go through loading evaluation data.

------------------------------------------------------------
TRAIN_BATCH: ['train_file_0' 'train_file_1' 'train_file_2']
TRAIN_BATCH: ['train_file_3' 'train_file_4' 'train_file_5']

将 `capacity` 参数设置为各种值

我也试过设置 capacity 参数tf.train.string_input_producer() 到小的值，例如 3 和 1.但是这些对结果没有影响.

Setting `capacity` argument to various values

I have also tried setting the capacity argument in tf.train.string_input_producer() to small values, such as 3, and 1. But these had no effect on the results.

我还可以采取哪些其他方法在训练数据和验证数据之间切换?我必须创建单独的队列吗?我不知道如何做到这一点工作.我是否还必须创建额外的协调器和队列运行器?

What other approach could i take to switch between training and validation data? Would i have to create separate queues? I am at a loss as to how to get that to work. Would i have to create additional coordinators and queue runners as well?

Tensorflow 队列 - 在训练数据和验证数据之间切换

问题描述

变化和实验

为 `num_epochs`

num_epochs=无

Variations and experiments

Trying different values for `num_epochs`

num_epochs=None

num_epochs=2

num_epochs=2

num_epochs=1

num_epochs=1

将 `capacity` 参数设置为各种值

Setting `capacity` argument to various values

推荐答案

相关文章

Tensorflow 队列 - 在训练数据和验证数据之间切换

问题描述

变化和实验

为 num_epochs

num_epochs=无

Variations and experiments

Trying different values for num_epochs

num_epochs=None

num_epochs=2

num_epochs=2

num_epochs=1

num_epochs=1

将 capacity 参数设置为各种值

Setting capacity argument to various values

推荐答案

相关文章

为 `num_epochs`

Trying different values for `num_epochs`

将 `capacity` 参数设置为各种值

Setting `capacity` argument to various values