2016-10-20 1 views
3

Поезда im2txt для нескольких тысяч шагов затем останавливаются со следующей ошибкой. Я проверил файлы тренировки, и они выглядят нормально.Тренировочный тензорный поток im2txt не работает с усеченной записью на

Работает на Ubuntu 16.04, TF r.0.11, режим GPU GTX 970 4Gb.

Не уверен, что нехватка оперативной памяти?

INFO:tensorflow:global step 56396: loss = 2.4654 (0.41 sec/step) 
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors.DataLossError'>, truncated record at 369740238 
    [[Node: ReaderRead = ReaderRead[_class=["loc:@TFRecordReader", "loc:@filename_queue"], _device="/job:localhost/replica:0/task:0/cpu:0"](TFRecordReader, filename_queue)]] 

Caused by op u'ReaderRead', defined at: 
    File "/home/john/Developer/tensorflow/tensorflow/models/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/train.py", line 114, in <module> 
    tf.app.run() 
    File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run 
    sys.exit(main(sys.argv[:1] + flags_passthrough)) 
    File "/home/john/Developer/tensorflow/tensorflow/models/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/train.py", line 65, in main 
    model.build() 
    File "/home/john/Developer/tensorflow/tensorflow/models/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/show_and_tell_model.py", line 352, in build 
    self.build_inputs() 
    File "/home/john/Developer/tensorflow/tensorflow/models/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/show_and_tell_model.py", line 153, in build_inputs 
    num_reader_threads=self.config.num_input_reader_threads) 
    File "/home/john/Developer/tensorflow/tensorflow/models/im2txt/bazel-bin/im2txt/train.runfiles/im2txt/im2txt/ops/inputs.py", line 115, in prefetch_input_data 
    _, value = reader.read(filename_queue) 
    File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/io_ops.py", line 277, in read 
    return gen_io_ops._reader_read(self._reader_ref, queue_ref, name=name) 
    File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 211, in _reader_read 
    queue_handle=queue_handle, name=name) 
    File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 748, in apply_op 
    op_def=op_def) 
    File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2403, in create_op 
    original_op=self._default_original_op, op_def=op_def) 
    File "/home/john/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1305, in __init__ 
    self._traceback = _extract_stack() 

DataLossError (see above for traceback): truncated record at 369740238 
    [[Node: ReaderRead = ReaderRead[_class=["loc:@TFRecordReader", "loc:@filename_queue"], _device="/job:localhost/replica:0/task:0/cpu:0"](TFRecordReader, filename_queue)]] 

INFO:tensorflow:global step 56397: loss = 2.5540 (0.40 sec/step) 

ответ

0

У меня такая же проблема, не знаю почему. Я не видел ошибок при создании tfrecords. Во время обучения ошибка появляется около конца записей. BTW Я использую tf 0.11rc

Смежные вопросы