1.安装tensorboard
tensorboard依赖于tensorflow,所以也要安装tensorflow。
pip install tensorflow
-----------------------
在linux上安装时,遇到了这个错误导致安装失败:
ERROR: Cannot uninstall 'wrapt'. It is a distutils installed project and thus
we cannot accurately determine which files belong to it which would lead to only
a partial uninstall.
原因:
解决方案:
pip install tf-nightly-gpu-2.0-preview --ignore-installed wrapt
----------------------
然后安装tensorboard
pip install tensorboardX
安装以后可以在终端试验一下命令
tensorboard --logdir
如果出现 tensor board: error: argument --logdir: expected one argument
就说明安装成功了 tensorboard命令可以运行
没有安装tensorflow,只安装tensorboard的话
运行后会报错: Tensorflow installation not found
如果按照上述步骤,仍旧安装失败的话,可以去官网下载对应版本的.whl文件再用pip进行安装。
2.使用tensorboard
- 使用tensorboard中的summarywritter,在logs文件中生成了 events.out.tf.events.****..的events.out文件。
from tensorboardX import SummaryWriter
- 需要在服务器上使用tensorboard,内网服务器又是通过跳板机连接时,查看tensorboard可以遵循如下步骤
3.使用tensorboard常见错误
- 数据加载失败
W0625 11:53:14.949339 123145584033792 core_plugin.py:172] Unable to get first event timestamp for run .
W0625 11:53:14.992231 123145578778624 core_plugin.py:172] Unable to get first event timestamp for run .
W0625 12:01:30.308917 123145599799296 core_plugin.py:172] Unable to get first event timestamp for run .
W0625 12:02:00.313975 123145578778624 core_plugin.py:172] Unable to get first event timestamp for run .
W0625 12:02:30.315774 123145578778624 core_plugin.py:172] Unable to get first event timestamp for run .
W0625 12:03:00.318829 123145599799296 core_plugin.py:172] Unable to get first event timestamp for run .
-------------------
原因及解决方案
writer使用结束后需要加writer.close()
- add_graph导出计算图过程出错
Traceback (most recent call last):
File "/Users/eleanorcc/Desktop/Tests/detection_attention/classifier/models/resnet_attention_2.py", line 191, in <module>
writer.add_graph(model, x)
File "/Users/eleanorcc/anaconda3/lib/python3.6/site-packages/tensorboardX/writer.py", line 697, in add_graph
self._get_file_writer().add_graph(graph(model, input_to_model, verbose, **kwargs))
File "/Users/eleanorcc/anaconda3/lib/python3.6/site-packages/tensorboardX/pytorch_graph.py", line 317, in graph
_optimize_trace(trace, operator_export_type)
File "/Users/eleanorcc/anaconda3/lib/python3.6/site-packages/tensorboardX/pytorch_graph.py", line 243, in _optimize_trace
trace.set_graph(_optimize_graph(trace.graph(), operator_export_type))
File "/Users/eleanorcc/anaconda3/lib/python3.6/site-packages/tensorboardX/pytorch_graph.py", line 274, in _optimize_graph
graph = torch._C._jit_pass_onnx(graph, operator_export_type)
File "/Users/eleanorcc/anaconda3/lib/python3.6/site-packages/torch/onnx/__init__.py", line 52, in _run_symbolic_function
return utils._run_symbolic_function(*args, **kwargs)
File "/Users/eleanorcc/anaconda3/lib/python3.6/site-packages/torch/onnx/utils.py", line 504, in _run_symbolic_function
return fn(g, *inputs, **attrs)
TypeError: min() missing 1 required positional argument: 'dim_or_y' (occurred when translating min)
-------------------
不需要一层层的找bug,重点看网络中有没有多余的测试代码,比如我在之前为了验证otsu_ethod的矩阵优化的运算结果与之前是否一致,就添加了一段测试代码
self.otsu_medthod = OtsuMethod()
self.otsu_medthod_2 = OtsuMethod_2()
forward中有
delta = self.otsu_method(M) - self.otsu_method_2(M)
把几段多余的代码注释掉就可以了,只要网络能跑通,正常输出,add_graph通常是不会有错的
4. tensorboard使用实例
- 查看计算图
- 可视化featuremap