Depth Anything Metric | 实战笔记

简单记录,防止日后重复工作

如何判断正深度逆深度

depth_anything v1/v2输出的都是逆深度
0代表无穷远
255代表距离为0

  • 对于相对深度:先看相对深度图中远近对应位置是黑还是白,只需建立一个和输出深度相同尺寸的向量,全部赋值为0,看看对应是黑还是白(借助颜色)
zeros_depth = np.zeros(np.shape(depth))
depth = zeros_depth
  • 对于绝对深度
    绝对深度如何判断?一样,把数据范围拉成0-255,仿照相对深度

Environment

depth_anything for /home/zzx/Depth-Anything/run.py
depth_anything_metric for /home/zzx/Depth-Anything/metric_depth/depth_to_pointcloud.py

Need to fix

File "/home/cjd/Depth-Anything/torchhub/facebookresearch_dinov2_main/vision_transformer.py", line 219, in prepare_tokens_with_masks
x = x + self.interpolate_pos_encoding(x, w, h)
File "/home/cjd/Depth-Anything/torchhub/facebookresearch_dinov2_main/vision_transformer.py", line 199, in interpolate_pos_encoding
patch_pos_embed = nn.functional.interpolate(
TypeError: interpolate() got an unexpected keyword argument 'antialias'

Run Metric Notes

/home/zzx/Depth-Anything/metric_depth/depth_to_pointcloud.py

  1. Configure the config file

    DATASET = 'nyu'

    /home/zzx/Depth-Anything/metric_depth/zoedepth/utils/config.py

  2. Choose the pt model file

    parser.add_argument

    Indoor:
    /home/zzx/Depth-Anything/checkpoints/depth_anything_metric_depth_indoor.pt

    Outdoor:
    /home/zzx/Depth-Anything/checkpoints/depth_anything_metric_depth_outdoor.pt

Process Images and MonoDepth

import cv2
from PIL import Image
import torchvision.transforms as transforms

DATASET = 'nyu'
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
FINAL_HEIGHT = 256
FINAL_WIDTH = 256

color_image = Image.open(image_path).convert('RGB')
image_tensor = transforms.ToTensor()(color_image).unsqueeze(0).to(DEVICE) # 1, 3, 1552, 2326

pred = model(image_tensor, dataset=DATASET)

if isinstance(pred, dict):
    pred = pred.get('metric_depth', pred.get('out')) # 1, 1, 392, 518
elif isinstance(pred, (list, tuple)):
    pred = pred[-1]

pred = pred.squeeze().detach().cpu().numpy() # 1, 392, 518

# depth pred
pred_color=cv2.applyColorMap(cv2.convertScaleAbs(pred,alpha=15),cv2.COLORMAP_JET)
pred_color=Image.fromarray(pred_color).resize((FINAL_WIDTH, FINAL_HEIGHT), Image.NEAREST)

# photo
resized_color_image = color_image.resize((FINAL_WIDTH, FINAL_HEIGHT), Image.LANCZOS)

# save
photo = Image.new('RGB', (FINAL_WIDTH, FINAL_HEIGHT * 2 ))
photo.paste(resized_color_image,  (0, 0))
photo.paste(pred_color, (0, FINAL_HEIGHT))
photo.save(os.path.join(OUTPUT_DIR, os.path.splitext(os.path.basename(image_path))[0] + ".png"))

View the size of PNG file

from PIL import Image

image = Image.open('/home/zzx/Depth-Anything/assets/examples/demo20.png')

width, height = image.size

print("Width:", width)
print("Height:", height)

Merge left and right images

from PIL import Image
import os

# 设置文件夹路径
indoor_folder = "/home/zzx/Depth-Anything/output/Metric/Indoor"
outdoor_folder = "/home/zzx/Depth-Anything/output/Metric/Outdoor"
OUTPUT_DIR = '/home/zzx/Depth-Anything/output/Metric/Compare'

indoor_files = os.listdir(indoor_folder)
outdoor_files = os.listdir(outdoor_folder)

for filename in indoor_files:
    if filename.endswith(".png"):
        indoor_path = os.path.join(indoor_folder, filename)
        outdoor_path = os.path.join(outdoor_folder, filename)

        indoor_image = Image.open(indoor_path)
        outdoor_image = Image.open(outdoor_path)

        combined_image = Image.new("RGB", (indoor_image.width * 2, indoor_image.height))
        combined_image.paste(indoor_image, (0, 0))
        combined_image.paste(outdoor_image, (indoor_image.width, 0))

        output_path = os.path.join(OUTPUT_DIR, filename)
        combined_image.save(output_path)

Use the metric depth

import sys
sys.path.append('/home/cjd/Depth-Anything/metric_depth')
from zoedepth.models.builder import build_model
from zoedepth.utils.config import get_config

DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
DATASET = 'kitti' 
config = get_config('zoedepth', "eval", DATASET)
config.pretrained_resource = 'local::/home/cjd/Depth-Anything/checkpoints/depth_anything_metric_depth_outdoor.pt'
mono_model = build_model(config).to(DEVICE)
mono_model.eval()

color_image = Image.open(imfile).convert('RGB')
image_tensor = transforms.ToTensor()(color_image).unsqueeze(0).to(DEVICE)
depth_tensor = mono_model(image_tensor, dataset=DATASET)
if isinstance(depth_tensor, dict):
  depth_tensor = depth_tensor.get('metric_depth', depth_tensor.get('out'))
elif isinstance(depth_tensor, (list, tuple)):
  depth_tensor = depth_tensor[-1]
depth_tensor = torch.nn.functional.interpolate(depth_tensor.float(),size=(image_size[0], image_size[1]),mode="nearest").squeeze()
depth_tensor[depth_tensor>60.0] = 0.0
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容