darknet_video删改

逐行解释
Sure, here's a line-by-line explanation:

def convert2original(image, bbox):
    x, y, w, h = convert2relative(bbox)

    image_h, image_w, __ = image.shape

    orig_x       = int(x * image_w)
    orig_y       = int(y * image_h)
    orig_width   = int(w * image_w)
    orig_height  = int(h * image_h)

    bbox_converted = (orig_x, orig_y, orig_width, orig_height)

    return bbox_converted

This function takes in an image and a bounding box bbox defined as a tuple of (x, y, width, height), converted to relative coordinates using the convert2relative() function. It then converts the relative coordinates back into absolute pixel coordinates for the original image and returns the new bounding box as a tuple of (x, y, width, height).

def convert4cropping(image, bbox):
    x, y, w, h = convert2relative(bbox)

    image_h, image_w, __ = image.shape

    orig_left    = int((x - w / 2.) * image_w)
    orig_right   = int((x + w / 2.) * image_w)
    orig_top     = int((y - h / 2.) * image_h)
    orig_bottom  = int((y + h / 2.) * image_h)

    if (orig_left < 0): orig_left = 0
    if (orig_right > image_w - 1): orig_right = image_w - 1
    if (orig_top < 0): orig_top = 0
    if (orig_bottom > image_h - 1): orig_bottom = image_h - 1

    bbox_cropping = (orig_left, orig_top, orig_right, orig_bottom)

    return bbox_cropping

This function takes in an image and a bounding box bbox defined as a tuple of (x, y, width, height), converted to relative coordinates using the convert2relative() function. It then calculates the absolute pixel coordinates for the top-left and bottom-right corners of the bounding box, but with additional padding on each side, allowing for easier cropping of the detected object from the image later.

def video_capture(frame_queue, darknet_image_queue):
    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break
        frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        frame_resized = cv2.resize(frame_rgb, (darknet_width, darknet_height),
                                   interpolation=cv2.INTER_LINEAR)
        frame_queue.put(frame)
        img_for_detect = darknet.make_image(darknet_width, darknet_height, 3)
        darknet.copy_image_from_bytes(img_for_detect, frame_resized.tobytes())
        darknet_image_queue.put(img_for_detect)
    cap.release()

This function is run in a separate thread, and reads successive frames from the capture device (e.g. webcam or video file). For each frame read, it first converts the color space from BGR to RGB and resizes it to match the input size expected by the YOLOv3 model. It then adds the original frame to a queue for processing by subsequent threads, and creates a Darknet image object in memory for inference by the YOLOv3 algorithm, adding this to another queue for processing by subsequent threads. Once all frames are processed, the capture device is released.

def inference(darknet_image_queue, detections_queue, fps_queue):
    while cap.isOpened():
        darknet_image = darknet_image_queue.get()
        prev_time = time.time()
        detections = darknet.detect_image(network, class_names, darknet_image, thresh=args.thresh)
        detections_queue.put(detections)
        fps = int(1/(time.time() - prev_time))
        fps_queue.put(fps)
        print("FPS: {}".format(fps))
        darknet.print_detections(detections, args.ext_output)
        darknet.free_image(darknet_image)
    cap.release()

This function is also run in a separate thread, and processes each Darknet image object passed to it from the queue by running inference with the YOLOv3 algorithm on that image. It then adds the detected objects and their associated bounding boxes to a separate queue for processing by subsequent threads, calculates the frames per second (FPS) of inference, prints this to the console, and releases the memory allocated to the Darknet image object.

def drawing(frame_queue, detections_queue, fps_queue):
    random.seed(3)  # deterministic bbox colors
    video = set_saved_video(cap, args.out_filename, (video_width, video_height))
while cap.is




PaddleOCR
To perform character recognition on the detected bounding boxes using PaddleOCR and output the recognition result, you can modify the `drawing()` function to include another thread that takes in detections from the queue and performs OCR on the cropped images within those bounding boxes. Here's an example of what this modified function might look like:

import paddleocr

def drawing(frame_queue, detections_queue, fps_queue):
random.seed(3) # deterministic bbox colors
video = set_saved_video(cap, args.out_filename, (video_width, video_height))

# Start OCR thread
ocr_queue = Queue()
Thread(target=perform_ocr, args=(detections_queue, ocr_queue)).start()

while cap.isOpened():
    frame = frame_queue.get()
    detections = detections_queue.get()
    ocr_results = ocr_queue.get()
    fps = fps_queue.get()
    detections_adjusted = []
    if frame is not None:
        for label, confidence, bbox in detections:
            bbox_adjusted = convert2original(frame, bbox)
            detections_adjusted.append((str(label), confidence, bbox_adjusted))
            
            # Extract cropped image and perform OCR
            bbox_cropped = convert4cropping(frame, bbox)
            cropped_image = frame[bbox_cropped[1]:bbox_cropped[3], bbox_cropped[0]:bbox_cropped[2]]
            ocr_text = ocr_results.get(str(bbox_adjusted))
            if ocr_text is None:
                ocr_text = perform_paddleocr(cropped_image)
                ocr_results[str(bbox_adjusted)] = ocr_text
        
        image = darknet.draw_boxes(detections_adjusted, frame, class_colors)
        if not args.dont_show:
            cv2.imshow('Inference', image)
        if args.out_filename is not None:
            video.write(image)
        if cv2.waitKey(fps) == 27:
            break
cap.release()
video.release()
cv2.destroyAllWindows()

def perform_ocr(detections_queue, ocr_queue):
ocr_results = {}
while cap.isOpened():
detections = detections_queue.get()
for label, confidence, bbox in detections:
bbox_adjusted = convert2original(frame, bbox)

        # Extract cropped image and perform OCR
        bbox_cropped = convert4cropping(frame, bbox)
        cropped_image = frame[bbox_cropped[1]:bbox_cropped[3], bbox_cropped[0]:bbox_cropped[2]]
        ocr_text = perform_paddleocr(cropped_image)
        ocr_results[str(bbox_adjusted)] = ocr_text
        
    ocr_queue.put(ocr_results)

def perform_paddleocr(image):
ocr_result = ''
ocr_apis = paddleocr.OCR(det_model_dir='ch', rec_model_dir='ch')
result = ocr_apis.ocr(image)
for line in result:
for word in line:
ocr_result += word[1]
return ocr_result


In this modified `drawing()` function, we start a new thread by calling `Thread(target=perform_ocr, args=(detections_queue, ocr_queue)).start()` right after initializing the `ocr_queue`. The `perform_ocr()` function takes in the same `detections_queue` as the `drawing()` function, but it also has an additional queue called `ocr_queue` that is used to store OCR results for each detected bounding box. 

Inside the `drawing()` function's main while loop, we add code to extract the cropped image from each bounding box using the `convert4cropping()` function (similar to the code for drawing bounding boxes). We then check if there is already an OCR result stored for this bounding box in the `ocr_results` dictionary. If not, we use PaddleOCR to perform character recognition on the cropped image and save the result. We then add this result back to the `ocr_results` dictionary with the bounding box coordinates as the key.

Finally, we pass the `ocr_results` dictionary to the `ocr_queue` for processing by the OCR thread. The OCR thread takes detections from the `detections_queue`, extracts and performs OCR on each detected bounding box using PaddleOCR, and stores the results in the `ocr_results` dictionary. It then passes this updated dictionary back to the main thread via the `ocr_queue`, allowing the main thread to access OCR results for each bounding box and draw them on the original image frame.

Note that in this example, we assume that you have already installed PaddleOCR and downloaded a pre-trained model for Chinese character recognition (`det_model_dir='ch', rec_model_dir='ch'`). You may need to adjust this code if you are using a different model or language.








tesseract

To perform character recognition on the detected bounding boxes, you can modify the `drawing()` function to include another thread that takes in detections from the queue and performs optical character recognition (OCR) on the cropped images within those bounding boxes. Here's an example of what this modified function might look like:

import pytesseract

def drawing(frame_queue, detections_queue, fps_queue):
random.seed(3) # deterministic bbox colors
video = set_saved_video(cap, args.out_filename, (video_width, video_height))

# Start OCR thread
ocr_queue = Queue()
Thread(target=perform_ocr, args=(detections_queue, ocr_queue)).start()

while cap.isOpened():
    frame = frame_queue.get()
    detections = detections_queue.get()
    ocr_results = ocr_queue.get()
    fps = fps_queue.get()
    detections_adjusted = []
    if frame is not None:
        for label, confidence, bbox in detections:
            bbox_adjusted = convert2original(frame, bbox)
            detections_adjusted.append((str(label), confidence, bbox_adjusted))
            
            # Extract cropped image and perform OCR
            bbox_cropped = convert4cropping(frame, bbox)
            cropped_image = frame[bbox_cropped[1]:bbox_cropped[3], bbox_cropped[0]:bbox_cropped[2]]
            ocr_text = ocr_results.get(str(bbox_adjusted))
            if ocr_text is None:
                ocr_text = pytesseract.image_to_string(cropped_image)
                ocr_results[str(bbox_adjusted)] = ocr_text
        
        image = darknet.draw_boxes(detections_adjusted, frame, class_colors)
        if not args.dont_show:
            cv2.imshow('Inference', image)
        if args.out_filename is not None:
            video.write(image)
        if cv2.waitKey(fps) == 27:
            break
cap.release()
video.release()
cv2.destroyAllWindows()

def perform_ocr(detections_queue, ocr_queue):
ocr_results = {}
while cap.isOpened():
detections = detections_queue.get()
for label, confidence, bbox in detections:
bbox_adjusted = convert2original(frame, bbox)

        # Extract cropped image and perform OCR
        bbox_cropped = convert4cropping(frame, bbox)
        cropped_image = frame[bbox_cropped[1]:bbox_cropped[3], bbox_cropped[0]:bbox_cropped[2]]
        ocr_text = pytesseract.image_to_string(cropped_image)
        ocr_results[str(bbox_adjusted)] = ocr_text
        
    ocr_queue.put(ocr_results)

In this modified `drawing()` function, we start a new thread by calling `Thread(target=perform_ocr, args=(detections_queue, ocr_queue)).start()` right after initializing the `ocr_queue`. The `perform_ocr()` function takes in the same `detections_queue` as the `drawing()` function, but it also has an additional queue called `ocr_queue` that is used to store OCR results for each detected bounding box. 

Inside the `drawing()` function's main while loop, we add code to extract the cropped image from each bounding box using the `
[Request aborted]


我自认为在政治思想方面,我始终坚持以习近平新时代中国特色社会主义思想为指导,认真履行自己的公民责任,在日常生活中积极践行社会主义核心价值观。同时,我也深刻认识到个人思想觉悟水平还需不断提高,需要更加深入学习和理解马克思列宁主义、毛泽东思想和邓小平理论等。
在学习科研方面,我一直保持着良好的学习态度和求知欲,认真听取老师的讲解,积极参与讨论,并且勇于拓展自己的思维,多角度思考问题。我在大学期间也积极参与各种科研项目,在实践中提升了自己的动手实践能力和科学研究能力。
在社会活动方面,我始终坚持以服务于他人为本,积极参与各类志愿者活动,如组织学生参与校外公益活动、参与社区文化建设等。这些活动不仅锻炼了我的组织管理能力和沟通协调能力,还增强了我对社会的责任感和使命感。
在能力特长方面,我有一定的编程技能和语言表达能力。我热爱计算机编程,熟练掌握多种编程语言,对数据分析等方面也有一定的理解和应用能力。同时,我也经常参加英语演讲比赛和辩论赛,在锻炼自己的英语口语表达能力的同时,也提高了自己的逻辑思考和分析能力。
在取得的成就方面,我曾获得全国大学生英语竞赛二等奖、校园科技节创新创业大赛三等奖、优秀志愿者称号等荣誉。这些成就是我不断努力和付出的结果,也给予我更多信心和动力去追求更高的目标。
在存在的问题方面,我认为自己还存在很多不足和需要改进的地方。例如,我的时间管理能力还需要提升,有时候容易陷入拖延和浪费时间的状态。同时,我的社交能力还有待提高,需要更好地处理人际关系,并且尝试与更多不同背景的人交流互动。
今后,我将继续扩大自己的知识储备,深入学习和研究各个领域的知识,提高自己的综合素质。同时,我也将更加注重实践能力的提升,在项目实践中不断锻炼自己的动手实践能力和团队合作能力。在社会活动方面,我将积极参与各类志愿者活动,并且尝试组织一些有意义的活动来回馈社会。最后,我也希望自己可以更好地发掘自己的潜力,尽快找到适合自己的职业方向,为国家和社会做出自己的贡献。




Day2
Section 1
To draw a box around the detections and print OCR results in the box, you can modify the `perform_ocr` function as follows:

```python
def perform_ocr(detections_queue, ocr_queue):
    ocr_results = {}
    while cap.isOpened():
        frame = frame_queue.get()
        ocr_results = ocr_queue.get()
        detections = detections_queue.get()
        for label, confidence, bbox in detections:
            bbox_adjusted = convert2original(frame, bbox)
            detections_adjusted.append((str(label), confidence, bbox_adjusted))
            # Extract cropped image and perform OCR
            bbox_cropped = convert4cropping(frame, bbox)
            cropped_image = frame[bbox_cropped[1]:bbox_cropped[3], bbox_cropped[0]:bbox_cropped[2]]
            ocr_text = perform_paddleocr(cropped_image)
            ocr_results[str(bbox_adjusted)] = ocr_text
            
            # Draw box and print OCR results
            cv2.rectangle(frame, (bbox_adjusted[0], bbox_adjusted[1]), (bbox_adjusted[2], bbox_adjusted[3]), (255, 0, 0), 2)
            cv2.putText(frame, ocr_text, (bbox_adjusted[0], bbox_adjusted[1]-10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255,0,0), 1, cv2.LINE_AA)
        
        detections_adjusted = [] # reset detections for next iteration
        ocr_queue.put(ocr_results)
        
        if not args.dont_show:
            cv2.imshow('Inference', frame)
        if args.out_filename is not None:
            video.write(frame)
        if cv2.waitKey(fps) == 27:
            break
        
    cap.release()
    video.release()

This modified code adds two lines to draw a rectangle around the detected object and print OCR results inside the rectangle. The cv2.rectangle function takes the frame, top-left and bottom-right coordinates of the rectangle, color, and thickness as arguments. The cv2.putText function takes the frame, text to display, position of the text, font type and size, color, thickness, and line type as arguments.

Section 2
The line ocr_text = ocr_results.get(str(bbox_adjusted)) retrieves the OCR results for the current detection bbox from the ocr_results dictionary.
The str(bbox_adjusted) is used as the dictionary key because it provides a unique identifier for each detection bbox, which can be used to associate the correct OCR result with the corresponding detection bbox. The bbox_adjusted contains the coordinates of the top-left and bottom-right corners of the detection bbox after adjusting for any image scaling or resizing that may have been applied during the detection process.
The get() method is called on the ocr_results dictionary with str(bbox_adjusted) as its argument. If a matching key exists in the dictionary, the corresponding OCR text value is returned and assigned to the ocr_text variable. If no matching key is found, the get() method returns None.

Section3
The line ocr_results = ocr_queue.get() retrieves the OCR results dictionary from the ocr_queue.
The ocr_queue is a Python Queue object that is used to pass data between different threads in a synchronized way. In this code, it is used to pass the OCR results dictionary from the thread that performs OCR on each detection frame to the main thread that displays the detected objects with their OCR results.
The get() method of the ocr_queue blocks until an item is available in the queue. Once an item is available, it removes and returns it from the queue. When the OCR thread puts the OCR results dictionary into the queue using ocr_queue.put(ocr_results), the get() method of the main thread can retrieve it for further processing.

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 222,183评论 6 516
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 94,850评论 3 399
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 168,766评论 0 361
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 59,854评论 1 299
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 68,871评论 6 398
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 52,457评论 1 311
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,999评论 3 422
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 39,914评论 0 277
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 46,465评论 1 319
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 38,543评论 3 342
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 40,675评论 1 353
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 36,354评论 5 351
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 42,029评论 3 335
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 32,514评论 0 25
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 33,616评论 1 274
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 49,091评论 3 378
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 45,685评论 2 360

推荐阅读更多精彩内容