OpenCV在Google Colboratory中不起作用

问题描述

我在Google协作室练习OpenCV,因为我不知道如何在GPU上使用OpenCV,在我的硬件上运行OpenCV会占用很多CPU,所以我去了Google协作室。 指向我的笔记本的链接是here。

如果您不想看,代码如下:

import cv2

face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
cap = cv2.VideoCapture(0)

while True:
    _, img = cap.read()
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(gray, 1.1, 4)
    for (x, y, w, h) in faces:
        cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)

    cv2.imshow('img', img)

    k = cv2.waitKey(30) & 0xff
    if k==27:
        break
    
cap.release()

相同的代码在我的PC上运行得很好,但在Google Colboratory上就不行了。错误为:

---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
<ipython-input-5-0d9472926d8c> in <module>()
      6 while True:
      7         _, img = cap.read()
----> 8         gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
      9         faces = face_cascade.detectMultiScale(gray, 1.1, 4)
     10         for (x, y, w, h) in faces:

error: OpenCV(4.1.2) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'

ps~我在Google Colboratory中将haarcasade文件放在我笔记本的同一目录中

如何处理?如果没有,是否有任何具体的解决方案可以在启用了CUDA的GPU(而不是CPU)上运行OpenCV?提前感谢!


解决方案

_src.empty()表示它从摄像头获取帧有问题,imgNone,当它尝试cvtColor(None, ...)时,它会给出_src.empty()

您应该选中if img is not None:,因为cv2在无法从相机获取帧或从文件读取图像时不会引发错误。而且有时相机预热需要时间,而且它可以提供很少的空帧(None)。


VideoCapture(0)从直接连接到运行此代码的计算机的摄像机读取帧-当您在服务器Google Colaboratory上运行代码时,这意味着摄像机直接连接到服务器Google Colaboratory(不是您的本地摄像机),但此服务器没有摄像机,因此VideoCapture(0)无法在Google Colaboratory上工作。

cv2在服务器上运行时无法从本地摄像机获取图像。您的Web浏览器可能可以访问您的摄像机,但它需要JavaScript来获取帧并发送到服务器-但是服务器需要代码来获取此帧


我在Google上查过,如果Google Colaboratory可以访问本地网络摄像头,他们似乎为此创建了脚本-Camera Capture-在第一个单元格中是函数take_photo(),该函数使用JavaScript访问您的摄像头并在浏览器中显示,而在第二个单元格中该函数用于显示本地摄像头的图像和截图。

您应该使用此函数而不是VideoCapture(0)使用本地摄像机在服务器上工作。


btw:Belovetake_photo()还有关于cv2.im_show()的信息,因为它也只能与直接连接到运行此代码的计算机的监视器一起工作(此计算机必须运行GUI,就像Windows在Windows上,X11在Linux上一样)-当您在服务器上运行它时,它希望在直接连接到服务器的监视器上显示-但是服务器通常在没有监视器(没有GUI)的情况下工作

Google Colaboratory具有在Web浏览器中显示的特殊替换

 from google.colab.patches import cv2_imshow

btw:如果您在加载haarcasade时遇到问题.xml,那么您可能需要文件夹来命名文件。cv2有针对此的特殊变量cv2.data.haarcascades

path = os.path.join(cv2.data.haarcascades, 'haarcascade_frontalface_default.xml')

cv2.CascadeClassifier( path )

您还可以查看此文件夹中的内容

import os

filenames = os.listdir(cv2.data.haarcascades)
filenames = sorted(filenames)
print('
'.join(filenames))

编辑:

我创建的代码可以从本地摄像头逐帧获取,无需使用button,也无需保存在文件中。问题是它很慢-因为它仍然必须将帧从本地Web浏览器发送到Google CoLab服务器,然后再发送回本地Web浏览器

带有JavaScript函数的Python代码

#
# based on: https://colab.research.google.com/notebooks/snippets/advanced_outputs.ipynb#scrollTo=2viqYx97hPMi
#

from IPython.display import display, Javascript
from google.colab.output import eval_js
from base64 import b64decode, b64encode
import numpy as np

def init_camera():
  """Create objects and functions in HTML/JavaScript to access local web camera"""

  js = Javascript('''

    // global variables to use in both functions
    var div = null;
    var video = null;   // <video> to display stream from local webcam
    var stream = null;  // stream from local webcam
    var canvas = null;  // <canvas> for single frame from <video> and convert frame to JPG
    var img = null;     // <img> to display JPG after processing with `cv2`

    async function initCamera() {
      // place for video (and eventually buttons)
      div = document.createElement('div');
      document.body.appendChild(div);

      // <video> to display video
      video = document.createElement('video');
      video.style.display = 'block';
      div.appendChild(video);

      // get webcam stream and assing to <video>
      stream = await navigator.mediaDevices.getUserMedia({video: true});
      video.srcObject = stream;

      // start playing stream from webcam in <video>
      await video.play();

      // Resize the output to fit the video element.
      google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);

      // <canvas> for frame from <video>
      canvas = document.createElement('canvas');
      canvas.width = video.videoWidth;
      canvas.height = video.videoHeight;
      //div.appendChild(input_canvas); // there is no need to display to get image (but you can display it for test)

      // <img> for image after processing with `cv2`
      img = document.createElement('img');
      img.width = video.videoWidth;
      img.height = video.videoHeight;
      div.appendChild(img);
    }

    async function takeImage(quality) {
      // draw frame from <video> on <canvas>
      canvas.getContext('2d').drawImage(video, 0, 0);

      // stop webcam stream
      //stream.getVideoTracks()[0].stop();

      // get data from <canvas> as JPG image decoded base64 and with header "data:image/jpg;base64,"
      return canvas.toDataURL('image/jpeg', quality);
      //return canvas.toDataURL('image/png', quality);
    }

    async function showImage(image) {
      // it needs string "data:image/jpg;base64,JPG-DATA-ENCODED-BASE64"
      // it will replace previous image in `<img src="">`
      img.src = image;
      // TODO: create <img> if doesn't exists, 
      // TODO: use `id` to use different `<img>` for different image - like `name` in `cv2.imshow(name, image)`
    }

  ''')

  display(js)
  eval_js('initCamera()')

def take_frame(quality=0.8):
  """Get frame from web camera"""

  data = eval_js('takeImage({})'.format(quality))  # run JavaScript code to get image (JPG as string base64) from <canvas>

  header, data = data.split(',')  # split header ("data:image/jpg;base64,") and base64 data (JPG)
  data = b64decode(data)  # decode base64
  data = np.frombuffer(data, dtype=np.uint8)  # create numpy array with JPG data

  img = cv2.imdecode(data, cv2.IMREAD_UNCHANGED)  # uncompress JPG data to array of pixels

  return img

def show_frame(img, quality=0.8):
  """Put frame as <img src="data:image/jpg;base64,...."> """

  ret, data = cv2.imencode('.jpg', img)  # compress array of pixels to JPG data

  data = b64encode(data)  # encode base64
  data = data.decode()  # convert bytes to string
  data = 'data:image/jpg;base64,' + data  # join header ("data:image/jpg;base64,") and base64 data (JPG)

  eval_js('showImage("{}")'.format(data))  # run JavaScript code to put image (JPG as string base64) in <img>
                                           # argument in `showImage` needs `" "` 

和在循环中使用它的代码

# 
# based on: https://colab.research.google.com/notebooks/snippets/advanced_outputs.ipynb#scrollTo=zo9YYDL4SYZr
#

#from google.colab.patches import cv2_imshow  # I don't use it but own function `show_frame()`

import cv2
import os

face_cascade = cv2.CascadeClassifier(os.path.join(cv2.data.haarcascades, 'haarcascade_frontalface_default.xml'))

# init JavaScript code
init_camera()

while True:
    try:
        img = take_frame()

        gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        #cv2_imshow(gray)  # it creates new image for every frame (it doesn't replace previous image) so it is useless
        #show_frame(gray)  # it replace previous image

        faces = face_cascade.detectMultiScale(gray, 1.1, 4)

        for (x, y, w, h) in faces:
                cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)
        
        #cv2_imshow(img)  # it creates new image for every frame (it doesn't replace previous image) so it is useless
        show_frame(img)  # it replace previous image
        
    except Exception as err:
        print('Exception:', err)

我不使用from google.colab.patches import cv2_imshow,因为它总是在页面上添加新图像,而不是替换现有图像。


与Google Colab上的笔记本相同的代码:

https://colab.research.google.com/drive/1j7HTapCLx7BQUBp3USiQPZkA0zBKgLM0?usp=sharing

相关文章