0

Can someone give me a tip on what to research for predicting human pose with computer vision with the open VINO toolkit? Not a lot of wisdom here so any tips appreciated even at a high level on what I need to learn/research...

I preprocess an image with:

def preprocessing(input_image, height, width):
    '''
    Given an input image, height and width:
    - Resize to width and height
    - Transpose the final "channel" dimension to be first
    - Reshape the image to add a "batch" of 1 at the start 
    '''
    image = np.copy(input_image)
    image = cv2.resize(image, (width, height))
    image = image.transpose((2,0,1))
    image = image.reshape(1, 3, height, width)

    return image

and run inference with the open VINO toolkit models:

# Run inference.
predicted_poses = compiled_pose_model([image])[compiled_pose_model.output(0)]

Which returns a numpy array but then get lost on what the next steps are to turn an image: enter image description here

Into this: enter image description here

This is the code I am working. Any tips appreciated.

EDIT

Im working with the open Model Zoo (part of Intel's Open VINO project) and the model I am working with is the human-pose-estimation-0001.

bbartling
  • 101
  • 2
  • It helps to know what exactly your model is. Specifically, what does it produce as output? Is it a probability map, is it a list of bounding boxes, what is it? – Minuano Oct 11 '22 at 06:09
  • I added an EDIT and am trying to follow this example: https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/402-pose-estimation-webcam/402-pose-estimation.ipynb – bbartling Oct 11 '22 at 08:37

0 Answers0