Mission

Project Name: Custom Deopth Estimation System

- This mission is an individual project
- Create the custom depth program which utilizes zetabot camera.
- Within your individual computers, execute the following mission.

Writing Custom DepthNet Program

Similar to how we created a new python file in our team assignment, generate a new python file and name it 02_8-2. depth_camera.py.

Create a new python file in the Jupyter Notebook Environment:

  • Press the blue plus button on the top left corner of the web.


  • Create a new python file by pressing the Python File button


  • Rename the untitiled python file to 02_8-2. depth_camera.py

  • On the new python file, import the libraries necessary. For our depth task, we need to import the Jetson inference library modules and jetson utility library modules

    • argparse: This library contains modules that are responsbile for bringing and intitializing the flags or parameters set by the user when envoking the program.

    • sys: this library allows us to manipulate/ utilize system functions within our python programs.

    • jetson_inference: This library contains all the pre-built networks that can be used for inference task and a functions that would allow for custom models to be used for inference tasks.

      • depthNet: We are importing depthNet module for our depth task.

    • jetson_utils: This library contains modules that are responsible for processing input and output sources along with output stream methods. We will be importing the following modules:

      • videoSource: used to process input source (whether it is a camera, an image, or a video).

      • videoOutput: used to process the output stream.

      • cudaOverlay: this module allows for overlay on the output stream.

      • cudaDeviceSynchronize: This module allows for cuda devices and processes to synchronize.

    • depth_utils: This library allows for buffer depth methods.

    import argparse
    import sys
    
    from jetson_inference import depthNet
    from jetson_utils import videoSource, videoOutput, cudaOverlay, cudaDeviceSynchronize
    
    from depth_utils import depthBuffers
    
  • After all the libraries are imported, initialize the parser variable with argparse.ArgumentParser module.

    For our mission, we must receive the network name, and Camera output channel name. Additionally we add our minor functinoality flags.

    # parse the command line
    # For our mission, We recieve the network name, and Camera name.
    # Set up argument parser, so that command line parameters can be read within the program
    parser = argparse.ArgumentParser(description="Mono depth estimation on a video/image stream using depthNet DNN.",
                             formatter_class=argparse.RawTextHelpFormatter,
                             epilog=depthNet.Usage() + videoSource.Usage() + videoOutput.Usage())
    
    # Major Functionality parameters (required from the user)
    parser.add_argument("input_CAMERA", type=str, default="", nargs='?', help="use csi://0 for Raspberry pi Camera")
    parser.add_argument("--network", type=str, default="fcn-mobilenet", help="pre-trained model to load, see below for options")
    
    # Minor Functionality parameters (optional)
    parser.add_argument("--visualize", type=str, default="input,depth", help="visualization options (can be 'input' 'depth' 'input,depth'")
    parser.add_argument("--depth-size", type=float, default=1.0, help="scales the size of the depth map visualization, as a percentage of the input size (default is 1.0)")
    parser.add_argument("--filter-mode", type=str, default="linear", choices=["point", "linear"], help="filtering mode used during visualization, options are:\n  'point' or 'linear' (default: 'linear')")
    parser.add_argument("--colormap", type=str, default="viridis-inverted", help="colormap to use for visualization (default is 'viridis-inverted')",
                              choices=["inferno", "inferno-inverted", "magma", "magma-inverted", "parula", "parula-inverted",
                                       "plasma", "plasma-inverted", "turbo", "turbo-inverted", "viridis", "viridis-inverted"])
    
  • Initialize opt variable to hold all the user-set flags in a list form. If the user has set no flags, terminate the program:

    # If no parameter is given from the user, shut the program down
    try:
        opt = parser.parse_known_args()[0]
    except:
        print("")
        parser.print_help()
        sys.exit(0)
    
  • Initialize the necessary variables. Since we wish to infer a network with a camera and show the results with our output stream we will need:

    1. net variable for holding the nvidia pre-built networks. For this mission we are using fcn-mobilenet network.

    2. input variable for handling the input stream. Using the opt variable created in our previous step, we will bring in input_CAMERA to set our videoSource.

    3. display variable for handling the output stream. Although we are accessing the code remotely on our remote computer, the zetabot is equipped with a touch screen display. The display is set on DISPLAY://0

    4. buffer variable for managing buffer.

    # load the depth network
    net = depthNet(args.network, sys.argv)
    
    # create buffer manager
    buffers = depthBuffers(args)
    
    # create video sources & outputs
    input = videoSource(args.input_CAMERA, argv=sys.argv)
    output = videoOutput("DISPLAY://0", argv=sys.argv)
    
  • For this task we are utilizing our camera. On our previous trials, we had to to an inference on a single image. The program could recieve the one image infer it with the network and output a single result.

    But with a camera, we need to repeatedly run the inference so that we may capture the incoming frames from the camera and output a constant stream of results.

    • We may achieve this by running a while loop until an envoked output stream window is killed by the user.

      # process frames until the user exits
      while display.IsStreaming():
      
    • Within the while loop:

      • Capture the current frame from the camera, allocate buffer for the size of the camera and infer the image using the trained model.

        # capture the next image
        img_input = input.Capture()
        
        if img_input is None: # timeout
            continue
        
        # allocate buffers for this size image
        buffers.Alloc(img_input.shape, img_input.format)
        
        # process the mono depth and visualize
        net.Process(img_input, buffers.depth, args.colormap, args.filter_mode)
        
      • Add input and depth images to the composite image if selected in the buffer.

        # composite the images
        if buffers.use_input:
            cudaOverlay(img_input, buffers.composite, 0, 0)
        
        if buffers.use_depth:
            cudaOverlay(buffers.depth, buffers.composite, img_input.width if buffers.use_input else 0, 0)
        
      • Render the result output and update the title bar of the output window.

        # render the output image
        output.Render(buffers.output)
        
        # update the title bar
        output.SetStatus("{:s} | Network {:.0f} FPS".format(opt.network, net.GetNetworkFPS()))
        
        # exit on input/output EOS
        if not input.IsStreaming() or not output.IsStreaming():
            break
        

Executing the Custom Program

  • Open the 02_8-3. depth_camera.ipynb notebook.


  • Run the cell code which initializes the input/ output stream of the environment as well as the CAMERA variable, which will be the flag that determines the input vairable for the program to be a camera stream.

    %env DISPLAY=:0
    %env csi=:0
    %env CAMERA=csi://0
    
  • Check if your python notebook can read the python code you have written:

    !cat /home/zeta/notebook/lecture/'2.AI Training Examples'/'02_8-2. depth_camera.py'
    
  • Execute the depth_camera python code.

    Note that we are setting our major functions,

    • --network: to set which networks to use in our depth task.

      • You may change the pre-trained networks to the previously discussed networks.

    • input_CAMERA: to set which input stream will be used for our task. It is being set to CAMERA environment variable which holds csi://0 as a string.

    !python3 /home/zeta/notebook/lecture/'2.AI Training Examples'/'02_8-2. depth_camera.py' $CAMERA --input-width=640, --input-height=360