Motion Capture System
Last updated
Last updated
Motion Capture System (or MoCap in short) refers to the system that can capture the motion (movement) of people or objects.
It often consists of many optical cameras that can produce and receive infrared rays (the emitters are placed around the camera). By placing reflective markers on the object, the MoCap can uniquely identify and record the pose of objects at a very high frequency (~100Hz to 400Hz).
Passive markers (a ball-shape marker that can reflect infrared rays) are often used. However, there also exists another type, active markers, that can produce infrared rays on its own.
Active markers require power supply from the robot, and hence are suitable for mid-to-large size of robots. They are often used together with a synchronization module, which can sync the emitting moment of active markers with the camera shutter of the MoCap.
There should be at least 3 markers rigidly attached to an object, in order to compute a valid 3D pose. To improve redundancy, we often put 4 or 5 markers on the object, and make sure at each moment of the motion at least 3 markers are visible to cameras from different positions.
The markers shall not be placed in a singular position (e.g., co-planar, linear), and are supposed to be arranged in an asymmetric pattern. Only in this way can we compute a unique pose.
Make sure the relative positions between markers are rigid, and it will not deform as the object moves; otherwise, this can result in tracking errors.
In the above example, the three markers on the side of propellers form a symmetric pattern, which may cause ambiguity in the down side and the right side (in terms of the viewpoint of this image). In other words, cameras viewing from the down side may falsely recognize that they are viewing from the right side.
It is possible to use less than 3 markers to track the robot pose. If only 3D position (without orientation) is needed, it is also possible to track with a single marker.
This can be achieved by accessing the raw data of the position of each marker, and applying the Iterative Closest Point algorithm on the user end.
This is often used in robot swarms, where a unique marker arrangement for each robot is hard to find as the number of robots increases. Robots with similar marker arrangement can cause confusion to the MoCap system.
It is still true that having 3 markers is the minimum requirement to form an object and compute a full 6-dof pose in the MoCap software. This implementation does not require the object pose computed by the MoCap software, and directly access the raw data of marker positions and apply a user-defined algorithm to track objects.
There are at least two networks required in a MoCap system. One is the camera network (local, unidentified, no connection to Internet), and the other is the user or client network. An example network setup of a MoCap system is the following.
Data flow in this network is the following: Each camera captures markers/objects and sends this data to the MoCap PC via a switch in the camera network, and the PC runs MoCap software to process data, compute object poses, and send to the user end via Ethernet or WiFi in the user network.
On the user PC, we often run a ROS package to process the incoming data, convert them into ROS format, and publish ROS topics accordingly.
In our college, the MoCap PC is a machine managed by the IT department, and therefore this PC must have connection to the BCOE engineering network at all times. In this case, the MoCap PC is connecting to 3 networks at the same time. As most of the PCs are not shipped with 3 network cards by default, we need to purchase a network card separately and install it manually. (For example, we use the Intel I210-T1 network adapter.)
This is also to say that you need to have an BCOE engineering account in order to login to the MoCap machine. (Every BCOE student should have one; please work with the systems if you don't remember.)
System calibration is needed in the following cases: 1) first use or the system is not in use for a long time, 2) whenever the relative poses between cameras are changed (e.g., someone accidentally kicked the tripod or twisted the camera).
After a successful calibration, the parameters can be saved for further use in the following days, until the next time you feel necessary, or you have a higher requirement on the accuracy.
This is to calibrate (compute) the relative pose between each and every camera.
To do this, a person needs to hold a wand (with predefined marker arrangement known to the manufacturer) and walk in the field. The person shall wave the wand and try to cover as much field of view of cameras as possible.
In the software, the trajectories of the wand are recorded and visualized during the calibration process; you can see if past trajectories have covered most of the area in each camera's view.
Note: Avoid wearing clothes or objects (watch, glasses, phone) with reflective materials. You may walk around in the MoCap volume and check if any part of your body has been falsely identified as a marker.
Basic steps are: 1) in the MoCap software, click start calibration, 2) a person waving the wand in the volume, 3) in the MoCap software, click stop calibration or compute poses.
The theory behind this is triangulation-based localization and least-square estimation.
In the calibration step, the relative distance between markers on the calibration tool is known, whereas the relative poses between cameras are variables to be estimated by least-square solutions.
In normal use, the relative poses between cameras are known, and the poses of objects are variables to be estimated by minimizing the least-square re-projection errors from multiple cameras seeing the same object.
Extrinsic parameters are constraints on the relative poses between cameras, but do not specify the origin of the coordinate system (without origin being set, these cameras are free objects in 3D space). Setting the ground plane is to tell the cameras where the origin of the MoCap volume is, adding one more constraint to all cameras.
This can be done by placing another calibration tool (with predefined marker arrangement known to the manufacturer, often in a triangular shape) on the center of the ground of the MoCap volume, and clicking a simple button in the MoCap software.
There are many MoCap system providers in the world. The leading companies in the US are VICON and Optitrack, which are the two we often see in college.
They both provide specialized software in Windows systems on the MoCap PC, where we can create objects from selected markers, perform system calibration, record marker/object trajectories, and set data streaming settings. The appearance of the software may look different, but the basic functionalities they provide are the same.
On the user end, we use the corresponding ROS packages to convert the raw, low-level data packets in TCP/IP into ROS topics.
VICON does not have its official ROS package available. We often use the following two packages:
https://github.com/ethz-asl/vicon_bridge (pose published in geometry_msgs::TransformStamped
format, without velocity estimation)
https://github.com/KumarRobotics/vicon (pose published in nav_msgs::Odometry
format, with velocity estimation)
In both ROS packages, you only need to specify the name of the object you created in the VICON software on the MoCap PC.
OptiTrack has its official ROS package available: https://github.com/ros-drivers/mocap_optitrack
You need to configure a yaml file in the ROS package similar to the following, where you need to specify the object ID (a unique integer number), which pose information to be published, and some settings regarding the software on the MoCap PC (e.g., network IP address).
Regarding the spikes on the curve, it is likely that VICON lost the track for a moment during the "spikes". To debug this, you may enable recording (in the VICON software) during experiments, and check the playback (animation) to see if all markers are stable from the beginning to the end (visible, continuous, not shown as ghost markers, etc.)
If markers are stable all the time (less likely), then check the ROS package that makes use of VICON data for pose estimation.
If it is confirmed that markers are not stable enough, we can try the following to fix it.
Double check the marker configuration, and see if there is any possibility to lose visibility, rigidity, or asymmetric pattern during experiments.
Switch to grayscale mode in each camera, adjust the parameters (focal length, exposure, circle filtering algorithm, etc.) to make sure each marker can be viewed clearly, in proper size, with no overlapping with any other markers at all times.
Double check the valid capturing volume and if the quadrotor has been traveled to any spot out of this volume (minimum requirement: seen by 4 cameras). It can be tested by manually moving a few markers around and checking the playback. It is possible to adjust (finely tune) the field of view of each camera (i.e. to change the direction/angle each camera is facing towards) to enlarge this volume.