Vision¶
This page documents the implementation of the visual input received by the simulated fly. Note that in the typical use case, the user should not have to access most of the functions described here. Instead, the visual inputs are given as a part of the observation returned by NeuroMechFly
at each time step. Nonetheless, the full API reference is provided here for greater transparency.
Note
For API references of NeuroMechFly simulation with the connectome-constrained model proposed in Lappalainen et al., 2024,, see the Advanced Vision page.
Retina simulation¶
- class flygym.vision.Retina(ommatidia_id_map: ndarray | None = None, pale_type_mask: ndarray | None = None, distortion_coefficient: float | None = None, zoom: float | None = None, nrows: int | None = None, ncols: int | None = None)¶
Bases:
object
This class handles the simulation of the fly’s visual input. Calculation in this class is vectorized and parallelized using Numba.
- Parameters:
- ommatidia_id_mapnp.ndarray
Integer NumPy array of shape (nrows, ncols) where the value indicates the ID of the ommatidium (starting from 1). 0 indicates background (outside the hex lattice). By default, the map indicated in the configuration file is loaded.
- pale_type_masknp.ndarray
Integer NumPy array of shape (max(ommatidia_id_map),) where the value of each element indicates whether the ommatidium is pale-type (1) or yellow-type (0). By default, the mask indicated in the configuration file is used.
- distortion_coefficientfloat
A coefficient determining the extent of fisheye effect applied to the raw MuJoCo camera images. By default, the value indicated in the configuration file is used.
- zoomfloat
A coefficient determining the zoom level when the fisheye effect is applied. By default, the value indicated in the configuration file is used.
- nrowsint
The number of rows in the raw image rendered by the MuJoCo camera. By default, the value indicated in the configuration file is used.
- ncolsint
The number of columns in the raw image rendered by the MuJoCo camera. By default, the value used in the configuration file is used.
- Attributes:
- ommatidia_id_mapnp.ndarray
Integer NumPy array of shape (nrows, ncols) where the value indicates the ID of the ommatidium (starting from 1). 0 indicates background (outside the hex lattice).
- num_pixels_per_ommatidianp.ndarray
Integer NumPy array of shape (max(ommatidia_id_map),) where the value of each element indicates the number of raw pixels covered within each ommatidium.
- pale_type_masknp.ndarray
Integer NumPy array of shape (max(ommatidia_id_map),) where the value of each element indicates whether the ommatidium is pale-type (1) or yellow-type (0).
- distortion_coefficientfloat
A coefficient determining the extent of fisheye effect applied to the raw MuJoCo camera images.
- zoomfloat
A coefficient determining the zoom level when the fisheye effect is applied.
- nrowsint
The number of rows in the raw image rendered by the MuJoCo camera.
- ncolsint
The number of columns in the raw image rendered by the MuJoCo camera.
- correct_fisheye(img: ndarray) ndarray ¶
The raw imaged rendered by the MuJoCo camera is rectilinear. This distorts the image and overrepresents the periphery of the field of view (the same angle near the periphery is reflected by a greater angle in the rendered image). This method applies a fisheye effect to make the same angle represented roughly equally anywhere within the field of view.
- Parameters:
- img: np.ndarray
The raw MuJoCo camera rendering as a NumPy array of shape (nrows, ncols, 3).
- Returns:
- np.ndarray
The corrected camera rendering as a NumPy array of shape (nrows, ncols, 3).
Notes
This implementation is based on https://github.com/Gil-Mor/iFish, MIT License.
- hex_pxls_to_human_readable(ommatidia_reading: ndarray, color_8bit=False) ndarray ¶
Given the intensity readings for all ommatidia in one eye, convert them to an (nrows, ncols) image with hexagonal blocks that can be visualized as a human-readable image.
- Parameters:
- ommatidia_readingnp.ndarray
Our simulation of what the fly might see through its compound eyes. It is a (N,) or (N, …) array where the first dimension is for the number of ommatidia.
- color_8bitbool
If True, the returned image will be in 8-bit color. This speeds up rendering. Otherwise, the image will be in the same data type as the input
ommatidia_reading
.
- Returns:
- np.ndarray
An (nrows, ncols, …) image with hexagonal blocks that can be visualized as a human-readable image. The shape after the 0th dimension matches that of the input
ommatidia_reading
.
- raw_image_to_hex_pxls(raw_img: ndarray) ndarray ¶
Given a raw image from an eye (one camera), simulate what the fly would see.
- Parameters:
- raw_imgnp.ndarray
RGB image with the shape (H, W, 3) returned by the camera.
- Returns:
- np.ndarray
Our simulation of what the fly might see through its compound eyes. It is a (N, 2) array where the first dimension is for the N ommatidia, and the third dimension is for the two channels.
Note that sometimes it is helpful to hide certain objects in the arena when rendering the fly’s vision. For example, markers for odor sources that are meant for user visualization only should not be seen by the fly. To accomplish this, we have provided two hook methods in BaseArena
that allow the user to modify the arena as needed before and after we simulate the fly’s vision (for example, changing the alpha value of the odor source markers here):
- BaseArena.pre_visual_render_hook(physics: dm_control.mjcf.Physics, *args, **kwargs) None
Make necessary changes (e.g. make certain visualization markers transparent) before rendering the visual inputs. By default, this does nothing.
- BaseArena.post_visual_render_hook(physics: dm_control.mjcf.Physics, *args, **kwargs) None
Make necessary changes (e.g. make certain visualization markers opaque) after rendering the visual inputs. By default, this does nothing.
Visualization tool¶
We have also provided a utility function to generate a video of the visual input during a simulation:
- flygym.vision.visualize_visual_input(retina: Retina, output_path: Path, vision_data_li: list[ndarray], raw_vision_data_li: list[ndarray], vision_update_mask: ndarray, vision_refresh_rate: float = 500, playback_speed: float = 0.1)¶
Convert lists of vision readings into a video and save it to disk.
- Parameters:
- retinaRetina
The retina object used to generate the visual input.
- output_pathPath
Path of the output video will be saved. Should end with “.mp4”.
- vision_data_lilist[np.ndarray]
List of ommatidia readings. Each element is an array of shape (2, N, 2) where the first dimension is for the left and right eyes, the second dimension is for the N ommatidia, and the third dimension is for the two channels. The length of this list is the number of simulation steps.
- raw_vision_data_lilist[np.ndarray]
Same as
vision_data_li
but with the raw RGB images from the cameras instead of the simulated ommatidia readings. The shape of each element is therefore (2, H, W, 3) where the first dimension is for the left and right eyes, and the remaining dimensions are for the RGB image.- vision_update_masknp.ndarray
Mask indicating which simulation steps have vision updates. This should be taken from
NeuroMechFly.vision_update_mask
.- vision_refresh_ratefloat, optional
The refresh rate of visual inputs in Hz. This should be consistent with
MuJoCoParameters.vision_refresh_rate
that is given to the simulation. By default 500.- playback_speedfloat, optional
Speed, as a multiple of the 1x speed, at which the video should be rendered, by default 0.1.
- flygym.vision.add_insets(retina, viz_frame, visual_input, panel_height=150)¶
Add insets to the visualization frame.
- Parameters:
- retinaRetina
The retina object used to generate the visual input.
- viz_framenp.ndarray
The visualization frame to add insets to.
- visual_inputnp.ndarray
The visual input to the retina. Should be of shape (2, N, 2) as returned in the observation of the environment (
obs["vision"]
).- panel_heightint, optional
Height of the panel that contains the insets, by default 150.
- Returns:
- np.ndarray
The visualization frame with insets added.
- flygym.vision.save_video_with_vision_insets(sim, cam, path, visual_input_hist, stabilization_time=0.02)¶
Save a list of frames as a video with insets showing the visual experience of the fly. This is almost a drop-in replacement of
NeuroMechFly.save_video
but as a static function (instead of a class method) and with an extra argumentvisual_input_hist
.- Parameters:
- simSimulation
The Simulation object.
- camCamera
The Camera object that has been used to generate the frames.
- pathPath
Path of the output video will be saved. Should end with “.mp4”.
- visual_input_histlist[np.ndarray]
List of ommatidia readings. Each element is an array of shape (2, N, 2) where N is the number of ommatidia per eye.
- stabilization_timefloat, optional
Time (in seconds) to wait before starting to render the video. This might be wanted because it takes a few frames for the position controller to move the joints to the specified angles from the default, all-stretched position. By default 0.02s