visionsim.dataset package¶
Submodules¶
visionsim.dataset.dataset module¶
- class visionsim.dataset.dataset.PathTransforms(paths: Sequence[Path], iter_npys: bool = True, **kwargs)[source]¶
Bases:
objectGiven a sequence of paths to load from, yield the minimal transforms dictionary for each path.
Specifically, for image paths we just yield
{"file_path": paths[idx]}for every index, but if some of the paths are numpy arrays, anditer_npysis true, we unpack the array’s first dimension, and return the correspondingoffsetas well. For instance, ifpathspoints to a png, a npy of shape (4, H, W, C), and another png, an index of 3 will return the path to the numpy file and an offset of 3.- __init__(paths: Sequence[Path], iter_npys: bool = True, **kwargs) None[source]¶
Initialize a sequence of “dummy” transform dictionaries from a set of paths.
- Parameters:
paths (Sequence[Path]) – Paths to yield from
iter_npys (bool, optional) – If true, yield from numpy arrays will before moving on to next path. Defaults to True.
**kwargs (dict[str, Any]) – Additional key/value pairs to include in each transform dict.
- class visionsim.dataset.dataset.Dataset(transforms: Sequence[dict[str, Any]], root: str | PathLike | None = None, cameras: set[Camera] | None = None)[source]¶
Bases:
DatasetMain dataset class for loading a
.db/.jsondataset or a set of image/exr/npy files.- __init__(transforms: Sequence[dict[str, Any]], root: str | PathLike | None = None, cameras: set[Camera] | None = None) None[source]¶
Initialize a dataset object.
Note
No data validation is performed here, you likely want to use one of the classmethods such as
from_path()orfrom_pattern()instead.- Parameters:
transforms (Sequence[dict[str, Any]]) – A sequence of transforms dicts, which at a minimum should have a
file_pathkey defined.root (str | os.PathLike | None, optional) – Dataset root directory, if supplied all
file_paths are assumed to be relative to it. Defaults to None.cameras (set[Camera] | None, optional) – Set of camera objects. Defaults to None.
- classmethod from_path(root: str | PathLike) Self[source]¶
Load a dataset from a path.
- Parameters:
root (str | os.PathLike) – Path to dataset file (either a
.dbor.jsonfile) or a directory containing a valid dataset.- Raises:
RuntimeError – raised if a dataset is not found at the provided path, or if multiple datasets are found.
- Returns:
instantiated Dataset object
- Return type:
Self
- classmethod from_paths(paths: Sequence[Path], iter_npys: bool = True, root: str | PathLike | None = None, cameras: set[Camera] | None = None, **kwargs) Self[source]¶
Create a dataset object from a collection of data files.
- Parameters:
paths (Sequence[Path]) – Paths to load data from.
iter_npys (bool, optional) – If true, step into the first dimension of any numpy files when iterating over data. Defaults to True.
root (str | os.PathLike | None, optional) – Dataset root directory, if supplied all
file_paths are assumed to be relative to it. Defaults to None.cameras (set[Camera] | None, optional) – Set of camera objects. Defaults to None.
**kwargs (dict[str, Any]) – Optional keyword arguments passed to
PathTransforms
- Raises:
ValueError – raised if provided paths do not exist or if they are not subpaths of root (when provided).
- Returns:
instantiated Dataset object
- Return type:
Self
- classmethod from_pattern(root: str | ~os.PathLike, pattern: str, cameras: set[~visionsim.dataset.models.Camera] | None = None, iter_npys: bool = True, key: ~collections.abc.Callable[[~typing.Any], ~typing.Any] = functools.partial(<function natsort_key>, key=None, string_func=<function parse_string_factory.<locals>.func>, bytes_func=<function parse_bytes_factory.<locals>.<lambda>>, num_func=<function parse_number_or_none_factory.<locals>.func>), **kwargs) Self[source]¶
Same as
from_paths()but will search for all paths that match the provided pattern (as found by pathlib’s glob)
- property poses: list[list[list[float]] | ndarray[tuple[Any, ...], dtype[floating]]] | None[source]¶
List of all camera poses, if available
- static load_data(path: str | os.PathLike, idx: tuple[int | slice, ...] = (), auto_collapse: bool = True, bitpack_dim: Literal[0, 1, 2] | None = None, unpacked_size: int | None = None) int | float | npt.NDArray[source]¶
Load data from provided path, optionally slicing it.
Support various image formats, as provided by imageio’s imread, exrs files, and numpy arrays (optionally bitpacked).
Note
This function uses OpenEXR to read exr files as both imageio and opencv cannot read an exr file when the data is stored in any other channel than RGB(A). As of Blender v4 single-channel data, such as depth maps, are correctly saved as single channel exrs, in the V channel. Previously, Blender just saved these as RGB by duplicating the data channel-wise. This function (optionally) auto-detects this issue and returns only a single channel numpy array.
Note
Numpy arrays are not loaded into memory, instead they are memory mapped, making this function safe to use with very large arrays.
- Parameters:
path (str | os.PathLike) – Path to the image file or numpy array.
idx (tuple[int | slice], optional) – If present, slice the data using this index. In most cases this is equivalent to slicing the data after loading it, but for bitpacked numpy arrays, the slice needs to be modified first. Defaults to empty tuple (no slicing).
auto_collapse (bool, optional) – If true, when loading an EXR file that has duplicated channels, collapse them down into a single channel. See note for more. Only used when loading an EXR file that is saved using the “RGB” channel. Defaults to True.
bitpack_dim (Literal[0, 1, 2] | None, optional) – Axis along which to bits have been packed. Only used when loading data from a numpy file. Defaults to None.
unpacked_size (int | None, optional) – Length of bitpacked axis once unpacked, if not specified data will be returned in a larger array that is a multiple of 8. Only used when loading from a numpy array that is bitpacked.
- Returns:
Data loaded from path
- Return type:
int | float | npt.NDArray
visionsim.dataset.models module¶
- class visionsim.dataset.models.Camera(*, camera_model: Literal['OPENCV', 'OPENCV_FISHEYE'] | None = None, fl_x: float | None = None, fl_y: float | None = None, cx: float | None = None, cy: float | None = None, h: int | None = None, w: int | None = None, c: int | None = None, k1: float | None = None, k2: float | None = None, k3: float | None = None, k4: float | None = None, p1: float | None = None, p2: float | None = None, fps: float | None = None, **extra_data: Any)[source]¶
Bases:
BaseModelCamera Intrinsics
- model_config = {'extra': 'allow', 'frozen': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- camera_model: Literal['OPENCV', 'OPENCV_FISHEYE'] | None¶
camera model type
- fl_x: float | None¶
focal length x
- fl_y: float | None¶
focal length y
- cx: float | None¶
principal point x
- cy: float | None¶
principal point y
- h: int | None¶
image height
- w: int | None¶
image width
- c: int | None¶
image channels
- k1: float | None¶
first radial distortion parameter, used by [OPENCV, OPENCV_FISHEYE]
- k2: float | None¶
second radial distortion parameter, used by [OPENCV, OPENCV_FISHEYE]
- k3: float | None¶
third radial distortion parameter, used by [OPENCV_FISHEYE]
- k4: float | None¶
fourth radial distortion parameter, used by [OPENCV_FISHEYE]
- p1: float | None¶
first tangential distortion parameter, used by [OPENCV]
- p2: float | None¶
second tangential distortion parameter, used by [OPENCV]
- fps: float | None¶
framerate of camera
- class visionsim.dataset.models.Data(*, file_path: Path | None = None, bitpack_dim: int | None = None, bitplanes: int | None = None, **extra_data: Any)[source]¶
Bases:
BaseModelFrame data
- model_config = {'extra': 'allow', 'frozen': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- file_path: Path | None¶
path to data, usually an image or ndarray file
- bitpack_dim: int | None¶
dimension that has been bitpacked
- bitplanes: int | None¶
number of summed bitplanes in image
- class visionsim.dataset.models.Frame(*, file_path: Path | None = None, bitpack_dim: int | None = None, bitplanes: int | None = None, camera_model: Literal['OPENCV', 'OPENCV_FISHEYE'] | None = None, fl_x: float | None = None, fl_y: float | None = None, cx: float | None = None, cy: float | None = None, h: int | None = None, w: int | None = None, c: int | None = None, k1: float | None = None, k2: float | None = None, k3: float | None = None, k4: float | None = None, p1: float | None = None, p2: float | None = None, fps: float | None = None, transform_matrix: Annotated[list[list[float]], AfterValidator(func=_validate_transform_matrix)], offset: int | None = None, **extra_data: Any)[source]¶
-
Frame information
- model_config = {'extra': 'allow', 'frozen': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- transform_matrix: Annotated[_Matrix4x4, AfterValidator(_validate_transform_matrix)]¶
camera pose (orientation and position) as a 4x4 matrix
- offset: int | None¶
index of frame, used when
file_pathis an.npyfile
- class visionsim.dataset.models.Metadata(*, camera_model: Literal['OPENCV', 'OPENCV_FISHEYE'] | None = None, fl_x: float | None = None, fl_y: float | None = None, cx: float | None = None, cy: float | None = None, h: int | None = None, w: int | None = None, c: int | None = None, k1: float | None = None, k2: float | None = None, k3: float | None = None, k4: float | None = None, p1: float | None = None, p2: float | None = None, fps: float | None = None, frames: list[Frame], **extra_data: Any)[source]¶
Bases:
CameraA superset of the Nerfstudio
transforms.jsonformat which enables use of numpy arrays for single photon data, and allows for additional data paths (eg: flow/segmentation) and metadata attributes such as a channels dimension.- model_post_init(context: Any, /) None¶
This function is meant to behave like a BaseModel method to initialise private attributes.
It takes context as an argument since that’s what pydantic-core passes when calling it.
- Parameters:
self – The BaseModel instance.
context – The context.
- model_config = {'extra': 'allow', 'frozen': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- classmethod load(path: str | PathLike, rename_to: str = 'file_path') Self[source]¶
Load metadata from a
.jsonor.dbtransforms file.- Parameters:
path (str | os.PathLike) – Path to load metadata from.
rename_to (str, optional) – Load data paths from a
.dbfile as a different key. Defaults to “file_path”.
- Raises:
RuntimeError – raised if loading camera configurations fail.
ValueError – raised if file format is not understood.
- Returns:
instantiated Metadata object
- Return type:
Self
- classmethod from_path(path: str | PathLike, rename_to: str = 'file_path') Self[source]¶
Same as
load()with the added bonus of path disambiguation, wherepathcan also be the directory containing the metadata file.
- save(path: str | PathLike, *, indent: int = 2) None[source]¶
Save metadata to a
.jsonor.dbtransforms file.- Parameters:
path (str | os.PathLike) – Path to save metadata to.
indent (int, optional) – Indent amount to use when saving JSON file. Defaults to 2.
- classmethod from_dense_transforms(transforms: Sequence[dict[str, Any]]) Self[source]¶
Load metadata from a sequence of dictionary which contain all frame and camera information.
- Parameters:
transforms (Sequence[dict[str, Any]]) – Dictionaries containing frame information such as “file_path”, “transform_matrix” and camera parameters.
- Returns:
instantiated Metadata object
- Return type:
Self
- classmethod from_frames(frames: Sequence[Frame] | Sequence[dict[str, Any]], camera: Camera | dict[str, Any] | None = None) Self[source]¶
Load metadata from Frame objects (or their model dicts) and a single Camera object (or model dict).
- Parameters:
- Returns:
instantiated Metadata object
- Return type:
Self
- iter_dense_transforms(data_type: str | None = None, rename_to: str = 'path', relative_to: Path | None = None) Iterator[dict[str, Any]][source]¶
Yield dictionaries containing all frame and camera information, one per frame.
- Parameters:
data_type (str | None, optional) – Select which data type to iterate over, since there might be multiple (“file_path”, “mask_path”, etc). Defaults to None (all available).
rename_to (str, optional) – Rename key of iterated data, for instance from “file_path” to “path”. Only used if
data_typeis set. Defaults to “path”.relative_to (Path | None, optional) – Make data paths relative to provided path. Defaults to not modifying paths (None).
- Yields:
Iterator[dict[str, Any]] – Dictionaries containing all relevant frame data
- to_dense_transforms(*args, **kwargs) list[dict[str, Any]][source]¶
Same as
iter_dense_transforms()but returns a list instead of a generator.
- property data_types: set[str]¶
Data types that are defined for each frame, such as
file_pathofdepth_file_path.
- property poses: list[list[list[float]] | ndarray[tuple[Any, ...], dtype[floating]]]¶
Pose matrices of all frames.
- property path: Path | None¶
Path to loaded metadata file, may be undefined.