Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support splat export in original dataset coordinates #2951

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

brentyi
Copy link
Collaborator

@brentyi brentyi commented Feb 23, 2024

The .ply export for Gaussians previously defaulted to saving in Nerfstudio's auto-oriented / auto-scaled coordinate frame.

I added support for exporting this in the original dataset coordinate frame, which the point cloud export supports via a "Save in world frame" checkbox.

I matched this in both the GUI and the export script CLI:

image

To me it makes the most sense to default this to True (I also flipped this for the point cloud export), but open to thoughts!

cc @jb-ye, #2909

@brentyi brentyi requested a review from kerrj February 23, 2024 07:59
@brentyi brentyi changed the title Export splats in "world frame" by default Support splat export in original dataset coordinates Feb 23, 2024
kerrj
kerrj previously approved these changes Feb 23, 2024
Copy link
Collaborator

@kerrj kerrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems good

@jb-ye
Copy link
Collaborator

jb-ye commented Feb 23, 2024

I don't think the option would work actually for spherical harmonic parameters. The spherical harmonics are saved in the transformed frame and cannot be "re-orient" back to the original coordinates. (We can apply translation and scale, but not rotations, otherwise SH becomes inconsistent.)

My recommendation is to save the transforms as meta data as part of exporter.

@brentyi
Copy link
Collaborator Author

brentyi commented Feb 24, 2024

I don't think the option would work actually for spherical harmonic parameters. The spherical harmonics are saved in the transformed frame and cannot be "re-orient" back to the original coordinates. (We can apply translation and scale, but not rotations, otherwise SH becomes inconsistent.)

My recommendation is to save the transforms as meta data as part of exporter.

Agree the current implementation is flawed, will revisit this after ECCV deadline!

@kerrj kerrj dismissed their stale review February 26, 2024 18:25

See comments about SH parameters

@brentyi brentyi marked this pull request as draft February 28, 2024 13:09
…dio-project/nerfstudio into brent/splat_export_world_frame
Copy link
Contributor

@pwais pwais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jb-ye Many viewers ignore spherical harmonics, and in my experience doing a rigid transform on splats looks plenty fine. I think ideally Nerfstudio just does a scale and re-center internally as part of the model rather than changing any of the camera poses, initial points etc that are part of the dataset. It should be up to the model to normalize as needed, not up to the user.

)
)

CONSOLE.log("Caching / undistorting eval images")
with ThreadPoolExecutor() as executor:
with ThreadPoolExecutor(max_workers=2) as executor:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make this configurable? or maybe this is just for debugging

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not sure why this shows up in this diff, the change is from #2969. it speeds up undistortion a lot for big datasets I've been toying with!

it's hardcoded to 2 because we can really only expect benefits from one thread doing IO while the other thread is doing undistortion; the implementation is still weird given this (ideally we'd just have 1 worker doing IO while the main one is sequentially undistorting) but I'm a fan of not letting perfect be the enemy of... better 🤷

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW each worker might need cv2.setNumThreads(1) or else it can choke the CPU. I seem to see way more than 200% util here hence why i brought it up so maybe it's just not tuned well for all users.

maybe in a future refactor this stuff will just get pushed to a torch dataloader... the pinned memory part breaks for me for large datasets anyways, literally I got a OOM and hard lock-up because too too too much much much pinned memory


output_scale = 1 / dataparser_scale
output_transform = np.zeros((3, 4))
output_transform[:3, :3] = dataparser_transform[:3, :3].T
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty please don't do transform math w/out at least comments, this sort of code is 110% likely to put a future reader in transform hell

also pretty please use pipeline.datamanager.train_dataparser_outputs.transform_poses_to_original_space() because
(1) that's what's used elsewhere in this file
(2) using that function ensures future refactors won't break things, and most past nerfstudio refactors have indeed broken lots of things

Comment on lines +524 to +525
np.einsum("ij,bj->bi", output_transform[:3, :3], model.means.cpu().numpy() * output_scale)
+ output_transform[None, :3, 3]
Copy link
Contributor

@pwais pwais Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't do this, ESPECIALLY w/out comments. I have lost a lot of time reading nerfstudio code that's like this. instead consider:

positions = model.means.cpu().numpy()

poses = np.eye(4, dtype=np.float32)[None, ...].repeat(positions.shape[0], axis=0)[:, :3, :]
poses[:, :3, 3] = positions
poses = pipeline.datamanager.train_dataparser_outputs.transform_poses_to_original_space(
    torch.from_numpy(poses)
)

for i in range(3):
map_to_tensors[f"scale_{i}"] = scales[:, i, None]

quats = model.quats.data.cpu().numpy()
def quaternion_multiply(wxyz0: np.ndarray, wxyz1: np.ndarray) -> np.ndarray:
Copy link
Contributor

@pwais pwais Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

??? First of all, scipy.spatial.transform already has quaternion multiply... at least this code is clear about scalar-first versus scalar-last.

this could be made more concise, but consider instead:

from scipy.spatial.transform import Rotation as ScR
# ns gplat says quaternions are [w,x,y,z] scalar-first format
# scipy is [x, y, z, w] scalar-last format

raw_quats = model.quats.data.cpu().numpy().squeeze()
R_quats = ScR.from_quat(raw_quats[:, [1, 2, 3, 0]])

# apply the inverse dataparser transform to the splat rotations
poses = np.eye(4, dtype=np.float32)[None, ...].repeat(raw_quats.shape[0], axis=0)[:, :3, :]
poses[:, :3, :3] = R_quats.as_matrix()
poses = pipeline.datamanager.train_dataparser_outputs.transform_poses_to_original_space(
    torch.from_numpy(poses)
)
rots_in_input = poses[:, :3, :3].numpy()
quat_in_input = ScR.from_matrix(rots_in_input)
quats = quat_in_input.as_quat()[:, [3, 0, 1, 2], None]

Again, this uses transform_poses_to_original_space(), which might amalgamate several different transforms and scales, who knows? Instead of trying to re-derive the transform as the current PR does. And hopefully transform_poses_to_original_space() gets maintained. But it's really really important to be clear about frames etc, and transform_poses_to_original_space() helps with that a ton.

@brentyi
Copy link
Collaborator Author

brentyi commented Mar 15, 2024

I think ideally Nerfstudio just does a scale and re-center internally as part of the model rather than changing any of the camera poses, initial points etc that are part of the dataset.

Tough to overcome the inertia here but I agree that this would solve a lot of problems!

@pwais
Copy link
Contributor

pwais commented Mar 15, 2024

Tough to overcome the inertia here but I agree that this would solve a lot of problems!

Yes inertia but at least the PR discussions leave breadcrumbs. Things that fall off the rolling katamari ball become seeds for the next re-write.

@jkulhanek
Copy link
Contributor

Wait, spherical harmonics can easily be rotated by using Wigner matrices, right?

@jb-ye
Copy link
Collaborator

jb-ye commented Mar 15, 2024

@jkulhanek Could you share a gist of sample code of rotating spherical harmonics? I am not aware of Wigner matrices.

@jkulhanek
Copy link
Contributor

jkulhanek commented Mar 15, 2024

Just a note here. Since SH is a complete basis, the rotation is possible exactly. Here is the code:
https://github.com/jkulhanek/nerfbaselines/blob/develop/nerfbaselines/math_utils.py

The code is heavy, and I don't understand it fully, but I played with it in a notebook, and it seems to rotate the SH correctly. I can also share the notebook if you want.

It's the recursive impl (whatever it means) which is supposed to be more stable (for higher order). The good thing about it is that the wigner matrix can be computed once and then applied to all SHs at once.

@Frenchman997
Copy link

Is there any update on this? I would also be interested in the ability to export splats in their original world frame. I haven't looked into it yet at all but would be willing to contribute.

@hardikdava
Copy link
Contributor

@jkulhanek can you provide the notebook? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants