MultiBands, and tools improvements (mostly) #138

ocourtin · 2018-11-14T21:03:58Z

A) MultiBands aka #56

Switch from PIL to OpenCV for slippy map images (to allow multibands images handling)
SlippyMapTileConcatenation produce now an aggregate C,W,H NumPy tensor rather than a list of RGB images
First Unet-like encoder layer, could be adjust upon num_channels (so not anymore harcoded at 3 but now 1-N)
ONNX export tool was updated consequently
Defaults weights (before reusing any ImageNet RGB pre trained weights bands) use Xavier initialization
Nothing specific was done at this moment on ResNet mean/stdev values (still to explore if/when needed)
Dataset configuration file allow channels configuration for bands on each choosen slippymap dir
and choosen channels are reported in the train ouputs logs
Add an experimental option in export tool, to allow to reduce channels, from a pth to an another pth.
At this stage, allow to make some experiments, on how to reuse ImageNet weights in a MultiBands/Fusion context.
This point is still an open subject to find best training approach (i.e rather than train from scratch)
Nota: still to explore alternate NN topology to deal with multibands
for example, where extra bands are copied on each Encoder layers (and not only on the first one, as in this PR): cf: https://hal.archives-ouvertes.fr/hal-01523573/document

Data Augmentation

Refactored rotate and flip
Removed what looks like an useless crop treatment
data_augmentation occurence probability (flip or rotate for now) became an hyperparameter in userland
Select best resampling resize interpolation as previously, but now also on downsampling
To keep in mind that even some basic DataAugmentation as flip/rotate can't be been performed with PyTorch on GPU cf https://discuss.pytorch.org/t/torch-from-numpy-not-support-negative-strides/3663

B) Tools:

Download:

Add a WMS and TMS server type handling (XYZ still the default)
Timeout could be specified in userland (some WMS performing pre/post-treatments could take times)
Use and improve Log handling (e.g giving more information on already downloaded tiles)

Cover:

Add ability to render a cover from a lat/lon bbox, or from a slippy map directory
(as a consequence the features parameter become optional)

Predict:

Add an option to generate directly masks (rather than probs)
Refactor the tile_buffer provider (slightly improve perfs)
model parameter is not mandatory anymore (as we use cuda if CUDA_VISIBLE_DEVICES not empty)

Compare:

Full refactor, Compare tool now allow 3 differents modes:
- side: Close to the previous behaviour (and so the default), but allow 2 to N images to compare (cf Compare tool works with just images and labels #80)
- diff: Work with images/masks/labels and compute a single diff image for efficient QoD check
- list: Render a cover with IoU metric on each mask/label couple tiles

Tile:

Add a first tiling tool implementation, to tile either images or labels from a raster,
performances are decents for a mono process stuff,
and it's able to deal with no_data borders (by removing the related tiles)

Rasterize:

Refactor the GeoJSON parser, to cleanly handle all GeoJSON surfacic geometries
(GeometryCollection, MultiPolygon, Polygon) and N-Dimensional GeoJSON coordinates
Nota: we still blindly assume GeoJSON coordinates inputs are EPSG:4326
Use and improve Log handling

Predict, Subset, Download, Rasterize, Compare, Masks, Tile:

Add a leaflet client generation option, to allow an easy slippy map visual inspection

C) Maintenance:

Bugfixes

Fix a ZeroDiv issue, who was still remaning in mIoU metrics

Performances

Switch from Pillow to Pillow-simd (cf https://github.com/uploadcare/pillow-simd)
Imply recent Intel/AMD proc, but significant perf improvements on PIL treatment stuff (about x6)
Nota: didn't see a significative difference for RoboSat use case beetwen SSE4 and AVX2 (i.e SSE4 is 'enough')

Versions stuff

Upgrade rasterio version to the latest stable one (1.0.9)
nn.functional.sample -> nn.functional.interpolate (to avoid 0.4.1 related warning)

Credits

echoed my name >> AUTHORS.md

D) Userland considerations:

channels configuration now mandatory in dataset.toml (as an array of table)
data_augmentation ratio and resnet pretrained as new mandatory hyper-parameters in model.toml
cuda bool is no longer needed in model.toml (by default use now all CUDA_VISIBLE_DEVICES available)
libwebp and libjpeg are mandatory for pillow-simd (so imply updating install doc)
In compare tool, maximum parameter was removed (could't imagine a use case) and minimum was rename to minimum_fg
In cover tool the features parameter is not anymore a positional (became an optional)

NOTES:

Could you Daniel give me a hint how the robosat unit tests are supposed to be launched ?
It still remains to me unclear, and did'nt yet investigate, on it.
Was developped (and so tested) on Cuda 9.2/PyTorch 0.4.1 (single and multi GPU)
and also quickly cheked on a single GPU Cuda 10/PyTorch 1.0 Nightly build.
CPU only, have barely not been tested at this point.
Dockers cpu and gpu have not been tested at all (cf pillow-simd stuff)
Didn't yet find an easy/efficient way to deal with PNG Palette with OpenCV2.
Could leads to remove Pillow, as OpenCV2 is faster.
Code concision is kept with less than ~400 additionnal lines, in the codebase, for this whole PR ^^
Thanks in advance Daniel for the coming code review :)

…en loss

…serland

…Data Augmentation

…stuff

… test Docker instances

…already downloaded

…r with mask or prob

…irsts

…sion dir

ocourtin · 2018-11-27T01:28:24Z

From your previous code review, i've performed some new improvements,
to take some of your comments in account, and keep on to the tools refactoring.

Compare:

Whole refactor on my previous whole refactor (sic)
Code is far more factorized, and with a clear separation beetween rendering and QoD filtering.
And so allow for instance, 2-N images in stack mode.
Add GeoJson and Vertical option output.
Put back fg_maximum and qod_maximum parameters (finally could imagine a use case ^^)

Subset:

Use the cover file as a filter, rather than parsing the whole slippy dir.
Add a move optional parameter (rather than the default copy)
Improve log.

Web UI:

Generalize the previous leaflet concept, to become more abstract:
- Allow now to choose among several templates, including in userland.
- The GeoJSON selected tiles (i.e grid), is separated from the HTML,
  and as Compare tool could produce geojson too, they could so be combined.
Add a new Web UI tool, for compare in side mode.
Allow a prev/next efficient navigation.
Allow also to select tiles, and to grap selection as a cover in the clipboard.

Features and Extract dynamics handlers:

Dynamic module loading to avoid for an end user to modify anything inside robosat.
Module path could either use the robosat ones, or be specified in userland.

Colors:

Handle all CSS3 colors (and not only a small subset) and allow also hex pattern #RRGGBB.
Use to do so the webcolors package.
Labels use complementary color (from color class defined in dataset file)
in order to create nice stack visual result with compare tool.

Few little things:

Remove tile_to_bbox function, and use mercantile.xy_bounds instead
Improve GeoJSON output to be lightweight (keep only meaningful precision and properties)
Use choices from args.parse each type we have mode/type parameter with several values to choose from
Update pytests to pass
Update to later rasterio version (again ^^)
In extract tool, allow to deal with 1-N GeoJSON files
Train tool use an positional output parameter (rather than the one hard-written in model file)
Avoid to overwrite already existing dir, when launching back a command (on train and subset)
Add a new tile_image function, allowing loading a single tile, as you don't know it's file extension
In tools, put optional args firsts, in the help section.
In cover tool, use a type parameter to specify kind of input type. Seems more user-friendly.

ocourtin · 2018-11-27T01:31:00Z

So any new comments welcome :)

…and epochs)

… in train tool

DavidDohmen · 2019-04-04T09:46:33Z

Hi @ocourtin and @daniel-j-h I'm glad to see this amazing merge request! I wanted to ask if there's any progress on this and if there are any blockers left for a merge? The discussion seems to be stalled.

ocourtin · 2019-04-04T09:56:30Z

Hi @DavidDohmen !
Current dev, including this PR, and lot more enhancements, continue right now on RoboSat.pink fork.

GitHub: https://github.com/datapink/robosat.pink (use the master till the coming 0.4.0 release)
Gitter Chat: https://gitter.im/RoboSatPink

I let @daniel-j-h answers if there's a way to resume devs in this repo.

DavidDohmen · 2019-04-04T13:55:50Z

Thanks for your quick answer and pointing me in the right direction! I will look into this!

ocourtin added 30 commits October 10, 2018 22:37

Display Hyperparameters weights, only if needed and used by the choos…

213540e

…en loss

Data Augmentation: Add upscale_factor, remove useless CenterCrop

6991719

Add in rs download, TMS and WMS services support. Expose timeout in u…

bbd86a3

…serland

Homogenize reprojection calls, pass throught rasterio

9c9b307

Add GeoJSON MultiPolygon support

d51393d

Add url in user error message. Neat for WMS use cases

d8f1686

Change image_upscale hyperparameter name and add entry in log

43a6050

Fix mIoU to be 0 div resilient

b20e5a4

Move to PIL Image to OpenCV H,W,C RGB for DataAugmentation. Refactor …

73933fd

…Data Augmentation

add tile tools

fd4ca3b

black format

29935f6

refactor resize data augmentation

2e97d16

typo

dd15457

Fix issue relative to OpenCV switch.

822bae7

Fix axes order on labels tiles. Add label_thresold option. Few clean …

0b86808

…stuff

Add masks_ouput option in predict tool

1e43632

Update comments (related to output_masks)

48115d2

in train, resume option don't need an explicit yes. User Friendly

82c2bcd

Add leaflet ouput option

f2e6405

reverse map templates changes. Add leaflet html template

b0c554c

Move from pillow to pillow-simd. Significant perf improvement. Did'nt…

fab73f9

… test Docker instances

Variable name homogenize

e68d8f5

Add robosat log handling. User friendly log output if some tiles was …

236f471

…already downloaded

Add bbox option to cover tool

69ef557

remove period from extension

249107b

Tile tool: few debug fixes and cleanup

1407730

add complemenraty_palette function

2fe3864

allow to explicitly choose open file mode, in log

3f27682

Use NaN rather than Inf on 0 div exception. Allow to use metric eithe…

8da0e8b

…r with mask or prob

Bugfix on leaflet tiles

b2a76dd

ocourtin added 9 commits November 25, 2018 14:54

Handle 1-N geojson features files. Few cleanup.

57f3bc4

use args.parse choices

01ea0b2

Update to latest rasterio version

b4391d3

Dynamic handler arch.

3612d70

Use a single type parameter, more user understandable prototype

0433e26

add web_ui_template parameter. order args in tools to put optionals f…

ed16e8f

…irsts

Protect already existing data in dest dir to be overwritten

53b0b4e

update existing tests to pass

e02dffd

few cleanup. Add path parameter to allow an user to set his own exten…

1a451d3

…sion dir

ocourtin and others added 14 commits November 27, 2018 02:44

Black is black

0fa123e

Merge branch 'master' into multi

0c6fff7

One rasterio version is enough ^^

24716db

Create destination repository if needed

bcf13ce

add expanduser path

552ffb5

additional stdout log by default

0504df5

Add user hint

9f3eb8f

Allow to override most common train option from the command line (lr …

0aa7f46

…and epochs)

Refactor config file, to a single one. Allow to override dataset path…

d7e8559

… in train tool

polish

c628975

Add geojson slicer

a59d971

Fix: propagate config classes titles

9b8f94e

Typo

bb2dcf8

Add pillow-simd dependancy libs

b28b2c1

ocourtin mentioned this pull request Jan 10, 2020

questions related to offline geotiff image and export option #200

Closed

daniel-j-h mentioned this pull request Aug 9, 2021

Bringin own data #222

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MultiBands, and tools improvements (mostly) #138

MultiBands, and tools improvements (mostly) #138

ocourtin commented Nov 14, 2018 •

edited

ocourtin commented Nov 27, 2018 •

edited

ocourtin commented Nov 27, 2018

DavidDohmen commented Apr 4, 2019

ocourtin commented Apr 4, 2019 •

edited

DavidDohmen commented Apr 4, 2019

MultiBands, and tools improvements (mostly) #138

Are you sure you want to change the base?

MultiBands, and tools improvements (mostly) #138

Conversation

ocourtin commented Nov 14, 2018 • edited

ocourtin commented Nov 27, 2018 • edited

ocourtin commented Nov 27, 2018

DavidDohmen commented Apr 4, 2019

ocourtin commented Apr 4, 2019 • edited

DavidDohmen commented Apr 4, 2019

ocourtin commented Nov 14, 2018 •

edited

ocourtin commented Nov 27, 2018 •

edited

ocourtin commented Apr 4, 2019 •

edited