[performance] feed_to_graph_path is slow on larger feeds #12

kuanb · 2017-12-21T04:45:58Z

test_feed_to_graph_path itself is the slowest test by far. Create benchmarks and identify which steps are slowest. Find ways to speed up operations and get graph creation process to be as fast as possible.

The text was updated successfully, but these errors were encountered:

kuanb · 2017-12-25T00:07:10Z

Addressed (but still slow) via #14

kuanb · 2018-03-28T05:40:16Z

Used snakeviz with cProfile and this is what the breakdown on performance of the operation looks like at present:

generate_edge_and_wait_values is the real hog here. It is primarily comprised of two steps:

generate_wait_times (60% of the runtime of parent function)
linearly_interpolate_infill_times (20% of the runtime of parent function)

Both are executing Pandas functions so, beneath them, are just Pandas ops and groupby functions, respectively. To speed this module up, I'll need to better manage the Pandas operations and identify optimizations I can make on how I am using the Pandas operations in the logic.

For example, since these are all wrapped in a single route iteration, the whole operation is embarrassingly parallelizable.

kuanb · 2018-04-11T14:52:17Z

Parallelization with performant pickling enabled via #12

kuanb · 2018-04-21T17:06:41Z

Noticing the unaccounted for stop id management step is taking quite a while:

Some unaccounted for stop ids. Resolving 2457...

^ Example from LA Metro GTFS zip file.

kuanb · 2018-06-30T05:48:06Z

On smaller feeds (or even mid-sized feeds, like AC Transit), MP is slower. I need to figure out how to intelligently navigate away from using MP in these situations.

Sigh, this whole performance issue is not good.

Example:

%%time

st = time.time()
G_orig = pt.load_feed_as_graph(feed, start, end)
et = time.time()

# Runtime
print(round(et-st, 2))

Above run once with MP as False and one time as True.

No MP:

238.4
CPU times: user 3min 57s, sys: 350 ms, total: 3min 57s
Wall time: 3min 58s

Yes MP:

286.01
CPU times: user 1min 13s, sys: 390 ms, total: 1min 14s
Wall time: 4min 46s

kuanb · 2018-07-12T01:30:42Z

Huge performance gain found right here: #87

(Thank you @yiyange)

kuanb · 2018-07-15T23:56:17Z

Updated performance, with the last few updates incorporates (see all commits from Wed to today):

Without MP: 87.5s (63.3% faster)
With MP: 93.97s (67% faster)

cc @yiyange

yiyange · 2018-07-16T00:44:04Z

I am curious in what cases using multi-processing is faster; when i played with it, it is much slower than without using it.

kuanb · 2018-07-16T02:52:49Z

There is a higher initialization cost to using multiprocessing. The gains can be seen primarily on larger datasets, such as LA Metro. I should bench mark that.

kuanb · 2018-07-16T02:53:06Z

Whoops sorry didn't mean to close.

kuanb · 2018-07-16T03:31:52Z

LA Metro (without digging around for the exact numbers) used to take 12-15 minutes.

It now takes:
Without MP: 231s
With MP: 229s

So, no observable improvement. Of course, it's running in a Docker environment that only has access to 2 CPUs on my '16 Macbook Pro. A better test would be to use a virtual machine on AWS / GCloud or wherever and see what gains are achieved there.

That said, we can observe that there are pretty limited (essentially no observable) gains to be had by MP for the typical user/use case (local machine, in a Notebook like environment). This is something that should be addressed long term.

kuanb added the performance label Dec 21, 2017

This was referenced Mar 29, 2018

[performance] WIP Parallelization of route edge and wait costing iteration #51

Closed

[performance] WIP Parallelization of route edge and wait costing iteration #53

Merged

kuanb mentioned this issue Apr 11, 2018

[performance] Utilize mp.BaseManager to share reference object among processes #60

Merged

kuanb mentioned this issue Apr 21, 2018

Fix dropped edges during a coalesce operation #64

Merged

kuanb closed this as completed Jul 16, 2018

kuanb reopened this Jul 16, 2018

kuanb added the help wanted label Oct 9, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[performance] feed_to_graph_path is slow on larger feeds #12

[performance] feed_to_graph_path is slow on larger feeds #12

kuanb commented Dec 21, 2017

kuanb commented Dec 25, 2017 •

edited

kuanb commented Mar 28, 2018 •

edited

kuanb commented Apr 11, 2018

kuanb commented Apr 21, 2018

kuanb commented Jun 30, 2018

kuanb commented Jul 12, 2018

kuanb commented Jul 15, 2018

yiyange commented Jul 16, 2018 •

edited

kuanb commented Jul 16, 2018

kuanb commented Jul 16, 2018

kuanb commented Jul 16, 2018

[performance] feed_to_graph_path is slow on larger feeds #12

[performance] feed_to_graph_path is slow on larger feeds #12

Comments

kuanb commented Dec 21, 2017

kuanb commented Dec 25, 2017 • edited

kuanb commented Mar 28, 2018 • edited

kuanb commented Apr 11, 2018

kuanb commented Apr 21, 2018

kuanb commented Jun 30, 2018

kuanb commented Jul 12, 2018

kuanb commented Jul 15, 2018

yiyange commented Jul 16, 2018 • edited

kuanb commented Jul 16, 2018

kuanb commented Jul 16, 2018

kuanb commented Jul 16, 2018

kuanb commented Dec 25, 2017 •

edited

kuanb commented Mar 28, 2018 •

edited

yiyange commented Jul 16, 2018 •

edited