Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply functools.lru_cache to RegexFSM to Improve CFGFSM performance #621

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lapp0
Copy link
Contributor

@lapp0 lapp0 commented Feb 7, 2024

Improve performance of lark_lark_self_grammar.lark.test-True (tests -True meaning the cache already is populated, simulating a second run.) 16 tokens / second -> 39 tokens / second on second run.

(Profiled with #587)

Addresses #620

Problem

The vast majority of the time in the second run of a CFGFSM is spent retrieving cached RegexFSM

  • Total run time: 179 seconds
  • RegexFSM.__init__ (called 2772 times): 144 seconds

specifically the following operations are slow:

  • sorting tokenizer.vocabulary for hashing
  • hashing the vocabulary
  • unpickling the states mapping

Solution

To alleviate this, RegexFSM is wrapped with lru_cache(). This operation is safe because we never actually mutate RegexFSM.

This simple change nearly triples throughput, decreasing runtime of lark_self_grammar.lark.test's second run from 179 second to 71 seconds.

Initial performance:

tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[lark_lark_self_grammar.lark.test-True]:
	Tokens / Second: 16.032
	(Num Tokens: 2771, Time: 172.845 seconds)


------------------------------------------------------------------------- benchmark: 1 tests ------------------------------------------------------------------------
Name (time in s)                                                              Min       Max      Mean  StdDev    Median     IQR  Outliers     OPS  Rounds  Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_cfg_generation[lark_lark_self_grammar.lark.test-True]     172.8449  172.8449  172.8449  0.0000  172.8449  0.0000       0;0  0.0058       1           1
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Profile:

tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[lark_lark_self_grammar.lark.test-True]
ncalls	tottime	percall	cumtime	percall	filename:lineno(function)
1	0.0003	0.0003	178.8296	178.8296	outlines/tests/benchmark/test_benchmark_cfg_generation.py:130(<lambda>)
1	3.5873	3.5873	178.8293	178.8293	outlines/tests/benchmark/test_benchmark_cfg_generation.py:51(run_until_eos)
2772	4.5557	0.0016	174.7364	0.0630	outlines/outlines/fsm/fsm.py:222(allowed_token_ids)
2455	8.7711	0.0036	144.3058	0.0588	outlines/outlines/fsm/fsm.py:95(__init__)
7753/7706	70.4711	0.0091	70.4712	0.0091	~:0(<built-in method builtins.sorted>)
2455	0.0374	0.0000	65.0214	0.0265	outlines/outlines/caching.py:64(wrapper)
2455	3.1791	0.0013	37.6745	0.0153	outlines/outlines/caching.py:39(hash_arguments)
4910	0.0452	0.0000	31.9835	0.0065	outlines/.myenv/lib/python3.11/site-packages/cloudpickle/cloudpickle.py:1464(dumps)
4910	0.0086	0.0000	31.8976	0.0065	outlines/.myenv/lib/python3.11/site-packages/cloudpickle/cloudpickle.py:1243(dump)
4910	31.8890	0.0065	31.8890	0.0065	~:0(<function Pickler.dump at 0x7fd57eeec400>)
2454	0.0111	0.0000	25.2837	0.0103	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:1224(__getitem__)
2454	0.0240	0.0000	25.2726	0.0103	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:1123(get)
2454	0.0243	0.0000	25.2196	0.0103	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:254(fetch)
2454	25.0553	0.0102	25.0553	0.0102	~:0(<built-in method _pickle.load>)
2455	0.1836	0.0001	15.8389	0.0065	outlines/.myenv/lib/python3.11/site-packages/lark/parsers/lalr_interactive_parser.py:103(accepts)
149119/26899	0.1850	0.0000	15.3162	0.0006	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/copy.py:66(copy)
24444	0.0285	0.0000	15.2249	0.0006	outlines/.myenv/lib/python3.11/site-packages/lark/parsers/lalr_interactive_parser.py:61(__copy__)
24444	0.0610	0.0000	14.8236	0.0006	outlines/.myenv/lib/python3.11/site-packages/lark/parsers/lalr_parser_state.py:56(__copy__)
4403485/24444	5.7888	0.0002	14.7345	0.0006	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/copy.py:128(deepcopy)
1649638/24444	1.5524	0.0001	14.6750	0.0006	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/copy.py:201(_deepcopy_list)
1625194/53426	1.1849	0.0000	14.2413	0.0003	outlines/.myenv/lib/python3.11/site-packages/lark/tree.py:206(__deepcopy__)
2455	0.0389	0.0000	4.6914	0.0019	outlines/.myenv/lib/python3.11/site-packages/lark/parsers/lalr_interactive_parser.py:47(exhaust_lexer)
194248	0.1144	0.0000	4.6525	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/parsers/lalr_interactive_parser.py:35(iter_parse)
2454	4.1548	0.0017	4.1548	0.0017	outlines/outlines/fsm/fsm.py:272(<listcomp>)
216237	0.1050	0.0000	2.5984	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/parsers/lalr_interactive_parser.py:28(feed_token)
4910	2.4741	0.0005	2.4741	0.0005	~:0(<method 'update' of '_hashlib.HASH' objects>)
216237	1.5775	0.0000	2.4934	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/parsers/lalr_parser_state.py:67(feed_token)
4403485	1.5684	0.0000	2.1144	0.0000	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/copy.py:243(_keep_alive)
32	1.3233	0.0414	1.3235	0.0414	outlines/outlines/fsm/regex.py:461(state_scan_tokens)
1344890	0.8671	0.0000	1.1167	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:213(_future_new)
10485389	0.6935	0.0000	0.6935	0.0000	~:0(<built-in method builtins.id>)
5226	0.6614	0.0001	0.6659	0.0001	outlines/outlines/fsm/fsm.py:128(allowed_token_ids)
9005538	0.6520	0.0000	0.6520	0.0000	~:0(<method 'get' of 'dict' objects>)
1344890	0.6396	0.0000	1.7562	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:202(__new__)
1128653	0.6172	0.0000	2.0550	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:262(__deepcopy__)
194248	0.5555	0.0000	2.0572	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:590(next_token)
8795712	0.5146	0.0000	0.5146	0.0000	~:0(<method 'append' of 'list' objects>)
305361	0.4096	0.0000	0.4096	0.0000	~:0(<method 'match' of '_regex.Pattern' objects>)
1900847	0.3678	0.0000	0.3678	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/tree.py:61(__init__)
2928940	0.3527	0.0000	0.3527	0.0000	~:0(<built-in method builtins.getattr>)
2878528	0.2306	0.0000	0.2306	0.0000	~:0(<built-in method builtins.issubclass>)
1473314	0.1171	0.0000	0.1171	0.0000	~:0(<built-in method builtins.len>)
1371802	0.2532	0.0000	0.2532	0.0000	~:0(<built-in method __new__ of type object at 0x7fd61dfb1ba0>)
1240974	0.1310	0.0000	0.1395	0.0000	~:0(<built-in method builtins.isinstance>)
1021461	0.0764	0.0000	0.0764	0.0000	~:0(<method 'add' of 'set' objects>)
735283	0.0330	0.0000	0.0330	0.0000	~:0(<method 'setdefault' of 'dict' objects>)
555623	0.1320	0.0000	0.1852	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/grammar.py:124(__eq__)
489362	0.0533	0.0000	0.0533	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/grammar.py:121(__hash__)
407204	0.1553	0.0000	0.1917	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:265(__eq__)
304840	0.2316	0.0000	0.6925	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:387(match)
304840	0.1944	0.0000	0.2542	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:292(feed)
304840	0.1701	0.0000	0.9150	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:587(match)
9272	0.0022	0.0000	0.0022	0.0000	~:0(<method 'keys' of 'dict' objects>)
90	0.0000	0.0000	0.0000	0.0000	~:0(<method 'group' of 're.Match' objects>)
90	0.0000	0.0000	0.0000	0.0000	~:0(<method 'tell' of '_io.BytesIO' objects>)
9	0.0001	0.0000	0.0002	0.0000	outlines/.myenv/lib/python3.11/site-packages/packaging/version.py:186(__init__)
9	0.0000	0.0000	0.0000	0.0000	~:0(<method 'search' of 're.Pattern' objects>)
9	0.0000	0.0000	0.0000	0.0000	outlines/.myenv/lib/python3.11/site-packages/packaging/version.py:503(_cmpkey)
9	0.0000	0.0000	0.0002	0.0000	outlines/.myenv/lib/python3.11/site-packages/packaging/version.py:45(parse)
9	0.0000	0.0000	0.0000	0.0000	outlines/.myenv/lib/python3.11/site-packages/packaging/version.py:276(release)
9	0.0000	0.0000	0.0000	0.0000	outlines/.myenv/lib/python3.11/site-packages/packaging/version.py:491(_parse_local_version)
9	0.0000	0.0000	0.0000	0.0000	outlines/.myenv/lib/python3.11/site-packages/packaging/version.py:518(<lambda>)
89	0.0001	0.0000	0.0023	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:585(atom)
89	0.0000	0.0000	0.0001	0.0000	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/pickle.py:217(commit_frame)
89	0.0000	0.0000	0.0000	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:35(__eq__)
87/1	0.0002	0.0002	0.0140	0.0140	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/pickle.py:535(save)
87/1	0.0001	0.0001	0.0142	0.0142	outlines/.myenv/lib/python3.11/site-packages/datasets/utils/py_utils.py:607(save)
87/1	0.0001	0.0001	0.0141	0.0141	outlines/.myenv/lib/python3.11/site-packages/dill/_dill.py:365(save)
87	0.0000	0.0000	0.0000	0.0000	outlines/.myenv/lib/python3.11/site-packages/dill/_dill.py:314(get)
87	0.0000	0.0000	0.0000	0.0000	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/pickle.py:605(persistent_id)
86	0.0000	0.0000	0.0000	0.0000	~:0(<method 'issubset' of 'set' objects>)
85	0.0006	0.0000	0.0029	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:129(to_fsm)
85	0.0000	0.0000	0.0001	0.0000	outlines/outlines/fsm/regex.py:70(<genexpr>)
8/5	0.0000	0.0000	0.0006	0.0001	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:525(extension_group)
8	0.0000	0.0000	0.0000	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/utils/simple_parser.py:109(any)
1	0.2836	0.2836	1.7119	1.7119	outlines/outlines/fsm/regex.py:494(create_fsm_index_end_to_end)
1	0.0224	0.0224	0.0224	0.0224	outlines/outlines/fsm/regex.py:584(<dictcomp>)
1	0.0115	0.0115	0.0115	0.0115	~:0(<built-in method _pickle.dumps>)
1	0.0105	0.0105	1.7596	1.7596	outlines/outlines/fsm/regex.py:558(create_fsm_index_tokenizer)
1	0.0013	0.0013	0.0033	0.0033	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:230(_write)
131/1	0.0012	0.0012	0.0062	0.0062	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:69(get_alphabet)
339	0.2901	0.0009	0.2901	0.0009	outlines/outlines/fsm/fsm.py:303(<listcomp>)
496/1	0.0003	0.0003	0.0032	0.0032	outlines/.myenv/lib/python3.11/site-packages/interegular/utils/simple_parser.py:34(w)
2	0.0005	0.0002	0.0072	0.0036	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:561(reversed)
1	0.0002	0.0002	0.0002	0.0002	~:0(<method 'intersection' of 'frozenset' objects>)
26/1	0.0001	0.0001	0.0561	0.0561	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:447(to_fsm)
1	0.0000	0.0000	1.8271	1.8271	outlines/outlines/fsm/fsm.py:96(create_states_mapping)
1	0.0000	0.0000	0.0151	0.0151	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:814(__setitem__)
1	0.0000	0.0000	0.0151	0.0151	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:749(set)
1	0.0000	0.0000	0.0149	0.0149	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:179(store)
1	0.0000	0.0000	0.0144	0.0144	outlines/outlines/models/transformers.py:177(__hash__)
1	0.0000	0.0000	0.0144	0.0144	outlines/.myenv/lib/python3.11/site-packages/datasets/fingerprint.py:231(hash)
1	0.0000	0.0000	0.0144	0.0144	outlines/.myenv/lib/python3.11/site-packages/datasets/fingerprint.py:227(hash_default)
1	0.0000	0.0000	0.0143	0.0143	outlines/.myenv/lib/python3.11/site-packages/datasets/utils/py_utils.py:723(dumps)
1	0.0000	0.0000	0.0142	0.0142	outlines/.myenv/lib/python3.11/site-packages/datasets/utils/py_utils.py:700(dump)
1	0.0000	0.0000	0.0142	0.0142	outlines/.myenv/lib/python3.11/site-packages/dill/_dill.py:416(dump)
1	0.0000	0.0000	0.0142	0.0142	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/pickle.py:476(dump)
117	0.0001	0.0000	0.0001	0.0000	~:0(<method 'write' of '_io.BytesIO' objects>)
3908	0.0016	0.0000	0.0016	0.0000	~:0(<method 'write' of '_io.BufferedWriter' objects>)
2474	0.0017	0.0000	0.0017	0.0000	~:0(<method 'values' of 'dict' objects>)
1	0.0001	0.0001	0.0001	0.0001	~:0(<method 'update' of 'xxhash.xxh64' objects>)
13315	0.0010	0.0000	0.0010	0.0000	~:0(<method 'update' of 'set' objects>)
12	0.0000	0.0000	0.0000	0.0000	~:0(<method 'union' of 'set' objects>)
161	0.0004	0.0000	0.0007	0.0000	~:0(<method 'union' of 'frozenset' objects>)
71	0.0000	0.0000	0.0000	0.0000	~:0(<method 'translate' of 'str' objects>)
16318	0.0096	0.0000	0.0096	0.0000	~:0(<method 'startswith' of 'str' objects>)
17	0.0000	0.0000	0.0000	0.0000	~:0(<method 'split' of 'str' objects>)
3	0.0000	0.0000	0.0000	0.0000	~:0(<method 'rstrip' of 'str' objects>)
3	0.0000	0.0000	0.0000	0.0000	~:0(<method 'rpartition' of 'str' objects>)
60769	0.0114	0.0000	0.0114	0.0000	~:0(<method 'rindex' of 'str' objects>)
3	0.0000	0.0000	0.0000	0.0000	~:0(<method 'rfind' of 'str' objects>)
27710	0.0027	0.0000	0.0027	0.0000	~:0(<method 'replace' of 'str' objects>)
2116	0.0004	0.0000	0.0004	0.0000	~:0(<method 'remove' of 'set' objects>)
32	0.0000	0.0000	0.0000	0.0000	~:0(<method 'pop' of 'set' objects>)
21	0.0000	0.0000	0.0000	0.0000	~:0(<method 'pop' of 'list' objects>)
2774	0.0006	0.0000	0.0006	0.0000	~:0(<method 'pop' of 'dict' objects>)
15	0.0000	0.0000	0.0000	0.0000	~:0(<method 'match' of 're.Pattern' objects>)

This PRs performance:

Additional Benchmark Details:
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[lark_lark_self_grammar.lark.test-True]:
	Tokens / Second: 38.981
	(Num Tokens: 2771, Time: 71.085 seconds)


----------------------------------------------------------------------- benchmark: 1 tests ----------------------------------------------------------------------
Name (time in s)                                                             Min      Max     Mean  StdDev   Median     IQR  Outliers     OPS  Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_cfg_generation[lark_lark_self_grammar.lark.test-True]     71.0852  71.0852  71.0852  0.0000  71.0852  0.0000       0;0  0.0141       1           1
-----------------------------------------------------------------------------------------------------------------------------------------------------------------

Profile:

--------------------------------------------------------------------------------- cProfile (time in s) ---------------------------------------------------------------------------------
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[lark_lark_self_grammar.lark.test-True]
ncalls	tottime	percall	cumtime	percall	filename:lineno(function)
1	0.0003	0.0003	389.3176	389.3176	outlines/tests/benchmark/test_benchmark_cfg_generation.py:130(<lambda>)
1	5.2589	5.2589	389.3173	389.3173	outlines/tests/benchmark/test_benchmark_cfg_generation.py:51(run_until_eos)
2772	6.1756	0.0022	383.3901	0.1383	outlines/outlines/fsm/fsm.py:224(allowed_token_ids)
2455	3.1861	0.0013	263.2231	0.1072	outlines/.myenv/lib/python3.11/site-packages/lark/parsers/lalr_interactive_parser.py:103(accepts)
153859/27689	0.3803	0.0000	259.2035	0.0094	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/copy.py:66(copy)
25234	0.0834	0.0000	259.0552	0.0103	outlines/.myenv/lib/python3.11/site-packages/lark/parsers/lalr_interactive_parser.py:61(__copy__)
25234	1.2313	0.0000	257.8610	0.0102	outlines/.myenv/lib/python3.11/site-packages/lark/parsers/lalr_parser_state.py:56(__copy__)
53988175/25234	97.4954	0.0039	256.5626	0.0102	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/copy.py:128(deepcopy)
20196252/25234	25.6356	0.0010	256.4766	0.0102	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/copy.py:201(_deepcopy_list)
20171018/55104	20.5832	0.0004	255.9176	0.0046	outlines/.myenv/lib/python3.11/site-packages/lark/tree.py:206(__deepcopy__)
2455	0.4501	0.0002	74.1375	0.0302	outlines/.myenv/lib/python3.11/site-packages/lark/parsers/lalr_interactive_parser.py:47(exhaust_lexer)
2411113	1.5916	0.0000	73.6874	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/parsers/lalr_interactive_parser.py:35(iter_parse)
13620905	13.3024	0.0000	48.7314	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:262(__deepcopy__)
16054797	16.6253	0.0000	39.5422	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:202(__new__)
2433892	1.3488	0.0000	38.6249	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/parsers/lalr_interactive_parser.py:28(feed_token)
2433892	23.1669	0.0000	37.2761	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/parsers/lalr_parser_state.py:67(feed_token)
2411113	1.8932	0.0000	34.2022	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:661(lex)
2455	0.0480	0.0000	32.7434	0.0133	outlines/outlines/fsm/fsm.py:96(__init__)
2455	0.0185	0.0000	32.6171	0.0133	outlines/outlines/caching.py:64(wrapper)
2455	0.0129	0.0000	32.2840	0.0132	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:1224(__getitem__)
2455	0.0271	0.0000	32.2711	0.0131	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:1123(get)
2455	0.0294	0.0000	32.2104	0.0131	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:254(fetch)
2455	32.0279	0.0130	32.0279	0.0130	~:0(<built-in method _pickle.load>)
2411113	7.8974	0.0000	31.9108	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:590(next_token)
53988175	23.2596	0.0000	31.8389	0.0000	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/copy.py:243(_keep_alive)
16054797	18.8740	0.0000	22.9169	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:213(_future_new)
128197836	11.8729	0.0000	11.8729	0.0000	~:0(<built-in method builtins.id>)
108168351	9.9834	0.0000	9.9834	0.0000	~:0(<method 'get' of 'dict' objects>)
107566250	7.5129	0.0000	7.5129	0.0000	~:0(<method 'append' of 'list' objects>)
3626455	6.5151	0.0000	12.9878	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:387(match)
3626455	5.7808	0.0000	5.7808	0.0000	~:0(<method 'match' of '_regex.Pattern' objects>)
23848985	5.5810	0.0000	5.5810	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/tree.py:61(__init__)
2455	5.5302	0.0023	5.5302	0.0023	outlines/outlines/fsm/fsm.py:274(<listcomp>)
33972815	5.0665	0.0000	5.0665	0.0000	~:0(<built-in method builtins.getattr>)
16082486	4.0508	0.0000	4.0508	0.0000	~:0(<built-in method __new__ of type object at 0x7f62985b1ba0>)
1986232	3.3941	0.0000	4.4274	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/parse_tree_builder.py:145(__call__)
33920548	3.3124	0.0000	3.3124	0.0000	~:0(<built-in method builtins.issubclass>)
3626455	2.5135	0.0000	3.2301	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:292(feed)
17465659	1.6116	0.0000	1.6116	0.0000	~:0(<built-in method builtins.len>)
14338433	1.5978	0.0000	1.6085	0.0000	~:0(<built-in method builtins.isinstance>)
6682337	1.8151	0.0000	2.5300	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/grammar.py:124(__eq__)
6608887	0.8644	0.0000	0.8644	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/grammar.py:121(__hash__)
5163159	2.2580	0.0000	2.8340	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:265(__eq__)
3626455	2.3552	0.0000	16.0163	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:587(match)
3626455	0.6919	0.0000	0.6919	0.0000	~:0(<method 'group' of '_regex.Match' objects>)
3626455	0.6733	0.0000	0.6733	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:581(scanner)
9230	0.0036	0.0000	0.0036	0.0000	~:0(<method 'keys' of 'dict' objects>)
7365	0.0384	0.0000	0.0384	0.0000	~:0(<method '__exit__' of '_io._IOBase' objects>)
7365	0.0152	0.0000	0.0251	0.0000	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/typing.py:352(inner)
57787	0.0214	0.0000	0.0214	0.0000	~:0(<method 'isupper' of 'str' objects>)
532129	0.1161	0.0000	0.1161	0.0000	~:0(<method 'rindex' of 'str' objects>)
5227	0.9069	0.0002	0.9114	0.0002	outlines/outlines/fsm/fsm.py:129(allowed_token_ids)
5226	0.0198	0.0000	0.0198	0.0000	~:0(<built-in method builtins.sorted>)
4910	0.1251	0.0000	0.1251	0.0000	~:0(<method 'execute' of 'sqlite3.Connection' objects>)
4910	0.0296	0.0000	0.0296	0.0000	outlines/.myenv/lib/python3.11/site-packages/cloudpickle/cloudpickle.py:1253(__init__)
4910	0.0289	0.0000	0.0795	0.0000	outlines/.myenv/lib/python3.11/site-packages/cloudpickle/cloudpickle.py:1464(dumps)
4910	0.0252	0.0000	0.0252	0.0000	~:0(<method 'fetchall' of 'sqlite3.Cursor' objects>)
4910	0.0125	0.0000	0.0125	0.0000	~:0(<function Pickler.dump at 0x7f61f96e8360>)
4910	0.0109	0.0000	0.0236	0.0000	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:608(_con)
4910	0.0064	0.0000	0.0299	0.0000	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:646(_sql)
4910	0.0055	0.0000	0.0055	0.0000	~:0(<method 'update' of '_hashlib.HASH' objects>)
4910	0.0048	0.0000	0.0048	0.0000	~:0(<built-in method posix.getpid>)
4910	0.0045	0.0000	0.0170	0.0000	outlines/.myenv/lib/python3.11/site-packages/cloudpickle/cloudpickle.py:1243(dump)
4910	0.0039	0.0000	0.0039	0.0000	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:139(put)
4910	0.0022	0.0000	0.0022	0.0000	~:0(<built-in method time.time>)
4910	0.0012	0.0000	0.0012	0.0000	~:0(<method 'getvalue' of '_io.BytesIO' objects>)
339	0.2992	0.0009	0.2992	0.0009	outlines/outlines/fsm/fsm.py:305(<listcomp>)
2771	0.3501	0.0001	0.6461	0.0002	outlines/outlines/fsm/fsm.py:309(next_state)
2455	0.0911	0.0000	0.0911	0.0000	~:0(<built-in method io.open>)
2771	0.0546	0.0000	0.0546	0.0000	~:0(<method 'decode' of 'tokenizers.Tokenizer' objects>)
2455	0.0389	0.0000	0.0731	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/parser_frontends.py:96(_make_lexer_thread)
2771	0.0375	0.0000	0.1555	0.0001	outlines/.myenv/lib/python3.11/site-packages/transformers/utils/generic.py:232(to_py_obj)
2455	0.0244	0.0000	0.1445	0.0001	outlines/outlines/caching.py:39(hash_arguments)
2771	0.0104	0.0000	0.2833	0.0001	outlines/outlines/models/transformers.py:159(decode)
2771	0.0078	0.0000	0.2729	0.0001	outlines/.myenv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:3686(batch_decode)
2771	0.0143	0.0000	0.2651	0.0001	outlines/.myenv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:3710(<listcomp>)
2771	0.0123	0.0000	0.2507	0.0001	outlines/.myenv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:3720(decode)
2455	0.0015	0.0000	0.0015	0.0000	~:0(<method 'values' of 'dict' objects>)
16310	0.0138	0.0000	0.0138	0.0000	~:0(<method 'startswith' of 'str' objects>)
27710	0.0034	0.0000	0.0034	0.0000	~:0(<method 'replace' of 'str' objects>)
2116	0.0004	0.0000	0.0004	0.0000	~:0(<method 'remove' of 'set' objects>)
2771	0.0008	0.0000	0.0008	0.0000	~:0(<method 'pop' of 'dict' objects>)
2455	0.0022	0.0000	0.0022	0.0000	~:0(<method 'join' of 'str' objects>)
32915	0.0129	0.0000	0.0129	0.0000	~:0(<method 'items' of 'dict' objects>)
2455	0.0071	0.0000	0.0071	0.0000	~:0(<method 'hexdigest' of '_hashlib.HASH' objects>)
2771	0.0005	0.0000	0.0005	0.0000	~:0(<method 'extend' of 'list' objects>)
2455	0.0012	0.0000	0.0012	0.0000	~:0(<method 'endswith' of 'str' objects>)
1	0.0000	0.0000	0.0000	0.0000	~:0(<method 'disable' of '_lsprof.Profiler' objects>)
1112206	0.2710	0.0000	0.2710	0.0000	~:0(<method 'count' of 'str' objects>)
27591	0.2041	0.0000	0.2041	0.0000	~:0(<method 'copy' of 'list' objects>)

@lapp0 lapp0 mentioned this pull request Feb 7, 2024
3 tasks
@lapp0 lapp0 changed the title Apply lru_cache to RegexFSM Apply functools.lru_cache to RegexFSM to Improve CFGFSM performance Feb 7, 2024
@lapp0
Copy link
Contributor Author

lapp0 commented Feb 13, 2024

@rlouf this and #622 #623 are relatively simple changes. When merged they will increase CFGFSM performance 20-fold :)

Could you please review when you have a chance?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant