Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
These benchmarks help us make sure the optimisations are going in the right direction. The benchmark added to time Segment.divide also shows that two proposed changes to 'Segment.divide' using binary search ended up being slower that the current linear search. They're included here because: 1. it _may_ come in handy in the future (unlikely); and 2. I'm sad about throwing this code away. The first implementation replaces the (very clever) linear search with a binary search similar to what we did for `set_cell_size`. My benchmarks showed that this was only 10% faster so I thought maybe I could replace the list slicing with iterator slicing and it would be better, which resulted in my second implementation. They're both equally fast, and the rich benchmarks showed both were actually slower. ```py @classmethod def divide( cls, segments: Iterable["Segment"], cuts: Iterable[int] ) -> Iterable[List["Segment"]]: """Divides an iterable of segments into portions. Args: segments (Iterable[Segment]): The segments to divide. cuts (Iterable[int]): Cell positions where to divide. Yields: Iterable[List[Segment]]: An iterable of Segments in List. """ _cell_len = cached_cell_len segments = list(segments) cuts = list(cuts) widths = [0 if s.control else _cell_len(s.text) for s in segments] lengths = list(accumulate(widths)) offset = 0 for cut in cuts: if cut == offset: yield [] continue segment_idx = bisect.bisect_left(lengths, cut) if segment_idx >= len(lengths): yield segments return if lengths[segment_idx] == cut: yield segments[: segment_idx + 1] segments = segments[segment_idx + 1 :] lengths = lengths[segment_idx + 1 :] else: start_width = lengths[segment_idx - 1] if segment_idx > 0 else offset before, after = segments[segment_idx].split_cells(cut - start_width) yield segments[:segment_idx] + [before] segments = segments[segment_idx:] segments[0] = after lengths = lengths[segment_idx:] offset = cut @classmethod def divide( cls, segments: Iterable["Segment"], cuts: Iterable[int] ) -> Iterable[List["Segment"]]: """Divides an iterable of segments into portions. Args: segments (Iterable[Segment]): The segments to divide. cuts (Iterable[int]): Cell positions where to divide. Yields: Iterable[List[Segment]]: An iterable of Segments in List. """ _cell_len = cached_cell_len segments = list(segments) cuts = list(cuts) widths = [0 if s.control else _cell_len(s.text) for s in segments] lengths = list(accumulate(widths)) segments_iter = iter(segments) idx_offset = 0 offset = 0 for cut in cuts: if cut == offset: yield [] continue length_idx = bisect.bisect_left(lengths, cut) if length_idx >= len(lengths): yield list(segments_iter) return if lengths[length_idx] == cut: segments = list(islice(segments_iter, length_idx - idx_offset + 1)) yield segments idx_offset += len(segments) else: start_width = lengths[length_idx - 1] if length_idx > idx_offset else offset segments = list(islice(segments_iter, length_idx - idx_offset + 1)) before, after = segments[-1].split_cells(cut - start_width) segments_iter = chain([after], segments_iter) segments[-1] = before yield segments idx_offset += len(segments) - 1 offset = cut ```
- Loading branch information