Add get_start_time/get_end_time functions and use them in get_duration#3623
Conversation
|
Where do you think the loading into memory is happening? Because the duration only requires the last and the first timestamps: Maybe we can just not transform to array if the timestamps are memmap and or zarr provided they are read only? My concern is that changing the repr with these defaults will display wrong information for some cases but maybe I am wrong. |
|
I think that transforming to array is very convenient to cache the timestamps in memory and overall.it improves performance. Another option could be to have a get_start_time and get_end_time, which will only access the required samples. |
|
Fair enough, changing the way that Adding private methods like this:
Seems like a great idea to me. That way, we can use that in get duration without touching anything else. |
use_times option in get_duration/get_total_durationget_start_time/get_stop_time functions and use them in get_duration
|
@h-mayorquin done :) much faster now |
|
This looks good but there are some tests that are failing due to tolerance I guess? Two points for discussion:
|
|
Changed to I'm ok in exposing these functions as public since their behavior is quite straightforward. |
get_start_time/get_stop_time functions and use them in get_durationget_start_time/get_end_time functions and use them in get_duration
Since the
__repr__uses theget_durations function, I added an option to avoid using timestamps to estimate duration.In fact, this can be quite slow for long zarr datasets with timestamps, since the whole array is loaded in mem when calling
get_times().