![]() ![]() You could either manually tune this value, or set it to tf.data.AUTOTUNE, which will prompt the The number of elements to prefetch should be equal to (or possibly greater than) the number of batches consumed by a single training step. In particular, the transformation uses a background thread and an internal buffer to prefetch elements from the input dataset ahead of the time they are requested. It can be used to decouple the time when data is produced from the time when data is consumed. The tf.data API provides the tf. transformation. While the model is executing training step s, the input pipeline is reading the data for step s+1.ĭoing so reduces the step time to the maximum (as opposed to the sum) of the training and the time it takes to extract the data. Prefetching overlaps the preprocessing and model execution of a training step. The next sections build on this input pipeline, illustrating best practices for designing performant TensorFlow input pipelines. The training step time is thus the sum of opening, reading and training times. However, in a naive synchronous implementation like here, while your pipeline is fetching the data, your model is sitting idle.Ĭonversely, while your model is training, the input pipeline is sitting idle. Opening a file if it hasn't been opened yet.The plot shows that performing a training step involves: Under the hood, this is how your execution time was spent: Start with a naive pipeline using no tricks, iterating over the dataset as-is. To exhibit how performance can be optimized, you will improve the performance of the ArtificialDataset. Print("Execution time:", time.perf_counter() - start_time) Next, write a dummy training loop that measures how long it takes to iterate over a dataset. This dataset is similar to the tf. one, adding a fixed delay at the beginning of and in-between each sample. Output_signature = tf.TensorSpec(shape = (1,), dtype = tf.int64), ![]() # Reading data (line, record) from the file Sleeps for some time before producing each item to simulate reading data from a fileĬlass ArtificialDataset(tf.data.Dataset):.Sleeps for some time before the first item to simulate opening a file.Generates num_samples samples (default is 3). ![]() Start with defining a class inheriting from tf.data.Dataset called ArtificialDataset. To get a reproducible benchmark, you will build an artificial example. Different factors affecting reproducibility include: Making reproducible performance benchmarks can be difficult. Throughout this guide, you will iterate across a dataset and measure the performance.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |