forked from Meelfy/pytorch_pretrained_BERT
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathnohup.out
executable file
Β·9477 lines (6385 loc) Β· 425 KB
/
nohup.out
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
12/28/2018 20:54:12 - INFO - __main__ - device: cuda n_gpu: 2, distributed training: False, 16-bits training: False
12/28/2018 20:54:12 - INFO - pytorch_pretrained_bert.tokenization - loading vocabulary file /data/nfsdata/meijie/data/uncased_L-12_H-768_A-12/vocab.txt
12/28/2018 20:54:13 - INFO - pytorch_pretrained_bert.modeling - loading archive file /data/nfsdata/meijie/data/uncased_L-12_H-768_A-12
12/28/2018 20:54:13 - INFO - pytorch_pretrained_bert.modeling - Model config {
"attention_probs_dropout_prob": 0.1,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"type_vocab_size": 2,
"vocab_size": 30522
}
12/28/2018 20:54:23 - INFO - pytorch_pretrained_bert.modeling - Weights of BertForQuestionAnswering not initialized from pretrained model: ['qa_outputs.weight', 'qa_outputs.bias']
12/28/2018 20:54:23 - INFO - pytorch_pretrained_bert.modeling - Weights from pretrained model not used in BertForQuestionAnswering: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
12/28/2018 20:54:29 - INFO - __main__ - ***** Running training *****
12/28/2018 20:54:29 - INFO - __main__ - Num orig examples = 1177
12/28/2018 20:54:29 - INFO - __main__ - Num split examples = 1196
12/28/2018 20:54:29 - INFO - __main__ - Batch size = 12
12/28/2018 20:54:29 - INFO - __main__ - Num steps = 2942
Epoch: 0%| | 0/30 [00:00<?, ?it/s]
Iteration: 0%| | 0/100 [00:00<?, ?it/s][A/home/meefly/.local/lib/python3.6/site-packages/torch/nn/parallel/_functions.py:58: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
warnings.warn('Was asked to gather along dimension 0, but all '
Iteration: 1%| | 1/100 [00:06<10:58, 6.65s/it][A
Iteration: 2%|β | 2/100 [00:07<07:57, 4.87s/it][A
Iteration: 3%|β | 3/100 [00:07<05:46, 3.58s/it][A
Iteration: 4%|β | 4/100 [00:08<04:16, 2.67s/it][A
Iteration: 5%|β | 5/100 [00:09<03:12, 2.03s/it][A
Iteration: 6%|β | 6/100 [00:09<02:28, 1.58s/it][A
Iteration: 7%|β | 7/100 [00:10<01:57, 1.26s/it][A
Iteration: 8%|β | 8/100 [00:10<01:35, 1.04s/it][A
Iteration: 9%|β | 9/100 [00:11<01:20, 1.14it/s][A
Iteration: 10%|β | 10/100 [00:11<01:09, 1.30it/s][A
Iteration: 11%|β | 11/100 [00:12<01:01, 1.44it/s][A
Iteration: 12%|ββ | 12/100 [00:12<00:57, 1.53it/s][A
Iteration: 13%|ββ | 13/100 [00:13<00:53, 1.62it/s][A
Iteration: 14%|ββ | 14/100 [00:13<00:51, 1.67it/s][A
Iteration: 15%|ββ | 15/100 [00:14<00:49, 1.71it/s][A
Iteration: 16%|ββ | 16/100 [00:14<00:48, 1.74it/s][A
Iteration: 17%|ββ | 17/100 [00:15<00:47, 1.76it/s][A
Iteration: 18%|ββ | 18/100 [00:15<00:45, 1.79it/s][A
Iteration: 19%|ββ | 19/100 [00:16<00:44, 1.81it/s][A
Iteration: 20%|ββ | 20/100 [00:17<00:44, 1.81it/s][A
Iteration: 21%|ββ | 21/100 [00:17<00:43, 1.81it/s][A
Iteration: 22%|βββ | 22/100 [00:18<00:43, 1.81it/s][A
Iteration: 23%|βββ | 23/100 [00:18<00:42, 1.81it/s][A
Iteration: 24%|βββ | 24/100 [00:19<00:41, 1.82it/s][A
Iteration: 25%|βββ | 25/100 [00:19<00:41, 1.83it/s][A
Iteration: 26%|βββ | 26/100 [00:20<00:40, 1.83it/s][A
Iteration: 27%|βββ | 27/100 [00:20<00:39, 1.84it/s][A
Iteration: 28%|βββ | 28/100 [00:21<00:39, 1.83it/s][A
Iteration: 29%|βββ | 29/100 [00:21<00:38, 1.83it/s][A
Iteration: 30%|βββ | 30/100 [00:22<00:38, 1.83it/s][A
Iteration: 31%|βββ | 31/100 [00:23<00:37, 1.83it/s][A
Iteration: 32%|ββββ | 32/100 [00:23<00:37, 1.83it/s][A
Iteration: 33%|ββββ | 33/100 [00:24<00:36, 1.83it/s][A
Iteration: 34%|ββββ | 34/100 [00:24<00:35, 1.84it/s][A
Iteration: 35%|ββββ | 35/100 [00:25<00:35, 1.85it/s][A
Iteration: 36%|ββββ | 36/100 [00:25<00:34, 1.84it/s][A
Iteration: 37%|ββββ | 37/100 [00:26<00:34, 1.83it/s][A
Iteration: 38%|ββββ | 38/100 [00:26<00:33, 1.83it/s][A
Iteration: 39%|ββββ | 39/100 [00:27<00:33, 1.84it/s][A
Iteration: 40%|ββββ | 40/100 [00:27<00:32, 1.85it/s][A
Iteration: 41%|ββββ | 41/100 [00:28<00:31, 1.86it/s][A
Iteration: 42%|βββββ | 42/100 [00:29<00:31, 1.87it/s][A
Iteration: 43%|βββββ | 43/100 [00:29<00:30, 1.86it/s][A
Iteration: 44%|βββββ | 44/100 [00:30<00:30, 1.85it/s][A
Iteration: 45%|βββββ | 45/100 [00:30<00:29, 1.84it/s][A
Iteration: 46%|βββββ | 46/100 [00:31<00:29, 1.84it/s][A
Iteration: 47%|βββββ | 47/100 [00:31<00:28, 1.87it/s][A
Iteration: 48%|βββββ | 48/100 [00:32<00:27, 1.89it/s][A
Iteration: 49%|βββββ | 49/100 [00:32<00:26, 1.91it/s][A
Iteration: 50%|βββββ | 50/100 [00:33<00:26, 1.92it/s][A12/28/2018 21:00:36 - INFO - __main__ - device: cuda n_gpu: 2, distributed training: False, 16-bits training: False
12/28/2018 21:00:36 - INFO - pytorch_pretrained_bert.tokenization - loading vocabulary file /data/nfsdata/meijie/data/uncased_L-12_H-768_A-12/vocab.txt
12/28/2018 21:00:36 - INFO - pytorch_pretrained_bert.modeling - loading archive file /data/nfsdata/meijie/data/uncased_L-12_H-768_A-12
12/28/2018 21:00:36 - INFO - pytorch_pretrained_bert.modeling - Model config {
"attention_probs_dropout_prob": 0.1,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"type_vocab_size": 2,
"vocab_size": 30522
}
12/28/2018 21:00:47 - INFO - pytorch_pretrained_bert.modeling - Weights of BertForQuestionAnswering not initialized from pretrained model: ['qa_outputs.weight', 'qa_outputs.bias']
12/28/2018 21:00:47 - INFO - pytorch_pretrained_bert.modeling - Weights from pretrained model not used in BertForQuestionAnswering: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
12/28/2018 21:00:52 - INFO - __main__ - ***** Running training *****
12/28/2018 21:00:52 - INFO - __main__ - Num orig examples = 1177
12/28/2018 21:00:52 - INFO - __main__ - Num split examples = 1196
12/28/2018 21:00:52 - INFO - __main__ - Batch size = 12
12/28/2018 21:00:52 - INFO - __main__ - Num steps = 2942
Epoch: 0%| | 0/30 [00:00<?, ?it/s]
Iteration: 0%| | 0/100 [00:00<?, ?it/s][A/home/meefly/.local/lib/python3.6/site-packages/torch/nn/parallel/_functions.py:58: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
warnings.warn('Was asked to gather along dimension 0, but all '
Iteration: 1%| | 1/100 [00:06<10:49, 6.56s/it][A
Iteration: 2%|β | 2/100 [00:07<07:48, 4.78s/it][A
Iteration: 3%|β | 3/100 [00:07<05:40, 3.51s/it][A
Iteration: 4%|β | 4/100 [00:08<04:11, 2.62s/it][A
Iteration: 5%|β | 5/100 [00:08<03:09, 2.00s/it][A
Iteration: 6%|β | 6/100 [00:09<02:26, 1.56s/it][A
Iteration: 7%|β | 7/100 [00:09<01:56, 1.25s/it][A
Iteration: 8%|β | 8/100 [00:10<01:35, 1.04s/it][A
Iteration: 9%|β | 9/100 [00:10<01:21, 1.12it/s][A
Iteration: 10%|β | 10/100 [00:11<01:10, 1.27it/s][A
Iteration: 11%|β | 11/100 [00:12<01:03, 1.40it/s][A
Iteration: 12%|ββ | 12/100 [00:12<00:58, 1.51it/s][A
Iteration: 13%|ββ | 13/100 [00:13<00:54, 1.60it/s][A
Iteration: 14%|ββ | 14/100 [00:13<00:51, 1.67it/s][A
Iteration: 15%|ββ | 15/100 [00:14<00:49, 1.72it/s][A
Iteration: 16%|ββ | 16/100 [00:14<00:47, 1.75it/s][A
Iteration: 17%|ββ | 17/100 [00:15<00:46, 1.79it/s][A
Iteration: 18%|ββ | 18/100 [00:15<00:45, 1.81it/s][A
Iteration: 19%|ββ | 19/100 [00:16<00:44, 1.83it/s][A
Iteration: 20%|ββ | 20/100 [00:16<00:43, 1.83it/s][A
Iteration: 21%|ββ | 21/100 [00:17<00:43, 1.83it/s][A
Iteration: 22%|βββ | 22/100 [00:18<00:42, 1.83it/s][A
Iteration: 23%|βββ | 23/100 [00:18<00:41, 1.83it/s][A
Iteration: 24%|βββ | 24/100 [00:19<00:41, 1.84it/s][A
Iteration: 25%|βββ | 25/100 [00:19<00:40, 1.85it/s][A
Iteration: 26%|βββ | 26/100 [00:20<00:39, 1.85it/s][A
Iteration: 27%|βββ | 27/100 [00:20<00:39, 1.86it/s][A
Iteration: 28%|βββ | 28/100 [00:21<00:38, 1.87it/s][A
Iteration: 29%|βββ | 29/100 [00:21<00:37, 1.88it/s][A
Iteration: 30%|βββ | 30/100 [00:22<00:37, 1.88it/s][A
Iteration: 31%|βββ | 31/100 [00:22<00:36, 1.88it/s][A
Iteration: 32%|ββββ | 32/100 [00:23<00:36, 1.88it/s][A
Iteration: 33%|ββββ | 33/100 [00:23<00:35, 1.88it/s][A
Iteration: 34%|ββββ | 34/100 [00:24<00:35, 1.89it/s][A
Iteration: 35%|ββββ | 35/100 [00:24<00:34, 1.89it/s][A
Iteration: 36%|ββββ | 36/100 [00:25<00:34, 1.88it/s][A
Iteration: 37%|ββββ | 37/100 [00:26<00:33, 1.87it/s][A
Iteration: 38%|ββββ | 38/100 [00:26<00:33, 1.87it/s][A
Iteration: 39%|ββββ | 39/100 [00:27<00:32, 1.87it/s][A
Iteration: 40%|ββββ | 40/100 [00:27<00:32, 1.87it/s][A
Iteration: 41%|ββββ | 41/100 [00:28<00:31, 1.87it/s][A
Iteration: 42%|βββββ | 42/100 [00:28<00:30, 1.88it/s][A
Iteration: 43%|βββββ | 43/100 [00:29<00:30, 1.87it/s][A
Iteration: 44%|βββββ | 44/100 [00:29<00:30, 1.87it/s][A
Iteration: 45%|βββββ | 45/100 [00:30<00:29, 1.86it/s][A
Iteration: 46%|βββββ | 46/100 [00:30<00:29, 1.85it/s][A
Iteration: 47%|βββββ | 47/100 [00:31<00:28, 1.85it/s][A
Iteration: 48%|βββββ | 48/100 [00:31<00:28, 1.84it/s][A
Iteration: 49%|βββββ | 49/100 [00:32<00:27, 1.84it/s][A
Iteration: 50%|βββββ | 50/100 [00:33<00:27, 1.84it/s][A
Iteration: 51%|βββββ | 51/100 [00:33<00:26, 1.84it/s][A
Iteration: 52%|ββββββ | 52/100 [00:34<00:26, 1.84it/s][A
Iteration: 53%|ββββββ | 53/100 [00:34<00:25, 1.84it/s][A
Iteration: 54%|ββββββ | 54/100 [00:35<00:24, 1.84it/s][A
Iteration: 55%|ββββββ | 55/100 [00:35<00:24, 1.84it/s][A
Iteration: 56%|ββββββ | 56/100 [00:36<00:23, 1.84it/s][A
Iteration: 57%|ββββββ | 57/100 [00:36<00:23, 1.85it/s][A
Iteration: 58%|ββββββ | 58/100 [00:37<00:22, 1.86it/s][A
Iteration: 59%|ββββββ | 59/100 [00:37<00:22, 1.85it/s][A
Iteration: 60%|ββββββ | 60/100 [00:38<00:21, 1.85it/s][A
Iteration: 61%|ββββββ | 61/100 [00:38<00:21, 1.84it/s][A
Iteration: 62%|βββββββ | 62/100 [00:39<00:20, 1.85it/s][A
Iteration: 63%|βββββββ | 63/100 [00:40<00:20, 1.83it/s][A
Iteration: 64%|βββββββ | 64/100 [00:40<00:19, 1.83it/s][A
Iteration: 65%|βββββββ | 65/100 [00:41<00:19, 1.82it/s][A
Iteration: 66%|βββββββ | 66/100 [00:41<00:18, 1.82it/s][A
Iteration: 67%|βββββββ | 67/100 [00:42<00:18, 1.82it/s][A
Iteration: 68%|βββββββ | 68/100 [00:42<00:17, 1.83it/s][A
Iteration: 69%|βββββββ | 69/100 [00:43<00:16, 1.84it/s][A
Iteration: 70%|βββββββ | 70/100 [00:43<00:16, 1.85it/s][A
Iteration: 71%|βββββββ | 71/100 [00:44<00:15, 1.85it/s][A
Iteration: 72%|ββββββββ | 72/100 [00:44<00:15, 1.85it/s][A
Iteration: 73%|ββββββββ | 73/100 [00:45<00:14, 1.85it/s][A
Iteration: 74%|ββββββββ | 74/100 [00:46<00:13, 1.86it/s][A
Iteration: 75%|ββββββββ | 75/100 [00:46<00:13, 1.87it/s][A
Iteration: 76%|ββββββββ | 76/100 [00:47<00:12, 1.87it/s][A
Iteration: 77%|ββββββββ | 77/100 [00:47<00:12, 1.87it/s][A
Iteration: 78%|ββββββββ | 78/100 [00:48<00:11, 1.87it/s][A
Iteration: 79%|ββββββββ | 79/100 [00:48<00:11, 1.88it/s][A
Iteration: 80%|ββββββββ | 80/100 [00:49<00:10, 1.88it/s][A
Iteration: 81%|ββββββββ | 81/100 [00:49<00:10, 1.88it/s][A
Iteration: 82%|βββββββββ | 82/100 [00:50<00:09, 1.87it/s][A
Iteration: 83%|βββββββββ | 83/100 [00:50<00:09, 1.86it/s][A
Iteration: 84%|βββββββββ | 84/100 [00:51<00:08, 1.86it/s][A
Iteration: 85%|βββββββββ | 85/100 [00:51<00:08, 1.85it/s][A
Iteration: 86%|βββββββββ | 86/100 [00:52<00:07, 1.84it/s][A
Iteration: 87%|βββββββββ | 87/100 [00:53<00:07, 1.85it/s][A
Iteration: 88%|βββββββββ | 88/100 [00:53<00:06, 1.88it/s][A
Iteration: 89%|βββββββββ | 89/100 [00:54<00:05, 1.91it/s][A
Iteration: 90%|βββββββββ | 90/100 [00:54<00:05, 1.91it/s][A
Iteration: 91%|βββββββββ | 91/100 [00:55<00:04, 1.93it/s][A
Iteration: 92%|ββββββββββ| 92/100 [00:55<00:04, 1.90it/s][A
Iteration: 93%|ββββββββββ| 93/100 [00:56<00:03, 1.88it/s][A
Iteration: 94%|ββββββββββ| 94/100 [00:56<00:03, 1.87it/s][A
Iteration: 95%|ββββββββββ| 95/100 [00:57<00:02, 1.85it/s][A
Iteration: 96%|ββββββββββ| 96/100 [00:57<00:02, 1.85it/s][A
Iteration: 97%|ββββββββββ| 97/100 [00:58<00:01, 1.82it/s][A
Iteration: 98%|ββββββββββ| 98/100 [00:58<00:01, 1.81it/s][A
Iteration: 99%|ββββββββββ| 99/100 [00:59<00:00, 1.81it/s][A
Iteration: 100%|ββββββββββ| 100/100 [00:59<00:00, 1.90it/s][A
[AEpoch: 3%|β | 1/30 [00:59<28:58, 59.94s/it]
Iteration: 0%| | 0/100 [00:00<?, ?it/s][A
Iteration: 1%| | 1/100 [00:00<00:54, 1.83it/s][A
Iteration: 2%|β | 2/100 [00:01<00:53, 1.84it/s][A
Iteration: 3%|β | 3/100 [00:01<00:52, 1.84it/s][A
Iteration: 4%|β | 4/100 [00:02<00:51, 1.85it/s][A
Iteration: 5%|β | 5/100 [00:02<00:51, 1.85it/s][A
Iteration: 6%|β | 6/100 [00:03<00:50, 1.86it/s][A
Iteration: 7%|β | 7/100 [00:03<00:49, 1.86it/s][A
Iteration: 8%|β | 8/100 [00:04<00:49, 1.86it/s][A
Iteration: 9%|β | 9/100 [00:04<00:48, 1.86it/s][A
Iteration: 10%|β | 10/100 [00:05<00:48, 1.85it/s][A
Iteration: 11%|β | 11/100 [00:05<00:48, 1.84it/s][A
Iteration: 12%|ββ | 12/100 [00:06<00:47, 1.84it/s][A
Iteration: 13%|ββ | 13/100 [00:07<00:47, 1.84it/s][A
Iteration: 14%|ββ | 14/100 [00:07<00:46, 1.84it/s][A
Iteration: 15%|ββ | 15/100 [00:08<00:46, 1.84it/s][A
Iteration: 16%|ββ | 16/100 [00:08<00:45, 1.84it/s][A
Iteration: 17%|ββ | 17/100 [00:09<00:44, 1.85it/s][A
Iteration: 18%|ββ | 18/100 [00:09<00:44, 1.84it/s][A
Iteration: 19%|ββ | 19/100 [00:10<00:44, 1.84it/s][A
Iteration: 20%|ββ | 20/100 [00:10<00:43, 1.84it/s][A
Iteration: 21%|ββ | 21/100 [00:11<00:42, 1.84it/s][A
Iteration: 22%|βββ | 22/100 [00:11<00:42, 1.84it/s][A
Iteration: 23%|βββ | 23/100 [00:12<00:41, 1.84it/s][A
Iteration: 24%|βββ | 24/100 [00:13<00:41, 1.83it/s][A
Iteration: 25%|βββ | 25/100 [00:13<00:40, 1.83it/s][A
Iteration: 26%|βββ | 26/100 [00:14<00:40, 1.83it/s][A
Iteration: 27%|βββ | 27/100 [00:14<00:39, 1.87it/s][A
Iteration: 28%|βββ | 28/100 [00:15<00:38, 1.89it/s][A
Iteration: 29%|βββ | 29/100 [00:15<00:37, 1.91it/s][A
Iteration: 30%|βββ | 30/100 [00:16<00:36, 1.92it/s][A
Iteration: 31%|βββ | 31/100 [00:16<00:35, 1.93it/s][A
Iteration: 32%|ββββ | 32/100 [00:17<00:35, 1.91it/s][A
Iteration: 33%|ββββ | 33/100 [00:17<00:35, 1.90it/s][A
Iteration: 34%|ββββ | 34/100 [00:18<00:35, 1.88it/s][A
Iteration: 35%|ββββ | 35/100 [00:18<00:34, 1.86it/s][A
Iteration: 36%|ββββ | 36/100 [00:19<00:34, 1.85it/s][A
Iteration: 37%|ββββ | 37/100 [00:19<00:34, 1.85it/s][ABetter speed can be achieved with apex installed from https://www.github.com/nvidia/apex.
tensor(35.7190, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.5284, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.6775, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.6649, device='cuda:0', grad_fn=<MeanBackward1>) tensor(36.0738, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.6258, device='cuda:0', grad_fn=<MeanBackward1>) tensor(36.3579, device='cuda:0', grad_fn=<MeanBackward1>) tensor(36.2161, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.5443, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.5191, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.7809, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.5433, device='cuda:0', grad_fn=<MeanBackward1>) tensor(36.2103, device='cuda:0', grad_fn=<MeanBackward1>) tensor(36.0296, device='cuda:0', grad_fn=<MeanBackward1>) tensor(36.1959, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.4065, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.7888, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.0770, device='cuda:0', grad_fn=<MeanBackward1>) tensor(36.2357, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.6453, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.5829, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.5944, device='cuda:0', grad_fn=<MeanBackward1>) tensor(36.0803, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.4188, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.3667, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.4666, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.6988, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.8574, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.6057, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.1521, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.4946, device='cuda:0', grad_fn=<MeanBackward1>) tensor(36.1301, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.2759, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.5610, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.9933, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.2952, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.4758, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.0467, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.4220, device='cuda:0', grad_fn=<MeanBackward1>) tensor(34.8205, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.2079, device='cuda:0', grad_fn=<MeanBackward1>) tensor(34.8988, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.1239, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.1106, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.4959, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.1998, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.4077, device='cuda:0', grad_fn=<MeanBackward1>) tensor(35.2854, device='cuda:0', grad_fn=<MeanBackward1>) tensor(34.8783, device='cuda:0', grad_fn=<MeanBackward1>) tensor(34.3283, device='cuda:0', grad_fn=<MeanBackward1>) tensor(34.4950, device='cuda:0', grad_fn=<MeanBackward1>) tensor(34.9320, device='cuda:0', grad_fn=<MeanBackward1>) tensor(34.4494, device='cuda:0', grad_fn=<MeanBackward1>) tensor(34.6740, device='cuda:0', grad_fn=<MeanBackward1>) tensor(34.3991, device='cuda:0', grad_fn=<MeanBackward1>) tensor(34.3139, device='cuda:0', grad_fn=<MeanBackward1>) tensor(34.0329, device='cuda:0', grad_fn=<MeanBackward1>) tensor(33.8319, device='cuda:0', grad_fn=<MeanBackward1>) tensor(34.2566, device='cuda:0', grad_fn=<MeanBackward1>) tensor(34.0327, device='cuda:0', grad_fn=<MeanBackward1>) tensor(33.8646, device='cuda:0', grad_fn=<MeanBackward1>) tensor(34.2083, device='cuda:0', grad_fn=<MeanBackward1>) tensor(33.8513, device='cuda:0', grad_fn=<MeanBackward1>) tensor(33.9766, device='cuda:0', grad_fn=<MeanBackward1>) tensor(33.7691, device='cuda:0', grad_fn=<MeanBackward1>) tensor(34.0155, device='cuda:0', grad_fn=<MeanBackward1>) tensor(34.1935, device='cuda:0', grad_fn=<MeanBackward1>) tensor(33.4364, device='cuda:0', grad_fn=<MeanBackward1>) tensor(33.7150, device='cuda:0', grad_fn=<MeanBackward1>) tensor(33.8655, device='cuda:0', grad_fn=<MeanBackward1>) tensor(33.9587, device='cuda:0', grad_fn=<MeanBackward1>) tensor(33.1649, device='cuda:0', grad_fn=<MeanBackward1>) tensor(33.5211, device='cuda:0', grad_fn=<MeanBackward1>) tensor(33.7156, device='cuda:0', grad_fn=<MeanBackward1>) tensor(32.3505, device='cuda:0', grad_fn=<MeanBackward1>) tensor(33.0286, device='cuda:0', grad_fn=<MeanBackward1>) tensor(33.1351, device='cuda:0', grad_fn=<MeanBackward1>) tensor(33.1459, device='cuda:0', grad_fn=<MeanBackward1>) tensor(32.2339, device='cuda:0', grad_fn=<MeanBackward1>) tensor(32.9354, device='cuda:0', grad_fn=<MeanBackward1>) tensor(31.9212, device='cuda:0', grad_fn=<MeanBackward1>) tensor(32.0864, device='cuda:0', grad_fn=<MeanBackward1>) tensor(31.9971, device='cuda:0', grad_fn=<MeanBackward1>) tensor(32.5682, device='cuda:0', grad_fn=<MeanBackward1>) tensor(31.3718, device='cuda:0', grad_fn=<MeanBackward1>) tensor(31.2569, device='cuda:0', grad_fn=<MeanBackward1>) tensor(31.0495, device='cuda:0', grad_fn=<MeanBackward1>) tensor(32.3168, device='cuda:0', grad_fn=<MeanBackward1>) tensor(29.0517, device='cuda:0', grad_fn=<MeanBackward1>) tensor(30.3467, device='cuda:0', grad_fn=<MeanBackward1>) tensor(30.4135, device='cuda:0', grad_fn=<MeanBackward1>) tensor(30.1993, device='cuda:0', grad_fn=<MeanBackward1>) tensor(30.8677, device='cuda:0', grad_fn=<MeanBackward1>) tensor(29.3833, device='cuda:0', grad_fn=<MeanBackward1>) tensor(28.3529, device='cuda:0', grad_fn=<MeanBackward1>) tensor(28.7499, device='cuda:0', grad_fn=<MeanBackward1>) tensor(28.7933, device='cuda:0', grad_fn=<MeanBackward1>) tensor(28.4807, device='cuda:0', grad_fn=<MeanBackward1>) tensor(28.5933, device='cuda:0', grad_fn=<MeanBackward1>) tensor(19.8771, device='cuda:0', grad_fn=<MeanBackward1>) tensor(27.2718, device='cuda:0', grad_fn=<MeanBackward1>) tensor(27.7017, device='cuda:0', grad_fn=<MeanBackward1>) tensor(28.1623, device='cuda:0', grad_fn=<MeanBackward1>) tensor(26.9756, device='cuda:0', grad_fn=<MeanBackward1>) tensor(28.6890, device='cuda:0', grad_fn=<MeanBackward1>) tensor(25.2464, device='cuda:0', grad_fn=<MeanBackward1>) tensor(26.9185, device='cuda:0', grad_fn=<MeanBackward1>) tensor(28.6738, device='cuda:0', grad_fn=<MeanBackward1>) tensor(25.9392, device='cuda:0', grad_fn=<MeanBackward1>) tensor(27.9710, device='cuda:0', grad_fn=<MeanBackward1>) tensor(26.0500, device='cuda:0', grad_fn=<MeanBackward1>) tensor(26.6261, device='cuda:0', grad_fn=<MeanBackward1>) tensor(26.4814, device='cuda:0', grad_fn=<MeanBackward1>) tensor(26.8821, device='cuda:0', grad_fn=<MeanBackward1>) tensor(25.4883, device='cuda:0', grad_fn=<MeanBackward1>) tensor(24.4640, device='cuda:0', grad_fn=<MeanBackward1>) tensor(23.5847, device='cuda:0', grad_fn=<MeanBackward1>) tensor(23.7877, device='cuda:0', grad_fn=<MeanBackward1>) tensor(26.0315, device='cuda:0', grad_fn=<MeanBackward1>) tensor(26.6754, device='cuda:0', grad_fn=<MeanBackward1>) tensor(27.4110, device='cuda:0', grad_fn=<MeanBackward1>) tensor(21.5436, device='cuda:0', grad_fn=<MeanBackward1>) tensor(22.5789, device='cuda:0', grad_fn=<MeanBackward1>) tensor(26.0395, device='cuda:0', grad_fn=<MeanBackward1>) tensor(26.2672, device='cuda:0', grad_fn=<MeanBackward1>) tensor(25.1768, device='cuda:0', grad_fn=<MeanBackward1>) tensor(22.8602, device='cuda:0', grad_fn=<MeanBackward1>) tensor(22.4364, device='cuda:0', grad_fn=<MeanBackward1>) tensor(24.2987, device='cuda:0', grad_fn=<MeanBackward1>) tensor(23.7889, device='cuda:0', grad_fn=<MeanBackward1>) tensor(21.5325, device='cuda:0', grad_fn=<MeanBackward1>) tensor(21.4440, device='cuda:0', grad_fn=<MeanBackward1>) tensor(23.0542, device='cuda:0', grad_fn=<MeanBackward1>) tensor(20.4385, device='cuda:0', grad_fn=<MeanBackward1>) tensor(22.1981, device='cuda:0', grad_fn=<MeanBackward1>) tensor(21.7050, device='cuda:0', grad_fn=<MeanBackward1>) tensor(21.1709, device='cuda:0', grad_fn=<MeanBackward1>) tensor(21.6080, device='cuda:0', grad_fn=<MeanBackward1>)
Iteration: 38%|ββββ | 38/100 [00:20<00:33, 1.85it/s][A
Iteration: 39%|ββββ | 39/100 [00:21<00:32, 1.85it/s][A
Iteration: 40%|ββββ | 40/100 [00:21<00:32, 1.85it/s][A
Iteration: 41%|ββββ | 41/100 [00:22<00:31, 1.85it/s][A
Iteration: 42%|βββββ | 42/100 [00:22<00:31, 1.85it/s][A
Iteration: 43%|βββββ | 43/100 [00:23<00:31, 1.83it/s][A
Iteration: 44%|βββββ | 44/100 [00:23<00:30, 1.83it/s][A
Iteration: 45%|βββββ | 45/100 [00:24<00:30, 1.83it/s][A
Iteration: 46%|βββββ | 46/100 [00:24<00:29, 1.82it/s][A
Iteration: 47%|βββββ | 47/100 [00:25<00:29, 1.82it/s][A
Iteration: 48%|βββββ | 48/100 [00:25<00:28, 1.82it/s][A
Iteration: 49%|βββββ | 49/100 [00:26<00:27, 1.82it/s][A
Iteration: 50%|βββββ | 50/100 [00:27<00:27, 1.81it/s][A
Iteration: 51%|βββββ | 51/100 [00:27<00:26, 1.82it/s][A
Iteration: 52%|ββββββ | 52/100 [00:28<00:26, 1.82it/s][A
Iteration: 53%|ββββββ | 53/100 [00:28<00:25, 1.83it/s][A
Iteration: 54%|ββββββ | 54/100 [00:29<00:25, 1.83it/s][A
Iteration: 55%|ββββββ | 55/100 [00:29<00:24, 1.84it/s][A
Iteration: 56%|ββββββ | 56/100 [00:30<00:23, 1.84it/s][A
Iteration: 57%|ββββββ | 57/100 [00:30<00:23, 1.85it/s][A
Iteration: 58%|ββββββ | 58/100 [00:31<00:22, 1.85it/s][A
Iteration: 59%|ββββββ | 59/100 [00:31<00:22, 1.84it/s][A
Iteration: 60%|ββββββ | 60/100 [00:32<00:21, 1.85it/s][A
Iteration: 61%|ββββββ | 61/100 [00:33<00:21, 1.84it/s][A
Iteration: 62%|βββββββ | 62/100 [00:33<00:20, 1.84it/s][A
Iteration: 63%|βββββββ | 63/100 [00:34<00:20, 1.84it/s][A
Iteration: 64%|βββββββ | 64/100 [00:34<00:19, 1.83it/s][A
Iteration: 65%|βββββββ | 65/100 [00:35<00:19, 1.84it/s][A
Iteration: 66%|βββββββ | 66/100 [00:35<00:18, 1.83it/s][A
Iteration: 67%|βββββββ | 67/100 [00:36<00:17, 1.85it/s][A
Iteration: 68%|βββββββ | 68/100 [00:36<00:17, 1.87it/s][A
Iteration: 69%|βββββββ | 69/100 [00:37<00:16, 1.87it/s][A
Iteration: 70%|βββββββ | 70/100 [00:37<00:15, 1.89it/s][A
Iteration: 71%|βββββββ | 71/100 [00:38<00:15, 1.91it/s][A
Iteration: 72%|ββββββββ | 72/100 [00:38<00:14, 1.88it/s][A
Iteration: 73%|ββββββββ | 73/100 [00:39<00:14, 1.87it/s][A
Iteration: 74%|ββββββββ | 74/100 [00:39<00:13, 1.87it/s][A
Iteration: 75%|ββββββββ | 75/100 [00:40<00:13, 1.86it/s][A
Iteration: 76%|ββββββββ | 76/100 [00:41<00:12, 1.86it/s][A
Iteration: 77%|ββββββββ | 77/100 [00:41<00:12, 1.85it/s][A
Iteration: 78%|ββββββββ | 78/100 [00:42<00:11, 1.84it/s][A
Iteration: 79%|ββββββββ | 79/100 [00:42<00:11, 1.83it/s][A
Iteration: 80%|ββββββββ | 80/100 [00:43<00:10, 1.83it/s][A
Iteration: 81%|ββββββββ | 81/100 [00:43<00:10, 1.82it/s][A
Iteration: 82%|βββββββββ | 82/100 [00:44<00:09, 1.82it/s][A
Iteration: 83%|βββββββββ | 83/100 [00:44<00:09, 1.83it/s][A
Iteration: 84%|βββββββββ | 84/100 [00:45<00:08, 1.84it/s][A
Iteration: 85%|βββββββββ | 85/100 [00:45<00:08, 1.85it/s][A
Iteration: 86%|βββββββββ | 86/100 [00:46<00:07, 1.84it/s][A
Iteration: 87%|βββββββββ | 87/100 [00:47<00:07, 1.85it/s][A
Iteration: 88%|βββββββββ | 88/100 [00:47<00:06, 1.85it/s][A
Iteration: 89%|βββββββββ | 89/100 [00:48<00:05, 1.85it/s][A
Iteration: 90%|βββββββββ | 90/100 [00:48<00:05, 1.86it/s][A
Iteration: 91%|βββββββββ | 91/100 [00:49<00:04, 1.85it/s][A
Iteration: 92%|ββββββββββ| 92/100 [00:49<00:04, 1.85it/s][A
Iteration: 93%|ββββββββββ| 93/100 [00:50<00:03, 1.85it/s][A
Iteration: 94%|ββββββββββ| 94/100 [00:50<00:03, 1.85it/s][A
Iteration: 95%|ββββββββββ| 95/100 [00:51<00:02, 1.86it/s][A
Iteration: 96%|ββββββββββ| 96/100 [00:51<00:02, 1.86it/s][A
Iteration: 97%|ββββββββββ| 97/100 [00:52<00:01, 1.86it/s][A
Iteration: 98%|ββββββββββ| 98/100 [00:53<00:01, 1.84it/s][A
Iteration: 99%|ββββββββββ| 99/100 [00:53<00:00, 1.83it/s][A
Iteration: 100%|ββββββββββ| 100/100 [00:54<00:00, 1.90it/s][A
[AEpoch: 7%|β | 2/30 [01:53<27:08, 58.17s/it]
Iteration: 0%| | 0/100 [00:00<?, ?it/s][A
Iteration: 1%| | 1/100 [00:00<00:53, 1.84it/s][A
Iteration: 2%|β | 2/100 [00:01<00:53, 1.82it/s][A
Iteration: 3%|β | 3/100 [00:01<00:53, 1.83it/s][A
Iteration: 4%|β | 4/100 [00:02<00:52, 1.83it/s][A
Iteration: 5%|β | 5/100 [00:02<00:51, 1.84it/s][A
Iteration: 6%|β | 6/100 [00:03<00:51, 1.84it/s][A
Iteration: 7%|β | 7/100 [00:03<00:50, 1.85it/s][A
Iteration: 8%|β | 8/100 [00:04<00:49, 1.86it/s][A
Iteration: 9%|β | 9/100 [00:04<00:48, 1.86it/s][A
Iteration: 10%|β | 10/100 [00:05<00:48, 1.86it/s][A
Iteration: 11%|β | 11/100 [00:05<00:47, 1.86it/s][A
Iteration: 12%|ββ | 12/100 [00:06<00:47, 1.87it/s][A
Iteration: 13%|ββ | 13/100 [00:07<00:46, 1.87it/s][A
Iteration: 14%|ββ | 14/100 [00:07<00:46, 1.86it/s][A
Iteration: 15%|ββ | 15/100 [00:08<00:45, 1.86it/s][A
Iteration: 16%|ββ | 16/100 [00:08<00:45, 1.86it/s][A
Iteration: 17%|ββ | 17/100 [00:09<00:44, 1.85it/s][A
Iteration: 18%|ββ | 18/100 [00:09<00:44, 1.85it/s][A
Iteration: 19%|ββ | 19/100 [00:10<00:43, 1.85it/s][A
Iteration: 20%|ββ | 20/100 [00:10<00:43, 1.86it/s][A
Iteration: 21%|ββ | 21/100 [00:11<00:42, 1.86it/s][A
Iteration: 22%|βββ | 22/100 [00:11<00:42, 1.85it/s][A
Iteration: 23%|βββ | 23/100 [00:12<00:41, 1.85it/s][A
Iteration: 24%|βββ | 24/100 [00:12<00:41, 1.85it/s][A
Iteration: 25%|βββ | 25/100 [00:13<00:40, 1.85it/s][A
Iteration: 26%|βββ | 26/100 [00:14<00:39, 1.86it/s][A
Iteration: 27%|βββ | 27/100 [00:14<00:39, 1.86it/s][A
Iteration: 28%|βββ | 28/100 [00:15<00:38, 1.85it/s][A
Iteration: 29%|βββ | 29/100 [00:15<00:38, 1.86it/s][A
Iteration: 30%|βββ | 30/100 [00:16<00:37, 1.85it/s][A
Iteration: 31%|βββ | 31/100 [00:16<00:37, 1.84it/s][A
Iteration: 32%|ββββ | 32/100 [00:17<00:37, 1.83it/s][A
Iteration: 33%|ββββ | 33/100 [00:17<00:36, 1.83it/s][A
Iteration: 34%|ββββ | 34/100 [00:18<00:36, 1.83it/s][A
Iteration: 35%|ββββ | 35/100 [00:18<00:35, 1.83it/s][A
Iteration: 36%|ββββ | 36/100 [00:19<00:34, 1.84it/s][A
Iteration: 37%|ββββ | 37/100 [00:20<00:34, 1.84it/s][A
Iteration: 38%|ββββ | 38/100 [00:20<00:33, 1.83it/s][A
Iteration: 39%|ββββ | 39/100 [00:21<00:33, 1.83it/s][A
Iteration: 40%|ββββ | 40/100 [00:21<00:32, 1.83it/s][A
Iteration: 41%|ββββ | 41/100 [00:22<00:32, 1.83it/s][A
Iteration: 42%|βββββ | 42/100 [00:22<00:31, 1.86it/s][A
Iteration: 43%|βββββ | 43/100 [00:23<00:30, 1.88it/s][A
Iteration: 44%|βββββ | 44/100 [00:23<00:29, 1.90it/s][A
Iteration: 45%|βββββ | 45/100 [00:24<00:28, 1.91it/s][A
Iteration: 46%|βββββ | 46/100 [00:24<00:28, 1.92it/s][A
Iteration: 47%|βββββ | 47/100 [00:25<00:27, 1.90it/s][A
Iteration: 48%|βββββ | 48/100 [00:25<00:27, 1.88it/s][A
Iteration: 49%|βββββ | 49/100 [00:26<00:27, 1.87it/s][A
Iteration: 50%|βββββ | 50/100 [00:26<00:26, 1.88it/s][A
Iteration: 51%|βββββ | 51/100 [00:27<00:26, 1.87it/s][A
Iteration: 52%|ββββββ | 52/100 [00:28<00:25, 1.86it/s][A
Iteration: 53%|ββββββ | 53/100 [00:28<00:25, 1.86it/s][A
Iteration: 54%|ββββββ | 54/100 [00:29<00:24, 1.86it/s][A
Iteration: 55%|ββββββ | 55/100 [00:29<00:24, 1.86it/s][A
Iteration: 56%|ββββββ | 56/100 [00:30<00:23, 1.84it/s][A
Iteration: 57%|ββββββ | 57/100 [00:30<00:23, 1.85it/s][A
Iteration: 58%|ββββββ | 58/100 [00:31<00:22, 1.85it/s][A
Iteration: 59%|ββββββ | 59/100 [00:31<00:22, 1.86it/s][A
Iteration: 60%|ββββββ | 60/100 [00:32<00:21, 1.85it/s][A
Iteration: 61%|ββββββ | 61/100 [00:32<00:21, 1.84it/s][A
Iteration: 62%|βββββββ | 62/100 [00:33<00:20, 1.84it/s][A
Iteration: 63%|βββββββ | 63/100 [00:33<00:20, 1.84it/s][A
Iteration: 64%|βββββββ | 64/100 [00:34<00:19, 1.84it/s][A
Iteration: 65%|βββββββ | 65/100 [00:35<00:18, 1.85it/s][A
Iteration: 66%|βββββββ | 66/100 [00:35<00:18, 1.83it/s][A
Iteration: 67%|βββββββ | 67/100 [00:36<00:18, 1.83it/s][A
Iteration: 68%|βββββββ | 68/100 [00:36<00:17, 1.83it/s][A
Iteration: 69%|βββββββ | 69/100 [00:37<00:16, 1.83it/s][A
Iteration: 70%|βββββββ | 70/100 [00:37<00:16, 1.84it/s][A
Iteration: 71%|βββββββ | 71/100 [00:38<00:15, 1.84it/s][A
Iteration: 72%|ββββββββ | 72/100 [00:38<00:15, 1.86it/s][A
Iteration: 73%|ββββββββ | 73/100 [00:39<00:14, 1.86it/s][A
Iteration: 74%|ββββββββ | 74/100 [00:39<00:13, 1.86it/s][A
Iteration: 75%|ββββββββ | 75/100 [00:40<00:13, 1.87it/s][A
Iteration: 76%|ββββββββ | 76/100 [00:41<00:12, 1.86it/s][A
Iteration: 77%|ββββββββ | 77/100 [00:41<00:12, 1.85it/s][A tensor(19.5217, device='cuda:0', grad_fn=<MeanBackward1>) tensor(19.4931, device='cuda:0', grad_fn=<MeanBackward1>) tensor(25.0291, device='cuda:0', grad_fn=<MeanBackward1>) tensor(23.7079, device='cuda:0', grad_fn=<MeanBackward1>) tensor(21.1116, device='cuda:0', grad_fn=<MeanBackward1>) tensor(28.6230, device='cuda:0', grad_fn=<MeanBackward1>) tensor(20.6984, device='cuda:0', grad_fn=<MeanBackward1>) tensor(14.1518, device='cuda:0', grad_fn=<MeanBackward1>) tensor(21.5892, device='cuda:0', grad_fn=<MeanBackward1>) tensor(16.8634, device='cuda:0', grad_fn=<MeanBackward1>) tensor(22.6581, device='cuda:0', grad_fn=<MeanBackward1>) tensor(17.4013, device='cuda:0', grad_fn=<MeanBackward1>) tensor(19.9627, device='cuda:0', grad_fn=<MeanBackward1>) tensor(21.4767, device='cuda:0', grad_fn=<MeanBackward1>) tensor(21.8233, device='cuda:0', grad_fn=<MeanBackward1>) tensor(16.7135, device='cuda:0', grad_fn=<MeanBackward1>) tensor(19.6345, device='cuda:0', grad_fn=<MeanBackward1>) tensor(22.5731, device='cuda:0', grad_fn=<MeanBackward1>) tensor(20.7074, device='cuda:0', grad_fn=<MeanBackward1>) tensor(21.9474, device='cuda:0', grad_fn=<MeanBackward1>) tensor(17.1525, device='cuda:0', grad_fn=<MeanBackward1>) tensor(18.7872, device='cuda:0', grad_fn=<MeanBackward1>) tensor(19.1300, device='cuda:0', grad_fn=<MeanBackward1>) tensor(18.2275, device='cuda:0', grad_fn=<MeanBackward1>) tensor(15.7096, device='cuda:0', grad_fn=<MeanBackward1>) tensor(19.0289, device='cuda:0', grad_fn=<MeanBackward1>) tensor(16.3454, device='cuda:0', grad_fn=<MeanBackward1>) tensor(16.1548, device='cuda:0', grad_fn=<MeanBackward1>) tensor(18.1166, device='cuda:0', grad_fn=<MeanBackward1>) tensor(22.0522, device='cuda:0', grad_fn=<MeanBackward1>) tensor(19.4081, device='cuda:0', grad_fn=<MeanBackward1>) tensor(19.6903, device='cuda:0', grad_fn=<MeanBackward1>) tensor(19.2761, device='cuda:0', grad_fn=<MeanBackward1>) tensor(14.6820, device='cuda:0', grad_fn=<MeanBackward1>) tensor(20.4354, device='cuda:0', grad_fn=<MeanBackward1>) tensor(18.7604, device='cuda:0', grad_fn=<MeanBackward1>) tensor(17.6286, device='cuda:0', grad_fn=<MeanBackward1>) tensor(18.7149, device='cuda:0', grad_fn=<MeanBackward1>) tensor(22.4134, device='cuda:0', grad_fn=<MeanBackward1>) tensor(17.6916, device='cuda:0', grad_fn=<MeanBackward1>) tensor(17.4945, device='cuda:0', grad_fn=<MeanBackward1>) tensor(12.7009, device='cuda:0', grad_fn=<MeanBackward1>) tensor(18.8083, device='cuda:0', grad_fn=<MeanBackward1>) tensor(13.9626, device='cuda:0', grad_fn=<MeanBackward1>) tensor(16.2995, device='cuda:0', grad_fn=<MeanBackward1>) tensor(11.6117, device='cuda:0', grad_fn=<MeanBackward1>) tensor(14.8884, device='cuda:0', grad_fn=<MeanBackward1>) tensor(13.5581, device='cuda:0', grad_fn=<MeanBackward1>) tensor(13.1327, device='cuda:0', grad_fn=<MeanBackward1>) tensor(15.0537, device='cuda:0', grad_fn=<MeanBackward1>) tensor(10.8806, device='cuda:0', grad_fn=<MeanBackward1>) tensor(14.3943, device='cuda:0', grad_fn=<MeanBackward1>) tensor(16.5873, device='cuda:0', grad_fn=<MeanBackward1>) tensor(16.0849, device='cuda:0', grad_fn=<MeanBackward1>) tensor(17.2781, device='cuda:0', grad_fn=<MeanBackward1>) tensor(10.7485, device='cuda:0', grad_fn=<MeanBackward1>) tensor(16.5903, device='cuda:0', grad_fn=<MeanBackward1>) tensor(20.1337, device='cuda:0', grad_fn=<MeanBackward1>) tensor(13.2532, device='cuda:0', grad_fn=<MeanBackward1>) tensor(12.1556, device='cuda:0', grad_fn=<MeanBackward1>) tensor(12.8779, device='cuda:0', grad_fn=<MeanBackward1>) tensor(9.5918, device='cuda:0', grad_fn=<MeanBackward1>) tensor(14.2135, device='cuda:0', grad_fn=<MeanBackward1>) tensor(10.0630, device='cuda:0', grad_fn=<MeanBackward1>) tensor(9.3797, device='cuda:0', grad_fn=<MeanBackward1>) tensor(12.6937, device='cuda:0', grad_fn=<MeanBackward1>) tensor(11.2402, device='cuda:0', grad_fn=<MeanBackward1>) tensor(11.3532, device='cuda:0', grad_fn=<MeanBackward1>) tensor(14.2157, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.9583, device='cuda:0', grad_fn=<MeanBackward1>) tensor(11.8444, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.4632, device='cuda:0', grad_fn=<MeanBackward1>) tensor(9.9739, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.2960, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.2640, device='cuda:0', grad_fn=<MeanBackward1>) tensor(15.9773, device='cuda:0', grad_fn=<MeanBackward1>) tensor(11.6414, device='cuda:0', grad_fn=<MeanBackward1>) tensor(11.3632, device='cuda:0', grad_fn=<MeanBackward1>) tensor(9.8825, device='cuda:0', grad_fn=<MeanBackward1>) tensor(8.3027, device='cuda:0', grad_fn=<MeanBackward1>) tensor(9.9856, device='cuda:0', grad_fn=<MeanBackward1>) tensor(8.9571, device='cuda:0', grad_fn=<MeanBackward1>) tensor(10.6716, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.5309, device='cuda:0', grad_fn=<MeanBackward1>) tensor(12.9184, device='cuda:0', grad_fn=<MeanBackward1>) tensor(11.0655, device='cuda:0', grad_fn=<MeanBackward1>) tensor(9.9719, device='cuda:0', grad_fn=<MeanBackward1>) tensor(12.8102, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.1327, device='cuda:0', grad_fn=<MeanBackward1>) tensor(10.6970, device='cuda:0', grad_fn=<MeanBackward1>) tensor(12.4478, device='cuda:0', grad_fn=<MeanBackward1>) tensor(13.9610, device='cuda:0', grad_fn=<MeanBackward1>) tensor(10.3259, device='cuda:0', grad_fn=<MeanBackward1>) tensor(9.7051, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.1586, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.1950, device='cuda:0', grad_fn=<MeanBackward1>) tensor(9.5813, device='cuda:0', grad_fn=<MeanBackward1>) tensor(9.8647, device='cuda:0', grad_fn=<MeanBackward1>) tensor(17.3659, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.3702, device='cuda:0', grad_fn=<MeanBackward1>) tensor(12.3091, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.6662, device='cuda:0', grad_fn=<MeanBackward1>) tensor(11.2644, device='cuda:0', grad_fn=<MeanBackward1>) tensor(13.6302, device='cuda:0', grad_fn=<MeanBackward1>) tensor(9.4936, device='cuda:0', grad_fn=<MeanBackward1>) tensor(12.3281, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.7969, device='cuda:0', grad_fn=<MeanBackward1>) tensor(8.7796, device='cuda:0', grad_fn=<MeanBackward1>) tensor(8.5798, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.2386, device='cuda:0', grad_fn=<MeanBackward1>) tensor(8.3152, device='cuda:0', grad_fn=<MeanBackward1>) tensor(11.0419, device='cuda:0', grad_fn=<MeanBackward1>) tensor(13.3992, device='cuda:0', grad_fn=<MeanBackward1>) tensor(9.6593, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.6577, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.5716, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.2528, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.1090, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.2988, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.8447, device='cuda:0', grad_fn=<MeanBackward1>) tensor(12.5544, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.2986, device='cuda:0', grad_fn=<MeanBackward1>) tensor(11.6539, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.0875, device='cuda:0', grad_fn=<MeanBackward1>) tensor(9.0189, device='cuda:0', grad_fn=<MeanBackward1>) tensor(13.9192, device='cuda:0', grad_fn=<MeanBackward1>) tensor(9.7054, device='cuda:0', grad_fn=<MeanBackward1>) tensor(8.6291, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.6385, device='cuda:0', grad_fn=<MeanBackward1>) tensor(8.2710, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.9630, device='cuda:0', grad_fn=<MeanBackward1>) tensor(9.3582, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.1131, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.4887, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.5389, device='cuda:0', grad_fn=<MeanBackward1>) tensor(9.1681, device='cuda:0', grad_fn=<MeanBackward1>) tensor(8.3563, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.8153, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.2369, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.6461, device='cuda:0', grad_fn=<MeanBackward1>)
Iteration: 78%|ββββββββ | 78/100 [00:42<00:11, 1.84it/s][A
Iteration: 79%|ββββββββ | 79/100 [00:42<00:11, 1.84it/s][A
Iteration: 80%|ββββββββ | 80/100 [00:43<00:10, 1.84it/s][A
Iteration: 81%|ββββββββ | 81/100 [00:43<00:10, 1.84it/s][A
Iteration: 82%|βββββββββ | 82/100 [00:44<00:09, 1.87it/s][A
Iteration: 83%|βββββββββ | 83/100 [00:44<00:08, 1.89it/s][A
Iteration: 84%|βββββββββ | 84/100 [00:45<00:08, 1.91it/s][A
Iteration: 85%|βββββββββ | 85/100 [00:45<00:07, 1.92it/s][A
Iteration: 86%|βββββββββ | 86/100 [00:46<00:07, 1.93it/s][A
Iteration: 87%|βββββββββ | 87/100 [00:46<00:06, 1.90it/s][A
Iteration: 88%|βββββββββ | 88/100 [00:47<00:06, 1.89it/s][A
Iteration: 89%|βββββββββ | 89/100 [00:47<00:05, 1.87it/s][A
Iteration: 90%|βββββββββ | 90/100 [00:48<00:05, 1.85it/s][A
Iteration: 91%|βββββββββ | 91/100 [00:49<00:04, 1.85it/s][A
Iteration: 92%|ββββββββββ| 92/100 [00:49<00:04, 1.85it/s][A
Iteration: 93%|ββββββββββ| 93/100 [00:50<00:03, 1.83it/s][A
Iteration: 94%|ββββββββββ| 94/100 [00:50<00:03, 1.84it/s][A
Iteration: 95%|ββββββββββ| 95/100 [00:51<00:02, 1.83it/s][A
Iteration: 96%|ββββββββββ| 96/100 [00:51<00:02, 1.83it/s][A
Iteration: 97%|ββββββββββ| 97/100 [00:52<00:01, 1.81it/s][A
Iteration: 98%|ββββββββββ| 98/100 [00:52<00:01, 1.81it/s][A
Iteration: 99%|ββββββββββ| 99/100 [00:53<00:00, 1.82it/s][A
Iteration: 100%|ββββββββββ| 100/100 [00:53<00:00, 1.91it/s][A
[AEpoch: 10%|β | 3/30 [02:47<25:36, 56.89s/it]
Iteration: 0%| | 0/100 [00:00<?, ?it/s][A
Iteration: 1%| | 1/100 [00:00<00:54, 1.83it/s][A
Iteration: 2%|β | 2/100 [00:01<00:53, 1.84it/s][A
Iteration: 3%|β | 3/100 [00:01<00:52, 1.85it/s][A
Iteration: 4%|β | 4/100 [00:02<00:51, 1.85it/s][A
Iteration: 5%|β | 5/100 [00:02<00:51, 1.85it/s][A
Iteration: 6%|β | 6/100 [00:03<00:51, 1.84it/s][A
Iteration: 7%|β | 7/100 [00:03<00:50, 1.84it/s][A
Iteration: 8%|β | 8/100 [00:04<00:50, 1.83it/s][A
Iteration: 9%|β | 9/100 [00:04<00:49, 1.84it/s][A
Iteration: 10%|β | 10/100 [00:05<00:49, 1.83it/s][A
Iteration: 11%|β | 11/100 [00:05<00:48, 1.83it/s][A
Iteration: 12%|ββ | 12/100 [00:06<00:48, 1.83it/s][A
Iteration: 13%|ββ | 13/100 [00:07<00:47, 1.83it/s][A
Iteration: 14%|ββ | 14/100 [00:07<00:46, 1.84it/s][A
Iteration: 15%|ββ | 15/100 [00:08<00:46, 1.84it/s][A
Iteration: 16%|ββ | 16/100 [00:08<00:45, 1.83it/s][A
Iteration: 17%|ββ | 17/100 [00:09<00:45, 1.83it/s][A
Iteration: 18%|ββ | 18/100 [00:09<00:44, 1.83it/s][A
Iteration: 19%|ββ | 19/100 [00:10<00:44, 1.82it/s][A
Iteration: 20%|ββ | 20/100 [00:10<00:44, 1.81it/s][A
Iteration: 21%|ββ | 21/100 [00:11<00:43, 1.81it/s][A
Iteration: 22%|βββ | 22/100 [00:11<00:42, 1.85it/s][A
Iteration: 23%|βββ | 23/100 [00:12<00:41, 1.88it/s][A
Iteration: 24%|βββ | 24/100 [00:13<00:40, 1.90it/s][A
Iteration: 25%|βββ | 25/100 [00:13<00:39, 1.91it/s][A
Iteration: 26%|βββ | 26/100 [00:14<00:38, 1.92it/s][A
Iteration: 27%|βββ | 27/100 [00:14<00:38, 1.89it/s][A
Iteration: 28%|βββ | 28/100 [00:15<00:38, 1.88it/s][A
Iteration: 29%|βββ | 29/100 [00:15<00:37, 1.88it/s][A
Iteration: 30%|βββ | 30/100 [00:16<00:37, 1.87it/s][A
Iteration: 31%|βββ | 31/100 [00:16<00:37, 1.85it/s][A
Iteration: 32%|ββββ | 32/100 [00:17<00:36, 1.86it/s][A
Iteration: 33%|ββββ | 33/100 [00:17<00:36, 1.85it/s][A
Iteration: 34%|ββββ | 34/100 [00:18<00:35, 1.84it/s][A
Iteration: 35%|ββββ | 35/100 [00:18<00:35, 1.84it/s][A
Iteration: 36%|ββββ | 36/100 [00:19<00:34, 1.83it/s][A
Iteration: 37%|ββββ | 37/100 [00:20<00:34, 1.83it/s][A
Iteration: 38%|ββββ | 38/100 [00:20<00:33, 1.83it/s][A
Iteration: 39%|ββββ | 39/100 [00:21<00:33, 1.81it/s][A
Iteration: 40%|ββββ | 40/100 [00:21<00:32, 1.82it/s][A
Iteration: 41%|ββββ | 41/100 [00:22<00:32, 1.82it/s][A
Iteration: 42%|βββββ | 42/100 [00:22<00:31, 1.83it/s][A
Iteration: 43%|βββββ | 43/100 [00:23<00:31, 1.82it/s][A
Iteration: 44%|βββββ | 44/100 [00:23<00:30, 1.82it/s][A
Iteration: 45%|βββββ | 45/100 [00:24<00:30, 1.82it/s][A
Iteration: 46%|βββββ | 46/100 [00:24<00:29, 1.83it/s][A
Iteration: 47%|βββββ | 47/100 [00:25<00:28, 1.83it/s][A
Iteration: 48%|βββββ | 48/100 [00:26<00:28, 1.83it/s][A
Iteration: 49%|βββββ | 49/100 [00:26<00:27, 1.83it/s][A
Iteration: 50%|βββββ | 50/100 [00:27<00:27, 1.83it/s][A
Iteration: 51%|βββββ | 51/100 [00:27<00:26, 1.84it/s][A
Iteration: 52%|ββββββ | 52/100 [00:28<00:25, 1.85it/s][A
Iteration: 53%|ββββββ | 53/100 [00:28<00:25, 1.84it/s][A
Iteration: 54%|ββββββ | 54/100 [00:29<00:25, 1.83it/s][A
Iteration: 55%|ββββββ | 55/100 [00:29<00:24, 1.83it/s][A
Iteration: 56%|ββββββ | 56/100 [00:30<00:24, 1.83it/s][A
Iteration: 57%|ββββββ | 57/100 [00:30<00:23, 1.83it/s][A
Iteration: 58%|ββββββ | 58/100 [00:31<00:22, 1.83it/s][A
Iteration: 59%|ββββββ | 59/100 [00:32<00:22, 1.83it/s][A
Iteration: 60%|ββββββ | 60/100 [00:32<00:21, 1.83it/s][A
Iteration: 61%|ββββββ | 61/100 [00:33<00:21, 1.82it/s][A
Iteration: 62%|βββββββ | 62/100 [00:33<00:20, 1.83it/s][A
Iteration: 63%|βββββββ | 63/100 [00:34<00:20, 1.82it/s][A
Iteration: 64%|βββββββ | 64/100 [00:34<00:19, 1.83it/s][A
Iteration: 65%|βββββββ | 65/100 [00:35<00:19, 1.83it/s][A
Iteration: 66%|βββββββ | 66/100 [00:35<00:18, 1.84it/s][A
Iteration: 67%|βββββββ | 67/100 [00:36<00:17, 1.83it/s][A
Iteration: 68%|βββββββ | 68/100 [00:36<00:17, 1.85it/s][A
Iteration: 69%|βββββββ | 69/100 [00:37<00:16, 1.85it/s][A
Iteration: 70%|βββββββ | 70/100 [00:38<00:16, 1.85it/s][A
Iteration: 71%|βββββββ | 71/100 [00:38<00:15, 1.84it/s][A
Iteration: 72%|ββββββββ | 72/100 [00:39<00:15, 1.84it/s][A
Iteration: 73%|ββββββββ | 73/100 [00:39<00:14, 1.83it/s][A
Iteration: 74%|ββββββββ | 74/100 [00:40<00:14, 1.83it/s][A
Iteration: 75%|ββββββββ | 75/100 [00:40<00:13, 1.83it/s][A
Iteration: 76%|ββββββββ | 76/100 [00:41<00:13, 1.83it/s][A
Iteration: 77%|ββββββββ | 77/100 [00:41<00:12, 1.84it/s][A
Iteration: 78%|ββββββββ | 78/100 [00:42<00:11, 1.84it/s][A
Iteration: 79%|ββββββββ | 79/100 [00:42<00:11, 1.83it/s][A
Iteration: 80%|ββββββββ | 80/100 [00:43<00:10, 1.84it/s][A
Iteration: 81%|ββββββββ | 81/100 [00:44<00:10, 1.84it/s][A
Iteration: 82%|βββββββββ | 82/100 [00:44<00:09, 1.84it/s][A
Iteration: 83%|βββββββββ | 83/100 [00:45<00:09, 1.84it/s][A
Iteration: 84%|βββββββββ | 84/100 [00:45<00:08, 1.83it/s][A
Iteration: 85%|βββββββββ | 85/100 [00:46<00:08, 1.81it/s][A
Iteration: 86%|βββββββββ | 86/100 [00:46<00:07, 1.79it/s][A
Iteration: 87%|βββββββββ | 87/100 [00:47<00:07, 1.80it/s][A
Iteration: 88%|βββββββββ | 88/100 [00:47<00:06, 1.81it/s][A
Iteration: 89%|βββββββββ | 89/100 [00:48<00:06, 1.82it/s][A
Iteration: 90%|βββββββββ | 90/100 [00:48<00:05, 1.82it/s][A
Iteration: 91%|βββββββββ | 91/100 [00:49<00:04, 1.81it/s][A
Iteration: 92%|ββββββββββ| 92/100 [00:50<00:04, 1.82it/s][A
Iteration: 93%|ββββββββββ| 93/100 [00:50<00:03, 1.82it/s][A
Iteration: 94%|ββββββββββ| 94/100 [00:51<00:03, 1.82it/s][A
Iteration: 95%|ββββββββββ| 95/100 [00:51<00:02, 1.83it/s][A
Iteration: 96%|ββββββββββ| 96/100 [00:52<00:02, 1.84it/s][A
Iteration: 97%|ββββββββββ| 97/100 [00:52<00:01, 1.88it/s][A
Iteration: 98%|ββββββββββ| 98/100 [00:53<00:01, 1.90it/s][A
Iteration: 99%|ββββββββββ| 99/100 [00:53<00:00, 1.91it/s][A
Iteration: 100%|ββββββββββ| 100/100 [00:54<00:00, 2.01it/s][A
[AEpoch: 13%|ββ | 4/30 [03:42<24:18, 56.10s/it]
Iteration: 0%| | 0/100 [00:00<?, ?it/s][A
Iteration: 1%| | 1/100 [00:00<00:51, 1.94it/s][A
Iteration: 2%|β | 2/100 [00:01<00:51, 1.90it/s][A
Iteration: 3%|β | 3/100 [00:01<00:51, 1.87it/s][A
Iteration: 4%|β | 4/100 [00:02<00:51, 1.86it/s][A
Iteration: 5%|β | 5/100 [00:02<00:51, 1.85it/s][A
Iteration: 6%|β | 6/100 [00:03<00:50, 1.85it/s][A
Iteration: 7%|β | 7/100 [00:03<00:50, 1.85it/s][A
Iteration: 8%|β | 8/100 [00:04<00:49, 1.86it/s][A
Iteration: 9%|β | 9/100 [00:04<00:48, 1.86it/s][A
Iteration: 10%|β | 10/100 [00:05<00:48, 1.85it/s][A
Iteration: 11%|β | 11/100 [00:05<00:48, 1.84it/s][A
Iteration: 12%|ββ | 12/100 [00:06<00:47, 1.83it/s][A
Iteration: 13%|ββ | 13/100 [00:07<00:47, 1.83it/s][A
Iteration: 14%|ββ | 14/100 [00:07<00:46, 1.83it/s][A
Iteration: 15%|ββ | 15/100 [00:08<00:46, 1.84it/s][A
Iteration: 16%|ββ | 16/100 [00:08<00:45, 1.84it/s][A
Iteration: 17%|ββ | 17/100 [00:09<00:45, 1.84it/s][A
Iteration: 18%|ββ | 18/100 [00:09<00:44, 1.83it/s][A
Iteration: 19%|ββ | 19/100 [00:10<00:44, 1.82it/s][A tensor(9.6170, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.2795, device='cuda:0', grad_fn=<MeanBackward1>) tensor(10.3238, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.1224, device='cuda:0', grad_fn=<MeanBackward1>) tensor(8.7240, device='cuda:0', grad_fn=<MeanBackward1>) tensor(10.1101, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.9173, device='cuda:0', grad_fn=<MeanBackward1>) tensor(8.3745, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.4841, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.4282, device='cuda:0', grad_fn=<MeanBackward1>) tensor(8.4338, device='cuda:0', grad_fn=<MeanBackward1>) tensor(8.0449, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.6309, device='cuda:0', grad_fn=<MeanBackward1>) tensor(10.4466, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.2646, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.0426, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.9309, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.2764, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.3907, device='cuda:0', grad_fn=<MeanBackward1>) tensor(14.8181, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.4094, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.4657, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.1746, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.1144, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.7568, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.5564, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.0714, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.3006, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.3838, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.7347, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.8105, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.8774, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.4126, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.8704, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.6489, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.5497, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.9307, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.0221, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.3174, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.8198, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.8656, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.5784, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.1578, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.8998, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.5024, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1204, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.4208, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.0420, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.2843, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.7830, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.7764, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.7746, device='cuda:0', grad_fn=<MeanBackward1>) tensor(8.6879, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.4311, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.5775, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.4169, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.2333, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.6881, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.6291, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.3746, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.0429, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.2904, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.0441, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.2340, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.8735, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.4843, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.1458, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.5512, device='cuda:0', grad_fn=<MeanBackward1>) tensor(9.6382, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.3505, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.9071, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.4752, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.6990, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.4931, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.9806, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.0400, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.9883, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.2144, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.9120, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.2387, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9885, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.4355, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.4066, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.2418, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.3946, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.7523, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.5369, device='cuda:0', grad_fn=<MeanBackward1>) tensor(8.2282, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.8749, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.5602, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.7205, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.2373, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.4980, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.8264, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.9793, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.1585, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.3616, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.8725, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.3016, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.4284, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.5315, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.9057, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.7600, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9048, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.0392, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.4467, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.8938, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.3599, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.3952, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.5287, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.5668, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.5549, device='cuda:0', grad_fn=<MeanBackward1>) tensor(7.7698, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.0979, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.0863, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.7086, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.5307, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.5706, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.8031, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.0357, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.9941, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1504, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9705, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.8705, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.4674, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.5110, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.2759, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.3212, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.4925, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3221, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.4245, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6410, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.7257, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9895, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.8878, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.7267, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.7761, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9041, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9714, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.8723, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.5012, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.7350, device='cuda:0', grad_fn=<MeanBackward1>)
Iteration: 20%|ββ | 20/100 [00:10<00:44, 1.82it/s][A
Iteration: 21%|ββ | 21/100 [00:11<00:43, 1.83it/s][A
Iteration: 22%|βββ | 22/100 [00:11<00:42, 1.82it/s][A
Iteration: 23%|βββ | 23/100 [00:12<00:42, 1.82it/s][A
Iteration: 24%|βββ | 24/100 [00:13<00:41, 1.81it/s][A
Iteration: 25%|βββ | 25/100 [00:13<00:41, 1.82it/s][A
Iteration: 26%|βββ | 26/100 [00:14<00:40, 1.82it/s][A
Iteration: 27%|βββ | 27/100 [00:14<00:39, 1.83it/s][A
Iteration: 28%|βββ | 28/100 [00:15<00:39, 1.84it/s][A
Iteration: 29%|βββ | 29/100 [00:15<00:38, 1.84it/s][A
Iteration: 30%|βββ | 30/100 [00:16<00:38, 1.84it/s][A
Iteration: 31%|βββ | 31/100 [00:16<00:37, 1.84it/s][A
Iteration: 32%|ββββ | 32/100 [00:17<00:37, 1.83it/s][A
Iteration: 33%|ββββ | 33/100 [00:17<00:36, 1.83it/s][A
Iteration: 34%|ββββ | 34/100 [00:18<00:36, 1.83it/s][A
Iteration: 35%|ββββ | 35/100 [00:19<00:35, 1.84it/s][A
Iteration: 36%|ββββ | 36/100 [00:19<00:35, 1.83it/s][A
Iteration: 37%|ββββ | 37/100 [00:20<00:34, 1.83it/s][A
Iteration: 38%|ββββ | 38/100 [00:20<00:33, 1.83it/s][A
Iteration: 39%|ββββ | 39/100 [00:21<00:33, 1.83it/s][A
Iteration: 40%|ββββ | 40/100 [00:21<00:32, 1.82it/s][A
Iteration: 41%|ββββ | 41/100 [00:22<00:32, 1.81it/s][A
Iteration: 42%|βββββ | 42/100 [00:22<00:31, 1.82it/s][A
Iteration: 43%|βββββ | 43/100 [00:23<00:31, 1.82it/s][A
Iteration: 44%|βββββ | 44/100 [00:24<00:30, 1.82it/s][A
Iteration: 45%|βββββ | 45/100 [00:24<00:30, 1.82it/s][A
Iteration: 46%|βββββ | 46/100 [00:25<00:29, 1.83it/s][A
Iteration: 47%|βββββ | 47/100 [00:25<00:28, 1.83it/s][A
Iteration: 48%|βββββ | 48/100 [00:26<00:28, 1.84it/s][A
Iteration: 49%|βββββ | 49/100 [00:26<00:27, 1.84it/s][A
Iteration: 50%|βββββ | 50/100 [00:27<00:27, 1.85it/s][A
Iteration: 51%|βββββ | 51/100 [00:27<00:26, 1.85it/s][A
Iteration: 52%|ββββββ | 52/100 [00:28<00:25, 1.85it/s][A
Iteration: 53%|ββββββ | 53/100 [00:28<00:25, 1.86it/s][A
Iteration: 54%|ββββββ | 54/100 [00:29<00:24, 1.85it/s][A
Iteration: 55%|ββββββ | 55/100 [00:29<00:24, 1.84it/s][A
Iteration: 56%|ββββββ | 56/100 [00:30<00:23, 1.85it/s][A
Iteration: 57%|ββββββ | 57/100 [00:31<00:23, 1.86it/s][A
Iteration: 58%|ββββββ | 58/100 [00:31<00:22, 1.86it/s][A
Iteration: 59%|ββββββ | 59/100 [00:32<00:21, 1.86it/s][A
Iteration: 60%|ββββββ | 60/100 [00:32<00:21, 1.86it/s][A
Iteration: 61%|ββββββ | 61/100 [00:33<00:21, 1.84it/s][A
Iteration: 62%|βββββββ | 62/100 [00:33<00:20, 1.84it/s][A
Iteration: 63%|βββββββ | 63/100 [00:34<00:20, 1.83it/s][A
Iteration: 64%|βββββββ | 64/100 [00:34<00:19, 1.83it/s][A
Iteration: 65%|βββββββ | 65/100 [00:35<00:19, 1.81it/s][A
Iteration: 66%|βββββββ | 66/100 [00:35<00:18, 1.81it/s][A
Iteration: 67%|βββββββ | 67/100 [00:36<00:18, 1.81it/s][A
Iteration: 68%|βββββββ | 68/100 [00:37<00:17, 1.81it/s][A
Iteration: 69%|βββββββ | 69/100 [00:37<00:17, 1.79it/s][A
Iteration: 70%|βββββββ | 70/100 [00:38<00:16, 1.78it/s][A
Iteration: 71%|βββββββ | 71/100 [00:38<00:16, 1.78it/s][A
Iteration: 72%|ββββββββ | 72/100 [00:39<00:15, 1.78it/s][A
Iteration: 73%|ββββββββ | 73/100 [00:39<00:15, 1.80it/s][A
Iteration: 74%|ββββββββ | 74/100 [00:40<00:14, 1.81it/s][A
Iteration: 75%|ββββββββ | 75/100 [00:40<00:13, 1.82it/s][A
Iteration: 76%|ββββββββ | 76/100 [00:41<00:13, 1.83it/s][A
Iteration: 77%|ββββββββ | 77/100 [00:42<00:12, 1.84it/s][A
Iteration: 78%|ββββββββ | 78/100 [00:42<00:11, 1.85it/s][A
Iteration: 79%|ββββββββ | 79/100 [00:43<00:11, 1.85it/s][A
Iteration: 80%|ββββββββ | 80/100 [00:43<00:10, 1.84it/s][A
Iteration: 81%|ββββββββ | 81/100 [00:44<00:10, 1.84it/s][A
Iteration: 82%|βββββββββ | 82/100 [00:44<00:09, 1.84it/s][A
Iteration: 83%|βββββββββ | 83/100 [00:45<00:09, 1.84it/s][A
Iteration: 84%|βββββββββ | 84/100 [00:45<00:08, 1.82it/s][A
Iteration: 85%|βββββββββ | 85/100 [00:46<00:08, 1.82it/s][A
Iteration: 86%|βββββββββ | 86/100 [00:46<00:07, 1.81it/s][A
Iteration: 87%|βββββββββ | 87/100 [00:47<00:07, 1.81it/s][A
Iteration: 88%|βββββββββ | 88/100 [00:48<00:06, 1.82it/s][A
Iteration: 89%|βββββββββ | 89/100 [00:48<00:06, 1.82it/s][A
Iteration: 90%|βββββββββ | 90/100 [00:49<00:05, 1.82it/s][A
Iteration: 91%|βββββββββ | 91/100 [00:49<00:04, 1.82it/s][A
Iteration: 92%|ββββββββββ| 92/100 [00:50<00:04, 1.81it/s][A
Iteration: 93%|ββββββββββ| 93/100 [00:50<00:03, 1.81it/s][A
Iteration: 94%|ββββββββββ| 94/100 [00:51<00:03, 1.81it/s][A
Iteration: 95%|ββββββββββ| 95/100 [00:51<00:02, 1.83it/s][A
Iteration: 96%|ββββββββββ| 96/100 [00:52<00:02, 1.83it/s][A
Iteration: 97%|ββββββββββ| 97/100 [00:53<00:01, 1.82it/s][A
Iteration: 98%|ββββββββββ| 98/100 [00:53<00:01, 1.82it/s][A
Iteration: 99%|ββββββββββ| 99/100 [00:54<00:00, 1.82it/s][A
Iteration: 100%|ββββββββββ| 100/100 [00:54<00:00, 1.90it/s][A
[AEpoch: 17%|ββ | 5/30 [04:36<23:11, 55.65s/it]
Iteration: 0%| | 0/100 [00:00<?, ?it/s][A
Iteration: 1%| | 1/100 [00:00<00:54, 1.82it/s][A
Iteration: 2%|β | 2/100 [00:01<00:53, 1.82it/s][A
Iteration: 3%|β | 3/100 [00:01<00:53, 1.82it/s][A
Iteration: 4%|β | 4/100 [00:02<00:52, 1.82it/s][A
Iteration: 5%|β | 5/100 [00:02<00:52, 1.82it/s][A
Iteration: 6%|β | 6/100 [00:03<00:51, 1.81it/s][A
Iteration: 7%|β | 7/100 [00:03<00:50, 1.84it/s][A
Iteration: 8%|β | 8/100 [00:04<00:49, 1.86it/s][A
Iteration: 9%|β | 9/100 [00:04<00:48, 1.87it/s][A
Iteration: 10%|β | 10/100 [00:05<00:47, 1.90it/s][A
Iteration: 11%|β | 11/100 [00:05<00:46, 1.91it/s][A
Iteration: 12%|ββ | 12/100 [00:06<00:46, 1.88it/s][A
Iteration: 13%|ββ | 13/100 [00:07<00:47, 1.85it/s][A
Iteration: 14%|ββ | 14/100 [00:07<00:46, 1.83it/s][A
Iteration: 15%|ββ | 15/100 [00:08<00:46, 1.83it/s][A
Iteration: 16%|ββ | 16/100 [00:08<00:45, 1.84it/s][A
Iteration: 17%|ββ | 17/100 [00:09<00:45, 1.84it/s][A
Iteration: 18%|ββ | 18/100 [00:09<00:44, 1.83it/s][A
Iteration: 19%|ββ | 19/100 [00:10<00:44, 1.83it/s][A
Iteration: 20%|ββ | 20/100 [00:10<00:44, 1.82it/s][A
Iteration: 21%|ββ | 21/100 [00:11<00:43, 1.82it/s][A
Iteration: 22%|βββ | 22/100 [00:11<00:42, 1.83it/s][A
Iteration: 23%|βββ | 23/100 [00:12<00:41, 1.85it/s][A
Iteration: 24%|βββ | 24/100 [00:13<00:40, 1.86it/s][A
Iteration: 25%|βββ | 25/100 [00:13<00:40, 1.86it/s][A
Iteration: 26%|βββ | 26/100 [00:14<00:39, 1.86it/s][A
Iteration: 27%|βββ | 27/100 [00:14<00:39, 1.85it/s][A
Iteration: 28%|βββ | 28/100 [00:15<00:39, 1.83it/s][A
Iteration: 29%|βββ | 29/100 [00:15<00:38, 1.83it/s][A
Iteration: 30%|βββ | 30/100 [00:16<00:38, 1.81it/s][A
Iteration: 31%|βββ | 31/100 [00:16<00:37, 1.83it/s][A
Iteration: 32%|ββββ | 32/100 [00:17<00:37, 1.83it/s][A
Iteration: 33%|ββββ | 33/100 [00:17<00:36, 1.83it/s][A
Iteration: 34%|ββββ | 34/100 [00:18<00:36, 1.83it/s][A
Iteration: 35%|ββββ | 35/100 [00:19<00:35, 1.84it/s][A
Iteration: 36%|ββββ | 36/100 [00:19<00:34, 1.85it/s][A
Iteration: 37%|ββββ | 37/100 [00:20<00:33, 1.85it/s][A
Iteration: 38%|ββββ | 38/100 [00:20<00:33, 1.84it/s][A
Iteration: 39%|ββββ | 39/100 [00:21<00:33, 1.83it/s][A
Iteration: 40%|ββββ | 40/100 [00:21<00:33, 1.81it/s][A
Iteration: 41%|ββββ | 41/100 [00:22<00:32, 1.80it/s][A
Iteration: 42%|βββββ | 42/100 [00:22<00:32, 1.81it/s][A
Iteration: 43%|βββββ | 43/100 [00:23<00:31, 1.81it/s][A
Iteration: 44%|βββββ | 44/100 [00:23<00:30, 1.82it/s][A
Iteration: 45%|βββββ | 45/100 [00:24<00:30, 1.82it/s][A
Iteration: 46%|βββββ | 46/100 [00:25<00:29, 1.82it/s][A
Iteration: 47%|βββββ | 47/100 [00:25<00:28, 1.85it/s][A
Iteration: 48%|βββββ | 48/100 [00:26<00:27, 1.88it/s][A
Iteration: 49%|βββββ | 49/100 [00:26<00:26, 1.90it/s][A
Iteration: 50%|βββββ | 50/100 [00:27<00:26, 1.90it/s][A
Iteration: 51%|βββββ | 51/100 [00:27<00:25, 1.91it/s][A
Iteration: 52%|ββββββ | 52/100 [00:28<00:25, 1.88it/s][A
Iteration: 53%|ββββββ | 53/100 [00:28<00:25, 1.87it/s][A
Iteration: 54%|ββββββ | 54/100 [00:29<00:24, 1.85it/s][A
Iteration: 55%|ββββββ | 55/100 [00:29<00:24, 1.85it/s][A
Iteration: 56%|ββββββ | 56/100 [00:30<00:23, 1.85it/s][A
Iteration: 57%|ββββββ | 57/100 [00:30<00:23, 1.84it/s][A
Iteration: 58%|ββββββ | 58/100 [00:31<00:22, 1.83it/s][A
Iteration: 59%|ββββββ | 59/100 [00:32<00:22, 1.83it/s][A
Iteration: 60%|ββββββ | 60/100 [00:32<00:21, 1.82it/s][A
Iteration: 61%|ββββββ | 61/100 [00:33<00:21, 1.81it/s][A tensor(1.1081, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.8177, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5648, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.8934, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.6917, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.4977, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.4188, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.5360, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.2302, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6974, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.9631, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.1517, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.0394, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.0805, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.2111, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.0659, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1721, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.3797, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.7653, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9124, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.9455, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.0043, device='cuda:0', grad_fn=<MeanBackward1>) tensor(6.6603, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.7757, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.9480, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9029, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.8141, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9449, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.3505, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.8941, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.7033, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.7577, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.7977, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6812, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.6969, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.1521, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.1447, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.7582, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1282, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.0421, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.0774, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.5804, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.8813, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.4468, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4122, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.5550, device='cuda:0', grad_fn=<MeanBackward1>) tensor(5.1962, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.9861, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9668, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1332, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9180, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.6219, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.5055, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.8967, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.2863, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.5136, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1325, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.8104, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.0977, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.7893, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.0940, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.9422, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1497, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.9089, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.4366, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.7261, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9153, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.2642, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2475, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.5557, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.1212, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.2402, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.5081, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.8284, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1039, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.1556, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.0285, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.5621, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.5597, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.0610, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.3153, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.0219, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1338, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.4598, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.3830, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.5409, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2138, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5985, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3845, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9587, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6141, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.3082, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.4773, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1898, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.4770, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.9666, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.8754, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.4914, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6849, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5988, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3344, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3399, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.1651, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.7354, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.3303, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.4454, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.5658, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.2901, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.8821, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.8476, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5372, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1198, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.9477, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5039, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4734, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.1818, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6576, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.2035, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4744, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.2483, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2178, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5691, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.8304, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1605, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5521, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.8657, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.1824, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5191, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.8885, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.5927, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3377, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5256, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.8835, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.6478, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.2770, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.4553, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6152, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5821, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3016, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3521, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.7716, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.8367, device='cuda:0', grad_fn=<MeanBackward1>)
Iteration: 62%|βββββββ | 62/100 [00:33<00:21, 1.81it/s][A
Iteration: 63%|βββββββ | 63/100 [00:34<00:20, 1.80it/s][A
Iteration: 64%|βββββββ | 64/100 [00:34<00:19, 1.80it/s][A
Iteration: 65%|βββββββ | 65/100 [00:35<00:19, 1.80it/s][A
Iteration: 66%|βββββββ | 66/100 [00:35<00:18, 1.80it/s][A
Iteration: 67%|βββββββ | 67/100 [00:36<00:18, 1.80it/s][A
Iteration: 68%|βββββββ | 68/100 [00:37<00:17, 1.81it/s][A
Iteration: 69%|βββββββ | 69/100 [00:37<00:17, 1.82it/s][A
Iteration: 70%|βββββββ | 70/100 [00:38<00:16, 1.81it/s][A
Iteration: 71%|βββββββ | 71/100 [00:38<00:15, 1.82it/s][A
Iteration: 72%|ββββββββ | 72/100 [00:39<00:15, 1.80it/s][A
Iteration: 73%|ββββββββ | 73/100 [00:39<00:14, 1.82it/s][A
Iteration: 74%|ββββββββ | 74/100 [00:40<00:14, 1.83it/s][A
Iteration: 75%|ββββββββ | 75/100 [00:40<00:13, 1.83it/s][A
Iteration: 76%|ββββββββ | 76/100 [00:41<00:13, 1.84it/s][A
Iteration: 77%|ββββββββ | 77/100 [00:41<00:12, 1.86it/s][A
Iteration: 78%|ββββββββ | 78/100 [00:42<00:11, 1.86it/s][A
Iteration: 79%|ββββββββ | 79/100 [00:42<00:11, 1.86it/s][A
Iteration: 80%|ββββββββ | 80/100 [00:43<00:10, 1.85it/s][A
Iteration: 81%|ββββββββ | 81/100 [00:44<00:10, 1.84it/s][A
Iteration: 82%|βββββββββ | 82/100 [00:44<00:09, 1.83it/s][A
Iteration: 83%|βββββββββ | 83/100 [00:45<00:09, 1.83it/s][A
Iteration: 84%|βββββββββ | 84/100 [00:45<00:08, 1.82it/s][A
Iteration: 85%|βββββββββ | 85/100 [00:46<00:08, 1.82it/s][A
Iteration: 86%|βββββββββ | 86/100 [00:46<00:07, 1.85it/s][A
Iteration: 87%|βββββββββ | 87/100 [00:47<00:06, 1.88it/s][A
Iteration: 88%|βββββββββ | 88/100 [00:47<00:06, 1.90it/s][A
Iteration: 89%|βββββββββ | 89/100 [00:48<00:05, 1.91it/s][A
Iteration: 90%|βββββββββ | 90/100 [00:48<00:05, 1.92it/s][A
Iteration: 91%|βββββββββ | 91/100 [00:49<00:04, 1.90it/s][A
Iteration: 92%|ββββββββββ| 92/100 [00:49<00:04, 1.87it/s][A
Iteration: 93%|ββββββββββ| 93/100 [00:50<00:03, 1.86it/s][A
Iteration: 94%|ββββββββββ| 94/100 [00:51<00:03, 1.85it/s][A
Iteration: 95%|ββββββββββ| 95/100 [00:51<00:02, 1.84it/s][A
Iteration: 96%|ββββββββββ| 96/100 [00:52<00:02, 1.84it/s][A
Iteration: 97%|ββββββββββ| 97/100 [00:52<00:01, 1.85it/s][A
Iteration: 98%|ββββββββββ| 98/100 [00:53<00:01, 1.86it/s][A
Iteration: 99%|ββββββββββ| 99/100 [00:53<00:00, 1.86it/s][A
Iteration: 100%|ββββββββββ| 100/100 [00:54<00:00, 1.93it/s][A
[AEpoch: 20%|ββ | 6/30 [05:30<22:05, 55.22s/it]
Iteration: 0%| | 0/100 [00:00<?, ?it/s][A
Iteration: 1%| | 1/100 [00:00<00:55, 1.77it/s][A
Iteration: 2%|β | 2/100 [00:01<00:54, 1.78it/s][A
Iteration: 3%|β | 3/100 [00:01<00:53, 1.80it/s][A
Iteration: 4%|β | 4/100 [00:02<00:52, 1.82it/s][A
Iteration: 5%|β | 5/100 [00:02<00:51, 1.83it/s][A
Iteration: 6%|β | 6/100 [00:03<00:51, 1.83it/s][A
Iteration: 7%|β | 7/100 [00:03<00:50, 1.83it/s][A
Iteration: 8%|β | 8/100 [00:04<00:49, 1.84it/s][A
Iteration: 9%|β | 9/100 [00:04<00:49, 1.85it/s][A
Iteration: 10%|β | 10/100 [00:05<00:48, 1.86it/s][A
Iteration: 11%|β | 11/100 [00:05<00:48, 1.85it/s][A
Iteration: 12%|ββ | 12/100 [00:06<00:47, 1.84it/s][A
Iteration: 13%|ββ | 13/100 [00:07<00:47, 1.85it/s][A
Iteration: 14%|ββ | 14/100 [00:07<00:46, 1.85it/s][A
Iteration: 15%|ββ | 15/100 [00:08<00:45, 1.85it/s][A
Iteration: 16%|ββ | 16/100 [00:08<00:45, 1.86it/s][A
Iteration: 17%|ββ | 17/100 [00:09<00:44, 1.86it/s][A
Iteration: 18%|ββ | 18/100 [00:09<00:44, 1.84it/s][A
Iteration: 19%|ββ | 19/100 [00:10<00:44, 1.80it/s][A
Iteration: 20%|ββ | 20/100 [00:10<00:44, 1.80it/s][A
Iteration: 21%|ββ | 21/100 [00:11<00:43, 1.80it/s][A
Iteration: 22%|βββ | 22/100 [00:12<00:43, 1.80it/s][A
Iteration: 23%|βββ | 23/100 [00:12<00:42, 1.81it/s][A
Iteration: 24%|βββ | 24/100 [00:13<00:42, 1.81it/s][A
Iteration: 25%|βββ | 25/100 [00:13<00:41, 1.81it/s][A
Iteration: 26%|βββ | 26/100 [00:14<00:40, 1.85it/s][A
Iteration: 27%|βββ | 27/100 [00:14<00:38, 1.88it/s][A
Iteration: 28%|βββ | 28/100 [00:15<00:38, 1.88it/s][A
Iteration: 29%|βββ | 29/100 [00:15<00:37, 1.88it/s][A
Iteration: 30%|βββ | 30/100 [00:16<00:37, 1.89it/s][A
Iteration: 31%|βββ | 31/100 [00:16<00:37, 1.86it/s][A
Iteration: 32%|ββββ | 32/100 [00:17<00:36, 1.85it/s][A
Iteration: 33%|ββββ | 33/100 [00:17<00:36, 1.85it/s][A
Iteration: 34%|ββββ | 34/100 [00:18<00:35, 1.85it/s][A
Iteration: 35%|ββββ | 35/100 [00:18<00:35, 1.86it/s][A
Iteration: 36%|ββββ | 36/100 [00:19<00:34, 1.86it/s][A
Iteration: 37%|ββββ | 37/100 [00:20<00:34, 1.85it/s][A
Iteration: 38%|ββββ | 38/100 [00:20<00:33, 1.85it/s][A
Iteration: 39%|ββββ | 39/100 [00:21<00:33, 1.83it/s][A
Iteration: 40%|ββββ | 40/100 [00:21<00:32, 1.83it/s][A
Iteration: 41%|ββββ | 41/100 [00:22<00:32, 1.83it/s][A
Iteration: 42%|βββββ | 42/100 [00:22<00:31, 1.82it/s][A
Iteration: 43%|βββββ | 43/100 [00:23<00:31, 1.80it/s][A
Iteration: 44%|βββββ | 44/100 [00:23<00:31, 1.80it/s][A
Iteration: 45%|βββββ | 45/100 [00:24<00:30, 1.80it/s][A
Iteration: 46%|βββββ | 46/100 [00:25<00:29, 1.81it/s][A
Iteration: 47%|βββββ | 47/100 [00:25<00:29, 1.81it/s][A
Iteration: 48%|βββββ | 48/100 [00:26<00:28, 1.82it/s][A
Iteration: 49%|βββββ | 49/100 [00:26<00:28, 1.82it/s][A
Iteration: 50%|βββββ | 50/100 [00:27<00:27, 1.82it/s][A
Iteration: 51%|βββββ | 51/100 [00:27<00:26, 1.83it/s][A
Iteration: 52%|ββββββ | 52/100 [00:28<00:26, 1.84it/s][A
Iteration: 53%|ββββββ | 53/100 [00:28<00:25, 1.85it/s][A
Iteration: 54%|ββββββ | 54/100 [00:29<00:24, 1.85it/s][A
Iteration: 55%|ββββββ | 55/100 [00:29<00:24, 1.85it/s][A
Iteration: 56%|ββββββ | 56/100 [00:30<00:23, 1.84it/s][A
Iteration: 57%|ββββββ | 57/100 [00:31<00:23, 1.83it/s][A
Iteration: 58%|ββββββ | 58/100 [00:31<00:22, 1.84it/s][A
Iteration: 59%|ββββββ | 59/100 [00:32<00:22, 1.83it/s][A
Iteration: 60%|ββββββ | 60/100 [00:32<00:21, 1.83it/s][A
Iteration: 61%|ββββββ | 61/100 [00:33<00:21, 1.83it/s][A
Iteration: 62%|βββββββ | 62/100 [00:33<00:20, 1.83it/s][A
Iteration: 63%|βββββββ | 63/100 [00:34<00:20, 1.83it/s][A
Iteration: 64%|βββββββ | 64/100 [00:34<00:19, 1.83it/s][A
Iteration: 65%|βββββββ | 65/100 [00:35<00:19, 1.82it/s][A
Iteration: 66%|βββββββ | 66/100 [00:35<00:18, 1.83it/s][A
Iteration: 67%|βββββββ | 67/100 [00:36<00:17, 1.84it/s][A
Iteration: 68%|βββββββ | 68/100 [00:37<00:17, 1.83it/s][A
Iteration: 69%|βββββββ | 69/100 [00:37<00:16, 1.84it/s][A
Iteration: 70%|βββββββ | 70/100 [00:38<00:16, 1.85it/s][A
Iteration: 71%|βββββββ | 71/100 [00:38<00:15, 1.85it/s][A
Iteration: 72%|ββββββββ | 72/100 [00:39<00:15, 1.83it/s][A
Iteration: 73%|ββββββββ | 73/100 [00:39<00:14, 1.83it/s][A
Iteration: 74%|ββββββββ | 74/100 [00:40<00:14, 1.84it/s][A
Iteration: 75%|ββββββββ | 75/100 [00:40<00:13, 1.85it/s][A
Iteration: 76%|ββββββββ | 76/100 [00:41<00:13, 1.85it/s][A
Iteration: 77%|ββββββββ | 77/100 [00:41<00:12, 1.83it/s][A
Iteration: 78%|ββββββββ | 78/100 [00:42<00:12, 1.83it/s][A
Iteration: 79%|ββββββββ | 79/100 [00:43<00:11, 1.84it/s][A
Iteration: 80%|ββββββββ | 80/100 [00:43<00:10, 1.85it/s][A
Iteration: 81%|ββββββββ | 81/100 [00:44<00:10, 1.84it/s][A
Iteration: 82%|βββββββββ | 82/100 [00:44<00:09, 1.83it/s][A
Iteration: 83%|βββββββββ | 83/100 [00:45<00:09, 1.83it/s][A
Iteration: 84%|βββββββββ | 84/100 [00:45<00:08, 1.83it/s][A
Iteration: 85%|βββββββββ | 85/100 [00:46<00:08, 1.83it/s][A
Iteration: 86%|βββββββββ | 86/100 [00:46<00:07, 1.83it/s][A
Iteration: 87%|βββββββββ | 87/100 [00:47<00:07, 1.83it/s][A
Iteration: 88%|βββββββββ | 88/100 [00:47<00:06, 1.84it/s][A
Iteration: 89%|βββββββββ | 89/100 [00:48<00:05, 1.84it/s][A
Iteration: 90%|βββββββββ | 90/100 [00:49<00:05, 1.84it/s][A
Iteration: 91%|βββββββββ | 91/100 [00:49<00:04, 1.83it/s][A
Iteration: 92%|ββββββββββ| 92/100 [00:50<00:04, 1.83it/s][A
Iteration: 93%|ββββββββββ| 93/100 [00:50<00:03, 1.81it/s][A
Iteration: 94%|ββββββββββ| 94/100 [00:51<00:03, 1.82it/s][A
Iteration: 95%|ββββββββββ| 95/100 [00:51<00:02, 1.80it/s][A
Iteration: 96%|ββββββββββ| 96/100 [00:52<00:02, 1.81it/s][A
Iteration: 97%|ββββββββββ| 97/100 [00:52<00:01, 1.82it/s][A
Iteration: 98%|ββββββββββ| 98/100 [00:53<00:01, 1.81it/s][A
Iteration: 99%|ββββββββββ| 99/100 [00:53<00:00, 1.82it/s][A
Iteration: 100%|ββββββββββ| 100/100 [00:54<00:00, 1.90it/s][A
[AEpoch: 23%|βββ | 7/30 [06:25<21:04, 54.99s/it]
Iteration: 0%| | 0/100 [00:00<?, ?it/s][A
Iteration: 1%| | 1/100 [00:00<00:55, 1.79it/s][A
Iteration: 2%|β | 2/100 [00:01<00:54, 1.79it/s][A
Iteration: 3%|β | 3/100 [00:01<00:53, 1.82it/s][A tensor(2.1750, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.2600, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.5678, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.1569, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.1075, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.4970, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9506, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1941, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.7229, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3056, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2301, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.4112, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4023, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.9740, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4024, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.5570, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9153, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3852, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3258, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.2480, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4198, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1784, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1883, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.2148, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5897, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.9266, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.7152, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.4451, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4336, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.8127, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.6028, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.7686, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9159, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.1304, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.8454, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5320, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5033, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4037, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1693, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.6606, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5839, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.4180, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.7493, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3283, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.0153, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5268, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5855, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1863, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5388, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.3705, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2602, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6562, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3641, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5900, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5265, device='cuda:0', grad_fn=<MeanBackward1>) tensor(4.1762, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.3156, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6298, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.7736, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6361, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3744, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6704, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0662, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9536, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0500, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0606, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.4916, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4672, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5074, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.0915, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.5359, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.6047, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0464, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2997, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2797, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6955, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6950, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4739, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5573, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.9329, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1302, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4602, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.2103, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.4366, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1661, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.2588, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.0366, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.7053, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.7450, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4361, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.0249, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4535, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.7366, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.8457, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2021, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2838, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.7775, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2273, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6774, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1070, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.3469, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.9098, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.7884, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6653, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1182, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.7700, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2538, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3085, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.0935, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.0880, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.6114, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.9359, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5215, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.6319, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.2486, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.4777, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1273, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3742, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0937, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2066, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.0391, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.8095, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1030, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.3667, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.7630, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3822, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.5481, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1493, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.5742, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0682, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.0222, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.2346, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.9199, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.7022, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.8941, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2483, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0275, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0141, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0229, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.7532, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0169, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1179, device='cuda:0', grad_fn=<MeanBackward1>)
Iteration: 4%|β | 4/100 [00:02<00:52, 1.83it/s][A
Iteration: 5%|β | 5/100 [00:02<00:51, 1.84it/s][A
Iteration: 6%|β | 6/100 [00:03<00:50, 1.84it/s][A
Iteration: 7%|β | 7/100 [00:03<00:50, 1.83it/s][A
Iteration: 8%|β | 8/100 [00:04<00:50, 1.82it/s][A
Iteration: 9%|β | 9/100 [00:04<00:49, 1.83it/s][A
Iteration: 10%|β | 10/100 [00:05<00:48, 1.84it/s][A
Iteration: 11%|β | 11/100 [00:05<00:48, 1.85it/s][A
Iteration: 12%|ββ | 12/100 [00:06<00:47, 1.86it/s][A
Iteration: 13%|ββ | 13/100 [00:07<00:46, 1.86it/s][A
Iteration: 14%|ββ | 14/100 [00:07<00:46, 1.84it/s][A
Iteration: 15%|ββ | 15/100 [00:08<00:46, 1.83it/s][A
Iteration: 16%|ββ | 16/100 [00:08<00:46, 1.81it/s][A
Iteration: 17%|ββ | 17/100 [00:09<00:46, 1.80it/s][A
Iteration: 18%|ββ | 18/100 [00:09<00:45, 1.80it/s][A
Iteration: 19%|ββ | 19/100 [00:10<00:44, 1.81it/s][A
Iteration: 20%|ββ | 20/100 [00:10<00:43, 1.83it/s][A
Iteration: 21%|ββ | 21/100 [00:11<00:43, 1.83it/s][A
Iteration: 22%|βββ | 22/100 [00:12<00:42, 1.84it/s][A
Iteration: 23%|βββ | 23/100 [00:12<00:41, 1.84it/s][A
Iteration: 24%|βββ | 24/100 [00:13<00:41, 1.83it/s][A
Iteration: 25%|βββ | 25/100 [00:13<00:41, 1.83it/s][A
Iteration: 26%|βββ | 26/100 [00:14<00:40, 1.83it/s][A
Iteration: 27%|βββ | 27/100 [00:14<00:39, 1.84it/s][A
Iteration: 28%|βββ | 28/100 [00:15<00:38, 1.85it/s][A
Iteration: 29%|βββ | 29/100 [00:15<00:38, 1.84it/s][A
Iteration: 30%|βββ | 30/100 [00:16<00:38, 1.83it/s][A
Iteration: 31%|βββ | 31/100 [00:16<00:37, 1.82it/s][A
Iteration: 32%|ββββ | 32/100 [00:17<00:37, 1.81it/s][A
Iteration: 33%|ββββ | 33/100 [00:18<00:36, 1.82it/s][A
Iteration: 34%|ββββ | 34/100 [00:18<00:36, 1.82it/s][A
Iteration: 35%|ββββ | 35/100 [00:19<00:35, 1.82it/s][A
Iteration: 36%|ββββ | 36/100 [00:19<00:35, 1.82it/s][A
Iteration: 37%|ββββ | 37/100 [00:20<00:34, 1.82it/s][A
Iteration: 38%|ββββ | 38/100 [00:20<00:34, 1.82it/s][A
Iteration: 39%|ββββ | 39/100 [00:21<00:33, 1.82it/s][A
Iteration: 40%|ββββ | 40/100 [00:21<00:32, 1.82it/s][A
Iteration: 41%|ββββ | 41/100 [00:22<00:32, 1.80it/s][A
Iteration: 42%|βββββ | 42/100 [00:23<00:32, 1.80it/s][A
Iteration: 43%|βββββ | 43/100 [00:23<00:31, 1.80it/s][A
Iteration: 44%|βββββ | 44/100 [00:24<00:30, 1.81it/s][A
Iteration: 45%|βββββ | 45/100 [00:24<00:30, 1.82it/s][A
Iteration: 46%|βββββ | 46/100 [00:25<00:29, 1.82it/s][A
Iteration: 47%|βββββ | 47/100 [00:25<00:29, 1.82it/s][A
Iteration: 48%|βββββ | 48/100 [00:26<00:28, 1.82it/s][A
Iteration: 49%|βββββ | 49/100 [00:26<00:28, 1.82it/s][A
Iteration: 50%|βββββ | 50/100 [00:27<00:27, 1.82it/s][A
Iteration: 51%|βββββ | 51/100 [00:27<00:26, 1.82it/s][A
Iteration: 52%|ββββββ | 52/100 [00:28<00:26, 1.82it/s][A
Iteration: 53%|ββββββ | 53/100 [00:29<00:25, 1.83it/s][A
Iteration: 54%|ββββββ | 54/100 [00:29<00:25, 1.83it/s][A
Iteration: 55%|ββββββ | 55/100 [00:30<00:24, 1.84it/s][A
Iteration: 56%|ββββββ | 56/100 [00:30<00:23, 1.84it/s][A
Iteration: 57%|ββββββ | 57/100 [00:31<00:23, 1.84it/s][A
Iteration: 58%|ββββββ | 58/100 [00:31<00:22, 1.84it/s][A
Iteration: 59%|ββββββ | 59/100 [00:32<00:22, 1.83it/s][A
Iteration: 60%|ββββββ | 60/100 [00:32<00:21, 1.82it/s][A
Iteration: 61%|ββββββ | 61/100 [00:33<00:21, 1.83it/s][A
Iteration: 62%|βββββββ | 62/100 [00:33<00:20, 1.84it/s][A
Iteration: 63%|βββββββ | 63/100 [00:34<00:20, 1.85it/s][A
Iteration: 64%|βββββββ | 64/100 [00:34<00:19, 1.85it/s][A
Iteration: 65%|βββββββ | 65/100 [00:35<00:19, 1.83it/s][A
Iteration: 66%|βββββββ | 66/100 [00:36<00:18, 1.81it/s][A
Iteration: 67%|βββββββ | 67/100 [00:36<00:18, 1.81it/s][A
Iteration: 68%|βββββββ | 68/100 [00:37<00:17, 1.82it/s][A
Iteration: 69%|βββββββ | 69/100 [00:37<00:16, 1.82it/s][A
Iteration: 70%|βββββββ | 70/100 [00:38<00:16, 1.83it/s][A
Iteration: 71%|βββββββ | 71/100 [00:38<00:15, 1.83it/s][A
Iteration: 72%|ββββββββ | 72/100 [00:39<00:15, 1.86it/s][A
Iteration: 73%|ββββββββ | 73/100 [00:39<00:14, 1.89it/s][A
Iteration: 74%|ββββββββ | 74/100 [00:40<00:13, 1.90it/s][A
Iteration: 75%|ββββββββ | 75/100 [00:40<00:13, 1.92it/s][A
Iteration: 76%|ββββββββ | 76/100 [00:41<00:12, 1.93it/s][A
Iteration: 77%|ββββββββ | 77/100 [00:41<00:12, 1.91it/s][A
Iteration: 78%|ββββββββ | 78/100 [00:42<00:11, 1.88it/s][A
Iteration: 79%|ββββββββ | 79/100 [00:43<00:11, 1.85it/s][A
Iteration: 80%|ββββββββ | 80/100 [00:43<00:10, 1.84it/s][A
Iteration: 81%|ββββββββ | 81/100 [00:44<00:10, 1.84it/s][A
Iteration: 82%|βββββββββ | 82/100 [00:44<00:09, 1.85it/s][A
Iteration: 83%|βββββββββ | 83/100 [00:45<00:09, 1.85it/s][A
Iteration: 84%|βββββββββ | 84/100 [00:45<00:08, 1.84it/s][A
Iteration: 85%|βββββββββ | 85/100 [00:46<00:08, 1.84it/s][A
Iteration: 86%|βββββββββ | 86/100 [00:46<00:07, 1.84it/s][A
Iteration: 87%|βββββββββ | 87/100 [00:47<00:07, 1.85it/s][A
Iteration: 88%|βββββββββ | 88/100 [00:47<00:06, 1.86it/s][A
Iteration: 89%|βββββββββ | 89/100 [00:48<00:05, 1.86it/s][A
Iteration: 90%|βββββββββ | 90/100 [00:49<00:05, 1.84it/s][A
Iteration: 91%|βββββββββ | 91/100 [00:49<00:04, 1.82it/s][A
Iteration: 92%|ββββββββββ| 92/100 [00:50<00:04, 1.81it/s][A
Iteration: 93%|ββββββββββ| 93/100 [00:50<00:03, 1.79it/s][A
Iteration: 94%|ββββββββββ| 94/100 [00:51<00:03, 1.78it/s][A
Iteration: 95%|ββββββββββ| 95/100 [00:51<00:02, 1.78it/s][A
Iteration: 96%|ββββββββββ| 96/100 [00:52<00:02, 1.80it/s][A
Iteration: 97%|ββββββββββ| 97/100 [00:52<00:01, 1.80it/s][A
Iteration: 98%|ββββββββββ| 98/100 [00:53<00:01, 1.80it/s][A
Iteration: 99%|ββββββββββ| 99/100 [00:54<00:00, 1.81it/s][A
Iteration: 100%|ββββββββββ| 100/100 [00:54<00:00, 1.89it/s][A
[AEpoch: 27%|βββ | 8/30 [07:19<20:06, 54.85s/it]
Iteration: 0%| | 0/100 [00:00<?, ?it/s][A
Iteration: 1%| | 1/100 [00:00<00:54, 1.80it/s][A
Iteration: 2%|β | 2/100 [00:01<00:54, 1.81it/s][A
Iteration: 3%|β | 3/100 [00:01<00:53, 1.81it/s][A
Iteration: 4%|β | 4/100 [00:02<00:52, 1.83it/s][A
Iteration: 5%|β | 5/100 [00:02<00:51, 1.83it/s][A
Iteration: 6%|β | 6/100 [00:03<00:50, 1.85it/s][A
Iteration: 7%|β | 7/100 [00:03<00:50, 1.85it/s][A
Iteration: 8%|β | 8/100 [00:04<00:49, 1.85it/s][A
Iteration: 9%|β | 9/100 [00:04<00:49, 1.84it/s][A
Iteration: 10%|β | 10/100 [00:05<00:48, 1.84it/s][A
Iteration: 11%|β | 11/100 [00:06<00:49, 1.81it/s][A
Iteration: 12%|ββ | 12/100 [00:06<00:48, 1.83it/s][A
Iteration: 13%|ββ | 13/100 [00:07<00:47, 1.85it/s][A
Iteration: 14%|ββ | 14/100 [00:07<00:46, 1.87it/s][A
Iteration: 15%|ββ | 15/100 [00:08<00:44, 1.89it/s][A
Iteration: 16%|ββ | 16/100 [00:08<00:44, 1.91it/s][A
Iteration: 17%|ββ | 17/100 [00:09<00:43, 1.90it/s][A
Iteration: 18%|ββ | 18/100 [00:09<00:44, 1.86it/s][A
Iteration: 19%|ββ | 19/100 [00:10<00:44, 1.84it/s][A
Iteration: 20%|ββ | 20/100 [00:10<00:43, 1.83it/s][A
Iteration: 21%|ββ | 21/100 [00:11<00:42, 1.85it/s][A
Iteration: 22%|βββ | 22/100 [00:11<00:42, 1.85it/s][A
Iteration: 23%|βββ | 23/100 [00:12<00:41, 1.85it/s][A
Iteration: 24%|βββ | 24/100 [00:12<00:40, 1.86it/s][A
Iteration: 25%|βββ | 25/100 [00:13<00:40, 1.84it/s][A
Iteration: 26%|βββ | 26/100 [00:14<00:40, 1.84it/s][A
Iteration: 27%|βββ | 27/100 [00:14<00:39, 1.83it/s][A
Iteration: 28%|βββ | 28/100 [00:15<00:39, 1.83it/s][A
Iteration: 29%|βββ | 29/100 [00:15<00:38, 1.84it/s][A
Iteration: 30%|βββ | 30/100 [00:16<00:38, 1.83it/s][A
Iteration: 31%|βββ | 31/100 [00:16<00:38, 1.81it/s][A
Iteration: 32%|ββββ | 32/100 [00:17<00:37, 1.81it/s][A
Iteration: 33%|ββββ | 33/100 [00:17<00:36, 1.81it/s][A
Iteration: 34%|ββββ | 34/100 [00:18<00:36, 1.82it/s][A
Iteration: 35%|ββββ | 35/100 [00:19<00:35, 1.83it/s][A
Iteration: 36%|ββββ | 36/100 [00:19<00:34, 1.83it/s][A
Iteration: 37%|ββββ | 37/100 [00:20<00:34, 1.84it/s][A
Iteration: 38%|ββββ | 38/100 [00:20<00:33, 1.84it/s][A
Iteration: 39%|ββββ | 39/100 [00:21<00:33, 1.83it/s][A
Iteration: 40%|ββββ | 40/100 [00:21<00:32, 1.82it/s][A
Iteration: 41%|ββββ | 41/100 [00:22<00:32, 1.82it/s][A
Iteration: 42%|βββββ | 42/100 [00:22<00:31, 1.82it/s][A
Iteration: 43%|βββββ | 43/100 [00:23<00:31, 1.82it/s][A
Iteration: 44%|βββββ | 44/100 [00:23<00:30, 1.81it/s][A
Iteration: 45%|βββββ | 45/100 [00:24<00:30, 1.81it/s][A tensor(0.1441, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1497, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.9167, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1851, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.8080, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.9862, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0317, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0425, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1542, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.0861, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0225, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0574, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2105, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3675, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4448, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.4845, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3738, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.8085, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5127, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6588, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4577, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9319, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4344, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1920, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0266, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5289, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2385, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2580, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5123, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1011, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9435, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3350, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1632, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.8856, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3756, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6524, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4251, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.9508, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1952, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1232, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1657, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.8077, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.0481, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.8182, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0821, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4515, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3619, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9553, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4308, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5923, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.6533, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.8023, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4351, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.9200, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6540, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.4698, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.0301, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3490, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.0802, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2242, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3890, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0521, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0418, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1429, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0895, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5439, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2903, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1201, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3739, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2464, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0344, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3257, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5080, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6775, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.8394, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.2206, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2421, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0650, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.5008, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.8323, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3994, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.9032, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2641, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4107, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.7043, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2151, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6050, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1040, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3156, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5811, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.8203, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2720, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3273, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.4406, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1958, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.9906, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0668, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1414, device='cuda:0', grad_fn=<MeanBackward1>) tensor(3.7456, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0483, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.5880, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2942, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1597, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0977, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2877, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2098, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1108, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.7298, device='cuda:0', grad_fn=<MeanBackward1>) tensor(2.5583, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4492, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2240, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0434, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0773, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.2337, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4881, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1105, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.0712, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1268, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.9801, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0778, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0115, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0174, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.3667, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0037, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2425, device='cuda:0', grad_fn=<MeanBackward1>) tensor(1.1389, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5085, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2370, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0373, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2490, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.6200, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5539, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1891, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.4070, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.5485, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0907, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.9299, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2755, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.9592, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.2240, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.0988, device='cuda:0', grad_fn=<MeanBackward1>) tensor(0.1550, device='cuda:0', grad_fn=<MeanBackward1>)
Iteration: 46%|βββββ | 46/100 [00:25<00:30, 1.79it/s][A
Iteration: 47%|βββββ | 47/100 [00:25<00:29, 1.79it/s][A
Iteration: 48%|βββββ | 48/100 [00:26<00:29, 1.79it/s][A
Iteration: 49%|βββββ | 49/100 [00:26<00:28, 1.80it/s][A
Iteration: 50%|βββββ | 50/100 [00:27<00:27, 1.81it/s][A
Iteration: 51%|βββββ | 51/100 [00:27<00:27, 1.81it/s][A
Iteration: 52%|ββββββ | 52/100 [00:28<00:26, 1.83it/s][A
Iteration: 53%|ββββββ | 53/100 [00:28<00:25, 1.83it/s][A
Iteration: 54%|ββββββ | 54/100 [00:29<00:25, 1.82it/s][A
Iteration: 55%|ββββββ | 55/100 [00:30<00:24, 1.82it/s][A
Iteration: 56%|ββββββ | 56/100 [00:30<00:24, 1.82it/s][A
Iteration: 57%|ββββββ | 57/100 [00:31<00:23, 1.84it/s][A
Iteration: 58%|ββββββ | 58/100 [00:31<00:22, 1.85it/s][A
Iteration: 59%|ββββββ | 59/100 [00:32<00:22, 1.84it/s][A
Iteration: 60%|ββββββ | 60/100 [00:32<00:21, 1.85it/s][A
Iteration: 61%|ββββββ | 61/100 [00:33<00:21, 1.85it/s][A
Iteration: 62%|βββββββ | 62/100 [00:33<00:20, 1.84it/s][A
Iteration: 63%|βββββββ | 63/100 [00:34<00:20, 1.84it/s][A
Iteration: 64%|βββββββ | 64/100 [00:34<00:19, 1.84it/s][A
Iteration: 65%|βββββββ | 65/100 [00:35<00:19, 1.84it/s][A
Iteration: 66%|βββββββ | 66/100 [00:35<00:18, 1.83it/s][A
Iteration: 67%|βββββββ | 67/100 [00:36<00:18, 1.83it/s][A
Iteration: 68%|βββββββ | 68/100 [00:37<00:17, 1.81it/s][A
Iteration: 69%|βββββββ | 69/100 [00:37<00:17, 1.81it/s][A
Iteration: 70%|βββββββ | 70/100 [00:38<00:16, 1.83it/s][A
Iteration: 71%|βββββββ | 71/100 [00:38<00:15, 1.84it/s][A
Iteration: 72%|ββββββββ | 72/100 [00:39<00:15, 1.84it/s][A
Iteration: 73%|ββββββββ | 73/100 [00:39<00:14, 1.83it/s][A
Iteration: 74%|ββββββββ | 74/100 [00:40<00:14, 1.83it/s][A
Iteration: 75%|ββββββββ | 75/100 [00:40<00:13, 1.82it/s][A
Iteration: 76%|ββββββββ | 76/100 [00:41<00:13, 1.82it/s][A
Iteration: 77%|ββββββββ | 77/100 [00:42<00:12, 1.81it/s][A
Iteration: 78%|ββββββββ | 78/100 [00:42<00:12, 1.81it/s][A